Backend for new archive site
Backend for new archive site including, ETLs from BB Metadata DB to Elasticsearch.
The archive-backend is meant to be executed as command line.
Type archive-backend <command> -h
to see how to use each command.
archive-backend server
Execute the backend api server for the new archive site.
archive-backend version
Print the version of archive-backend
The default config file is config.toml
in your current work directory.
See config.sample.toml
for a sample config file.
Release and Deployment
Once development is done, all tests are green, we want to go live.
All we have to do is simply execute misc/
To add a pre-release tag, add the relevant environment variable. For example,
PRE_RELEASE=rc.1 misc/
MDB models
When MDB schema is changed we need to update the mdb
package. Run this script:
(See the next section below for the instructions on installing Elasticsearch for Windows)
- Hebrew plugin:
- Instead of standard analyzer for exact match (הריון to be same as היריון):
sudo bin/elasticsearch-plugin install analysis-phonetic
WIP - Does not works yet.
- ICU plugin to transliterate Russian (and others) to enable phonetic on them:
sudo bin/elasticsearch-plugin install analysis-icu
- Ukrainial analyzer (fails for standard - Not started)
Build index
There are two more dependencies required to build index:
- Open Office (soffice binary) - to convert all doc to docx.
- python-docx pyton library - to get text from docx
Elasticsearch installation for Windows
- Download and install the Java Virtual Machine for Windows from

Download and install the Elasticsearch 5.6.0 MSI from
Open CMD as administrator
Go to Elasticsearch bin directory
cd C:\Program Files\Elastic\Elasticsearch\bin
To install analysis-phonetic type
elasticsearch-plugin install analysis-phonetic
To install the hebrew plugin type
elasticsearch-plugin install
Answer 'y' to the security question
Continue with installation? [y/N]y
To install ICU plugin type
elasticsearch-plugin install analysis-icu
Download and install Python - version 2.7.x
Install python-docx (to get text from docx):
- in CMD go to python directory
cd C:\Python27
python -m pip install python-docx
Download and install OpenOffice
We need soffice.exe that is located on C:\Program Files (x86)\OpenOffice 4\program
Update 'soffice-bin' value with soffice.exe location in config.toml, [elasticsearch] section.