Installation
Bonsai consist of multiple services such as a frontend, an API, and tools for various clustering methods that needs to be configured during installation. It’s therefore recommended to install Bonsai and its services using the pre-built docker containers stored on Docker Hub. This requires that you have installed docker and preferably docker-compose. The prebuilt images currently support the x86-64 architecture.
Installing Bonsai using docker-compose and setup involves the following steps.
Start the Bonsai with
docker-compose up -d
Please note that your docker-compose.yml file might be different from the minimal example in the documentaiton depending on your network and server environment. You can configure the different services using environmental variables (defined in the docker-compose file). See advanced container configuration for the available options. In rare instances you might need to or by editing the related config files (frontend config and api config) and mount these to the container using volume mounts.
Some containers requires access to directories or files the host file system in order for all features to function or for data to be persistant accros updates to container images. These can be made available using using docker volumes. For more information see the sections on data persistance, setup IGV, and the documentaiton of volume mounts in the advanced container configuration.
Setup Bonsai with docker-compose
Use docker-compose to get started creating the Bonsai containers and configure their access to a mongo database. Some containers must be configured using a combination of environmental variables and volume mounts to either function properly or for data to be persistant. See container configuration for more information on how to configure docker containers.
services:
mongodb:
image: mongo:latest
networks:
- bonsai-net
redis:
image: redis:7.0.10
networks:
- bonsai-net
frontend:
image: clinicalgenomicslund/bonsai-app:0.8.0
depends_on:
- mongodb
- api
ports:
- "8000:8000"
environment:
- TZ=Europe/Stockholm
networks:
- bonsai-net
api:
image: clinicalgenomicslund/bonsai-api:0.8.0
depends_on:
- mongodb
- minhash_service
- allele_cluster_service
ports:
- "8001:8000"
networks:
- bonsai-net
minhash_service:
image: clinicalgenomicslund/bonsai-minhash-clustering:0.3.0
depends_on:
- redis
volumes:
- "./volumes/api/genome_signatures:/data/signature_db"
networks:
- bonsai-net
allele_cluster_service:
image: clinicalgenomicslund/bonsai-allele-clustering:0.3.0
depends_on:
- redis
networks:
- bonsai-net
networks:
bonsai-net:
driver: bridge
Start the services with docker-compose up -d
Create database indexes
The database must be indexed for Bonsai to work correctly. The database indexes support and speed up common queries and enforces restriction on the data. For instance, will the indexes prevent duplicated sample IDs and sample group IDs? The indexes are created using the Bonsai API command line interface. Note that if you are running the containerized version of Bonsai, you must execute the commands in the container.
docker-compose exec api bonsai_api index
Create an admin user
You can create an admin user either manually using the CLI or automatically on startup by configuring environment variables.
Manual Creation (CLI)
Create an admin user with the CLI. The admin has full permission to view, create, modify and delete data and can be used to login, upload samples, and create additional users.
docker-compose exec api bonsai_api create-user -u admin \
-p admin \
--fname Place \
--lname Holder \
-m place.holder@mail.com \
-r admin
Automatic Creation on Startup
Alternatively, Bonsai can automatically create an admin user when the API starts up if no users exist in the database. Configure the following environment variables in your docker-compose.yml or environment:
BONSAI_ADMIN_USER: Username for the admin userBONSAI_ADMIN_PASSWORD: Password for the admin userBONSAI_ADMIN_MAIL: Email for the admin user (optional, defaults to username@example.com)
Example docker-compose environment configuration:
api:
environment:
- BONSAI_ADMIN_USER=admin
- BONSAI_ADMIN_PASSWORD=securepassword
- BONSAI_ADMIN_MAIL=admin@example.com
This feature is particularly useful for containerized deployments and automated setups.
Additional users can be created in the WebUI in the admin panel (http://your-ip/admin/users) or by using the CLI as above. For more information see create users.
Setup IGV integration
Bonsai uses IGV to visualise the read depth for called SNVs and structural variants (SV). This can help interpreting if a called variant is a true or false positive. IGV uses the reference genome sequences with annotated genes, the mapped reads in bam or cram format and optionally called variants and regions of interests. These files are either used as assets by Jasen or genreated for the sample and published in the pipeline output directory.
These files are served by the API and therefore needs to be accessable by the container at the paths specified by the environmental variables REFERENCE_GENOMES_DIR, ANNOTATIONS_DIR and the path where Jasen publishes its results.
Note
IGV needs access to fasta indexes and bam indexes in order to function well.
Reference genomes
These should be the same as the reference gneomes used by Jasen. You can use the Makefile from Jasen to download the genomes, their indexes, and the tbprofiler database. Alternatively you could copy existing files from your Jasen installation to the directories you mount to the API container.
Reference genomes and the corresponding GFF file should be copied to the directory you mount to the path in REFERENCE_GENOMES_DIR. The BED files describing regions of interests should be copied to the directory you mount to the ANNOTATIONS_DIR path.
BAM and VCF files
The Bonsai API needs access to directory where Jasen publishes its result because the BAM and VCFs are not uploaded to the API. The result directory could me mounted using docker volumes if its accessable by the host machine. The expected path can be found in the analysis result json file under the field name read_mapping and genome_annotation.
Accessing the web interface
To access the web interface, access the URL http://localhost:8000 in your web browser.
If this doesn’t work, you might want to run docker container ls and make sure that a frontend container is running. Secondly ensure that there are not errors in the frontend and api container logs.
Upload samples to Bonsai
Use the upload_sample.py script to add analysis result and genome signature file to the database.
./scripts/upload_sample.py \
--api localhost:8011 \
--group <optional: group_id of group to associate sample with> \
-u <username> \
-p <password> \
--input /path/to/input.json
Data persistance
The data is not persitant between docker container updates by default as all data is kept in the container. You have to mount the mongo database and the API genome signature database to the host OS to make the data persitant. The volume mounts can be configured in the docker-compose.yaml file. If you mount the databases to the host OS you have to ensure that they have correct permissions so the container have read and write access to these files.
Use the following command to get the user and group id of the user in the container.
$ docker-compose run --rm mongodb id
# uid=1000(worker) gid=1000(worker) groups=1000(worker)
Use chown -R /path/to/volume_dir 1000:1000 to change the permission of the folders you
mount to the container.
The following are an example volume mount configuration. See the docker-compose documentation for more information on volume mounts.
services:
mongodb:
volumes:
- "./volumes/mongodb:/data/db"
api:
volumes:
- "./volumes/api/genome_signatures:/data/signature_db"