Optional - Skip this step
We have already processed all these data and json documents are available through our FTP server for those users who wish to skip this section.
Download the JSON from here:
http://bioinfo.hpc.cam.ac.uk/downloads/cellbase/v4/homo_sapiens_grch37/mongodb/
http://bioinfo.hpc.cam.ac.uk/downloads/cellbase/v4/homo_sapiens_grch38/mongodb/
And follow the instructions here: Load Data
The first step to creating a CellBase instance is to download the data files. Download can be done through the CellBase CLI.
$ cellbase/build/bin$ ./cellbase.sh download --data genome,gene
The --data argument is required and is a comma separated list of data types to download. See below for the full list.
Type | Data sources |
---|---|
genome |
|
gene |
|
variation ** |
|
variation_functional_score |
|
regulation |
|
protein |
|
conservation ** |
|
clinical_variants ** |
|
repeats |
|
svs |
|
all ** | Downloads all of the above |
See Download Sources for details on versions and available organisms.
For example, to download all human (GRCh37) data from all sources and save it into the `/tmp/data/cellbase/v4/` directory, run:
cellbase/build/bin$ ./cellbase-admin.sh download -a GRCh37 --common /tmp/data/cellbase/v4/common/ -d all -o /tmp/data/cellbase/v4/ -s hsapiens
If download was successful, you can proceed to building the json objects that should be loaded into the corresponding database: Building the CellBase database