Page tree
Skip to end of metadata
Go to start of metadata

Optional - Skip this step

We have already processed all these data and json documents are available through our FTP server for those users who wish to skip this section.

Download the JSON from here:

And follow the instructions here: Load Data

The first step to creating a CellBase instance is to download the data files. Download can be done through the CellBase CLI.

$ cellbase/build/bin$ ./ download --data genome,gene

The --data argument is required and is a comma separated list of data types to download. See below for the full list.

TypeData sources
  • Ensembl
  • Ensembl
  • DGIdb
  • UniProt gene mappings
  • Gene Expression Atlas
  • HPO gene annotation
  • GNomad
variation **
  • 1000 genomes
  • ExAC
  • GoNL
  • UK10K
  • ESP
  • CADD
  • Ensembl
  • UniProt
  • InterPro
  • Polyphen/Sift
conservation **
  • PhaseCons
  • PhyloP
  • GERP++
clinical_variants **
  • ClinVar
  • HPO
  • DisGeNET
  • UCSC
  • DGV
all **Downloads all of the above

See Download Sources for details on versions and available organisms.

** Please note that many files are very large and can take several hours to download.

For example, to download all human (GRCh37) data from all sources and save it into the `/tmp/data/cellbase/v4/` directory, run:

cellbase/build/bin$ ./ download -a GRCh37 --common 
/tmp/data/cellbase/v4/common/ -d all -o /tmp/data/cellbase/v4/ -s 

If download was successful, you can proceed to building the json objects that should be loaded into the corresponding database: Building the CellBase database

  • No labels