Installing and configuring CellBase consists of different steps, as you will see in this page you must first make sure that the server(s) have all dependencies installed, then you can configure and complete the CellBase build.
You do not need to install CellBase to run queries. See Using CellBase for more information on how to use CellBase |
Building a CellBase instance has three stages:
Stage | Description |
---|---|
Download ** | Downloads the data files for the specified data sets |
Build ** | Parses the downloaded data files, generates JSON objects, e.g. gene.json |
Load | Loads the generated JSON objects into the Mongo database |
This document will show you how to create a CellBase instance. First, you will download a set of raw files from several data sources. These raw files shall contain the core data that will populate the Cellbase knowledgebase. Then, you will build the JSON documents that should be loaded into the Cellbase knowledgebase. These three stages are described in detail below.
** We have already downloaded and processed these data, and the resulting JSON documents are available through our FTP server. For those users who wish to skip these two sections, directly download json documents from http://bioinfo.hpc.cam.ac.uk/downloads/cellbase/v4/homo_sapiens_grch37/mongodb/ and jump to the [[Load Data Models]] section.
Before you can start building CellBase, you must first install all required software dependencies.
Which sort of hardware you need depends on how much data you need, query load, etc. A full CellBase instance is 1 TB of data, but loading only genomic data is XXX GB. Also loading and querying data is very resource intensive, we recommend at least XXX GB of RAM.
Below are the software dependencies required by CellBase.
Software | Version | Purpose |
---|---|---|
Java | 8 | |
MongoDB | 3.6 | Database |
Tomcat | 8.5x | |
Docker | 18 | Building Ensembl |
There are two main ways to get CellBase for installation:
Here you can learn more about these two options.
Although most users will use stable prebuilt binaries (see below) there is still the need for different users to compile and build CellBase, for instance to test a development version. You can learn how to build from the source code at Installation Guide > Building from Source Code.
You can download stable and pre-release (beta and release candidate) versions from CellBase GitHub Releases. You will find a tar.gz file with the name of cellbase and the version, for instance to download CellBase 4.7.1 you can go to the GitHub Release at:
https://github.com/opencb/cellbase/releases/tag/v4.7.1
and download the file cellbase-4.7.1.tar.gz from the Downloads section.
Run this command to download all the data:
$ ./build/bin/cellbase-admin.sh download -d gene -s hsapiens |
See Download Sources for the details on all the data that's available to download.
Run this command to download all the data:
$ ./build/bin/cellbase-admin.sh build -d gene -s hsapiens |
See Building the CellBase database for the details on how to build.
Run this command to download all the data:
$ ./build/bin/cellbase-admin.sh load -d gene -s hsapiens |
See Load Data for the details on all the data that's available to download.
Make sure Tomcat is running, then copy the WAR file the build has created for you into Tomcat's webapps directory:
cp ./build/cellbase-5.0.0-SNAPSHOT.war ~/apache-tomcat-8.5.47/webapps/
You should see the generated API docs here:
http://localhost:8080/cellbase-5.0.0-SNAPSHOT/
For how to configure tomcat see <HERE>.
See Using CellBase for information how to run queries.
Table of Contents: