- Created by Nacho Medina, last modified by Joaquín Tárraga Giménez on Jun 26, 2018
You are viewing an old version of this page. View the current version.
Compare with Current View Page History
« Previous Version 7 Next »
Pre-requisites
In order to follow this guide you have to install BioNetDB in your system. Please, please follow the steps on installation guide and set it up.
Download test data
Download the test data from http://bioinfo.hpc.cam.ac.uk/downloads/bionetdb/bionetdb.dataset.tar.gz and extract the content of the archive executing:
tar xvfz bionetdb.dataset.tar.gz
The content of the archive is:
/tmp$ tar xvfz bionetdb.dataset.tar.gz bionetdb.dataset/ bionetdb.dataset/illumina_platinum.export.5k.json bionetdb.dataset/mirna.csv bionetdb.dataset/genes.json.gz bionetdb.dataset/proteins.json.gz bionetdb.dataset/illumina_platinum.export.5k.json.meta.json bionetdb.dataset/Homo_sapiens.owl bionetdb.dataset/10k.clinvar.json.gz /tmp$ cd bionetdb.dataset/ /tmp/bionetdb.dataset$ ls -ltrh total 475M -rw-rw-r-- 1 jtarraga jtarraga 38M Jun 26 13:39 proteins.json.gz -rw-rw-r-- 1 jtarraga jtarraga 78M Jun 26 13:39 genes.json.gz -rw-rw-r-- 1 jtarraga jtarraga 1.2M Jun 26 13:39 mirna.csv -rw-rw-r-- 1 jtarraga jtarraga 53K Jun 26 13:39 illumina_platinum.export.5k.json.meta.json -rw-rw-r-- 1 jtarraga jtarraga 56M Jun 26 13:39 illumina_platinum.export.5k.json -rw-rw-r-- 1 jtarraga jtarraga 215M Jun 26 13:39 Homo_sapiens.owl -rw-rw-r-- 1 jtarraga jtarraga 89M Jun 26 13:39 10k.clinvar.json.gz
Import genomic data
Before you query BioNetDB database, you have to populate it by importing your data into the Neo4j database. BioNetDB provides a command line interface to import data. First, you prepare your data, and then, you load into the BioNetDB database:
- Prepare your data, i.e., transform your genomic data files into Neo4j CSV files:
./bionetdb.sh import -i <input-directory> -o <output-csv-directory> --create-csv-files
- Load the create Neo4j CSV files into the database:
./bionetdb.sh import -i <csv-directory>
Accesing BioNetDB from Neo4j browser interface
You can access to your BioNetDB database from the Neo4j browser interface. Open your regular internet browser and type http://localhost:7474:
Now that you can access the BioNetDB database, you can start working with your imported data using the Cypher query language. For a Cypher tutorial, please refer to Intro to Cypher by the Neo4j Team.
As examples, here you have some Cypher queries to the BioNetDB data model:
match (n:TRANSCRIPT) return n.id, n.name, n.biotype, n.chromosome, n.start, n.end, n.annotationFlags limit 10
n.id | n.name | n.biotype | n.chromosome | n.start | n.end | n.annotationFlags |
---|---|---|---|---|---|---|
"ENST00000553557" | "TSPYL2-003" | "retained_intron" | "X" | "53111549" | "53115595" | "-" |
"ENST00000375442" | "TSPYL2-001" | "protein_coding" | "X" | "53111549" | "53117722" | "CCDS;basic" |
"ENST00000579390" | "TSPYL2-005" | "protein_coding" | "X" | "53111563" | "53115300" | "mRNA_end_NF;cds_end_NF" |
"ENST00000578306" | "TSPYL2-006" | "nonsense_mediated_decay" | "X" | "53112175" | "53115021" | "cds_start_NF;mRNA_start_NF" |
"ENST00000556808" | "TSPYL2-004" | "retained_intron" | "X" | "53112305" | "53117721" | "-" |
"ENST00000463525" | "TSPYL2-002" | "retained_intron" | "X" | "53113881" | "53115125" | "-" |
"ENST00000314888" | "TLN1-001" | "protein_coding" | "9" | "35696945" | "35732392" | "CCDS;basic" |
"ENST00000540444" | "TLN1-201" | "protein_coding" | "9" | "35697334" | "35732392" | "basic" |
"ENST00000489255" | "TLN1-003" | "processed_transcript" | "9" | "35698041" | "35699325" | "-" |
"ENST00000464379" | "TLN1-005" | "processed_transcript" | "9" | "35703556" | "35707871" | "-" |
match (n:VARIANT) return count(n)
count(n) |
---|
9010279 |
Table of Contents:
- No labels