In order to follow this guide you have to install BioNetDB in your system. Please, please follow the steps on installation guide and set it up.
Users can download test data from the following link. Download the tar.gz file and uncompress in your system. Once uncompressed, you should see the following files:
Before you query BioNetDB database, you have to populate it by importing your data into the Neo4j database. BioNetDB provides a command line interface to import data. First, you prepare your data, and then, you load into the BioNetDB database:
./bionetdb.sh import -i <input-directory> -o <output-csv-directory> --create-csv-files |
./bionetdb.sh import -i <csv-directory> |
You can access to your BioNetDB database from the Neo4j browser interface. Open your regular internet browser and type http://localhost:7474:
Now that you can access the BioNetDB database, you can start working with your imported data using the Cypher query language. For a Cypher tutorial, please refer to Intro to Cypher by the Neo4j Team.
As examples, here you have some Cypher queries to the BioNetDB data model:
match (n:TRANSCRIPT) return n.id, n.name, n.biotype, n.chromosome, n.start, n.end, n.annotationFlags limit 10 |
n.id | n.name | n.biotype | n.chromosome | n.start | n.end | n.annotationFlags |
---|---|---|---|---|---|---|
"ENST00000553557" | "TSPYL2-003" | "retained_intron" | "X" | "53111549" | "53115595" | "-" |
"ENST00000375442" | "TSPYL2-001" | "protein_coding" | "X" | "53111549" | "53117722" | "CCDS;basic" |
"ENST00000579390" | "TSPYL2-005" | "protein_coding" | "X" | "53111563" | "53115300" | "mRNA_end_NF;cds_end_NF" |
"ENST00000578306" | "TSPYL2-006" | "nonsense_mediated_decay" | "X" | "53112175" | "53115021" | "cds_start_NF;mRNA_start_NF" |
"ENST00000556808" | "TSPYL2-004" | "retained_intron" | "X" | "53112305" | "53117721" | "-" |
"ENST00000463525" | "TSPYL2-002" | "retained_intron" | "X" | "53113881" | "53115125" | "-" |
"ENST00000314888" | "TLN1-001" | "protein_coding" | "9" | "35696945" | "35732392" | "CCDS;basic" |
"ENST00000540444" | "TLN1-201" | "protein_coding" | "9" | "35697334" | "35732392" | "basic" |
"ENST00000489255" | "TLN1-003" | "processed_transcript" | "9" | "35698041" | "35699325" | "-" |
"ENST00000464379" | "TLN1-005" | "processed_transcript" | "9" | "35703556" | "35707871" | "-" |
match (n:VARIANT) return count(n) |
count(n) |
---|
9010279 |
Table of Contents: