Overview

BioNetDB models biology data as a network of nodes and relations.Biology data comes from different formats and sources it comprises system biology data from Reactome, annotation data from CellBase and human genetic variations from healthcare centers' clinical data. BioNetDB relies on Neo4j graph database that allows users to access biological data using the Cypher query language (similar to SQL in relational databases). Neo4j is highly optimized for queries and it is scalable and reliable.

BioNetDB > Data Model > BioNetDB data model.png

Modelling

This section describes the main nodes of the BioNetDB network data model and for each node its properties and relationships are shown.

Genes

Gene node properties:

uid
id
name
chromosome
start
end
strand
biotype
description:
source
status

Gene relationships:

BioNetDB > Data Model > neo4j.gene.png

Transcripts

Transcript node properties:

uid
id
name
biotype
chromosome
start
end
strand
proteinId
genomicCodingEnd
genomicCodingStart
annotationFlags
cdnaCodingEnd
cdnaCodingStart
cdsLength
description
status

Transcript relationships (transcript node in pink):

BioNetDB > Data Model > Screenshot_20180615_122729.png

Proteins

Protein node properties:

uid
id
name
accession
dataset
evidence
proteinExistence

Protein relationships:

BioNetDB > Data Model > neo4j.protein.png

Protein Complex

BioNetDB > Data Model > neo4j.protein.complex.png

Variants

Variant node properties:

uid
id
name
chromosome
start
end
strand
type
alternate
reference
alternativeNames

Variant relationships:

BioNetDB > Data Model > neo4j.variant.png

Regulation

Regulation node properties:

uid
id
name

Regulation relationships:

BioNetDB > Data Model > neo4j.regulation.png

Pathway

Pathway node properties:

uid
id
name

Pathway relationships (pathway nodes in yellow):

BioNetDB > Data Model > pathway.neo4j.png

Table of Contents: