Date: Fri, 29 Mar 2024 14:54:22 +0000 (GMT) Message-ID: <548831724.481.1711724062223@web> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_480_1173751958.1711724062219" ------=_Part_480_1173751958.1711724062219 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
In order to follow this guide you have to install BioNetDB in your syste= m. Please, please follow the steps on Installation Guide and set it up.
Download the test data from http://bioinfo.hpc.cam.ac.uk/downloads/bionetdb/bionetdb.dataset= .tar.gz and extract the content of the archive executing:
# Download in the= /tmp folder $ cd /tmp $ wget http://bioinfo.hpc.cam.ac.uk/downloads/bionetdb/bionetdb.dataset.tar= .gz # Extract the content $ tar xvfz bionetdb.dataset.tar.gz=20 bionetdb.dataset/ bionetdb.dataset/illumina_platinum.export.5k.json bionetdb.dataset/mirna.csv bionetdb.dataset/genes.json.gz bionetdb.dataset/proteins.json.gz bionetdb.dataset/illumina_platinum.export.5k.json.meta.json bionetdb.dataset/Homo_sapiens.owl bionetdb.dataset/10k.clinvar.json.gz # List the content $ cd bionetdb.dataset/ $ ls -ltrh total 475M -rw-rw-r-- 1 jtarraga jtarraga 38M Jun 26 13:39 proteins.json.gz -rw-rw-r-- 1 jtarraga jtarraga 78M Jun 26 13:39 genes.json.gz -rw-rw-r-- 1 jtarraga jtarraga 1.2M Jun 26 13:39 mirna.csv -rw-rw-r-- 1 jtarraga jtarraga 53K Jun 26 13:39 illumina_platinum.export.5= k.json.meta.json -rw-rw-r-- 1 jtarraga jtarraga 56M Jun 26 13:39 illumina_platinum.export.5= k.json -rw-rw-r-- 1 jtarraga jtarraga 215M Jun 26 13:39 Homo_sapiens.owl -rw-rw-r-- 1 jtarraga jtarraga 89M Jun 26 13:39 10k.clinvar.json.gz= =20
Before you query BioNetDB database, you have to populate it. Neo4j provi= des a mechanism to do batch imports of large amounts of data into a Neo4j d= atabase from CSV files. The importing mechanism has been integrated in the = BioNetDB command line (bionetdb.sh import) that allows users, firs= t, prepare your data by creating the Neo4j CSV files, and then, these files= are loaded into the database.
Creating the Neo4j CSV files= p>
In order to create the Neo4j CSV files you have to use the BioNetDB comm=
and line: bionetdb.sh import --create-csv. The followin=
g command line creates the Neo4j CSV files for the previously downloaded da=
taset.
$ mkdir /tmp/bion= etdb.dataset/csv $ ./bionetdb.sh import -i /tmp/bionetdb.dataset -o /tmp/bionetdb.dataset/cs= v --create-csv-files ... ... [main] INFO org.opencb.bionetdb.core.utils.Neo4jBioPaxImporter - 2: 96% [main] INFO org.opencb.bionetdb.core.utils.Neo4jBioPaxImporter - 2: 99% [main] INFO org.opencb.bionetdb.core.utils.Neo4jBioPaxImporter - Processing= /tmp/bionetdb.dataset/Homo_sapiens.owl containing 383790 BioPax elements i= n 11 s [main] INFO org.opencb.bionetdb.core.utils.Neo4jBioPaxImporter - Processing= 55847 nodes [main] INFO org.opencb.bionetdb.core.utils.Neo4jBioPaxImporter - Processing= 178398 relations [main] INFO class org.opencb.bionetdb.app.cli.ImportCommandExecutor - Post-= processing 778 dna nodes [main] INFO class org.opencb.bionetdb.app.cli.ImportCommandExecutor - Post-= processing 302 miRNA nodes [main] INFO org.opencb.bionetdb.core.utils.Neo4jCsvImporter - Processing JS= ON file /tmp/bionetdb.dataset/10k.clinvar.json.gz [main] INFO org.opencb.bionetdb.core.utils.Neo4jCsvImporter - Parsing 5000 = variants... [main] INFO org.opencb.bionetdb.core.utils.Neo4jCsvImporter - Parsing 10000= variants... [main] INFO org.opencb.bionetdb.core.utils.Neo4jCsvImporter - Parsed 10000 = variants from /tmp/bionetdb.dataset/10k.clinvar.json.gz. Done!!! [main] INFO org.opencb.bionetdb.core.utils.Neo4jCsvImporter - Processing JS= ON file /tmp/bionetdb.dataset/illumina_platinum.export.5k.json [main] INFO org.opencb.bionetdb.core.utils.Neo4jCsvImporter - Parsing 5000 = variants... [main] INFO org.opencb.bionetdb.core.utils.Neo4jCsvImporter - Parsed 5000 v= ariants from /tmp/bionetdb.dataset/illumina_platinum.export.5k.json. Done!!= ! [main] INFO class org.opencb.bionetdb.app.cli.ImportCommandExecutor - Gene = indexing in 40 s [main] INFO class org.opencb.bionetdb.app.cli.ImportCommandExecutor - Prote= in indexing in 13 s [main] INFO class org.opencb.bionetdb.app.cli.ImportCommandExecutor - miRNA= indexing in 0 s [main] INFO class org.opencb.bionetdb.app.cli.ImportCommandExecutor - BioPA= X processing in 27 s [main] INFO class org.opencb.bionetdb.app.cli.ImportCommandExecutor - Varia= nt processing in 19 s=20
The Neo4j CSV files are located in the output folder:
$ ls -ltr /tmp/bi= onetdb.dataset/csv total 180936 -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 VARIANT_ANNOTATION.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 TRANSPORT.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 TRANSCRIPT_ANNOTATION_= FLAG.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 REGULATION_REGION.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 PROTEIN_ANNOTATION.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 PHYSICAL_ENTITY.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 PANEL.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 ONTOLOGY.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 INTERACTION.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 GENE_TRAIT_ASSOCIATION= .csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 GENE_ANNOTATION.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 EXPRESSION.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 EXON_OVERLAP.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 DISEASE_SUBGROUP.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 DISEASE_GROUP.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 CONFIG.csv -rw-rw-r-- 1 jtarraga jtarraga 0 Jun 26 14:14 ASSEMBLY.csv drwxr-xr-x 2 jtarraga jtarraga 4096 Jun 26 14:15 genes.rocksdb drwxr-xr-x 2 jtarraga jtarraga 4096 Jun 26 14:15 mirna.rocksdb drwxr-xr-x 2 jtarraga jtarraga 4096 Jun 26 14:15 proteins.rocksdb drwxr-xr-x 2 jtarraga jtarraga 4096 Jun 26 14:15 rocksdb -rw-rw-r-- 1 jtarraga jtarraga 14011261 Jun 26 14:15 XREF___PROTEIN___XREF.= csv -rw-rw-r-- 1 jtarraga jtarraga 28263017 Jun 26 14:15 XREF.csv -rw-rw-r-- 1 jtarraga jtarraga 240044 Jun 26 14:15 VARIANT__VARIANT_CALL.= csv -rw-rw-r-- 1 jtarraga jtarraga 419286 Jun 26 14:15 VARIANT__TRAIT_ASSOCIA= TION.csv -rw-rw-r-- 1 jtarraga jtarraga 3633480 Jun 26 14:15 VARIANT__POPULATION_FR= EQUENCY.csv -rw-rw-r-- 1 jtarraga jtarraga 80045 Jun 26 14:15 VARIANT_FILE_INFO__FIL= E.csv -rw-rw-r-- 1 jtarraga jtarraga 516837 Jun 26 14:15 VARIANT_FILE_INFO.csv -rw-rw-r-- 1 jtarraga jtarraga 911253 Jun 26 14:15 VARIANT.csv -rw-rw-r-- 1 jtarraga jtarraga 793845 Jun 26 14:15 VARIANT__CONSERVATION.= csv -rw-rw-r-- 1 jtarraga jtarraga 2068937 Jun 26 14:15 VARIANT__CONSEQUENCE_T= YPE.csv -rw-rw-r-- 1 jtarraga jtarraga 390033 Jun 26 14:15 VARIANT_CALL.csv -rw-rw-r-- 1 jtarraga jtarraga 75421 Jun 26 14:15 UNDEFINED.csv -rw-rw-r-- 1 jtarraga jtarraga 849048 Jun 26 14:15 TRANSCRIPT__TFBS.csv -rw-rw-r-- 1 jtarraga jtarraga 84839 Jun 26 14:15 TRANSCRIPT__PROTEIN.cs= v -rw-rw-r-- 1 jtarraga jtarraga 739714 Jun 26 14:15 TRANSCRIPT.csv -rw-rw-r-- 1 jtarraga jtarraga 3312486 Jun 26 14:15 TFBS.csv -rw-rw-r-- 1 jtarraga jtarraga 826 Jun 26 14:15 TARGET_GENE___MIRNA___= GENE.csv -rw-rw-r-- 1 jtarraga jtarraga 876916 Jun 26 14:15 SUBSTITUTION_SCORE.csv -rw-rw-r-- 1 jtarraga jtarraga 1212 Jun 26 14:15 SO.csv -rw-rw-r-- 1 jtarraga jtarraga 130465 Jun 26 14:15 SMALL_MOLECULE.csv -rw-rw-r-- 1 jtarraga jtarraga 8913 Jun 26 14:15 RNA.csv -rw-rw-r-- 1 jtarraga jtarraga 10839 Jun 26 14:15 REACTANT___REACTION___= UNDEFINED.csv -rw-rw-r-- 1 jtarraga jtarraga 1708 Jun 26 14:15 REACTANT___REACTION___= RNA.csv -rw-rw-r-- 1 jtarraga jtarraga 82014 Jun 26 14:15 REACTANT___REACTION___= PROTEIN.csv -rw-rw-r-- 1 jtarraga jtarraga 13002 Jun 26 14:15 REACTANT___REACTION___= DNA.csv -rw-rw-r-- 1 jtarraga jtarraga 93777 Jun 26 14:15 REACTANT___REACTION___= COMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 39 Jun 26 14:15 PROTEIN_VARIANT_ANNOTA= TION__PROTEIN_KEYWORD.csv -rw-rw-r-- 1 jtarraga jtarraga 48 Jun 26 14:15 PROTEIN_VARIANT_ANNOTA= TION__PROTEIN_FEATURE.csv -rw-rw-r-- 1 jtarraga jtarraga 1558786 Jun 26 14:15 PROTEIN__PROTEIN_KEYWO= RD.csv -rw-rw-r-- 1 jtarraga jtarraga 6322047 Jun 26 14:15 PROTEIN__PROTEIN_FEATU= RE.csv -rw-rw-r-- 1 jtarraga jtarraga 23573 Jun 26 14:15 PROTEIN_KEYWORD.csv -rw-rw-r-- 1 jtarraga jtarraga 74644290 Jun 26 14:15 PROTEIN_FEATURE.csv -rw-rw-r-- 1 jtarraga jtarraga 2145473 Jun 26 14:15 PROTEIN.csv -rw-rw-r-- 1 jtarraga jtarraga 6881 Jun 26 14:15 PRODUCT___REACTION___U= NDEFINED.csv -rw-rw-r-- 1 jtarraga jtarraga 97187 Jun 26 14:15 PRODUCT___REACTION___S= MALL_MOLECULE.csv -rw-rw-r-- 1 jtarraga jtarraga 1420 Jun 26 14:15 PRODUCT___REACTION___R= NA.csv -rw-rw-r-- 1 jtarraga jtarraga 102232 Jun 26 14:15 PRODUCT___REACTION___C= OMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 12258089 Jun 26 14:15 POPULATION_FREQUENCY.c= sv -rw-rw-r-- 1 jtarraga jtarraga 15956 Jun 26 14:15 PATHWAY_NEXT_STEP___RE= GULATION___REGULATION.csv -rw-rw-r-- 1 jtarraga jtarraga 11308 Jun 26 14:15 PATHWAY_NEXT_STEP___RE= GULATION___CATALYSIS.csv -rw-rw-r-- 1 jtarraga jtarraga 33343 Jun 26 14:15 PATHWAY_NEXT_STEP___RE= ACTION___REGULATION.csv -rw-rw-r-- 1 jtarraga jtarraga 162451 Jun 26 14:15 PATHWAY_NEXT_STEP___RE= ACTION___REACTION.csv -rw-rw-r-- 1 jtarraga jtarraga 963 Jun 26 14:15 PATHWAY_NEXT_STEP___RE= ACTION___PATHWAY.csv -rw-rw-r-- 1 jtarraga jtarraga 127 Jun 26 14:15 PATHWAY_NEXT_STEP___PA= THWAY___REGULATION.csv -rw-rw-r-- 1 jtarraga jtarraga 1996 Jun 26 14:15 PATHWAY_NEXT_STEP___PA= THWAY___PATHWAY.csv -rw-rw-r-- 1 jtarraga jtarraga 594 Jun 26 14:15 PATHWAY_NEXT_STEP___CA= TALYSIS___PATHWAY.csv -rw-rw-r-- 1 jtarraga jtarraga 46186 Jun 26 14:15 PATHWAY_NEXT_STEP___CA= TALYSIS___CATALYSIS.csv -rw-rw-r-- 1 jtarraga jtarraga 129897 Jun 26 14:15 PATHWAY.csv -rw-rw-r-- 1 jtarraga jtarraga 34 Jun 26 14:15 MIRNA__TARGET_TRANSCRI= PT.csv -rw-rw-r-- 1 jtarraga jtarraga 968 Jun 26 14:15 MIRNA.csv -rw-rw-r-- 1 jtarraga jtarraga 469 Jun 26 14:15 IS___RNA___MIRNA.csv -rw-rw-r-- 1 jtarraga jtarraga 10413 Jun 26 14:15 IS___DNA___GENE.csv -rw-rw-r-- 1 jtarraga jtarraga 94312 Jun 26 14:15 GENE__TRANSCRIPT.csv -rw-rw-r-- 1 jtarraga jtarraga 51330 Jun 26 14:15 GENE__DRUG.csv -rw-rw-r-- 1 jtarraga jtarraga 988165 Jun 26 14:15 GENE__DISEASE.csv -rw-rw-r-- 1 jtarraga jtarraga 105517 Jun 26 14:15 GENE.csv -rw-rw-r-- 1 jtarraga jtarraga 203 Jun 26 14:15 FILE.csv -rw-rw-r-- 1 jtarraga jtarraga 120908 Jun 26 14:15 DRUG.csv -rw-rw-r-- 1 jtarraga jtarraga 911319 Jun 26 14:15 DISEASE.csv -rw-rw-r-- 1 jtarraga jtarraga 205 Jun 26 14:15 CONTROLLER___REGULATIO= N___UNDEFINED.csv -rw-rw-r-- 1 jtarraga jtarraga 67 Jun 26 14:15 CONTROLLER___REGULATIO= N___RNA.csv -rw-rw-r-- 1 jtarraga jtarraga 3529 Jun 26 14:15 CONTROLLER___CATALYSIS= ___UNDEFINED.csv -rw-rw-r-- 1 jtarraga jtarraga 29158 Jun 26 14:15 CONTROLLER___CATALYSIS= ___PROTEIN.csv -rw-rw-r-- 1 jtarraga jtarraga 41431 Jun 26 14:15 CONTROLLER___CATALYSIS= ___COMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 24238 Jun 26 14:15 CONTROLLED___REGULATIO= N___REACTION.csv -rw-rw-r-- 1 jtarraga jtarraga 208268 Jun 26 14:15 CONSEQUENCE_TYPE__TRAN= SCRIPT.csv -rw-rw-r-- 1 jtarraga jtarraga 512125 Jun 26 14:15 CONSEQUENCE_TYPE__PROT= EIN_VARIANT_ANNOTATION.csv -rw-rw-r-- 1 jtarraga jtarraga 38 Jun 26 14:15 CONSEQUENCE_TYPE__GENE= .csv -rw-rw-r-- 1 jtarraga jtarraga 175344 Jun 26 14:15 COMPONENT_OF_PATHWAY__= _REACTION___PATHWAY.csv -rw-rw-r-- 1 jtarraga jtarraga 32800 Jun 26 14:15 COMPONENT_OF_PATHWAY__= _PATHWAY___PATHWAY.csv -rw-rw-r-- 1 jtarraga jtarraga 17555 Jun 26 14:15 COMPONENT_OF_COMPLEX__= _UNDEFINED___COMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 3298 Jun 26 14:15 COMPONENT_OF_COMPLEX__= _RNA___COMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 236226 Jun 26 14:15 COMPONENT_OF_COMPLEX__= _PROTEIN___COMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 6506 Jun 26 14:15 COMPONENT_OF_COMPLEX__= _DNA___COMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 17216 Jun 26 14:15 CELLULAR_LOCATION___UN= DEFINED___CELLULAR_LOCATION.csv -rw-rw-r-- 1 jtarraga jtarraga 44423 Jun 26 14:15 CELLULAR_LOCATION___SM= ALL_MOLECULE___CELLULAR_LOCATION.csv -rw-rw-r-- 1 jtarraga jtarraga 3384 Jun 26 14:15 CELLULAR_LOCATION___RN= A___CELLULAR_LOCATION.csv -rw-rw-r-- 1 jtarraga jtarraga 8757 Jun 26 14:15 CELLULAR_LOCATION___DN= A___CELLULAR_LOCATION.csv -rw-rw-r-- 1 jtarraga jtarraga 4396 Jun 26 14:15 CELLULAR_LOCATION.csv -rw-rw-r-- 1 jtarraga jtarraga 156427 Jun 26 14:15 CELLULAR_LOCATION___CO= MPLEX___CELLULAR_LOCATION.csv -rw-rw-r-- 1 jtarraga jtarraga 22842 Jun 26 14:15 CELLULAR_LOCATION___CA= TALYSIS___CELLULAR_LOCATION.csv -rw-rw-r-- 1 jtarraga jtarraga 117673 Jun 26 14:15 CATALYSIS.csv -rw-rw-r-- 1 jtarraga jtarraga 33 Jun 26 14:15 XREF___RNA___XREF.csv -rw-rw-r-- 1 jtarraga jtarraga 435530 Jun 26 14:15 VARIANT__FUNCTIONAL_SC= ORE.csv -rw-rw-r-- 1 jtarraga jtarraga 240052 Jun 26 14:15 VARIANT_CALL__VARIANT_= FILE_INFO.csv -rw-rw-r-- 1 jtarraga jtarraga 2779326 Jun 26 14:15 TRAIT_ASSOCIATION.csv -rw-rw-r-- 1 jtarraga jtarraga 39 Jun 26 14:15 TARGET_TRANSCRIPT__TRA= NSCRIPT.csv -rw-rw-r-- 1 jtarraga jtarraga 31 Jun 26 14:15 TARGET_TRANSCRIPT.csv -rw-rw-r-- 1 jtarraga jtarraga 240043 Jun 26 14:15 SAMPLE__VARIANT_CALL.c= sv -rw-rw-r-- 1 jtarraga jtarraga 97 Jun 26 14:15 SAMPLE.csv -rw-rw-r-- 1 jtarraga jtarraga 204330 Jun 26 14:15 REGULATION.csv -rw-rw-r-- 1 jtarraga jtarraga 887011 Jun 26 14:15 REACTION.csv -rw-rw-r-- 1 jtarraga jtarraga 110854 Jun 26 14:15 REACTANT___REACTION___= SMALL_MOLECULE.csv -rw-rw-r-- 1 jtarraga jtarraga 635263 Jun 26 14:15 PROTEIN_VARIANT_ANNOTA= TION__SUBSTITUTION_SCORE.csv -rw-rw-r-- 1 jtarraga jtarraga 187645 Jun 26 14:15 PROTEIN_VARIANT_ANNOTA= TION__PROTEIN.csv -rw-rw-r-- 1 jtarraga jtarraga 445975 Jun 26 14:15 PROTEIN_VARIANT_ANNOTA= TION.csv -rw-rw-r-- 1 jtarraga jtarraga 43355 Jun 26 14:15 PRODUCT___REACTION___P= ROTEIN.csv -rw-rw-r-- 1 jtarraga jtarraga 29473 Jun 26 14:15 PATHWAY_NEXT_STEP___RE= GULATION___REACTION.csv -rw-rw-r-- 1 jtarraga jtarraga 553 Jun 26 14:15 PATHWAY_NEXT_STEP___RE= GULATION___PATHWAY.csv -rw-rw-r-- 1 jtarraga jtarraga 75303 Jun 26 14:15 PATHWAY_NEXT_STEP___RE= ACTION___CATALYSIS.csv -rw-rw-r-- 1 jtarraga jtarraga 538 Jun 26 14:15 PATHWAY_NEXT_STEP___PA= THWAY___REACTION.csv -rw-rw-r-- 1 jtarraga jtarraga 322 Jun 26 14:15 PATHWAY_NEXT_STEP___PA= THWAY___CATALYSIS.csv -rw-rw-r-- 1 jtarraga jtarraga 7221 Jun 26 14:15 PATHWAY_NEXT_STEP___CA= TALYSIS___REGULATION.csv -rw-rw-r-- 1 jtarraga jtarraga 78151 Jun 26 14:15 PATHWAY_NEXT_STEP___CA= TALYSIS___REACTION.csv -rw-rw-r-- 1 jtarraga jtarraga 1366995 Jun 26 14:15 FUNCTIONAL_SCORE.csv -rw-rw-r-- 1 jtarraga jtarraga 20391 Jun 26 14:15 DNA.csv -rw-rw-r-- 1 jtarraga jtarraga 3052 Jun 26 14:15 CONTROLLER___REGULATIO= N___SMALL_MOLECULE.csv -rw-rw-r-- 1 jtarraga jtarraga 7105 Jun 26 14:15 CONTROLLER___REGULATIO= N___PROTEIN.csv -rw-rw-r-- 1 jtarraga jtarraga 13837 Jun 26 14:15 CONTROLLER___REGULATIO= N___COMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 131 Jun 26 14:15 CONTROLLED___REGULATIO= N___PATHWAY.csv -rw-rw-r-- 1 jtarraga jtarraga 45 Jun 26 14:15 CONTROLLED___REGULATIO= N___CATALYSIS.csv -rw-rw-r-- 1 jtarraga jtarraga 73180 Jun 26 14:15 CONTROLLED___CATALYSIS= ___REACTION.csv -rw-rw-r-- 1 jtarraga jtarraga 2225052 Jun 26 14:15 CONSERVATION.csv -rw-rw-r-- 1 jtarraga jtarraga 2521876 Jun 26 14:15 CONSEQUENCE_TYPE__SO.c= sv -rw-rw-r-- 1 jtarraga jtarraga 12201659 Jun 26 14:15 CONSEQUENCE_TYPE.csv -rw-rw-r-- 1 jtarraga jtarraga 29393 Jun 26 14:15 COMPONENT_OF_COMPLEX__= _SMALL_MOLECULE___COMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 109239 Jun 26 14:15 COMPONENT_OF_COMPLEX__= _COMPLEX___COMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 568684 Jun 26 14:15 COMPLEX.csv -rw-rw-r-- 1 jtarraga jtarraga 5644 Jun 26 14:15 CELLULAR_LOCATION___RE= GULATION___CELLULAR_LOCATION.csv -rw-rw-r-- 1 jtarraga jtarraga 68316 Jun 26 14:15 CELLULAR_LOCATION___RE= ACTION___CELLULAR_LOCATION.csv -rw-rw-r-- 1 jtarraga jtarraga 244996 Jun 26 14:15 CELLULAR_LOCATION___PR= OTEIN___CELLULAR_LOCATION.csv=20
Once created the CSV files, they have to be loaded into the database by = using the BioNetDB command line: bionetdb.sh import. This command = line can only be used to load data into a previously unused database, so if= you are using the default Neo4j database (located at $NEO4J_HOME/dat= a/databases/graph.db), be sure that it is empty.
According to our example:
$ rm $NEO4J_HOME/= data/databases/graph.db $ ./bionetdb.sh import -i /tmp/bionetdb.dataset/csv ... ... [>:23.27 MB/s----------|NODE:22.89 MB|*PROPERTIES(3)=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|LA|v:63.93 MB/s(2)=3D=3D=3D=3D=3D=3D=3D]2.11= M =E2=88=86 764K Done in 6s 661ms Prepare node index, started 2018-06-26 13:31:53.186+0000 [*DETECT:30.96 MB----------------------------------------------------------= -------------------]2.12M =E2=88=862.12M Done in 974ms Relationships, started 2018-06-26 13:31:54.217+0000 [*>:18.40 MB/s----------------------------------------|T|PREPARE(3)=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|RE|P|v:43.21]2.60M =E2=88=86 376K Done in 2s 665ms Node Degrees, started 2018-06-26 13:31:56.955+0000 [*>(3)=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|CALCULATE(2)= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D]2.60M =E2=88=862.60M Done in 326ms Relationship --> Relationship 1-32/32, started 2018-06-26 13:31:57.324+= 0000 [*>---------------------------------|LINK(4)=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|v:??----------------------]2.60M= =E2=88=862.60M Done in 499ms RelationshipGroup 1-32/32, started 2018-06-26 13:31:57.844+0000 [*>:??---------------------------------------------------------------|v:= ??--------------------]68.6K =E2=88=8668.6K Done in 69ms Node --> Relationship, started 2018-06-26 13:31:57.924+0000 [>:??---|>-----------------------------------|LINK|*v:??(2)=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D]2.09M =E2=88=862.09M Done in 285ms Relationship --> Relationship 1-32/32, started 2018-06-26 13:31:58.244+0= 000 [>-----------------------------|*LINK(2)=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|v:??(2)=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D]2.60M =E2=88=862.44M Done in 402ms Count groups, started 2018-06-26 13:31:58.681+0000 [*>---------------------------------------------------------------------= -----------|COUNT-----]67.3K =E2=88=8667.3K Done in 53ms Gather, started 2018-06-26 13:31:58.804+0000 [>-------------|*CACHE--------------------------------------------------= ----------------------]67.3K =E2=88=8667.3K Done in 67ms Write, started 2018-06-26 13:31:58.900+0000 [>:??---------------------------------|ENCODE----|*v:??-----------------= ----------------------]67.0K =E2=88=8667.0K Done in 34ms Node --> Group, started 2018-06-26 13:31:58.957+0000 [>------------|FIRST------------------|*v:??----------------------------= ----------------------]14.1K =E2=88=8614.1K Done in 21ms Node counts, started 2018-06-26 13:31:59.012+0000 [>--------------------------------------------|*COUNT:76.29 MB----------= ----------------------]2.12M =E2=88=862.12M Done in 191ms Relationship counts, started 2018-06-26 13:31:59.224+0000 [>(2)=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D|*COUNT(2)=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D]2.61M =E2=88=862.61M Done in 256ms IMPORT DONE in 13s 446ms.=20 Imported: 2117124 nodes 2605206 relationships 15047626 properties Peak memory usage: 536.43 MB=20
You can access to your BioNetDB database from the Neo4j browser interfac= e. Open your regular internet browser and type http://localhost:7474= a>:
Now that you can access the BioNetDB database, you can start working wit= h your imported data using the Cypher= query language. For a Cypher tutorial, please refer to Intro to Cypher by the Neo4j Team.
Below you have some Cypher queries: