OpenCGA uses a two-level structure to organise datasets, these are Projects and Studies and are used to organise HGVA data and metadata:
You can get more information about data organisation at OpenCGA Catalog Data Management. Projects and Studies have a unique alias to ease their usage. Please, see below the full list and organisation of the currently available Projects and Studies (datasets) in HVGVA.
Projects name (alias) | Studies | HGVA Version/Date | ||
---|---|---|---|---|
Name | Alias | v1 (Dec. 2016) | v2 (Jan. 2018) | |
Reference GRCh37 (reference_grch37) | 1000 Genomes Project GRCh37 | 1kG_phase3 | Phase 3 2016-05 | Phase 3 2016-05 |
Exome Sequencing Project (ESP6500) | ESP6500 | 2016-05 | 2016-05 | |
Exome Aggregation Consortium (ExAC) | EXAC | 0.3.1 2016-05 | 0.3.1 2016-05 | |
Genome of the Netherlands (GoNL) | GONL | Release 5 2016-05 | Release 5 2016-05 | |
UK10K Project | UK10k | 2016-05 | 2016-05 | |
DiscovEHR | DISCOVEHR | - | ||
Genome Aggregation Database (gnomAD Exomes) | GNOMAD_EXOMES | - | ||
Genome Aggregation Database (gnomAD Genomes) | GNOMAD_GENOMES | - | ||
Spanish Medical Genome Project (MGP) | MGP | 2016-12 | 2016-12 | |
Reference GRCh38 (reference_grch38) | 1000 Genomes Project GRCh38 | 1kG_phase3 | Phase 3 2016-10 | Phase 3 2016-10 |
ESP6500 | ESP6500 | - | ||
UK10K Project (*) | UK10K | - | ||
DiscovEHR (*) | DISCOVEHR | - | ||
Genome Aggregation Database (gnomAD Exomes) (*) | GNOMAD_EXOMES | - | ||
Genome Aggregation Database (gnomAD Genomes) (*) | GNOMAD_GENOMES | - | ||
Cancer GRCh37 (cancer_grch37) | QIMR Berghofer Melanoma | QIMR_Berghofer_Melanoma | 2016-12 | 2016-12 |
Chronic Myeloid Leukemia - Russian Academy of Medical Sciences | RAMS_CML | 2016-12 | 2016-12 | |
Platinum (platinum) | Illumina Platinum | illumina_platinum | 2015-08 | 2015-08 |
(*) A liftover carried aout by Genomics England (GEL)
Variant annotation was carried out by the CellBase project. Please, check CellBase documentation for details on additional data sources: Data sources and species
Table of Contents: