Versions Compared
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Index
Annotate
Custom annotation
Calculate Statistics
Define cohorts
Remove
Export / Query
Export statistics
Export statistics is an special case of export. Instead of export full variants, only the variant cohort statistics are exported.
As for variants export, there are multiple possible output formats:
VCF : Standard VCF format without samples information, with the stats as values in the INFO column.
Code Block | ||||
---|---|---|---|---|
| ||||
##fileformat=VCFv4.2
##FILTER=<ID=.,Description="No FILTER info">
##FILTER=<ID=PASS,Description="Valid variant">
##INFO=<ID=AC,Number=A,Type=Integer,Description="Total number of alternate alleles in called genotypes, for each ALT allele, in the same order as listed">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, calculated from AC and AN, in the range (0,1), in the same order as listed">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##INFO=<ID=AFK_AF,Number=A,Type=Float,Description="Allele frequency in the C1 cohort calculated from AC and AN, in the range (0,1), in the same order as listed">
#CHROM POS ID REF ALT QUAL FILTER INFO
22 16050115 . G A . PASS AC=1;AF=0.001;AN=1000;AFK_AF=0.002008
22 16050213 . C T . PASS AC=1;AF=0.001;AN=1000;AFK_AF=0
22 16050319 . C T . PASS AC=1;AF=0.001;AN=1000;AFK_AF=0
22 16050607 . G A . PASS AC=2;AF=0.002;AN=1000;AFK_AF=0.004016
|
TSV (Tab Separated Values). Simple format with each cohort in one column.
Code Block | ||||
---|---|---|---|---|
| ||||
#CHR POS REF ALT ALL_AN ALL_AC ALL_AF ALL_HET ALL_HOM
22 16050213 C T 1000 1 0.001 0.002 0.0
22 16050607 G A 1000 2 0.002 0.004 0.0
22 16050740 A - 1000 1 0.001 0.002 0.0
22 16050840 C G 1000 13 0.013 0.026 0.0
22 16051075 G A 1000 2 0.002 0.004 0.0
22 16051249 T C 1000 91 0.091 0.162 0.01
22 16051453 A C 998 74 0.074 0.144 0.004
22 16051453 A G 926 2 0.002 0.144 0.004
22 16051723 A - 1000 12 0.012 0.024 0.0
22 16051816 T G 1000 2 0.002 0.004 0.0
|
JSON. Variant model just with minimal information and statistics.
Code Block | ||||
---|---|---|---|---|
| ||||
{"reference":"T","chromosome":"22","alternate":"C","start":16174643,"annotation":null,"names":[],"id":"22:16174643:T:C","type":"SNV","studies":[{"format":[],"samplesData":[],"studyId":"user@p1:s1","stats":{},"files":[],"secondaryAlternates":[]}],"end":16174643,"strand":"+","sv":null,"hgvs":{},"length":1}
{"reference":"C","chromosome":"22","alternate":"T","start":16176715,"annotation":null,"names":[],"id":"22:16176715:C:T","type":"SNV","studies":[{"format":[],"samplesData":[],"studyId":"user@p1:s1","stats":{},"files":[],"secondaryAlternates":[]}],"end":16176715,"strand":"+","sv":null,"hgvs":{},"length":1}
{"reference":"C","chromosome":"22","alternate":"A","start":16176724,"annotation":null,"names":[],"id":"22:16176724:C:A","type":"SNV","studies":[{"format":[],"samplesData":[],"studyId":"user@p1:s1","stats":{},"files":[],"secondaryAlternates":[]}],"end":16176724,"strand":"+","sv":null,"hgvs":{},"length":1}
{"reference":"T","chromosome":"22","alternate":"C","start":16176769,"annotation":null,"names":[],"id":"22:16176769:T:C","type":"SNV","studies":[{"format":[],"samplesData":[],"studyId":"user@p1:s1","stats":{},"files":[],"secondaryAlternates":[]}],"end":16176769,"strand":"+","sv":null,"hgvs":{},"length":1}
{"reference":"T","chromosome":"22","alternate":"A","start":16176926,"annotation":null,"names":[],"id":"rs141065546","type":"SNV","studies":[{"format":[],"samplesData":[],"studyId":"user@p1:s1","stats":{},"files":[],"secondaryAlternates":[]}],"end":16176926,"strand":"+","sv":null,"hgvs":{},"length":1}
{"reference":"G","chromosome":"22","alternate":"T","start":16176936,"annotation":null,"names":[],"id":"22:16176936:G:T","type":"SNV","studies":[{"format":[],"samplesData":[],"studyId":"user@p1:s1","stats":{},"files":[],"secondaryAlternates":[]}],"end":16176936,"strand":"+","sv":null,"hgvs":{},"length":1}
{"reference":"A","chromosome":"22","alternate":"G","start":16176994,"annotation":null,"names":[],"id":"22:16176994:A:G","type":"SNV","studies":[{"format":[],"samplesData":[],"studyId":"user@p1:s1","stats":{},"files":[],"secondaryAlternates":[]}],"end":16176994,"strand":"+","sv":null,"hgvs":{},"length":1}
|
Population Frequencies (Cellbase mode). Specific JSON format for import into Cellbase variation. It is a Variant model with VariantAnnotation with PupulationFrequencies.
Code Block | ||||
---|---|---|---|---|
| ||||
{"names":[],"reference":"T","chromosome":"22","alternate":"C","start":16174643,"annotation":{"populationFrequencies":[{"study":"s1","population":"ALL","refAllele":"T","altAllele":"C","refAlleleFreq":0.999,"altAlleleFreq":0.001,"refHomGenotypeFreq":0.998,"hetGenotypeFreq":0.002,"altHomGenotypeFreq":0.0},{"study":"s1","population":"C1","refAllele":"T","altAllele":"C","refAlleleFreq":0.998,"altAlleleFreq":0.002,"refHomGenotypeFreq":0.996,"hetGenotypeFreq":0.004,"altHomGenotypeFreq":0.0}]},"end":16174643,"type":"SNV","studies":[],"strand":"+","hgvs":{},"length":1}
{"names":[],"reference":"C","chromosome":"22","alternate":"T","start":16176715,"annotation":{"populationFrequencies":[{"study":"s1","population":"ALL","refAllele":"C","altAllele":"T","refAlleleFreq":0.998,"altAlleleFreq":0.002,"refHomGenotypeFreq":0.996,"hetGenotypeFreq":0.004,"altHomGenotypeFreq":0.0},{"study":"s1","population":"C2","refAllele":"C","altAllele":"T","refAlleleFreq":0.99598396,"altAlleleFreq":0.004016064,"refHomGenotypeFreq":0.99196786,"hetGenotypeFreq":0.008032128,"altHomGenotypeFreq":0.0}]},"end":16176715,"type":"SNV","studies":[],"strand":"+","hgvs":{},"length":1}
{"names":[],"reference":"C","chromosome":"22","alternate":"A","start":16176724,"annotation":{"populationFrequencies":[{"study":"s1","population":"ALL","refAllele":"C","altAllele":"A","refAlleleFreq":0.999,"altAlleleFreq":0.001,"refHomGenotypeFreq":0.998,"hetGenotypeFreq":0.002,"altHomGenotypeFreq":0.0},{"study":"s1","population":"C2","refAllele":"C","altAllele":"A","refAlleleFreq":0.997992,"altAlleleFreq":0.002008032,"refHomGenotypeFreq":0.99598396,"hetGenotypeFreq":0.004016064,"altHomGenotypeFreq":0.0}]},"end":16176724,"type":"SNV","studies":[],"strand":"+","hgvs":{},"length":1}
{"names":[],"reference":"T","chromosome":"22","alternate":"C","start":16176769,"annotation":{"populationFrequencies":[{"study":"s1","population":"ALL","refAllele":"T","altAllele":"C","refAlleleFreq":0.999,"altAlleleFreq":0.001,"refHomGenotypeFreq":0.998,"hetGenotypeFreq":0.002,"altHomGenotypeFreq":0.0},{"study":"s1","population":"C2","refAllele":"T","altAllele":"C","refAlleleFreq":0.997992,"altAlleleFreq":0.002008032,"refHomGenotypeFreq":0.99598396,"hetGenotypeFreq":0.004016064,"altHomGenotypeFreq":0.0}]},"end":16176769,"type":"SNV","studies":[],"strand":"+","hgvs":{},"length":1}
{"names":[],"reference":"T","chromosome":"22","alternate":"A","start":16176926,"annotation":{"populationFrequencies":[{"study":"s1","population":"C3","refAllele":"T","altAllele":"A","refAlleleFreq":0.5,"altAlleleFreq":0.5,"refHomGenotypeFreq":0.0,"hetGenotypeFreq":1.0,"altHomGenotypeFreq":0.0},{"study":"s1","population":"ALL","refAllele":"T","altAllele":"A","refAlleleFreq":0.473,"altAlleleFreq":0.527,"refHomGenotypeFreq":0.166,"hetGenotypeFreq":0.614,"altHomGenotypeFreq":0.22},{"study":"s1","population":"C1","refAllele":"T","altAllele":"A","refAlleleFreq":0.474,"altAlleleFreq":0.526,"refHomGenotypeFreq":0.164,"hetGenotypeFreq":0.62,"altHomGenotypeFreq":0.216},{"study":"s1","population":"C2","refAllele":"T","altAllele":"A","refAlleleFreq":0.4698795,"altAlleleFreq":0.5301205,"refHomGenotypeFreq":0.16465864,"hetGenotypeFreq":0.6104418,"altHomGenotypeFreq":0.2248996}]},"end":16176926,"type":"SNV","studies":[],"strand":"+","hgvs":{},"length":1}
|
Operations
Variant Storage Operations are responsible for leaving variant data ready for querying and analysis, for instance VCF loading, integrity checks, sample genotype aggregation, indexing, or variant annotation are examples of operations. Operations can only be executed by admin users. Many operations write and update indexed data, this will significantly improve the quality and performance of different queries and analysis.
The OpenCGA Variant Storage Engine supports several operations to work with variant datasets:
Index Pipeline
Image Added
Table of Contents:
Table of Contents | ||
---|---|---|
|