- Created by Nacho Medina, last modified by Jacobo Coll on Apr 07, 2020
You are viewing an old version of this page. View the current version.
Compare with Current View Page History
« Previous Version 6 Current »
Variant Stats contain a basic information for each variant in a different cohort.
Implementation
Variant Stats is implemented using Hadoop MapReduce over HBase.
Input
Parameters
OpenCGA support different input parameters:
- Variant Query
- Sample list, cohort or query
Output
Files
If the stats are not indexed, the analysis produces a Variant stats file in json format with the following model schema:
Variant Stats Data Mode
cohortId | Unique cohort identifier within the study. |
sampleCount | Count of samples with non-missing genotypes in this variant from the cohort. |
fileCount | Count of files with samples from the cohort that reported this variant. |
alleleCount | Total number of alleles in called genotypeCounters. It does not include missing alleles. |
refAlleleCount | Number of reference alleles found in this variant. |
refAlleleFreq | Reference allele frequency calculated from refAlleleCount and alleleCount, in the range [0,1] |
altAlleleCount | Number of main alternate alleles found in this variants. It does not include secondary alternates. |
altAlleleFreq | Alternate allele frequency calculated from altAlleleCount and alleleCount, in the range [0,1] |
missingAlleleCount | Number of missing alleles. |
missingGenotypeCount | Number of genotypes with all alleles missing (e.g. ./.). It does not count partially missing genotypes like "./0" or "./1". |
genotypeCount | Number of occurrences for each genotype. |
genotypeFreq | Genotype frequency for each genotype found calculated from the genotypeCount and samplesCount, in the range [0,1] |
maf | Minor allele frequency. Frequency of the less common allele between the reference and the main alternate alleles. |
mafAllele | Allele with minor frequency. |
mgf | Minor genotype frequency. Frequency of the less common genotype seen in this variant. |
mgfGenotype | Genotype with minor frequency. |
filterCount | The number of occurrences for each FILTER value in files from samples in this cohort reporting this variant. |
filterFreq | Frequency of each filter calculated from the filterCount and filesCount, in the range [0,1] |
qualityCount | The number of files from samples in this cohort reporting this variant with valid QUAL values. |
qualityAvg | The average Quality value for files with valid QUAL values from samples in this cohort reporting this variant. |
Index
Pre-computed stats are useful for filtering variants. This stats are intra-study, calculated within a given cohort.
Useful Links
Table of Contents:
- No labels