- This line was added.
- This line was removed.
- Formatting was changed.
Variant Stats contain a basic information for each variant in a different cohort.
Variant Stats is implemented using Hadoop MapReduce over HBase.
OpenCGA support different input parameters:
- Variant Query
- Sample list, cohort or query
Variant stat file including the following values:
- The total number of alleles (it does not include missing alleles)
- The number of reference alleles found in this variant
- The number of main alternate alleles found in this variant (it does not include secondary alternates)
- The reference allele frequency, i.e., the quotient of the number of reference alleles divided by the total number of alleles.
- The alternate allele frequency, i.e., the quotient of the number of alternate alleles divided by the total number of alleles.
- The number of occurrences for each genotype
- The frequency for each genotype
- The number of missing alleles
- The number of missing genotypes
- The minor allele frequency (maf)
- The minor genotype frequency (mgf)
- The allele with the minor frequency
- The genotype with the minor frequency
Pre-computed stats are useful for filtering variants. This stats are intra-study, calculated within a given cohort.
Table of Contents:
|Table of Contents|