A gene is considered to be knocked out for a sample when there is a set of variants that disable each copy of a certain gene.
This analysis obtains the list of knocked out genes for each input sample.
A variant is considered to disable a gene depending on the biotype of the gene, and its annotated consequence type. In protein_coding genes, the consequence type must be any from the list of loss of function sequence ontology terms listed below. In genes with other biotypes, the consequence type is not checked. The variants must also match other filter quality criteria.
Loss of function consequence type:
frameshift_variant
inframe_deletion
inframe_insertion
start_lost
stop_gained
stop_lost
splice_acceptor_variant
splice_donor_variant
transcript_ablation
transcript_amplification
initiator_codon_variant
splice_region_variant
incomplete_terminal_codon_variant
There are multiple scenarios where we can ensure that a set of variants are affecting all copies of the gene, therefore, the gene is knocked out.
Implemented at opencga#1455.
Parameters can be grouped in three categories:
The analysis will produce one JSON file per sample, and one summary JSON file with aggregated information.
....
sample
genesCount
transcriptsCount
countByType
homAltCount
multiAllelicCount
compHetCount
deletionOverlapCount
genes[]
id
name
transcripts[]
id
biotype
homAltVariants[]
multiAllelicVariants[]
compHetVariants[]
deletionOverlapVariants[]
{ "sample" : "NA19600", "genesCount" : 5, "transcriptsCount" : 15, "countByType" : { "homAltCount" : 15, "multiAllelicCount" : 2, "compHetCount" : 0, "deletionOverlapCount" : 0 }, "genes" : [ { "id" : "ENSG00000186470", "name" : "BTN3A2", "transcripts" : [ { "id" : "ENST00000377708", "biotype" : "protein_coding", "homAltVariants" : [ "6:26370605:T:C" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] }, { "id" : "ENST00000508906", "biotype" : "protein_coding", "homAltVariants" : [ "6:26370605:T:C" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] }, { "id" : "ENST00000356386", "biotype" : "protein_coding", "homAltVariants" : [ "6:26370605:T:C" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] }, { "id" : "ENST00000527422", "biotype" : "protein_coding", "homAltVariants" : [ "6:26370605:T:C" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] }, { "id" : "ENST00000396948", "biotype" : "protein_coding", "homAltVariants" : [ "6:26370605:T:C" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] }, { "id" : "ENST00000527639", "biotype" : "protein_coding", "homAltVariants" : [ "6:26370605:T:C" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] }, { "id" : "ENST00000396934", "biotype" : "protein_coding", "homAltVariants" : [ "6:26370605:T:C" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] } ] }, { "id" : "ENSG00000198919", "name" : "DZIP3", "transcripts" : [ { "id" : "ENST00000361582", "biotype" : "protein_coding", "homAltVariants" : [ "3:108634973:C:A" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] }, { "id" : "ENST00000463306", "biotype" : "protein_coding", "homAltVariants" : [ "3:108634973:C:A" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] }, { "id" : "ENST00000479138", "biotype" : "protein_coding", "homAltVariants" : [ "3:108634973:C:A" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] } ] }, { "id" : "ENSG00000215182", "name" : "MUC5AC", "transcripts" : [ { "id" : "ENST00000621226", "biotype" : "protein_coding", "homAltVariants" : [ ], "multiAllelicVariants" : [ "11:1158073:T:C", "11:1158073:T:-" ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] } ] }, { "id" : "ENSG00000147874", "name" : "HAUS6", "transcripts" : [ { "id" : "ENST00000380496", "biotype" : "protein_coding", "homAltVariants" : [ "9:19058483:C:A" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] }, { "id" : "ENST00000380502", "biotype" : "protein_coding", "homAltVariants" : [ "9:19058483:C:A" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] } ] }, { "id" : "ENSG00000099937", "name" : "SERPIND1", "transcripts" : [ { "id" : "ENST00000215727", "biotype" : "protein_coding", "homAltVariants" : [ "22:20780030:-:C" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] }, { "id" : "ENST00000406799", "biotype" : "protein_coding", "homAltVariants" : [ "22:20780030:-:C" ], "multiAllelicVariants" : [ ], "compHetVariants" : [ ], "deletionOverlapVariants" : [ ] } ] } ] } |
Table of Contents: