Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

As OpenCGA wraps the packages Samtools and GATK, users can easily call variants from their alignment files, i.e., BAM files. This tutorial details how to use the command line to the Samtools and GATK wrappers.

Prerequisites 

A working setup of OpenCGA is required to setup a testing environment, please follow the steps on installation guide.

In addition, you need to download the following data files:

Variant Calling

Preparing the reference genome: index and dictionary files

In order to use the ref.fasta file, it has to be linked to the OpenCGA catalog:

$ ./opencga.sh files link -i ~/ref.fasta --path call/ --parents

Once linked the Fasta file, you need to index it by running the Samtools wrapper with the command faidx. The FASTA index file (ref.fasta.fai) is created in the folder of the input FASTA file (ref.fasta):

$ ./opencga.sh alignments samtools --command faidx --input-file ref.fasta

And you need to create the sequence dictionary file for that FASTA file, again you run the Samtools wrapper with the command dict. The sequence dictionary file (ref.dict) is created in the folder of the input FASTA file (ref.fasta):

$ ./opencga.sh alignments samtools --command dict --input-file ref.fasta

Preparing the alignment file: sort and index BAM file

In order to use the mother.bam file, it has to be linked to the OpenCGA catalog:

$ ./opencga.sh files link -i ~/mother.bam --path call/ --parents

Then you need to sort the BAM file, you run the Samtools wrapper with the command sort:

$ ./opencga.sh alignments samtools --command sort --input-file mother.bam --output-filename mother.sorted.bam

Once sorted the BAM file, you can index it by running the Samtools wrapper with the command index. The BAM index file (mother.sorted.bam.bai) is created:

$ ./opencga.sh alignments samtools --command index --input-file mother.sorted.bam

Variant calling

You can call variants by running the Gatk wrapper with the command HaplotypeCaller:

$ ./opencga.sh variant gatk --command HaplotypeCaller --fasta-file ref.fasta --bam-file mother.sorted.bam --vcf-filename mother.vcf

The variant calls are saved in the VCF file mother.vcf that can ben downloaded from the OpenCGA catalog to the local directory /tmp by using the following command:

$ ./opencga.sh files download --file mother.vcf --to /tmp

Table of Contents:


  • No labels