OpenCGA uses ARM template to auto deploy a pool with preconfigured opencga docker image. This pool is "AutoScale" enabled so will scale as number of variant index jobs will grow. These ARM script will also populate the following section in "configuration.yml" which are enables OpenCGA daemon to submit job (azure task) to Azure Batch Service.
Configuration
configuration.yml
.... execution: mode: AZURE ... options: #Azure Batch Service information batchAccount : "batchAccount" batchKey : "batchKey" batchUri : "https://batchservice.uksouth.batch.azure.com" batchPoolId : "poolId" dockerImageName : "openCGADockerImageName" # preconfigured docker image dockerArgs : "dockerRunOptions" # e,g; mount points etc. ....
Variant Index Job Creation
Once user create an OpenCGA variant indexing job, this will be stored in OpenCGA catalog. For example, following is an example to link a file in catalog and create index pipeline which internally will be stored as a catalog job :
OpenCGA Variant Index Job Creation
./opencga.sh files link -i ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz -s "sudy" ./opencga.sh variant index --file ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz --calculate-stats --annotate -o tmp