Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In many occassions, variant annotation users want to complement core CellBase annotation with custom annotations either from own generated files or other resources not currently integrated in CellBase database. Thus, the variant annotation command line allows the user can specify a set of files with custom annotation data. Currently, only VCF files are allowed as custom input. Any attribute value in the INFO field can be used during the annotation process. Three parameters in the command line control custom annotation behaviour:

...

Code Block
languagebash
          --custom-file                STRING     String with a comma separated list (no spaces in between) of files with custom annotation to be
                                                  included during the annotation process. File format must be VCF. For example: file1.vcf,file2.vcf
          --custom-file-fields         STRING     String containing a colon separated list (no spaces in between) of field lists which indicate the
                                                  info fields to be taken from each VCF file. For example:
                                                  field1File1,field2File1:field1File2,field3File2
          --custom-file-id             STRING     String with a comma separated list (no spaces in between) of short identifiers for each custom file.
                                                  For example: fileId1,fileId2
  1.  FirstFirst, you will need to specify the set of files that should be used as a custom annotation input. You can do this by using the `--custom-file` parameter and providing a comma separated list (no spaces in between) of VCF files. For example, /tmp/file1.vcf,/tmp/file2.vcf,/tmp/file3.vcf.
  2. Next, you need to specify which INFO attributes must be taken from each of the files. You can do this by using the `--custom-file-fields` parameter, and providing a colon separated list (no spaces in between) of info tag lists. For example, field1File1,field2File1:field1File2,field3File2
  3. Last, you need to provide one label or tag per file. This label will be used in the output file to represent this file in the variant annotation ouput. You can do this by using the `custom-file-id` parameter, and provinding a comma separated list (no spaces in between) of short labels/tags/identifiers.

A variant-annotation command including three custom files will would look like:

Code Block
languagebash
$ cellbase$ build/bin/cellbase.sh variant-annotation -i /tmp/test.vcf.gz \
  -o /tmp/test.json.gz \
  --species hsapiens \
  --assembly GRCh37 \
  --custom-file /tmp/file1.vcf,/tmp/file2.vcf,/tmp/file3.vcf \
  --custom-file-fields AF,DP:AF,DP:AF,DP
  --custom-file-id file1,file2,file3

...