Overview

There are two possible ways of querying variants in OpenCGA using the Command Line Interface (CLI), these are:

  • opencga.sh: this is the user command line, it works remotely (outside of OpenCGA cluster) by querying the REST or gRPC services. This can also query Catalog data.
  • opencga-analysis.sh: a private and internal command line, this is not intended to be used by users and it only works inside the OpenCGA cluster.

Although both command lines provide similar functionality users are expected to use opencga.sh. They can be found in the _bin_ folder of OpenCGA installation directory.

Using opencga.sh

This allows to query by: genomic regions and feature IDs such as gene and SNPa query by variant annotation such as consequence types, conservations scores, polyphen, sift or population frequencies sample genotypes variant stats in the study * some basic aggregations such as ranks, group-by or counts

All these filters can be combined. There are some query modifiers implemented: skip and limit count: this can be added to all CLIs and return the number of results

From the $OPENCGAHOME_ folder you can execute to see all the parameters:

./bin/opencga.sh variants query -h

NOTE: for security reasons you need to login into OpenCGA if you want to use this CLI in a standard OpenCGA installation, this will guarantee you only access to the data you have permission, to login you only need to execute:

./bin/opencga.sh users login -u USER -p PASSWORD

A session token will be stored in your home directory and used internally by OpenCGA Storage.


Table of Contents: