Date: Thu, 28 Mar 2024 15:37:56 +0000 (GMT) Message-ID: <775290857.265.1711640276878@web> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_264_931873.1711640276875" ------=_Part_264_931873.1711640276875 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
As described in the documentation, HGVA backend is powered by the Ope= nCGA project. The CLI is distributed with the rest of the O= penCGA code. The OpenCGA code can be cloned in your machine by executin= g in your terminal. Checkout the latest code (release-1.1.0 branch):
$ git clone= https://github.com/opencb/opencga.git $ git checkout v1.3.6=20
Alternatively, you can download tar.gz files with the= code for the latest tags/releases of OpenCGA from:
https://github.com/opencb/opencga/releases
Once you have downloaded the code, follow the instructions at the <= em>How to Build section of the OpenCGA repository:
https://github.com/opencb/opencga
The CLI interface is accessible through the opencga.sh&nbs= p;script:
cd opencga cd build cd bin opencga/build/bin$ ./opencga.sh Program: OpenCGA (OpenCB) Version: 1.1.0 Git commit: f2dace56fcdf491efee8ebb0cb43f981e31c320e Description: Big Data platform for processing and analysing NGS data Usage: opencga.sh [-h|--help] [--version] <command> [options] Catalog commands: users User commands projects Project commands studies Study commands files File commands jobs Jobs commands individuals Individual commands families Family commands samples Samples commands variables Variable set commands cohorts Cohorts commands Analysis commands: alignments Implement several tools for the genomic alignment analysis variant Variant commands=20
The CLI provides commands, subcommands and = parameters to access its functionality. Commands of most interest for HGVA users are projects, studies, cohorts and samples. Please= , find below a list of commands which can be of most interest for HGVA user= . Further documentation on the OpenCGA CLI can be found at the Command Lines section of the O= penCGA documentation.
As previously said, the CLI makes intensive use of the RESTful API. Thus= , the only configuration detail needed for the CLI to work is a URL where t= he Web Services API is hosted. The configuration file client-confi= guration.yml is used for this purpose. You shall find a template = of this file at the build/conf directory:
$ ll opencg= a/build/conf/client-configuration.yml -rw-r--r-- 1 fjlopez fjlopez 290 Oct 24 17:49 opencga/build/conf/client-con= figuration.yml=20
Edit this file with any text editor and set the rest =E2=86=92 host= attribute to "http://bioinfo.hpc.cam.ac.uk/hgva":
--- ## number of seconds that session remain open sessionDuration: 12000 ## REST client configuration options rest: host: "http://bioinfo.hpc.cam.ac.uk/hgva" batchQuerySize: 200 timeout: 30000 defaultLimit: 2000 ## gRPC configuration options grpc: host: "localhost:9091"=20
You can query variants by using the variant command and&nb= sp;query subcommand. An extensive list of filtering parameter= s allow great flexibility on the queries. Please, check inline help provide= d by opencga.sh for further details. For example, get TTN variants= from the Genome of the Netherlands study, which is framed within the = reference_grch37 project. We will restrict studies data to th= ose corresponding to GONL. Finally, we will also limit the number of return= ed results to 3:
./opencga.s= h variant query --gene TTN --study GONL --limit 3 --of json --output-study = GONL=20
You can use the command projects to query projects data.= p>
For getting all metadata from a particular project you can use the =
info subcommand. For example, getting all metadata for the
./opencga.s= h projects info --project cancer_grch37=20
For getting all metadata from all studies associated to a particular pro= ject yo ucan use the studies subcommand. For example, getting= all studies and their metadata for the cancer_grch37 pr= oject:
./opencga.s= h projects studies --project cancer_grch37=20
You can use the command studies to query studies data.
For getting all available studies and their metadata you can use th= e search subcommand. For example, getting all metadata for&nb= sp;all available studies (please note, of special = interest will be here the field alias&n= bsp;which contains the study identifier to be used as an input whenever a s= tudy must be passed as a parameter):
./opencga.s= h studies search=20
For getting summary data from a particular study you can use the
./opencga.s= h studies summary --study reference_grch37:1kG_phase3=20
For getting all available metadata for a particular study you can use th= e info command. For example, getting all metadata for study&n= bsp;GONL which is framed within the project re= ference_grch37:
./opencga.s= h studies info --study GONL=20
For getting all samples metadata for a given study you can use the = samples subcommand. For example, getting all samples meta= data for study 1kG_phase3 which is framed within project= reference_grch37. Please, note that not all studies con= tain samples data, e.g. GONL, ExAC, among others, only provide variant list= s and aggregated frequencies, i.e. no sample genotypes.
./opencga.s= h studies samples --study reference_grch37:1kG_phase3=20
You can use the co= mmand samples to query samples data.
./opencga.s= h samples info --sample HG00096 --study reference_grch37:1kG_phase3= =20
You can use the&nb= sp;cohorts command to query cohorts data.
./opencga.s= h cohorts samples --study reference_grch37:1kG_phase3 --cohort GBR=20
Table of Contents: