Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

OpenCGA RESTful Web Services

HGVA is powered by the Open Computational Genomic Analysis (OpenCGA) project. OpenCGA implements an exensive extensive API that enables numerous operations over metadata, samples and genomic data. The whole API specification can be accessed at:

http://bioinfodevbioinfo.hpc.cam.ac.uk/hgva-1.0/webservicesWe will here focus

A description of the API and URLs design can be found at the OpenCGA Using OpenCGA documentation.

The tutorial Using RESTful Web Services URL shows practical examples on how to directly query the RESTful API. It focuses on those end points of the API which are of more interest for HGVA users, giving examples of their use and pin pointing certain peculiarities of the parameters for HGVA. Data is hierarchically organised in Projects and Studies. Please, in order to understand the API behaviour, have a look at Data: sources and HGVA organization in Datasets and Studies in order to understand first how data is organized: Projects, Studies and Cohorts . For details on the query parameters, please refer to the Swagger documentation linked above.

Clients

Clients

Examples

Getting information about genomic variants

Getting variant data from a given study:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/analysis/variant/query?studies={project}:{study}

An extensive list of filtering parameters allow great flexibility on the queries (check Swagger documentation link above). For example, get TTN variants from the Genome of the Netherlands study, which is framed within the reference_grch37 project. We will also limit the number of returned results to 3:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/analysis/variant/query?gene=TTN&studies=hgvauser@reference_grch37:GONL&limit=3

Getting information about projects

Getting all metadata from a particular project:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/projects/{projects}/info

For example, getting all metadata for the reference_grch37 project:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/projects/reference_grch37/info

Getting all metadata from all studies associated to a particular project:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/projects/{projects}/studies

For example, getting all studies and their metadata for the cancer_grch37 project:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/projects/cancer_grch37/studies

Getting information about studies

Get all available studies and their metadata. Please note, of special interest will be here the field alias which contains the study identifier to be used as an input whenever a study must be passed as a parameter:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/studies/search

For example, getting all metadata for all available studies:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/studies/search

Getting summary data from a particular study:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/studies/{project}:{study}/summary

For example, getting summary data for study 1kG_phase3 which is framed within project reference_grch37:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/studies/reference_grch37%3A1kG_phase3/summary

Getting all available metadata for a particular study:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/studies/{project}:{study}/info

For example, getting all metadata for study GONL  which is framed within the project reference_grch37:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/studies/reference_grch37:GONL/info

Getting all samples metadata for a given study:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/studies/{project}:{study}/samples

For example, getting all samples metadata for study 1kG_phase3 which is framed within project reference_grch37. Please, note that not all studies contain samples data, e.g. GONL, ExAC, among others, only provide variant lists and aggregated frequencies, i.e. no sample genotypes.

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/studies/reference_grch37:1kG_phase3/samples

Getting information about samples

Get all metadata for a particular sample:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/samples/{sample}/info?study={project}:{study}

For example, get all metadata for sample HG00096 of the 1kG_phase3 study which is framed within the reference_grch37 project:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/samples/HG00096/info?study=reference_grch37:1kG_phase3

Getting information about cohorts

Getting all samples metadata in a given cohort:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/cohorts/{cohort}/samples?study={project}:{study}

For example, get all samples metadata for cohort GBR from study 1kG_phase3 which is framed within project reference_grch37:

http://bioinfodev.hpc.cam.ac.uk/hgva-1.0/webservices/rest/v1/cohorts/GBR/samples?study=reference_grch37:1kG_phase3Likewise, a number of client libraries are provided which make intensive use of the Using OpenCGA. They provide fast programmatic access for genome-scale data analysis, therefore discouraging massive downloads of data to local computers. Currently supported languages include Python, Java and JavaScript. A similar design has been used in all of them in order to facilitate their use, external contributions and maintenance. Again, all of them provide an exhaustive API for accessing the whole Using OpenCGA. Please, refer to the corresponding Tutorials to find details on how to download, install, configure the libraries as well as practical examples on how to use the methods which are of particular interest for HGVA users.

Java

The Java client library is distributed together with the rest of the OpenCGA code:

https://github.com/opencb/opencga/tree/develop/opencga-client

It offers a Java API to all the functionality provided by the Using OpenCGA. Please, refer to Using the Java REST client for further details on how to get the code, configure, build and use the library. Only those methods which are of more interest to HGVA users are described in that tutorial.

Python (pyCGA)

pyCGA is the Python client library for Using OpenCGA, all the web services are accessible through this client, and it offers a quick way to query OpenCGA projects programmatically from custom scripts. The Python client library is distributed with the rest of the OpenCGA code

https://github.com/opencb/opencga/tree/develop/opencga-client/src/main/python

GitHub provides a public issue tracker which enables users to provide comments and contributions.

Please, refer to the tutorial Using the Python REST client in order to get detailed instructions for installing and configuring it, as well as a list of the methods which are of more interest for HGVA users and practical examples on how to use them.

JavaScript client

OpencgaClient is the Javascript client library for Using OpenCGA, all the web services are accessible through this client, and it offers a quick way to query OpenCGA projects through web interface. This is available at OpenCB JSorolla.




Table of Contents:

Table of Contents
indent20px