Overview

Understanding the URL

The general format of the REST API web services is:

http://HOST_URL/webservices/rest/{apiVersion}/{entity}/{id(s)}/{endpoint}?{options}

where HOST_URL is the name of the host machine and the name of Java war file deployed in web server (eg. tomcat) over that server for example bioinfodev.hpc.cam.ac.uk/opencga

Entities inside the curly braces { } are the web service parameters, and they are treated as variables. For example the following URL:

http://bioinfodev.hpc.cam.ac.uk/opencga/webservices/rest/v1/users/opencga/login?password=opencga

As is explained later in this documentation, this RESTful web service will login the user opencga.

apiVersion (v1) : indicates OpenCGA version to retrieve information from, data models and API may change between versions.
entity (users) : specifies the data type of what the user wants to retrieve in the id field. This can be one of resources listed below.
id (opencga) : the resource id we want to query.
endpoint (login) : these parameters must be specified depending on the nature of your input data. For example, if we want to query all files by a specific study (e.g. 1000genomes) we should use the studies resource and files endpoint.
options (password) : variables in key value pair form, passed as query parameters.

URL parameters

apiVersion

apiVersions are numbered as v1, v2, etc. At this moment we are heading to second stable apiVersion which will be v2.

entity

There are several metadata entities implemented such as users, samples, individuals, ... see below for more info..

IDs

This is the query parameter and the type matches resources path parameter, this is a unique identifier for the resource we are interacting. Plural means a list of IDs can be passed to improve performance with a single REST call separated by commas rather than multiple calls. OpenCGA preserves the order of the results with corresponding IDs. A Boolean variable, silent, can be set to indicate either if user is interested to receive partial results (true) as well or only expects complete set of results or nothing in case of any failure (resource doesn't exist, permission denied etc). As a trade off between performance and ease of use a maximum of 100 IDs are allowed in one web service.

options

These query parameters can modify the behaviour of the query (exclude, include, limit, skip and count) or add some filters to some specific endpoints to add useful functionality. The following image shows some typical options for a certain web service.

OpenCGA > RESTful Web Services > image2017-11-8 15:23:20.png

REST Response

Most web services return the results encapsulated in a single QueryResponse object (view data model) consisting of some metadata and a list of QueryResult objects (view data model) called response containing the data and metadata requested. The reason for this two-level response is that some REST web services allow to pass multiple IDs as input parameter, this improves significantly the performance by reducing the number of calls, for instance a calling /info method with three sample IDs will return a QueryResponse object with three QueryResults. Then, each QueryResult can contain multiple results, for instance when getting all samples from an individual or when fetching all variants from a gene.

However, most of the web services will return a QueryResponse with one single QueryResult with one or more result. In general the response object looks like:

{
  "apiVersion": "v1",
  "time": 19,
  "warning": "",
  "error": "",
  "queryOptions": {
    "metadata": true,
    "skipCount": false,
    "limit": 10
  },
  "response": [
    {
      "id": "search",
      "dbTime": 18,
      "numResults": 10,
      "numTotalResults": 56,
      "warningMsg": "",
      "errorMsg": "",
      "resultType": "",
      "result": [
        {
          	// result 1
        },
		{
          	// result 2
        },
		// ...
		{
			// result 10
        }
      ]
    }
  ]
}

where:

Line 1: single QueryResponse object
Lines 2 and 3: show the version and the duration time (ms)
Lines 4 and 5: show warning and error messages, for instance when having network issues you could get "Catalog database not accessible"
Line 6: summary of all option parameters provided
Line 11: list of QueryResults called response. In this example, and in most of calls, there is only one QueryResult.
Line 14: database duration time (ms)
Line 15 and 16: number of elements returned in the list result (see below) and total number of records found in the database for a given query.

Entities and Endpoints

REST API is organised in two main groups of web services depending you want to work with metadata and permission or run some analysis: Catalog and Analysis. See below a description of the web services.

Catalog Web Services

Contains all endpoints for managing and querying metadata and permission.

Entity	Path	Description	Main Endpoints
Users	/users	Different methods to work with users	info, create, login, ...
Projects	/projects	projects are defined for each user and contains studies	info, create, studies, ...
Studies	/studies	studies are the main component of catalog, the can be shared and contain files, samples and jobs	info, create, files, samples, jobs, variants, alignments, groups, ...
Files	/files	files are added to the study and can be indexed to be queried	info, create, index, share, ...
Jobs	/jobs	jobs are tool executions that can be queued	info, create, ...
Families	/families	families are connected collection of individuals based on relationship	info, create, ...
Individuals	/individuals	samples come from the individuals	info, create, ...
Samples	/samples	samples are each of the experiment samples, typically matches a NGS BAM file or VCF sample	info, create, annotate, share, ...
Cohorts	/cohorts	these model a group of samples that share some common properties, these are used for data analysis	info, create, stats, samples, ...
Clinical Analysis	/clinical	this handles creating and search of a clinical analysis	info, create, ...
Variable Set	/variableset	variables annotate samples with different information useful for data analysis	info, crate, ...
Meta	/meta	gives the meta information about OpenCGA installation instance	ping, about, status
GA4GH	/ga4gh	GA4GH standard web services to search genomics data in OpenCGA	variant search, reads search, responses

Analysis Web Services

Different endpoint for running the alignment, variant and clinical analysis

Category	Path	Description	Main Endpoints
Alignment Analysis	/analysis/alignment	Operations over Read Alignments to facilitate complete analysis with different tools.	index, query, stats, coverage
Variant Analysis	/analysis/variant	Operations over Genomic Variants to facilitate complete analysis with different tools.	index, stats, query, validate, ibs, facet, samples, metadata
Clinical Analysis	/analysis/clinical	You can manage Clinical Analysis metadata (e.g create a case, set permissions) or run a genome interpretation	execute

Swagger

OpenCGA has been documented using Swagger project. Detailed information about resources, endpoints and options is available at:

http://bioinfodev.hpc.cam.ac.uk/opencga/webservices/

Client Libraries

Currently OpenCGA implements the following four client libraries:

Deprecation Policy

Certain APIs are deprecated over the period of time as OpenCGA is a live project and continuously improved and new features are implemented. Deprecation cycle consists of a warning period to let make user aware that these services are considered for change and highly likely will be replaced followed by a deprecated message. OpenCGA supports deprecated services for two releases (Deprecated and Next one). Deprecated services are hidden in the following release of deprecation and finally removed completely in next release.

Warning (working) ---> Deprecated (working) ---> Hidden (working) ---> Removed (not working)

All deprecated web services are clearly marked as deprecated along with a warning help message for user.

OpenCGA > RESTful Web Services > image2017-11-9 13:56:41.png

Table of Contents: