Page tree
Skip to end of metadata
Go to start of metadata


OpenCGA implements a Python client library called PyOpenCGA to perform any operation through the REST web services API. PyOpenCGA provides programmatic access to all the implemented REST webservices, providing and easy, lightweight, fast and intuitive solution to access OpenCGA data. The library offers the convenience of an object-oriented scripting language and provides the ability to integrate the obtained results into other Python applications.

Some of the main features include:

  • full RESTful web service API implemented, all endpoints are supported including new alignment or clinical functionality.
  • data is returned in a new RestResponse object which contains metadata and the results, some handy methods and iterators implemented.
  • it uses the OpenCGA client-configuration.yml file.
  • several Jupyter Notebooks implemented.

PyOpenCGA has been implemented and contributed by Pablo Marin and David Gomez and it is based on the previous pyCGA library implemented by Antonio Rueda and Daniel Perez from Genomics England. The code is open-source and can be found at It can be easily installed using PyPI. Please, find more details on how to use the python library at Using the Python client.


Python client requires at least Python 3.x, although most of the code is fully compatible with Python 2.7. You can install PyOpenCGA either from PyPI repository or from the source code.


PyOpenCGA client is deployed at PyPI and it is available at Installing it is as simple as running the following command line:

## Latest stable version
pip install pyopencga

Source Code

Python client source code can be found in OpenCGA GitHub repository at To install any stable of development version of pyOpenCGA we will first need to clone the right branch of OpenCGA repository and install the library using the file.

## Latest stable version
git clone -b master

## Move to the pyOpenCGA client folder
cd opencga/opencga-client/src/main/python/pyOpenCGA/

## Install the library
python install

Library Implementation

Developers only need to create an instance of the ClientConfiguration class passing it as an argument to the OpenCGAClient. They can optionally pass a valid token to start doing calls as an authenticated user. The only role of the OpenCGAClient class is to work as a factory of the actual clients.

## Import OpenCGAClient and ClientConfiguration classes
from pyopencga.opencga_client import OpenCGAClient
from pyopencga.opencga_config import ClientConfiguration

## Creating a ClientConfiguration: 
# This can be done by passing the path to the main client-configuration.yml file
config = ClientConfiguration('/opt/opencga/conf/client-configuration.yml')
# Or by creating a dictionary using the below format passing the OpenCGA host to point to
config = ClientConfiguration({
    "rest": {
            "host": ""

## Create an instance of OpenCGAClient passing the configuration
oc = OpenCGAClient(config)

## Authenticate the user. Password is optional and if this is not passed to the login method, it will be prompted to the user
# or
oc.login('user', 'password')


OpenCGAClient class works as a factory containing all the different clients necessary to call to any REST web service.

As described in RESTful Web Services#RESTResponse, most of the web services return a QueryResponse object containing a list of QueryResults. This structure has been maintained in the Python library and everytime a call to any WS is done, the response is automatically encapsulated into a custom RESTResponse class that automatically stores all the different values returned and defines a few public methods to help users navigating through the data.

The RESTResponse methods developed are:

## Return an iterator to help iterating over all the results.

## Return the total number of matches taking of all the QueryResponses.

## Return the total number of results taking of all the QueryResponses.


The client implements at least one function for each available resource (user, project, study, etc.). Currently, the following functions are available:

## User
oc.users.login(user, pwd, **options)
oc.users.refresh_token(user, **options)
oc.users.logout(), **options)
oc.users.create(data, **options)
oc.users.update(query_id, data, **options)

oc.users.projects(user, **options)
oc.users.update_password(user, pwd, newpwd, **options)
oc.users.configs(user, **options)
oc.users.update_configs(user, data, action, **options)
oc.users.filters(user, **options)
oc.users.update_filters(user, data, action, **options)
oc.users.update_filter(user, filter_name, data, **options)

## Projects, **options)
oc.projects.create(data, **options)
oc.projects.update(query_id, data, **options)
oc.projects.studies(project, **options)
oc.projects.aggregation_stats(project, **options)
oc.projects.increment_release(project, **options)

## Studies, **options)
oc.studies.create(data, **options)
oc.studies.update(query_id, data, **options)

oc.studies.acl(query_id, **options)
oc.studies.update_acl(memberId, data, **options)

oc.studies.groups(study, **options)**options)
oc.studies.scan_files(study, **options)
oc.studies.resync_files(study, **options)
oc.studies.create_groups(study, data, **options)
oc.studies.update_groups(study, data, action, *options)
oc.studies.update_users_from_group(study, group, data, action, **options)
oc.studies.permission_rules(study, entity, **options)
oc.studies.update_permission_rules(study, entity, data, action, **options)
oc.studies.variablesets(study, **options)
oc.studies.update_variablesets(study, data, action, **options)
oc.studies.aggregation_stats(study, **options)
oc.studies.update_variable_from_variableset(study, variable_set, action, **options)

## Individuals, **options)
oc.individuals.create(data, **options)
oc.individuals.update(query_id, data, **options)
oc.individuals.update_annotations(query_id, annotationset_id, data, **options)

oc.individuals.acl(query_id, **options)
oc.individuals.update_acl(memberId, data, **options)

## Samples, **options)
oc.samples.create(data, **options)
oc.samples.update(query_id, data, **options)
oc.samples.load(file, **options)
oc.samples.update_annotations(query_id, annotationset_id, data, **options)

oc.samples.acl(query_id, **options)
oc.samples.update_acl(memberId, data, **options)

## Files, **options)
oc.files.create(data, **options)
oc.files.update(query_id, data, **options)
oc.files.scan_folder(folder, **options)
oc.files.list_folder(folder, **options)
oc.files.content(file, **options)
oc.files.grep(file, **options)
oc.files.refresh(file, **options)
oc.files.tree_folder(folder, **options)
oc.files.upload(data, **options), **options)
oc.files.update_annotations(query_id, annotationset_id, data, **options)

oc.files.acl(query_id, **options)
oc.files.update_acl(memberId, data, **options)

## Jobs, **options), **options), data, **options)**options)**options), **options), **options), data, **options)

## Families, **options)
oc.families.create(data, **options)
oc.families.update(query_id, data, **options)
oc.families.update_annotations(query_id, annotationset_id, data, **options)

oc.families.acl(query_id, **options)
oc.families.update_acl(memberId, data, **options)

## Cohorts, **options)
oc.cohorts.create(data, **options)
oc.cohorts.update(query_id, data, **options)
oc.cohorts.samples(self, cohort, **options)
oc.cohorts.update_annotations(query_id, annotationset_id, data, **options)

oc.cohorts.acl(query_id, **options)
oc.cohorts.update_acl(memberId, data, **options)

## Disease Panels, **options)
oc.panels.create(data, **options)
oc.panels.update(query_id, data, **options)

oc.panels.acl(query_id, **options)
oc.panels.update_acl(memberId, data, **options)

Table of Contents:

Useful Links

  • No labels