There are three main ways to access OpenCGA data:

RESTful Web Services and Clients

Command Line Interface (CLI)

Web data mining and visualisation

RESTful Web Services

OpenCGA implements a comprehensive and well-designed REST web service API, this consists of more than 200 web services to allow querying and operating data in OpenCGA. You can get more info at RESTful Web Services page.

We have implemented three different ways to query and operate OpenCGA through the REST web services API:

REST Client Libs: four different client libraries have been implemented to ease the use of REST web services, This allows bioinformaticians to easily integrate OpenCGA in any pipeline. The four libraries are equally functional and fully maintained, these are Java, Python (available at PyPI), R and JavaScript
Command Line: users and administrators can use opencga.sh command line to query and operate OpenCGA.
IVA Web Application: an interactive web application called IVA has been developed to query and visualisation OpenCGA data.

OpenCGA Demo

We have deployed a public demo installation to make easy facilitate the testing and development for all bioinformaticians and developers.

Data

This demo consists of three users. We have loaded and indexed five different datasets organised in 3 projects and 5 studies, these cover the most typical data use cases today such as multi-sample VCF, family exomes and genomes; or cancer somatic data. All documentation examples and tutorials use this demo installation.

Connecting to demo installation

OpenCGA demo REST URL is available at http://bioinfo.hpc.cam.ac.uk/opencga-prod/. You can check REST API and documentation at http://bioinfo.hpc.cam.ac.uk/opencga-prod/webservices/.

We have created a read-only user called demouser with password demouser. As in most OpenCGA installations where normal users are not the owners of the data, demouser has been given VIEW access to all demo user data, this is a very common configuration in OpenCGA where the owner of the data grant access to other users. In this demo installation the owner of the data is demo user, while demouser user is the public user created to query data.

Genomic Data

In this demo we have indexed 5 different genomic datasets. Data has been organised in three projects and five studies. These represents different assembly assemblies and type of data types such as multi sample VCF, aggregated VCF or family genome or exome. The data

Project

data is organised in 3 projects and 5 studies. You can find some useful information in this table:

Project ID and Name	Study ID

and

- Name

SamplesVCF Files

VCF File Type

Samples

Variants

population

Population Studies GRCh38

1000g - 1000 Genomes

Project Phase

phase 3

Name

WGS Multi sample

2,504

24Multi sample

82,587,763

uk10k - UK10K

UK10K Project

WGS Aggregated

10,000

1Agregated

46,624,127

family

Family Studies GRCh37

platinum

corpasome -

Platinum

NA12877, NA12877 and NA12877 samples from platinum genomes

33Multi sample

8,456,984

corpasome - Corpas Family

This study simulates two disorders and some phenotypes in the Corpas family for training purposes

44Multi sample

300,711

GRCh38 Somaticrams_cml

Corpas Family

WES Family Multi sample

4

300,711

platinum - Illumina Platinum

GWS Family Multi sample

17

12,263,246

cancer

Cancer Studies GRCh37

rams_cml - RAMS_CML

Chronic Myeloid Leukemia - Russian Academy of Medical Sciences

11

Somatic

11

Somatic130

121,

160

384

Credentials

OpenCGA host URL is available at

Clinical Data

In order to make this demo more useful to users we have loaded or simulated some clinical data, this allows to exploit OpenCGA analysis such as GWAS or clinical interpretation. You can find clinical data for each study in the following sections.

1000g

We loaded the 1000 Genomes pedigree file, you can find a copy at http://

bioinfo

resources.

hpc.cam.ac.uk/opencga-prod/

We have created a read-only user called demouser with password demouser.

You can check Swagger at: http://bioinfo.hpc.cam.ac.uk/opencga-prod/webservices/

opencb.org/opencb/opencga/templates/demo/20130606_g1k.ped

uk10k

There is no possible clinical data in this study. This is a WGS aggregated dataset so no samples or genotypes were present in the dataset and, therefore, no Individuals or Samples have been created.

corpasome

We simulated two different disorders and few phenotypes for the different members of the family. To be documented soon.

platinum

To be documented soon.

rams_cml

To be documented soon.

Table of Contents:

Table of Contents

indent	20px

Page tree

Versions Compared

Old Version 18

New Version Current

Key

RESTful Web Services

OpenCGA Demo

Data

Connecting to demo installation

Genomic Data

Credentials

Clinical Data

1000g

uk10k

corpasome

platinum

rams_cml

Page tree

Page History

Versions Compared

Old Version 18

New Version Current

Key

RESTful Web Services

OpenCGA Demo

Data

Connecting to demo installation

Genomic Data

Credentials

Clinical Data

1000g

uk10k

corpasome

platinum

rams_cml