Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Clinical data is supported in File, Sample, Cohort, Individual and Family in a field called annotationSets. Any of these entities will be able to perform the same operations described below apart from their own particular features.

In this document, we will be referring to annotationSets and annotations (the field names used in OpenCGA to store any clinical data or any other user-defined free data model).

...

In this case, the operators =, == and != are also supported, though they might give unexpected results to the user. For this reason, we have also added === and !== operators to support any possible query operation. An example containing the results that would be obtained is shown in the table below:


OperatorValue looked forIndividuals returnedExplanation
=, ==B1, 2Fetch all the individuals containing annotationSet or variableSet B
===B2Fetch all the individuals that only contains annotationSet or variableSet B
!=B1, 3, 4Fetch all the individuals that doesn't only contain annotationSet or variableSet B. Individuals containing B plus any other annotationSet or variableSet  will be returned.
!==B3, 4Fetch all the individuals that have never been annotated using annotationSet or variableSet B.
  • All the annotationset webservices have been deprecated.

Querying annotation sets

  • Querying by annotation sets is only possible through sample|individual|family|cohort/search. variableSet and annotationSetName parametesrs have been deprecated. Instead, all the queries should be done through the annotation query param. The annotation query param will be able to contain a ; separated string following any combination of the following:
    • Filtering by an annotation: Considering a.b is the variable of a nested object we want to query for from the variableSet "tumor", it would be supported "a.b=4" or "tumor:a.b=4". As long as the variable is only valid in one of the variable sets defined for the study, the variableSet part can be omitted. 
    • Filtering by a variableSet: 
      • "variableSet=tumor" will return all the objects that have been annotated with that variableSet
      • "variableSet!=tumor" will return all the object that have not been annotated with that variableSet
    • Filtering by annotation set name:
      • "annotationSet=pepe" will return all the objects that have an AnnotationSet with the name "pepe".
      • "annotationSet!=pepe" will return all the objects that don't have an AnnotationSet with the name "pepe".
  • Projections of annotationSets can be done using the typical include/exclude query params. In this case, we have special words to only include/exclude some concrete things:
    • Projecting annotations: "annotationSets.annotations.a.b" and "annotation.a.b" will project the result of a.b annotations only !
    • Projecting whole AnnotationSets: "annotationSets.name.pepe" or "annotationSet.pepe" will project the result of the whole AnnotationSet with name pepe
    • Projecting AnnotationSets from VariableSets: "annotationSets.variableSet.tumor" or "variableSet.tumor" will project all the existing AnnotationSets annotating the VariableSet tumor.
  • The boolean "flattenAnnotations" will be used to flatten the annotations in one single level (true) or leave it as nested objects (default - false)

Creating or modifying annotationSets

  • AnnotationSets can be created when the entry that will contain it is being created or by calling to the /entry/{entry}/update webservice
  • AnnotationSets can be updated by calling the /entry/{entry}/update webservice

Deleting annotationSets

  • AnnotationSets can be deleted by using the /entry/{entry}/update webservice using the new deleteAnnotationSet parameter

Deleting annotations

...

Project the annotation fields to return

Annotations, as well as any other field from the data models can be included or excluded from the final JSON the user will get. However, because annotations contain custom data models that are not completely under OpenCGA's control, a set of reserved prefixes have been defined as explained below:

  • Include/exclude specific annotations: If we need to project some specific annotations only, users will need to add the prefixes "annotationSets.annotations" or "annotation" to the field to be projected. Example: If after running a query we only want to include the full_name and the hpo variables defined in the Individual VariableSet, users will need to write

      include: annotation.full_name,annotation.hpo

            or

      include: annotationSets.annotation.full_name,annotationSets.annotation.hpo
  • Include/exclude specific annotationSets: Let's imagine that we have several annotationSets defined such as in the examples of Individual1 and Individual3. If we only want to project the annotations of one specific annotationSet, users will need to use the prefixes "annotationSets.id" or "annotationSet" to the annotationSet id to be projected. Example: To include only the annotations from the annotationSets B and D, we will need to write:
      include: annotationSets.id.B,annotationSets.id.D

            or

      include: annotationSet.B,annotationSet.D
  • Include/exclude specific variableSets: Let's say that for some entries the user have created several annotationSets using the same variableSet and the user wants to fetch only those instead of getting other annotationSets. To do so, users will need to use the prefixes "annotationSets.variableSet" or "variableSet". Example: Let's imagine that we have another Individual that contains 2 annotationSets (a and b) using the template defined in the variableSet X and another annotationSet (c) annotating the variableSet Y. If the user is only interested in getting the annotationSets "a" and "b", we will need to write:
      include: annotationSets.variableSet.X

            or

      include: variableSet.B

Additionally, the different /info and /search web services have a new query parameter called flattenAnnotations. That field is a simple boolean to indicate whether the annotations should be returned flattened or not. Let's imagine we have the following annotationSet:

Code Block
languagejs
{
  "id": "annotation_set_id",
  "variableSetId": "individual_private_details",
  "annotations": {
    "full_name": "John Smith",
    "age": 60,
    "gender": "MALE",
    "address": {
		"city": "United States",
        "zip": "99501"
    }
  }
}

The same result with flattenAnnotations set to true would be:

Code Block
languagejs


{
  "id": "annotation_set_id",
  "variableSetId": "individual_private_details",
  "annotations": {
    "full_name": "John Smith",
    "age": 60,
    "gender": "MALE",
    "address.city": "United States",
"address.zip": "99501"
 } }



GroupBy

  • We can put something like the following the 'fields' field: annotation:29:pedigreeAnnotation:Population to group by Population

...