Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Variables and Annotations

Clinical Data in OpenCGA is managed through what we have called Variable Sets and Annotation Sets.

Variables

A Variable Set is a free modelled data model. The fields of a Variable Set are explained below:

  • name: Unique String that can be used to identify the defined Variable Set.
  • unique: Boolean indicating whether there can only exist one single Annotation Set annotating the Variable Set per each Annotable* entry or not. If false, many Annotation Sets annotating the same Variable Set per Annotable entry will be allowed.
  • confidential: Boolean indicating whether the Variable Set as well as the Annotation Sets annotating the Variable Set are confidential or not. In case of confidentiality, only the users with that CONFIDENTIAL permission will be able to access it. 
  • description: String containing a description of the Variable Set defined.
  • variables: List containing all the different Variables that will form the Variable Set. Explained in detail below.

Annotable: We consider an entry to be Annotable if the entry can have Annotation Sets. At this stage, only Sample, Individual, Cohort and Family are Annotable.

** Confidential: Explain in Sharing and Permissions section !

Variable Set is no more than a set of Variables. Variable can be understood as a user-defined field that can be of any type (Boolean, String, Integer, Float, Object, List...). The different fields of a Variable are:

  • name: String containing the unique identifier of the field (Variable) defined by the user.
  • title:Nice identifier of the name. This field is intended to be used in a web application to show the field name in a nicer way.
  • category: Free String that can contain anything useful for the user to group and categorise Variables of a same Variable Set.
  • type: Type of the field (Variable) defined. It can be one of BOOLEAN, CATEGORICAL*, INTEGER, DOUBLE, TEXT, OBJECT.
  • defaultValue: Object containing the default value of the Variable in case the user has not given any value when creating the Annotation Set.
  • required: Boolean indicating whether the field is mandatory to be filled in or not.
  • multivalue: Boolean indicating whether the field being annotated is a List of type type or it will only contain a single value.
  • allowedValues: A list containing all the possible values a field could have.
  • rank: Integer containing the order in which the annotations will be shown (only for web purposes).
  • dependsOn: String containing the Variable the current Variable would depend on. Let's say we have defined two different Variables in a Variable Set called country and city. We can decide that we could only give a value to city once the country have been filled in, so city would depend on country.
  • description: String containing a description for the Variable.
  • variableSet: List of Variables that would only be used if the Variable being modelled is of type Object. Every Variable from the list will have the fields explained in this list.

* Categorical: A Categorical variable can be understood as an Enum object where the possible values that can be assigned are already known. Example of some categorical  Variables are: month, that can only contain values from January to December, gender, that could only contain values from MALE, FEMALE, UNKNOWN; etc.

  • No labels