Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

Info

OpenCGA v1.3.0 will allow to define internal releases and versions of Catalog data

Overview

OpenCGA v1.2.0 already gave support to releases. The new release field is now included in most of the data models (#616) to indicate the release when the data was first created and registered in the database. Version 1.3.0 adds a new feature which allows creating different versions of the data (#684). This new feature was added to Sample, Individual and Family data models.

The new release in conjunction with the versioning of data allows users to do really powerful queries such as:

  • Fetch data inserted in a concrete release or between a range of releases.
  • Fetch all the versions (whole history) of an entry*.
  • Fetch a concrete version of an entry*.
  • Look for historic data from older releases*.

* The only supported entries at the moment are Sample, Individual and Family.

Many research institutions need to create deliverable from time to time that will contain everything that has been done so far up to a point. Since version 1.2.0, a new field release is present in most of the data models (#616). All the data (samples, files, individuals, studies...) make sense within a Project context. Projects will not have a release field, but rather a new release counter field showing the current release of the data being ingested at the moment.


Releases

Project

Every time a user creates a new project, the project will be created with the release counter field set to 1. Only the owner of the project will be able to increase the release counter by using one new RESTful webservice (/{version}/projects/{project}/increlease) added for this purpose. 

Other entries

Studies, samples, files, individuals... they all will be assigned a release number matching the current value of the release counter of the project where they are contained. This release number cannot be ever modified as it just reflects the moment (release) in which these new data was added.

Querying data by release

Now that a new release tag is present in every entry, it becomes really easy to query data coming from different or concrete releases. All the /xxx/search RESTful webservices were updated to include a new release query parameter. Some example queries can be found below:

  • Query for data created in release 2: release=2
  • Query for data created before release 4: release<4

Export data to a different database

Not yet implemented

Available for next 1.3.0 release

The release concept can be normally associated to the concept of deliverables. OpenCGA can be used by a bunch of users that will ingest new data and will be updating it over time to satisfy some kind of deadline. When the time is due, the owner of the project will need to increase the release counter so new ingested data is associated to a new release number to satisfy the requirements for the next deadline. 

Once a release is finished, that data could be made available for other kind of users that will only needs access as is (read-only). One of the things OpenCGA will offer in next 1.3.0 release is the option to export old releases of data to a read-only database so other researchers can access that data without interfering the work that might still be in progress in the source database with the next release.

Export

The export option will export complete projects up to a specified release number. This means, that if the release counter is 4 in the project and the user wants to export up to release 3, all the studies, samples, files... created during releases 1, 2 and 3 and the project itself, will be exported. 

When exporting a project, it will never export permissions or groups associated to the studies. This information will be lost in the exported file(s). It will only export the data itself and the cross-references.

Import

Importing data from other OpenCGA installation is much more trickier than just exporting the data. For this reason, some restrictions need to be satisfied to guarantee that everything will work properly.

  1. The very first time something is going to be imported to other database, the database should NOT contain any project, study... However, users are allowed.
  2. Imports can be incremental 

To finish




Allowed operations over source database: any

Allowed operations over target database: Login, create groups, create users, assign permissions!


Versioning



Export and import


Table of Contents:


  • No labels