OpenCGA is an open-source project that aims to provide a Big Datastorage engine and analysis framework for genomic scale data analysis of hundreds of terabytes. It implements different components:
Catalog to store metadata
Variant storage engine to provide real-time queries to big data in genomics. This can use MongoDB or HBase together with Solr. A Redis server is also used to cache queries.
A complete RESTful API and gRPC for variant and alignment (BAM) streaming
Client libraries and command-line to query data
HGVA uses a small cluster for OpenCGA installation. This consist of three servers for MongoDB and a single Solr server. HAProxy is used to balance queries to two Tomcat servers.
Client User Interfaces
IVA is a highly customisable web application for Interactive Variant Analysis (IVA). It consists of several tools, HGVA activates two of them: Variant Browser and Facets. You can execute complex queries in IVA using any variant annotation including full-text search for disease descriptions.
With Facets you can perform different aggregations of data: