Page tree
Skip to end of metadata
Go to start of metadata

Welcome to HGVA project!

Human Genomic Variation Archive (HGVA) is an open-access genomic variation resource that integrates variants from the main reference human projects. HGVA also adds valuable information such as variant annotation: consequence types, population frequencies, protein effect predictions, variant-associated phenotype, etc.


HGVA aims to provide a high-performance and scalable resource to store, query and visualise variants from main open-access human datasets. We have put special emphasis on making HGVA very responsive even with complex queries and to make the data available to researchers and bioinformaticians in three different ways: a rich web interface based on OpenCB IVA, client libs (Python, Java and JavaScript) and through a command-line. Also, different datasets are normalised and annotated using OpenCB CellBase.

HGVA does not intend to replace or provide full archiving services like NCBI dbSNP or EMBL-EBI EVA projects, these provide excellent submission and accessioning services and play a crucial role allowing scientists to submit variation data for many different species. Instead, HGVA is focused on human and only the most relevant datasets are selected and indexed. HGVA provides different high-performance user interfaces to allow researchers to query and visualise human datasets or to use this data in genomic pipelines.

We would like to thank very much the authors of the different projects such as 1000 Genomes Project or ExAC (see below for a complete list) for making all these invaluable data open and freely accessible for the biomedical community, we hope HGVA will help to make all these data more accessible to them.

HGVA was born during 2016 as a response to the necessity of having most relevant human datasets centralised, normalised and annotated for different analysis pipelines. It is currently developed and maintained by researchers at University of Cambridge and Genomics England and it is freely available at

Main Features

  • Most important high-quality variant studies normalised and integrated in one single server database
  • High-performance complex queries to variants. Faceted seach also implemented
  • Datasets are organised in four main projects Reference Studies GRCh37Reference Studies GRCh38, Cancer GRCh37 and Platinum (see below)
  • Rich variant annotation performed using OpenCB CellBase, including HPO terms, consequence types, substitution effect prediction scores, Gene Ontology terms, etc.
  • Population frequencies calculated, including populations and super-populations
  • Data is indexed in the server using OpenCB OpenCGA. This provides a high-performance and scalable variant storage solution for big data analysis and visualisation
  • Rich interactive web-based data mining tool based on OpenCB IVA
  • Clients in Python, Java, JavaScript for fast programmatic access
  • Command-line interface developed

Latest news:

HGVA 1.1.0 released!
Improved web interface appearance Web interface includes new beta features: Beta Genome Browser available to visualise variant context Summary (beta) tab to get a quick, visual description of filtering result Facets to get graphical descriptions of the data Upgraded backend. HGVA backend is now powered by OpenCGA 1.1.1 Try now!
HGVA paper published!
HGVA paper has been recently published. Please, cite: Lopez, J., Coll, J., Haimel, M., Kandasamy, S., Tarraga, J., Furio-Tari, P., Rendon, A, Dopazo, J & Medina, I. (2017). HGVA: the Human Genome Variation Archive. Nucleic Acids Research.
gnomAD genomes and gnomAD exomes population frequencies are now provided as part of the advanced annotation tab!

Current Projects and Studies

Reference Studies GRCh37

  • 1000 Genomes Phase 3
  • Exome Aggregation Consortium (ExAC)
  • Exome Sequencing Project (ESP6500)
  • Genome of the Netherlands (GoNL)
  • UK10K project
  • Spanish Medical Genome Project (MGP)

Refernce Studies GRCh38

  • 1000 Genomes Phase 3

Cancer GRCh37

  • QIMR Berghofer Melanoma
  • Chronic Myeloid Leukemia - Russian Academy of Medical Sciences


  • Illumina Platinum


More than 250M variants reported and about 120M of unique variants


Source Code

Web based on IVA project at

Server based on OpenCGA at


HGVA is a collaborative project that aims to integrate as many reference human studies as possible, you can contact us for feature request. If you want to contribute to the code you are more than welcome to contribute to IVA and OpenCGA


  • No labels