The Past, Present and Future of Genome Sequencing

Genome Sequence Human Genome Update Feature Header

The last two decades have seen a revolution in genome sequencing with dramatic increases in speed and efficiency coupled with massive reductions in cost. There is so much going on in this area it can be difficult to keep up, so here is what you should know.

Gene sequencing has proved its usefulness as a diagnostic and prognostic tool. Its use in the identification of BRCA1 mutations is already a gold standard in cancer research. Thanks to personalized medicine trends and collaborations between the industry and regulatory authorities, whole genome sequencing (WGS) is turning into a common practice faster than one could have originally expected. Indeed, it is now possible to get your genome sequenced through the post by firms such as Dante Labs and 24 Genetics in Europe and Veritas Genetics and Sure Genomics in the US.

Pricing by popular demand

Back in 2003, the International Human Genome Sequencing Consortium kicked-off the genome analysis race by sequencing a complete human genome. It took more than 13 years and cost approximately €2.3B. By 2008, costs had already dropped to less than €1M per genome and last year it was possible to get your genome sequenced for around €900 in a few days.

Genome sequencing - Cost per genome sequencedThe next generation sequencing (NGS) market, including but not limited to WGS, was valued at €4.6Bn in 2015 and is expected to reach €19Bn by 2020.

Illumina is one of the biggest players in the sequencing industry and at the start of 2017 announced a new NovaSeq range of sequencers that “one day will enable the $100 genome” according to Francis deSouza, President and CEO of the market-leading company. While the new range has proved extremely popular with customers, exact predictions on when the $100 genome will happen are less clear.   

Illumina has a strong competitor in up-and-coming UK-biotech Oxford Nanopore, one of the few biotech unicorns in Europe. Convenience and cost-effectiveness are the company’s buzzwords and it is well known for its handheld MinION sequencer. Earlier this year, the company raised an additional €113M to add to its earlier fundraising haul of €578M. Initially criticised for the accuracy of its products, its R&D team is continuing to improve quality, accuracy, and cost, making it a serious contender. It is also developing an even smaller sequencer that could be attached to a phone called a SmidgION.

While a MinION retails for approximately €865, other sequencers are still quite expensive. For example, Illumina’s more popular sequencers range from  €800,000 to almost €1M.

Drowning in data

New partnerships, like the one between 23andMe and Roche’s Genentech in 2015 to fight Parkinson’s, or the more recent  €260M deal between GSK and 23andMe, are trying to capitalize on the wealth of data (Fun fact: Google invested in 23andMe and its co-founder married 23andMe’s CEO.)

However, sequencing genomes to generate data is only part of the job. Quality checking, preprocessing of sequenced reads and mapping to a reference genome still require powerful computing facilities, efficient algorithms and obviously experienced staff. It is a time-consuming process.

“Everybody talks about the $1,000 genome, but they don’t talk about the $2,000 mapping problem behind the $1,000 genome,” says Peter Tonellato, Professor of Biostatistics at the University of Wisconsin.

Moreover, WGS generates huge amounts of data, which poses a challenge for data storage.

The Broad Institute in Cambridge, Massachusetts, previously stated that during one month it decoded the equivalent of one human genome every half an hour, equivalent to 200TB of raw data. Even if that quantity is smaller than what is handled daily by internet companies, it exceeds anything biologists and hospitals have ever dealt with.

Amazon and Google understand this need and already offer to keep a copy of any genome for €24 ($25) a year, which translates to roughly €0.02/GB per month, since a file is commonly between 100 and 400GB. In 2014, The National Cancer Institute said that it would pay €18M ($19M) to move copies of the 2.6 petabyte Cancer Genome Atlas into the cloud.

“Our bird’s eye view is that if I were to get lung cancer in the future, doctors are going to sequence my genome and my tumor’s genome, and then query them against a database of 50 million other genomes,” said Deniz Kural, whose company, Seven Bridges, stores genome data using Amazon’s cloud system.

US-based, an online platform for people developing apps to analyze genetic data from multiple sources, is offering free genetic data storage (with the help of a partnership with Microsoft) to app developers and those who want to use its services. It and others believe free access to data is important for progress and also for ethical reasons, although some competitors (such as Helix) do not allow completely free access to your genetic data. Companies such as Nebula Genomics, Zenome and DNAtix have even taken this a step further and are using blockchain technology to anonymize a person’s DNA sequence and allow them to sell access to the data to the highest bidder.

Genome sequencing around the world

The UK was the first to launch a program dedicated to whole genome sequencing in Europe. Genomics England aims to sequence up to 100,000 whole genomes from patients with rare diseases, their families, and cancer patients from 11 Genomic Medicine Centres. Ten companies, including GSK, AstraZeneca and Roche have signed up to be part of the GENE Consortium, giving them access to 5,000 sequenced genomes.

What will scientists do with the genomics data?
Genomics England


These collaborations have raised concern regarding access to private health data, but there is no doubt that such a massive project could not be possible without private funding. Genomics England’s community management is impressive, with frequent updates and campaigns to raise public awareness. According to a monthly updated counter, almost 85,000 genomes have been sequenced so far!

On a similar framework, Australia is currently working on the €290M (AU$400M), 4-year 100,000 Genomes Project, sequencing patients with rare diseases and cancer to create a massive database for R&D.

Estonia proposed an ambitious personalized medicine program in June 2000 and thus became an unexpected pioneer. The Estonian Genome Project Foundation had collected data from 52,000 adult donors by February 2014 including a few hundred WGS. In March this year it offered a further 100,000 people free genetic testing in an attempt to dramatically increase the size of its existing biobank.

In the USA, the Precision Medicine Initiative (PMI), with its 1-million-volunteer health study, will gather a large database of health data including genetics and lifestyle factors. To cut a long story short, the Mayo Clinic will analyze and store one million blood and DNA samples.

As in the UK, some of the anonymized data will be probably made available to researchers and industries in order to stimulate the project, which started in 2016 with €52M ($55M) from the NIH to build the foundational partnerships and infrastructure needed to launch the program.

Precision Medicine Initiative

In 2016, France announced the “France Medecine Genomique 2025” program, aiming to open 12 sequencing centers and ensure 235,000 WGS a year. The French government is planning to inject €670M in this program, whose main aim is to use WGS as a diagnostics tool.

Many other western countries such as Ireland and Iceland have launched their own programs. However, when it comes to personalized medicine, taking into account genetic variability between populations is a prerequisite. Western medicine has historically targeted western populations, but nowadays western medicine is a worldwide practice.

“There is a massive bias in medical research; Europeans have been developing drugs for Europeans without asking how compatible these pharmaceuticals are for the rest of the world,” commented Stephan Schuster, Chair of the Genome Asia consortium.

Based on this observation, the non-profit consortium GenomeAsia 100K decided to generate genomic data for Asian populations. Supporters of the initiative include genomics companies Macrogen in Korea and MedGenome in India, as well as Illumina. According to the PHGFoundation, at least 50,000 DNA samples have already been collected, and initial work will focus on creating suitable reference genome sequences for key populations in Malaysia, India, Japan or Thailand.

With the same purpose, the Qatar Genome Program aims to establish the Qatari Reference Genome Map by sequencing 3,000 whole genomes, which accounts for around 1% of the Qatari population.

Last but not least, China has been an unbeatable leader in genome sequencing for years now. In 2010, the BGI genomics institute in Shenzhen was probably hosting a higher sequencing capacity than that of the entire United States. China’s sequencing program is not just aiming for thousands but rather one million human genomes and will include subgroups of 50,000 people, each with specific conditions such as cancer or metabolic disease. There will also be cohorts from different regions of China “to look at the different genetic backgrounds of subpopulations.

How to Handle the Ethical Implications?

It is difficult to anticipate the impact of WGS in modern medicine, but ethical issues regarding privacy of health data have already emerged. It is obvious that no one would like to see GAFA (Google, Apple, Facebook, Amazon) selling genome data as they are probably already doing with personal data from their users.

A key challenge is that ethical, legal, and social concerns raised by the most innovative technologies, including cell and gene therapy as well as sequencing, significantly differ between regions. This definitely gives a certain advantage to countries with less restrictive laws, which are usually not western countries. For example, in Europe, transparency about the purpose of sample collection and protocols is mandatory before any research is conducted. In the US, the Genetic Information Nondiscrimination Act was set up in 2008 to legally protect people from being discriminated against based on their DNA, but this is by no means widespread in other countries and the legal position in Europe remains somewhat grey.

Although it is easier said than done, regulators should be proactive and set up an appropriate framework for these promising but challenging approaches while ensuring it does not hinder R&D.

This feature was originally published on 2.3.17 authored by Timothé Cynober. It has since been updated to reflect the latest developments in the field.

Images via whiteMocca /Shutterstock; Clark MJ et al. (2010), PLoS Genet 6(1): e1000832; GenomicsEngland; National Institutes of Health

Newsletter Signup - Under Article / In Page

"*" indicates required fields

Subscribe to our newsletter to get the latest biotech news!

This field is for validation purposes and should be left unchanged.

Suggested Articles

Show More