Unraveling Genomic Mysteries with Long Read NGS

 

Introduction

Since the landmark sequencing of the human genome in 2003, sequencing technology has made incredible strides.

From painstakingly decoding DNA letter by letter, we now have a range of technologies that offer faster and more efficient DNA sequencing. Today, one of the pivotal decisions in any sequencing experiment is choosing between short-read and long-read sequencing. While short-read sequencing has been the go-to workhorse in NGS labs, long-read sequencing is emerging as a powerful alternative, capable of unlocking deeper genomic insights.

In this blog post, we’ll delve into the realm of long read NGS and explore its unique strengths and applications.

Unleashing the Power of Long Read NGS

Long read sequencing is a game-changer in genomics research.

Unlike its short-read counterpart (You can read more about it here), long read NGS instruments generate reads spanning thousands to hundreds of thousands of bases. This extended read length enables researchers to detect complex structural variations, including large insertions/deletions, inversions, repeats, duplications, and translocations. Additionally, long-read sequencing facilitates phasing of single nucleotide polymorphisms (SNPs) into haplotypes, aids in de novo assembly, and provides a comprehensive view of splicing events in full-length cDNA.

Overcoming Challenges and Enhancements

While long read sequencing instruments have been available for some time, factors like lower yield, higher error rates, and increased costs initially limited their adoption. The accuracy per read also tends to be lower compared to short-read sequencing, particularly in nanopore technology due to the inherent difficulty in controlling DNA molecule speed through the pore. However, advancements in circular consensus sequencing techniques have significantly improved the accuracy, rivaling that of NGS platforms.

Companies like PacBio and Oxford Nanopore Technologies (ONT) have made substantial strides in making long-read sequencing more accessible. PacBio’s Sequel II instruments now offer “HiFi sequencing” through circular consensus, enabling the sequencing of larger DNA fragments (15-20 kb) with error rates approaching those of short read sequencing. ONT, on the other hand, provides a range of platforms with varying price points, data outputs, and portability, allowing for read lengths of up to hundreds of kilobases.

Harnessing the Synergy of Short and Long Read Data

In many research projects, the combination of short and long-read data can yield powerful insights.

Short read sequencing provides high depth and high-quality data at a lower cost per base, making it ideal for SNP and mutation calling. Layering long-read sequencing information on top of this data allows for the resolution of complex structural variations and haplotype phasing.

Although it requires more sophisticated analysis methods, this fusion of technologies is particularly valuable in de novo assembly or rare disease sequencing projects, leading to a more comprehensive understanding of genetic variation.

Conclusion

The ongoing debate between short-read and long-read sequencing underscores the unique strengths of each technology.

However, the true power lies in their integration. By combining both short and long-read sequencing approaches, researchers can attain a holistic view of their genomic data, capitalizing on the speed and affordability of short-read sequencing while unlocking the deeper insights offered by long-read technology.