Large-Scale Long-Read RNA Dataset Advances Disease Research with 750 Million Reads from 14 Human Cell Lines

Researchers led by the ASTAR Genome Institute of Singapore (ASTAR GIS) have unveiled one of the most comprehensive and extensive long-read RNA sequencing datasets to date, aiming to accelerate discoveries in disease mechanisms. This groundbreaking dataset, known as the Singapore Nanopore Expression (SG-NEx), encompasses over 750 million RNA reads derived from 14 human cell lines, offering unprecedented insight into RNA complexity.
Traditional short-read RNA sequencing has been instrumental in transcriptomics but faces limitations in capturing full-length RNA molecules, especially when it comes to intricate phenomena like alternative splicing, fusion transcripts, and specific chemical modifications. These limitations hinder the identification of key biomarkers relevant to diseases such as cancer.
The SG-NEx dataset employs long-read RNA sequencing technology, which allows scientists to observe entire RNA molecules in their full length, thereby revealing detailed structural information that short reads cannot. This approach significantly enhances the understanding of transcript diversity and provides a robust foundation for developing next-generation diagnostics and therapeutics.
"Think of reading a book torn into fragments versus reading the complete chapters," explained Chen Ying, Senior Scientist at A*STAR GIS. "Long-read sequencing lets us read full chapters, making it easier to uncover vital information linked to disease processes."
This open-access dataset aims to support academic, translational research, and industrial applications, enabling the development of more precise gene isoform analysis and fostering innovation in RNA-based medicine. It benefits biotech firms, pharmaceutical companies, bioinformatics developers, and healthcare policymakers.
Collaborating with institutions like Duke-NUS Medical School, the National Cancer Centre Singapore, and international research centers, the SG-NEx project was launched in 2018 with a commitment to rapid data sharing to expedite biomedical advances.
Future efforts include integrating artificial intelligence tools for automated detection of RNA features, broadening global data access, and standardizing long-read sequencing protocols for clinical use.
"By combining large-scale data with benchmarking and open access, SG-NEx is shaping the future of RNA research, bringing us closer to understanding how RNA influences health and disease," stated Dr. Wan Yue, Executive Director of A*STAR GIS.
This initiative marks a significant step toward harnessing the full potential of long-read sequencing technologies to improve diagnostics, patient care, and the understanding of disease at the molecular level.
Stay Updated with Mia's Feed
Get the latest health & wellness insights delivered straight to your inbox.
Related Articles
New Insights into Human Proteins Essential for Coronavirus Replication Suggest Innovative Treatment Approaches
New research uncovers key human proteins essential for SARS-CoV-2 replication, paving the way for innovative broad-spectrum antiviral treatments targeting host pathways.
First Likely Case of Locally Acquired Malaria in Washington State
A woman in Washington's Pierce County has been diagnosed with malaria without recent travel, marking the first possible local transmission in the state—a case that has health officials investigating the source.
Severity of Metabolic Syndrome and Its Link to Chronic Kidney Disease Development
Recent research reveals a strong link between high metabolic syndrome severity and increased risk of developing chronic kidney disease, emphasizing the importance of metabolic health monitoring.
Innovative Nitric Oxide-Based Gel Offers Alternative to Traditional Alcohol Hand Sanitizers
A new nitric oxide-releasing gel has been developed as a powerful, long-lasting alternative to traditional alcohol-based hand sanitizers, offering better infection control especially in healthcare settings.



