Volume 18, No 6, Jun 2008
ISSN: 1001-0602
EISSN: 1748-7838 2018
impact factor 17.848*
(Clarivate Analytics, 2019)
Volume 18 Issue 6, June 2008: 695-700
ORIGINAL ARTICLES
Finding noncoding RNA transcripts from low abundance expressed sequence tags
Chenghai Xue1,2,*, Fei Li1,3,*, and Fei Li1,?/sup>
1Department of Entomology, Nanjing Agricultural University, Nanjing 210095, China;
2MOE Key Laboratory of Bioinformatics and Bioinformatics Div, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China;
3The First Hospital of Tsinghua University, Beijing 10084, China
Correspondence: Fei Li(lifei@njau.edu.cn)
It has been proved that noncoding RNA (ncRNA) genes are much more numerous than expected. However, it remains a difficult task to identify ncRNAs with either computational algorithms or biological experiments. Recent reports have suggested that ncRNAs may also appear in the expressed sequence tags (EST's) database. Nevertheless, intergenic ESTs have received little attention and are poorly annotated owing to their low abundance. Here, we have developed a computational strategy for discovering ncRNA genes from human ESTs. We first collected ESTs that are located in the intergenic regions and do not have detailed annotations. The intergenic regions were divided into non-overlapping 50-nt windows and PhastCons scores obtained from the UCSC database were assigned to these windows. We kept conserved windows that had PhastCons scores of over 0.8 and that had at least three supporting ESTs to act as seeds. Each cluster of ESTs corresponding to the seeds was assembled into a long contig. We used two criteria to screen for ncRNA transcripts from these contigs: the first was that the longest predicted open reading frame was less than 300 nt and the second was that the likely Pol-II promoters exist within 2 000 nt upstream or downstream of the contigs. As a result, 118 novel ncRNA genes were identified from human low abundance ESTs. Of seven randomly selected candidates, six were transcribed in human 2BS cells as shown by RT-PCR. Our work proves that the EST is a 'hidden treasure' for detecting novel ncRNA genes.
Cell Research (2008) 18:695-700. doi: 10.1038/cr.2008.59; published online 27 May 2008
FULL TEXT | PDF
Browse 1792