LncRNA Research Resources
Mary Johnson (han at labome dot com)
Synatom Research, Princeton, New Jersey, United States
DOI
//dx.doi.org/10.13070/mm.en.3.159
Date
last modified : 2022-10-20; original version : 2013-11-13
Cite as
MATER METHODS 2013;3:159
Abstract

A compilation of resources for lncRNA research.

LncRNA refers to long non-coding RNA molecules, usually greater than 200 bases, and with features similar to mRNA, such as 5’ capping, splicing, and polyadenylation. However, lncRNAs have little or no open reading frames, and thus are not translated. A substantial number of lncRNAs may turn out to be mis-annotated, since they might code for essential proteins, as in the case of Aw112010, through non-canonical open reading frames [2]. In addition, lncRNAs are not well conserved across species. Even those conserved may display distinct processing in different species [3]. Many thousands or tens of thousands of lncRNA have been suggested in different species. It has been estimated that there are 91,000 human lncRNAs [4], as downloadable from MiTranscriptome. However, the definitive functions for only a handful of them have been experimentally identified. For example, lncRNA SLERT was found to regulate phase separation of fibrillar center and dense fibrillar component units in the nucleolus [5]. D Prokopenko et al identified LINC00298 as a candidate Alzheimer's disease candidate locus [6]. M Pradas-Juni et al identified the involvement of LincIRS2 in hepatic glucose metabolism [7]. Labonté B et al identified and characterized a novel lncRNA, MAALIN, that regulates the expression of monoamine oxidase A (MAOA) gene in the brain and may, consequentially, regulate the impulsive and aggressive behaviours in mice and humans [8]. LncRNA CCR5AS protects CCR5 mRNA from Raly-mediated degradation through its interference with interactions between Raly and the CCR5 3' untranslated region [9]. Dali, a 3.5-kb, CNS-expressed, mono-exonic, intergenic lncRNA, was shown to interact with a neighbouring transcription factor Pou3f3 and distally with DNMT1 DNA methyltransferase to affect the DNA methylation of promoters [10]. Long intergenic noncoding RNA HOTAIR might serve as a modular scaffold of histone modification complexes [11] and its transcription is regulated by chromatin topology modulation [12]. ARLNC1 (AR-regulated long non-coding RNA 1) interacts with and stabilizes the androgen receptor transcript and promotes prostate cancer growth [13]. The promoter of lncRNA gene PVT1 possesses tumor-suppressor function [14].

A large amount of information regarding lncRNAs identities, properties, and functions, as well as many tools for their analysis became available during the last few years [15, 16]. Here we list databases, computational, prediction and experimental tools related to lncRNAs. Valuable information about these resources can also be found in recent comparative analysis reviews on databases [1, 17-19], computational and experimental methods [20], structure prediction methods [21], or on lncRNA nomenclature [22].

LncRNA Databases

A lot of effort has been put into organizing the vast amount of data on lncRNAs. The information curated in these databases includes basic genomic annotation, lncRNA expression profiles, sequence variants and lncRNA-protein, lncRNA-RNA or lncRNA-DNA interactions (Figure 1).

LncRNA Research Resources figure 1
Figure 1. Main types of information are included in lncRNA databases (reproduced from [1] ).
GENCODE

http://www.gencodegenes.org/ last update: ongoing. From ENCODE Consortium.

GENCODE is a large-scale effort, aiming to annotate all evidence-based gene features in the entire human genome at a high accuracy [23]. It is funded by NIH and Wellcome Trust. GENCODE combines manual curation, computational analysis, and targeted experimental validation of the GENCODE transcript database. The current human version, Gencode 38, released in May 2021, includes 17944 lncRNA genes for 48752 lncRNA transcripts. The current mouse version Genecode M27, released in May 2021, includes 13188 lncRNA genes for 18838 lncRNA gene transcripts.

LNCipedia

http://www.lncipedia.org/ last updated: Aug 2, 2018. From Ghent University, Belgium.

LNCipedia V5.2, the latest version, contains 127802 transcripts from 56,946 genes. In addition to basic transcript information and structure, several statistics are calculated for each entry in the database, such as secondary structure information, protein-coding potential, and microRNA binding sites.

Publications about LNCipedia:

  • LNCipedia 5: towards a reference set of human long non-coding RNAs [24].
  • An update on LNCipedia: a database for annotated human lncRNA sequences [25].
  • A database for annotated human lncRNA transcript sequences and structures [26]
.
NONCODE

http://www.noncode.org/ last update: Sep 2017. From Beijing, China.

The current version (5.0) presents an increased collection of lncRNAs from 17 species.

Publications about NONCODE:

  • NONCODE: an integrated knowledge database of non-coding RNAs [27]
  • NONCODE v2.0: decoding the non-coding [28]
  • NONCODE v3.0:Integrative annotation of long noncoding RNAs [29]
  • NONCODEv4: exploring the world of long non-coding RNA genes [30]
  • NONCODEv4: Annotation of Noncoding RNAs with Emphasis on Long Noncoding RNAs [31]
LncRNASNP2

http://bioinfo.life.hust.edu.cn/lncRNASNP/ last update: unknown. From Huazhong University of Science & Technology, Wuhan, China.

"Long non-coding RNAs (lncRNAs) are emerging as key factors in the regulation of various cellular processes and diseases. LncRNASNP is a database providing comprehensive resources of single nucleotide polymorphisms (SNPs) in human/mouse lncRNAs. It contains SNPs in lncRNAs, SNP effects on lncRNA structure, mutations in lncRNAs and lncRNA:miRNA binding.

In lncRNASNP2 [32], numbers of human lncRNAs and SNPs on them were updated to 141,353 and 10,205,295. Furthermore, we identified 859,534 Cosmic Noncoding Variations and 315,234 TCGA cancer mutations based on GRCh38 in these lncRNAs."

DIANA-LncBase

DIANA-LncBase: Part of the Diana lab tools, (http://diana.imis.athena-innovation.gr/DianaTools/index.php?r=site/index), provides a comprehensive annotation of putative (miRNA)-lncRNA functional interactions. It includes experimentally verified (> 5000 as of Jan 2013) and computationally predicted (> 10 million as of Jan 2013) miRNA recognition elements (MREs) on human and mouse lncRNAs. For each miRNA-lncRNA pair it provides “external links, graphic plots of transcripts' genomic location, representation of the binding sites, lncRNA tissue expression as well as MREs conservation and prediction scores” [33].

An enhanced version, DIANA-LncBase v2.0, became available as of 2015 [34]. The database adds “more than 70 000 low and high-throughput, (in)direct miRNA:lncRNA experimentally supported interactions”, miRNA targets on lncRNAs, predicted with the DIANA-microT algorithm, cell type-specific miRNA:lncRNA regulation, and lncRNA expression information, derived from the analysis of more RNA-Seq reads.

Computational and Prediction Tools for LncRNAs
ATtRACT

A daTabase of RNA binding proteins and AssoCiated moTifs can be used to predict RNA-binding proteins for LncRNAs [35]. CJ Guo et al used ATtRACT to predict FAST binding proteins [3].

LncDisease

LncDisease is a sequence-based bioinformatics method to predict the lncRNA-disease associations based on the crosstalk between lncRNAs and miRNAs. The most recent update is January 2019. "Current version of LncRNADisease database integrated near 3000 lncRNA-disease entries and 475 lncRNA interaction entries, including 914 lncRNAs and 329 diseases from ~2000 publications. LncRNADisease also provided the predicted associated diseases of 1564 human lncRNAs".

LncRscan-SVM

lncRscan-SVM is a tool for predicting lncRNAs using Support Vector Machine (SMV). In order to make the predictions, it integrates features derived from gene structure, transcript sequence, potential codon sequence and conservation [36].

LncRNA-MFDL

lncRNA-MFDL is a tool to identify lncRNAs by “fusing multiple features of the open reading frame, k-mer, the secondary structure and the most-like coding domain sequence and using deep learning classification algorithms” [37].

LncRNA-ID

LncRNA-ID is a tool to calculate “the coding potential of a transcript using a machine learning model (random forest)”. The analysis takes into account multiple features including: “sequence characteristics of putative open reading frames, translation scores based on ribosomal coverage, and conservation against characterized protein families” [38].

LPBNI (lncRNA-protein bipartite network inference)

LPBNI is a new computational method that aims to identify potential lncRNA-protein interactions, by making full use of the known lncRNA-protein interactions [39].

Major Research Methods

All the common mRNA research methods can be used to study lncRNA molecules. They include 3SEQ (3’ End Sequencing for Expression Quantification) [40, 41], RNA-seq, qRT-PCR, RNA in situ hybridizations (can be done in an array format) or RNAscope in situ hybridization [8], and Northern Blot. Specific category of siRNAs against lncRNAs is available from Dharmacon (Lincode SMARTpool); Butler AA et al injected Lincode SMARTpool siRNAs against Neat1 into mouse hippocampal areas to evaluate the role of lncRNA Neat1 in memory formation [42].

References
  1. Rühle F, Stoll M. Long non-coding RNA Databases in Cardiovascular Research. Genomics Proteomics Bioinformatics. 2016;14:191-9 pubmed publisher
  2. Jackson R, Kroehling L, Khitun A, Bailis W, Jarret A, York A, et al. The translation of non-canonical open reading frames controls mucosal immunity. Nature. 2018;564:434-438 pubmed publisher
  3. Guo C, Ma X, Xing Y, Zheng C, Xu Y, Shan L, et al. Distinct Processing of lncRNAs Contributes to Non-conserved Functions in Stem Cells. Cell. 2020;181:621-636.e22 pubmed publisher
  4. Iyer M, Niknafs Y, Malik R, Singhal U, Sahu A, Hosono Y, et al. The landscape of long noncoding RNAs in the human transcriptome. Nat Genet. 2015;47:199-208 pubmed publisher
  5. Wu M, Xu G, Han C, Luan P, Xing Y, Nan F, et al. lncRNA SLERT controls phase separation of FC/DFCs to facilitate Pol I transcription. Science. 2021;373:547-555 pubmed publisher
  6. Prokopenko D, Morgan S, Mullin K, Hofmann O, Chapman B, Kirchner R, et al. Whole-genome sequencing reveals new Alzheimer's disease-associated rare variants in loci related to synaptic function and neuronal development. medRxiv. 2020;: pubmed publisher
  7. Pradas Juni M, Hansmeier N, Link J, Schmidt E, Larsen B, Klemm P, et al. A MAFG-lncRNA axis links systemic nutrient abundance to hepatic glucose metabolism. Nat Commun. 2020;11:644 pubmed publisher
  8. Labonte B, Abdallah K, Maussion G, Yerko V, Yang J, Bittar T, et al. Regulation of impulsive and aggressive behaviours by a novel lncRNA. Mol Psychiatry. 2020;: pubmed publisher
  9. Kulkarni S, Lied A, Kulkarni V, Rucevic M, Martin M, Walker Sperling V, et al. CCR5AS lncRNA variation differentially regulates CCR5, influencing HIV disease outcome. Nat Immunol. 2019;: pubmed publisher
  10. Chalei V, Sansom S, Kong L, Lee S, Montiel J, Vance K, et al. The long non-coding RNA Dali is an epigenetic regulator of neural differentiation. elife. 2014;3:e04530 pubmed publisher
  11. Tsai M, Manor O, Wan Y, Mosammaparast N, Wang J, Lan F, et al. Long noncoding RNA as modular scaffold of histone modification complexes. Science. 2010;329:689-93 pubmed publisher
  12. Battistelli C, Sabarese G, Santangelo L, Montaldo C, Gonzalez F, Tripodi M, et al. The lncRNA HOTAIR transcription is controlled by HNF4α-induced chromatin topology modulation. Cell Death Differ. 2019;26:890-901 pubmed publisher
  13. Zhang Y, Pitchiaya S, Cieslik M, Niknafs Y, Tien J, Hosono Y, et al. Analysis of the androgen receptor-regulated lncRNA landscape identifies a role for ARLNC1 in prostate cancer progression. Nat Genet. 2018;50:814-824 pubmed publisher
  14. Cho S, Xu J, Sun R, Mumbach M, Carter A, Chen Y, et al. Promoter of lncRNA Gene PVT1 Is a Tumor-Suppressor DNA Boundary Element. Cell. 2018;173:1398-1412.e22 pubmed publisher
  15. Nojima T, Proudfoot N. Mechanisms of lncRNA biogenesis as revealed by nascent transcriptomics. Nat Rev Mol Cell Biol. 2022;23:389-406 pubmed publisher
  16. Kopp F, Mendell J. Functional Classification and Experimental Dissection of Long Noncoding RNAs. Cell. 2018;172:393-407 pubmed publisher
  17. Fritah S, Niclou S, Azuaje F. Databases for lncRNAs: a comparative evaluation of emerging tools. RNA. 2014;20:1655-65 pubmed publisher
  18. Yotsukura S, Duverle D, Hancock T, Natsume Kitatani Y, Mamitsuka H. Computational recognition for long non-coding RNA (lncRNA): Software and databases. Brief Bioinform. 2017;18:9-27 pubmed publisher
  19. Iwakiri J, Hamada M, Asai K. Bioinformatics tools for lncRNA research. Biochim Biophys Acta. 2016;1859:23-30 pubmed publisher
  20. Housman G, Ulitsky I. Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs. Biochim Biophys Acta. 2016;1859:31-40 pubmed publisher
  21. Yan K, Arfat Y, Li D, Zhao F, Chen Z, Yin C, et al. Structure Prediction: New Insights into Decrypting Long Noncoding RNAs. Int J Mol Sci. 2016;17: pubmed publisher
  22. Wright M. A short guide to long non-coding RNA gene nomenclature. Hum Genomics. 2014;8:7 pubmed publisher
  23. Harrow J, Frankish A, Gonzalez J, Tapanari E, Diekhans M, Kokocinski F, et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 2012;22:1760-74 pubmed publisher
  24. Volders P, Anckaert J, Verheggen K, Nuytens J, Martens L, Mestdagh P, et al. LNCipedia 5: towards a reference set of human long non-coding RNAs. Nucleic Acids Res. 2019;47:D135-D139 pubmed publisher
  25. Volders P, Verheggen K, Menschaert G, Vandepoele K, Martens L, Vandesompele J, et al. An update on LNCipedia: a database for annotated human lncRNA sequences. Nucleic Acids Res. 2015;43:D174-80 pubmed publisher
  26. Volders P, Helsens K, Wang X, Menten B, Martens L, Gevaert K, et al. LNCipedia: a database for annotated human lncRNA transcript sequences and structures. Nucleic Acids Res. 2013;41:D246-51 pubmed publisher
  27. Liu C, Bai B, Skogerbø G, Cai L, Deng W, Zhang Y, et al. NONCODE: an integrated knowledge database of non-coding RNAs. Nucleic Acids Res. 2005;33:D112-5 pubmed
  28. He S, Liu C, Skogerbø G, Zhao H, Wang J, Liu T, et al. NONCODE v2.0: decoding the non-coding. Nucleic Acids Res. 2008;36:D170-2 pubmed
  29. Bu D, Yu K, Sun S, Xie C, Skogerbø G, Miao R, et al. NONCODE v3.0: integrative annotation of long noncoding RNAs. Nucleic Acids Res. 2012;40:D210-5 pubmed publisher
  30. Xie C, Yuan J, Li H, Li M, Zhao G, Bu D, et al. NONCODEv4: exploring the world of long non-coding RNA genes. Nucleic Acids Res. 2014;42:D98-103 pubmed publisher
  31. Zhao Y, Yuan J, Chen R. NONCODEv4: Annotation of Noncoding RNAs with Emphasis on Long Noncoding RNAs. Methods Mol Biol. 2016;1402:243-254 pubmed publisher
  32. Miao Y, Liu W, Zhang Q, Guo A. lncRNASNP2: an updated database of functional SNPs and mutations in human and mouse lncRNAs. Nucleic Acids Res. 2017;: pubmed publisher
  33. Paraskevopoulou M, Georgakilas G, Kostoulas N, Reczko M, Maragkakis M, Dalamagas T, et al. DIANA-LncBase: experimentally verified and computationally predicted microRNA targets on long non-coding RNAs. Nucleic Acids Res. 2013;41:D239-45 pubmed publisher
  34. Paraskevopoulou M, Vlachos I, Karagkouni D, Georgakilas G, Kanellos I, Vergoulis T, et al. DIANA-LncBase v2: indexing microRNA targets on non-coding transcripts. Nucleic Acids Res. 2016;44:D231-8 pubmed publisher
  35. Giudice G, Sanchez Cabo F, Torroja C, Lara Pezzi E. ATtRACT-a database of RNA-binding proteins and associated motifs. Database (Oxford). 2016;2016: pubmed publisher
  36. Sun L, Liu H, Zhang L, Meng J. lncRScan-SVM: A Tool for Predicting Long Non-Coding RNAs Using Support Vector Machine. PLoS ONE. 2015;10:e0139654 pubmed publisher
  37. Fan X, Zhang S. lncRNA-MFDL: identification of human long non-coding RNAs by fusing multiple features and using deep learning. Mol Biosyst. 2015;11:892-7 pubmed publisher
  38. Achawanantakun R, Chen J, Sun Y, Zhang Y. LncRNA-ID: Long non-coding RNA IDentification using balanced random forests. Bioinformatics. 2015;31:3897-905 pubmed publisher
  39. Ge M, Li A, Wang M. A Bipartite Network-based Method for Prediction of Long Non-coding RNA-protein Interactions. Genomics Proteomics Bioinformatics. 2016;14:62-71 pubmed publisher
  40. Beck A, Weng Z, Witten D, Zhu S, Foley J, Lacroute P, et al. 3'-end sequencing for expression quantification (3SEQ) from archival tumor samples. PLoS ONE. 2010;5:e8768 pubmed publisher
  41. Brunner A, Beck A, Edris B, Sweeney R, Zhu S, Li R, et al. Transcriptional profiling of long non-coding RNAs and novel transcribed regions across a diverse panel of archived human cancers. Genome Biol. 2012;13:R75 pubmed publisher
  42. Butler A, Johnston D, Kaur S, Lubin F. Long noncoding RNA NEAT1 mediates neuronal histone methylation and age-related memory impairment. Sci Signal. 2019;12: pubmed publisher
ISSN : 2329-5139