A central web resource for microRNA research, including microRNA databases, microRNA prediction and microRNA target prediction tools, microRNA expression profiling, functions
Many software systems have been developed to analyze microRNA sequence data and generate microRNA prediction, target prediction and functional annotation. This article lists some of the commonly used databases and software systems for microRNA research. Other specialized and more detailed reviews are available over the years. For example, for plant miRNA analysis, Lukasik A and Zielenkiewicz P reviewed software systems for sequence analysis (mirTools 2.0, miRDeep-P, sRNAtoolbox), microRNA identification (SeqBuster, isomiR2Function, DeAnnIso), novel miRNA prediction (miRNAFold, miRNA Digger, HuntMi), target prediction (psRNATarget, TAPIR, psRobot), functional annotation (clusterProfiler, BUFET), and others (miRprimer, WMD3, miTRATA, miRBase Tracker) [1]. Web resources for RNA in general can also be used for miRNA research, for example, RNAfold WebServer [2].
miRBase (www.mirbase.org) provides published miRNA sequences, annotations, predictions for targets, etc. and a convenient online enquiry interface, allowing users to search known miRNA and target information through keywords or sequences [3]. It is one of the main microRNA sequence databases. For example, Pandolfini L et al analyzed smallRNA sequencing data with the miRbase version 22 annotation of mature miRNAs [4]. miRBase is hosted and maintained in University of Manchester, Faculty of Life Sciences, United Kingdom, funded by BBSRC, and by the Wellcome Trust Sanger Institute (previously).
The current version is 22.1, released in 2019 (version 21 was released in 2014). This version contains 38589 entries representing hairpin precursor miRNAs expressing 48885 mature miRNA products, in 2771 species. The database is actively maintained and was last updated in 2019 [3]. A significant number of entries, however, are likely false-positives [5, 6].
miRBase is one of the expert databases participating in RNAcentral [7] and is widely used [6, 8].
MirGeneDB is a database of microRNA genes that have been validated and annotated by a group of European and American researchers [9], according an annotation standard [10]. MirGeneDB 2.1 contains more than 1,500 miRNA families from 75 metazoan species, including 567 human, 452 mouse, 286 chicken and 414 zebrafish genes. It also includes expression data at different granularity, whole animals, organs, tissues, and cell types, based on publicly available datasets The data are publicly and freely available under the Creative Commons Zero license.
miRTarBase is a database of manually curated and experimentally validated miRNA-target interactions (MTIs) [11]. Originally the database contained over 3500 MTIs validated experimentally by reporter assays, western blots or microarray experiments. The most recent update in 2021 [12] includes 2,200,449 miRNA-target interactions for 4,630 miRNA from 37 species based on 13,389 articles, including 16,257 validated MTIs by reporter assays and 14,665 interactions by western blots.
miRWalk (http://mirwalk.umm.uni-heidelberg.de/) is a comprehensive database that provides information on miRNA from human, mouse and rat on their predicted as well as validated and predicted binding sites on their target genes [13, 14]. From Ruprecht-Karls-Universite Heidelberg, Medizinische Fakult Mannheim, Germany.
Pandolfini L et al, for example, used miRWalk 2.0 web server to identify in silica the target mRNAs of specific miRNAs based on 6 prediction algorithms (miRWalk, miRanda, miRDB, Pictar2, RNA22 and Targetscan) for their investigation of m7G methylation in let-7 microRNA [4].
The webpage was last updated in January, 2022.
miRGen - Part of the Diana lab tools. It is a database ideal for miRNA transcription regulation studies, includes cell-line-specific miRNA gene transcription start sites (TSSs) and genome-wide maps of transcription factor (TF) binding sites [15].
TransmiR (http://www.cuilab.cn/transmir) is a database for regulation of microRNA by transcription factors [16]. From Beijing University, China. The most recent version 2.0, updated in May 2018, contains 3,730 entries, which include ~623 transcriptional factors, ~785 miRNAs, and 19 organisms from 1,349 publications. Also included are 1,785,998 TF-miRNA regulations derived from ChIP-seq evidence in 5 species.
miRCarta, a central repository of miRNA, includes both the miRBase entries and predicted miRNA candidates by miRMaster prediction program [17]. It was last updated in July 2018.
A small RNA sequencing ( smRNA-seq) database for microRNA research. The most recent version was released as of 2015: YM500v.2 [18].
PolymiRTS from University of Tennessee Health Science Center, United States is a database of naturally occurring DNA variations in both predicted and validated miRNA target sites [19]. The data can be searched or downloaded.
miRGator (http://mirgator.kobic.re.kr/) The miRGator database is a navigator tool for functional interpretation of miRNAs [20, 21]. Functional analyses and expression profiling are integrated with target gene prediction to infer the biological function of miRNAs. Version 3.0 contains 73 deep sequencing datasets on human samples from GEO, SRA, and TCGA archives with 4.1 billion short reads and 2.5 billion aligned reads. From Ewha Woman's University, Seoul, Korea.
dorina.mdc-berlin.de is a database of RNA interactions in post-transcriptional regulation [22, 23]. "In animals, RNA binding proteins (RBPs) and microRNAs (miRNAs) post-transcriptionally regulate the expression of virtually all genes by binding to RNA. Recent advances in experimental and computational methods facilitate transcriptome-wide mapping of these interactions. It is thought that the combinatorial action of RBPs and miRNAs on target mRNAs form a post-transcriptional regulatory code. We provide a database that supports the quest for deciphering this regulatory code. Within doRiNA, we are systematically curating, storing and integrating binding site data for RBPs and miRNAs. Users are free to take a target (mRNA) or regulator (RBP and/or miRNA) centric view on the data. We have implemented a database framework with short query response times for complex searches (e.g., asking for all targets of a particular combination of regulators). All search results can be browsed, inspected and analyzed in conjunction with a huge selection of other genome-wide data, because our database is directly linked to a local copy of the UCSC genome browser. At the time of writing, doRiNA encompasses RBP data for the human, mouse and worm genomes. For computational miRNA target site predictions, we provide an update of PicTar predictions."
The database was last updated in Nov, 2014.
STarMirDB from Wadsworth in the New York state, includes a collection of microRNA binding sites, predicted with STarMir algorithm (see below) [24]. In addition, some of the predictions supported with actual CLIP data are indicated.
Vir-Mir (alk.ibms.sinica.edu.tw/cgi-bin/miRNA/miRNA.cgi) a database containing predicted viral miRNA candidate hairpins [25]. From Institute of BioMedical Science, Academia Sinica, Taipei, Taiwan. The site appears to be last updated in 2007.
ViTa (vita.mbc.nctu.edu.tw) is a collection of viral data from miRBase and ICTV, VirGne, VBRC.., etc, including known miRNAs on viruses and supporting predicted host miRNA targets by miRanda and TargetScan [26]. ViTa also provides effective annotations, including human miRNA expression, virus-infected tissues, etc. From Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan. The site appears to be last updated in 2007.
S-MED (www.oncomir.umn.edu/SMED/index.php) is a repository of expression data for microRNAs in different types of human sarcoma tumor and select normal tissues. From University of Minnesota, United States.
miRPath is a part of the Diana lab tools, (http://diana.imis.athena-innovation.gr/DianaTools/index.php?r=site/index). It is a web-based computational tool and database developed to identify microRNA-regulated molecular pathways and determine the functional roles of miRNAs in seven species (Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Caenorhabditis elegans, Gallus gallus and Danio rerio) [27]. Incorporates more than 600 000 experimentally validated miRNA targets from DIANA-TarBase v7.0. and it is directly linked to DIANA and other external tools or databases.
SomamiR is a database for use in studying the miRNA roles in cancer [28]. Contains “germline and somatic mutations in miRNAs and their targets that have been experimentally shown to impact miRNA function and have been associated with cancer.” The 2016 update -version SomamiR 2.0 [29] - also includes somatic mutations affecting the miRNA - circRNA and miRNA- lncRNA interactions, and a list of somatic mutations in the miRNA seed regions. The webserver miR2GO was integrated for assessing for functional analysis of the somatic mutations.
miRGate database contains novel computationally predicted miRNA-mRNA target pairs calculated by using well-established algorithms [30]. It includes a complete dataset of sequences for miRNAs and mRNAs 3'-untranslated region from human (including human viruses), mouse and rat, and experimentally validated data from other known databases.
There are multiple miRNA target prediction platforms. Researchers often utilize multiple platforms to generate a list of target genes for an miRNA. For example, SE Sillivan et al predicted target genes for mir-135b-5p with TargetScan, DIANA and mirDB [31].
miRDB (mirdb.org/miRDB/) is an online microRNA target prediction database [32]. All targets are predicted by MirTarget, which employs SVM learning machine to analyze thousands of miRNA and target interactions [33, 34]. Current version 6.0, released in June 2019 [35], is based on miRBase version 22 with MirTarget V4, involves 7,086 microRNAs from human, mouse, rat, dog and chicken, with the total number of predicted targets of 3,519,884. The whole data set can be downloaded [32].
TargetScan (www.targetscan.org/) predicts biological targets of miRNAs by searching for the presence of conserved 8mer and 7mer sites that match the seed region of each miRNA [36]. It is from the Informatics and Research Computing, Whitehead Institute for Biomedical Research, United States. The most recent release is version 7.2, released in March 2018. The main home page is for mammalian target prediction. Mouse, worm, fly, and fish have separate search page links on the home page. Moro A et al, for example, associated the miRNA sequences identified from endothlial cell HITS-CLIP studies with a pan-AGO2 antibody to the HITS-CLIP mRNA sequences based on the putative miRNA recognition elements with TargetScan software through 8mer SEED regions [37]. McGeary SE et al found it to be not sufficiently comprehensive, as of version 7 [38]. Others have also used it [39, 40].
microPIECE predicts target genes of miRNAs in one species based on CLIP data from another species [41]. Mukherjee K et al searched the target genes for Galleria mellonella miRNAs based on microPIECE analysis of six Aedes aegypti AGO-CLIP libraries [42].
RNAhybrid (bibiserv.techfak.uni-bielefeld.de/rnahybrid/) is a tool for finding the minimum free energy hybridisation of a long and a short RNA [43]. The hybridisation is performed in a kind of domain mode, ie. the short sequence is hybridized to the best fitting part of the long one. The tool serves as microRNA target prediction. From University Bielefeld, Germany. Mukherjee K et al validated miRNA target prediction by microPIECE with RNAhybrid and RNA22 [42].
RNA22 [44, 45] is available from Computational Medicine Center, Thomas Jefferson University.
STarMir from Wadsworth is a software program [46], based on the sequence, thermodynamic and target structure features derived from CLIP data [24].
PicTar (www.pictar.org/) PicTar is an algorithm for the identification of microRNA targets [47]. This searchable website provides details regarding: microRNA target predictions in vertebrates, seven Drosophila species,three nematode species, and human microRNA targets that are not conserved but co-expressed (i.e., the microRNA and mRNA are expressed in the same tissue). From Rajewsky lab at NYU's Center for Comparative Functional Genomics and the Max Delbruck Centrum, Berlin, Germany.
Diana-microT: Part of the Diana lab tools, (http://diana.imis.athena-innovation.gr/DianaTools/index.php?r=site/index) it is currently at its 5th version [48]. The webpage indicates that it has the "highest sensitivity at any level of specificity, when compared against other state-of-the-art implementations". "It also provides hyperlinks to on-line servers such as iHOP and expression data for the selected microRNAs in tissues and cell lines" and also link to KEGG pathways. The full dataset of predicted human and mouse target sites can be downloaded. The most recent version was built in July 2012.
TripletSVM (bioinfo.au.tsinghua.edu.cn/mirnasvm/) predicts a query sequence with a hairpin structure as a real miRNA precursor or not [49]. The program is trained with the triplet element features of a set of real miRNA precursors and a set of pseudo-miRNA hairpins. From Tsinghua University, Beijing, China. The software is free for download.
PITA (genie.weizmann.ac.il/pubs/mir07/mir07_prediction.html) is a target prediction platform [50]. The last released catalog of predicted microRNA targets was on August 31, 2008, based on miRBase release 11.
Users can input UTR sequences into the web form to search for the predicted target sites, or download the executable to use locally.
miRTarVis is an interactive visual analysis tool that predicts and visualizes targets of miRNAs from miRNA-mRNA expression profile data. The resulting miRNA-mRNA network is visualized either as an interactive Treemap, or as a conventional node-link diagram. To exemplify, Jung et al [51] report results of miRNA-mRNA expression profile analysis of data from asthma patients.
SubmiRine predicts miRNA target site variants (miR-TVSs), which are genomic variants of miRNA target sites that are linked to multiple human diseases [52].
MiRComb is an R package that combines miRNA and mRNA expression data with hybridization information with the aim of finding potential miRNA-mRNA targets in specific contexts (e.g., specific disease) [53]. It is an efficient filter for the high amount of miRNA-mRNA interactions prediction data existent in databases, and it provides the results in an easy-to-read pdf file.
miRDeep2 identifies and profiles miRNAs in deep sequencing data [54]. Mukherjee K et al, for example, identified Galleria mellonella miRNAs and calculated the expression levels of these miRNAs in larvae with miRDeep2 [42].
NovoAlign from Novocraft is designed to mapp short reads from high-throughput sequencing platforms, such as Illumina, Ion Torrent, and 454 NGS, onto a reference genome. Moro A et al, for example, aligned Illumina NGS reads from HITS-CLIP with a pan-AGO2 antibody against human miRNA sequences from miRBase (release 21) with NovoAlign, and identified the abundances of miRNA sequences in the samples [37].
WMD3 (wmd3.weigelworld.org) designs artificial microRNAs (amiRNAs) [55, 56]. The 21mer amiRNA21mers can specifically silence single or multiple genes of interest in more than 90 plants. From Max Planck Institute for Developmental Biology, 72076 Tuingen, Germany. It appears to be last updated in 2009.
MagiCMicroRNA is a user-friendly web interface for the AgiMicroRna R-package [57] for the systematic preprocessing and statistical analysis of Agilent miRNA arrays. It introduces a new data filtering approach exemplified on datasets of cancerous and normal tissues from 14 patients [58].
miRiadne is a web tool for miRNA annotation. It is based on miRBase versions 10 to 21, and manually curated annotations of 40 common profiling platforms from nine brands [59]. It uses the mature sequences of miRNAs to link miRBase versions and/or platforms to prevent nomenclature ambiguities.
miRNA Digger is a software program for discovering novel miRNAs from the available high-throughput sequencing (HTS) data. The application is based on screening of cleavage signals in miRNA precursors mapped by degradome sequencing [60]. As a test, Yu et al applied miRNA Digger to discover miRNAs from Arabidopsis. Along with recovering of most known miRNAs, they identified 30 novel miRNA-miRNA* duplexes not registered in miRBase.
Web resources for the study of miRNAs are numerous and their number is continuously rising, making compilations like this very useful. The Nucleic Acids Research (NAR) journal maintains a list of about 1000 biological databases (http://www.oxfordjournals.org/our_journals/nar/database/a/), which includes databases on miRNA. These databases have been described in their annual database issue. Comparative reviews can also be very informative. Chipman LB and Pasquinelli AE reviewed noncanonical base pairing of 3' half of miRNAs with target mRNAs and its effect on stability of the miRNAs [61].
- Nam S, Kim B, Shin S, Lee S. miRGator: an integrated system for functional annotation of microRNAs. Nucleic Acids Res. 2008;36:D159-64 pubmed
- Li S, Shiau C, Lin W. Vir-Mir db: prediction of viral microRNA candidate hairpins. Nucleic Acids Res. 2008;36:D184-9 pubmed
- Hsu P, Lin L, Hsu S, Hsu J, Huang H. ViTa: prediction of host microRNAs targets on viruses. Nucleic Acids Res. 2007;35:D381-5 pubmed
- Wang X, El Naqa I. Prediction of both conserved and nonconserved microRNA targets in animals. Bioinformatics. 2008;24:325-32 pubmed
- Amsel et al., (2018). microPIECE - microRNA pipeline enhanced by CLIP experiments. Journal of Open Source Software, 3(24), 616. Available from: doi.org/10.21105/joss.00616
- Rehmsmeier M, Steffen P, Hochsmann M, Giegerich R. Fast and effective prediction of microRNA/target duplexes. RNA. 2004;10:1507-17 pubmed
- Miranda K, Huynh T, Tay Y, Ang Y, Tam W, Thomson A, et al. A pattern-based method for the identification of MicroRNA binding sites and their corresponding heteroduplexes. Cell. 2006;126:1203-17 pubmed
- Chen K, Rajewsky N. Natural selection on human microRNA binding sites inferred from SNP data. Nat Genet. 2006;38:1452-6 pubmed
- Xue C, Li F, He T, Liu G, Li Y, Zhang X. Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics. 2005;6:310 pubmed
- Kertesz M, Iovino N, Unnerstall U, Gaul U, Segal E. The role of site accessibility in microRNA target recognition. Nat Genet. 2007;39:1278-84 pubmed
- Schwab R, Ossowski S, Riester M, Warthmann N, Weigel D. Highly specific gene silencing by artificial microRNAs in Arabidopsis. Plant Cell. 2006;18:1121-33 pubmed
- Materials and Methods [ISSN : 2329-5139] is a unique online journal with regularly updated review articles on laboratory materials and methods. If you are interested in contributing a manuscript or suggesting a topic, please leave us feedback.