Authors:
Malik Yousef
1
;
Dalit Levy
1
and
Jens Allmer
2
Affiliations:
1
Community Information Systems, Zefat Academic College, Israel
;
2
Wageningen University and Research, Netherlands
Keyword(s):
MicroRNA, MicroRNA Target, Categorization, Sequence Features, Machine Learning.
Abstract:
Proteins define phenotypes and their dysregulation leads to diseases. Post-translational regulation of protein
abundance can be achieved by microRNAs (miRNAs). Therefore studying this method of gene regulation is
of high importance. MicroRNAs interact with their target messenger RNA via hybridization within a
specialized molecular framework. Many miRNAs and their targets have been identified and they are listed
in various databases like miRTarBase. The experimental identification of functional miRNA-mRNA pairs is
difficult and, therefore, they are detected computationally which is complicated due to missing negative
data. Machine learning has been used for miRNA and target detection and many features have been
described for miRNAs and miRNA:mRNA target duplexes generally on a per species basis. However, many
claims of cross-kingdom regulation via miRNAs have been made and, therefore, we were interested whether
it is possible to differentiate among species based on the target
sequence in the mRNA alone. Thus, we
investigated whether miRNA targets sites within the 3’UTR can be differentiated between species based on
k-mer features only. Target information of one species was used as positive examples and the others as
negative ones to establish machine learning models. It was observed that few features were sufficient for
successful categorization of mircoRNA targets to species. For example mouse versus Caenorhabditis
elegans reached up to 97% average accuracy over 100 fold cross validation. The simplicity of the approach,
based on just k-mers, is promising for automatic categorization systems. In the future, this approach will
help scrutinize alleged cross-kingdom regulation via miRNAs in respect to miRNA from one species
targeting mRNAs in another.
(More)