Authors:
Malik Yousef
1
;
Waleed Khalifa
2
;
İlhan Erkin Acar
3
and
Jens Allmer
3
;
4
Affiliations:
1
Zefat Academic College, Israel
;
2
The College of Sakhnin, Israel
;
3
Izmir Institute of Technology, Turkey
;
4
Bionia Incorporated, Turkey
Keyword(s):
MicroRNA, Target Prediction, Motif, Machine Learning.
Abstract:
A disease phenotype is often due to dysregulation of gene expression. Post-translational regulation of protein
abundance by microRNAs (miRNAs) is, therefore, of high importance in, for example, cancer studies.
MicroRNAs provide a complementary sequence to their target messenger RNA (mRNA) as part of a complex
molecular machinery. Known miRNAs and targets are listed in miRTarBase for a variety of organisms. The
experimental detection of such pairs is convoluted and, therefore, their computational detection is desired
which is complicated by missing negative data. For machine learning, many features for parameterization of
the miRNA targets are available and k-mers and sequence motifs have previously been used. Unrelated
organisms like intracellular pathogens and their hosts may communicate via miRNAs and, therefore, we
investigated whether miRNA targets from one species can be differentiated from miRNA targets of another.
To achieve this end, we employed target information of one spec
ies as positive and the other as negative
training and testing data. Models of species with higher evolutionary distance generally achieved better results
of up to 97% average accuracy (mouse versus \textit{Caenorhabditis elegans}) while more closely related species did
not lead to successful models (human versus mouse; 60%). In the future, when more targeting data becomes
available, models can be established which will be able to more precisely determine miRNA targets in hostpathogen
systems using this approach.
(More)