McKinney, W. et al. (2011). pandas: a foundational python
library for data analysis and statistics. Python for high
performance and scientific computing, 14(9):1–9.
Medina, D., Sepulveda-Yanez, J., Alvarez-Saravia, D.,
Uribe-Paredes, R., Veelken, H., and Navarrete, M.
(2023). Artificial intelligence approach for the discov-
ery of autoantigen recognition by b-cell lymphomas.
Blood, 142:125.
Medina-Ortiz, D., Contreras, S., Amado-Hinojosa, J.,
Torres-Almonacid, J., Asenjo, J. A., Navarrete, M.,
and Olivera-Nappa, A. (2020a). Combination of dig-
ital signal processing and assembled predictive mod-
els facilitates the rational design of proteins. arXiv
preprint arXiv:2010.03516.
Medina-Ortiz, D., Contreras, S., Amado-Hinojosa, J.,
Torres-Almonacid, J., Asenjo, J. A., Navarrete, M.,
and Olivera-Nappa,
´
A. (2022). Generalized property-
based encoders and digital signal processing facilitate
predictive tasks in protein engineering. Frontiers in
Molecular Biosciences, 9.
Medina-Ortiz, D., Contreras, S., Fern
´
andez, D., Soto-
Garc
´
ıa, N., Moya, I., Cabas-Mora, G., and Olivera-
Nappa,
´
A. (2024). Protein language models and ma-
chine learning facilitate the identification of antimi-
crobial peptides. International Journal of Molecular
Sciences, 25(16):8851.
Medina-Ortiz, D., Contreras, S., Quiroz, C., and Olivera-
Nappa,
´
A. (2020b). Development of supervised learn-
ing predictive models for highly non-linear biological,
biomedical, and general datasets. Frontiers in molec-
ular biosciences, 7:13.
Meier, J., Rao, R., Verkuil, R., Liu, J., Sercu, T., and Rives,
A. (2021). Language models enable zero-shot predic-
tion of the effects of mutations on protein function. In
Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.,
and Vaughan, J. W., editors, Advances in Neural Infor-
mation Processing Systems, volume 34, pages 29287–
29303. Curran Associates, Inc.
Mishra, A., Pokhrel, P., and Hoque, M. T. (2019). Stackdp-
pred: a stacking based prediction of dna-binding pro-
tein from sequence. Bioinformatics, 35(3):433–441.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,
Weiss, R., Dubourg, V., et al. (2011). Scikit-learn:
Machine learning in python. the Journal of machine
Learning research, 12:2825–2830.
Rahman, M. S., Shatabda, S., Saha, S., Kaykobad, M., and
Rahman, M. S. (2018). Dpp-pseaac: a dna-binding
protein prediction model using chou’s general pseaac.
Journal of theoretical biology, 452:22–34.
Rives, A., Meier, J., Sercu, T., Goyal, S., Lin, Z., Liu, J.,
Guo, D., Ott, M., Zitnick, C. L., Ma, J., and Fergus, R.
(2021). Biological structure and function emerge from
scaling unsupervised learning to 250 million protein
sequences. Proceedings of the National Academy of
Sciences, 118(15).
Roel-Touris, J., Bonvin, A. M., and Jim
´
enez-Garc
´
ıa, B.
(2020). Lightdock goes information-driven. Bioin-
formatics, 36(3):950–952.
Shadab, S., Khan, M. T. A., Neezi, N. A., Adilina, S., and
Shatabda, S. (2020). Deepdbp: deep neural networks
for identification of dna-binding proteins. Informatics
in Medicine Unlocked, 19:100318.
Sharma, R., Kumar, S., Tsunoda, T., Kumarevel, T., and
Sharma, A. (2021). Single-stranded and double-
stranded dna-binding protein prediction using hmm
profiles. Analytical biochemistry, 612:113954.
Tan, C., Wang, T., Yang, W., and Deng, L. (2019). Predpsd:
a gradient tree boosting approach for single-stranded
and double-stranded dna binding protein prediction.
Molecules, 25(1):98.
Wang, W., Sun, L., Zhang, S., Zhang, H., Shi, J., Xu,
T., and Li, K. (2017). Analysis and prediction of
single-stranded and double-stranded dna binding pro-
teins based on protein sequences. BMC bioinformat-
ics, 18:1–10.
Wang, Y., Zhang, L., Huang, T., Wu, G.-R., Zhou, Q.,
Wang, F.-X., Chen, L.-M., Sun, F., Lv, Y., Xiong, F.,
et al. (2022a). The methyl-cpg-binding domain 2 fa-
cilitates pulmonary fibrosis by orchestrating fibroblast
to myofibroblast differentiation. European Respira-
tory Journal, 60(3).
Wang, Z., Gong, M., Liu, Y., Xiong, S., Wang, M.,
Zhou, J., and Zhang, Y. (2022b). Towards a better
understanding of tf-dna binding prediction from ge-
nomic features. Computers in Biology and Medicine,
149:105993.
Werner, M. H., Huth, J. R., Gronenborn, A. M., and Clore,
G. M. (1995). Molecular basis of human 46x, y
sex reversal revealed from the three-dimensional so-
lution structure of the human sry-dna complex. Cell,
81(5):705–714.
Zaman, R., Chowdhury, S. Y., Rashid, M. A., Sharma, A.,
Dehzangi, A., Shatabda, S., et al. (2017). Hmm-
binder: Dna-binding protein prediction using hmm
profile based features. BioMed research international,
2017.
Zhang, J., Chen, Q., and Liu, B. (2020). idrbp mmc: iden-
tifying dna-binding proteins and rna-binding proteins
based on multi-label learning model and motif-based
convolutional neural network. Journal of molecular
biology, 432(22):5860–5875.
Zhang, Q., Liu, P., Wang, X., Zhang, Y., Han, Y., and Yu,
B. (2021). Stackpdb: predicting dna-binding proteins
based on xgb-rfe feature optimization and stacked en-
semble classifier. Applied Soft Computing, 99:106921.
Zhang, Y., Bao, W., Cao, Y., Cong, H., Chen, B., and Chen,
Y. (2022). A survey on protein–dna-binding sites in
computational biology. Briefings in Functional Ge-
nomics, 21(5):357–375.
KDIR 2024 - 16th International Conference on Knowledge Discovery and Information Retrieval
308