Fine-Tuning of Conditional Transformers Improves the Generation of Functionally Characterized Proteins

Marco Nicolini, Dario Malchiodi, Alberto Cabri, Emanuele Cavalleri, Marco Mesiti, Alberto Paccanaro, Peter Robinson, Justin Reese, Elena Casiraghi, Elena Casiraghi, Giorgio Valentini, Giorgio Valentini

2024

Abstract

Conditional transformers improve the generative capabilities of large language models (LLMs) by processing specific control tags able to drive the generation of texts characterized by specific features. Recently, a similar approach has been applied to the generation of functionally characterized proteins by adding specific tags to the protein sequence to qualify their functions (e.g., Gene Ontology terms) or other characteristics (e.g., their family or the species which they belong to). In this work, we show that fine tuning conditional transformers, pre-trained on large corpora of proteins, on specific protein families can significantly enhance the prediction accuracy of the pre-trained models and can also generate new potentially functional proteins that could enlarge the protein space explored by the natural evolution. We obtained encouraging results on the phage lysozyme family of proteins, achieving statistically significant better prediction results than the original pre-trained model. The comparative analysis of the primary and tertiary structure of the synthetic proteins generated by our model with the natural ones shows that the resulting fine-tuned model is able to generate biologically plausible proteins. Our results confirm and suggest that fine-tuned conditional transformers can be applied to other functionally characterized proteins for possible industrial and pharmacological applications.

Download


Paper Citation


in Harvard Style

Nicolini M., Malchiodi D., Cabri A., Cavalleri E., Mesiti M., Paccanaro A., Robinson P., Reese J., Casiraghi E. and Valentini G. (2024). Fine-Tuning of Conditional Transformers Improves the Generation of Functionally Characterized Proteins. In Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIOINFORMATICS; ISBN 978-989-758-688-0, SciTePress, pages 561-568. DOI: 10.5220/0012567900003657


in Bibtex Style

@conference{bioinformatics24,
author={Marco Nicolini and Dario Malchiodi and Alberto Cabri and Emanuele Cavalleri and Marco Mesiti and Alberto Paccanaro and Peter Robinson and Justin Reese and Elena Casiraghi and Giorgio Valentini},
title={Fine-Tuning of Conditional Transformers Improves the Generation of Functionally Characterized Proteins},
booktitle={Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIOINFORMATICS},
year={2024},
pages={561-568},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012567900003657},
isbn={978-989-758-688-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIOINFORMATICS
TI - Fine-Tuning of Conditional Transformers Improves the Generation of Functionally Characterized Proteins
SN - 978-989-758-688-0
AU - Nicolini M.
AU - Malchiodi D.
AU - Cabri A.
AU - Cavalleri E.
AU - Mesiti M.
AU - Paccanaro A.
AU - Robinson P.
AU - Reese J.
AU - Casiraghi E.
AU - Valentini G.
PY - 2024
SP - 561
EP - 568
DO - 10.5220/0012567900003657
PB - SciTePress