Authors:
Saeed Khalilian
1
;
Zahra Moti
2
;
Arian Baloochestani
3
;
Yeganeh Hallaj
3
;
Alireza Chavosh
4
and
Zahra Hemmatian
4
Affiliations:
1
Independent Researcher, Iran
;
2
Independent Researcher, The Netherlands
;
3
Independent Researcher, Norway
;
4
MarWell Bio Inc., California, U.S.A.
Keyword(s):
Antibody, Nanobody, Complementarity Determining Region (CDR), SARS-CoV-2, COVID-19, Deep Generative Models, Transfer Learning, Bioinformatics, in-silico Screening.
Abstract:
The global impact of the COVID-19 pandemic underlines the importance of developing a competent machine learning (ML) approach that can rapidly design therapeutics and prophylactics such as antibodies/nanobodies against novel viral infections despite data shortage problems and sequence complexity. Here, we propose a novel end-to-end deep generative model based on convolutional Variational Autoencoder (VAE), Residual Neural Network (Resnet), and Transfer Learning (TL), named VAEResTL that can competently generate CDR-H3 sequences for a novel target lacking sufficient training data. We further demonstrate that our proposed method generates the third complementarity-determining region (CDR) of the heavy chain (CDR-H3) sequences for designing and developing therapeutic antibodies/nanobodies that can bind to different variants of SARS-CoV-2 despite the shortage of SARS-CoV-2 training data. The predicted CDR-H3 sequences are then screened and filtered for their developability parameters nam
ely viscosity, clearance, solubility, stability, and immunogenicity through several in-silico steps resulting in a list of highly optimized lead candidates.
(More)