Authors:
José Fernando dos Anjos Rodrigues
1
;
Letícia Martins Raposo
1
;
2
and
Flavio Fonseca Nobre
1
Affiliations:
1
Programa de Engenharia Biomédica, Universidade Federal do Rio de Janeiro, Av. Horácio Macedo, 2030, Rio de Janeiro, Brazil
;
2
Departamento de Métodos Quantitativos, Universidade Federal do Estado do Rio de Janeiro, Av. Pasteur, 458, Rio de Janeiro, Brazil
Keyword(s):
Clinical Applications, HIV, Viral Tropism, Genotypic Classifiers.
Abstract:
The pathway of human immunodeficiency virus (HIV) infection depends on the composition of a 35-amino acid variable region in its envelope, known as the V3 loop. Since this discovery, many tools have been developed to diagnose and predict viral tropism, from biochemical tests to various computational algorithms. To date, the biggest developmental difficulty is the correct prediction of X4 or R5X4-tropism virions. In this study, we evaluated some of these recommended criteria and proposed a random forest-based approach for better prediction of X4-capable (i.e., either X4-only, or R5X4-dual/mixed capability). All methods achieved a specificity higher than 87%, with geno2pheno 2.5% showing the best performance (98.2%). Nevertheless, the sensitivity (73.3%) was lower compared to the other approaches. The highest sensitivity was attained by our Complete Model with an undersampling strategy (90.1%). The accuracy of all approaches ranged from 87.4% to 93.0%. Complete Model with oversampling
and Reduced Model with no balancing showed the highest MCC value (both with 0.796 score). Considering error rates and the number of explanatory variables, our main objective of increasing the ability to predict viral specimens with X4-tropism was achieved.
(More)