Authors:
J. E. Salazar Jiménez
1
;
J. D. Sánchez Carvajal
1
;
B. Quiros-Gómez
2
and
J. D. Arias-Londoño
3
Affiliations:
1
Faculty of Engineering, Universidad de Antioquia, Colombia
;
2
Unidad de Investigación e Innovaci´on, Humax Pharmaceutical S.A., Colombia
;
3
Universidad de Antioquia, Colombia
Keyword(s):
Automatic Feature Selection, Bioequivalence, Drug Development, Drug Dissolution Profile Prediction, Solid Oral Pharmaceutical Forms.
Abstract:
This work addressed the problem of dimensionality reduction in the drug dissolution profile prediction task.
The learning problem is assumed as a multi-output learning task, since dissolution profiles are recorded in non-uniform
sampling times, which avoid the use of basic function-on-scalar regression approaches. Ensemblebased
tree methods are used for prediction, and also for the selection of the most relevant features, because
they are able to deal with high dimensional feature spaces, when the number of training samples is small.
All the drugs considered corresponds to rapid release solid oral pharmaceutical forms. Six different feature
selection schemes were tested, including sequential feature selection and genetic algorithms, along with a
feature scoring procedure, which was proposed in order to get a consensus about the best subset of variables.
The performance was evaluated in terms of the similitude factor used in the drug industry for dissolution
profile compariso
n. The feature selection methods were able to reduce the dimensionality of the feature space
in 79.2%, without loss in the performance of the prediction system. The results confirm that in the dissolution
profile prediction problem, especially for different solid oral pharmaceutical forms, variables from different
components and phases of the drug development must be considered.
(More)