Machine Learning Methods for Phenotype Prediction from High-Dimensional, Low Population Aquaculture Data

Giovanni Faldani, Enrico Rossignolo, Eleonora Signor, Alessio Longo, Sara Faggion, Luca Bargelloni, Matteo Comin, Cinzia Pizzi

2025

Abstract

Recent research has increasingly focused on classification rules within the big data framework, yet many bioinformatics applications still address prediction problems that involve small-sample, high-dimensional data. In phenotype prediction, especially with the rise of large-scale genomic data, a central challenge arises from handling high-dimensional datasets where the number of genetic features (such as SNPs) far exceeds the sample size. A significant example of such high-dimensional, low-sample datasets is found in aquaculture, a rapidly growing sector within global food production and a crucial source of high-quality protein. This study uses data from an experiment performed on European seabass as a test case, focusing on predicting resistance to Viral Nervous Necrosis (VNN) as a specific phenotype of interest. We explore a range of machine learning techniques to address the complexities of high-dimensional data, from established methods like gradient boosting, SVM, and deep learning to newer approaches. This paper evaluates various methods for associating SNPs with phenotypic traits, benchmarking their performance on challenging aquaculture genomic data to provide insight into the effectiveness of these techniques.

Download


Paper Citation


in Harvard Style

Faldani G., Rossignolo E., Signor E., Longo A., Faggion S., Bargelloni L., Comin M. and Pizzi C. (2025). Machine Learning Methods for Phenotype Prediction from High-Dimensional, Low Population Aquaculture Data. In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIOINFORMATICS; ISBN 978-989-758-731-3, SciTePress, pages 638-646. DOI: 10.5220/0013248000003911


in Bibtex Style

@conference{bioinformatics25,
author={Giovanni Faldani and Enrico Rossignolo and Eleonora Signor and Alessio Longo and Sara Faggion and Luca Bargelloni and Matteo Comin and Cinzia Pizzi},
title={Machine Learning Methods for Phenotype Prediction from High-Dimensional, Low Population Aquaculture Data},
booktitle={Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIOINFORMATICS},
year={2025},
pages={638-646},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013248000003911},
isbn={978-989-758-731-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIOINFORMATICS
TI - Machine Learning Methods for Phenotype Prediction from High-Dimensional, Low Population Aquaculture Data
SN - 978-989-758-731-3
AU - Faldani G.
AU - Rossignolo E.
AU - Signor E.
AU - Longo A.
AU - Faggion S.
AU - Bargelloni L.
AU - Comin M.
AU - Pizzi C.
PY - 2025
SP - 638
EP - 646
DO - 10.5220/0013248000003911
PB - SciTePress