Enhancing the Efficiency of the Grouping-Scoring-Modeling Framework with Statistical Pre-Scoring Component for Transcriptomic Data Analysis

Maham Khokhar, Burcu Bakir-Gungor, Malik Yousef

2025

Abstract

The advent of high-throughput transcriptomic technologies has generated vast transcriptomic datasets, challenging current analytical methodologies with their sheer volume and complexity. The Grouping-Scoring-Modeling (G-S-M) approach is one of the recent approaches that treat groups of genes (or clusters of genes) by embedding prior biological knowledge with machine learning in order to detect the most significant groups for classification tasks. The G-S-M might need to treat thousand ten thousand of groups (scoring those groups) which might affect the speed and performance of the algorithm. In response, this study introduces the Pre-Scoring G-S-M model, an enhancement of the established Grouping-Scoring-Modeling (G-S-M) framework. This approach incorporates a Pre-Scoring component that leverages the Limma package for its empirical Bayes methods to optimize initial transcriptomic data evaluation through a percentage-based selection of statistically significant gene groups. Aimed at reducing computational demand and streamlining feature selection, the model also addresses data redundancy by eliminating duplicate gene-disease associations. Application to nine human gene expression datasets from the GEO database showed promising results. It demonstrated improvements in computational efficiency and analytical precision while reducing the number of features selected per dataset compared to the traditional G-S-M approach, without compromising accuracy. These initial findings highlight the Pre-Scoring G-S-M model's potential to enhance transcriptomic data analysis, indicating a promising direction for future bioinformatics research.

Download


Paper Citation


in Harvard Style

Khokhar M., Bakir-Gungor B. and Yousef M. (2025). Enhancing the Efficiency of the Grouping-Scoring-Modeling Framework with Statistical Pre-Scoring Component for Transcriptomic Data Analysis. In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIOINFORMATICS; ISBN 978-989-758-731-3, SciTePress, pages 479-488. DOI: 10.5220/0013192600003911


in Bibtex Style

@conference{bioinformatics25,
author={Maham Khokhar and Burcu Bakir-Gungor and Malik Yousef},
title={Enhancing the Efficiency of the Grouping-Scoring-Modeling Framework with Statistical Pre-Scoring Component for Transcriptomic Data Analysis},
booktitle={Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIOINFORMATICS},
year={2025},
pages={479-488},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013192600003911},
isbn={978-989-758-731-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 1: BIOINFORMATICS
TI - Enhancing the Efficiency of the Grouping-Scoring-Modeling Framework with Statistical Pre-Scoring Component for Transcriptomic Data Analysis
SN - 978-989-758-731-3
AU - Khokhar M.
AU - Bakir-Gungor B.
AU - Yousef M.
PY - 2025
SP - 479
EP - 488
DO - 10.5220/0013192600003911
PB - SciTePress