of bias stems from skin color differences, which are
most distinctly observed between Black and White
populations. This insight led us to narrow our focus
to these two groups, allowing a clearer quantification
of bias.
The XGBoost classifier achieved the highest ac-
curacy, reaching 72.60% when predicting race us-
ing SpO
and other parameters, with SpO
tently ranking as a top predictive feature and show-
ing minimal variability across iterations. When SpO
was excluded from the model, accuracy dropped
from 72.60% to 70.12%. Additionally, introduc-
ing delta SpO
, the difference between SpO
, slightly improved accuracy to 72.94%, indi-
cating that the bias arises not only from individual
or SaO
values but from their interrelation. Our
findings reinforce existing clinical evidence showing
that Black patients are more susceptible to undetected
hypoxemia when SpO
is used as the sole diagnos-
tic tool. While our analysis demonstrated the abil-
ity of machine learning models to detect and quan-
tify bias through feature importance analysis, we em-
phasize that SpO
discrepancies cannot be fully ad-
dressed without more granular data, such as direct
skin color measurements. These insights suggest that
race serves as a reasonable surrogate for skin color in
current datasets, but future datasets must incorporate
explicit skin pigmentation data to enable more precise
For future work, we propose exploring more ro-
bust solutions beyond race-based corrections. De-
vices such as transcutaneous oxygen monitors, like
the prototype wearable developed by Vakhter et al.,
measure oxygen diffusion directly through the skin,
bypassing the bias introduced by skin pigmenta-
tion (Vakhter et al., 2023). Integrating data from such
devices with pulse oximetry could provide a more
accurate and skin-independent assessment of oxygen
saturation. However, in our current dataset, we lack
specific skin color information, preventing the im-
plementation of skin-specific corrections. As a re-
sult, in our ongoing work, we will continue to use
race as a surrogate for skin color as a proof of con-
cept to demonstrate the potential effectiveness of ma-
chine learning-based corrections (Karli and Unluturk,
This material is based upon work supported in part by
the National Science Foundation (NSF) under Grant
OAC-2203827, in part by the National Institutes of
Health (NIH) under Grant R01HL172293 and in part
by R21EB036329.
Bangash, M. N., Hodson, J., Evison, F., Patel, J. M., John-
ston, A. M., Gallier, S., Sapey, E., and Parekh, D.
(2022). Impact of ethnicity on the accuracy of mea-
surements of oxygen saturations: A retrospective ob-
servational cohort study. EClinicalMedicine, 48.
Biello, K. B., Rawlings, J., Carroll-Scott, A., Browne, R.,
and Ickovics, J. R. (2010). Racial disparities in age at
preventable hospitalization among us adults. Ameri-
can journal of preventive medicine, 38(1):54–60.
Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer,
W. P. (2002). Smote: synthetic minority over-
sampling technique. Journal of artificial intelligence
research, 16:321–357.
Delgado, C., Powe, N. R., Chertow, G. M., Grimes, B.,
and Johansen, K. L. (2024). Muscle mass and serum
creatinine concentration by race and ethnicity among
hemodialysis patients. Journal of the American Soci-
ety of Nephrology, 35(1):66–73.
Fawzy, A., Wu, T. D., Wang, K., Robinson, M. L., Zeger,
S. L., Tracy, R. P., Haynes, S. G., Krishnan, J. A., and
McEvoy, J. W. (2022). Racial and ethnic discrepancy
in pulse oximetry and delayed identification of treat-
ment eligibility among patients with covid-19. JAMA
Internal Medicine, 182(7):730–738.
Feiner, J. R., Severinghaus, J. W., and Bickler, P. E. (2007).
Dark skin decreases the accuracy of pulse oxime-
ters at low oxygen saturation: The effects of oxime-
ter probe type and gender. Anesthesia & Analgesia,
Harskamp, R. E., Bekker, L., Himmelreich, J. C.,
De Clercq, L., Karregat, E. P., Sleeswijk, M. E., and
Lucassen, W. A. (2021). Performance of popular
pulse oximeters compared with simultaneous arterial
oxygen saturation or clinical-grade pulse oximetry: a
cross-sectional validation study in intensive care pa-
tients. BMJ open respiratory research, 8(1):e000939.
Jamali, H., Castillo, L. T., Morgan, C. C., Coult, J., Muham-
mad, J. L., Osobamiro, O. O., Parsons, E. C., and
Adamson, R. (2022). Racial disparity in oxygen satu-
ration measurements by pulse oximetry: evidence and
implications. Annals of the American Thoracic Soci-
ety, 19(12):1951–1964.
Johnson, A. E., Bulgarelli, L., Shen, L., Gayles, A., Sham-
mout, A., Horng, S., Pollard, T. J., Hao, S., Moody,
B., Gow, B., et al. (2023). Mimic-iv, a freely acces-
sible electronic health record dataset. Scientific data,
Johnson, A. E., Pollard, T. J., Shen, L., Lehman, L.-w. H.,
Feng, M., Ghassemi, M., Moody, B., Szolovits, P.,
Anthony Celi, L., and Mark, R. G. (2016). Mimic-
iii, a freely accessible critical care database. Scientific
data, 3(1):1–9.
Jones, C. A., McQuillan, G. M., Kusek, J. W., Eberhardt,
M. S., Herman, W. H., Coresh, J., Salive, M., Jones,
C. P., and Agodoa, L. Y. (1998). Serum creatinine lev-
els in the us population: third national health and nu-
trition examination survey. American Journal of Kid-
ney Diseases, 32(6):992–999.
BIOSIGNALS 2025 - 18th International Conference on Bio-inspired Systems and Signal Processing