
of bias stems from skin color differences, which are
most distinctly observed between Black and White
populations. This insight led us to narrow our focus
to these two groups, allowing a clearer quantification
of bias.
The XGBoost classifier achieved the highest ac-
curacy, reaching 72.60% when predicting race us-
ing SpO
2
and other parameters, with SpO
2
consis-
tently ranking as a top predictive feature and show-
ing minimal variability across iterations. When SpO
2
was excluded from the model, accuracy dropped
from 72.60% to 70.12%. Additionally, introduc-
ing delta SpO
2
, the difference between SpO
2
and
SaO
2
, slightly improved accuracy to 72.94%, indi-
cating that the bias arises not only from individual
SpO
2
or SaO
2
values but from their interrelation. Our
findings reinforce existing clinical evidence showing
that Black patients are more susceptible to undetected
hypoxemia when SpO
2
is used as the sole diagnos-
tic tool. While our analysis demonstrated the abil-
ity of machine learning models to detect and quan-
tify bias through feature importance analysis, we em-
phasize that SpO
2
discrepancies cannot be fully ad-
dressed without more granular data, such as direct
skin color measurements. These insights suggest that
race serves as a reasonable surrogate for skin color in
current datasets, but future datasets must incorporate
explicit skin pigmentation data to enable more precise
corrections.
For future work, we propose exploring more ro-
bust solutions beyond race-based corrections. De-
vices such as transcutaneous oxygen monitors, like
the prototype wearable developed by Vakhter et al.,
measure oxygen diffusion directly through the skin,
bypassing the bias introduced by skin pigmenta-
tion (Vakhter et al., 2023). Integrating data from such
devices with pulse oximetry could provide a more
accurate and skin-independent assessment of oxygen
saturation. However, in our current dataset, we lack
specific skin color information, preventing the im-
plementation of skin-specific corrections. As a re-
sult, in our ongoing work, we will continue to use
race as a surrogate for skin color as a proof of con-
cept to demonstrate the potential effectiveness of ma-
chine learning-based corrections (Karli and Unluturk,
2024).
ACKNOWLEDGEMENTS
This material is based upon work supported in part by
the National Science Foundation (NSF) under Grant
OAC-2203827, in part by the National Institutes of
Health (NIH) under Grant R01HL172293 and in part
by R21EB036329.
REFERENCES
Bangash, M. N., Hodson, J., Evison, F., Patel, J. M., John-
ston, A. M., Gallier, S., Sapey, E., and Parekh, D.
(2022). Impact of ethnicity on the accuracy of mea-
surements of oxygen saturations: A retrospective ob-
servational cohort study. EClinicalMedicine, 48.
Biello, K. B., Rawlings, J., Carroll-Scott, A., Browne, R.,
and Ickovics, J. R. (2010). Racial disparities in age at
preventable hospitalization among us adults. Ameri-
can journal of preventive medicine, 38(1):54–60.
Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer,
W. P. (2002). Smote: synthetic minority over-
sampling technique. Journal of artificial intelligence
research, 16:321–357.
Delgado, C., Powe, N. R., Chertow, G. M., Grimes, B.,
and Johansen, K. L. (2024). Muscle mass and serum
creatinine concentration by race and ethnicity among
hemodialysis patients. Journal of the American Soci-
ety of Nephrology, 35(1):66–73.
Fawzy, A., Wu, T. D., Wang, K., Robinson, M. L., Zeger,
S. L., Tracy, R. P., Haynes, S. G., Krishnan, J. A., and
McEvoy, J. W. (2022). Racial and ethnic discrepancy
in pulse oximetry and delayed identification of treat-
ment eligibility among patients with covid-19. JAMA
Internal Medicine, 182(7):730–738.
Feiner, J. R., Severinghaus, J. W., and Bickler, P. E. (2007).
Dark skin decreases the accuracy of pulse oxime-
ters at low oxygen saturation: The effects of oxime-
ter probe type and gender. Anesthesia & Analgesia,
105(6):S18–S23.
Harskamp, R. E., Bekker, L., Himmelreich, J. C.,
De Clercq, L., Karregat, E. P., Sleeswijk, M. E., and
Lucassen, W. A. (2021). Performance of popular
pulse oximeters compared with simultaneous arterial
oxygen saturation or clinical-grade pulse oximetry: a
cross-sectional validation study in intensive care pa-
tients. BMJ open respiratory research, 8(1):e000939.
Jamali, H., Castillo, L. T., Morgan, C. C., Coult, J., Muham-
mad, J. L., Osobamiro, O. O., Parsons, E. C., and
Adamson, R. (2022). Racial disparity in oxygen satu-
ration measurements by pulse oximetry: evidence and
implications. Annals of the American Thoracic Soci-
ety, 19(12):1951–1964.
Johnson, A. E., Bulgarelli, L., Shen, L., Gayles, A., Sham-
mout, A., Horng, S., Pollard, T. J., Hao, S., Moody,
B., Gow, B., et al. (2023). Mimic-iv, a freely acces-
sible electronic health record dataset. Scientific data,
10(1):1.
Johnson, A. E., Pollard, T. J., Shen, L., Lehman, L.-w. H.,
Feng, M., Ghassemi, M., Moody, B., Szolovits, P.,
Anthony Celi, L., and Mark, R. G. (2016). Mimic-
iii, a freely accessible critical care database. Scientific
data, 3(1):1–9.
Jones, C. A., McQuillan, G. M., Kusek, J. W., Eberhardt,
M. S., Herman, W. H., Coresh, J., Salive, M., Jones,
C. P., and Agodoa, L. Y. (1998). Serum creatinine lev-
els in the us population: third national health and nu-
trition examination survey. American Journal of Kid-
ney Diseases, 32(6):992–999.
BIOSIGNALS 2025 - 18th International Conference on Bio-inspired Systems and Signal Processing
858