HealthAIDE: Developing an Audit Framework for AI-Generated

Online Health Information

Tahir Hameed

a

Girard School of Business, Merrimack College, North Andover, MA 01845, U.S.A.

Keywords: Online Health Information, Large Language Models, Generative AI, AI Auditing, Responsible AI.

Abstract: Online health information (OHI) encompasses a wide range of public-facing content, such as information on

symptoms, diseases, medications, and treatments, while online medical information (OMI) involves more

specialized and regulated content, including clinical trial data, surgical procedures, and medical research. OMI

generation and dissemination is held to stringent standards for accuracy, transparency, and explainability,

whereas OHI often requires information-seekers to independently evaluate credibility and relevance of the

information. The rise of generative AI or large language models (LLMs) has exacerbated this disparity, as

LLMs are primarily applied to public-domain OHI without sufficient safeguards, leaving users vulnerable to

misinformation, bias, and non-transparent outputs. This paper presents a systematic literature survey on the

usage of AI and LLMs in OHI, highlighting focus areas and critical gaps in developing a robust framework

for auditing AI-generated health information. The proposed HealthAIDE Framework defines four key pillars

for oversight: reliability and accuracy, trust and acceptance, security and safety, and equity and fairness. A

short but systematic review of AI-driven health information literature reveals areas of stronger focus, such as

accuracy and trust, and weaker focus areas, such as misuse prevention and transparency. Addressing these

gaps through comprehensive audits will enable responsible evolution of AI-driven health information systems.

1 INTRODUCTION

Online Health Information (OHI) searches have

become a primary resource for individuals seeking to

check symptoms, explore treatment options and their

efficacy, and locate healthcare providers or clinics

(Hameed, 2018). With the rise of pre-trained large

language models (LLMs), OHI seeking, its use, and

corresponding health behavior are undergoing a

transformation (Clark et al., 2024; Yan et al.,2024).

Traditionally, users conducting OHI searches on

general-purpose or specialized healthcare search

engines faced issues such as information overload,

difficulty in distinguishing credible sources, and a

lack of personalization (Swar et al.,2017; Freeman et

al., 2020). They were less confident about the

information received and therefore conscientious

when using it. While generative-AI platforms address

some of these issues by offering personalized,

dynamic, and conversational interfaces, they also

introduce new complications while amplifying some

existing problems. For instance, misinformation

a

https://orcid.org/0000-0002-6824-6803

generated by AI hallucinations in definitive tones can

lead to misuse of OHI or undesirable behavior (Jin et

al., 2023, Shen et al., 2024). User trust also suffers

due to a lack of transparency about the sources of

information, making it difficult for users to verify its

accuracy. Additionally, the risks of potential misuse

of sensitive user data increase as open queries are

made on the internet rather than secure electronic

health records or clinical databases, raising concerns

about privacy and ethical use. These issues not only

magnify pre-existing concerns but also raise further

dilemmas around consent, privacy, and the accuracy

of recommendations (Shi et al.,2024).

AI audits assess AI systems to ensure their

reliability, ethical integrity, and compliance with

regulations and standards. These audits aim to

identify and address potential risks associated with

AI, such as bias, security vulnerabilities, privacy

concerns, and transparency issues. By analyzing an

AI system's data, algorithms, and outputs, auditors

can uncover risks, assess the impact of the systems,

and recommend improvements to enhance

Hameed, T.

HealthAIDE: Developing an Audit Framework for AI-Generated Online Health Information.

DOI: 10.5220/0013187100003911

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2025) - Volume 2: HEALTHINF, pages 173-183

ISBN: 978-989-758-731-3; ISSN: 2184-4305

173

trustworthiness and accountability (Falco et al., 2021;

Li et al., 2024, Mökander, 2023).

AI audits are essential for fostering trust in AI

systems and ensuring compliance with governmental

regulations such as the General Data Protection

Regulation (GDPR) in the EU, the Health Insurance

Portability and Accountability Act (HIPAA), and the

California Consumer Privacy Act (CCPA) in the US

(Forcier et al., 2019; Mulgund et al., 2021). However,

the rapid development of generative-AI systems has

complicated the definition and scope of assessments

in an ever-evolving information landscape. Proposed

initiatives, such as the Algorithmic Accountability

Act and the AI Bill of Rights in the US, along with

the EU AI Act, represent significant efforts to guide

the design, use, and deployment of AI systems,

focusing on protecting citizens' rights (Blumenthal-

Barby, 2023; Veale & Zuiderveen, 2021). These

regulations aim to pre-emptively identify potential

ethical, legal, and operational risks prior to

deployment and ensure ongoing monitoring of AI

systems post-implementation.

Even though OHI seeking is one of the fastest-

developing areas leveraging generative AI and LLM-

driven platforms, it also has the potential to cause

significant harm through the dissemination of

misinformation. Despite this, AI audits are not yet

fully adopted or deeply considered in this domain.

While substantial efforts are underway to support the

curation, auditing, and sharing of Online Medical

Information (OMI)—including biomedical content,

medical databases, and repositories used by

professionals—similar mechanisms are notably

absent in the OHI domain for patients and caregivers

(Li and Goel, 2024)

This paper introduces Health-AIDE, a

preliminary framework designed to audit AI-

generated OHI. Section 2 begins by exploring the

rapidly changing landscape of OHI, followed by a

scoping review of existing AI auditing literature to

identify the critical components of an AI audit

framework. Section 3 provides a systematic literature

review of peer-reviewed studies on generative AI in

the OHI domain. It evaluates the strengths and

weaknesses of the current AI-driven OHI systems and

applications against the proposed HEALTH-AIDE

framework. Finally, Section 4 draws conclusions and

discusses future directions for improving OHI

auditing practices.

As one of the earlier papers in this area, this work

aims to provide a foundation for developing Health-

AIDE into a comprehensive, scalable AI Auditing

framework also including the soft side of appropraite

communication with the patients and their caregivers.

By addressing the pressing need for AI audits in the

OHI domain, we hope to pave the way for safer, more

reliable, and ethically governed generative AI-driven

health information systems.

2 BACKGROUND & SCOPE OF

AI AUDITING IN ONLINE

HEALTH INFORMATION

2.1 Emergence of LLM and Generative

AI OHI Platforms

LLMs such as OpenAI's GPT series, Meta’s LLama

and Google's Bard have transformed OHI platforms

by enabling conversational, personalized, and

context-aware responses to user queries. These

models utilize huge datasets of medical literature and

publicly available information, to generate detailed

responses tailored to individual needs. (Shen et al.,

2024; Yan et al., 2024). Unlike traditional search

engines, LLMs excel in synthesizing information

from multiple sources and providing it in user-

friendly formats. No wonder, generative AI based

OHI platforms offers significant benefits, including

improved accessibility, efficiency, and interactivity.

Users can ask complex health-related questions and

receive coherent explanations, making it easier to

understand medical concepts. These tools have also

shown promise in underserved regions, where limited

access to healthcare professionals makes reliable

online information crucial. Despite these advantages,

the adoption of LLM-driven platforms has raised

ethical concerns about data privacy and the

trustworthiness of AI-generated advice, such as

diagnostics and treatment recommendations (Cocci et

al., 2024). Therefore, like all other sectors, there is a

strong need to ensure LLM-based OHI platforms

operate responsibly under the oversight of robust

regulatory frameworks (Mesko and Topol, 2023).

2.2 Auditing Generative AI and

LLM-Based Information and

Systems, Towards a Framework

Auditing has long been a critical governance

mechanism to ensure that information systems and

data are managed in compliance with technical, legal,

and ethical requirements established by

manufacturers, industry organizations, and

governments. Information systems auditing has

matured as a professional discipline with established

procedures and records (Champlain, 2003). However,

HEALTHINF 2025 - 18th International Conference on Health Informatics

174

the rapid evolution of generative AI and LLMs in

recent years has outpaced the development of

appropriate governance measures, leaving gaps in

defining and implementing effective management

practices.

Weidinger et al.’s (2022) taxonomy of LLM risks

highlights several critical issues, including the

perpetuation of discrimination and biases, inadvertent

information hazards such as data leaks, malicious use

such as fraudulent scams, and environmental harms

caused by excessive computing power requirements,

among others. However, the two most prevalent

hazards they identify in generative AI are distorted

human-machine interactions and misinformation

hazards. The former pertains to users overestimating

the capabilities of LLMs, leading to their unsafe or

inappropriate use. The latter poses significant risks as

less-informed users may consume misleading

information, resulting in harm and potentially eroding

public trust in AI-generated content.

Mökander et al. (2023) proposed a three-layered

approach to auditing the outputs of LLM-based

systems, aiming to ensure their effectiveness from

technical, social, and legal perspectives. Their

framework emphasizes equal attention to mitigating

social and ethical risks associated with AI systems.

The authors recommend three types of audits:

governance audits for LLM providers, model audits

conducted before the release of pre-trained LLMs,

and application audits for specific scenarios where

LLMs are deployed. This comprehensive approach

suggests that AI system audits should evolve to

address not only technological and procedural aspects

but also complementary areas that ensure the

responsible use and long-term impact of AI systems

on users, societies, and the natural environment

(Mökander et al., 2023).

We discuss most critical aspects of AI oversight

to develop a basic framework for auditing AI in

healthcare sector. At first, ensuring the accuracy and

correctness of generated content is a critical factor in

the reliability of any AI system. Techniques such as

prompt engineering, querying, and probing serve as

robust methods for generating content and comparing

it against established benchmarks. However, content

must also be assessed on qualitative and semantic

aspects, including fluency, coherence, and relevance,

which significantly contribute to overall user

satisfaction. (Davis et al., 2023). Reliability of AI

systems outputs involves their broader applicability

i.e. scalability as well as continuous learning to stay

on top of current knowledge. These aspects of

reliability, accuracy and access are primarily

addressed through technology and algorithmic audits,

which have consistently been the most prevalent and

central mechanism for evaluating AI systems.

AI systems often rely on vast amounts of sensitive

and personal data. Ensuring that this data is protected

from unauthorized access, breaches, or cyberattacks

is fundamental to maintaining the integrity of AI

systems and safeguarding user privacy (Bala et al.,

2024). Robust data security measures prevent data

leaks, tampering, or misuse throughout the data

lifecycle— from collection and storage to processing

and deployment. These measures include encryption,

access controls, regular security testing, and secure

data handling practices. Security audits help identify

vulnerabilities, ensure compliance with data

protection regulations, and verify that AI systems are

equipped with the necessary safeguards to protect

sensitive data (Nankya et al., 2024).

Similarly, harm prevention and misuse prevention

are essential components in ensuring that AI-

generated content does not lead to negative

consequences for individuals or society (Ellaham et

al., 2020). AI systems must be designed to avoid

producing outputs that could cause physical,

psychological, or social harm, such as promoting

harmful behaviors, spreading misinformation, or

enabling discrimination. Harm prevention strategies

include incorporating ethical guidelines into the AI’s

design, continuously monitoring outputs for

unintended negative effects, and ensuring that AI

models are trained to recognize and mitigate harmful

content. Misuse prevention focuses on safeguarding

against malicious or unethical use of AI systems, such

as using AI for fraud, manipulation, or the creation of

harmful content like deepfakes. To effectively

manage these risks, safety audits are required to

assess how well AI systems prevent harm and misuse.

These audits ensure that safeguards are in place, that

ethical considerations are followed, and that AI

systems are used responsibly, reducing the risk of

harm to individuals and society (Shneiderman, 2020).

Equity and fairness in AI systems rely heavily on

systematically evaluating the technical and

organizational practices surrounding their

development and deployment. This requires

mechanisms to identify, mitigate, and monitor biases

in datasets, models, and outputs while adhering to

ethical principles, regulatory requirements, and

industry standards that promote fairness and prevent

discrimination (Rajkomar, 2018, Ueda et al., 2024).

Ensuring accountability structures within AI providers

is essential to integrating equity and fairness

considerations into decision-making processes at all

levels. Transparency in algorithmic design, inclusivity

of training data, and representativeness of stakeholder

HealthAIDE: Developing an Audit Framework for AI-Generated Online Health Information

175

engagement efforts are critical aspects that must be

prioritized. Furthermore, effective monitoring post-

deployment is necessary to detect and address

unintended biases or inequities. To comprehensively

address these requirements, process and governance

audits emerge as a vital need, providing a structured

approach to evaluate and reinforce the equitable design

and implementation of AI systems, ensuring positive

outcomes for all users and communities.

The AI auditing literature can clearly be organized

into the following major aspects that must be

monitored and assessed to develop and deploy

reliable, trustworthy, secure, and inclusive AI

systems. A well-structured Figure 1 outlines the

proposed framework for AI audits, specifically

tailored for the online health information sector,

while remaining broadly applicable to any

information domain.

Figure 1: Health-AIDE; A preliminary framework for

Auditing AI-generated health information.

3 A SYSTEMATIC REVIEW

OF OHI DRIVEN BY

GENERATIVE AI

ChatGPT was launched on November 30, 2022. While

research on transformer models and preliminary

applications existed prior, the release of ChatGPT

marked a significant turning point, driving public

access to advanced LLMs and generative AI systems.

Recognizing this milestone, we conducted a systematic

literature review of articles published on the use of

LLMs and generative AI in the domain of OHI. The

aim was to evaluate the recorded advancements and

align them with the core dimensions of the proposed

Health-Aide framework, assessing the implications

and gaps in AI-driven OHI systems.

3.3 Methods

To that end, our literature review followed a

structured and systematic process to ensure relevance

and depth in analysing AI-generated OHI.

Figure 2: Articles selection for systematic literature review.

Initially, a Google Advanced Search was

conducted using keywords like Online Health

Information, yielding 2,790 results. To refine the

focus, additional searches with terms such as AI

Health Information, LLM Health Information, and

Chatbot Health Information narrowed the results to

112. To ensure timeliness, only articles published

after the launch of ChatGPT in late 2022 were

considered, further reducing the scope to 55. After

reviewing article types and sources, only English

peer-reviewed journal articles and conference

proceedings were retained, resulting in 49 relevant

papers. Full-text reviews were then conducted,

focusing on AI auditing frameworks, ultimately

identifying 30 highly relevant sources for analysis.

The process is outlined in Figure 2.

HEALTHINF 2025 - 18th International Conference on Health Informatics

176

Each paper was reviewed thoroughly to analyze

the scope of the system or device capturing or

generating personal or clinical health information.

Features, opportunities, challenges, and issues

discussed in the articles were systematically mapped

to a table aligned with the dimensions of the Health-

Aide Framework. This mapping provided a

preliminary understanding of innovation trends,

highlighting where AI-driven OHI generation and

sharing are prioritized. The analysis offers insights

into key development areas, challenges, and gaps,

forming a foundation for refining the framework

further.

At this stage, the focus was on identifying

priorities and areas of emphasis without assigning

weights or conducting detailed comparisons. A more

rigorous analysis will be conducted in the next phase

using a systematic literature selection process through

the Web of Science database. This follow-up will

enable weighted assessments, deeper comparisons,

and a broader understanding of how these align with

the Health-AIDE Framework dimensions.

A table comprehensivley summarizing the

mapped observations is included in Appendix A (See

Table 3). This table highlights the features,

opportunities, and issues focused in each health

AI/LLM-based system or device with respect to the

Health-AIDE framework. It also provides a clear and

structured overview of the findings of the literature

review, offering a visual representation of current

innovation and research priorities in AI-based OHI

systems. It defintley serves as a valuable resource for

identifying trends and gaps for further exploration in

other research and future phases of this paper.

4 ANALYSIS, DISCUSSION &

CONCLUSIONS

Since this is a preliminary framework and an initial

attempt at identifying the current focus areas, we

began by calculating the frequencies of each auditable

aspect of health information. All the counts are

presented in Table 1.

Surprisingly, Human-AI Collaboration emerged

as the most emphasized focus area. This indicates that

both users and developers are deeply focused on

understanding and improving interactions with AI

systems in the health domain. Following this is

Accuracy, a critical priority for any information

system and particularly essential in healthcare, where

reliable and precise data is foundational.

Table 1: On-going innovation and research areas w.r.t

components of the AI/LLM-generated Online Health

Information Audit Framework - Health-AIDE.

Count of Discussions

(Implementations,

Opportunities or

Challenges)

Reliability

and

Accuracy

Accuracy 15

Currency/Updates

Monitoring

9

Scalability 5

Trust and

Acceptance

Transparency 4

User Trust 13

Human-AI

Collaboration

17

Security

and Safety

Data Governance 12

Safety and Harm

Prevention

13

Security and Misuse

Prevention

4

Equity and

Fairness

Bias and Fairness 11

Ethical and Legal

Compliance

4

Environmental

Sustainability

1

User Trust and Harm Prevention ranked next,

reflecting growing concerns about misinformation

and its potential dangers to individuals and society.

These aspects highlight the need to mitigate risks

associated with the dissemination of inaccurate or

harmful content in AI-driven systems.

At the lower end, aspects such as Sustainability,

Misuse Prevention, Ethical and Legal Compliance,

and Transparency received comparatively less

attention. This is somewhat unexpected, given that

Data Governance is reasonably well-addressed, yet

related areas like Security and Misuse Prevention

remain underexplored. This disparity underscores

potential gaps in the prioritization of critical aspects

in AI-driven health information systems development

that require further investigation and emphasis.

A co-occurrence map provides deeper insights

into how various aspects are interlinked in research,

innovation, and oversight. For instance, Human-AI

Collaboration strongly co-occurs with Bias and

Fairness, suggesting that as efforts progress to

improve human-AI interactions, there is a growing

recognition of the need for equity and fairness in

benefiting from health information systems.

HealthAIDE: Developing an Audit Framework for AI-Generated Online Health Information

177

Addressing biases in design and implementation is

crucial to ensuring inclusivity and equitable access to

AI-driven health solutions.

Figure 3: Co-Occurrence map of auditability aspects of

AI/LLM-generated online health information.

Additionally, Data Governance and User Trust

strongly co-occur with Privacy and Misuse

Prevention, highlighting that privacy concerns and

potential data breaches are central to maintaining trust

in AI-based health information systems. Robust

privacy protections and safeguards against misuse are

essential for fostering confidence and reliability in

these systems.

Transparency and Human-AI Collaboration also

consistently co-occur, emphasizing the users' need to

understand the sources of information and the

explainability of how complex content is generated

and organized. Transparency is critical in fostering

trust, as it provides clear insights into how AI systems

work and ensures that outputs are explainable and

intuitive. In health information systems, this is

particularly important as decisions based on AI-

generated content can directly impact individual well-

being. Transparency is foundational for building trust

and enabling meaningful collaboration between

humans and AI.

In conclusion, these observations demonstrate

that existing and emerging laws, regulations,

technical guidelines, and process standards must be

complemented by interconnected and synergistic

requirements to ensure that information generated by

AI systems is trustworthy, safe, secure and equitable.

Addressing these aspects holistically is vital to

fostering user confidence and minimizing risks in AI-

driven health information systems.

A scan of existing or emerging industry standards,

audit toolkits, and national-level regulations reveals

an increasing emphasis on addressing the identified

auditability aspects for AI-driven software and

information systems in general (see Table 2).

However, compliance with these auditing standards

and regulations has yet to gain widespread acceptance

in the OHI sector,. This gap highlights the need for

greater efforts to enforce audit frameworks in OHI

domain for ensuring accountability, transparency,

and trust in AI-driven health information systems.

Table 2: AI standards, regulations and audit toolkits w.r.t to

OHI auditability aspects identified in HealthAIDE.

AI Standards,

Regulations and

Auditin

g

Toolkits

Reliability

and

Accuracy

ISO/IEC 23894:2023,

ISO/IEC 25010,

NIST AI RMF,

EU AI Act,

FDA SaMD

Currency/

Updates

Monitoring

Scalability

Trust and

Acceptance

Transparency

IEEE 7001-2021,

ISO/IEC TR 24028,

ANSI/CTA-2090,

NIH DISCERN,

Algorithmic

Accountability Act

User Trust

Human-AI

Collaboration

Security

and Safety

Data

Governance

GDPR,

EU AI Act,

HIPPA,

CCPA,

FedRAMP,

HL7, FHIR,

IEEE 7010-2020

Safety and

Harm

Prevention

Security and

Misuse

Prevention

Equity and

Fairness

Bias and

Fairness

WHO Ethics

Guidelines,

DoD

Ethical/Responsible AI

Guidelines,

Ada Toolkit,

ISO/IEC TR

24027:2021

Ethical and

Legal

Compliance

Environmental

Sustainability

This study represents a preliminary, smaller

survey and the development of an initial framework

for auditing AI-generated health information. Future

work aims to expand on this research by

incorporating a larger dataset of papers from the Web

of Science database and engaging expert reviews.

This expanded approach will address technical,

process, and interdisciplinary issues at a deeper level,

while also considering the details of above-noted and

emerging regulations, technical standards and

professional codes to create a more comprehensive

and robust framework.

HEALTHINF 2025 - 18th International Conference on Health Informatics

178

REFERENCES

Alhendawi, K. M. (2024). Task-technology fit model:

Modelling and assessing the nurses’ satisfaction with

health information system using AI prediction

models. International Journal of Healthcare

Management, 17(1), 12-24.

Alkhalaf, M., Yu, P., Yin, M., & Deng, C. (2024). Applying

generative AI with retrieval augmented generation to

summarize and extract key clinical information from

electronic health records. Journal of Biomedical

Informatics, 104662.

Amin, M. S., Johnson, V. L., Prybutok, V., & Koh, C. E.

(2024). An investigation into factors affecting the

willingness to disclose personal health information

when using AI-enabled caregiver robots. Industrial

Management & Data Systems, 124(4), 1677-1699.

Aparicio, E., Nguyen, Q., Doig, A. C., Gutierrez, F. X. M.,

Sah, S., Mane, H., ... & He, X. (2023, November).

Development of rosie the chatbot: A health information

intervention for pregnant and new mothers of color.

In APHA 2023 Annual Meeting and Expo. APHA.

Arda, S. (2024). Taxonomy to Regulation: A (Geo)

Political Taxonomy for AI Risks and Regulatory

Measures in the EU AI Act. arXiv preprint

arXiv:2404.11476.

Ascorbe, P., Campos, M. S., Domínguez, C., Heras, J., &

Reinares, A. R. T. (2023). prevenIA: a Chatbot for

Information and Prevention of Suicide and other Mental

Health Disorders. In SEPLN (Projects and

Demonstrations) (pp. 26-30).

Bala, I., Pindoo, I., Mijwil, M. M., Abotaleb, M., &

Yundong, W. (2024). Ensuring security and privacy in

Healthcare Systems: a Review Exploring challenges,

solutions, Future trends, and the practical applications

of Artificial Intelligence. Jordan Medical

Journal, 58(3).

Blumenthal-Barby, J. (2023). An AI bill of rights:

Implications for health care AI and machine learning—

A bioethics lens. The American Journal of

Bioethics, 23(1), 4-6.

Champlain, J. J. (2003). Auditing information systems. John

Wiley & Sons.

Chang, C. W., Hu, M., Ghavidel, B., Wynne, J. F., Qiu, R.

L. J., Washington, M., ... & Yang, X. (2024). An LLM-

Based Framework for Zero-Shot De-Identifying

Flexible Text Data in Protected Health Information

Enabling Potential Risk-Informed Patient Safety.

International Journal of Radiation Oncology, Biology,

Physics, 120(2), e518.

Clark, O., Reynolds, T. L., Ugwuabonyi, E. C., & Joshi, K.

P. (2024, June). Exploring the Impact of Increased

Health Information Accessibility in Cyberspace on

Trust and Self-care Practices. In Proceedings of the

2024 ACM Workshop on Secure and Trustworthy

Cyber-Physical Systems (pp. 61-70).

Cocci, A., Pezzoli, M., Lo Re, M., Russo, G. I., Asmundo,

M. G., Fode, M., ... & Durukan, E. (2024). Quality of

information and appropriateness of ChatGPT outputs

for urology patients. Prostate cancer and prostatic

diseases, 27(1), 103-108.

Davis, R., Eppler, M., Ayo-Ajibola, O., Loh-Doyle, J. C.,

Nabhani, J., Samplaski, M., ... & Cacciamani, G. E.

(2023). Evaluating the effectiveness of artificial

intelligence–powered large language models

application in disseminating appropriate and readable

health information in urology. The Journal of urology,

210(4), 688-694.

Ellaham, S., Ellahham, N., & Simsekler, M. C. E. (2020).

Application of artificial intelligence in the health care

safety context: opportunities and challenges. American

Journal of Medical Quality, 35(4), 341-348.

Falco, G., Shneiderman, B., Badger, J., Carrier, R.,

Dahbura, A., Danks, D., ... & Yeong, Z. K. (2021).

Governing AI safety through independent

audits. Nature Machine Intelligence, 3(7), 566-571.

Freeman, J. L., Caldwell, P. H., & Scott, K. M. (2020).

The role of trust when adolescents search for and

appraise online health information. The Journal of

Pediatrics, 221, 215-223.

Forcier, M. B., Gallois, H., Mullan, S., & Joly, Y. (2019).

Integrating artificial intelligence into health care

through data access: can the GDPR act as a beacon for

policymakers?. Journal of Law and the

Biosciences, 6(1), 317-335.

Hameed, T. (2018). Impact of Online Health Information

on Patient-physician Relationship and Adherence;

Extending Health-belief Model for Online Contexts.

In HEALTHINF (pp. 591-597).

Harrington, C. N., & Egede, L. (2023, April). Trust,

comfort and relatability: Understanding black older

adults’ perceptions of chatbot design for health

information seeking. In Proceedings of the 2023 CHI

Conference on Human Factors in Computing

Systems (pp. 1-18).

Hatamlah, H. (2024). Adolescents' online health

information seeking: Trust, e-health literacy, parental

influence, and AI-generated credibility. International

Journal of Data and Network Science, 8(2), 809-822.

Hong, S. J., & Cho, H. (2023). Privacy management and

health information sharing via contact tracing during

the COVID-19 pandemic: A hypothetical study on AI-

based technologies. Health Communication, 38(5),

913-924.

Jin, Q., Leaman, R., & Lu, Z. (2023). Retrieve, summarize,

and verify: how will ChatGPT affect information

seeking from the medical literature?. Journal of the

American Society of Nephrology, 34(8), 1302-1304.

Jin, E., & Eastin, M. (2024). Towards more trusted virtual

physicians: the combinative effects of healthcare

chatbot design cues and threat perception on health

information trust. Behaviour & Information

Technology, 1-14.

Kunlerd, A. (2024). Developing an Innovative Health

Information Service System: The Potential of Chatbot

Technology. Suan Sunandha Science and Technology

Journal, 11(2), 61-69.

Latt, P. M., Aung, E. T., Htaik, K., Soe, N. N., Lee, D.,

King, A. J., ... & Fairley, C. K. (2024). Real-World

HealthAIDE: Developing an Audit Framework for AI-Generated Online Health Information

179

Evaluation of Artificial Intelligence (AI) Chatbots for

Providing Sexual Health Information: A Consensus

Study Using Clinical Queries.

Li, S., Mou, Y., & Xu, J. (2024). Disclosing Personal

Health Information to Emotional Human Doctors or

Unemotional AI Doctors? Experimental Evidence

Based on Privacy Calculus Theory. International

Journal of Human–Computer Interaction, 1-13.

Li, Y., & Goel, S. (2024). Making it possible for the

auditing of AI: A systematic review of AI audits and AI

auditability. Information Systems Frontiers, 1-31.

Liew, T. W., Tan, S. M., Yoo, N. E., Gan, C. L., & Lee, Y.

Y. (2023). Let's talk about Sex!: AI and relational

factors in the adoption of a chatbot conveying sexual

and reproductive health information. Computers in

Human Behavior Reports, 11, 100323.

Link, E., & Beckmann, S. (2024). AI at everyone’s

fingertips? Identifying the predictors of health

information seeking intentions using AI. Communication

Research Reports, 1-11.

Liu, J., Wang, J., Huang, H., Zhang, R., Yang, M., & Zhao,

T. (2023, October). Improving LLM-Based Health

Information Extraction with In-Context Learning.

In China Health Information Processing

Conference (pp. 49-59). Singapore: Springer Nature

Singapore.

Ma, Y., Achiche, S., Pomey, M. P., Paquette, J., Adjtoutah,

N., Vicente, S., ... & MARVIN chatbots MARVIN

chatbots Patient Expert Committee. (2024). Adapting

and evaluating an AI-Based Chatbot through patient

and stakeholder engagement to provide information for

different health conditions: Master Protocol for an

Adaptive Platform Trial (the MARVIN Chatbots

Study). JMIR Research Protocols, 13(1), e54668.

McMahon, E., Fetters, T., Jive, N. L., & Mpoyi, M. (2023).

Perils and promise providing information on sexual and

reproductive health via the Nurse Nisa WhatsApp

chatbot in the Democratic Republic of the Congo. Sexual

and Reproductive Health Matters, 31(4), 2235796.

Mendel, T., Nov, O., & Wiesenfeld, B. (2024). Advice from

a Doctor or AI? Understanding Willingness to Disclose

Information Through Remote Patient Monitoring to

Receive Health Advice. Proceedings of the ACM on

Human-Computer nteraction, 8(CSCW2), 1-34.

Meskó, B., & Topol, E. J. (2023). The imperative for

regulatory oversight of large language models (or

generative AI) in healthcare. NPJ digital medicine,

6(1), 120.

Mökander, J. (2023). Auditing of AI: Legal, ethical and

technical approaches. Digital Society, 2(3), 49.

Mökander, J., Schuett, J., Kirk, H. R., & Floridi, L. (2023).

Auditing large language models: a three-layered

approach. AI and Ethics, 1-31.

Mulgund, P., Mulgund, B. P., Sharman, R., & Singh, R.

(2021). The implications of the California Consumer

Privacy Act (CCPA) on healthcare organizations:

Lessons learned from early compliance

experiences. Health Policy and Technology, 10(3),

100543.

Nankya, M., Mugisa, A., Usman, Y., Upadhyay, A., &

Chataut, R. (2024). Security and Privacy in E-Health

Systems: A Review of AI and Machine Learning

Techniques. IEEE Access.

Nyarko, A. J. (2024). Exploring Ghanaian Tertiary

Students’ Perceptions Towards AI as a First-Hand

Source of Health Information for Diagnosis and Self-

Medication. Journal of Health Informatics in Africa,

11(1), 64-76.

Ono, G. N., Obi, E. C., Chiaghana, C., & Ezegwu, D.

(2024). Digital Divide and Access: Addressing

Disparities in Artificial Intelligence (Ai) Health

Information for Nigerian Rural Communities. Social

Science Research, 10(3).

Park, J., Singh, V., & Wisniewski, P. (2023). Supporting

youth mental and sexual health information seeking in

the era of artificial intelligence (ai) based

conversational agents: Current landscape and future

directions. Available at SSRN 4601555.

Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G., &

Chin, M. H. (2018). Ensuring fairness in machine

learning to advance health equity. Annals of internal

medicine, 169(12), 866-872.

Raddatz, N., Kettinger, W. J., & Coyne, J. (2023). Giving

to Get Well: Patients’ Willingness to Manage

and Share Health Information on AI-Driven

Platforms. Communications of the Association for

Information Systems, 52(1), 1017-1049.

Rezaee, Z., Homayoun, S., Poursoleyman, E., & Rezaee, N.

J. (2023). Giving to Get Well: Patients’ Willingness to

Manage and Share Health Information on AI-Driven

Platforms. Global Finance Journal, 55.

Sakriwattana, K. (2024). Factor affecting intention to use

chatbot for health information. Procedia of

Multidisciplinary Research, 2(7), 5-5.

Shen, S. A., Perez-Heydrich, C. A., Xie, D. X., & Nellis, J.

C. (2024). ChatGPT vs. web search for patient questions:

what does ChatGPT do better?. European Archives of

Oto-Rhino-Laryngology, 281(6), 3219-3225.

Shi, X., Liu, J., Liu, Y., Cheng, Q., & Lu, W. (2024). Know

where to go: Make LLM a relevant, responsible, and

trustworthy searchers. Decision Support Systems,

114354.

Shneiderman, B. (2020). Bridging the gap between ethics

and practice: guidelines for reliable, safe, and

trustworthy human-centered AI systems. ACM

Transactions on Interactive Intelligent Systems

(TiiS), 10(4), 1-31.

Swar, B., Hameed, T., & Reychav, I. (2017). Information

overload, psychological ill-being, and behavioral

intention to continue online healthcare information

search. Computers in human behavior, 70, 416-425.

Ueda, D., Kakinuma, T., Fujita, S., Kamagata, K., Fushimi,

Y., Ito, R., ... & Naganawa, S. (2024). Fairness of

artificial intelligence in healthcare: review and

recommendations. Japanese Journal of

Radiology, 42(1), 3-15.

Vaira, L. A., Lechien, J. R., Abbate, V., Allevi, F., Audino,

G., Beltramini, G. A., ... & De Riu, G. (2024).

HEALTHINF 2025 - 18th International Conference on Health Informatics

180

Validation of the QAMAI tool to assess the quality of

health information provided by AI. medRxiv, 2024-01.

Veale, M., & Zuiderveen Borgesius, F. (2021).

Demystifying the Draft EU Artificial Intelligence

Act—Analysing the good, the bad, and the unclear

elements of the proposed approach. Computer Law

Review International, 22(4), 97-112.

Weidinger, L., Uesato, J., Rauh, M., Griffin, C., Huang, P.

S., Mellor, J., ... & Gabriel, I. (2022, June). Taxonomy

of risks posed by language models. In Proceedings of

the 2022 ACM Conference on Fairness, Accountability,

and Transparency (pp. 214-229).

Xiao, Z+3:50., Liao, Q. V., Zhou, M., Grandison, T., & Li,

Y. (2023, March). Powering an ai chatbot with expert

sourcing to support credible health information access.

In Proceedings of the 28th international conference on

intelligent user interfaces (pp. 2-18).

Yan, Y., Hou, Y., Xiao, Y., Zhang, R., & Wang, Q. (2024).

KNOWNET: Guided Health Information Seeking from

LLMs via Knowledge Graph Integration. IEEE

Transactions on Visualization and Computer Graphics.

Yiannakoulias, N. (2024). Spatial intelligence and

contextual relevance in AI-driven health information

retrieval. Applied Geography, 171, 103392.

Yin, R., & Neyens, D. M. (2024). Examining how

information presentation methods and a chatbot impact

the use and effectiveness of electronic health record

patient portals: An exploratory study. Patient

Education and Counseling, 119, 108055.

APPENDIX

Table 3: Systematic Literature Review – Mapping key features, opportunities, challenges and issues of AI systems in literature

onto Health-AIDE AI Auditing framework.

Reliability &

Accuracy

Trust and

Acceptance

Security and Safety

Equity

Reference AI Scope

Accu

racy

Monit

oring

(Curr

ency)

Scala

bility

Tran

spar

enc

y

User

Trust

Huma

n-AI

Collab

oratio

n

Data

Gover

nance

Harm

Preve

ntion

Privacy

and

Misuse

Preven

tion

Bias

and

Fairn

ess

Ethical

and

Legal

Compli

ance

Sustai

nability

1

Xiao, Z.

et al.,

(2023)

Expert-sourced

chatbot;

Design,

Evaluation,

OHI

X X X X X X X

2

Vaira et

al. (2024)

OHI, Quality

assessment of

medical

Artificial

Intelligence

(QAMAI) based

on mDISCERN

X X

3

Hatamla

h, H.

(2024)

Parental

oversight and

communicatio

n to ensure

reliable and

trustworthy

info and

behavior

X X X

4

Yiannak

oulias,

N.

(2024)

Spatial

dispersion and

OHI

X X X

5

Amin,

M. S., et

al.,

(2024).

Caregiver

robots

interaction,

Disclosing

personal

health info

(PHI)

X X X X X

HealthAIDE: Developing an Audit Framework for AI-Generated Online Health Information

181

Table 3: Systematic Literature Review – Mapping key features, opportunities, challenges and issues of AI systems in literature

onto Health-AIDE AI Auditing framework (cont.).

Reliability &

Accuracy

Trust and

Acceptance

Security and Safety

Equity

Reference AI Scope

Accu

racy

Monit

oring

(Curr

ency)

Scala

bility

Tran

spar

enc

y

User

Trust

Huma

n-AI

Collab

oratio

n

Data

Gover

nance

Harm

Preve

ntion

Privacy

and

Misuse

Preven

tion

Bias

and

Fairn

ess

Ethical

and

Legal

Compli

ance

Sustai

nability

6

Liew, T.

W., et

al.

(2023)

Chatbot, Youth

interactions on

sexual and

reproductive

health (SRH)

information,

X X X X X

7

Link, E., &

Beckman

n, S.

(2024).

OHI seeking

X

8

Li, S.et

al., (2023)

AI doctors,

disclosure of

PHI

X X X X X

9

Raddatz,

N.et al.

(2023)

Internet of

Medical

Things,

disclosure of

PHI

X X X X

10

Park, J. et

al. (2023)

Sexual and

reproductive

chatbots,

adolescents,

OHI

X X X X X X

11

Nyarko,

A. J.

(2024)

Engaging with

OHI for self-

care purposes,

Ghanian

students

X X X X

12

Ono, G.N.

et al.

2024

Digital gaps, AI

access,

Nigerian

students

X X

13

Alhenda

wi, K. M.

(2024).

Connected

medical

devices cloud

network,

Exchange of

health info

X X X X

14

Hong, S.

J.et al.

(2023)

Users

considering

adoption of

contact-

tracing app.,

COVID-19

X X

15

Mendel,

T. et

al.2024

Remote

patient

monitoring

devices, dis-

closure of PHI

X X X

16

Latt, P.

M.et al.

(2024)

Sexual health,

prompt-tuned

chatbots,

Australia

X X X X X

17

Rezaee,

Z.et al.

(2023)

ESG

disclosures by

firms, EU and

others

X X X

HEALTHINF 2025 - 18th International Conference on Health Informatics

182

Table 3: Systematic Literature Review – Mapping key features, opportunities, challenges and issues of AI systems in literature

onto Health-AIDE AI Auditing framework (cont.).

Reliability &

Accuracy

Trust and

Acceptance

Security and Safety

Equity

Reference AI Scope

Accu

racy

Monit

oring

(Curr

ency)

Scala

bility

Tran

spar

enc

y

User

Trust

Huma

n-AI

Collab

oratio

n

Data

Gover

nance

Harm

Preve

ntion

Privacy

and

Misuse

Preven

tion

Bias

and

Fairn

ess

Ethical

and

Legal

Compli

ance

Sustai

nability

18

Alkhalaf,

M.et al.

(2024)

Zero-shot

prompt

engineering,

RAG, EHRs, AI

Hallucinations

X X X

19

Ma, Y.et

al. (2024)

MARVIN

Chatbots,

Multiple

disease

diagnostics

and care

X X X X

20

Liu, J.et

al. (2023)

CHINA, Health

information

extraction, In-

context

learning

X X X

21

Chang, C.

W.et al.

(2024)

Zero-shot de

identification

of health data

X X X X

22

Nguyen,

V. C.et al.

(2024)

Mental health,

Misinformation

, Few-shot

X X

23

Jin, E.et

al. (2024)

Virtual

Doctors/

Physicians,

Chatbots

X X X X

24

Harringto

n, C. N. et

al. (2024)

Chatbots,

Gender and

race

associations

X X X X X

25

Yin, R., &

Neyens,

D. M.

(2024).

Design,

Chatbots,

EHRs, Patient

Portals

X X

26

Kunlerd,

A. (2024).

Line Chatbots,

Diet regimens,

Health

X X X X X X

27

Sakriwatt

ana, K.

(2024).

Chatbots,

Intergeneration

al differences,

Age gaps

X X X X

28

Aparicio,

E.et al.

(2023)

Chatbots, Pre-

natal care,

Post-partum

care, Mothers

of color

X X X X X X

29

McMahon

, E.et al.

(2023)

Congo, Sexual

and

Reproductive

Health, NISA

chatbot

X X X X X X

30

Ascorbe,

P.et al.

(2023)

Mental health,

Suicide

Preven-tion,

Chatbot,

WhatsApp,

Spain

X X X X X

HealthAIDE: Developing an Audit Framework for AI-Generated Online Health Information

183