Quantifying the Role of Active Listening and Reassurance in Virtual

Health Coach Interactions

Hussain Ghulam

, Brian Keegan

and Robert Ross

ADAPT Centre / School of Computer Science, TU Dublin, Ireland

Keywords:

Conversational Agents, LLM, Health Care, Active Listening, Reassurance.

Abstract:

Conversational Agents have the potential to support healthcare through coaching exercise routines, but are

still lacking in demonstrating authentic social behaviours to support engagement. To this end, we present a

series of experiments that we conducted in order to investigate how automated health care coaches can be

more effective when their interaction style is tailored to demonstrate qualities associated with a good bedside

manner, namely active listening and reassurance. To test this, we ﬁrst developed a dataset of 135 dialogue

excerpts from three distinct sources, i.e., original, handcrafted and LLMs, the latter two of which were tuned

to demonstrate speciﬁc types of comforting or reassuring language. Using this dataset, we conducted a study to

validate whether users perceive different levels of active listening and reassurance across sources. The results

of the study indicate that users can distinctly perceive the varying levels of stimuli across the three different

data sources and that LLMs in particular clearly demonstrate these properties. In an accompanying analysis,

the results showed that there is no notable inﬂuence of participant personality on perception, which we argue

reduces the barrier to successful system deployment.

1 INTRODUCTION

Setting exercise goals can positively affect both phys-

ical and mental health, as well as aid recovery and

postoperative care for various medical conditions

(Hallal et al., 2016). The use of conversational agents

(CAs) as intelligent tools to deliver interventions to

achieve exercise goals represents a novel and poten-

tially inclusive approach to broadening physical ac-

tivity. These CAs can, in principle, exhibit greater

adaptability and personalization to user needs and of-

fer personalized recommendations based on prefer-

ences, goals, and ﬁtness levels to improve physical ac-

tivity (Beinema et al., 2021). Despite recent progress

in context-oriented conversation models, integrating

certain language types that are adaptive and engag-

ing remains a challenge, hindering their ability to pro-

vide responses that feel more human-like and exhibit

human level emotional intelligence (Ahmad et al.,

2022).

As a manifestation of demonstrated emotional in-

telligence, good bedside manner in healthcare settings

refers to how healthcare professionals interact with

their clients. Possessing a good bedside manner is

https://orcid.org/0009-0000-6256-5523

https://orcid.org/0000-0002-7793-398X

https://orcid.org/0000-0003-1449-1827

Figure 1: Contrasting Conversational Excerpts: Demon-

strating Presence and Absence of Active Listening and Re-

assurance Behaviors (Right) vs basic form (Left).

assumed to imply that clinicians and related health

professionals are kind, friendly, and understanding of

those in their care (Elliott, 2018). Good bedside man-

ner has been described as characterized by qualities

such as Active Listening and Reassurance (Berman

and Chutka, 2016). To illustrate, an example of two

contrasting interactions is shown in Figure 1. where

the interaction on the right-hand side demonstrates a

higher degree of encouragement, active listening, and

reassurance relative to the more basic form on the left.

Hussain, G., Keegan, B. and Ross, R.

Quantifying the Role of Active Listening and Reassurance in Virtual Health Coach Interactions.

DOI: 10.5220/0013124900003911

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2025) - Volume 2: HEALTHINF, pages 449-457

ISBN: 978-989-758-731-3; ISSN: 2184-4305

449

Active Listening has been shown to make pa-

tients comfortable and alleviate their fears and anx-

ieties (Fassaert et al., 2007) while reassurance has

been shown to restore conﬁdence, hope, and encour-

ages patients to pursue their goals (Rolfe and Burton,

2013). Despite signiﬁcant studies that discuss and

emphasize the importance of active listening skills

and reassurance in healthcare education, there re-

mains a notable scarcity of studies that address active

listening and reassuring behaviour from the patient’s

perspective (Snyder, 2008).

In the context of conversational healthcare assis-

tants, the assumption prevails that CAs should display

many of the same qualities associated with a good

bedside manner, such as the ability to listen and com-

fort. However, it is far from clear whether CAs should

exhibit these qualities for all users during healthcare

communication, and indeed, it is much less clear how

these behaviors can and should be ﬁne-tuned to the

user and conversational context.

Considering this, our research focuses on the de-

sign of adaptive-policy strategies (Sahijwani, 2022)

for a CA assisted health coaching system (Beinema

et al., 2023), which supports users in their health-

care goals by demonstrating suitable levels of active

listening and reassurance to the user. We are work-

ing speciﬁcally in the domain of exercise regime sup-

port, as this is applicable to a wide range of the pop-

ulation but has particular long-term beneﬁts for the

medical community (Liang et al., 2021a). This pa-

per presents our work on an initial set of experiments

to validate the realization and user perception of be-

havioural variants in interaction to demonstrate the

ability to listen and reassure in typical coaching sce-

narios. The contribution of this work are as follows:

• The construction of a corpus of dialogue ex-

cerpts that demonstrate language types indicative

of active listening and reassuring language in the

health coaching domain.

• An evaluation of whether participants can identify

differences in language types controlled against

the source of that language and participant per-

sonality type.

2 BACKGROUND AND RELATED

WORK

Recently, CAs have been widely used in promoting

physical activity and improving health outcomes in

healthcare (Cohen Rodrigues et al., 2024). In such

cases, CAs provide patients with personalized guid-

ance and motivation to engage in physical activity,

track their progress, and provide feedback on their

performance. Additionally, CAs can be employed to

deliver educational resources to patients on the bene-

ﬁts of physical activity and how to engage in it safely

and effectively (Cohen Rodrigues et al., 2024). It has

also been claimed that the use of CAs to promote

physical activity has the potential to improve overall

health outcomes, prevent chronic diseases, and reduce

healthcare costs (Moore et al., 2023). The existing

studies on the use of CAs in health and well-being

show that the ﬁeld seems to be in its early stages of

development with some evidence of user acceptance

of CAs in the physical health domain (Wutz et al.,

2023). Despite the promising adoption of CAs in

healthcare, the research indicates a lack of human-like

effective communication and language types (Shan

et al., 2022).

Current health-centric CAs primarily focus on

users’ activity goals—meaning they concentrate on

coaching actions, mainly providing information re-

lated to regime prescription and physical assessment

with little work to date focusing on the social as-

pects of interaction management. However, the use of

social behaviours can help build strong relationships

and user engagement by incorporating different levels

of user personality aspects such as traits, persona and

language styles during communication (Fernau et al.,

2022). Indeed, CAs equipped with certain types of

language as indicators of empathetic language have

been found to play a central role in improving phys-

ical activity by helping people overcome anxiety or

concerns about physical activity (Lynch et al., 2022).

Additionally, such systems have been found to con-

tribute towards building and restoring conﬁdence, fos-

tering a sense of care, and ensuring a feeling of calm.

This, in turn, alleviates doubts and enables people to

feel safe and valued in both clinical and non-clinical

settings (Hicks et al., 2014; Karlsson et al., 2012;

O’Keeffe et al., 2016).

Tuning to the speciﬁcs of Active Listening and

Reassurance: in early work Traeger et al. (2017), in-

dicated that reassurance is a notable psychological as-

pect related to good bedside manner which is very

important for various patient groups, including those

with long-term medical conditions and those under-

going pre- and post-treatment, as well as physical

therapy and counseling. Meanwhile, active listening

has been studied by Jagosh et al. (2011) as another

very crucial communicative behavior. This behavior

is valued not only in general communication, but also

in specialized health ﬁelds such as nursing, medicine,

health coaching, counseling, and rehabilitation (King,

2021). In the context of physical health, active listen-

ing enables the trainer to transition from being an ‘ex-

HEALTHINF 2025 - 18th International Conference on Health Informatics

450

pert’ to a helpful guide. Instead of exerting pressure,

the trainer assumes the role of a supportive partner,

offering encouraging and reassuring communication

(

Olafsson et al., 2019). In active listening-focused

activities, such as counseling, occasional feedback

is essential to maintain a smooth ﬂow of conversa-

tion. Feedback can be achieved by using supporting

backchannel (BC) cues, such as ‘Uh-huh’, ‘mm-hm’,

‘yeah’, ‘okay’ and ‘right’ (Ruede et al., 2017). BCs

serve as verbal and nonverbal indications of attention,

helping the listener to determine when it is their turn

to speak. The listener can incorporate BCs to express

their thoughts without interrupting the speaker (Lala

et al., 2017). There are two types of backchannels:

verbal backchannel, including responses like ‘mm-

hm’, ‘uhh-huh’ and ‘okay’, and nonverbal backchan-

nel, consisting of cues like nodding the head, mak-

ing eye contact, or laughing (Heinz, 1998). Research

indicates that the inclusion of backchannels can en-

hance user engagement and create a more natural con-

versation ﬂow. Additionally, the backchannels con-

tribute to establishing a more positive relationship be-

tween the user and the conversational agent (Ding

et al., 2022).

3 EXPERIMENTAL GOALS AND

DESIGN

Given the lack of systematic investigation on this

topic to date, the present study seeks to investigate

methods to manipulate levels of reassurance and ac-

tive listening in CA output, and measure whether text

designed to demonstrate active listening and reassur-

ance was perceived as such by analyzing participant’s

perceptions of the provided texts. Our goal therefore

is to provide an approximate calibration of these qual-

ities and measurement of automatically versus manu-

ally collected data.

The experimental design of this study was struc-

tured into three elements:

In the ﬁrst element, we measured the users’ per-

ception of different levels of active listening and reas-

surance across a corpus of dialogue excerpts sourced

from three distinct pools, i.e., original, handmade, and

LLM-generated content.

The second element of the study aimed to vali-

date whether participants can effectively discern dif-

ferences among language types while controlling for

variations in the source of language. The data sources

were further broken down into Block A (Active Lis-

tening), Block B (Reassurance) and Block C (Neutral)

within each language source.

In the third element of this study, we validated

whether different personality types have any differ-

ences in their perception of active listening and reas-

surance.

3.1 Data Synthesis and Properties

Given a lack of suitable existing data sets, we devel-

oped a dataset of dialogue excerpts clearly stating that

the coach is a conversational agent rather than a hu-

man. Speciﬁcally, we built a data set comprising 135

dialogues within the healthcare domain. The data set

has 45 dialogues sourced from original real world in-

teractions, 45 dialogues crafted by human annotators

(handcrafted), and an additional 45 dialogues gener-

ated using an LLM (LLM).

For the original dialogue data, we utilized an ex-

isting open-source dataset comprising human-human

dialogues in the context of physical healthcare coun-

seling to ensure the inclusion of real-world complex-

ities, clients concerns, and diverse language usage.

This data set was collected from a real world physical

activity intervention program for women (Liang et al.,

2021b). This original dialogue dataset was not classi-

ﬁed in any way into active listening and reassurance

as this dataset was aimed to support social support for

physical activity and its barriers.

Building on this real world sourced dataset, hand-

made dialogues were curated with the help of anno-

tators to simulate various physical healthcare scenar-

ios, such as to incorporate a range of medical condi-

tions, and communication styles. For this work we

followed the guidelines and instructions discussed by

Wu et al. (2023). The curated dataset was distributed

proportionally across three blocks: 15 dialogues were

stylised or biased towards active listening, 15 towards

reassurance, and 15 featuring neutral language.

To generate automatic data, we used ChatGPT

3.5 with different prompts to create dialogue excerpts

in speciﬁc styles, depicting qualities related to ac-

tive listening and reassurance. The designed prompts

are provided in the appendix for reference. The

LLM-generated excerpts were distributed across three

blocks: 15 styled with active listening, 15 styled to-

wards reassurance, and 15 featuring neutral stimuli.

The data resources from this study are publicly avail-

able to promote further research on GitHub.

3.2 Study Design

After collecting the datasets, we conducted three sur-

veys – one for each data source – for dialogue eval-

uation. The surveys included a total of six questions,

The data resources are available on Github.

Quantifying the Role of Active Listening and Reassurance in Virtual Health Coach Interactions

451

speciﬁcally focused on examining the perception of

active listening and reassurance. The ﬁrst three ques-

tions assessed active listening, while the remaining

three questions referred to reassurance. A 5-point

Likert scale was used to assess participant responses.

The participant ratings were calculated by averaging

the responses from each segment, reﬂecting perceived

levels of active listening and reassurance. These ques-

tions are included in the appendix for reference.

Nine dialogues were randomly displayed for each

user interaction. The evaluation system was deployed

on the Proliﬁc crowd-sourcing platform (Eyal et al.,

2021) with informed consent. The time allotted to

each participant was 20 minutes. In total 90 partic-

ipants were recruited; 30 for each of the 3 language

sources, original, handmade, and LLM dialogues.

Following the stimuli rating activity, we asked

participants to rate themselves against the ten item

personality measure (TIPI) personality test (Gosling

et al., 2003) to assess the different personality lev-

els of participants. This supplementary assessment

aimed to validate whether different types of person-

ality have different trends regarding the perception of

active listening and reassurance. This measurement

results in an estimate of the openness, conscientious-

ness, extroversion, agreeableness, and emotional sta-

bility of the Big 5 personality traits demonstrated by

each participant.

3.3 Participant Demographics

In the ﬁrst survey, which focused on data from origi-

nal dialogues, the cohort of 30 participants included

10 males, 17 females, and 3 participants who pre-

ferred not to disclose their gender. The ages of the

participants ranged from 22 to 67 years (M = 38.4,

SD = 5.63). The second survey, centered on hand-

made dialogues, included 14 men and 16 women in

the 30-participant cohort, with ages ranging from 23

to 73 years (M = 39.11, SD = 6.22). In the third sur-

vey with LLM-generated data, which also featured

30 participants, there were 12 men and 18 women,

and the age range was 27 to 55 years (M = 39.72,

SD = 6.16). Participants from the US, UK, Ireland,

New Zealand, and Australia were sourced across the

three studies. Analyzing the median time participants

spent engaging with the experiments reveals notable

patterns. The median time taken by participants in

survey I was approximately 17.68 minutes. Survey

II reveals that the median time taken by participants

was approximately 16.91 minutes. Survey III how-

ever stands out with a signiﬁcantly shorter median

time of 9 minutes. The main cause of this shorter me-

dian time may be the ease of linguistic styles mim-

Figure 2: User’s Perception of Active Listening across Lan-

guage Sources (original, handmade, LLM).

Figure 3: User’s Perception of Reassurance across Lan-

guage Sources (original, handmade, LLM).

icked by LLMs, However, it can be in part be at-

tributed to average dialogue length. The mean lengths

of dialogues vary across data sources, with the dataset

of original dialogues having the longest dialogues

(mean length: 270.82 tokens), the handmade dataset

exhibiting moderate lengths (mean length: 215.59 to-

kens), LLM-generated dialogues feature the shortest

dialogues (mean length: 106.18 tokens).

4 EXPERIMENTAL RESULTS

In this section, we present the experimental results for

the three elements of the study.

HEALTHINF 2025 - 18th International Conference on Health Informatics

452

4.1 Source Analysis

The ﬁrst element investigates the perception of active

listening and reassurance across the three language

sources: original, handcrafted, and LLM-Generated

dialogues. Our goal here is to determine as a baseline

whether participants perceived any variation in over-

all amounts of reassurance and active listening across

the sources without taking into account any styling

blocks within those sources. Figures 2 and 3, illus-

trate the Likert scale ratings of perceived active lis-

tening and reassurance provided by the participants.

Table 1: Anova with post-hoc Tukey’s HSD Results for

Overall User’s Perception of Active Listening across the

data sources (original, handmade, LLM).

Group 1 Group 2 p-adj

Handmade LLM 0.9295

Handmade original 0.5448

LLM original 0.3347

Table 2: Anova with post-hoc Tukey’s HSD Results for

Overall User’s Perception of Reassurance across the data

sources (original, handmade, LLM).

Group 1 Group 2 p-adj

Handmade LLM 0.9437

Handmade original 0.3586

LLM original 0.5485

Overall, the results show that users perceive ac-

tive listening and reassurance slightly higher in both

the LLM and Handcrafted dialogues than in the base-

line original content. The perceived level is strongest

in LLMs. These results can be explained by the fact

that original content was not purposefully designed to

have reassurance and active listening qualities while

the other two text sources were designed (in part) to

display these styles. Statistical analysis by means of

the Anova with post-hoc Tukey’s HSD in Tables 1 and

2 shows that there are no statistically signiﬁcant dif-

ferences in mean perception scores for ‘Active Lis-

tening’ and ‘Reassurance’ between any of the com-

pared language sources (‘Original’, ‘Handmade’, and

‘LLM-Generated’) and the adjusted p-value is also

recorded greater than the signiﬁcance level. This in it-

self is also aligned with our experimental design since

not all dialogue examples within the LLM and hand-

crafted sets were stylised, thus resulting in a small and

notable though not strong perception effect.

4.2 Style Analysis

To dig deeper and account for the speciﬁc stylistic

biasing of the individual dialogues, the second ele-

Table 3: Anova with post-hoc Tukey’s HSD results for the

Perception of Active Listening Across the different blocks

of Handmade Data. Block A = Active Listening biased data;

Block B = Reassurance biased data, and Block C = Neutral

data.

Group 1 Group 2 p-adj

Block A Block B 0.6812

Block A Block C 0.7777

Block B Block C 0.9858

Table 4: Anova with post-hoc Tukey’s HSD results for the

Perception of Reassurance Across the different blocks of

Handmade Data. Block A = Active Listening biased data;

Block B = Reassurance biased data, and Block C = Neutral

data.

Group 1 Group 2 p-adj

Block A Block B 0.8019

Block A Block C 0.7262

Block B Block C 0.9907

ment of the investigation aimed to validate the partici-

pant ratings of active listening and reassurance against

the breakdown of the 45 dialogues distributed across

three distinctively styled blocks: Block A (Active

Listening biased data), Block B (Reassurance biased

data), and Block C (Neutral). We present results for

handmade dialogues and LLM dialogues but not orig-

inal content since no style biasing was applied for that

content.

Handmade Dialogues: For handmade dialogues, as

depicted in Figure 4, for the perception of Active Lis-

tening (blue) we can see that participants identiﬁed

slightly more active listening in the active listening

biased data Block A than was the case for the neutral

data Block C. Similarly for the perception of reassur-

ance (green), participants perceived slightly greater

levels of reassurance on average in reassurance biased

data than was the case for the neutral data Block C. In

both cases however the effect is not strong, and statis-

tical tests shown in Table 3 demonstrate that the effect

was not signiﬁcant across blocks. It is also notable

that reassurance and active listening perceptions cross

bias blocks are very similar. In other words partici-

pants see reassurance in Active Listening biased data

and see Active Listening in Reassurance biased data.

Though the reported values were lower, it is notable

here that the users perceived active listening and re-

assurance even in the neutral Block C data. This may

be due to various factors such as tone, context, and

non-verbal cues. In fact, language does not necessar-

ily eliminate all the cues that can inﬂuence the per-

ception of active listening or assurance, as no effort

was made to actively engineer this out of the baseline

dialogues.

Quantifying the Role of Active Listening and Reassurance in Virtual Health Coach Interactions

453

Figure 4: User Perception of Active Listening (Blue) and

Reassurance (Green) across Handmade Content. Block A

(Active Listening), Block B (Reassurance) and Block C

(Neutral).

Figure 5: User’s Perception of Active Listening (Blue)

and Reassurance (Green) across LLM-Generated Content.

Block A (Active Listening), Block B (Reassurance) and

Block C (Neutral).

LLM Generated Data: Turning to the LLM data,

Figure 5 presents a similar analysis for the LLM data.

Generally the results follow the same overall pattern

as those for the handcrafted content, but with a much

clearer distinction between the perception results for

neutral dialogues versus the dialogues that were bi-

ased towards reassurance and active listening. As was

the case for handcrafted dialogues, again we see that

participants do in fact perceive high values of active

listening in dialogues which were biased for reassur-

ance, and vice versa. While comparing to those re-

sults for the handcrafted dialogues, it is clear that the

measures of reassurance and active listening are in-

stinctively clearer for LLM generated data than hand-

crafted dialogue. Statistical analysis by means of

Anova with post-hoc Tukey’s HSD shows signiﬁcant

differences in both active listening and reassurance

perceptions across the experimental blocks. These

ﬁndings underscore the potential of the engineered

stimuli successfully inﬂuencing perceptions of active

listening and reassurance. Tables 5 and 6 show the

detailed results of the Anova with post-hoc Tukey’s

HSD Test across the 3 blocks.

Table 5: Anova with post-hoc Tukey’s HSD Test Results

for the perception of Active Listening across the different

blocks of LLM-Generated Data. Block A (Active Listen-

ing), Block B (Reassurance) and Block C (Neutral).

Group 1 Group 2 p-adj

Block A Block B 1.0

Block A Block C 0.0

Block B Block C 0.0

Table 6: Anova with post-hoc Tukey’s HSD Test Results

for the Perception of Reassurance Across the Different

Blocks of LLM-Generated Data. Block A (Active Listen-

ing), Block B (Reassurance) and Block C (Neutral).

Group 1 Group 2 p-adj

Block A Block B 0.9986

Block A Block C 0.0

Block B Block C 0.0

4.3 Personality Variance in Perception

While it is interesting to understand whether the over-

all population can perceive of active listening and re-

assurance in designed content, it is important to rec-

ognize the potential for individual differences. There-

fore, we also collected personality measures to ana-

lyze whether personality traits correlate with the per-

ceptions of active listening and reassurance for each

participant in the LLM-generated content.

We present this analysis for the LLM sourced data.

As shown in the previous section the perceptions of

active listening and reassurance were most strongly

pronounced in this data, which in turn makes the in-

teractions with personality traits most valid for inves-

tigation. Our hypothesis is the existence of a linear

relationship between elements of personality measure

and perception measures of active listening and reas-

surance, To measure the strength and direction of the

this relationship we used Pearson’s correlation coefﬁ-

cient (r). Since different participants reviewed stimuli

across three stylistic blocks, we present the results for

these stylistic blocks individually. Table 7 summa-

rizes these results.

HEALTHINF 2025 - 18th International Conference on Health Informatics

454

Table 7: Pearson Correlation Coefﬁcient (r) between personality traits and blocks with active listening, reassurance biased

and neutral data. AL= Active Listening, R= Reassurance, *p values < 0.01 and **p < 0.05.

Personality Trait

Active Listening biased data Reassurance biased data Neutral data

AL R AL R AL R

Openness 0.2590 0.2599 0.0128 0.0128 -0.0783 -0.1623

Conscientiousness -0.0023 -0.0069 0.0993 0.0457 0.2610 0.1638

Extroversion 0.3977

0.2676 -0.4674

-0.5339

0.0017 -0.1054

Agreeableness 0.2218 0.2596 0.0923 -0.1859 -0.0622 -0.0432

Emotional Stability 0.3937

0.3765

-0.0622 -0.0051 0.0138 -0.0437

5 DISCUSSION

Our analysis suggests that when dialogues are con-

sciously crafted with language styles that indicate ac-

tive listening and reassurance, users are more likely

to perceive these dialogues as demonstrating those

traits compared to dialogues lacking such intentional

linguistic and social behavior cues. Our study re-

veals that users consistently perceived higher levels

of Active Listening and Reassurance in content gen-

erated by LLMs compared to hand-crafted data. This

discrepancy can likely be attributed to the advanced

content generation capabilities of LLMs when guided

by speciﬁc directives and instructions. Additionally,

it is crucial to acknowledge the possibility that user

comprehension may have been hindered during the

creation of hand-crafted data due to potential limita-

tions or inadequacies in conveying the intended lan-

guage styles. In either case the ﬁndings suggest that

we can comfortably use LLM generated content that

is tuned to the factors associated with good bedside

manner, and in fact these systems may be better at

consistently demonstrating these qualities than a hu-

man consciously aiming to replicate these styles.

While our study did not demonstrate any strong rela-

tionship between personality traits and the perception

of properties associated with good bedside manner,

that is a positive thing from the perspective of effec-

tive design of health supporting systems in the long

run. The results suggest that we do not need to over-

think the design of these stylistic factors and that with

respect to these elements of support, a one size ﬁts

all approach to displaying support may be sufﬁcient

rather than a dynamic style which needs to be cus-

tomized to individual personality traits.

6 CONCLUSION AND FUTURE

DIRECTIONS

Whilst past studies on Virtual Health Coaching Assis-

tants have emphasised the positive impact of certain

tasks mainly providing information related to regime

prescription and physical assessment, little has been

known about focusing on the social aspects of inter-

action management, To address this gap, our study

focused on inclusion and measuring of social be-

haviours, namely active listening and reassurance, in

the context of system-initiated virtual health coaching

assistants. By building and analysing a dataset com-

prised of 135 dialogues, including original, handmade

and LLM-generated excerpts curated with language

styles related to these qualities, we observed that users

with diverse personality traits perceived varying lev-

els of active listening and reassurance. Our ﬁndings

underscore the importance of integrating these social

behaviors into virtual health coaching assistants. This

study laid the foundation work for ﬁlling the gap in

understanding and leveraging such behaviors, paving

the way for the development of virtual health coach-

ing prototypes that prioritize active listening and re-

assurance.

Building upon these ﬁndings, our future research en-

deavors focuses on developing a virtual health coach-

ing prototype that incorporates varying degrees of ac-

tive listening and reassurance, and validating that dis-

playing these qualities in a controlled way in fact is

beneﬁcial to the participants. Through rigorous exper-

iments, our aim is to determine whether these qual-

ities indeed enhance engagement and effectiveness

compared to systems lacking these qualities. Ulti-

mately, our work aims to contribute to the advance-

ment of virtual health coaching, offering more per-

sonalized and adaptive interventions.

Quantifying the Role of Active Listening and Reassurance in Virtual Health Coach Interactions

455

ACKNOWLEDGEMENTS

This research was conducted with the ﬁnancial sup-

port of Science Foundation Ireland / Research Ire-

land under Grant Agreement No. 13/RC/2106 P2 at

ADAPT, the SFI Research Centre for AI-Driven Digi-

tal Content Technology. For the purpose of Open Ac-

cess, the authors have applied a CC BY public copy-

right licence to any Author Accepted Manuscript ver-

sion arising from this submission..

REFERENCES

Ahmad, R., Siemon, D., Gnewuch, U., and Robra-Bissantz,

S. (2022). Designing personality-adaptive conversa-

tional agents for mental health care. Information Sys-

tems Frontiers, 24.

Beinema, T., op den Akker, H., Hermens, H. J., and van

Velsen, L. (2023). What to discuss?—a blueprint

topic model for health coaching dialogues with con-

versational agents. International Journal of Hu-

man–Computer Interaction, 39(1):164–182.

Beinema, T., op den Akker, H., van Velsen, L., and

Hermens, H. (2021). Tailoring coaching strategies

to users’ motivation in a multi-agent health coach-

ing application. Computers in Human Behavior,

121:106787.

Berman, A. and Chutka, D. (2016). Assessing effective

physician-patient communication skills: ”are you lis-

tening to me, doc?”. Korean journal of medical edu-

cation, 28.

Cohen Rodrigues, T. R., de Buisonj

e, D. R., Reijnders,

T., Santhanam, P., Kowatsch, T., Breeman, L. D.,

Janssen, V. R., Kraaijenhagen, R. A., Atsma, D. E.,

and Evers, A. W. (2024). Human cues in ehealth to

promote lifestyle change: An experimental ﬁeld study

to examine adherence to self-help interventions. In-

ternet Interventions, 35:100726.

Ding, Z., Kang, J., HO, T. O. T., Wong, K. H., Fung, H. H.,

Meng, H., and Ma, X. (2022). Talktive: A conversa-

tional agent using backchannels to engage older adults

in neurocognitive disorders screening.

Elliott, M. (2018). Good bedside manner. Online.

Eyal, P., David, R., Andrew, G., Zak, E., and Damer, E.

(2021). Data quality of platforms and panels for online

behavioral research. Behavior Research Methods, 54.

Fassaert, T., Dulmen, A., Schellevis, F., and Bensing,

J. (2007). Active listening in medical consulta-

tions: Development of the active listening observation

scale (alos-global). Patient education and counseling,

68:258–64.

Fernau, D., Hillmann, S., Feldhus, N., Polzehl, T., and

oller, S. (2022). Towards personality-aware chat-

bots. In Lemon, O., Hakkani-T

ur, D., Li, J. J.,

Ashrafzadeh, A., Garc

ıa, D. H., Alikhani, M.,

Vandyke, D., and Dusek, O., editors, Proceedings of

the 23rd Annual Meeting of the Special Interest Group

on Discourse and Dialogue, SIGDIAL 2022, Edin-

burgh, UK, 07-09 September 2022, pages 135–145.

Association for Computational Linguistics.

Gosling, S. D., Rentfrow, P. J., and Swann Jr, W. B. (2003).

A very brief measure of the big-ﬁve personality do-

mains. Journal of Research in personality, 37(6):504–

528.

Hallal, P. C., Andersen, L. B., Gonc¸alves, L. G., Wells, J. C.,

Reichert, F. F., Anjos, L. A. d., Ferreira, R. C., and

Victora, C. G. (2016). Physical activity and inactivity

proﬁles in brazilian adults: results from the national

health survey (pns 2013). Revista de sa

ude p

ublica,

50:1S.

Heinz, B. M. (1998). Backchannel responses as conversa-

tional strategies in bilingual speakers’ conversations.

The University of Nebraska-Lincoln.

Hicks, K. M., Cocks, K., Martin, B. C., Elton, P. J., Macnab,

A., Colecliffe, W., and Furze, G. (2014). An interven-

tion to reassure patients about test results in rapid ac-

cess chest pain clinic: a pilot randomised controlled

trial. BMC Cardiovascular Disorders, 14.

Jagosh, J., Donald Boudreau, J., Steinert, Y., MacDon-

ald, M. E., and Ingram, L. (2011). The impor-

tance of physician listening from the patients’ per-

spective: Enhancing diagnosis, healing, and the doc-

tor–patient relationship. Patient Education and Coun-

seling, 85(3):369–374.

Karlsson, V., Forsberg, A., and Bergbom, I. (2012). Com-

munication when patients are conscious during respi-

rator treatment—a hermeneutic observation study. In-

tensive and Critical Care Nursing, 28(4):197–207.

King, G. (2021). Central yet overlooked: engaged and

person-centred listening in rehabilitation and health-

care conversations. Disability and rehabilitation,

44:1–13.

Lala, D., Milhorat, P., Inoue, K., Ishida, M., Takanashi, K.,

and Kawahara, T. (2017). Attentive listening system

with backchanneling, response generation and ﬂexible

turn-taking. In Proceedings of the 18th Annual SIG-

dial Meeting on Discourse and Dialogue, pages 127–

136, Saarbr

ucken, Germany. Association for Compu-

tational Linguistics.

Liang, K.-H., Lange, P., Oh, Y. J., Zhang, J., Fukuoka, Y.,

and Yu, Z. (2021a). Evaluation of in-person counsel-

ing strategies to develop physical activity chatbot for

women. arXiv preprint arXiv:2107.10410.

Liang, K.-H., Lange, P., Oh, Y. J., Zhang, J., Fukuoka, Y.,

and Yu, Z. (2021b). Evaluation of in-person counsel-

ing strategies to develop physical activity chatbot for

women. In Li, H., Levow, G.-A., Yu, Z., Gupta, C.,

Sisman, B., Cai, S., Vandyke, D., Dethlefs, N., Wu, Y.,

and Li, J. J., editors, Proceedings of the 22nd Annual

Meeting of the Special Interest Group on Discourse

and Dialogue, pages 32–44, Singapore and Online.

Association for Computational Linguistics.

Lynch, J., Hughes, G., Papoutsi, C., Wherton, J., and

A’Court, C. (2022). “it’s no good but at least i’ve

always got it round my neck”: A postphenomeno-

logical analysis of reassurance in assistive technology

use by older people. Social Science and Medicine,

292:114553.

HEALTHINF 2025 - 18th International Conference on Health Informatics

456

Moore, R., Al-Tamimi, A.-K., and Freeman, E. (2023). A

conversational agent (phyllis) to support adolescent

health and overcome barriers to physical activity: a

co-design and evaluation study (preprint). JMIR For-

mative Research, 8.

O’Keeffe, M., Cullinane, P., Hurley, J., Leahy, I., Bunzli,

S., O’Sullivan, P. B., and O’Sullivan, K. (2016). What

Inﬂuences Patient-Therapist Interactions in Muscu-

loskeletal Physical Therapy? Qualitative Systematic

Review and Meta-Synthesis. Physical Therapy, 96(5).

Olafsson, S., O’Leary, T. K., and Bickmore, T. W. (2019).

Coerced change-talk with conversational agents pro-

motes conﬁdence in behavior change. Proceedings of

the 13th EAI International Conference on Pervasive

Computing Technologies for Healthcare.

Rolfe, A. and Burton, C. (2013). Reassurance after diag-

nostic testing with a low pretest probability of serious

disease. JAMA internal medicine, 173:1–9.

Ruede, R., M

uller, M., St

uker, S., and Waibel, A. (2017).

Yeah, right, uh-huh: A deep learning backchannel pre-

dictor.

Sahijwani, H. (2022). Adaptive dialogue management for

conversational information elicitation. In Proceedings

of the 45th International ACM SIGIR Conference on

Research and Development in Information Retrieval,

pages 3495–3495.

Shan, Y., Ji, M., Xie, W., Qian, X., Li, R., Zhang, X.,

and Hao, T. (2022). Language use in conversa-

tional agent–based health communication: System-

atic review. Journal of Medical Internet Research,

24:e37403.

Snyder, U. (2008). The doctor-patient relationship ii: Not

listening. Medscape journal of medicine, 10:294.

Traeger, A., O’Hagan, E., Cashin, A., and Mcauley, J.

(2017). Reassurance for patients with non-speciﬁc

conditions – a user’s guide. Brazilian Journal of Phys-

ical Therapy, 21.

Wu, Z., Balloccu, S., Kumar, V., Helaoui, R., Refor-

giato Recupero, D., and Riboni, D. (2023). Cre-

ation, analysis and evaluation of annomi, a dataset of

expert-annotated counselling dialogues. Future Inter-

net, 15(3).

Wutz, M., Hermes, M., Winter (n

ee Hinz), V., and

Koeberlein-Neu, J. (2023). Factors inﬂuencing the ac-

ceptability, acceptance and adoption of conversational

agents in healthcare: An integrative review (preprint).

Journal of Medical Internet Research, 25.

APPENDIX

A: Survey Questions

Research survey included a total of six questions,

speciﬁcally focused on examining the perception of

active listening and reassurance. The ﬁrst three ques-

tions assessed active listening, while the remaining

three questions referred to reassurance.

1. How well did therapist demonstrate active listen-

ing by paying attention and showing interest in

clients concerns?

2. Did the therapist ask questions or provide feed-

back that demonstrated understanding and active

listening?

3. Did the therapist give cues or responses that

showed they were actively listen and paying at-

tention?

4. Did the therapist acknowledge and validate the

client’s concerns or emotions that demonstrated

reassurance?

5. How well did the therapist provide reassurance,

comfort, support, or encouragement to the client?

6. How effective was the therapist in fostering a

sense of reassurance and encouragement for the

client’s overall progress?

B: Designed Prompts

To generate synthetic data, we used ChatGPT 3.5

with different prompts to generate dialogue excerpts

in speciﬁc styles, which depicts qualities related to

active listening and reassure.

• Write a dialogue where the therapist actively lis-

tens to the client’s concerns about their progress

in therapy and reassures them with empathy and

understanding

• Craft a scenario where the client expresses anx-

iety about their ability to recover fully, and the

therapist listens attentively while providing reas-

surance and support

• Imagine a dialogue between a therapist and a

client where the client expresses frustration with

their current treatment plan. How does the ther-

apist respond with active listening and reassur-

ance?

• Create a conversation where the client shares their

fears about returning to activities that caused their

injury, and the therapist responds by actively lis-

tening and offering reassurance and guidance

• Write a dialogue where the client discusses feel-

ings of self-doubt and uncertainty about their

progress, and the therapist responds by validating

their concerns and providing reassurance.

Quantifying the Role of Active Listening and Reassurance in Virtual Health Coach Interactions

457