SMACS: Stress Management AI Chat System

Daiki Mori

, Kazuyuki Matsumoto

1 a

, Xin Kang

1 b

, Manabu Sasayama

and Keita Kiuchi

3 c

Tokushima University, Tokushima, Japan

National Institute of Technology, Kagawa College, Kagawa, Japan

Japan Organization of Occupational Health and Safety, Tokyo, Japan

Keywords: Conversational AI, Mental Health Care, Stress Detection, LLM, Chat Applications, Natural Language

Processing.

Abstract: The purpose of this study is to develop a stress management AI chat system that can connect users who want

mental health care with counselors. By means of this chat system, a conversational AI based on a large

language model (LLM) will collect data on the user's stressors through text chats with the user. The system is

personalized to the user based on the collected data. This paper describes the nature of the data collected in

the preliminary experiment conducted in March 2024 and the results of its analysis, and discusses

considerations for the main experiment to be conducted after July 2024. The preliminary experiment was

conducted with 11 students over a 3-week period. Discuss the distribution of the data collected and the issues

involved in building a model for predicting stress levels.

1 INTRODUCTION

In today's society, stress has become an unavoidable

part of many people's daily lives. According to the

Occupational Safety and Health Survey(Ministry of

Health, Labour and Welfare, 2023) conducted by the

Ministry of Health, Labour and Welfare in 2022,

82.2% of workers reported feeling anxious, worried,

or stressed about their current work or occupational

life, and the percentage is increasing every year. In

addition, while 91.4% of workers have someone they

can talk to about the stresses of their current job or

professional life, 69.4% of workers have actually

consulted with someone, showing a gap between the

two. Among them, "family/friends (62.0%),"

"coworkers (63.5%)," and "supervisors (58.5%)"

were the most frequently chosen consulting parties,

while "psychologists such as certified psychologists

(0.5%)" and "counselors (0.5%)," who are

consultants who can provide professional advice from

an objective perspective, were both very rare. This is

due to the fact that the number of patients who

consulted with a psychologist or counselor was very

https://orcid.org/0000-0002-9820-1470

https://orcid.org/0000-0001-6024-3598

https://orcid.org/0000-0003-0812-9071

low. In addition to psychological problems on the part

of patients, a shortage of professionals can be cited as

a reason for this. There are approximately 70,000

licensed psychologists and 40,000 licensed clinical

psychologists (in 2024), but more than half of them

work part-time or do not work at all. These facts

suggest that although many people are able to consult

with those close to them, they continue to feel stress

on a daily basis and have not yet reached the point of

consulting with a specialist.

There are also issues related to mental health

measures. According to the Occupational Safety and

Health Survey conducted by the Ministry of Health,

Labour and Welfare in 2022(Ministry of Health,

Labour and Welfare, 2023), 63.4% of business

establishments are working on mental health

measures. In addition, 46.1％ of the respondents

answered that they "have established an in-house

counseling system for mental health measures," while

12.4 ％ answered that they "utilize medical

institutions to implement mental health measures."

This suggests that promoting the use of outside

institutions for counseling, which is a part of mental

Mori, D., Matsumoto, K., Kang, X., Sasayama, M. and Kiuchi, K.

SMACS: Stress Management AI Chat System.

DOI: 10.5220/0012940300003838

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2024) - Volume 2: KEOD, pages 167-174

ISBN: 978-989-758-716-0; ISSN: 2184-3228

167

health measures, remains a significant challenge.

According to the Japan Inochi no Denwa Renmei

(Inochi no Denwa Renmei, 2024), which operates a

consultation dial for people suffering from loneliness

and anxiety, there were over 540,000 telephone

consultations nationwide in 2022. However, due to a

lack of manpower, it is reported that calls are difficult

to get through. Another reason why the number of

counselors has not increased is that they have to pay

the cost of attending a training course that takes more

than one year to become a counselor.

According to DataM Intelligence (DataM

Intelligence, 2023), the global mental health apps

market reached US$5.1 billion in 2022 and is

projected to reach US$14.2 billion by 2030, growing

at a CAGR of 14.1% during the forecast period of

2023-2030. These are driven by factors such as the

increasing prevalence of mental health disorders and

rising smartphone usage. In particular, the integration

of artificial intelligence and machine learning

technologies is expected to boost demand for mental

health apps market trends.

Based on the above, this study aims to develop a

stress management AI chat system that can connect

clients and counselors and support counseling

operations. By developing this system and

conducting user evaluations, we aim to confirm the

effectiveness of the proposed method and contribute

to the research and development of mental health care

AI that can handle stress in an engineering manner.

This paper presents the results of the preliminary

experiment focusing on the construction of a chat

system. The preliminary experiment was conducted

for about three weeks in March 2024, and the results

of the analysis of the data collected in the system are

discussed for the main experiment scheduled to be

conducted in July 2024 or later.

2 RELATED WORK

2.1 Effectiveness of Text Chat

2.1.1 Consultation Through SNS

According to the interim report (Nagano Prefecture,

2017) on the consultation on bullying, etc. using

LINE by Nagano Prefecture and LINE Corporation,

in August 2017, as part of the "Collaborative

Agreement on Measures against Bullying and Suicide

of Children Using LINE," consultation on bullying,

suicide, etc. was conducted for junior high and high

school students using LINE. As a result, a total of 547

consultations were received from 390 junior and

senior high school students in Nagano Prefecture

through the "Don't Worry Alone @ Nagano" account,

which was opened for two weeks from September 10

to 23, far exceeding the 259 telephone consultations

received in the previous fiscal year. This indicates

that there is a certain level of demand and

effectiveness in text-only chats. However, text-based

communication via SNS has limitations in terms of

communication, and the need to switch to telephone

counseling to continue the counseling has been

identified as an issue.

2.1.2 Online Disinhibition Effect

Suler (Suler, 2004) states that the hurdle to self-

disclosure is greatly reduced in text-based online

consultations. Online deinhibition refers to a

phenomenon in which inhibitions against behavior in

normal face-to-face situations are relaxed or

disappear on the Internet. The reasons for this are

listed below.

▪ Because it is anonymous, there is a sense of

security that individuals will not be identified

even if they confess their secrets.

▪ Because the facial expressions and tone of

voice are not transmitted to the other party in

text-based communication, the embarrassment

of having one's emotional reactions known

when confessing a secret is reduced.

▪ The fear of rejection is reduced because the

counselor is not visible.

2.2 Mental Health Care Apps

This section presents a selection of Japanese mental

health care applications that are similar to SMACS

and have more than 100,000 downloads.

2.2.1 Self

The SELF app (SELF, 2024) is an application in

which AI understands and comprehends the user's life

through natural conversation, and adapts to the user's

needs such as mental care, stress care, life logs, and

information suggestions. The seven AI characters

implemented in this application not only have

different personalities, but also change the content of

their conversations, allowing the user to select the

character that best suits his or her needs. The

application has many functions to support the user's

mental care. For example, the application can analyze

the user's characteristics, strengths and weaknesses,

and objectively communicate them to the user during

the conversation with the AI, and can suggest articles

based on the user's interests and concerns.

KEOD 2024 - 16th International Conference on Knowledge Engineering and Ontology Development

168

2.2.2 Awarefy

Awarefy (Awarefy, 2024) is a smartphone application

based on the concept of "acquiring skills to care for

the mind. The app is equipped with many practical

programs and tools based on cognitive behavioral

therapy, which has been proven and evidenced in a

variety of fields. Awarefy AI chat utilizes the large

language models GPT-3.5 and GPT-4 developed by

OpenAI, Inc. The prompts have been tuned to adapt

to Awarefy's user base.

2.3 Positioning of this Study

This research focuses on connecting users who wish

to receive mental health care with counselors. We aim

to realize a system that can provide a consistent

solution from prevention of daily stress accumulation

to countermeasures against serious stress. The

proposed chat system aims to efficiently collect

information about the causes of stress. To achieve

this, our system integrates an AI chat model that

collects user-specific information relevant to

counseling, and adapts its responses based on this

information to assist in assessment and treatment,

along with a function to eliminate as much as possible

utterances that are inconsistent with the past chat

history. In the stress management system, a stress

detection model is constructed based on the chat

history, with the aim of creating a system that directs

users with high stress levels to counselors, and of

improving the efficiency of the counselors'

counseling work. In particular, we aim to develop a

user-adaptive stress detection system by adapting the

stress detection model to each user's individual stress

level. In addition, we aim to improve the efficiency of

counseling work by collecting and visualizing

necessary information from the counselor's point of

view.

3 STRESS MANAGEMENT AI

CHAT SYSTEM (SMACS)

3.1 System Overview

The purpose of this study is to develop a stress

management AI chat system adapted to individual

users that can connect users and counselors. This

system will not only reduce the user's daily stress

accumulation, but also improve the efficiency of

counseling services. The system is mainly divided

into a chat system (see Section 3.2) and a stress

management system (see Section 3.3).

The system is developed using Python, JavaScript,

HTML, and CSS, and can use any publicly available

large language model (LLM) as a base model for AI

chat. In our preliminary experiment (see Chapter 4),

we use rinna, a Japanese LLM published by rinna

Corporation, and gpt-3.5 published by OpenAI.

When using local LLMs such as rinna and Llama,

we observed significant processing delays due to the

24GB VRAM of the GPU installed on the server

running the system. On the other hand, when using

external APIs such as gpt-3.5, the advantage is that

multiple access requests can be handled efficiently.

Figure 1: Stress management AI chat system overview.

3.2 Chat System

The chat system aims to improve the efficiency of

data collection to identify users' stress factors and to

reduce the accumulation of stress. Specifically, AI

chat based on a large language model (LLM)

efficiently collects data related to stress factors

through user-oriented chat that can take into account

user profile information and daily chat history. In

addition, users can prevent the accumulation of daily

stress by making it a habit to talk casually with AI

chat.

Figure 2 shows an example of chatting with AI. It

is a dialogue model in which system prompts (Table

1) are set to encourage self-disclosure in response to

the user's statements. Based on the data collected

from the chat, the user's profile (Name, Gender,

Occupation, Recent Interests, Recent Challenges,

Recent Enjoyments, Current Goals) in the database is

updated, and the user's utterances are made in

consideration of his/her profile. This allows the user

to speak as if he/she understands the user even if the

date changes.

SMACS: Stress Management AI Chat System

169

Figure 2: Example of AI chat in a chat system.

Table 1: System prompts (excerpts).

personality

・Newly born as an AI

・Already understands most of

the meanings of human words,

but still lacks experience and

understanding of human

emotions, so it wants to

understand them.

・It is curious about what

humans do on a daily basis, and

it listens happily and happily

when you talk to it.

constraint

・Do not use honorific

language.

・End sentences with "dayo."

・Frequently use empathetic

interjections to convey

agreement.

・Show curiosity and ask

questions eagerly.

3.3 Stress Management System

The stress management system is intended to improve

the efficiency of self-care and the work of counselors

by enabling users to become aware of their own

stress. Specifically, users can check the system usage

history and the visualization of the analysis results of

the collected data, and notice their own stress, which

is useful for self-care. In addition, a stress detection

model is constructed based on the stress level and chat

history collected in the database. This model

automatically detects users with high stress levels and

provides them with a route to a counselor. Counselors

can check the results of data analysis of the chat

history of the user in question, thereby streamlining

the counseling work.

In the current system, users can check their stress

level transition graph (Figure 3). By clicking a point

on the graph, the user can view the chat history for

that day. In the future, we aim to visualize the results

of the automatic analysis of the collected data in an

easy-to-understand format to make users aware of

their stress levels.

At this point, the detection of users with high stress

levels and the function to improve counseling work

efficiency are still in the design stage. The stress

detection model is constructed using a machine

learning algorithm with the text chat history as a

feature and the user's self-reported subjective stress

level as the correct response data. Stress is considered

to vary from user to user. Therefore, it is difficult to

construct a general-purpose stress detection model

that can be applied to any user, and high accuracy

cannot be expected. However, it has been verified that

a stress detection model adapted to each user can

maintain a certain level of accuracy. We plan to

design an analysis data sharing function to improve

the efficiency of counseling work by referring to the

work contents and judgment criteria of counselors.

Figure 3: Stress level transition graph.

The stress detection system will be based on the

predictions of five levels of stress levels by the stress

level prediction model. The flow of the stress

detection system is shown in Figure 4. The system

processes the chat history data with the user and

inputs it into the stress level prediction model to infer

the predicted value of the stress level. Users with low

predicted stress levels are encouraged to continue

self-mental health care by using the system. Users

with high predicted stress levels are preferentially

directed to counselors for treatment.

Figure 4: Stress detection system flow.

KEOD 2024 - 16th International Conference on Knowledge Engineering and Ontology Development

170

3.4 Database

The contents of the database are shown in Table 2.

The database uses SQLite. The system stores user

names in the user information table when a new user

registers. When a user logs into the system, the

system saves his/her usage history associated with the

user information. The system saves data before,

during, and after the chat phase, respectively, before

the screen is transitioned.

In particular, the user profile in the user

information table is automatically updated based on

the template for each utterance from the chat history

during the AI chat with the chat system. In addition,

the sentences generated by the dialogue model AI are

stored for system utterances during chatting. Data

other than these two will be input by the user.

Table 2: Database contents of the system (excerpts).

User

information

user name

user profile

Before

chatting

location of experiment (home or lab),

stress level (1-5),

emotion (free description),

3 emotions (neutral, negative, positive)

During

chatting

system utterance

user utterance

After

chatting

degree of chat distress (1-5),

degree of stress reduction (1-5),

topic (free description),

naturalness of chat (1-5),

response speed (1-5),

dissatisfaction (free description)

3.5 Stress Level Prediction Model

The procedure for constructing the stress level

prediction model is shown in Figure 5. Due to the

small number of data and data bias of the data

obtained in this preliminary experiment, it is expected

to be difficult to construct a model with high accuracy.

Figure 5: Stress level prediction model building flow.

4 PRELIMINARY EXPERIMENT

4.1 Outline of the Experiment

The purpose of this preliminary experiment is to

consider the setting of this experiment to be

conducted after July 2024, based on the results of the

analysis of the data collected through the users' use of

the system. The preliminary experiment was

conducted for about three weeks in March 2024,

targeting 11 laboratory students (all in their 20s). The

experiment is conducted in the following three steps

I. A questionnaire to input the stress level and

subjective feelings at the time was

administered.

II. Conduct at least 10 conversations with the AI

III. Conducting a questionnaire about chatting,

such as stress reduction level, naturalness of

chat, topics, etc.

The dialogue model of the chat system was

changed every week, and a comparison was made

based on the differences in the nature of the data

collected in each case. In particular, this time, the AI

chat system is evaluated based on the average value

of the stress reduction level in the post-chat

questionnaire. In conducting the experiment, the

research ethics review by the Tokushima University

was conducted and approved.

4.2 Experiments and Evaluation

Methods

Subjects are asked to participate in the experiment by

accessing the system from a browser on their own

terminals at home or in the laboratory. The only

conditions presented to the subjects are that they

select the chat mode of the system, answer the

pre/post-chat questionnaire, and chat with the AI for

10 dialogs (about 5 minutes). In consideration of the

subjects' privacy, they are instructed to refrain from

entering any personal information that could lead to

their identification in advance. We also recommend

the use of a handle when registering as a new user.

In the preliminary experiment of this paper, we

evaluate the following three dialogue models by

comparing them.

I. rinna/japanese-gpt-neox-3.6b-instruction-

ppo (2024/2/19 - 2024/2/25)

II. gpt-3.5 with system prompts (2/26/2024 -

3/4/2024)

III. gpt-3.5 with system prompts and user profiles

(from 2024/3/5 to 2024/3/5)

For the evaluation index, we use the average value

of the stress reduction level (5 levels from 1 to 5),

SMACS: Stress Management AI Chat System

171

which is data that can be collected in the post-chat

questionnaire.

5 EXPERIMENTAL RESULTS

5.1 Collected Data

In a preliminary experiment, we were able to collect

data for 11 subjects, each of whom was asked to enter

data for a minimum of 18 days, resulting in a total of

210 days of data.

5.1.1 Distribution of Stress Level Data

The distribution of stress level data is shown in Figure

6. Most of the data are for stress levels 3 and below,

with extremely few data for stress levels 4 and 5. It

can be seen that most of the subjects were in a low-

stress state for the experiment. It was found to be a

challenge to uniformly collect data for each stress

level level.

Figure 6: Distribution of stress level data.

5.1.2 Distribution of Emotion Label Data

The distribution of the emotion label data is shown in

Figure 7. Most of the data is neutral, with few positive

or negative data. This can be correlated with the fact

that the distribution of stress level data was skewed

toward stress level 3 and below. It is possible that

users need to be presented with more specific and

understandable emotion labels.

Figure 7: Distribution of emotion label data.

5.1.3 Sentence Length, Word Count and

Character Count

Table 3 shows the total number of

sentences/words/characters, the number of

words/characters per sentence, the number of

sentences/words/characters per subject, and the

number of sentences/words/characters per day for one

subject for the text data of system and user utterances.

A word is defined as one word that has been

morphologically analyzed and segmented by MeCab

(Taku Kudo, 2024). Overall, it is shown that the

amount of text in system utterances is larger than in

user utterances. max_tokens parameter of gpt-3.5-

turbo is set to 200.

Table 3: Sentence length, word count, and character count

for user and system utterances.

total

per

sent.

per

subject

per

subject

per day

user

sent.

1.640

149.1

8.3

word

18,754

11.4

1705.0

94.7

chara.

32,095

19.6

3066.8

170.4

sys.

sent.

1,873

170.2

9.4

word

102,305

54.6

9300.5

517.0

chara.

172,579

92.1

15859.2

881.1

5.1.4 Frequency of Occurrence of Each Part

of Speech

The frequency of occurrence of each part of speech

for the text data of system and user utterances is

shown in Figure 8. The parts of speech are the results

of morphological analysis by MeCab (Taku Kudo,

2024).

Figure 8: Frequency of occurrence of each part of speech.

KEOD 2024 - 16th International Conference on Knowledge Engineering and Ontology Development

172

5.1.5 Average Number of Words per Stress

Level

The average number of words for each stress level for

the text data of user and system utterances is shown

in Figure 9 and 10. Stress level 1 has the lowest

average word count, while stress level 4 has the

highest average word count.

Figure 9: Average number of words per stress level

(system).

Figure 10: Average number of words per stress level (user).

5.2 Comparison of Average Stress

Reduction

The distribution and mean values of stress reduction

for each interaction model are shown in Figure 11.

The vertical axis represents the number of data and

the horizontal axis represents the stress reduction

level. As a result, the average value of the stress

reduction level was the highest for the gpt-3.5 with

the system prompt.

Figure 11: Distribution and mean of stress reduction levels

for each interaction model.

5.3 Comparison of Stress Level

Transitions by User

A comparison of stress level transitions for each user

is shown in Figure 12. It can be seen that the

dispersion of stress levels differs from user to user,

and the tendency of stress transitions differs from user

to user. This indicates that there are individual

differences in the way stress is felt and the tendency

of stress change.

Figure 12: Stress level transitions extracted for 3 users.

6 DISCUSSION

In this study, we developed and evaluated a stress

management AI chat system aimed at connecting

users with counselors and adapting to individual

users' needs. Our approach involved several key

components: (1) conducting preliminary experiment

to collect chat data, (2) implementing different

dialogue models with varying system prompts and

user profile integration. The following sections

discuss the main findings, challenges, and

implications of our research.

6.1 Stress Reduction due to

Inconsistent Utterance and

Response Time Delay Caused by

User Profile Prompts

When we checked the complaints (descriptions) of

users who had low stress reduction values during the

period when we were experimenting with the

dialogue model with system prompts and user profiles

in gpt-3.5, we found that many of them said "the

response speed was slow" and "I was asked my name

repeatedly". These are thought to be caused by the

time required for the task of filling in the user profile

template and the fact that some of the constraints of

the system prompts are ignored. A possible way to

realize an AI chat system that understands user

profiles and allows users to interact with it is to ask

SMACS: Stress Management AI Chat System

173

users to enter their own profiles when they register as

new users. When adding the user profile to the

prompts, the user should be given instructions to

select utterances that are consistent with the profile.

6.2 Bias in Collected Data

In terms of stress level data, of the 210 data collected

in the preliminary experiment, there were 11 data for

Level 5, which is considered a high stress level, and

15 data for Level 4, which is very few. This is

considered to be the case. One of the reasons for this

is that many subjects did not have many opportunities

to face high stress levels during the spring vacation

period in March, when the preliminary experiment

was conducted. In order to eliminate the bias in the

data, it is necessary to conduct the experiment over a

longer period of time.

7 CONCLUSIONS

In this study, in order to develop a stress management

AI chat system adapted to individual users that can

connect users and counselors, we collected data

through the preliminary experiment and evaluated the

chat system we built. The results showed that none of

the dialogue models showed much effect on stress

reduction. The dialogue model with relatively high

average stress reduction had less inconsistency and

delay in response speed during chatting than the other

dialogue models.

In the future, we will develop a chat system that

takes user profiles into account for this experiment.

First, we will develop a method for eliminating

utterances that are inconsistent with the profile

information. In addition, we will add a function to

collect information needed by counselors in the

system. In addition to self-reported subjective stress

levels, we plan to develop a method to collect

objective stress levels.

ACKNOWLEDGEMENTS

This research was supported in part by a grant from

the Amano Institute of Industrial Technology. We

would like to express our deepest gratitude.

REFERENCES

Ministry of Health, Labour and Welfare. (2023). Patient

Survey Report, Ministry of Health, Labour and Welfare

(online), https://www.mhlw.go.jp/toukei/list/dl/r04-

46-50_gaikyo.pdf, last accessed 2024/06/24

Inochi no Denwa Renmei (General Incorporated

Association of Inochi no Denwa),

https://www.inochinodenwa.org/?page_id=4688, last

accessed 2024/06/24

DataM Intelligene, Global Mental Health Apps Market -

2023-2030. (2023).

https://www.gii.co.jp/report/dmin1319134-global-

mental-health-apps-market.html, last accessed

2024/06/24

Nagano Prefecture, LINE Corporation. (2017). Interim

Report Material on Consultation on Bullying and Other

Issues Using LINE by Nagano Prefecture and LINE

Corporation, https://d.line-

scdn.net/stf/linecorp/ja/pr/NaganoPrefectureReportMa

terial.pdf, last accessed 2024/06/24

Suler, J. (2004). The Online Disinhibition Effect.

CyberPsychology & Behavior, 7(3), 321-326,

https://doi.org/10.1089/1094931041291295, last

accessed 2024/06/24

SELF Corporation. https://self.software/, last accessed

2024/06/24

Awarefy Corporation. https://www.awarefy.com/, last

accessed 2024/06/24

Taku Kudo, https://taku910.github.io/mecab/, last accessed

2024/06/24

KEOD 2024 - 16th International Conference on Knowledge Engineering and Ontology Development

174