
health problems, which shows high potential in early 
detection  as  a  data  source. A.  B.  R.  Shatte,  et  al. 
adopted a scoping analysis method to quickly define 
the  realm  of  machine  learning  in  mental  health 
(Shatte  et  al  2019). The  extraction of  data  included 
information  on  mental  health  applications,  machine 
learning techniques, data types, and research results. 
Support  vector  machine,  decision  tree,  and  neural 
network  were  used.  The  application  of  machine 
learning  has  shown  a  series  of  benefits  to  mental 
health,  but  most  of  researches  concentrate  on  the 
identification  and  treatment  of  mental  health 
disorders.  Machine  learning  applications  still  have 
plenty of scope to grow in other fields. 
Jetli  Chung  and  Jason  Teo  used  the  PRISMA 
methodology  when  collecting  relevant  research 
articles  and  studies  by  searching  reliable  databases 
(Chung and Teo 2022). Researchers' challenges and 
limitations were reflected in the research. In addition, 
specific suggestions on potential future research and 
development were also provided.  Currently, there is 
no  model  that  can  predict  a  person's  likelihood  of 
having  mental  health  issues.  Machine  learning 
techniques  could  improve  logistic  regression  of  the 
standard  prediction  modelling  technique. Ashley  E. 
Tate,  et  al  aimed  to  evaluate  whether  machine 
learning techniques are superior to logistic regression 
and create a model to forecast mental health issues in 
mid-adolescence (Tate et al 2020). The research used 
nearly 500 predictors from register data and parental 
report. Finally, the best preforming model is not fit for 
clinical use. It is not necessary to seek more complex 
methods  and  forgo  logistic  regression  for  similar 
studies. 
Sumathi  M.R.  and  B.  Poorna  identified  eight 
algorithms and evaluated  the  efficacy in diagnosing 
five basic mental health issues with various measures 
(Sumathi  and  Poorna  2016).  In  order  to  train  and 
detect  the  accuracy  of  the  algorithms,  a  data  set  of 
sixty cases was collected in the research. Ayako Baba 
and  Kyosuke  Bunji  obtained  data  from  63% 
responses of about 6000 undergraduate students from 
a Japanese national university (Baba and Bunji 2023). 
The  research  compared  the  results  of  different 
machine  learning  models,  including  elastic  net, 
logistic  regression,  XGBoost,  random  forest,  and 
LightGBM.  According  to  the  comparison,  the 
LightGBM  model  performed  the  best.  Both 
conditions  and  analyses  reached  adequate 
performance in this model. 
Konda Vaishnavi,  et al  identified five  machine 
learning techniques, including KNN, Random Forest, 
Decision  Tree,  etc.  (Vaishnavi  et  al  2021).  The 
research used several  accuracy criteria to assess the 
accuracy  in  identifying  mental  health  problems. 
Finally, they acquired the most accurate model based 
on  the  Stacking  technique  with  the  prediction 
accuracy  81.75%.  Jetli  Chung  and  Jason  Teo 
evaluated some popular machine learning algorithms 
(Chung and Teo 2023). Responses to a survey taken 
by Open Sourcing Mental Illness were included in the 
data  set.  Machine  learning  techniques  included 
Gradient  Boosting,  Logistic  Regression,  KNN, 
Neural  Networks,  and  Support  Vector  Machine,  as 
well  as  an  ensemble  approach  based  on  these 
algorithms.  Gradient  Boosting  reached  the  highest 
accuracy,  which  was  88.80%,  providing  a  highly 
promising  approach  toward  automated  clinical 
diagnosis for mental health professionals. 
3  METHODOLOGY 
Random  Forest,  Support  Vector  Machine,  and 
Logistic  Regression  are  three  machine  learning 
methods  that  the  study  suggests  using.  Their 
performances on this data set are compared in order 
to identify the best model. 
The  classification  algorithm  Support  Vector 
Machine  (SVM)  uses  interval  maximization  that 
separates data points of different classes by finding an 
optimal  hyperplane.  Data  points  are  mapped  into  a 
high-dimensional space  as  the fundamental  concept 
of SVM, which makes it easier to separate data points. 
SVM is a commonly used machine learning algorithm 
with high accuracy and generalization ability. Finding 
an  ideal  hyperplane  that  optimizes  the  separation 
between different categories of data points is the aim 
of SVM. This distance is known as the margin, and 
support vectors are the most closely linked data points 
to the hyperplane. The following stages can be used 
to explain  the  fundamentals  of SVM, mapping  data 
points  to  a  high-dimensional  space,  finding  an 
optimal  hyperplane  in  a  high-dimensional  space  to 
increase  the  separation  of  data  points  from  the 
hyperplane  in  several  categories.,  classifying  data 
points  into  different  categories  according  to  the 
hyperplane,  categorizing  new  data  points.  In  SVM, 
the mapping of data points can be implemented using 
different kernel functions. Gaussian, polynomial, and 
linear  kernel  functions  are  examples  of  kernel 
functions  that  are  frequently  used.  These  kernel 
functions  can  map  data  points  into  higher-
dimensional Spaces, making it easier to separate data 
points in higher-dimensional Spaces. 
Random  forest  belongs  to  the  category  of 
ensemble  learning,  which  creates  a  strongly 
supervised  model  by  mixing  weakly  supervised 
DAML 2023 - International Conference on Data Analysis and Machine Learning
286