2.3 Graph Centrality
Introduced by (Bavelas, 1948) while studying the
communication of individuals and influence in small
social groups, the concept of centrality in graphs is as-
sociated with the degree of importance and influence
that each vertex has on another in the graph, and what
bottlenecks may exist in their connections. (Freeman,
1978) also works with the same concepts of central-
ity in social networks, investigating the quantitative
measures capable of defining the importance of each
vertex.
In this context, this work seeks to identify the im-
portance that each phoneme has in the set of analyzed
words, considering possible phonetic transcriptions
where a phoneme can appear in more than one po-
sition of the syllable and the word. In this way, it is
possible to identify the most influential phonemes in
the set, that is, those that are over-represented, and
make them less important by removing some of their
words, making the set more balanced.
Each vertex has a list of words in which a
phoneme occurred in a certain position. With that, it is
possible to identify the importance that each word has
in the vertices, in order to list which could or could not
be removed from the graph without the same ending
with under-represented phonemes.
2.4 Phonological Processes
In the context of speech therapy, phonological pro-
cesses have a great influence on a child’s language
acquisition process. It is expected that during this
stage, she applies several phonological processes,
such as replacing one phoneme with another or omit-
ting them. Such substitutions and omissions are con-
sidered in speech therapy as Phonological Processes
(PP), and some examples are presented in Table 1.
However, if a PP persists for a long time, it can
become a phonological disorder and remain in the
child’s speech, accompanying her in school during
her literacy process, bringing harm to her social life
(Goulart and Chiari, 2014). Therefore, some works
are dedicated to identifying possible phonological
disorders through voice recognition in phonological
evaluations (Franciscatto et al., 2019), so that the di-
agnosis and treatment is given early.
In the work of (Franciscatto et al., 2019), Ma-
chine Learning (ML) techniques were used to classify
the pronunciations of 84 words as correct or incor-
rect and to recognize phonological processes through
them. In (Franciscatto. et al., 2018), a case-based
method, commonly used in the health field and in ML
techniques (Tavana et al., 2022), is developed and is
able to have good learning while allowing new cases
to be stored in a database without complications (Hu-
sain and Pheng, 2010). The method works as an ex-
tra validation layer after the pronunciation classifica-
tion by ML, registering new cases and validating them
with an expert.
However, despite studies using Artificial Intelli-
gence (AI) techniques (Iliya and Neri, 2016) and ML
(Franciscatto et al., 2019) in speech therapy, they are
only used to identify phonological processes and clas-
sify (correct/incorrect) the pronunciation of spoken
words in phonological evaluation. But, before that,
a set of words must be chosen by a specialist to be
pronounced by the children, and such a set must an-
alyze all the phonemes of the language in different
positions of the word. Thus, the choice of words in
the set must follow specific criteria, addressed in Sec-
tion 3.1, because in a phonological evaluation, for ex-
ample, all phonemes must be analyzed at least twice
(Stoel-Gammon, 1985). So, to define the smallest
subset of words that meets the same criteria as the ini-
tial set, it is a matter of quantifying the “least effort”,
discussed in Section 3, rather than learning from er-
rors and successes.
3 DEVELOPMENT
In this section, the operation of the algorithm will be
presented and detailed, from the basic input structures
to the final result.
The set of 84 words from (Ceron et al., 2020) was
used as the database. Additionally, all the words in
the set should be phonetically decomposed in order to
detail in which positions the phonemes that compose
them appear. For this, the JSON structure, developed
in the work of (Marques, 2022), was used, which is
synthesized in Figure 3.
Figure 3: Example of the word “cavalo [horse]” [horse] de-
composed in consonant phonemes.
The main idea of the algorithm is to avoid as much
as possible that the same phoneme is over-represented
in the set of words, being analyzed in the same posi-
tion with a frequency equal to or greater than the min-
ICEIS 2023 - 25th International Conference on Enterprise Information Systems
82