second group of datasets (Datasets II) that complied
with nineteen rules, was 99.9815%.
Finally, based on the experimental results, we
deem that the accuracy of the learning procedure
depends on the number of instances that are
contained into the dataset. Specifically, the lowest
accuracy for both groups of datasets was
demonstrated by both algorithms when the dataset
with the least number of instances (“Mi” for the first
group of datasets and “Omega” for the second one)
was applied, and the highest accuracy was achieved
when the database with the most number of
instances was applied (the dataset “All Letters” for
both groups of datasets).
5 CONCLUSIONS
In this paper we presented a decision-tree based
approach for learning Greek phonetic rules. A
comparative evaluation of the ID3 divide-and-
conquer decision tree algorithm and Quinlan’s C4.5
learner model was performed, using two databases
that contained respectively 31990 and 48238 Greek
words and phrases.
The experimental results suggested that although
both algorithms perform exceptionally well at the
phonetic rule-learning task, the C4.5 classifier is a
lot quicker. Furthermore, the phonetic rule-learning
task was proven independent of the phonological
rules according to which the database is constructed,
but depends highly on the size of the dataset (i.e. the
number of instances that are contained in it).
REFERENCES
Babiniotis, G., 1986. Συνοπτική Ιστορία της Ελληνικής
Γλώσσας. Athens.
Busser, G., Daelemans, W., Van den Bosch, A., 1999.
Machine Learning of word pronunciation: The case
against abstraction. In Proc. 6th European Conference
on Speech Communication and Technology,
Eurospeech 99, Budapest, Hungary, pages 2123-2196.
Chomsky, N., and Halle, M., 1968. The Sound Patterns of
English, Harper & Roe, New York.
Dietterich, T.G, 1997. Machine Learning research:Four
current directions. AI Magazine, 18(4):97-136.
Johnson, C. D., 1972. Formal Aspects of Phonological
Description. Mouton, Hauge.
Levinson, S.E., Liberman, M.Y., Ljolje, A., and Miller,
L.G., 1989. Speaker Independent Phonetic
Transcription of Fluent Speech for Large Vocabulary
Speech Recognition. ICASSP’89, pp. 441-444.
Mitchell, T., “Decision Tree Learning”, in T. Mitchell,
Machine Learning, McGraw-Hill, 1997, pp. 52-78.
Nunn, A., van Heuven, V.J., 1993. Morphon, lexicon-
based text-to-phoneme conversion and phonological
rules. In V.J. Van Heuven and L.C.W. Pols, editors,
Analysis and synthesis of speech; strategic research
towards high-quality text-to-speech generation.
Berlin, Mouton de Gruyter.
Petrounias, E., 1984. Νεοελληνική Γραμματική και
Συγκριτική Ανάλυση. University Studio Press,
Thessaloniki, Greece.
Quinlan, J. R., 1993. C4.5: Programs for Machine
Learning. San Mateo: Morgan Kaufmann Publishers.
Rentzepopoulos, P., and Kokkinakis, G., 1996. Efficient
Multilingual Phoneme-to-Grapheme Conversion
Based on HMM. Computational Linguistics, 22:3.
Robins, R. H., 1980. General Linguistics. An Introductory
Survey. 3
rd
Edition, Longman.
Rosenberg, C. R., 1987. Revealing the Structure of
NETtalk’s Internal Representations. In Proc. of the 9
th
Annual Conf. Cognitive Science Society, pp.537-554.
Sejnowski, T.J., Rosenberg, C.S., 1987. Parallel networks
that learn to pronounce English text. Complex
Systems, 1:145-168.
Setatos, M., 1974. Φωνολογία της Κοινής Νεοελληνικής,
Papazisis, Athens.
Sgarbas, K., Fakotakis, N., and Kokkinakis, G., 1995. A
PC-KIMMO-based Morphological Description of
Modern Greek. Lit. & Ling. Computing, 10:189-201.
Sgarbas, K., Fakotakis, N., and Kokkinakis, G., 1998. A
PC-KIMMO-based Bi-directional Graphemic/
Phonetic Converter for Modern Greek. Literary and
Linguistic Computing, Oxford University Press,
Vol.13,No.2, pp. 65-75.
Sgarbas, K., Fakotakis, N., 2005. A Revised Phonetic
Alphabet for Modern Greek. In Proceedings of
SPECOM 2005, 10
th
International Conference on
Speech and Computer, 17-19 October 1005, Patras,
Greece, pp.273-276.
Triantafyllidis, M., 1977. Νεοελληνική Γραμματική.
ΟΕΔΒ, Athens.
Van den Bosch, A., Daelemans, W, 1993. Data-Oriented
Methods for Grapheme-to-Phoneme Conversion.
Proceedings of EACL, Utrecht, 45-53.
Winston, P., Learning by Building Identification Trees, in
P. Winston, 1992. Artificial Intelligence, Addison-
Wesley Publishing Company, pp.423-442.
Witten, I., Frank, E., 2005. Data Mining: Practical
Machine Learning tools and techniques, 2
nd
Edition,
Morgan Kaufmann, San Francisco.
Xuedong, H., Acero, A., Alleva, F., Hwang, M.-Y., Jiang,
L., and Mahajan, M., 1995. Microsoft Windows
Highly Intelligent Speech Recognizer: WHISPER.
ICASSP’95, USA.
LEARNING GREEK PHONETIC RULES USING DECISION-TREE BASED MODELS
427