thermore, many of these tools can be inte-
grated with the Apache UIMA platform.
A modular, client/server based approach
proved to be necessary for the project.
A fairly large corpus of transcribed child
language is nearly impossible to obtain.
Although there are FrameNet data sets for a
couple of languages (Spanish, German,
Chinese, etc.), their number of frames and
lexical units is presumably too small to use
for semantic parsing.
6 CONCLUSIONS
First we have to verify that autistic children react to
the prototype system in the manner expected.
If this is done successfully, there is much work left
to be done on the NLP side. We will not do further
research on using FrameNet with the Semafor parser
however, nor use database semantics (another
approach, which is not covered in this report).
We will intensify research on custom probabilistic
models with the following steps:
1. set up Apache UIMA since the NLP tools are
easy to integrate,
2. obtain a domain specific corpus,
3. split that corpus into a training and a test part,
4. annotate the corpus with semantic class labels,
5. select domain specific and situational features,
6. incorporate the features generated by the pre-
processing tools (i.e. taggers, parsers, etc.),
7. train a probabilistic model, possibly by using
the MaxEnt library of the Apache NLP tools,
8. evaluate the performance with different feature
sets.
6.1 Necessary Data
We need corpora about children’s language
domains, and we have to decide, which age level,
and which speech domains. If no corpus is available,
we have to develop one. Those corpora should be in
English language to develop and stabilize the
system. Later iterations may incorporate German
and Spanish language.
6.2 Further Steps
We will set up an experimental environment, based
on the work already done, gather experience and
knowledge on analyzing/parsing natural language.
Then we have to acquire or produce corpora
covering our domain of interest (child language).
Furthermore we have to work on creating natural
sentences as part of a dialog.
ACKNOWLEDGEMENTS
This work has been partially funded by the EU
Project GAVIOTA (DCI-ALA/19.09.01/10/21526/
245-654/ALFA 111(2010)149.
REFERENCES
ELIZA, 2013. www.med-ai.com/models/eliza.html
(March 3,2013)
Gaviota, 2012. Report on the Results of the Gaviota
Project, International Meeting, Santa Cruz, Bolivia
(unpublished presentations)
Ferari, E., Robins, B., Dautenhahn, K., 2009. Robot as a
Social Mediator - a Play Scenario Implementation
with Children with Autism, 8th International Con-
ference on Interaction Design and Children Workshop
on Creative Interactive Play for Disabled Children,
Como, Italy
Fillmore, C. J., 2006. Frame Semantics, in Geeraerts, D.
(ed.): Cognitive Linguistics - Basic Readings, chap.
10, Mouton de Gruyter, p. 373–400.
FrameNet, 2012. The FrameNet Project, University of
California, Berkeley, https://framenet.icsi.berkeley.edu
(Mar 06, 2013)
Hausser, R., 2000. Grundlagen der Computerlinguistik –
Mensch-Maschine-Kommunikation in natürlicher
Sprache, Springer Verlag Berlin
IROMEC, 2013. http://www.iromec.org/9.0.html (Jan 27,
2013)
JEOPARDY, 2011. http://www.nytimes.com/2011/02/17/
science/17jeopardy-watson.html?_r=0, (Jan 28, 2013)
Pina, A., 2011. New Technologies for Language and Lear-
ning Disabilities, 17
th
International Conference on
Technology supported Learning & Training, Online
Educa Berlin
Rogers, C. R., 1951. Client-centered therapy, Oxford,
Houghton Mifflin
Schneider, M., 2012. Processing of Semantics of Natural
Languages – Parsing Syntax of Natural Languages,
Bachelor-Thesis, HAW Würzburg-Schweinfurt
UIMA, 2013. http://uima.apache.org/ (Jan 28, 13)
Weizenbaum, J., 1966. ELIZA - A Computer Program for
the Study of Natural Language, Communication bet-
ween Man and Machine. Communications of the
ACM. New York 9.1966,1. ISSN 0001-0782
Willoweit, B., 2012. Processing of Semantics of Natural
Languages – Analyzing Semantics, Bachelor-Thesis,
HAW Würzburg-Schweinfurt
"ArtificialCommunication"-CanComputerGeneratedSpeechImproveCommunicationofAutisticChildren?
521