2.3.1 Discrete Dictation
The program for discrete speech input is called
MyDictate. Its design has been fully determined by
the target group, i.e. by the people who cannot use
their hands. Therefore, not only the dictation itself
but also all supporting activities, like editing, error
correction, formatting, or lexicon maintenance had
to be designed as hands-free operations. The
program has been described in detail in (Cerva et al,
2005).
The SW is distributed with a general purpose
lexicon containing about 550 thousand words. The
employed technology allows the program to run
even on recent low-cost PCs. When a word is
uttered, the recognizer outputs the ordered list of 10-
best candidates, taking into account the acoustic as
well as the language model score. The word with the
best score is automatically added to the dictated text
while the next 9 candidates appear on the list shown
in MyDictate’s window. In case of lexical ambiguity
or when a minor recognition error occurs, the user
can take another candidate from this list and replace
the wrongly typed item. There are about 100 other
control commands that can be used e.g., to delete the
last character(s), word(s), or sentence(s), to select a
part of text, to work with the clipboard, to move the
cursor, to spell the individual letters or to toggle
their (lower or upper) case. The basic vocabulary
assures about 99 % coverage rate for common Czech
texts. If a dictated word is not in the lexicon, the user
can add it by voice during the dictation session.
As the program is aimed particularly at people
with physical disabilities, it must be able to cope
with less standard pronunciation. If this is the case,
the user can employ the embedded speaker
adaptation module. It prompts him/her to say 300
phonetically balanced words. For most users it helps
to reduce the recognition error up to 25 % relatively.
2.3.2 Fluent Dictation
The software for fluent dictation was developed by
in collaboration with the Newton Technology
company, which distributes it under name
NewtonDictate. This software is aimed at general
public, and at professions, like lawyers, doctors and
people from the media domain. It comes with
several types of lexicons. The general-purpose one is
the largest and it contains 500K words recently. The
profession oriented lexicons are smaller (320K for
lawyers and 140K for medicine) with domain
specific language models.
Though originally, this program has not been
designed for hands-free use, it has been considered
for the lectures and training sessions in the center.
One reason is that some of the clients showed their
interest in learning this software and exploiting it in
their prospective jobs. It was mainly those people
whose physical disability does not affect their
speech and who can use their hands at least to some
extent. For the other clients, integration with the
MyVoice is being prepared. It is truth, however, that
many persons with disabled hands prefer the discrete
dictation to the fluent one. They appreciate namely
the facts that a) they can form the text in their own
pace (while the fluent speech technology requires
more or less continuous flow of words), b) they can
correct or modify the input text immediately, c) they
have feeling that the isolated-word decoder is more
robust to speaker-produced and background noises
as well as to hesitation sounds, and last but not least
d) they can easily add new words to the lexicon. It
should be also noted that in Czech - because of its
rich and complex morphology - many word-forms
differ only in one or two characters, which means
that they sound very similar and can be easily
confused, particularly in fluent speech. For the users,
correcting these small errors spread within a fluently
input text is a frustrating task, if it must be done in
hands-free manner. Similar observations were
reported also in (Hawley et al, 2005).
3 TRAINING CENTER
The Rainbow Bridge project has been supported by
a 200K Euro grant, from which one half has been
spent by building the training center and the other
will cover the running costs of a three-year pilot
operation. The center is situated in Prague in place
with good access by public and private transport.
3.1 Project Goals
The main goals of the project are:
Promotion of the voice technology among those
people with physical disabilities who can use it as an
alternative means of interaction with computers.
Creation of a pilot training center with certified
teaching methods (which can be later replicated on
regional levels).
Teaching the basic computer skills to the people
whose disability had never allowed them working
with PCs.
Giving at least some of the trainees a chance to
employ PCs together with the voice technology in
their prospective jobs, e.g. in call and help centers,
in voice-scanning of documents, in re-speaking
RAINBOW BRIDGE - Training Center based on Voice Technology for People with Physical Disabilities
531