However, the accuracy of machine translation is not
sufficient for it to be used in the field of medicine.
This is because any mistake in communication can di-
rectly affect a patient’s life.
Therefore, multilingual corpora, consisting of text
related to the medical field, are collected by a trans-
lation system for parallel translation(Yoshino et al.,
2009). The following three-step process is followed
by this system: (1) The necessary corpus is registered
with the system by the user. (2) The corpus is trans-
lated into different languages by other users. (3) The
translation is clubbed with the original corpus.
The collected corpora are then made available to
other medical support systems. As of now, this trans-
lation system is available in five languages (Japanese,
English, Chinese, Korean, and Portuguese).
However, this translation system cannot provide
support to a foreigner who is unable to read any of
the languages the system is available in. In such a
case, it is necessary to provide voice support. Differ-
ent voices have been collected for speech translation
research before. However, the use of these collected
voices has become one of the most challenging issues
in the field since not all voices can be understood by
everyone.
Therefore, the purpose of this research is to collect
accurate voices for providing voice translation sup-
port in the field of medicine(Miyabe et al., 2007).
3 ATTRIBUTES OF UTTERER
Voice data collected in the utterance collection system
are meant to be provided to other voice-based sys-
tems. Therefore, the attributes of the person whose ut-
terances are recorded (the utterer) need to be known.
This system considers the following four attributes of
an utterer.
1. Sex and Date of Birth.
This system records the sex and the date of birth of
the utterer since prior to using the recorded data, it
is necessary to know whether the utterer is a male
or a female and whether he/she is an adult or a
child.
2. Native Language.
Since a person’s native language affects his/her
pronunciation, the system also records the native
language of the utterer.
3. Proficiency in Nonnative Language.
It is necessary to know whether the utterer is
comfortable with the nonnative language in which
he/she is recording. Therefore, we asked the per-
son to rate his/her proficiency in the nonnative
language he/she was prepared to record in. The
definitions of the levels of proficiency are given in
Table 1.
4. Dialect.
The different ways in which a language is spoken
in different countries or in different regions of the
same country are called dialects. Since the qual-
ity of voice data is also dependent on these differ-
ences, the system even maintains a record of the
dialect spoken by the utterer. This system classi-
fies the variants of a language into the following
two types:
• Variants in which both pronunciationsand char-
acters are different
• Variants in which words are spelt in almost the
same way but their pronunciation is different
Chinese and Portuguese are examples of the first
type of classification. Chinese has two dialects,
Pekingese and Cantonese. The Portuguese spo-
ken in Portugal is different from the Portuguese
spoken in Brazil. These dialects are mutually dif-
ferent in terms of both pronunciation and spelling.
In particular, the pronunciation of both the Chi-
nese dialects is so different that people speaking
the two different dialects cannot understand each
other. English is anexample for the second type of
classification. Some words in American English,
British English, and Australian English are almost
similar in spelling but differ in pronunciation.
This system refers to the language variants that
fall in the latter category as ”dialects,” and the
ones that fall in the former category are classified
as different languages.
4 DESIGN OF OTOCKER
This section describes the design of a Web-
based multilingual utterance collection system called
OTOCKER. We used PHP as the development lan-
guage. In this system, the voice is registered and re-
played on a Web browser.
4.1 User Management
This system defines four types of roles for users: man-
ager, voice editor, voice registrant, and voice replay
user.
Table 2 shows the system rights for each role. The
system limits the functions of each role in order to
manage the large number of users.
A WEB-BASED MULTILINGUAL UTTERANCE COLLECTION SYSTEM FOR THE MEDICAL FIELD
371