<script>
…
Dr. Slope: <S4> Good morning! How are you today?
</S4>
Simon: <S5> I'm fine, Dr. Slope. </S5> <S6> My GP,
Dr. Harding, referred me to you. </S6> <S7> He thought
that you might be able to diagnose the problem with my
leg. </S7>
Dr. Slope: <S8> Well, let's take a look. </S8> <S9>
Hmm, I want to order some tests, but I think you may need
surgery. </S9> <S10> It's a simple procedure and it will
relieve your pain. </S10>
Simon: <S11> So, it's not a high risk operation?
</S11>
Dr. Slope: <S12> No, not at all. It's quite routine.
</S12>
Simon: <S13>Are there any other treatment options?
</S13>
Dr. Slope: <S14> Not that I'd recommend. </S14>
<S15> This is the best course of treatment, in my opinion.
</S15>
…
<description>
…
<D11> Simon says to the doctor, “So, it's not a high
risk operation?” A “high risk” (two words) means that it
could be dangerous. When we say that something is “high
risk,” that means that the surgery or the operation could
cause more problems. Of course, an operation is the noun
that means the same as surgery. </D11> <D12> Dr. Slope
says, “Not at all,” meaning not even a little bit; it's not high
risk. We say, “not at all” means “no,” “not in any way.” Dr.
Slope says, “It's quite routine.” And again, “routine” we
know means it's common, it's quite normal. Notice that the
use of the word “quite;” it's basically the same as it's
“very” routine, very common. It's a little more formal,
when someone says, “It's quite routine,” but they're used
similarly—very and quite—in this case. </D12>
<D13>Simon says, “Are there any other treatment
options?” “Treatment” is another word for what the doctor
gives you or does to you to help you. That's called the
treatment. So you go to the doctor, and the doctor
diagnoses you, and then, he or she gives you a treatment,
maybe some pills or drugs to take. It may be surgery, it
may be changing your exercise or your diet, what you eat.
(“Stop smoking,” for example; that's good advice.) So,
Simon asks what the other treatment options or choices are.
</D13> <D14>Dr. Slope says that there are no other good
treatment options. He says, “Not that I'd recommend,”
meaning there are no other ones that I'd recommend.
</D14>
…
Figure 1: An example of an ESL podcast document. The
description sentence <Di>_</Di> explains the script
sentence <Si>_</Si>.
Basically scripts and descriptions are given as
speech audio files, but the transcription text files are
also provided. Though these files are good sources
for ESL education for their own good, we can
extract more valuable information from the files for
DB-CALL systems. In ESL podcast files, the
speaker explains each sentence used in the script
part and many phrases in detail. If these descriptions
can be extracted with corresponding script sentences
or phrases, the extracted pairs can be used as a
database for feedback information. When the user
who uses a DB-CALL system wants to know the
meaning of the sentence or the phrase generated by
the system, the system can present similar
expressions and their descriptions that are gathered
from ESL podcast files. These descriptions can help
the user’s understanding better than simple word
dictionary explanations, because the descriptions can
give practical usage examples and alternative
expressions which are used in real world
conversations. For example, a user may not
understand the meaning of a sentence: “It was quite
great”. If the system detected the word ‘quite’ is the
point of understanding, it searches the script and
description parts related to ‘quite’, <S12>_</S12>
and <D12>_</D12> in Figure 1. With these
explanations, the user can learn the detailed meaning
of the word ‘quite’.
To construct these resources as a database for DB-
CALL systems, we propose a method which extracts
each script sentence and its description from the
ESL podcast text files. The method must be semi-
automatic to reduce the construction cost. For each
sentence in the script part, the corresponding
description sentences are classified by a linear
Conditional Random Fields (CRFs) (Lafferty, 2001)
classifier. Using the classifier, we can reduce human
effort. Several features are selected to train the
classifier. The experimental results show that the
proposed method can extract each pair of a script
sentence and corresponding descriptions
successfully.
The remainder of this paper is as follows:
Section 2 presents related work. Section 3 describes
our proposed method and features. Section 4
explains the experimental environments. Section 5
shows the evaluation results of our method. Finally,
we conclude this paper.
CSEDU 2010 - 2nd International Conference on Computer Supported Education
6