tion in our method. Our method processes the bun-
setsu sequence of an input sentence in order from the
begining to as follows:
1. We store the bunsetsu sequence of an input sen-
tence in the input order in the queue, and make
both the shift and swap stacks empty.
2. It is repeated that one of five operators (Shift,
Shift-Comma, Reduce, Reduce-Comma and
Swap) is selected to manipulate the target two
bunsetsus. One of the target two bunsetsus (here-
after, the forward bunsetsu) is the top bunsetsu of
the shift stack. The other (hereafter, the backward
bunsetsu) is the front bunsetsu of the queue if the
swap stack is empty and the top bunsetsu of the
swap stack otherwise. In practice, the choice of
operators is limited by the state of two stacks and
the queue, due to constraints on two aspects of the
algorithm’s behavior and Japanese grammar.
3. The repetition of 2. finishes when both the queue
and the swap stack become empty and only a tree
of which the root is the end-of-sentence bunsetsus
is left in the shift stack.
Each operator is elaborated as follows:
Shift operator ensures that the forward bunsetsu does
not depends on the backward one, and moves the
backward one into the shift stack.
Shift-Comma operator performs similar to the Shift
operator and also inserts a comma between the
forward and backward bunsetsus.
Reduce operator specifies that the forward bunsetsu
depends on the backward bunsetsu, removes the
forward bunsetsu from the shift stack, and adds it
as a first child node of the backward bunsetsu to
form a dependency tree.
Reduce-Comma operator performs the same oper-
ation of the Reduce operator. It also inserts a
comma between the forward and backward bun-
setsus.
Swap operator determines to swap the order of the
forward and backward bunsetsus, and pushes
them into the swap stack in this order. By using
the swap stack, we can reset the previous decision
on dependency parsing and comma insertion re-
lated to the target two bunsetsus, and reperforms
these processes based on the swapped word order
again.
We describe a concrete flow of our algorithm pre-
sented in Figure 2. As can be seen in Figure 2, a box
means a bunsetsu and boxes of the target two bunset-
sus are shown with a bold frame. In the initial state
at time 1, the bunsetsu sequence of the input sentence
is stored in the queue, and both the shift stack and the
swap stack are empty, so only the front of the queue
is targeted, and Shift is selected. As a result, “
1
I”
is pushed into the shift stack. At time 2, Shift is se-
lected, and “
2
home” is pushed into the shift stack
because it assumes that there is not a dependency re-
lation and a comma between “
1
I” and “
2
home”.
At time 4, Reduce operator is chosen and “
3
the
city” is removed from the shift stack because there
is not a comma but a dependency relation between
“
3
the city” and “
4
longing for”. At time 5, Swap
is selected, and therefore “
2
home”, “
3
the city” and
“
4
longing for” are pushed into the swap stack in the
order of “
2
home”, “
4
longing for”, “
3
the city”. At
time 6, Shift-Comma is selected, and consequently a
comma is inserted after “
1
I” and “
3
the city,” which
is the top of the swap stack, is pushed into the shift
stack since there is no dependency relation, but then
again a comma between “
1
I” and “
3
the city”. Fi-
nally, at time 14, the queue and the swap stack are
empty, and only the final bunsetsu is on the shift stack,
so the process ends.
3.2 Probabilistic Model
In this section, we describe a probabilistic model used
in the operator selection in our proposed algorithm. In
this study, we conducted a probabilistic model that es-
timates the validity of the processing results generated
by each operator, and selects each operator based on
the highest value of the processing results.
In the following equation, f
t
represents an opera-
tor that has been selected at time t and f
f
f
t
= f
1
f
2
··· f
t
indicates a sequence of operations from time 1 to
time t. B = b
1
b
2
···b
n
defines the bunsetsu sequence
of an input sentence. b
i
distinguishes the ith bunsetsu
in the word order of an input sentence. S
f
f
f
t
presents
the structure expressing the result that f
f
f
t
(operations
up to time t) are performed. The structure S
f
f
f
t
is
defined as a tuple S
f
f
f
t
= hO
f
f
f
t
,C
f
f
f
t
, D
f
f
f
t
i, where O
f
f
f
t
=
{o
f
f
f
t
1,2
, o
f
f
f
t
2,3
, ··· , o
f
f
f
t
1,n
, o
f
f
f
t
2,3
, ··· , o
f
f
f
t
i, j
, ··· , o
f
f
f
t
n−1,n
}, C
f
f
f
t
=
{c
f
f
f
t
1,2
, c
f
f
f
t
2,3
, ··· , c
f
f
f
t
1,n
, c
f
f
f
t
2,3
, ··· , c
f
f
f
t
i, j
, ··· , c
f
f
f
t
n−1,n
}, and
D
f
f
f
t
= {d
f
f
f
t
1,2
, d
f
f
f
t
2,3
, ··· , d
f
f
f
t
1,n
, d
f
f
f
t
2,3
, ··· , d
f
f
f
t
i, j
, ··· , d
f
f
f
t
n−1,n
}
are the word order, the comma positions, and the
dependency structure, respectively, which were
determined by f
f
f
t
.
Here, o
f
f
f
t
i, j
(1 ≤ i < j ≤ n) expresses the order be-
tween b
i
and b
j
after an operation at time t, and o
f
f
f
t
i, j
is
1 if b
i
is located before b
j
after a operation f
f
f
t
, and is 0
otherwise. In addition, c
f
f
f
t
i, j
(1 ≤ i < j ≤ n) is 1 if there
is a comma between b
i
and b
j
, and is 0 otherwise. Fi-