Table 6: The scores of LSTM.
Excluded Recall Prec. F
β
Matrix
– 0.788 0.670 0.684 (67 18 33 64)
Noun 0.642 0.612 0.616 (52 29 33 54)
PP 0.753 0.615 0.631 (64 21 40 57)
Verb 0.609 0.709 0.693 (56 36 23 67)
AV 0.545 0.714 0.685 (55 46 22 59)
Sym. 0.681 0.627 0.634 (64 30 38 49)
Adj. 0.698 0.615 0.625 (67 29 42 44)
Adv. 0.529 0.672 0.648 (45 40 22 75)
Int. 0.591 0.658 0.648 (52 36 27 66)
PA 0.653 0.711 0.702 (64 34 26 58)
Filler 0.625 0.640 0.637 (55 33 31 63)
Conj. 0.624 0.624 0.624 (53 32 32 65)
Verb, AV 0.717 0.724 0.723 (71 28 17 56)
median is 17.5, and the median tends to be lower sim-
ilar to SVM. Also same as SVM or CNN, the difficult
data predicted as easy are often including URLs.
Auxiliary verbs are often used as previous, com-
pletion, or affirmation meanings:
(the moon is beautiful so-so). This POS is
not useful to estimate emotions because it is hard to
associate some emotions with these auxiliary verbs.
However, sentences including like (no), one
of the negative expressions, some auxiliary verb, are
regarded to be high difficulty to decision them emo-
tion estimation so it is not always correct to exclude
all auxiliary verb.
5.3 Compare Methods with the Baseline
Extract the results shown in section 5.2 and compare
in Table 7. From F
β
values, for deciding the difficulty
of emotion estimation by classifiers, CNN trained by
features excluding interjection is adopted. The base-
line by words similarity decisions almost data to be
difficult. It is expected to be easy to decide as difficult
when using word similarities. On the other hand, the
case based on word distributed representation, easy
and difficult data are correctly decided over 60%, the
decision has not a bias. Especially CNN, the propor-
tion that easy data are correctly predicted to be easy
is over 90%.
Table 7: Compare the methods with the baseline.
Method (Excluded POS) Recall Prec. F
β
Baseline 0.901 0.519 0.552
SVM(PP) 0.636 0.644 0.643
CNN(Int.) 0.787 0.914 0.894
LSTM(Verb, AV) 0.717 0.724 0.723
6 BUILDING A DIFFICULTY
DECISION SYSTEM
In section 4 and 5, we are can decide the difficulty
of emotion estimation from the existence of emotive
expressions and classifiers. In this section, build a de-
ciding the difficulty of emotion estimation system to
combine these deciding methods. In section 6.1 de-
scribes the construction of this system and section 6.2
describes the evaluation of the system.
6.1 The Construction of the System
This system receives Japanese sentences and then re-
turns difficulties of emotion estimation “high diffi-
culty” or “low difficulty” for each sentence. Inside the
system, which decisions by a combination of 3 condi-
tions: (1) existence of negative expressions, (2) exis-
tence of emotive expressions, (3) prediction by classi-
fiers. The decision of the existence of negative expres-
sions, Naive Bayes is used (Yamashita et al., 2019).
The sentence including some negative expressions is
considered to be “high difficulty”, so if the sentence is
decided that it includes some negative expressions by
Naive Bayes, the decision of the sentence becomes
high. Including emotive expressions or not, is sug-
gested in chapter 4, is decided by the words similarity
score (over 0.7 or not). The sentence including emo-
tive expressions are regarded to be easy to decide the
difficulty, so the sentence predicted including emotive
expressions becomes easy. In the case of prediction
by classifiers, decide the difficulty by classifiers sug-
gested in section 4.
6.2 Evaluation of the System
To evaluate the system, use the annotation data that
8 people who know the writer (author) annotated 254
author’s tweets. This data is not included in the data
used on each above experiments. Same as section 4,
separate this data into the difficult data and the easy
data.
The evaluations are shown in Table 8. Deciding
by 2 steps, the existence of negative expressions and
classifiers is the best score. 70% of the difficult data
are correctly predicted, but the easy data could not be
predicted correctly 20%. In the case of the decision
including emotive expressions, one of the features of
the FN (= False Negative) data which is actually diffi-
cult but predicted easy is including (want
to do). This expression shows the writer’s hope or
request, but usually, it is not written that what kind
of emotion the writer can give to do it really, so the
emotive expressions are hard to be detected. In FP (=
A Classification Method for Japanese Sentences based on the Difficulty Level of Emotion Estimation
389