0.6. Comparing the approaches, we consider the dif-
ference might come from whether the keywords were
obtained from graph images themselves or obtained
from user-assigned keywords. It is clear that the for-
mer gives the keyword strongly related to the infor-
mation in the graphs, though the latter might contain
the less related keywords.
The values of precision of our method were 0.67,
which was greater than the previous study, 0.55. From
our opinion, we may obtain the better results, if we
improve the OCR process and adapt the idea to get
tokens from the caption (subject) by selecting other
keywords which have specific names, such as a name
of protein or chemical material.
We also observed limitations during the process.
The first limitation was about the pattern of input
data. We partitioned the input into three distinct pat-
terns: (1) one or more tokens of the title of either the
X- or Y-axis appearing in the first sentence of the cap-
tion; (2) one or more tokens of the title of both axes
appearing in the first sentence of the caption; and (3)
no token appearing in the first sentence of the caption.
Our system supports inputs with patterns (1) and (2).
For inputs with pattern (3), we need to extend our idea
in future studies to find the relationship for all sen-
tences in the captions, instead of only the first as we
did, as tokens in titles may appear in other sentences.
Moreover, it is significant to understand the pattern of
input. Hence, a text mining algorithm may be a can-
didate to solve this problem, because it can discover
patterns from unstructured data.
The second limitation arises when the subject and
object are coincidentally the same word. We found
only a few such cases in our study, and four they only
gave negligible affects on our results.
The third limitation was that our method was ap-
plicable to inputs containing a single graph. Under
this condition, we could clearly understand what the
caption meant. If multiple graphs were present in an
image, it became be difficult to identify which part of
the image the caption intended to explain. A method
for solving this problem is still a question that should
be addressed in future studies.
6 CONCLUSIONS
In this study, we proposed the method to extract
triples from graphs. Our main objective was to ad-
dress the difficulty of finding relationships between
axis titles and a caption.
We applied OCR to extract the text inside the
given graphs, but errors from incorrect recognition
occurred. The edit distance was employed to reduce
these errors by measuring the similarity between to-
kens in titles and a caption. The token with a min-
imum distance was used to replace incorrect outputs
of the OCR process.
Furthermore, we differentiated the dataset into
two groups: one group containing bar graphs and
the other containing line graphs. We observed that
the system could only utilize the Y-axis title in the
bar graphs because the X axis established individ-
ual categories, and not a single title. Unlike bar
graphs, we could use both titles of the axes in the line
graphs. Therefore, the explicit triples extracted from
bar graphs were created from the Y-axis title only. We
then decided to not create implicit triples from the bar
graphs in this study. In addition, we obtained explicit
triples and implicit triples from line graphs.
Overall, each triple comprises a tuple containing
subject, predicate and object. The subject was the
first noun of the first sentence of the caption. The de-
pendency parse tree was the crucial tool for defining
the predicate. The first verb of the first sentence of
the caption represented the predicate. If we could not
detect a verb in the sentence, we instead selected the
nearest preposition. The object came from tokens ex-
tracted from the titles of the axes of the graph. These
tokens also matched the words in the caption.
Finally, the system could create explicit triples.
On the other hand, the generation of implicit triples
was more difficult, occurring when nothing matched
the words of the caption. We believe that the graph
itself had obvious relationships between axes. There-
fore, we created meaningful implicit triples.
Consequently, our developed method was accu-
rate and reliable, because it provided dependable ac-
curacy and precision.
For our future direction, we will be extended our
method to support generic graphs such as pie graphs
and area graphs by investigating new techniques of
detecting types of graphs and extracting semantic in-
formation from graphs.
REFERENCES
Alday, R. B. and Pagayon, R. M. (2013). Medipic: A mo-
bile application for medical prescriptions. In Infor-
mation, Intelligence, Systems and Applications (IISA),
2013 Fourth International Conference on , pages 1–4.
IEEE.
Chen, D., Odobez, J.-M., and Bourlard, H. (2004). Text
detection and recognition in images and video frames.
Pattern Recognition, 37(3):595–608.
Delorme, V., Diomand
´
e, S. V., Dedieu, L., Cavalier, J.-
F., Carri
`
ere, F., Kremer, L., Leclaire, J., Fotiadu, F.,
and Canaan, S. (2012). Mmppox inhibits mycobac-
A Proposal for a Method of Graph Ontology by Automatically Extracting Relationships between Captions and X- and Y-axis Titles
237