in Table 2.
Table 2: Comparison with O-CRF.
O-CRF Our Approach
Language English Japanese
Target data
Binary
Relation
Human
Activity
Type of sentences
can be handled
S-V-O
{O, C}, V
S, {O, C}, V
...
all typical syntax
Relation must occur
between entities yes no
Requirement of determining
entities before extracting yes no
6 CONCLUSIONS
This paper proposed a novel approach that uses CRFs
and Self-SL to automatically extract all attributes
and relationships between activities derived from sen-
tences in Japanese CGM. Without requiring any hand-
tagged data, it achieved high precision by making
only a single pass over its corpus. This paper also
explains how our approach resolves the limitations of
previous works, and addresses each of the challenges
to activity extraction.
We are improving the architecture to handle more
complex or incorrect syntax sentences. Based on links
between web pages, we will try to extract relation-
ships between activities at the document-level. In the
next step, we will use a large data set to evaluate our
approach. We are also planning to build a large human
activity semantic network based on mining human ex-
periences from the entire CGM corpus.
REFERENCES
Agichtein, E. and Gravano, L. (2000). Snowball: Extracting
relations from large plain-text collections. In Proc.
ACM DL 2000.
Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M.,
and Etzioni, O. (2007). Open information extraction
from the web. In Proc. IJCAI2007, pages 2670–2676.
Banko, M. and Etzioni, O. (2008). The tradeoffs between
traditional and open relation extraction. In Proc. ACL-
08.
Brin, S. (1998). Extracting patterns and relations from the
world wide web. In Proc. EDBT-98, Valencia, Spain,
pages 172–183.
CoNLL (2000). Conll 2000 shared task: Chunking.
http://www.cnts.ua.ac.be/conll2000/chunking/.
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.-M.,
Shaked, T., Soderland, S., S.Weld, D., and Yates, A.
(2004). Methods for domain-independent information
extraction from the web: An experimental compari-
son. In Proc. AAAI-04.
Fuchi, T. and Takagi, S. (1998). Japanese morphological
analyzer using word co-occurence-jtag. In Proc. ACL-
98, pages 409–413.
Google (2009). Google maps api services.
http://code.google.com/intl/en/apis/maps/documentat
ion/geocoding/.
Jung, Y., Lim, S., Kim, J.-H., and Kim, S. (2009). Web min-
ing based oalf model for context-aware mobile adver-
tising system. The 4th IEEE/IFIP Int. Workshop on
Broadband Convergence Networks (BcN-09), pages
13–18.
Kawamura, T., The, N. M., and Ohsuga, A. (2009). Build-
ing of human activity correlation map from weblogs.
In Proc. ICSOFT.
Kudo, T., Yamamoto, K., and Matsumoto, Y. (2004). Ap-
plying conditional random fields to japanese mor-
phologiaical analysis. In Proc. EMNLP2004, pages
230–237.
Kurashima, T., Fujimura, K., and Okuda, H. (2009). Dis-
covering association rules on experiences from large-
scale weblogs entries. In Proc. ECIR 2009., LNCS vol
5478. Springer 2009.
Lafferty, J., McCallum, A., and Pereira, F. (2001). Con-
ditional random fields: Probabilistic models for seg-
menting and labeling sequence data. In Proc.
ICML2001.
Matsuo, Y., Okazaki, N., Izumi, K., Nakamura, Y.,
Nishimura, T., and Hasida, K. (2007). Inferring long-
term user properties based on users’ location history.
In Proc. IJCAI2007, pages 2159–2165.
McCallum, A. and Li, W. (2003). Early results for named
entity recognition with conditional random fields, fea-
ture induction and web-enhanced lexicons. In Proc.
CoNLL.
NTTDocomo, I. (2009). My life assist service.
http://www.igvpj.jp/contents en/activity09/ms09/list/
personal/ntt-docomo-inc-1.html.
Ozok, A. A. and Zaphiris, P. (2009). Online Communi-
ties and Social Computing. Third International Con-
ference, OCSC 2009, Held as Part of HCI Interna-
tional 2009, San Diego, CA, USA. Springer, ISBN-
10: 3642027733.
Pasca, M., Lin, D., Bigham, J., Lifchits, A., and Jain, A.
(2006). Organizing and searching the world wide web
of facts - step one: the one-million fact extraction
challenge. In Proc. AAAI-06, pages 1400–1405.
Perkowitz, M., Philipose, M., Fishkin, K., and J.Patterson,
D. (2004). Mining models of human activities from
the web. In Proc. WWW2004.
Phithakkitnukoon, S. and Dantu, R. (2009). A dimension-
reduction framework for human behavioral time se-
ries data. AAAIf09 Spring Symposium on Technosocial
Predictive Analytics, Stanford University, CA.
Poslad, S. (2009). Ubiquitous Computing Smart Devices,
Environments and Interactions. Wiley, ISBN: 978-0-
470-03560-3.
AUTOMATIC MINING OF HUMAN ACTIVITY AND ITS RELATIONSHIPS FROM CGM
291