BEYOND OPINION MINING
How can Automatic Online Opinion Analysis Help in Product Design?
Ying Liu
Dept. of Industrial and Systems Engineering, Hong Kong Polytechnic University, Hong Kong SAR, China
Keywords: Opinion Mining, Sentiment Analysis, Quality Function Deployment, QFD, Product Design, Design
Informatics.
Abstract: The rapid development of WWW, information technology and e-commerce has made the Internet forums, e-
opinion portals and personal blogs widely accessible to consumers. As a result, nowadays it has become
extremely popular for consumers to share their experience, point out their preferences and concerns with
respect to a specific product on Web. These online customer reviews possess vital information that product
designers can gain insights of their customers and products, and make improvements accordingly. However,
the sheer amount of data, their distributed locations and the inherent ambiguity of human language have
challenged designers greatly. In this paper, we aim to outline an intelligent system that is able to first
automatically gather global online reviews with respect to certain products interested, identify the product
features and customer requirements, and most importantly relates them to the product’s engineering
characteristics through quality function deployment (QFD), a tool that is widely used by product designers
in the customer-driven design paradigm. Meanwhile, we also highlight the challenges and relevant research
issues in order to fulfil such an ambition. As a pioneer study, we believe that this research will greatly help
designers in the era of global competition and e-commerce.
1 INTRODUCTION
Very recently, the rapid development of Internet,
information technology and e-commerce has made
the online forums, e-opinion portals and personal
blogs widely accessible to consumers (B. Liu &
Chang, 2004). As a result, nowadays it has become
extremely popular for consumers to share their
experience with the products, point out their
preferences and concerns on Web. In the last few
years, we have witnessed an enormous interest on
automatic online opinion analysis in the major
research forums like SIGKDD, WWW, SIGIR and
CIKM (Ding & Liu, 2007; Ding, Liu, & Zhang,
2009; Jiang & Yu, 2009; Qi et al., 2008).
Furthermore, the attention on opinion analysis has
also been quickly spread from its original target at
product reviews of particularly consumer products to
many other fields, like movie review analysis
(Zhuang, Jing, & Zhu, 2006), political and legal
issues (Lu et al., 2009; B. Yu, Kaufmann, &
Diermeier, 2008), blogs (Bossard, Généreux, &
Poibeau, 2009; Jack & Frank, 2007) and so on, with
an ultimate goal to better understand the preferences
of the “consumers”, e.g. product end users, movie
audiences, travellers and even the potential ones.
As a matter of fact, take the pioneer work on the
analysis of online customer reviews as an example
(B. Liu, Hu, & Cheng, 2005; Popescu & Etzioni,
2005), these reviews are vital in at least two senses
(Y. Liu, Lu, & Loh, 2007):
On the one hand, product designers can gain
more insights from the analysis of these review
documents concerned about not only their customers
and products but also their competitors’ customer
groups and products. Strategic adjustment as well as
technical improvements can be made accordingly.
On the other hand, these reviews often become
the major sources that will guide potential
consumers in their purchase decision making. The
general perception that potential consumers gathered
from such comments will highly possibly affect their
final decision in selecting a specific brand and
model.
Therefore, it is imperative to assist designers in
processing, understanding and taking advantage of
such information, although the sheer amount of data
313
Liu Y.
BEYOND OPINION MINING - How can Automatic Online Opinion Analysis Help in Product Design?.
DOI: 10.5220/0002860203130318
In Proceedings of the 6th International Conference on Web Information Systems and Technology (WEBIST 2010), page
ISBN: 978-989-674-025-2
Copyright
c
2010 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
and their heterogeneous nature, distributed locations
and the intrinsic nature of language ambiguity are all
nontrivial issues.
In the customer-driven design paradigm, quality
function deployment (QFD) is one of the essential
tools to interpret customers’ requirements (voice-of-
the-customer), relate them to various engineering
characteristics and eventually output the specifics of
engineering requirements, e.g. target values (Cohen,
1995). In fact, as a widely used tool, QFD has a rich
application foundation in industry, from conceptual
design to process planning, and from consumer
product design to construction project management
(Chan & Wu, 2002).
However, existing studies on opinion mining
focus mainly on feature identification and extraction
and its sentiment analysis (Ding, Liu, & Yu, 2008;
Ding et al., 2009; Gamon, Aue, Corston-Oliver, &
Ringger, 2005; Minqing Hu & Bing Liu, 2004; S.-
M. Kim & Hovy, 2004; Nitin & Bing, 2008). In
literature, it has not been reported how design
community can actually benefit from such efforts in
opinion mining. In this paper, it is of our interest to
bridge these two communities and demonstrate how
designers can take advantage of these largely
untapped sources of customer need information in a
way that they are familiar, i.e. through QFD. We
trust the outcome of our research will offer design
personnel the agility to handle the large amount of
valuable online information more effectively and
efficiently.
This paper is organized as follows. In Section 2,
we focus on the review of status quo of both opinion
mining and quality function deployment. The overall
system architecture is explained in Section 3 along
with our research plan. Section 4 lists the challenges
and research issues that we are interested. Section 5
concludes.
2 RELATED WORK
2.1 Opinion Mining
In the past few years, there has been an obviously
rising interest on the topic of automatic parsing and
analysis of online customer reviews in various major
research forums. Hu and Liu proposed a method that
uses various word features, including occurrence
frequency, part-of-speech tags and synonym set in
WordNet (M. Hu & B. Liu, 2004). While they called
it a summarization, their basic idea is to identify the
pair of a noun word and its nearest opinion word.
Popescu and Etzioni have proposed the OPINE
system, which uses relaxation labeling for finding
the semantic orientation of words (Popescu &
Etzioni, 2005). Their work focuses more on the
identification of opinion orientations, e.g. favor or
disfavor. Zhung et al. have also applied some similar
strategies on the analysis of movie reviews (Zhuang
et al., 2006). Ding et al. have proposed a holistic
lexicon based approach by using external evidences
and linguistic conventions to identifying the
semantic orientation of opinions (Ding et al., 2008),
and later on a further work on product entity
discovery and entity assignment (Ding et al., 2009).
Su et al. have studied a mutual reinforcement
approach to deal with the feature-level opinions with
a goal to discover hidden sentiment association from
Chinese Web pages (Qi et al., 2008). Recently, an
iterative reinforcement scheme based on improved
information bottleneck (Weifu & Songbo, 2009), a
lexicalized hidden Markov model based learning
framework (Wei & Hung Hay, 2009) and also
summarization approach (Bossard et al., 2009; Zhan,
Loh, & Liu, 2009) have also been reported.
Other related studies include sentiment analysis
and subjective classification, for example, using a
word sentiment classification approach (S.-M. Kim
& Hovy, 2004), a bootstrapping process to train a
sentiment classifier (Gamon et al., 2005; Riloff,
Wiebe, & Wilson, 2003), a unsupervised approach
coupled with Bayesian classifier (H. Yu &
Hatzivassiloglou, 2003), an extraction pattern
learner and a probabilistic subjectivity classifier
using only un-annotated texts (Wiebe & Riloff,
2005), an approach utilizing the linguistic
constraints on the semantic orientations of adjectives
in conjunctions (Hatzivassiloglou & McKeown,
1997), a WordNet approach using semantic distance
from a word to “good” and “bad” in WordNet as the
classification criteria (Kamps & Marx, 2002), a
latent semantic analysis based approach where
cosine distance is introduced (Turney & Littman,
2003), and finally an approach using semantic
factors (Osgood, Succi, & H.Tannenbaum, 1957)
and some syntactic information in the feature sets of
support vector machine (Mullen & Collier, 2004).
2.2 Quality Function Deployment
Originally introduced in Japan in the late 1960s
(Akao, 1990), quality function deployment (QFD)
has gained international acceptance and has become
a widely used tool in the paradigm of customer-
driven product design and manufacturing (Chan &
Wu, 2002; Cohen, 1995). Through the House of
Quality (HoQ), customer requirements are identified
and related to product engineering characteristics,
WEBIST 2010 - 6th International Conference on Web Information Systems and Technologies
314
Figure 1: The system architecture of the proposed system.
product planning, part deployment and even
manufacturing operations .
Recently, the analytic hierarchy process (AHP)
(Saaty, 1980) had been introduced to handle the
design concept variations and to calculate the
relative importance of customer preferences (Fung,
Popplewell, & Xie, 1998). The AHP methodology
basically establishes a design concept hierarchy with
prioritized subordinates and then decomposes the
linguistic based customer requirements into different
levels of subordinates and alternatives reliably and
consistently according to their correlation (Saaty,
1980). Due to the intrinsic ambiguity of human
language and the substantial degree of human
subjective judgment involved in the voice of the
customers as well as the identification of
engineering features in product characteristics and
their association (K. J. Kim, Moskowitz, Dhingra, &
Evans, 2000), it had been argued that linguistic
variables expressed in fuzzy numbers seem more
appropriate for describing those inputs in QFD
(Chen, Fung, & Tang, 2006). This is in contrast to
the previous efforts where the input variables are
assumed to be precise and are treated as numerical
data only (Grifn & Hauser, 1993; Gustafsson &
Gustafsson, 1994). Some recent research efforts
reported mainly focus on taking advantage of fuzzy
set theory (Zadeh, 1965) in QFD (Harding,
Popplewell, Fung, & Omar, 2001; Kwong & Bai,
2002; Temponi, Yen, & Tiao, 1999).
3 SYSTEM FRAMEWORK
From the previous literature review, in summary,
while the importance and value of online customer
opinions are well perceived, the idea of intelligent
processing of such reviews and feedbacks and relate
them to QFD, a prevailing tool widely used in the
design community as well as industry scenarios, has
never been studied before. In order to fulfil such a
gap, we propose an intelligent system to tackle this.
The system proposed has integrated various latest
research topics in data/text/Web mining, information
retrieval, machine learning and QFD.
Figure 1 shows the system architecture.
Basically, there are four key elements:
Sources of online opinions and reviews.
A engine for Web information retrieval and
parsing.
A data warehouse of customer requirements
and opinions (orientations).
A knowledge-based fuzzy QFD module.
In terms of research plan, the proposed project has
been divided into two main phases based on the
generic nature of tasks at different stages.
In Phase 1, the research and development of Web
information retrieval and parsing engine for online
reviews will be carried out. the major research tasks
include: (1) An information retrieval engine which is
able to gather online customer reviews that are of
interest; (2) A comprehensive strategy which is able
to retain genuine customer opinions by excluding
Internet frauds, invited attacks, duplication and so
on; (3) A machine learning approach which is able
BEYOND OPINION MINING - How can Automatic Online Opinion Analysis Help in Product Design?
315
to parse and identify the salient or obscure product
features and customers’ concerns with their opinions
associated, with or without the assistance of design
knowledge base; (4) A multi-facet modeling of
information based on relevant ingredients, e.g.
brands, features and opinions associated, geographic
locations and age groups, whenever the information
is available.
In Phase 2, we focus on the research and
development of a knowledge based QFD. The major
research tasks include: (1) Design knowledge based
engineering characteristics identification; (2) Weight
computation of product features as well as customer
concerns (of the target product); (3) Competitive
analysis of product features and their weights (of
competitors’ products); (4) Functional relationship
identification and target value specification of
engineering characteristics.
4 CHALLENGES AND
RESEARCH ISSUES
The research and development of the proposed
intelligent system are challenging in several ways. In
the following, we briefly outline some of these
challenges and research issues.
Although, it has become a general perception
that online customer reviews are valuable, it lacks of
an effective as well as efficient strategy to identify
the review sources, gather and further exploit these
large quantity of customer need information. This
problem is particularly severe when a company is
new to a specific market.
Internet frauds, invited attacks, duplications and
many other types of misleading messages with either
good or bad comments are serious problems that
should be carefully handled. While people in
marketing research may not give enough attention to
this issue, it is critical in serving product designers
to accurately understand the messages from end
users. A typical example is that a favourable
comment and a critical note questioning certain
product features present different values to
designers.
The contents of different customer messages
challenge us most. There are quite a few tough
issues, to name some: 1) Vague language and
different ways of expression, e.g. different
vocabulary and terminology, which affects the
identification of product features and the subsequent
tasks that relate them to QFD. 2) How to tell a
comparative review? How to effectively tell the
features extracted and correspond them to their
original products that are under comparison? 3) Can
we rank the posts based on their values of contents
with respect to different groups of interests, e.g.
designers?
Most existing QFD studies suffer from a very
limited quantity of customer surveys where
customers’ concerns are underrepresented due to
many reasons like cost and confidentiality of data.
However, collecting customer need information
from Internet, after successfully filtering, will
provide a sufficient amount of data samples to
represent the “Big Picture” of the product or the
potential design scheme that is in the pipeline. This
will offer a unique opportunity for further statistical
study on customer behaviors and market trends,
where a large amount of data is often deemed right.
Meanwhile, we also intend to build a multi-facet
model where geographic locations and ages are
available. This provides an interesting dimension to
designers.
It is also not an easy task to precisely relate
different customer opinions extracted to the
engineering characteristics automatically. A
knowledge mapping needs to be learned from the
design knowledge repository to indicate the relations
between the features identified and their
corresponding product engineering requirements.
With respect to competitors’ benchmarking,
while we envision that our system can provide a rare
opportunity that it is now possible to gather
customer comments about competitors’ similar
products and it offers designers a possibility to eye
on their competitors in a timely and effective
manner, it relies on the success of every previous
step.
5 CONCLUSIONS
It has been witnessed an increasingly popularity on
the analysis of online opinions due to their obvious
implications to customer understanding, marketing
and sales and product design. In this paper, we
highlight the gap existed between the current
approaches on online opinion analysis and the
expectations from the design community. We argue
that one ultimate goal of opinion analysis is actually
to offer designers a comprehensive view of customer
experience, feelings and most importantly to provide
a collection of clues or evidences for designers to
better understand the voice of the customer, hence,
refine and improve their existing product offerings
accordingly. In this paper, an intelligent system that
WEBIST 2010 - 6th International Conference on Web Information Systems and Technologies
316
intends to tackle this has been proposed while the
challenges and research issues are outlined. We are
certain that the realization of such a system will help
greatly in design.
ACKNOWLEDGEMENTS
The work described is currently supported by two
GRF grants from UGC, Hong Kong SAR, China
(RGC Ref: 520509 and 520208).
REFERENCES
Akao, Y. (1990). Quality Function Deployment:
Integrating Customer Requirements into Product
Design (translated by Glenn Mazur). Cambridge, MA:
Productivity Press.
Bossard, A., Généreux, M., & Poibeau, T. (2009).
CBSEAS, a summarization system integration of
opinion mining techniques to summarize blogs. Paper
presented at the Proceedings of the 12th Conference of
the European Chapter of the Association for
Computational Linguistics: Demonstrations Session.
Chan, L.-K., & Wu, M.-L. (2002). Quality function
deployment: A literature review. European Journal of
Operational Research, 143(3), 463-497.
Chen, Y., Fung, R. Y. K., & Tang, J. (2006). Rating
technical attributes in fuzzy QFD by integrating fuzzy
weighted average method and fuzzy expected value
operator. European Journal of Operational Research,
174(3), 1553-1566.
Cohen, L. (1995). Quality Function Deployment (1st ed.):
Prentice Hall PTR.
Ding, X., & Liu, B. (2007). The utility of linguistic rules in
opinion mining. Paper presented at the Proceedings of
the 30th annual international ACM SIGIR conference
on Research and development in information retrieval.
Ding, X., Liu, B., & Yu, P. S. (2008). A holistic lexicon-
based approach to opinion mining. Paper presented at
the Proceedings of the international conference on
Web search and web data mining.
Ding, X., Liu, B., & Zhang, L. (2009). Entity discovery
and assignment for opinion mining applications. Paper
presented at the Proceedings of the 15th ACM
SIGKDD international conference on Knowledge
discovery and data mining.
Fung, R. Y. K., Popplewell, K., & Xie, J. (1998). An
intelligent hybrid system for customer requirements
analysis and product attribute targets determination.
International Journal of Production Research, 36(1),
13-34.
Gamon, M., Aue, A., Corston-Oliver, S., & Ringger, E.
(2005). Pulse: Mining customer opinions from free
text. Paper presented at the Proceedings of Advances
in Intelligent Data Analysis VI, 6th International
Symposium on Intelligent Data Analysis, IDA 2005,
Madrid, Spain.
Grifn, H., & Hauser, J. R. (1993). The voice of the
customer. Marketing Science, 12, 1-27.
Gustafsson, A., & Gustafsson, N. (1994). Exceeding
customer expectations. Paper presented at the
Proceedings of the Sixth Symposium on Quality
Function Deployment.
Harding, J. A., Popplewell, K., Fung, R. Y. K., & Omar,
A. R. (2001). An intelligent information framework
for market driven product design. Computers in
Industry, 44(1), 49-63.
Hatzivassiloglou, V., & McKeown, K. R. (1997).
Predicting the semantic orientation of adjectives.
Paper presented at the Proceedings of the eighth
conference on European chapter of the Association for
Computational Linguistics, Madrid, Spain.
Hu, M., & Liu, B. (2004). Mining and summarizing
customer reviews. Paper presented at the Proceedings
of the 10th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining, Seattle,
WA.
Hu, M., & Liu, B. (2004). Mining Opinion Features in
Customer Reviews. Paper presented at the Proceedings
of the Nineteenth National Conference on Artificial
Intelligence, Sixteenth Conference on Innovative
Applications of Artificial Intelligence, AAAI 2004,
San Jose.
Jack, G. C., & Frank, S. (2007). Opinion mining in legal
blogs. Paper presented at the Proceedings of the 11th
international conference on Artificial intelligence and
law.
Jiang, M., & Yu, B. (2009). Proceeding of the 1st
international CIKM workshop on Topic-sentiment
analysis for mass opinion, Hong Kong, China.
Kamps, J., & Marx, M. (2002). Words with attitude. Paper
presented at the Proceedings of the First International
Conference on Global WordNet.
Kim, K. J., Moskowitz, H., Dhingra, A., & Evans, G.
(2000). Fuzzy multicriteria models for quality function
deployment. European Journal of Operational
Research, 121, 504-518.
Kim, S.-M., & Hovy, E. (2004). Determining the
sentiment of opinions. Paper presented at the
Proceedings of the 20th international conference on
Computational Linguistics.
Kwong, C. K., & Bai, H. (2002). A fuzzy AHP approach
to the determination of importance weights of
customer requirements in quality function deployment.
Journal of Intelligent Manufacturing, 13(5), 367-377.
Liu, B., & Chang, K. C.-C. (2004). Editorial: special issue
on web content mining. ACM SIGKDD Explorations
Newsletter, 6(2), 1-4.
Liu, B., Hu, M., & Cheng, J. (2005). Opinion observer:
analyzing and comparing opinions on the Web. Paper
presented at the Proceedings of the 14th international
conference on World Wide Web.
Liu, Y., Lu, W. F., & Loh, H. T. (2007). Knowledge
Discovery and Management for Product Design
through Text Mining - A Case Study of Online
BEYOND OPINION MINING - How can Automatic Online Opinion Analysis Help in Product Design?
317
Information Integration for Designers. Paper
presented at the Proceedings of the 16th International
Conference on Engineering Design, ICED'07, Paris,
France.
Lu, s, S., Paula, C., M, rio, J. S., Eug, et al. (2009).
Automatic creation of a reference corpus for political
opinion mining in user-generated content. Paper
presented at the Proceeding of the 1st international
CIKM workshop on Topic-sentiment analysis for mass
opinion.
Mullen, T., & Collier, N. (2004). Sentiment analysis using
support vector machines with diverse information
sources. Paper presented at the Proceedings of
Conference on Empirical Methods in Natural
Language Processing (EMNLP).
Nitin, J., & Bing, L. (2008). Opinion spam and analysis.
Paper presented at the Proceedings of the international
conference on Web search and web data mining.
Osgood, C. E., Succi, G. J., & H.Tannenbaum, P. (1957).
The Measurement of Meaning: University of Illinois.
Popescu, A.-M., & Etzioni, O. (2005). Extracting Product
Features and Opinions from Reviews. Paper presented
at the Proceedings of Human Language Technology
Conference and Conference on Empirical Methods in
Natural Language Processing, HLT/EMNLP,
Vancouver, B.C., Canada.
Qi, S., Xinying, X., Honglei, G., Zhili, G., Xian, W.,
Xiaoxun, Z., et al. (2008). Hidden sentiment
association in chinese web opinion mining. Paper
presented at the Proceeding of the 17th international
conference on World Wide Web.
Riloff, E., Wiebe, J., & Wilson, T. (2003). Learning
subjective nouns using extraction pattern
bootstrapping. Paper presented at the Proceedings of
the seventh conference on Natural language learning at
HLT-NAACL 2003, Edmonton, Canada.
Saaty, T. L. (1980). The Analytic Hierarchy Process. New
York: McGraw-Hill.
Temponi, C., Yen, J., & Tiao, W. A. (1999). House of
quality: A fuzzy logic-based requirements analysis.
European Journal of Operational Research in
Engineering Design, 117, 340-354.
Turney, P. D., & Littman, M. L. (2003). Measuring praise
and criticism: Inference of semantic orientation from
association. ACM Transactions on Information
Systems (TOIS), 21(4), 315-346.
Wei, J., & Hung Hay, H. (2009). A novel lexicalized
HMM-based learning framework for web opinion
mining. Paper presented at the Proceedings of the 26th
Annual International Conference on Machine
Learning.
Weifu, D., & Songbo, T. (2009). An iterative
reinforcement approach for fine-grained opinion
mining. Paper presented at the Proceedings of Human
Language Technologies: The 2009 Annual Conference
of the North American Chapter of the Association for
Computational Linguistics.
Wiebe, J., & Riloff, E. (2005). Creating subjective and
objective sentence classifiers from un-annotated texts.
Paper presented at the Proceedings of Conference on
Intelligent Text Processing and Computational
Linguistics (CICLing), Mexico City, Mexico.
Yu, B., Kaufmann, S., & Diermeier, D. (2008). Exploring
the characteristics of opinion expressions for political
opinion classification. Paper presented at the
Proceedings of the 2008 international conference on
Digital government research.
Yu, H., & Hatzivassiloglou, V. (2003). Towards
answering opinion questions: separating facts from
opinions and identifying the polarity of opinion
sentences. Paper presented at the Proceedings of the
2003 conference on Empirical methods in natural
language processing.
Zadeh, L. A. (1965). Fuzzy Sets. Information and
Control(8), 338-353.
Zhan, J., Loh, H. T., & Liu, Y. (2009). Gather Customer
Concerns from Online Product Reviews - A Text
Summarization Approach. Expert Systems With
Applications (ESWA), 36(2 Part 1), 2107-2115.
Zhuang, L., Jing, F., & Zhu, X.-Y. (2006). Movie review
mining and summarization. Paper presented at the
Proceedings of the 15th ACM international conference
on Information and knowledge management.
WEBIST 2010 - 6th International Conference on Web Information Systems and Technologies
318