maximum value and minimum value.
Currently, we
can only use the maximum value existed in the dataset
or dataset with a clear boundary. For example, in the
age dataset, the maximum value is
90
, so the maxi-
mum value in this prototype is 90, it is totally based on
the range of the dataset.
Limitation in categorizing
quantities with rounded values
: When we catego-
rize age dataset, the results of categories are not integer,
which leads to the abnormal categories e. g., category
Children is from 0.99 years old to 17.97 years old.
Our main perspective is to validate our approach
using a formal evaluation process to assess how the
fuzzy membership function helps a user categorizing
quantities. The evaluation includes three steps:
1)
manual categories creation; 2) using fuzzy mem-
bership function and 3) post-study questionnaire.
In manual categories creation, investigating How
does a user create categories for datasets such as the
characters in The Simpsons would be a major question
to be answered as in this paper we only reported on
typically used manually created to illustrate our ap-
proach. In this evaluation step, a user will be asked
by questionnaire to focus on the number of categories,
the intervals, and the name of categories, and we will
log the results for remote analysis.
Our study protocol will be as follows (using the
prototype shown in Fig. 3(a)):
Step 1:
we introduce
FUZZYCUT to participants and tell them how to use
and operate FUZZYCUT which will last 10 minutes.
In this part, the user is able to know how to change
parameters of fuzzy membership function and inter-
act with FUZZYCUT to generate and adjust categories.
Step 2:
those parameters, categories, and their inter-
vals are saved remotely by recording every interaction.
Step 3:
all those information collected from a user
are organized and analyzed to support a formalized
function, which is very important for future work to de-
velop specific mapping function because the collected
information will be the major proof of mapping spe-
cific quantities. A post-study questionnaire will collect
feedback, the content of which aims to illustrate if this
tool influences participants’ original intent, and how
those categories change after using this tool.
7 CONCLUSION
This paper introduced an interactive visualization tech-
nique to assist a user in categorizing quantities into
categories. We relied upon a well-known function, the
fuzzy membership function from fuzzy logic theory,
which we implemented as an interactive prototype. We
illustrated its use for 3 case studies: age, temperature,
and taxi speed data. As the prototype enables an ex-
plicit mapping of the categorization function, we plan
to use it to trace the process a user follows when cre-
ating categories. In particular to understand if there
is consensus between groups of users regarding the
choice of categories values interval, labeling, and con-
fidence. We also plan to use the interactive function
to communicate this mapping, e. g., as an interactive
legend (Park and Park, 2010b) for both visual commu-
nication and exploration of datasets.
ACKNOWLEDGMENTS
This work was partially supported by Chinese Schol-
arship Council (CSC). This work was also partially
supported by the M2I project on Urban Mobility
funded by the French Agency for Durable Develop-
ment (ADEME).
REFERENCES
Alsallakh, B., Hanbury, A., Hauser, H., Miksch, S., and
Rauber, A. (2014). Visual methods for analyzing prob-
abilistic classification data. IEEE transactions on visu-
alization and computer graphics, 20(12):1703–1712.
Bostock, M., Ogievetsky, V., and Heer, J. D
³
data-driven
documents. 17(12):2301–2309.
Brodlie, K., Osorio, R. A., and Lopes, A. (2012). A review
of uncertainty in data visualization. In Expanding the
frontiers of visual analytics and visualization, pages
81–109. Springer.
Cao, N., Lin, Y.-R., and Gotz, D. (2015). Untangle map:
Visual analysis of probabilistic multi-label data. IEEE
transactions on visualization and computer graphics,
22(2):1149–1163.
Clifford, H. T., Stephenson, W., Clifford, H., and Stephenson,
W. (1975). An introduction to numerical classification,
volume 240. Academic Press New York.
Cover, T. and Hart, P. (1967). Nearest neighbor pattern clas-
sification. IEEE transactions on information theory,
13(1):21–27.
Dong, X. and Hayes, C. C. (2012). Uncertainty visualiza-
tions: Helping decision makers become more aware of
uncertainty and its implications. Journal of Cognitive
Engineering and Decision Making, 6(1):30–56.
Dressel, J. and Nori, F. (2014). Certainty in heisenberg’s
uncertainty principle: revisiting definitions for esti-
mation errors and disturbance. Physical Review A,
89(2):022106.
Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. Dbscan:
A density-based algorithm for discovering clusters in
large spatial databases with noise. In Proc. 1996
Int. Conf. Knowledge Discovery and Data Mining
(KDD’96), pages 226–231.
Lin, Y.-R., Cao, N., Gotz, D., and Lu, L. (2014). Untangle:
visual mining for data with uncertain multi-labels via
Categorizing Quantities using an Interactive Fuzzy Membership Function
201