measure function. In particular, we use a diverse
subset of the data instances in an optimization model
to derive a topology with maximum resolution in
discerning dominance with respect to the reference
subset (Ho and Chu, 2005).
To this end, the first step is to render the glyph
unit free by normalizing the data on each dimension
to the unit interval [0, 1]. The second step is to
render the glyph context free by harmonizing the
dimensions as follows. For each attribute, the
quartiles for the values in the entire dataset are
computed. A spline function (Cline, 1974) is
constructed to map these quartiles into the [0.25, 0.5,
0.75] points of the unit interval. This way, a
hypothetical data instance with all attributes at mean
values of the dataset will assume the shape of a
symmetrical polygon with vertices at the mid-point
of each radial axis. In this frame of reference, all
shapes and sizes are relative to this generic
“average” glyph, and free of either units or specific
context of the attributes. For our exploratory work,
simple second-order (piecewise linear) splines are
used.
3.1 Dichotic Dominance with respect to
Reference Subsets
Next, to determine an optimal topology, we use the
concept of a reference subset of the data instances to
help define dichotic dominance. This concept is best
explained in a medical scenario. Suppose a certain
disease is monitored by a number of symptoms and
tests, with a dichotic prognosis of “life” or “death”.
Judging from the combination of data for any
particular case, it may be difficult to predict. A
reference subset is a collection of non-trivial, non-
obvious cases with known outcomes, namely life or
death. In our exploratory analysis of online auction
markets, there is no factual or expert judgment on
whether any particular case is a “buyers” or “sellers”
market. An initial collection from 34 diverse and
well-established markets is used on an ad hoc basis
as the reference subset. An arbitrary configuration of
the attributes within each part of the dichotomy is
selected with the attributes evenly spaced, as in
Figure 1. This is analogous to selecting a portfolio of
stocks to provide an index for a stock market. The
performance of any stock can be gauged relative to
the index, which may be arbitrarily chosen initially.
With better knowledge of the significance of
individual stocks, more useful indices can be
established. By the same token, the choice of
reference subsets for multi-attribute dichotomies can
be adaptively refined as the study progresses.
Once an optimal topology is derived with respect
to a given reference subset, any other data instance,
an online auction market in our case, can be plotted
and visualised as a maximum resolution dichotomy.
Moreover, the total enclosed area in the plot,
including both parts of the dichotomy may be used
as a relative measure of the overall activity of all the
attributes. We can consider this as an indicator of the
“robustness” of the market. Whereas, the difference
in the areas of the left and right parts of the
dichotomy provides an index of dichotic dominance
among market conditions favouring buyers and
sellers. In our settings, a left dominance favours
sellers, and a right dominance favours buyers.
3.2 A Goal Programming Optimization
Model
Subject to the constraints of preserving the
prejudged dominance in the reference subset of
dichotomies, an optimal topology (configuration of
attributes and angles between adjacent pairs) is
sought that maximises the discriminating power, or
resolution, as measured by the sum of absolute
differences in left and right areas for the reference
subset. Such an optimal configuration will be called
a maximum resolution topology (MRT). For any
given configuration of the attributes, maximization
of the discriminating power can be formulated as a
linear program (LP). However, LP produces
extreme-point solutions, which may reduce some of
the angles between attributes to zero, thus collapsing
the glyph. To avoid such degeneration,
maximization with bounded variation of the angles
is modelled as a goal program (GP) in (Ho and Chu,
2005).
4 DSS FOR MRT
To facilitate the computation of a maximum
resolution topology (MRT) for a given set of data
from a multi-attribute dichotomy, an easy-to-use
decision support system (DSS) has been built on
Excel spreadsheet software. Such an MRT-DSS
system has both its front end and report routine
integrated in the same Excel spreadsheet workfile,
into which the input data records can be placed (for
example, imported from a database); and outputs of
values and MRT-star plots displayed.
To find the solution, the user only needs to copy
and paste the records of training data (the “reference
OPTIMIZATION MODEL AND DSS FOR MAXIMUM RESOLUTION DICHOTOMIES
357