uct prices) we obtain many different values. In con-
trast, merely 6% of the items in our product catalog
contain a unique value for price, and this is certainly
not untypical for e-commerce. Likewise, preference
queries in practice usually include pairs of indepen-
dent, correlated and anti-correlated attributes at the
same time, yet almost all experiments in the litera-
ture investigate only pure settings. We could also ob-
serve a strong influence of outliers in the data set on
the performance of skyline algorithms. Again, in con-
trast to synthetic data, commercial catalogs will al-
most always contain strong outliers. Interestingly, the
Scalagon algorithm includes a heuristic for detecting
outliers in the pre-filter phase (Endres et al., 2015),
but to our best knowledge there are no detailed studies
on outliers in skyline computation. An important ad-
vantage of synthetic data is that it avoids bias (Balke
et al., 2007). Our experiments were based on a sin-
gle yet typical e-commerce product catalog such that
they clearly do not allow for a universally valid in-
terpretation. Still, when preference queries are to be
computed in concrete commercial applications and on
data sets, whose statistical properties have been ana-
lyzed, the rich skyline literature with all its investiga-
tions on synthetic data still does not provide helpful
indications on which skyline algorithm to apply.
We gratefully acknowledge the close and inspiring
collaboration with our industry partner Arcmedia AG
(www.arcmedia.ch) as well as Roland Christen and
Daniel Pf
affli for integration and testing of skyline al-
gorithms in a professional e-commerce environment
and the valuable feedback they provided.
