Since the Hamming distance between A, B ∈ 2
N
is
|A∆B| = |A ∪ B| − |A ∩ B|, it may seem reasonable
to consider distances δ(P, Q) between P, Q ∈ P
N
of
the form δ(P, Q) = f (P ∨ Q) − f (P ∧ Q) with f ∈ F.
Yet, this clearly does not distinguish between differ-
ent complements Q, Q
0
∈ C O(P) when P 6= P
⊥
, P
>
.
The geometric approach enables to analyze fur-
ther partition distances obtained by replacing the size
with alternative partition functions such as entropy,
rank and logical entropy, where these latter two are
submodular. In general, any symmetric and order-
preserving partition function f provides a distance be-
tween partitions P, Q by considering f (P), f (Q) and
the values taken on their meet f (P ∧ Q) or else on
their join f (P ∨ Q). Specifically, f defines weights on
edges of the Hasse diagram of partitions such that the
corresponding partition distance between any P, Q is
the weight of a lightest P − Q-path.
REFERENCES
Aigner, M. (1997). Combinatorial Theory. Springer.
Almudevar, A. and Field, C. (1999). Estimation of single-
generation sibling relationships based on DNA mark-
ers. Journal of Agricultural, Biological and Environ-
mental Statistics, 4(2):136–165.
Berger-Wolf, T. Y., Sheikh, S. I., DasGupta, B., Ashley,
M. V., Caballero, I. C., Chaovalitwongse, W., and Pu-
trevu, S. L. (2007). Reconstructing sibling relation-
ship in wild populations. Bioinf., 23(13):i49–i56.
Bollobas, B. (1986). Combinatorics. Set Systems, Hyper-
graphs, Families of Vectors, and Combinatorial Prob-
ability. Cambridge University Press.
Brøondsted, A. (1983). An introduction to convex poly-
topes. Springer.
Brown, D. G. and Dexter, D. (2012). Sibjoin: a fast heuristic
for half-sibling reconstruction. Algorithms in Bioin-
formatics, LNCS 7534:44–56.
Celeux, G., Diday, E., Govaert, G., Lechevalier, G., and
Ralambondrainy, H. (1989). Classification Automa-
tique Des Donn
´
ees. Dunod.
Day, W. (1981). The complexity of computing metric dis-
tances between partitions. Math. Soc. Sc., 1(3):269–
287.
Deza, M. M. and Deza, E. (2013). Encyclopedia of Dis-
tances - Second Edition. Springer.
Ellerman, D. (2013a). An introduction to logical entropy
and its relation to Shannon entropy. International
Journal of Semantic Computing, 7(2):121–145.
Ellerman, D. (2013b). An introduction to partition logic.
Logic Journal of the IGPL, 22(1):94–125.
Godsil, C. and Royle, G. F. (2001). Algebraic Graph The-
ory. Springer.
Graham, R., Knuth, D., and Patashnik, O. (1994). Concrete
Mathematics. Addison-Wesley.
Gr
¨
unbaum, B. (2001). Convex Polytopes. Springer.
Gusfield, D. (2002). Partition-distance: A problem and
class of perfect graphs arising in clustering. Informa-
tion Processing Letters, 82:159–164.
Hubert, L. and Arabie, P. (1985). Comparing partitions.
Journal of Classification, 2(1):193–218.
Konovalov, D. A. (2006). Accuracy of four heuristics for the
full sibship reconstruction problem in the presence of
genotype errors. Adv. Bioinf. Comp. Bio., 3:7–16.
Konovalov, D. A., Bajema, N., and Litow, B. (2005a). Mod-
ified Simpson O(n
3
) algorithm for the full sibship re-
construction problem. Bioinf., 21(20):3912–3917.
Konovalov, D. A., Litow, B., and Bajema, N. (2005b).
Partition-distance via the assignment problem. Bioinf.,
21(10):2463–2468.
Korte, B. and Vygen, J. (2002). Combinatorial Optimiza-
tion: Theory and Algorithms (2nd edition). Springer.
Lerman, I. C. (1981). Classification et Analyse Ordinale
des Donn
´
ees. Dunod.
Meila, M. (2007). Comparing clusterings - an information
based distance. J. of Mult. Ananysis, 98(5):873–895.
Mirkin, B. G. (1996). Mathematical Classification and
Clustering. Kluwer Academic Press.
Mirkin, B. G. and Cherny, L. B. (1970). Measurement of
the distance between distinct partitions of a finite set
of objects. Aut. and Rem. Con., 31(5):786–792.
Mirkin, B. G. and Muchnik, I. (2008). Some topics of cur-
rent interest in clustering: Russian approaches 1960-
1985. Electronic Journal for History of Probability
and Statistics, 4(2):1–12.
Pinto Da Costa, J. F. and Rao, P. R. (2004). Central parti-
tion for a partition-distance and strong pattern graph.
REVSTAT - Statistical Journal, 2(2):127–143.
R
´
enier, S. (1965). Sur quelques aspects math
´
ematiques des
probl
´
emes de classification automatique. ICC Bul-
letin, 4:175–191. Reprinted in Math
´
ematiques et Sci-
ences Humaines 82:13-29, 1983.
Rossi, G. (2011). Partition distances. arXiv:1106.4579v1.
Rota, G.-C. (1964a). The number of partitions of a set.
American Mathematical Monthly, 71:499–504.
Rota, G.-C. (1964b). On the foundations of combinatorial
theory I: theory of M
¨
obius functions. Z. Wahrschein-
lichkeitsrechnung u. verw. Geb., 2:340–368.
Seb
˝
o, A. and Tannier, E. (2004). On metric generators of
graphs. Math. of Op. Res., 29(2):383–393.
Sheikh, S. I., Berger-Wolf, T. Y., Khokhar, A. A., Caballero,
I. C., Ashley, M. V., Chaovalitwongse, W., Chou,
C.-A., and DasGupta, B. (2010). Combinatorial re-
construction of half-sibling groups from microsatellite
data. J. Bioinf. Comp. Biol., 8(2):337–356.
Stanley, R. (1971). Modular elements of geometric lattices.
Algebra Universalis, (1):214–217.
Stern, M. (1999). Semimodular Lattices. Theory and Appli-
cations. Encyclopedia of Mathematics and its Appli-
cations 73. Cambridge University Press.
Warrens, M. J. (2008). On the equivalence of Chen’s Kappa
and the Hubert-Arabie adjusted Rand index. Journal
of Classification, 25(1):177–183.
Whitney, H. (1935). On the abstract properties of linear
dependence. Amer. J. of Math., 57:509–533.
ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods
308