3 COLLECTIVE
CLASSIFICATION
It is an important issue to investigate how objects in-
fluence each other in network, such as how user’s
emotions are affected by his/her relationships in Twit-
ter. This can be generalized to the problem of finding
the labels of the entities in the network.
Collective classification can be seen as a relational
optimization task for networked data. According to
algorithm’s nature, different relational objective func-
tions are optimized within collective inference tech-
niques. Constructed relational features can be used
for the inference task. However, most of the local
classifiers use only fixed-size feature vectors, whereas
neighbor counts vary considerably in networked data.
For instance, a Twitter user can have many followers.
Although it is not a preferred method, by considering
limited (equal) number of connections for each user,
fixed-size feature vectors could be constructed.
A desirable solution is to apply aggregation tech-
niques to summarize the node’s neighborhood infor-
mation. For example, the number of neighbors that
have different class labels could be counted and added
as a new feature to node. Class labels may be replaced
or supported with local attributes. For numerical at-
tributes, it is also possible to use statistical methods
such as minimum, maximum, median, mode, ratio.
On the other hand, for each pair of neighboring
nodes, similarities of their local attributes can be con-
sidered exactly. In this study, a similar method is dis-
cussed but not only implemented as an aggregation
method but also used as a weight in relational prob-
ability calculations. Perlich and Provost (Perlich and
Provost, 2003) discuss aggregation-basedfeature con-
struction as the relational concept in more detail.
We can divide collective classification into three
models. These models are described as follows:
1. Local (Non-relational) Model. This model is
learned for target (class) variable by using the lo-
cal attributes of the nodes in the network. Al-
ternatively, classical supervised learning methods
such as naive Bayes or decision trees can be em-
ployed. In this study, the priors are estimated by
using Bayesian approach, which uses available lo-
cal attributes of the nodes to estimate its class-
probabilities.
2. Relational Model. Relational features and links
among entities come into prominence for this
component. It builds different objective func-
tions to estimate node’s target attribute probabili-
ties with its neighborhood. It is also possible to
benefit from local attributes of the neighboring
nodes.
3. Collective Inference. Created relational objec-
tive functions are generally the joint probability
distributions which are based on Markov Random
Fields. For example computing relational objec-
tive functions needs collective inference methods
such as iterative classification and relaxation la-
beling. As a result, it is aimed to find out how a
node’s classification is influenced from its neigh-
bors classification in a collaborative setting.
Netkit-SRL (or Netkit) (Macskassy and Provost,
2007), is an open source Network Learning Toolkit
for Statistical Relational Learning. It is coded in
Java and it can be integrated with Weka (Hall et al.,
2009) data mining tool. It allows to combine differ-
ent types of components for relational classification
on networked data as well as to design new classi-
fier components and use them with different configu-
rations.
Within the scope of this work, we use the fol-
lowing relational models. Weighted-vote relational
neighbor classifier (wvrn) produces a weighted mean
of class membership probability estimations from
node’s neighbors. Probabilistic relational neighbor
classifier (prn) estimates a particular node’s class
label probability by multiplying each neighboring
node’s class prior probability values. The class distri-
bution relational neighbor classifier (cdrn-norm-cos)
creates an average class vector for each class of node
and then estimates a label for a new node by calcu-
lating how near that new node is to each of these
class reference vectors. Network-only Bayes classi-
fier (no-bayes) counts the class labels of node’s each
neighbors. Then, this value is multiplied with prior
class distributions. Estimation needs product of each
neighbors observed class value probabilities condi-
tioned on given nodes class values and getting pow-
ered with edges weights. Network-only link-based re-
lational classifier (nolb-lr-distrib) firstly creates nor-
malized feature vector of the training node via aggre-
gating its neighbor’s class attributes. Then, it uses lo-
gistic regression for relational modeling.
The main goal of collective inference algorithms
is to infer the unknown class labels of nodes by max-
imizing the marginal probability distribution which is
represented by learned objective functions from rela-
tional classifiers. In Null inference setting, the local
classifier is applied, and then the relational classifier
is applied only once. Iterative classification classi-
fies the node’s unknown class labels by updating cur-
rent state of the graph in each iteration until every
node’s label is stabilized or maximum iteration count
is reached. Relaxation labeling uses direct class esti-
mations from learned models rather than constant la-
beling (e.g. as null). By this way, it does not miss