where k is a predefined number of nearest neighbours
to be considered and k
c
(x) is the number of nearest
neighbours of x that are labelled with c. The k-NN
classifier uses (2) to approximate (1); it assigns x to
the most voted class among its k nearest neighbours.
4.2 The Linear Classifier
The second approximation to the Bayes rule is the
well-known linear classifier (for vectorial data):
c
∗
(~x) ≈ argmax
c=1,...,C
g
c
(~x) (3)
where, for each class c, g
c
(~x) is its linear discriminant.
4.3 The HMM-based Classifier
In contrast to the previous classification techniques,
the third approximation to the Bayes rule is devoted
to symbolic (string) data. This approximation is best
described by first rewriting the Bayes rule as:
c
∗
(x) = argmax
c=1,...,C
log p(c) + logp(x | c) (4)
where p(c) is the prior probability of class c, and
p(x | c) is its class-conditional probability function.
Then, we assume that each class-conditional proba-
bility function p(x | c) is given by a class-conditional
Hidden Markov Model (HMM) M
c
, thus:
c
∗
(x) ≈ argmax
c=1,...,C
log p(c) + log p(x | M
c
) (5)
This is referred to as the HMM-based classifier.
5 THE ORACLE
The APP oracle is implemented on a Web-based inter-
face comprising five main pages: start, data, classi-
fiers, submissions, and scores. As its name indicates,
the start page is the initial page to visit (see Fig. 1). It
includes a navigation bar with links to the main pages,
and a body with the evaluation schedule (every day at
23:35 in Fig. 1) and a section of best results for each
classifier-task pair. Each result corresponds to a dif-
ferent submission and includes the test-set error, in
percentage and absolute terms, as well as the submis-
sion date, hour and file name. Also, each section of
best results includes a link to a page where all results
for its corresponding classifier-task pair are listed in
non-decreasing order of test-set error.
The data page simply lists both the vectorial and
symbolic datasets described in Section 3, together
with brief descriptions and links to their training sets.
Figure 1: Start page of the APP oracle.
Analogously, the classifiers page describes the clas-
sification techniques discussed in Section 4, file for-
mats for submissions, and a few examples of baseline
classifiers for different tasks.
The submissions page allows the students to sub-
mit their classifiers individually, or in groups of two.
Each submission is actually an uploaded file associ-
ated with a certain classifier-task pair; that is, learnt
from the training samples of a specific task, and ap-
propriately written in a specific classifier format. The
APP oracle runs periodic evaluations in accordance
with the planned evaluation schedule shown in the
main page. At each evaluation, the oracle tests up-
loaded classifiers on their corresponding test sets and
updates all oracle pages accordingly. Students are not
allowed to submit new classifiers while a previously
submitted still awaits evaluation. This is to avoid
“training on the test data” by repeatedly testing mi-
nor classifier variations on the test data.
Finally, the scores page contains a table of student
scores. Although the oracle maintains a complete log
of evaluation results, only the best (test-set) error for
each student in each classifier-task pair is taken into
account. This best error receives a score from 0.1 to
1 only if it is not below a predefined minimum error
for its corresponding classifier-task pair; otherwise, it
is ignored. The precise value from 0.1 to 1 assigned
to it depends on the quality of the error (1=high-
est quality), as compared with other student errors.
The table of student scores shows, for each student
(row), the student identifier (unknown for other stu-
dents), the current scores for all classifier-task pairs,
and the global score, which is simply the sum of cur-
rent scores at classifier-task level. It is sorted in non-
increasing order of global scores.
6 LATEST RESULTS
The oracle stores a complete log file of evaluation re-
sults. The analysis of this file draws interesting con-
THE APP ORACLE - An Interactive Student Competition on Pattern Recognition
387