our context that related tasks share some co mmon
structure or similar model parameters ( Evgeniou and
Pontil, 2004), assumin g one task is the former system
and the second one is the upd a te d system. And the
idea is also used to solve one class classification pro-
blem by (Yang et al., 2010; He et al., 2014), but both
of them are subject to the situation that the related
tasks are in the same feature space. In (Xue and Beau-
seroy, 2016), a new multi- ta sk learning model is p ro-
posed to solve the detection problem when additional
new feature is added, where it gives a good transi-
tion from the old detection system to the new mo di-
fied one. However, in some cases the kernel matrix in
that model is not positive semi-definite which means
that some approximation in a semi-definite subspace
must be considered to determine the detection.
In this paper, a new approach is proposed to avoid
that issue. As is shown in section 2.2, we can divide
the kernel matrix into two part, one part is based on
the old features and the second part is based on the
new added feature. After typical estimation method is
condu c te d to fill the corresponding new feature in the
old detection system in order to get a positive semi-
definite matrix, a specific variable kernel is used in the
second kernel matrix (which is base on the new fea-
ture) to co ntrol the impa ct of the new feature over the
detection according to the am ount of collected new
data.
The paper is organised as follows. I n section 2, we
propose the approac h to use multi-task lear ning idea
to solve one class SVM problems with the same fe -
atures and w ith additional new features respectively.
Then we prove the effectiveness of the proposed ap-
proach by experimental results in section 3. Finally,
we give conclusions and future work in section 4.
2 MULTI-TASK LEARNING FOR
ONE CLASS SVM
For the one class transfer learning classification p ro-
blem, two kinds of situation might happen depending
whether the source task and the target task share the
same feature space (homogenous case) or not (hete-
rogenous case). To study the heterogenous case, we
consider th e situation of adding new feature one b y
one in target task to simulate the modification or evo-
lution of an existing detectio n system.
2.1 Homogeneous Case
Consider the case of source task (with data set X
1
∈
R
p
) and target task (with data set X
2
∈ R
p
) in the
same space. For source task, a good dete ction model
can be trained based on a large number of samples
n
1
. After the maintenance or modificatio n of the sy -
stem, we have just a limited number of samples n
2
during a period of time. Intuitively, we may either try
to solve the proble m by considering independent se-
parated tasks or treat them together as one single task.
Inspired by references (Evgenio u and Po ntil, 2004)
and (He et al., 2014), a multi-task learning method
which trie s to ba la nce between the two extreme cases
was proposed by (Xue and Beauseroy, 2016). The de-
cision function for each task t ∈ {1, 2} (where t = 1
correspo nds to the source task and t = 2 corresponds
to the target task) is defined as:
f
t
(x) = sign(hw
t
, φ(x)i−1), (1)
where w
t
is the normal vector to the decisio n hyper-
plane and φ(x) is the non-linear feature mapping. In
the chosen multi-task learning approach, the needed
vector of each task w
t
could be divided into two part,
one part is the common mean vector w
0
shared among
all the learning tasks and the oth e r part is the spec ific
vector v
t
for a specific task.
w
t
= µw
0
+ (1 −µ)v
t
, (2)
where µ ∈ [0, 1]. When µ = 0, then w
t
= v
t
, which
correspo nds to two separated task, while µ = 1, im-
plies that w
t
= w
0
, which corresponds to one single
global task. Based on this setting, the primal one class
problem c ould be formulate d as:
min
w
0
,v
t
,ξ
it
1
2
µ k w
0
k
2
+
1
2
(1 −µ)
2
∑
t=1
k v
t
k
2
+C
2
∑
t=1
n
t
∑
i=1
ξ
it
s.t. hµw
0
+ (1 −µ)v
t
, φ(x
it
)i ≥ 1 −ξ
it
, ξ
it
≥ 0,
(3)
where t ∈ {1, 2}, x
it
is the ith sample from task t, ξ
it
is the corresponding slack variable and C is pen alty
parameter.
Based on the Lagrangian, the dual form could be
given as:
max
α
−
1
2
α
T
K
µ
α+ α
T
1
s.t. 0 ≤α ≤C1,
(4)
where α
T
= [α
11
, ..., α
n
1
1
, α
12
, ..., α
n
2
2
] and
K
µ
=
K
ss
µK
st
µK
T
st
K
tt
(5)
is a modified Gram matrix, K
ss
= hφ(X
1
), φ(X
1
)i,
K
st
= hφ(X
1
), φ(X
2
)i, K
tt
= hφ(X
2
), φ(X
2
)i, which
means that we can solve the problem by classical one-
class SVM with a specific kernel (we u se Gaussian
kernel in this paper).
Accordingly, the dec ision function for the target
task c ould be defined as:
f
2
(x) = sign(α
T
µhφ(X
1
), φ(x)i
hφ(X
2
), φ(x)i
−1). (6)