Keywords: Correlation filter, Channel reliability, Tracker, Validator, Siamese convolutional neural network
Abstract: For most algorithms, the problem of Tracking target performance degradation in the case of fast moving,
illumination changes, target deformation, occlusion, out-of-plane rotation, low-resolution images, etc. This
paper proposes a tracking verification algorithm based on channel reliability. The tracker part of the
algorithm is tracked by the method of correlation filter based on channel reliability. By calculating the
reliability weight of each feature channel of the input correlation filter, and multiplying the weight by the
response of the corresponding channel to obtain the final response, so that the target positioning will be
more accurate. The validator part uses the Siamese dual input network in the deep learning convolutional
neural network. Every few frames, the verifier will verify the results of the tracker part of the algorithm. If
the reliability is verified, the tracking result will not be modified. Otherwise, the validator will re-detect the
target location and verify the reliability through the Siamese dual-input network. The tracker will regard this
location as the new location of our target continues to be tracked, making target tracking more durable and
robust. The experimental evaluation of the OTB13 video sequence proves that the proposed algorithm has
good adaptability to target fast motion, illumination change, target deformation, occlusion, and out-of-plane
rotation, and has good robustness.
1 INTRODUCTION
As one of the basic technologies of computer vision,
target tracking technology is widely used in video
surveillance, human-computer interaction, robot
(Smeulders and Chu, 2014) and other fields.
Although the target tracking technology has
achieved a series of results in recent years, there are
still many difficulties and challenges in object
tracking, occlusion, rotation, illumination changes,
and posture changes.
Existing model-free visual tracking algorithms
are often classified as Discriminating or generating.
Discriminating algorithms can be learned by multi-
instance learning (MIL), compressed sensing, P-N
learning, structured output SVM (Hare, Golodetz,
Saffari, Vineet, Cheng, Hicks, and Torr, 2016),
online enhancement, and the like. In contrast, the
generated class tracker typically treats the tracking
as the most similar area of the search to the target.
To this end, various object appearance modeling
methods have been proposed, such as incremental
subspace learning and sparse representation (Fan
and Xiang, 2017) Currently, one of the new trends in
improving tracking accuracy is the use of deep
learning tracking methods (Fan and Ling, 2017, Ma,
Huang and Yang, 2015, Nam and Han, 2016)
because they have strong discriminative power, as
shown in (Nam and Han, 2016). However, the use of
deep learning-based tracking algorithms is
computationally intensive and less real-time.
Since MOSSE algorithm was proposed, the
correlation filter (CF) has been considered as a
robust and efficient method for visual tracking
problems (Bolme, Beveridge, Draper and Lui, 2010).
Currently, the proposed improvements based on the
MOSSE algorithm include the inclusion of kernel
and HOG features, the addition of color name
features or color histograms (Bertinetto, Valmadre,
Golodetz, Miksik, and Torr, 2016), and sparse fusion
tracking (Zhang, Bibi and Ghanem, 2016), adaptive
scales, mitigation of boundary effects (Danelljan,
Hager, Shahbaz Khan, and Felsberg, 2015), based on
Context-Aware correlation filter (Mueller, Smith,
Ghanem, 2017) and fusion of deep convolutional
network functions (Ma, Huang and Yang, 2015)
algorithm.