2. Removing the Unmatched Parts from Evalua-
tion: We observe a high number of false negatives
in our first evaluation ref. As mentioned in sec-
tion/ref our method, these high numbers of false
negatives are coming from all those parts which
have zero matches. Since these parts have zero
matches and our algorithm has no control over the
prediction of these parts. To check the perfor-
mance of our algorithm in a more accurate way
we have removed these parts from the evaluation
and calculate the accuracy. We have also calcu-
lated the Superpoint accuracy in similar condi-
tions. The results are shown in table 4.
3. Keeping Basic Heuristics on Part Count:
We hypothesize that if a network predicts the
presence of X or more parts of the template image
in the test image we can presume that all parts
of the template image are present. To show that
this hypothesis is true we add the heuristic on top
of the final output and compare our results. We
define the basic heuristics as
Out put part =
(
[1]*len(Np), len(Mp) ≥ len(Np)*P
[0]*len(Np), otherwise
Here, Np represents the list of total parts presents
in the template image, Mp is the parts with
matches. P is percentage value. In our dataset,
P = 0.2 gives the best number for both methods.
Results are shown in table 5.
5 CONCLUSION
We have established a framework for training of the
verification method over local feature matches. We
have presented a method to learn the interaction be-
tween the matched descriptor pairs. Our experiments
demonstrate that using the global context, local fea-
tures matching can be verified correctly. Further work
will investigate the handling of noise/ wrong matches
from the matching algorithm and make the verifica-
tion algorithm more robust.
REFERENCES
Arroyo, R., Jim
´
enez-Cabello, D., and Mart
´
ınez-Cebri
´
an, J.
(2020). Multi-label classification of promotions in
digital leaflets using textual and visual information.
In Proceedings of Workshop on Natural Language
Processing in E-Commerce, pages 11–20, Barcelona,
Spain. Association for Computational Linguistics.
Bai, M., Luo, W., Kundu, K., and Urtasun, R. (2016). Ex-
ploiting semantic information and deep matching for
optical flow. In European Conference on Computer
Vision, pages 154–170. Springer.
DeTone, D., Malisiewicz, T., and Rabinovich, A. (2018).
Superpoint: Self-supervised interest point detection
and description. In Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition
(CVPR) Workshops.
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., and Lu,
H. (2019). Dual attention network for scene segmen-
tation. In Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, pages
3146–3154.
Harris, C., Stephens, M., et al. (1988). A combined corner
and edge detector. In Alvey vision conference, vol-
ume 15, pages 10–5244. Citeseer.
Hu, J., Shen, L., and Sun, G. (2018). Squeeze-and-
excitation networks. In Proceedings of the IEEE con-
ference on computer vision and pattern recognition,
pages 7132–7141.
Lowe, D. G. (2004). Distinctive image features from scale-
invariant keypoints. International journal of computer
vision, 60(2):91–110.
Luo, W., Schwing, A. G., and Urtasun, R. (2016). Efficient
deep learning for stereo matching. In 2016 IEEE Con-
ference on Computer Vision and Pattern Recognition
(CVPR), pages 5695–5703.
Radford, A., Narasimhan, K., Salimans, T., and Sutskever,
I. (2018). Improving language understanding by gen-
erative pre-training.
Rosten, E. and Drummond, T. (2006). Machine learning for
high-speed corner detection. In European conference
on computer vision, pages 430–443. Springer.
Rublee, E., Rabaud, V., Konolige, K., and Bradski, G.
(2011). Orb: An efficient alternative to sift or surf.
In 2011 International conference on computer vision,
pages 2564–2571. Ieee.
Talmi, I., Mechrez, R., and Zelnik-Manor, L. (2017). Tem-
plate matching with deformable diversity similarity. In
Proceedings of the IEEE Conference on Computer Vi-
sion and Pattern Recognition, pages 175–183.
Tang, S., Andres, B., Andriluka, M., and Schiele, B. (2016).
Multi-person tracking by multicut and deep matching.
In European Conference on Computer Vision, pages
100–111. Springer.
Thewlis, J., Zheng, S., Torr, P. H., and Vedaldi, A. Fully-
trainable deep matching.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.
(2017). Attention is all you need. In Advances in
neural information processing systems, pages 5998–
6008.
Wang, X., Girshick, R., Gupta, A., and He, K. (2018). Non-
local neural networks. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion, pages 7794–7803.
Wu, Y., Abd-Almageed, W., and Natarajan, P. (2017). Deep
matching and validation network: An end-to-end so-
lution to constrained image splicing localization and
detection. In Proceedings of the 25th ACM interna-
tional conference on Multimedia, pages 1480–1502.
IMPROVE 2022 - 2nd International Conference on Image Processing and Vision Engineering
200