Dictionary selection video summarization approach
(Ma et al., 2019) works by assuming relationship
of linearity between frames. A non linear model
is contructed and then video is mapped to high
dimensional feature space with the help of kernel
to convert non linearity into linearity. Furthermore,
two greedy algorithms with strategy of back tracking
were suggested to manipulate model in dicitonary
selection but the final results were not better than
our proposed approach. GAN dpp (Mahasseni
et al., 2017) summarization approach works through
discriminator and a summarizer. Long short term
memory acts as summarizer as well as discriminator.
As a discriminator, it distinguishes between summary
generated by the system and original summary. This
approach is based on generative adversarial network
but it is not as effective as our approach. Comparison
of all of these approaches with our approach on
SumMe dataset can be seen in Table 2 where it can
be clearly seen that our approach is leading all other
approaches.
5 CONCLUSIONS
In this paper, we proposed improved video summa-
rization method that outperformed several state of
art methods. Residual network ResNet-152 was em-
ployed with gated recurrent unit having two RNN lay-
ers. We performed detailed comparison of our ap-
proach with DR-DSN by providing results on each
and every video in SumMe dataset. Furthermore,
we compared overall average F-score of our approach
with average F-score of several other state of art video
summarization methods and concluded the fact that
our method is best in terms of generating better rep-
resentative summaries of original videos.
ACKOWLEDGEMENT
We thank Kaiyang Zhou for detailed discussion about
his paper (Zhou et al., 2018a) . This work is sup-
ported by Video Surveillance Lab, Karachi, Pakistan
affiliated from National Center of Big data and Cloud
Computing, Pakistan.
REFERENCES
Bhalla, A., Ahuja, A., Pant, P., and Mittal, A. (2019). A
multimodal approach for automatic cricket video sum-
marization. In 2019 6th International Conference on
Signal Processing and Integrated Networks (SPIN),
pages 146–150. IEEE.
Chen, S.-C., Lin, K., Lin, S.-Y., Chen, K.-W., Lin, C.-W.,
Chen, C.-S., and Hung, Y.-P. (2013). Target-driven
video summarization in a camera network. In 2013
IEEE International Conference on Image Processing,
pages 3577–3581. IEEE.
Chen, X., Li, X., and Lu, X. (2015). Representative and
diverse video summarization. In 2015 IEEE China
Summit and International Conference on Signal and
Information Processing (ChinaSIP), pages 142–146.
IEEE.
Chung, J., Gulcehre, C., Cho, K., and Bengio, Y.
(2014). Empirical evaluation of gated recurrent neu-
ral networks on sequence modeling. arXiv preprint
arXiv:1412.3555.
Dimou, A., Matsiki, D., Axenopoulos, A., and Daras, P.
(2015). A user-centric approach for event-driven sum-
marization of surveillance videos.
Elsayed, N., Maida, A. S., and Bayoumi, M. (2018).
Deep gated recurrent and convolutional network hy-
brid model for univariate time series classification.
arXiv preprint arXiv:1812.07683.
Gygli, M., Grabner, H., Riemenschneider, H., and
Van Gool, L. (2014). Creating summaries from user
videos. In European conference on computer vision,
pages 505–520. Springer.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Jadon, S. and Jasim, M. (2019). Video summarization us-
ing keyframe extraction and video skimming. arXiv
preprint arXiv:1910.04792.
Ji, Z., Ma, Y., Pang, Y., and Li, X. (2017). Query-aware
sparse coding for multi-video summarization. arXiv
preprint arXiv:1707.04021.
Khan, R. U., Zhang, X., and Kumar, R. (2019). Analy-
sis of resnet and googlenet models for malware de-
tection. Journal of Computer Virology and Hacking
Techniques, 15(1):29–37.
Khan, R. U., Zhang, X., Kumar, R., and Aboagye, E. O.
(2018). Evaluating the performance of resnet model
based on image recognition. In Proceedings of the
2018 International Conference on Computing and Ar-
tificial Intelligence, pages 86–90.
Kingma, D. P. and Ba, J. (2014). Adam: A
method for stochastic optimization. arXiv preprint
arXiv:1412.6980.
Lai, P. K., D
´
ecombas, M., Moutet, K., and Lagani
`
ere, R.
(2016). Video summarization of surveillance cameras.
In 2016 13th IEEE International Conference on Ad-
vanced Video and Signal Based Surveillance (AVSS),
pages 286–294. IEEE.
Lei, J., Luan, Q., Song, X., Liu, X., Tao, D., and Song,
M. (2018). Action parsing-driven video summariza-
tion based on reinforcement learning. IEEE Transac-
tions on Circuits and Systems for Video Technology,
29(7):2126–2137.
Ma, M., Mei, S., Wan, S., Wang, Z., and Feng, D. (2019).
Reinforcement Learning based Video Summarization with Combination of ResNet and Gated Recurrent Unit
267