Authors:
Greg Olmschenk
1
;
Hao Tang
2
and
Zhigang Zhu
3
;
1
Affiliations:
1
The Graduate Center of the City University of New York, New York, U.S.A.
;
2
Borough of Manhattan Community College - CUNY, New York, U.S.A.
;
3
The City College of New York - CUNY, New York, U.S.A.
Keyword(s):
Crowd Counting, Convolutional Neural Network, k-Nearest Neighbor, Upsampling.
Abstract:
Gatherings of thousands to millions of people frequently occur for an enormous variety of events, and automated counting of these high-density crowds is useful for safety, management, and measuring significance of an event. In this work, we show that the regularly accepted labeling scheme of crowd density maps for training deep neural networks is less effective than our alternative inverse k-nearest neighbor (ikNN) maps, even when used directly in existing state-of-the-art network structures. We also provide a new network architecture MUD-ikNN, which uses multi-scale drop-in replacement upsampling via transposed convolutions to take full advantage of the provided ikNN labeling. This upsampling combined with the ikNN maps further improves crowd counting accuracy. Our new network architecture performs favorably in comparison with the state-of-the-art. However, our labeling and upsampling techniques are generally applicable to existing crowd counting architectures.