model. Using Structured Edge Detection as input to
the DGHT, this model scaling approach showed si-
milar performance to traditional image scaling at re-
duced runtime, even on different pedestrian databa-
ses (TUD Pedestrians, INRIA) than used in training
(IAIR). We also showed that an additional proposal
rejection step operating in the Hough space, the shape
consistency measure (SCM), can be used to signifi-
cantly reduce the number of proposals per image wit-
hout performance loss. Our framework generates be-
tween 50 and 350 proposals per image, depending on
the database, which is much less than current propo-
sal generation approaches. Furthermore, when using
only 25% of the (IAIR) training images (352 pedestri-
ans) for DGHT and SCM training, we obtained only
a moderate degradation in detection accuracy. Cur-
rently, we do not perform any bounding box refine-
ment which would further improve the detection accu-
racy. Still, our detection results compare well to other
state-of-the-art approaches (taking into account diffe-
rent training sets). First results in a car detection task
suggest that our detection framework can be success-
fully applied to other object detection tasks as well.
Thus, our framework could be useful especially for
detecting specific object categories with limited avai-
lable training material.
REFERENCES
Agarwal, S. et al. (2004). Learning to detect objects in ima-
ges via a sparse, part-based representation. In PAMI.
Andriluka, M. et al. (2008). People-Tracking-by-Detection
and People-Detection-by-Tracking. In CVPR.
Angelova, A. et al. (2015). Real-Time Pedestrian Detection
with Deep Network Cascades. In BMVC.
Arbelaez, P. et al. (2014). Multiscale Combinatorial Grou-
ping. In CVPR.
Ballard, D. (1981). Generalizing the Hough Transform to
Detect Arbitrary Shapes. In Pattern Recognition.
Benenson, R. et al. (2012). Pedestrian Detection at 100 FPS.
In CVPR.
Benenson, R. et al. (2013). Seeking the Strongest Rigid
Detector. In CVPR.
Benenson, R. et al. (2014). Ten Years of Pedestrian De-
tection, What Have We Learned? In ECCV.
Breiman, L. (2001). Random Forests. In Machine Learning.
Caltech (2017). Caltech Pedestrian Detection Bench-
mark. http://www.vision.caltech.edu/Image
Datasets/
CaltechPedestrians. [Online; accessed 28-July-2017].
Canny, J. (1986). A Computational Approach to Edge De-
tection. In PAMI.
Dalal, N. and Triggs, B. (2005). Histograms of Oriented
Gradients for Human Detection. In CVPR.
Dollar, P. et al. Integral channel features.
Dollar, P. et al. (2010). The Fastest Pedestrian Detector in
the West. In BMVC.
Dollar, P. et al. (2012). Pedestrian Detection: An Evaluation
of the State of the Art. In PAMI.
Dollar, P. and Zitnick, C. (2015). Fast Edge Detection Using
Structured Forests. In PAMI.
Felzenszwalb, P. et al. (2008). A Discriminatively Trained,
Multiscale, Deformable Part Model. In CVPR.
Gabriel, E. et al. (2016). Structured Edge Detection for Im-
proved Object Localization Using the Discriminative
Generalized Hough Transform. In VISAPP.
Gabriel, E. et al. (2017). Analysis of the Discriminative Ge-
neralized Hough Transform for Pedestrian Detection.
In ICIAP.
Gall, J. and Lempitsky, V. (2009). Class-specific Hough
Forests for Object Detection. In CVPR.
Girshick, R. (2015). Fast R-CNN. In ICCV.
Girshick, R. et al. (2013). Discriminatively Trained Defor-
mable Part Models.
Girshick, R. et al. (2014). Rich Feature Hierarchies for
Accurate Object Detection and Semantic Segmenta-
tion. In CVPR.
Hahmann, F. et al. (2015). A Shape Consistency Measure
for Improving the GHT. In VISAPP.
Kingma, D. and Ba, J. (2015). Adam: A Method for Stoch.
Optimization. In ICLR.
Krizhevsky, A. et al. (2012). ImageNet Classification with
Deep CNNs. In NIPS.
Lenc, K. and Vedaldi, A. (2015). R-cnn minus r. In BMVC.
Marin, J. et al. (2013). Random Forests of Local Experts
for Pededestrian Detection. In ICCV.
Ohn-Bar, E. and Trivedi, M. (2015). Looking outside of the
Box: Object Detection and Localization with Multi-
scale Patterns. In arXiv:1505.03597.
Paisitkriangkrai, S. et al. (2014). Strengthening the Effecti-
veness of Pedestrian Detection. In ECCV.
Redmon, J. et al. (2016a). YOLO: Unified, Real-time Ob-
ject Detection. In CVPR.
Redmon, J. et al. (2016b). YOLO9000: Better, Faster,
Stronger. In arXiv:1612.08242.
Ren, S. and others. (2015). Faster R-CNN: Towards Real-
Time Object Detection with RPNs. In NIPS.
Ruppertshofen, H. (2013). Automatic Modeling of Anato-
mical Variability for Object Localization in Medical
Images. In BoD–Books on Demand.
Simonyan, K. and Zisserman, A. (2015). Very Deep Con-
vNets for Large-Scale Image Recognition. In ICLR.
Uijlings, J. et al. (2012). Selective Search for Object Re-
cognition. In IJCV.
Viola, P. et al. (2005). Det. Pedestrians Using Patterns of
Motion and Appearance. In IJCV.
Wang, L. et al. (2007). Object Detection Combining Recog-
nition and Segmentation. In ACCV.
Wei, L. et al. (2016). SSD: Single Shot Multibox Detector.
In ECCV.
Wu, Y. et al. (2012). Iair-carped: A psychophys. annotated
dataset with fine-grained and layered semantic labels
for object recognition. In Pattern Recognition Letters.
Yao, C. et al. (2014). Human Detection Using Learned Part
Alphabet and Pose Dictionary. In ECCV.
Zeiler, M. and Fergus, R. (2014). Visualizing and Under-
standing ConvNets. In ECCV.
VISAPP 2018 - International Conference on Computer Vision Theory and Applications
176