Authors:
            
                    Yiqiang Chen
                    
                        
                                1
                            
                    
                    ; 
                
                    Stefan Duffner
                    
                        
                                1
                            
                    
                    ; 
                
                    Andrei Stoian
                    
                        
                                2
                            
                    
                    ; 
                
                    Jean-Yves Dufour
                    
                        
                                2
                            
                    
                     and
                
                    Atilla Baskurt
                    
                        
                                1
                            
                    
                    
                
        
        
            Affiliations:
            
                    
                        
                                1
                            
                    
                    Université de Lyon, France
                
                    ; 
                
                    
                        
                                2
                            
                    
                    Thales Services, France
                
        
        
        
        
        
             Keyword(s):
            Pedestrian Attributes, Convolutional Neural Networks, Multi-label Classification.
        
        
            
                Related
                    Ontology
                    Subjects/Areas/Topics:
                
                        Computer Vision, Visualization and Computer Graphics
                    ; 
                        Motion, Tracking and Stereo Vision
                    ; 
                        Video Surveillance and Event Detection
                    
            
        
        
            
                Abstract: 
                In video surveillance, pedestrian attributes such as gender, clothing or hair types are useful cues to identify
people. The main challenge in pedestrian attribute recognition is the large variation of visual appearance and
location of attributes due to different poses and camera views. In this paper, we propose a neural network combining
high-level learnt Convolutional Neural Network (CNN) features and low-level handcrafted features to
address the problem of highly varying viewpoints. We first extract low-level robust Local Maximal Occurrence
(LOMO) features and learn a body part-specific CNN to model attribute patterns related to different
body parts. For small datasets which have few data, we propose a new learning strategy, where the CNN is
pre-trained in a triplet structure on a person re-identification task and then fine-tuned on attribute recognition.
Finally, we fuse the two feature representations to recognise pedestrian attributes. Our approach achieves
state-of-the-art resu
                lts on three public pedestrian attribute datasets.
                (More)