
decodes the streams and perform processing on each 
of  them.  With  respect  to  those  configurations,  the 
need  to  introduce  distributed  intelligent  system  is 
motivated  by  several  requirements,  namely 
(Remagnino et al., 2004): 
• Speed:  in-network  distributed  processing  is 
inherently parallel;  in  addition, the specialization of 
modules permits to reduce the computational burden 
in  the higher  level  of the  network, in  this way,  the 
role of the central server is relieved and it might be 
actually omitted in a fully distributed architecture.  
• Bandwidth:  in-node  processing  permits  to 
reduce  the  amount  of  transmitted  data,  by 
transferring  only  information-rich  parameters  about 
the observed scene and not the redundant video data 
stream. 
• Redundancy:  a  distributed  system  may  be  re-
configured  in  case  of  failure  of  some  of  it 
components, still keeping the overall functionalities. 
• Autonomy: each  of the  nodes  may process  the 
images asynchronously and may react autonomously 
to the perceived changes in the scene. 
In particular, these issues suggest moving a part 
of  intelligence  towards  the  camera  nodes.  In  these 
nodes,  artificial  intelligence  and  computer  vision 
algorithms  are  able  to  provide  autonomy  and 
adaptation to  internal  conditions  (e.g.  hardware and 
software  failure)  as  well  as  to  external  conditions 
(e.g. changes in weather and lighting conditions). It 
can be stated that in a SCN the nodes are not merely 
collectors of information from the  sensors,  but they 
have to blend significant and compact descriptors of 
the  scene  from  the  bulky  raw  data  contained  in  a 
video stream.  
This  naturally  requires  the solution  of  computer 
vision  problems  such  as  change  detection  in  image 
sequences,  object  detection,  object  recognition, 
tracking,  and  image  fusion  for  multi-view  analysis. 
Indeed,  no  understanding  of  a  scene  may  be 
accomplished  without  dealing  with  some  of  the 
above  tasks.  As  it  is  well  known,  for  each  of  such 
problems  there  is  an  extensive  corpus  of  already 
implemented  methods  provided  by  the  computer 
vision  and  the  video  surveillance  communities. 
However, most of the techniques currently available 
are not suitable to be  used in SCN, due to the high 
computational  complexity  of  algorithms  or  to 
excessively  demanding  memory  requirements. 
Therefore, ad hoc algorithms should be designed for 
SCN,  as  we  will  explore  in  the  next  sections.  In 
particular, after describing the possible role of SCN 
in urban scenarios, we present in Section 3 a sample 
application,  namely  the  estimation  of  vehicular 
flows  on  a  road,  proposing  a  lightweight  method 
suitable for embedded systems. Then,  we introduce 
the  sensor  prototype  we  designed and  developed  in 
Section  4.  In  Section  5  we  report  the  experimental 
results  gathered  during  a  test  field  and  we  finally 
conclude the paper in Section 6. 
2  SCN IN URBAN SCENARIOS 
According to  (Buch et al., 2011),  there has been an 
increased scope for the automatic analysis of urban 
traffic activity. This is partially due to the additional 
numbers  of  cameras  and  other  sensors,  enhanced 
infrastructure  and  consequent  accessibility  of  data. 
In addition, the advances in analytical techniques for 
processing  video  streams  together  with  increased 
computing power  have  enabled  new applications in 
ITS. Indeed, video cameras have been deployed  for 
a  long  time  for  traffic  and  other  monitoring 
purposes,  because  they  provide  a  rich  information 
source  for  human  understanding.  Video  analytics 
may  now  provide  added  value  to  cameras  by 
automatically  extracting  relevant  information.  This 
way,  computer  vision  and  video  analytics  become 
increasingly important for ITS.  
In highway traffic scenarios, the use  of  cameras 
is now widespread and existing commercial systems 
have  excellent  performance.  Cameras  are  used 
tethered  to  ad  hoc  infrastructures,  sometimes 
together with Variable Message Signs (VMS), RSU 
and other devices typical of the ITS domain. Traffic 
analysis  is  often  performed  remotely  by  using 
special  broadband  connection,  encoding, 
multiplexing and transmission protocols to send the 
data  to  a  central  control  room  where  dedicated 
powerful hardware technologies are used to process 
multiple  incoming  video  streams  (Lopes  et  al., 
2010). The usual monitoring scenario consists in the 
estimation  of  traffic  flows  distinguished  among 
lanes  and  vehicles  typologies  together  with  more 
advanced  analysis  such  as  detection  of  stopped 
vehicles,  accidents  and  other  anomalous  events  for 
safety, security and law enforcement purposes. 
By  converse,  traffic  analysis  in  the  urban 
environment  appears  to  be  much  more  challenging 
than  on  highways.  In  addition,  several  extra 
monitoring  objectives  can  be  supported,  at  least  in 
principle, by the application of computer vision and 
pattern  recognition  techniques.  For  example  these 
include  the  detection  of  complex  traffic  violations 
(e.g.  illegal  turns,  one-way  streets,  restricted  lanes) 
(Guo  et  al., 2011;  Wang et  al.  2013), identification 
of  road  users  (e.g.  vehicles,  motorbikes  and 
pedestrians) (Buch et al., 2010) and of their 
VISAPP2015-InternationalConferenceonComputerVisionTheoryandApplications
664