Exploring Deep Spiking Neural Networks

for Automated Driving Applications

Sambit Mohapatra

, Heinrich Gotzig

, Senthil Yogamani

, Stefan Milz

and Raoul Z

ollner

Valeo Bietigheim, Germany

Valeo Vision Systems, Ireland

Valeo Kronach, Germany

Heilbronn University, Germany

Keywords:

Visual Perception, Efﬁcient Networks, Automated Driving.

Abstract:

Neural networks have become the standard model for various computer vision tasks in automated driving

including semantic segmentation, moving object detection, depth estimation, visual odometry, etc. The main

ﬂavors of neural networks which are used commonly are convolutional (CNN) and recurrent (RNN). In spite

of rapid progress in embedded processors, power consumption and cost is still a bottleneck. Spiking Neural

Networks (SNNs) are gradually progressing to achieve low-power event-driven hardware architecture which

has a potential for high efﬁciency. In this paper, we explore the role of deep spiking neural networks (SNN)

for automated driving applications. We provide an overview of progress on SNN and argue how it can be a

good ﬁt for automated driving applications.

1 INTRODUCTION

Autonomous driving is a rapidly progressing area

of automobile engineering that aims to gradually re-

duce human interaction in automobile driving. Divi-

ded into 5 levels of autonomy, level 4 and 5 target

the ultimate goal of automated driving, namely com-

plete removal of human interaction in vehicle driving.

The overall task of autonomous driving may be sub-

divided into 3 key groups of activities - (1) Environ-

mental sensing, (2) Environmental perception from

sensor data and (3) Actuation of drive action accor-

ding to perception. More often than not, it has been

seen that the type of sensor and its output, deﬁne the

approach most suitable for perception of the environ-

mental from the sensor data.

CNN (Convolutional Neural Networks) has made

huge leaps in accuracy for various computer vision

tasks like object recognition and semantic segmenta-

tion (Siam et al., 2017). They are also becoming do-

minant in geometric tasks like depth estimation, mo-

tion estimation, visual odometry, etc. It has played

a major role in achieving high accuracy for various

computer vision tasks which is critical for safe auto-

mated driving systems. However, they are computati-

onally expensive and power consumption is becoming

a bottleneck. For example, the recently announced

Nvidia platform Xavier provides 30 Tera-ops (TOPS)

of compute power but consumes 30 Watts. This ne-

cessitates an active cooling system which will con-

sume more power and add to operating costs.

SNN (Spiking Neural networks) have been pro-

gressing gradually as a power efﬁcient neural net-

work. The functional capabilities of SNN neuron

model is discussed in detail in (Chou et al., 2018).

SNNs were proven to be effective in several problems

but it remained less competitive compared to CNNs.

Recently (Sengupta et al., 2018) demonstrated that

a deep SNN can achieve better accuracy than CNN

on a challenging dataset ImageNet. A detailed over-

view of deep learning in SNN is discussed in (Tava-

naei et al., 2018). (Wunderlich et al., 2018) discuss

the power consumption advantages of SNN when im-

plemented in neuromorphic hardware. In (Zhou and

Wang, 2018), SNN was shown to be effective for LI-

DAR object detection directly on analog signals. Mo-

tivated by the recent progress in SNN, we study the

potential of SNN for automated driving applications

in this paper.

The rest of the paper is structured as follows.

Section 2 provides an overview of Spiking neural net-

works (SNN) and compares it with popular version

548

Mohapatra, S., Gotzig, H., Yogamani, S., Milz, S. and Zöllner, R.

Exploring Deep Spiking Neural Networks for Automated Driving Applications.

DOI: 10.5220/0007469405480555

In Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2019), pages 548-555

ISBN: 978-989-758-354-4

of NNs namely CNNs and RNNs. Section 3 discus-

ses opportunities of SNNs for automated driving ap-

plications and provides motivating use cases. Finally,

section 4 summarizes the paper and provides potential

future directions.

2 RELATED WORK ON SNN

Sensors such as camera, lidar and radar generate enor-

mous amounts of data. Machine learning has proven

to be highly successful in tasks involving such high

dimensional data. Since most objects in the environ-

ment can be grouped into certain classes such as pe-

destrian, cars etc, the data presents a pattern, which

can be used to train classiﬁers that can then classify

objects with great accuracy. Historically, Convolutio-

nal Neural Networks (CNNs) have been the mainstay

of all major machine learning approaches to object

detection. Some prominent models that make use of

CNNs for object detection are Fast R-CNN (Girshick,

2015) which uses the ﬁne-tuned CNN to extract fea-

tures from object proposals and Support Vector Ma-

chines (SVM) to classify them.

R-CNN based algorithms use a two-step process

for object detection namely - region proposal and re-

gion classiﬁcation. Recently one-shot methods have

also been proposed such as YOLO (Redmon et al.,

2016) and SSD. All these methods generate feature

maps which make up the bulk of the computation and

then classiﬁcation of the feature maps.

Spiking Neural Networks (SNNs) are the most re-

cent addition to the family of neural networks and

machine learning. Considering the fact that they are

still in a preliminary stage of research, large scale

practical applications and implementations on har-

dware are rather few. Some of the most notable ap-

plications include (Diehl et al., 2015) that use con-

version techniques for converting a CNN into SNN to

achieve impressive error rate of 0.9% in MNIST digit

recognition application. Another notable application

is (Hunsberger and Eliasmith, 2015) where Leaky In-

tegrate and Fire (LIF) neuron model with smoothened

response is used to convert a CNN to SNN for object

recognition application on CIFAR-10 dataset.

2.1 Overview of SNN

Widely considered as the third generation of neural

models, Spiking Neural models differ from conventi-

onal neural models in the very way that information is

represented and processed by them. This is inspired

by information representation and processing in bio-

logical neurons where information is converted into a

Figure 1: Spiking neuron model.

voltage spike train generally of equal amplitude. The

duration and timing of the spikes encodes the actual

information. In its very basic form, information arri-

ves at a neuron from preceding neurons in the form

of spikes, which are integrated over time. Once accu-

mulated voltage reaches a certain threshold, a voltage

spike is sent out as the output from the neuron. Figure

1 illustrates a basic representation of a spiking neuron

model.

Spiking networks are capable of processing a large

pool of data using a small number of spikes (Thorpe

et al., 2001). Previous work has demonstrated that

SNNs can be applied to all common tasks to which

CNNs are applied and can do so in an effective way

(Maass, 1997). Spiking neuron models are highly

motivated by biological neurons and the way they

function (Bois-Reymond et al., 1848). There are 3

main characteristics: (1) It accept inputs from many

incoming synapses and produce single output spike.

(2) Inputs can be excitatory - if they increase the ﬁ-

ring rate of a neuron or inhibitory - if they reduce ﬁ-

ring rate of a neuron. (3) The neuron model is gover-

ned by at least one state variable. In spiking models,

the spike timings carry the information rather than the

amplitude or shape (Gerstner and Kistler, 2002).

A spike train can be described as

S(t) =

∑

δ(t −t

) (1)

where f = 1, 2, ... is the label of the spike δ(.) is a

Dirac function as deﬁned below whose area is 1.

δ(t) =

(

1, if t = 0,

0, if t 6= 0.

(2)

Neuron models are used to represent the dynamics of

signal processing in a neuron mathematically. In case

of Spiking neurons, three commonly used models are

- Hodgkin-Huxley model, Izhikevich model and In-

tegrate and Fire model. While the Hodgkin-Huxley

model provides the closest modelling of actual biolo-

gical neurons, it’s mathematical complexity makes it

Exploring Deep Spiking Neural Networks for Automated Driving Applications

549

unsuitable for use in applications. A version of the In-

tegrate and Fire model known as the Leaky Integrate

and Fire (LIF) model is the most widely used neuro-

nal model for spiking neurons as it provides a balance

between mathematically complexity of implementa-

tion and closeness to biological process (Gerstner and

Kistler, 2002). LIFs are mathematically represented

as:

(t) = −

u(t)

+ (i

(t) +

∑

(t)) (3)

where u(t): state variable (membrane potential),

C: membrane capacitance, R: input resistance, i

(t):

is the external current, i

(t): is the input current from

the j-th synaptic input w

: strength of the j-th syn-

apse. A neuron ﬁres a spike at time t , if membrane

potential u reaches threshold(v). Immediately after a

spike the membrane potential is reset to a value less

than the threshold and held for the time known as

the refractory period. SNN can be represented as a

directed graph (V, S), with V being a set of neurons

and S representing a set of synapses (Maass, 1997).

The set V contains a subset of input and output

neurons.

Spiking Network Topologies:

1. Feedforward networks - The data ﬂows from

input to output in a unidirectional manner across se-

veral layers. Applications include sensory systems,

e.g. in vision (Escobar et al., 2009), olfaction (Fu

et al., 2007) or tactile sensing (Cassidy and Ekana-

yake, 2006).

2. Recurrent networks - In this case, neuron

groups have feedback connections. This allows dyna-

mic temporal behavior of the network. However, this

feedback arrangement makes control more difﬁcult in

such networks (Hertz et al., 1991).

3. Hybrid networks - Some of the neurons have

feedback connections while other are connected in a

feed-forward fashion.

Spike Coding Techniques:

Generally information available from sensors is not

in a form suitable for SNN processing. Hence, co-

ding such data into spike trains is a major factor in the

entire architecture. To address this problem several

neural coding strategies based on spike timing have

been proposed. Some of these strategies are listed be-

low and visualized in Figure 2. In some cases like

event based camera data, data arrives in a form more

suitable for SNNs.

1. Time to First Spike Information is encoded as

time between the beginning of stimulus and the time

of the ﬁrst spike in response. As can be seen from

Figure 2-a, a group of three neurons N1, N2 and N3

spike in response to a stimulus. The time between the

start of the stimulus to the ﬁrst spike by neuron N2 en-

codes the type of the stimulus. Such encoding scheme

is generally applied in applications such as artiﬁcial

tactile and olfactory sensors (Chen et al., 2011).

2. Rank-Order Coding (ROC) Here, informa-

tion is coded by the ﬁring order of spikes from the

group of neurons that encode the information. As

seen in Figure 2-b, the neurons ﬁre in the order N1

followed by N3 and N2 respectively. This sequence

of ﬁring of the three neurons encodes the type of sti-

mulus.

3. Latency Code Information is coded by the

difference in time between ﬁring of neurons. It is a

highly efﬁcient method of encoding large amounts of

information using only a few spikes (Borst and Theu-

nissen, 1999). This is because, a slight change in the

timings can be used to encode a completely different

data sample. Figure 2-c shows the latency between

ﬁring of neuron N1 and N2 as δ(t1). Similarly, the

latency in ﬁring of N3 after N2 is depicted as δ(t2).

Training Spiking Neurons: The connections bet-

ween subsequent neurons are called synapses. In spi-

king neuron models, these connections or synapses

have certain weights or strength associated with them

that determine the the strength of the input that the

post-synaptic neuron receives from it’s pre-synaptic

neuron. These weights can be changed and this phe-

nomenon is called synaptic plasticity. Several strate-

gies for adjusting the plasticity have been suggested

such as depending upon the history of a neuron’s re-

sponse to certain inputs from a particular pre-synaptic

neuron. Other variants may use the simultaneous ﬁ-

ring of a pre-synaptic and post-synaptic neuron as a

criteria for increasing the synaptic weight etc. Syn-

aptic plasticity is the key principle by which learning

is achieved in SNNs. Both supervised and unsupervi-

sed forms of learning can be modelled using synaptic

plasticity. Figure 3 shows a sample application of su-

pervised learning using the ReSuMe algorithm. Here

the objective was to learn the target ﬁring times of a

group of 10 spiking neurons. As seen in the ﬁgure, af-

ter 15 epochs, most of the neurons have achieved the

desired ﬁring times depicted as gray lines.

We brieﬂy review the main unsupervised lear-

ning algorithm. Donald Hebb famously formulated

a rule for changing synaptic weights depending on

pre-synaptic and post-synaptic activity. According to

Hebb’s formula the synaptic weight between neurons

VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

550

(a) Time to ﬁrst spike

(b) Rank-order coding

Figure 2: Different encoding schemes of information in spi-

kes - adapted from (Ponulak and Kasinski, 2011).

i and j, w

, is increased if neurons i and j are simul-

taneously active. This method of changing synaptic

weights is purely dictated by the input spike train and

can lead to pattern recognition in an unsupervised way

and no correction based on error evaluation is needed

within the network (Hinton et al., 1999; Hertz et al.,

1991). The condition formulated by Hebb for in-

creasing or decreasing the synaptic weights between

neurons is called Spike-Timing-Dependent-Plasticity

(STDP).

2.2 CNN vs SNN

CNNs have shown tremendous progress in their suit-

ability to vision and image based tasks such as

image recognition, object detection, pattern recogni-

tion. However, the key elements of the networks, con-

volution, feature map generation, max pooling etc, in-

volve a lot of matrix multiplication and addition and

are compute intensive. Also, the frame based opera-

tion of CNNs involves processing the entire input in a

batch, hence individual input channels have to wait till

the entire frame of inputs is available. This introduces

(a) Spike timings before training

(b) Spike ﬁring rates after 5 epochs training

Figure 3: Supervised learning with ReSuMe algorithm. A

single-layer feedforward network with 10 spiking neurons.

The task is to learn a sample target sequence of spikes assig-

ned individually to each LIF neuron. The gray vertical bars

are the target ﬁring times and black bars are the actual time

of ﬁring. It can be seen after 15 epochs, the target and actual

times are almost identical, describing learning (Ponulak and

Kasinski, 2011).

latency. Further, the inputs are processed in a layered

fashion and an output can only be produced when all

layers have ﬁnished processing a batch of inputs. This

causes latency in the output side. Due to these laten-

cies and compute intensive operations, inference in

data sets such as ImageNet (Russakovsky et al., 2015)

are not real-time and computationally un-economic.

However, meeting real-time on such targets is manda-

tory for autonomous driving applications.

Unlike CNNs, SNNs are event based, i.e, events

are processed as they are generated. This reduces

Exploring Deep Spiking Neural Networks for Automated Driving Applications

551

Figure 4: Spiking CNN architecture (Cao et al., 2015).

latency in input processing. Also, only those input

channels are evaluated and processed that have had a

change or an event. This reduces the number of inputs

that have to be processed in each cycle, as sensors do

not typically produce new data on every channel. This

reduces computational load and power consumption

greatly (Farabet et al., 2012).

CNNs can be implemented both in software and

in hardware and due to their frame based informa-

tion processing, the hardware resources can be mul-

tiplexed. Thus, higher memory bandwidth and faster

data transfer are key for real-time performance. Un-

like CNNs, SNNs process events instead of frames,

hence hardware needs to be always available as event

generation is not predictive. Though it may seem to

be a limitation, this means, the network is tightly cou-

pled to the hardware and can produce faster response

than an equivalent CNN. To improve the efﬁciency

of a SNN architecture, a modular and re-conﬁgurable

hardware is more suitable. (Farabet et al., 2012).

Given the potential beneﬁts of SNNs, a general

question arises on whether CNNs can be adapted to

SNNs? Infact, adapting pre-trained CNNs to equiva-

lent SNNs is easier and produces better results that

building a SNN with STDP and unsupervised or su-

pervised learning. Such adaptations have some key

beneﬁts:

1) A spiking convolution operator, analogous to

the convolution operator in CNNs would operate

much faster due to event based processing, while pro-

ducing similar results as traditional CNN.

2) Since events are asynchronous, each convolu-

tion operator, supported by its linked modules can

operate independent of others, if it has an event for

processing. This eliminates the need for a global syn-

chronization among the operators. Such an asynchro-

nous convolution operator may be then implemented

as a standard block in hardware for re-usability.

3) Since information is processed on a per-event

basis, power is also consumed on a per-event basis.

Since sensors typically produce a lot of redundant and

sparse data, this could bring a signiﬁcant reduction in

power consumption and computational load.

Finally SNNs can be queried for results anytime

after the ﬁrst spikes are produced at the output since

information processing is not frame based (Rueckauer

et al., 2017). Several implementations of deep SNNs

on neuromorphic hardware such as SpiNNaker and

Table 1: Comparison of ANN and SNN in various computer

vision datasets (Rueckauer et al., 2017).

Data set

ANN

error rate (%)

SNN

error rate (%)

MNIST [12] 0.86 0.86

CIFAR-10 11.13 11.18

ImageNet 23.88 25.4

BrainChip have demonstrated sensor applications that

support this potential of SNNs.

Some evidence to support the strong possibilities

in research of SNN based networks for object de-

tection is presented in Table 1. It is based on an im-

plementation by (Rueckauer et al., 2017). It presents

a comparison of classiﬁcation error rates for CNNs

and SNN implementation on state of the art data sets

(Cao et al., 2015).

3 SNNs IN AUTOMATED

DRIVING

3.1 Use Cases in Automated Driving

Event Driven Computing: Automated driving

has a wide variety of scenarios. At high level, the

main scenarios are parking, highway driving and

urban driving (Heimberger et al., 2017). The scene

dynamics and understanding is typically different for

these scenarios and a customized model is generally

used for these scenarios. There are also various

scenarios based on weather condition like rainy, day

or night, foggy, etc. The combination of various

environmental condition is exponential and difﬁcult

to have a customized model for each scenario. At

the same time, transfer function can be shared across

these different scenarios and event triggered mecha-

nism can be used to adapt the regions used. This

can be accomplished loosely using shared encoder

and gating mechanisms within CNN. However, SNN

naturally captures event triggered model. There is a

class of cameras called event based cameras which

encode information at the sensor level. Recently,

deep learning algorithms were demonstrated on event

based camera data (Maqueda et al., 2018).

Point Cloud: Light Detection and Ranging (LiDAR)

sensors have recently gained prominence as state of

the art sensors in sensing the environment. They pro-

duce a 3D representation of the objects in the ﬁeld of

view as distances of points from the source. This col-

lection of points over a 3D space is called a 3D Point

Cloud. Though cameras have been used for a long

time and they provide a more direct representation of

VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

552

the surrounding, LiDARs have gained ground because

of some critical advantages such as long range, robus-

tness to ambient light conditions and accurate locali-

zation of objects in 3D space. They produce sparse

data and hence suitable for SNNs.

3.2 Opportunities

SNNs have shown great potential to either aid or

replace CNNs in real-time tasks such as object de-

tection, posture recognition etc. (Hu et al., 2016).

Large SNN architectures can be implemented on neu-

romorphic spiking platforms such as TrueNorth (Ben-

jamin et al., 2014). and SpiNNaker (Furber et al.,

2014). The TrueNorth has demonstrated to consume

as low as couple hundred mW power while packing

a million neurons in it (Sawada et al., 2016). Driven

by the strong motivation to reduce power consump-

tion of integrated circuits, implementations of spiking

models have shown to consume in the order of nJ or

even pJ (Azghadi et al., 2014) for signal transmission

and processing (Indiveri et al., 2006). Some neuro-

morphic designs also feature on-chip learning (Indi-

veri and Fusi, 2007).

Spiking applications and spike based learning is

also suited to dynamic applications like speech re-

cognition systems. In such systems, training is not

sufﬁcient at manufacture as it has to adapt to dynamic

conditions such as accents. Other similar sensors are

event based Dynamic Vision Sensor (DVS) (Lichtstei-

ner et al., 2008) (Lenero-Bardallo et al., 2010). Some

of the applications especially in the object detection

and perception based tasks that are of direct relevance

to the automotive industry as mentioned brieﬂy below.

1. Object classiﬁcation on the CIFAR-10 data-

set: (Cao et al., 2015) designed a Spiking equivalent

model of a CNN for object detection on the CIFAR-

10 data set. The CNN was trained on the dataset

and the trained model was then converted into spi-

king with each individual block such as convolution,

max pooling, ReLU, being replaced by spiking equi-

valents. Their transformed model achieves an error

rate of 22.57%. CIFAR-10 is a collection of 60,000

labeled images of 10 classes of objects (Cao et al.,

2015) The network architecture is illustrated in Figure

2. Human action recognition: (Zhao et al., 2015)

constructed a network to recognize human actions

and posture and successfully tested it. The network

was trained on an event-based dataset of small video

sequences with simple human actions like sitting,

walking or bending. They achieved a detection

accuracy of 99.48%. This work is an indication

of how SNNs may be applied to such event based

inference tasks.

We summarize the key beneﬁts of SNN for automated

driving:

• Event driven mechanism which brings adaptation

for different scenarios.

• Low power consumption when realized as neuro-

morphic hardware.

• Simpler learning algorithm which leads to possi-

bility of on-chip learning for longer term adapta-

tion.

• Ability to integrate directly to analog signals lea-

ding to tightly integrated system.

• Lower latency in algorithm pipeline which is im-

portant for high speed braking and maneuvering.

4 CONCLUSION

Spiking Neural Networks (SNN) are biologically in-

spired where the neuronal activity is sparse and event

driven in order to optimize power consumption. In

this paper, we provide an overview of SNN and com-

pare it with CNN and argue how it can be useful in au-

tomated driving systems. Overall power consumption

over the driving cycle is a critical constraint which has

to be efﬁciently used especially for electric vehicles.

Event driven architectures for various scenarios in au-

tomated driving can also have accuracy advantages.

REFERENCES

Azghadi, M. R., Iannella, N., Al-Sarawi, S. F., Indiveri, G.,

and Abbott, D. (2014). Spike-based synaptic plasti-

city in silicon: design, implementation, application,

and challenges. Proceedings of the IEEE, 102(5):717–

737.

Benjamin, B. V., Gao, P., McQuinn, E., Choudhary, S.,

Chandrasekaran, A. R., Bussat, J.-M., Alvarez-Icaza,

R., Arthur, J. V., Merolla, P. A., and Boahen, K.

(2014). Neurogrid: A mixed-analog-digital multichip

system for large-scale neural simulations. Procee-

dings of the IEEE, 102(5):699–716.

Bois-Reymond, Y. et al. (1848). Investigations on animal

electricity ”a t. Annalen der Physik, 151:463–464.

Borst, A. and Theunissen, F. E. (1999). Information theory

and neural coding. Nature neuroscience, 2(11):947.

Cao, Y., Chen, Y., and Khosla, D. (2015). Spiking deep

convolutional neural networks for energy-efﬁcient ob-

ject recognition. International Journal of Computer

Vision, 113(1):54–66.

Exploring Deep Spiking Neural Networks for Automated Driving Applications

553

Cassidy, A. and Ekanayake, V. (2006). A biologically inspi-

red tactile sensor array utilizing phase-based computa-

tion. In Biomedical Circuits and Systems Conference,

2006. BioCAS 2006. IEEE, pages 45–48. IEEE.

Chen, H. T., Ng, K. T., Bermak, A., Law, M. K., and Mar-

tinez, D. (2011). Spike latency coding in biologically

inspired microelectronic nose. IEEE transactions on

biomedical circuits and systems, 5(2):160–168.

Chou, C.-N., Chung, K.-M., and Lu, C.-J. (2018). On the

algorithmic power of spiking neural networks. arXiv

preprint arXiv:1803.10375.

Diehl, P. U., Neil, D., Binas, J., Cook, M., Liu, S.-C., and

Pfeiffer, M. (2015). Fast-classifying, high-accuracy

spiking deep networks through weight and threshold

balancing. In Neural Networks (IJCNN), 2015 Inter-

national Joint Conference on, pages 1–8. IEEE.

Escobar, M.-J., Masson, G. S., Vieville, T., and Kornprobst,

P. (2009). Action recognition using a bio-inspired

feedforward spiking network. International Journal

of Computer Vision, 82(3):284.

Farabet, C., Paz, R., P

erez-Carrasco, J., Zamarre

no,

C., Linares-Barranco, A., LeCun, Y., Culurciello,

E., Serrano-Gotarredona, T., and Linares-Barranco,

B. (2012). Comparison between frame-constrained

ﬁx-pixel-value and frame-free spiking-dynamic-pixel

convnets for visual processing. Frontiers in neuros-

cience, 6:32.

Fu, J., Li, G., Qin, Y., and Freeman, W. J. (2007). A pattern

recognition method for electronic noses based on an

olfactory neural network. Sensors and Actuators B:

Chemical, 125(2):489–497.

Furber, S. B., Galluppi, F., Temple, S., and Plana, L. A.

(2014). The spinnaker project. Proceedings of the

IEEE, 102(5):652–665.

Gerstner, W. and Kistler, W. M. (2002). Spiking neuron

models: Single neurons, populations, plasticity. Cam-

bridge university press.

Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE

international conference on computer vision, pages

1440–1448.

Heimberger, M., Horgan, J., Hughes, C., McDonald, J., and

Yogamani, S. (2017). Computer vision in automated

parking systems: Design, implementation and chal-

lenges. Image and Vision Computing, 68:88–101.

Hertz, J., Krogh, A., and Palmer, R. G. (1991). Intro-

duction to the theory of neural computation. Addison-

Wesley/Addison Wesley Longman.

Hinton, G. E., Sejnowski, T. J., and Poggio, T. A. (1999).

Unsupervised learning: foundations of neural compu-

tation. MIT press.

Hu, Y., Liu, H., Pfeiffer, M., and Delbruck, T. (2016). Dvs

benchmark datasets for object tracking, action recog-

nition, and object recognition. Frontiers in neuros-

cience, 10:405.

Hunsberger, E. and Eliasmith, C. (2015). Spiking

deep networks with lif neurons. arXiv preprint

arXiv:1510.08829.

Indiveri, G., Chicca, E., and Douglas, R. J. (2006). A vlsi

array of low-power spiking neurons and bistable syn-

apses with spike-timing dependent plasticity. IEEE

transactions on neural networks, 17(1).

Indiveri, G. and Fusi, S. (2007). Spike-based learning in

vlsi networks of integrate-and-ﬁre neurons. In Circuits

and Systems, 2007. ISCAS 2007. IEEE International

Symposium on, pages 3371–3374. IEEE.

Lenero-Bardallo, J. A., Serrano-Gotarredona, T., and

Linares-Barranco, B. (2010). A signed spatial contrast

event spike retina chip. In Circuits and Systems (IS-

CAS), Proceedings of 2010 IEEE International Sym-

posium on, pages 2438–2441. IEEE.

Lichtsteiner, P., Posch, C., and Delbruck, T. (2008). A

128x128 120 db 15 microsec latency asynchronous

temporal contrast vision sensor. IEEE journal of solid-

state circuits, 43(2):566–576.

Maass, W. (1997). Networks of spiking neurons: the third

generation of neural network models. Neural net-

works, 10(9):1659–1671.

Maqueda, A. I., Loquercio, A., Gallego, G., Garcıa, N.,

and Scaramuzza, D. (2018). Event-based vision meets

deep learning on steering prediction for self-driving

cars. In Proceedings of the IEEE Conference on Com-

puter Vision and Pattern Recognition, pages 5419–

5427.

Ponulak, F. and Kasinski, A. (2011). Introduction to spi-

king neural networks: Information processing, lear-

ning and applications. Acta neurobiologiae experi-

mentalis, 71(4):409–433.

Redmon, J., Divvala, S., Girshick, R., and Farhadi, A.

(2016). You only look once: Uniﬁed, real-time object

detection. In Proceedings of the IEEE conference on

computer vision and pattern recognition, pages 779–

788.

Rueckauer, B., Lungu, I.-A., Hu, Y., Pfeiffer, M., and Liu,

S.-C. (2017). Conversion of continuous-valued deep

networks to efﬁcient event-driven networks for image

classiﬁcation. Frontiers in neuroscience, 11:682.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S.,

Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bern-

stein, M., et al. (2015). Imagenet large scale visual

recognition challenge. International Journal of Com-

puter Vision, 115(3):211–252.

Sawada, J., Akopyan, F., Cassidy, A. S., Taba, B., De-

bole, M. V., Datta, P., Alvarez-Icaza, R., Amir, A.,

Arthur, J. V., Andreopoulos, A., et al. (2016). Tru-

enorth ecosystem for brain-inspired computing: sca-

lable systems, software, and applications. In Procee-

dings of the International Conference for High Perfor-

mance Computing, Networking, Storage and Analysis,

page 12. IEEE Press.

Sengupta, A., Ye, Y., Wang, R., Liu, C., and Roy,

K. (2018). Going deeper in spiking neural net-

works: Vgg and residual architectures. arXiv preprint

arXiv:1802.02627.

Siam, M., Elkerdawy, S., Jagersand, M., and Yogamani, S.

(2017). Deep semantic segmentation for automated

driving: Taxonomy, roadmap and challenges. In In-

telligent Transportation Systems (ITSC), 2017 IEEE

20th International Conference on, pages 1–8. IEEE.

VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

554

Tavanaei, A., Ghodrati, M., Kheradpisheh, S. R., Masque-

lier, T., and Maida, A. S. (2018). Deep lear-

ning in spiking neural networks. arXiv preprint

arXiv:1804.08150.

Thorpe, S., Delorme, A., and Van Rullen, R. (2001). Spike-

based strategies for rapid processing. Neural net-

works, 14(6-7):715–725.

Wunderlich, T., Kungl, A. F., Hartel, A., Stradmann, Y.,

Aamir, S. A., Gr

ubl, A., Heimbrecht, A., Schreiber,

K., St

ockel, D., Pehle, C., et al. (2018). Demonstra-

ting advantages of neuromorphic computation: A pilot

study. arXiv preprint arXiv:1811.03618.

Zhao, B., Ding, R., Chen, S., Linares-Barranco, B., and

Tang, H. (2015). Feedforward categorization on aer

motion events using cortex-like features in a spiking

neural network. IEEE Trans. Neural Netw. Learning

Syst., 26(9):1963–1978.

Zhou, S. and Wang, W. (2018). Object detection based on

lidar temporal pulses using spiking neural networks.

arXiv preprint arXiv:1810.12436.

Exploring Deep Spiking Neural Networks for Automated Driving Applications

555