transform domain, there are many speech encryption
methods. For instance, methods such as fast Fourier
transform, discrete cosine transform and wavelet
transform are widely used. Recently, some new
voice encryption methods were developed based on
chaotic maps and on circular transformations.
Speech encryption algorithms can also be
classified into digital encryption and analogue
encryption methods. Analogue encryption operates
on the voice samples themselves. The main
advantage of analogue encryption is the fact that no
modem or voice compression method is required for
transmission. Moreover, the quality of the voice
which is recovered is independent of the language.
This type of encryption is recommended to be used
for the existing analog channels such as telephone,
satellite or mobile communication links.
Digital encryption performs as a first step the
digitization of the input voice signal. Then, the
digitized signal will be compressed to produce a bit
stream at suitable bit rate. The resulting bit stream
will be encrypted and transmitted through insecure
channels. This type of encryption ensures high voice
quality, low distortion and is considered
cryptanalytically stronger than analogue encryption.
Complex digital speech encryption algorithms
were developed due to the appearance of Very Large
Scale Integration (VLSI) and DSP chips and are
nowadays used in applications such as voice
activated security, personal communication systems,
secure voice mail and so on. A part of these
applications require devices that have limited
resources, which means that their implementation is
dependent on constraints such as memory, size and
power consumption. In this context, because of the
advantages offered, DPSs represent the best solution
for obtaining high performance speech encryption,
under real time requirements. Moreover, hardware
cryptographic algorithms are more physically
secure, which makes it hard for an attacker to read
information or to modify it.
The purpose of this paper was to optimize and to
compare the performance of six speech encryption
algorithms which can be easily embedded in low
power, portable systems and which can be used in
real time. This paper focuses on the following
speech encryption methods: three stream ciphers
(Mickey 2.0, Grain v1, Trivium), scrambling
encryption algorithm, Robust Secure Coder (RSC)
algorithm, encryption algorithm based on chaotic
map and Blowfish algorithms. An important aspect
presented in this paper is solving the problem of
optimizing the implementations of previously
mentioned voice encryption algorithms on DSP
platforms. All the algorithms were ported onto a
fixed point DSP and a stage by stage optimization
was performed to meet the real time requirements.
The goal was to determine which of the evaluated
encryption algorithms is best suited for real time
secure communications (in terms of performance).
This paper is organized as follows. The
necessary background for our work is presented in
Section 2. Related work is described in Section 3.
Details regarding the architecture and
implementation of voice encryption algorithms are
presented in Section 4. The experimental results for
the un-optimized code and for the optimized code of
the speech encryption algorithms are described in
Section 5. Conclusions are summarized in Section 6
together with our future work.
2 BACKGROUND
This section includes a brief description of Mixed
Excitation Linear Prediction (MELP), a speech
coding algorithm, of stream ciphers such as Mickey
v2, Trivium, Grain v1.0, of recently developed voice
encryption algorithms and the description of general
aspects of DSP architectures.
2.1 MELP Algorithm
Voice coders are widely used in digital
telecommunications systems to reduce the required
transmission bandwidth.
Since the late 1970s, vocoders have been
implemented using linear prediction which is a
technique of representing the spectral envelope, a
method conducting to linear predictive coding (LPC)
(Tremain, 1982). The main disadvantage of LPC
method is the fact that sometimes it sounds buzzy or
mechanical because of the inability to reproduce all
kinds of voiced speech using a simple pulse train.
MELP vocoder (McCree, 1996) and (Supplee,
1997) is based on LPC model, but has some
additional features such as: mixed-excitation, pulse
dispersion, adaptive spectral enhancement and
aperiodic pulses. The mixed-excitation reduces the
buzz which is in general encountered in LPC
vocoders. Aperiodic pulses ensure easy transitions
between unvoiced and voiced segments of the
signal. More exactly, the synthesizer can reproduce,
without having tonal noises inserted, erratic glottal
pulses. The pulse dispersion is, in general,
implemented using a filter, which disperses the
excitation energy with a pitch period. This feature is
important for synthetic speech, because the harsh
ICISSP 2016 - 2nd International Conference on Information Systems Security and Privacy