Performance Analysis of Real Time Implementations of Voice

Encryption Algorithms using Blackfin Processors

Cristina-Loredana Duta, Laura Gheorghe and Nicolae Tapus

Department of Computer Science and Engineering, University Politehnica of Bucharest, Bucharest, Romania

Keywords: Speech Processing, Voice Encryption, Digital Signal Processor, Grain V1, Trivium, Mickey 2.0, Blackfin

Processor.

Abstract: A large part of the latest research in speech coding and speech encryption algorithms is motivated by the

need of obtaining secure military communications, to allow effective operation in a hostile environment.

Since the bandwidth of the communication channel is a sensitive problem in military applications, low bit-

rate speech compression methods and high throughput encryption algorithms are mostly used. Several

speech encryption methods are characterized by very strict requirements in power consumption, size, and

voltage supply. These requirements are difficult to fulfill, given the complexity and number of functions to

be implemented, together with the real time requirement and large dynamic range of the input signals. To

meet these constraints, careful optimization should be done at all levels, ranging from algorithmic level,

through system and circuit architecture, to layout and design of the cell library. The key points of this

optimization are among others, the choice of the algorithms, the modification of the algorithms to reduce

computational complexity, the choice of a fixed-point arithmetic unit, the minimization of the number of

bits required at every node of the algorithm, and a careful match between algorithms and architecture. This

paper describes the performance analysis on Digital Signal Processor (DSP) platform of some of the

recently proposed voice encryption algorithms, as well as the performance of stream ciphers such as Grain

v1, Trivium and Mickey 2.0 (which are suited for real time voice encryption). The algorithms were ported

onto a fixed point DSP, Blackfin 537, and stage by stage optimization was performed to meet the real time

requirements. Memory optimization techniques such as data placement and caching were also used to

reduce the processing time. The goal was to determine which of the evaluated encryption algorithms is best

suited for real time secure communications.

1 INTRODUCTION

Speech represents the fundamental form of

communication between humans. There are two

methods to represent the speech: through its message

content (as information) and as an acoustic

waveform (the signal which carries the message

information).

In the last years, due to the advancements in

communication technology and the increasing

demand of speech based applications, security has

become an important aspect. The purpose of secure

communication is to overcome unwanted disclosure

and unauthorized modifications while transmitting

speech through insecure channels.

The redundancy of the language plays an

important role in secure speech communication

systems such that if the language is highly

redundant, an intruder can decipher much easier the

information. Traditional solutions to ensure

communications confidentiality were based on

scrambling techniques (which include simple

permutations and affine transformations in

frequency or time domain). Due to the fact that in

the last decade, the computing power has quickly

increased, these scrambling algorithms became

vulnerable to attacks. In this context, many real-

world cryptographic implementations shifted to

integrating encryption with compression algorithms

in order to reduce the size of the signal before

encryption and to eliminate the redundancy.

In general, there are four main categories of

speech encryption: frequency domain scrambling,

time domain scrambling, amplitude scrambling and

two-dimensional scrambling (combination of time-

domain and frequency-domain scrambling). In the

Duta, C-L., Gheorghe, L. and Tapus, N.

Performance Analysis of Real Time Implementations of Voice Encryption Algorithms using Blackﬁn Processors.

DOI: 10.5220/0005744601570166

In Proceedings of the 2nd International Conference on Information Systems Security and Privacy (ICISSP 2016), pages 157-166

ISBN: 978-989-758-167-0

157

transform domain, there are many speech encryption

methods. For instance, methods such as fast Fourier

transform, discrete cosine transform and wavelet

transform are widely used. Recently, some new

voice encryption methods were developed based on

chaotic maps and on circular transformations.

Speech encryption algorithms can also be

classified into digital encryption and analogue

encryption methods. Analogue encryption operates

on the voice samples themselves. The main

advantage of analogue encryption is the fact that no

modem or voice compression method is required for

transmission. Moreover, the quality of the voice

which is recovered is independent of the language.

This type of encryption is recommended to be used

for the existing analog channels such as telephone,

satellite or mobile communication links.

Digital encryption performs as a first step the

digitization of the input voice signal. Then, the

digitized signal will be compressed to produce a bit

stream at suitable bit rate. The resulting bit stream

will be encrypted and transmitted through insecure

channels. This type of encryption ensures high voice

quality, low distortion and is considered

cryptanalytically stronger than analogue encryption.

Complex digital speech encryption algorithms

were developed due to the appearance of Very Large

Scale Integration (VLSI) and DSP chips and are

nowadays used in applications such as voice

activated security, personal communication systems,

secure voice mail and so on. A part of these

applications require devices that have limited

resources, which means that their implementation is

dependent on constraints such as memory, size and

power consumption. In this context, because of the

advantages offered, DPSs represent the best solution

for obtaining high performance speech encryption,

under real time requirements. Moreover, hardware

cryptographic algorithms are more physically

secure, which makes it hard for an attacker to read

information or to modify it.

The purpose of this paper was to optimize and to

compare the performance of six speech encryption

algorithms which can be easily embedded in low

power, portable systems and which can be used in

real time. This paper focuses on the following

speech encryption methods: three stream ciphers

(Mickey 2.0, Grain v1, Trivium), scrambling

encryption algorithm, Robust Secure Coder (RSC)

algorithm, encryption algorithm based on chaotic

map and Blowfish algorithms. An important aspect

presented in this paper is solving the problem of

optimizing the implementations of previously

mentioned voice encryption algorithms on DSP

platforms. All the algorithms were ported onto a

fixed point DSP and a stage by stage optimization

was performed to meet the real time requirements.

The goal was to determine which of the evaluated

encryption algorithms is best suited for real time

secure communications (in terms of performance).

This paper is organized as follows. The

necessary background for our work is presented in

Section 2. Related work is described in Section 3.

Details regarding the architecture and

implementation of voice encryption algorithms are

presented in Section 4. The experimental results for

the un-optimized code and for the optimized code of

the speech encryption algorithms are described in

Section 5. Conclusions are summarized in Section 6

together with our future work.

2 BACKGROUND

This section includes a brief description of Mixed

Excitation Linear Prediction (MELP), a speech

coding algorithm, of stream ciphers such as Mickey

v2, Trivium, Grain v1.0, of recently developed voice

encryption algorithms and the description of general

aspects of DSP architectures.

2.1 MELP Algorithm

Voice coders are widely used in digital

telecommunications systems to reduce the required

transmission bandwidth.

Since the late 1970s, vocoders have been

implemented using linear prediction which is a

technique of representing the spectral envelope, a

method conducting to linear predictive coding (LPC)

(Tremain, 1982). The main disadvantage of LPC

method is the fact that sometimes it sounds buzzy or

mechanical because of the inability to reproduce all

kinds of voiced speech using a simple pulse train.

MELP vocoder (McCree, 1996) and (Supplee,

1997) is based on LPC model, but has some

additional features such as: mixed-excitation, pulse

dispersion, adaptive spectral enhancement and

aperiodic pulses. The mixed-excitation reduces the

buzz which is in general encountered in LPC

vocoders. Aperiodic pulses ensure easy transitions

between unvoiced and voiced segments of the

signal. More exactly, the synthesizer can reproduce,

without having tonal noises inserted, erratic glottal

pulses. The pulse dispersion is, in general,

implemented using a filter, which disperses the

excitation energy with a pitch period. This feature is

important for synthetic speech, because the harsh

ICISSP 2016 - 2nd International Conference on Information Systems Security and Privacy

158

quality of it is reduced. The filter for adaptive

spectral enhancement provides a more natural

quality to the outputted speech signal, by improving

the match between natural and synthetic waveforms.

2.2 Trivium Stream Cipher

Trivium stream cipher (Canniere, 2006) has 80-bit

initialization vector (IV) and a key of 80 bits and can

generate a keystream up to 2

bits. The secret state

of the algorithm has 288 bits which includes three

non-linear feedback shift registers of different

lengths: 93, 84 and 111 bits. As the majority of

stream ciphers, Trivium has two phases: the setup of

key and IV phase and the keystream generation

phase. The keystream generation operates in each

clock cycle on three input bits and produces one

output bit. The initialization includes 1152 steps of

the clocking procedure. The algorithm is designed

such that it allows improvement of the throughput

using parallelization (64 iterations can be computed

at once), without increasing the area necessary for

implementation.

2.3 Mickey 2.0 Stream Cipher

Mickey 2.0(Babbage, 2006) is a synchronous stream

cipher, which stands for “Mutual Irregular Clocking

KEYstream generator”. The cipher works with an IV

with length up to 80 bits and accepts keys of 80 bits

length. Mickey produces the ciphertext by

performing bitwise XOR between the plaintext and

the keystream bits. The keystream sequence can be

of maximum 2

bits. The state of the algorithm

includes two 100-bit shift registers (one nonlinear

and the other linear) which are clocked one by the

other in an irregular mode. The designers have also

created a version of the cipher (MICKEY1-28 2.0)

which accepts an IV up to 128 bits and a 128-bit

key. Regarding the implementation, the authors

mention that they were able to generate, using a PC

with 3.4 GHz Pentium 4 processor, 10

bits in

approximately 3.81 seconds.

2.4 Grain V1 Stream Cipher

Grain stream cipher (Hell, 2007) uses an 80-bit key

and a 64-bit IV. This initial version was revised

because vulnerabilities were discovered in its

structure and a new version Grain v1 was created

which includes two stream ciphers: one for 128-bit

keys (with 80-bit IV) and one for 80-bit keys (with

64-bit IV). These ciphers include a non-linear

feedback register and a linear feedback register

which are coupled using lightweight boolean

functions. Even though, the Grain family ciphers

design includes an ingenious multiplication of

throughput speed, this feature increases the space

consumed.

2.5 Scrambling Speech Encryption

Algorithm

In (Ravikrindi, 2011), the authors use for speech

encryption a scrambling technique. For this, they

have developed a software program in assembly

language programming techniques for Digital Signal

Processor ADSP 2181 (which is a 16 bit fixed point

processor).

The algorithm works as follows. The first step is

to acquire the voice signals, then to digitally code

them and store the values in the memory of the

processor (128 samples are stored). The scrambling

of the speech signal is performed using Fast Fourier

Transform (FFT) and Inverse Fast Fourier

Transform (IFFT) techniques.

The next step is to perform decoding, in order to

obtain the speech signal again, which will be

transmitted to the receiver. After applying FFT, the

signal is converted into spikes (in frequency

domain).

The signal spikes will be stored in the memory

and based on circular buffers, their positions will be

interchanged, ensuring in this manner the scrambling

of the original signal (first 64 samples are displaced

into next 64 samples position and vice versa). After

scrambling, IFFT is applied to convert the signal

from frequency domain, back into the time domain.

The last step is to convert the digital signal into

the analog signal and to transmit it. Figure 1

illustrates the speech encryption process, where

ADC represents the Analog-to-Digital Converter and

DAC represents the Digital-to-Analog Converter.

Figure 1: Speech encryption process (Scrambling).

The receiver performs the same steps previously

described, obtaining at the end of the process the

original speech signal. The original signal is

obtained when the spikes (mentioned in the

encryption process) are placed into their original

positions, process which happens using the circular

buffers. Figure 2 shows the speech decryption

Performance Analysis of Real Time Implementations of Voice Encryption Algorithms using Blackﬁn Processors

159

process.

Figure 2: Speech decryption process (Descrambling).

2.6 Robust Secure Coder (RSC) Speech

Encryption Algorithm

In (Babu, 2012), the authors present their scheme for

speech encryption, called RSC, which includes

MELP compression algorithm, Triple Data

Encryption Standard (3DES) encryption algorithm

and a Forward Error Correction (FEC) algorithm.

Figure 3 present the speech encryption process. The

original speech is passed through MELP encoder in

frames of 22.5 milliseconds. The frame is coded into

54 bits of compressed speech frame. Because MELP

algorithm ensures FEC for unvoiced mode only, the

designers of RSC use 10 parity bits to ensure error

correction only for voiced mode. The 54 bits of

speech previously compressed and the 10 bits of

FEC are given as input for 3DES encryption process.

The result is 64 bits of encrypted and compressed

speech, which will be transmitted to the receiver.

Figure 3: Speech encryption process (RSC algorithm).

The decryption process is illustrated in Figure 4

and is similar with the encryption process. The 64

bits of encrypted speech are given as input for 3DES

decryption process, resulting 64 bits which will then

enter in the FEC. 10 of 64 bits are used to correct

errors and the rest of 54 bits are separated and given

as input to MELP decoder. The output of the

decoder is a synthesized speech frame of 22.5

milliseconds.

Figure 4: Speech decryption process (RSC algorithm).

2.7 Chaotic Map and Blowfish Speech

Encryption Algorithms

In (Ulkareem, 2012), the authors describe a new

solution to encrypt speech signal which includes

chaotic encryption algorithm based on logistic map

and Blowfish encryption algorithm. The advantage

of using chaotic function is that it increases the

security of the algorithm and the complexity of the

encryption and decryption functions. The solution

proposed by the authors’ works as follows. For the

encryption process, which is described in Figure 5,

as a first step, the raw speech signal is divided into

frames, each frame containing 256 values. Then, the

speech frames chosen for encryption are

decomposed using wavelet packet transform (WPT),

to determine the decomposed frames coefficients of

the level 2. At the end of this process, 256

coefficients are found for each selected frame. The

next step is to use chaotic logistic map to generate a

chaotic key which will be XORed with each frame

value. The Blowfish algorithm provides two parts of

128 chaotic encrypted values which are merged to

obtain an encrypted frame of 256 values.

Figure 5: Speech encryption process (Chaotic map and

Blowfish algorithms).

The decryption process is illustrated in Figure 6.

In the first step, the encrypted speech signal is

divided into frames (each containing 256 values) and

each frame is split into left part (128 values) and

right part (128 values). The two parts are given as

input for Blowfish algorithm and then the decrypted

frames are XORed with the chaotic key. At the end

of the Blowfish decryption process, a frame of 256

values is obtained, which is passed through Inverse

WPT (IWPT) restoring in this way the decrypted

speech signal.

ICISSP 2016 - 2nd International Conference on Information Systems Security and Privacy

160

Figure 6: Speech decryption process (Chaotic map and

Blowfish algorithms).

2.8 DSP Architecture

In general, speech coding and speech encryption

algorithms include intensive processing operations

and for this reason it’s recommended to implement

them on dedicated DSP which have instructions to

handle these types of computations.

Several real-time speech encryption applications

are characterized by very tight requirements in size,

voltage supply and power consumption. In order to

fulfill these constraints, thorough optimization

should be performed at all levels, starting from

algorithmic level, then at system and circuit

architecture level, and also at layout and design of

the cell library.

A block diagram of embedded DSP architecture

is shown in Figure 7. This contains the processor

core, the peripherals, the memory and others (Direct

Memory Access controller, event controller etc.).

The DSP core includes Data Address Generator

(DAG), Arithmetic Logic Unit (ALU), register sets

and sequencer.

Figure 7: A diagram of DSP architecture.

The most important aspect regarding the DSP is

to decide between floating point and fixed point

computational core. Very fast implementations can

be obtained by using floating-point processors, but

these are not bit-exact. In this context, the best

option for real time implementations is a specialized,

fixed-point processor and this is what we have used

for the implementations of previously described

speech encryption algorithms.

We have chosen for our project’s implementation

Blackfin ADSP-BF537 (ADSP-BF357, 2013)

processor core architecture, because it combines:

flexible single instruction multiple data capabilities

for parallel computations, an orthogonal RISC-like

microprocessor instruction set, zero-overhead loops,

a dual-MAC (Multiply and Accumulate) signal

processing engine, and multiple timed features into a

single instruction set. Blackfin contains an internal

ADC and is much faster than microcontrollers.

VisualDSP++ software can be used to simulate the

behavior of the DSP chip. Unfortunately, even with

this specialized DSP, optimization techniques are

still necessary in the implementation of stable

speech encryption algorithms with real-time

performances.

3 RELATED WORK

To give a better perspective about the importance

and utility of our project this section describes the

results obtained by other researchers and other

developed speech encryption algorithms.

(Wollinger, 2000) presents the results obtained

after comparing the optimized implementations of

Advanced Encryption Standard (AES) finalists on

DPS. The evaluated algorithms were Mars, RC6,

Rijndael, Serpent and Twofish and the

implementations were done using TMS320C64x

platform. The test scenarios included single-block

mode and multi-block mode (two blocks at a time

were encrypted) and the results were measured in:

the number of cycles (necessary for the encryption

process of each algorithm), the throughput

(Mbit/sec) and memory usage.

In (Good, 2008), the hardware performances of

the eSTREAM competition finalists are presented.

The framework designed by the authors takes into

considerations the following evaluation elements:

compactness, throughput, power consumption,

simplicity and scalability. Their evaluation shows

that the simplest algorithm is Mickey128, the most

flexible one is Trivium and that Grain80 offers the

best results for two samples application of future

wireless network and low-end of radio frequency

identification tags/ wireless sensor network nodes.

In paper (Servetti, 2002), the authors propose a

speech encryption technique which uses low

Performance Analysis of Real Time Implementations of Voice Encryption Algorithms using Blackﬁn Processors

161

complexity perception based on partial schemes. The

speech signal is compressed using ITU-T G.729

standard (ITU-t, 1996) and the result is divided into

two classes: one is encrypted and the other is

unprotected. Also, there are two level partial

encryption techniques used, one low protection (for

eavesdropping prevention) and one high-protection

(for full encryption of the compressed bit stream).

In (Chen, 2007), a speech encryption method

based on vector quantization (VQ) of LPC

coefficients is presented. The secret key is generated

using the indices of VQ corresponding to the

neighboring frames derived from the natural

speech’s characters. (Girin, 2007) presents an

encryption algorithm based on time-trajectory model

of the sinusoidal components corresponding to

voiced speech signals. This method uses the

amplitude and phase parameters of the discrete

cosine functions which are applied for each voiced

segment of the speech.

A method for speech encryption based on

augmented identity matrix is presented in (Tingting,

2009). Enhanced encryption can be achieved by

analyzing the redundancy parameters of the coded

speech signal, with low computation complexity in

real time applications. In (Merit, 2012), a voice

encryption method called “DES with Random

permutation and Inversion” is described, which

solves the problem of penetrating the RPE-LTP

vocoder by the encrypted voice. The solution

proposed ensures secure communication in Global

System for Mobile Communications (GSM) and a

good compatibility to all GSM networks.

The authors describe in (Kaur, 2012) a new

speech encryption algorithm which integrates a

personalized time domain scrambling scheme and is

based on four level of hash based encryption. They

encrypt the original signal four times using different

algorithms (repositioning of bits, using twice

random number generation and amplitude ascending

ordering) at each level. Based on their experimental

results, it can be seen that the proposed algorithm

ensures a high level of security.

In (Knezevi, 2013) the authors illustrate how

signal processing techniques can be used to design

and implement cryptographic and security

applications. Implementations on DSP processors of

well known hash functions, public-key algorithms

are described in detail. Moreover, using the special

features of DSP processor, they present a key

derivation technique and other methods of pre-

processing data which can be very useful in

performing side-channel attacks.

4 IMPLEMENTATION

In this section we present the details regarding real-

time and offline implementations of six speech

encryption algorithms (described in Section 2).

A point of interest in this area is getting DSP

microprocessors, embedded in a system which

performs speech encryption and decryption.

Implementation in C is structured and easy to follow

and can be an important starting point for

implementing these algorithms on various platforms.

The algorithms were implemented at the

beginning in C language using Microsoft Visual

Studio2012, which makes the software processor

independent and can be linked with any processor if

the corresponding assembler is accessible. After we

thoroughly analyzed their functionality, the code

was transferred to the integrated development

environment, called VisualDSP++, on a DSP

platform. In the first stage, the algorithms were

tested offline, using a single processor so that we

could verify if the implementations remain

functional even when their included in this new

environment.

After we implemented the speech encryption

algorithms on a DSP platform, we optimized the

programs so that it allows real-time communication.

This was done using two fixed point DSPs from

Analog Devices, ADSP-BF537, as it can be seen in

Figure 8. The communication between the DPSs is

done using a serial transmission (through UART)

and MELP algorithm was used to compress the

speech signal. This block diagram is not available

for the implementation of scrambling speech

encryption algorithm (where the digital signal is

given as input for the FFT function without

compressing it) and for encryption algorithm based

on chaotic map and Blowfish cipher (it uses WPT

for compression).

Figure 8: Real-time communication on BF537 block

diagram.

ICISSP 2016 - 2nd International Conference on Information Systems Security and Privacy

162

Several optimizations were necessary to meet the

real time requirement of completing all computation

processes within frame duration. The source code of

speech encryption algorithms were thoroughly

optimized at the C Level.

5 OPTIMIZATION AND

EXPERIMENTAL RESULTS

Since most embedded systems are real-time systems,

code optimization in terms of execution speed is an

important performance index which will result in

lower power consumption. It is always easier to use

a C compiler optimization. However, in cases where

we have to save more MIPS or memory processing,

assembly code optimization is the only way to

achieve this level of performance. The development

cost of writing 100% of the entire program in

assembly language, far exceeds the performance

gain.

A better approach is to start writing code in C,

then create a detailed profile to identify time critical

code sections and replace these code segments with

code in assembly language. A common rule 80% -

20% says that 80% of processing time is used for

20% of code. So if we can identify those 20% of the

code and optimize it using assembly language,

significant performance gains can be achieved. A

method for identifying the region of interest is to use

the statistical profiler (for EZ-KIT) or the linear

profiler (for a simulator) that exist in the

VisualDSP++. Running the applications in

VisualDSP++ for the first time, generated the

execution times (in milliseconds) seen in Table 1,

for compressing and encrypting a single frame.

Table 1: Execution time per frame before code

optimization.

Speech Encryption

Algorithm

Compression

Algorithm

Execution

time/frame (in ms)

Trivium MELP 168.312

Mickey v2.0 MELP 160.125

Grain v1 - 80 bit key MELP 166.129

Grain v1 -128 bit key MELP 168.934

Scrambling - 162.331

RSC MELP 172.624

Chaotic map&

Blowfish

WPT 174.753

We started by applying different optimization

techniques at C level (Table 2) then we applied

different hardware optimizations which can be seen

in Table 3. The execution time (given in

milliseconds) decreased significantly for all

encryption algorithms as it can be seen in Table 4.

All performed computational processes lasted more

than 170 ms before code optimization, result which

is inacceptable, given the time available for a frame

(22.5 ms). After the optimization, the execution time

per frame was reduced to less than 21.5 ms for all

implemented algorithms.

Table 2: C level optimizations.

Optimization technique

Enable: optimization for C code, automatic inlining,

interprocedural optimization from VisualDSP++ options

Use pragma for optimizing loops

Use pragma for data alignment

Use pragma for different memory banks

Use pragma for no alias

Use volatile and static data types

Use arithmetic data types (int, short, char, unsigned int,

unsigned char, unsigned short)

Using runtime C/C++ and DSP libraries

Use pragma to optimize for speed

Using intrinsic functions and inline assembly

Profile Guided Optimization (PGO) - the compiler uses

the data collected during program execution for a

optimization analysis. PGO informs the compiler abou

the functions that affect branch prediction, improve loops

transformation and reduce code size.

Table 3: Hardware level optimizations.

Optimization technique

Special addressing modes – using different data sections

(for add() function)

Using assembly code

Using hardware loops

Using parallel instructions

Using software pipeline

Table 4: Execution time per frame after C level and

hardware level optimizations.

Speech Encryption

Algorithm

Execution

time/frame (in ms)

C level

Execution

time/frame (in

ms)

Hardware level

Trivium 71.034 15.884

Mickey v2.0 61.977 6.349

Grain v1 - 80 bit key 67.893 11.815

Grain v1 -128 bit key 70.102 14.209

Scrambling 65.340 10.053

RSC 74.298 18.644

Chaotic map&

Blowfish

78.112 21.115

As it can be seen in Table 4, the smallest

execution time per frame (in ms) at C level is

obtained for the implementations of Grain v1 (80 bit

Performance Analysis of Real Time Implementations of Voice Encryption Algorithms using Blackﬁn Processors

163

and 128 bit key) and by the scrambling algorithm

(approximately 65.5 ms). Mickey v2.0 algorithm has

the smallest execution time at hardware level, only

6.439 ms. The highest execution time is obtained at

software level and at hardware level by chaotic map

&Blowfish algorithm (78 ms, respectively 21 ms).

Memory optimization techniques such as caching

and data placement were also used to bring down the

processing time. It was not possible to include the

entire data inside the internal RAM of the DSP. For

this reason, less frequently accessed data was kept in

SDRAM which was comparatively slower.

Frequently accessed functions were cached. A

better optimization is achieved by writing the

functions which are computationally intensive in

assembly language. The result is a substantial

reduction of the number of cycles needed for

computation.

In Table 5, the CPU time for optimized and non-

optimized speech encryption implementations is

shown for each time consuming function.

Table 5: CPU time for optimized and non-optimized

algorithms implementations.

Algorithm

Time consuming

functions

Optimi-

zation

With

Optimiza-

tion

Trivium Multiplying the

algebraic normal

form of two

boolean functions

3.25

Mcycles

Kcycles

Mickey v2.0

2.0

Mcycles

Kcycles

Keystream

derivation

Grain v1 - 80

Initialization phase

2.95

Mcycles

Kcycles

Grain v1 -128

Initialization phase

3.20

Mcycles

Kcycles

Scrambling FFT

2.80

Mcycles

Kcycles

IFFT

RSC Forward Error

Correction

3.55

Mcycles

120

Kcycles

Chaotic map

& Blowfish

Chaotic key

generation

4.2

Mcycles

155

Kcycles

Subkey generation

for Blowfish

For Trivium algorithm the most consuming

function includes the multiplication of two Boolean

functions using their algebraic normal form.

Optimizing this function reduces the number of

cycles with more than 2 Mcycles. For Mickey v2.0

cipher there are two time consuming functions:

optimized save approximately 1.5 Mcycles. The

initialization phase is the most consuming for Grain

v1 algorithm (80 bit key or 128 bit key). Optimizing

this function the number of cycles decreases with

almost 2 Mcyles. For the scrambling technique, the

FFT and IFFT functions consume approximately 3

Mcycles. After the optimization, these functions

require less than 1 Mcycles. RSC algorithm has only

one time consuming function, which is the Forward

Error Correction. It’s optimization is significant,

from 3.55 Mcycles to 120 Kcycles. The cipher

which uses chaotic map and Blowfish algorithm for

voice encryption has two important functions:

chaotic key generation and subkey generation for

Blowfish. The reduction of clock cycles is high,

from 4.2 Mcycles to 155 Kcycles.

Based on the results in Table 5, we can calculate

the Clock Rate Reduction (CRR). This is defined as

in equation (1), where X represents the number of

clock cycles consumed by original code (before

optimization) and Y is the number of clock cycles

consumed by the optimized code. The CRR values

for all implemented encryption algorithms can be

seen in Table 6. As it can be observed, the best CRR

was obtained for Trivium algorithm (41.21%),

followed by Mickey v2.0 (35.36%). The smallest

CRR was obtained for chaotic map and Blowfish

algorithm, 26.09%.

 =

( − )



× 100%

(1)

Table 6: CRR value for all speech encryption algorithms.

Algorithm

CRR

Trivium

41.21%

Mickey v2.0

35.36%

Grain v1 - 80

34.11%

Grain v1 -128

31.65%

Scrambling

28.16%

RSC

28.58%

Chaotic map &

Blowfish

26.09%

We have also performed subjective analysis for

the offline and real-time implementation of the

evaluated speech encryption algorithms and the

results are shown in Figure 9.

In subjective analysis, the encrypted speech

signal is listened and the quality of it will be

determined only based on the listener’s opinion. Ten

listeners have graded the six algorithms. Each

person has listened to 10 distinct audio files and then

they gave grades on a scale of 0 to 10. As it can be

seen from Figure 9, the scores confirm the fact that

ICISSP 2016 - 2nd International Conference on Information Systems Security and Privacy

164

offline speech encryption algorithms’

implementations have better performances than the

real-time implementations. “O” stands for offline

implementation and “RT” stands for real-time

implementation. The best results in subjective

analysis were for offline Mickey v2.0 algorithm

(9.1), followed very closely by offline RSC (8.92)

and by offline Trivium (8.91). The worst results

were for real-time implementations, especially of

algorithms such as Grain v1 (7.14), RSC (7.32) and

Trivium (7.34).

Figure 9: Scores for offline and real-time implementations.

6 CONCLUSIONS

Because secure communications are extremely

important nowadays, we presented in this paper six

speech encryption techniques that can be used in

real-time applications. The implementations were

performed in C and the code was ported on Blackfin

ADSP-BF537. Thorough optimization of the code

was carried out to achieve real-time

implementations. The optimization process was

performed step by step by using different methods

which includes compiler tools, small function inline

expansion, and intrinsic functions and so on. We

were able for all algorithms to reduce the execution

time per frame to less than 21.5 milliseconds. The

results are ideal and meet the needs of engineering

applications: real-time implementations of speech

encryption algorithms on hardware platforms.

However, it can be seen that the best results

(smallest number of cycles and execution time per

frame) were obtained for Mickey v2.0 stream cipher

implementation and the worst results were obtained

for the encryption algorithm proposed in

(Ulkareem, 2012), which is based on chaotic map

and Blowfish cipher.

Taking into consideration the subjective analysis

results, it can be observed that offline

implementations of the algorithms provide better

speech quality than real-time implementation. All in

all, the proposed speech encryption algorithms have

a good audio quality, which is an extremely

important feature for communication applications.

This work can be extended to developing

hardware products which can be used to ensure

secure real-time communications. Also, the

algorithms we have selected for speech encryption

can be implemented on other DSP platforms, which

will offer more optimization methods and better

performances.

ACKNOWLEDGEMENTS

The work has been funded by the Sectoral

Operational Programme Human Resources

Development 2007-2013 of the Ministry of

European Funds through the Financial Agreement

POSDRU/187/1.5/S/155536, by the Sectoral

Operational Programme Human Resources

Development 2007-2013 of the Ministry of

European Funds through the Financial Agreement

POSDRU/159/1.5/S/134398, and by the program

Partnerships in priority areas – PN II carried out by

MEN-UEFISCDI, project No. 47/2014.

REFERENCES

Tremain, T. E., 1982. The Government Standard Linear

Predictive Coding LPC-10. In: Speech Technology,

pp.40-49.

McCree, A., Kwan, T., George, E. B.,Viswanathan, V.,

1996.A 2.4 kbit’s MELP coder candidate for the new

U.S. Federal Standard. In: Acoustics, Speech, and

Signal Processing, IEEE International Conference,

vol.1, pp. 200-203.

Supplee, L. M., Cohn, R. P., Collura, J. S.,McCree, A.,

1997. MELP: the new Federal Standard at 2400bps.

In: Acoustics, Speech, and Signal Processing, IEEE

International Conference, vol.2, pp. 1591-1594.

Canniere, C., Preneel, B., 2006.Trivium Specifications. In:

eSTREAM, ECRYPT Stream Cipher Project.

Babbage, S., Dodd, M., 2006. The stream cipher MICKEY

2.0. In eSTREAM, ECRYPT Stream Cipher Project.

Hell, M., Johansson, T., Meier, W., 2007. Grain- A Stream

Cipherfor Constrained Environments. In: International

Journal of Wireless and Mobile Computing, vol. 2, pp.

86-93.

Ravikrindi, R., Nalluri, S., 2015. Digital Signal

Processing, Speech encryption and decryption,

availableat https://www.scribd.com/doc/23336087/

Performance Analysis of Real Time Implementations of Voice Encryption Algorithms using Blackﬁn Processors

165

11-speech-encryption-and-decryption (Accessed: July

2015).

Babu, A. A., Yellasiri, R., 2012.Symmetric Encryption

Algorithm in Speech Coding for Defence

Communications. In: Journal of Computer Science &

Information technology, vol. 4, pp. 369-376.

Ulkareem, M., Abduljaleel, I. Q., 2013.Speech Encryption

Using Chaotic Map and Blowfish Algorithms. In:

Journal of Basrah Researches, vol. 39, no. 2, pp. 68-

76.

ADSP-BF537 Blackfin Processor Hardware Reference

manual, Revision 3.4 (2013).

Wollinger, T. J., Wang, M., Guajardo, J., Paar, C., 2000.

How Well Are High-End DSPs Suited for the AES

Algorithms? In: Proceedings of the Third Advanced

Encryption Standard Candidate Conference, pp. 94-

105.

Good, T., Benaissa, M., 2008.Hardware performance of

eStream phase-III stream cipher candidates. In: State

of the Art of Stream Ciphers (SASC), pp. 163-173.

Servetti, A., De Martin, J.C., 2002.Perception-based

partial encryption of compressed speech. In: IEEE

Transactions on Speech and Audio Processing, vol.10,

pp. 637 – 643.

ITU-t recommendation g.729, 1996, coding of speech at 8

kbit/s using conjugate-structure algebraic-codeexcited-

linear-prediction (cs-acelp).

Chen, N., Zhu, J., 2007.Robust speech watermarking

algorithm. In: Electronics Letters, vol. 3, pp. 1393-

1395.

Girin, L., Firouzmand, M., Marchand, S., 2007.Perceptual

Long-Term Variable-Rate Sinusoidal Modeling of

Speechǁ. In: IEEE Transactions on Audio, Speech, and

Language Processing, vol.15, pp. 851 – 861.

Tingting, X., Zhen, Y., 2009.Simple and effective speech

steganography in G.723.1 low-rate codes. In:

International Conference on Wireless

Communications & Signal Processing, pp. 1 – 4.

Merit, K., Ouamri, A., 2012.Securing Seech in GSM

Networks using DES with Random Permutation and

Inversion Algorithms. In:International Journal of

Distributed and Parallel Systems (IJDPS), vol.3, no.4,

pp.157-164.

Kaur, H., Sekhon,G. S., 2012.A Four Level Speech Signal

Encryption Algorithm. In: International Journal of

Computer Science and Communication (IJCSC), vol.

3, no. 1, pp. 151-153.

Knezevi, M., Batina, L., Mulder, E., Fan, J., Gierlichs, B.,

Lee, Y. K., Maes, R., Verbauwhede I., 2013.Signal

Processing for Cryptography and Security

Applications.In Handbook of Signal Processing

Systems, pp. 223-241.

Olausson, M.,Dake, L., 2003.The ADSP-21535 Blackfin

and Speech Coding.InProceedings of the Swedish

System-on-chip Conference 2003.

Blackfin DSP Instruction Set Reference, 2002.

ADSP-21535 Blackfin DSP Hardware Reference, 2002.

ITU-t recommendation on g.723.1, 1996, dual rate speech

coder for multimedia communications transmitting at

5.3 and 6.3 kbit/s.

ETSI GSM Fullrate Speech Codec for Analog

DevicesBlackfin, Bayer DSP Solutions, 2008.

Bertini, G.,Fontata, F., Gonzalez, D.,Grassi, L., Magrini,

M., 2004.Voice Transformation Algorithms with Real

Time DSP Rapid Prototyping Tools, unpublished.

Shaked, Y., Cole, A. L., 2004. Implementation of MELP

based Vocoder for 1200/2400 bps, The EE Project

Contest 2000, Technion Signal and Image Processing

Lab, unpublished.

ICISSP 2016 - 2nd International Conference on Information Systems Security and Privacy

166