letters
2
, from DNA letters to protein sequence
presented by English letters
3
, where ciphering
technique takes place. Then the resulted ciphered
English letters are again transformed to DNA letters
with extra overhead bits, which is generally known as
the ambiguity bits. Finally, the resulted DNA letters
are concealed into a real DNA sequence through a
hiding technique. The whole process is then reversed
again at the receiver’s node to extract the original
message. As we can easily note, the whole process is
a set of complicated long steps that only consume a
lot of the computational effort without a real addvalue
to the security strength.
In this paper, we propose an enhanced DNA-based
Steganography algorithm that is much more efficient
and faster than the current technique with a higher
hiding capacity. In the proposed algorithm, we en-
hance the commonly used playfair cipher by defin-
ing a novel sequence of preprocessing steps and get-
ting rid of the overhead. We also utilize a more effi-
cient technique to enhance the hiding process (Khalifa
and Atito, 2012). The proposed algorithm has rede-
fined the whole process in a much smarter and straight
forward mechanism resulting in a better performance
and low execution time with a higher hiding capacity.
Moreover, The security strength has been carefully
checked and proved through the calculation of the
cracking probability. The outstanding performance of
our proposed algorithm is demonstrated through ex-
tensive experimental studies.
The rest of this paper is organized as follows. Sec-
tion 2 overviews the background and related work
on the current Steganography techniques. Section 3
presents the proposed technique in detail. Section 4
discusses its performance analysis. Finally, the paper
is concluded in Section 5.
2 BACKGROUND
In this section, we provide a brief review on the DNA
and the related work. In addition, we discussed in
detail the main problems of the current DNA-based
Steganography techniques and their problems.
2.1 DNA Overview
DNA is the magic code for life (Smith, 2003), it con-
tains the genetic instructions used in the development
and functioning of all living organisms. Inspired from
2
The DNA letters are A, G, C, and T.
3
The protein sequence is composed of amino acids, each
is abbreviated by an English letter.
Key = “PLAY”
A B C D E
F G H I/J K
L M N O P
Q R S T U
V W X Y Z
P L A Y B
C D E F G
H I/J K M N
O Q R S T
U V Q X Z
Figure 1: 5x5 Playfair Cipher Grid before and after using
the Key.
nature, the fact that DNA molecule carries all the ge-
netic information, evolves the idea of using DNA it-
self as a data carrier. The information in DNA is
stored as a code made up of four chemical bases
named as nucleotides: adenine (A), guanine (G), cy-
tosine (C), and thymine (T). The sequence of these
four bases encodes the genetic information (Alberts
and Johnson, 2008). Each of the three nucleotides is
called a codon, therefore in nature there are 64 codons
since there are (4x4x4) letter combinations.
DNA has two main advantages that make it effi-
cient for data hiding and transmission. First of all its
high storage capacity; as proved by(Adleman, 1994).
Secondly, the simplicity of converting data to DNA
sequence makes it a good choice for data encryp-
tion within it. By exploiting the advantages of a
DNA as an efficient data carrier in addition to using
a well-suited encryption technique, researches ended
up by many solutions for secure data communica-
tion and transmission. DNA steganography is one of
these promising solutions(Peterson, 2001), (Catherine
et al., 1999),(Leier et al., 2000),(Shimanovsky et al.,
2002),(SAEB et al., 2007).
2.2 Related Work
In 1999, (Catherine et al., 1999) started DNA
steganography, where data is encrypted in DNA and
hid into microdots. In 2000 (Leier et al., 2000) pro-
posed a hiding technique where data can be encoded
into DNA sequence, however the original data can be
easily recovered once the primer sequence is known.
In 2001 (Peterson, 2001) proposed another new
scheme for secret data hiding but unfortunately it
had some concerns as it can be cracked through a
frequency-based cryptanalysis technique. In 2010
(Shiu et al., 2010) proposed three reversible data hid-
ing schemes based on DNA sequence, the most signif-
icant one was the substitution method, yet its hiding
capacity is not efficient enough.
In May 2012 (Khalifa and Atito, 2012) proposed
a Steganography technique, where data is encrypted
using DNA-based playfair cipher, then hid in a real
DNA sequence using a modified substitution tech-
nique to increase its hiding capacity. Although it
achieved higher hiding capacity than the original sub-
AnEnhancedDNA-basedSteganographyTechniquewithaHigherHidingCapacity
151