03.03.2015 Views

Steganography in Arabic Text Using Kashida ... - ResearchGate

Steganography in Arabic Text Using Kashida ... - ResearchGate

Steganography in Arabic Text Using Kashida ... - ResearchGate

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

andom algorithm to distribute the hidden bits <strong>in</strong>side the<br />

message.<br />

B. Ma<strong>in</strong> Contributions and Paper Organization<br />

A promis<strong>in</strong>g algorithm is presented <strong>in</strong> this paper. The ma<strong>in</strong><br />

idea is to use <strong>Kashida</strong> <strong>in</strong> <strong>Arabic</strong> that enables us to hide bits<br />

and remove any <strong>in</strong>truder suspicions. We first <strong>in</strong>troduce four<br />

scenarios to add <strong>Kashida</strong> letters and randomly select one of the<br />

four scenarios for each round. Message segmentation is<br />

applied, enabl<strong>in</strong>g the sender to select more than one strategy<br />

for each block of message. At the other end, the recipient can<br />

recognize which algorithm was applied and can then decrypt<br />

the message content and aggregate it.<br />

The rest of this paper is organized as follows. In Section II<br />

we discuss previous text <strong>Steganography</strong> techniques. <strong>Kashida</strong><br />

variation algorithm <strong>in</strong> <strong>Arabic</strong> hidden algorithm is discussed <strong>in</strong><br />

Section III. Section IV will describe our algorithm and provide<br />

analysis regard<strong>in</strong>g the proposed work. F<strong>in</strong>ally, conclud<strong>in</strong>g<br />

remarks are offered <strong>in</strong> Section V.<br />

<strong>Steganography</strong><br />

Protect Data<br />

Image<br />

Audio<br />

<strong>Text</strong><br />

Cryptography<br />

Symmeric<br />

Asymmertic<br />

Figure 1.Hidden Data classification<br />

II. PRIOR WORKS<br />

<strong>Text</strong> <strong>Steganography</strong> can be classified <strong>in</strong>to three strategies<br />

as shown <strong>in</strong> Figure 1. Each one of these techniques have<br />

advantages and drawbacks. In the follow<strong>in</strong>g section, we will<br />

present some of the prior works and provide some discussions<br />

on them.<br />

Different l<strong>in</strong>guistic methods have been classified <strong>in</strong>to two<br />

categories. The first one is syntax and the other one is<br />

semantic [9]. This work has been developed by creat<strong>in</strong>g a<br />

dictionary of synonyms, and creat<strong>in</strong>g a representation for each<br />

word by bit. Authors <strong>in</strong> [9] presented a novel synonym<br />

algorithm to hide data <strong>in</strong> Bahasa Melayu language, where the<br />

hidden algorithm is divided <strong>in</strong>to two phases. The first step<br />

converts hidden message <strong>in</strong>to b<strong>in</strong>ary codes by us<strong>in</strong>g ASCII<br />

codes. Then a synonyms file is created, where the sender and<br />

recipient must have the same word list to encrypt and decrypt<br />

the message. If the sender wants to <strong>in</strong>sert a Zero, then there is<br />

no need for word replacement. Otherwise, the word from<br />

synonyms file is replaced. The same strategy will be iterated<br />

until the end of the secret message is reached. The recipient<br />

can decrypt the message by an <strong>in</strong>vert<strong>in</strong>g strategy and<br />

compar<strong>in</strong>g if a replacement has occurred, <strong>in</strong> which case the<br />

secret code is 1.<br />

Other similar techniques have been presented <strong>in</strong> [12]. The<br />

algorithm consists of three <strong>in</strong>put sources; natural language;<br />

secret message and the key, and one output which is Stegoobject.<br />

By creat<strong>in</strong>g lexical substitutions set and variant forms<br />

of the same word, and after the first scan, the system will<br />

recognize each word and to which set it belongs to. The lexical<br />

analyzer for Ch<strong>in</strong>ese language is then used to embed the<br />

correct word <strong>in</strong> the carrier file and take the context <strong>in</strong>to<br />

consideration.<br />

L<strong>in</strong>e shift<strong>in</strong>g techniques have been presented <strong>in</strong> [13] by<br />

vertically shift<strong>in</strong>g the l<strong>in</strong>e. In some measurements , are<br />

employed so that secret data can be passed onto it. The ma<strong>in</strong><br />

drawback of this strategy is that character recognition<br />

programs can detect l<strong>in</strong>e shift<strong>in</strong>g. In addition, hidden data will<br />

be discarded by retyp<strong>in</strong>g the carrier document.<br />

Authors <strong>in</strong> [14] <strong>in</strong>troduced Syntactic methods us<strong>in</strong>g<br />

punctuations to pass the secret message by plac<strong>in</strong>g<br />

punctuations <strong>in</strong> proper positions to hide data. This method will<br />

not affect the mean<strong>in</strong>g of the message or the data embedded<br />

<strong>in</strong>side it. On other hand, little amount of data can be embedded<br />

by us<strong>in</strong>g this method.<br />

Authors <strong>in</strong> [12] employed one of the new technology<br />

techniques for Short Messag<strong>in</strong>g Service (SMS). The technique<br />

relies on us<strong>in</strong>g message abbreviation for words, and creat<strong>in</strong>g a<br />

table of abbreviation and description of the message, similar to<br />

Table I.<br />

TABLE I. SMS ABBREVIATION<br />

Abbreviation<br />

Description<br />

AFAIK<br />

As Far As I Know<br />

ABT<br />

About<br />

B4<br />

Before<br />

BTW<br />

By The Way<br />

EOL<br />

End of Lecture<br />

CRAP<br />

Cheap, redundant assorted<br />

products<br />

In [15], a novel text hid<strong>in</strong>g algorithm was proposed us<strong>in</strong>g<br />

chat properties; especially nowadays where chatt<strong>in</strong>g rooms are<br />

very popular. The ma<strong>in</strong> idea <strong>in</strong> this research was to use<br />

emoticons to hide data where tremendous number of those<br />

symbols could be used to hide data <strong>in</strong>side the file. Most of<br />

chat users use emoticons <strong>in</strong>stead words nowadays. So the first<br />

step of this algorithm is to classify symbols semantically and<br />

then control the symbol order by us<strong>in</strong>g a secret key. The work<br />

creates 4 sets of emoticons to hide data. As an example, if the<br />

symbol is <strong>in</strong>serted at the beg<strong>in</strong>n<strong>in</strong>g of sentence, it would pass<br />

0, otherwise the secret bit is 1. Other strategies used <strong>in</strong> this<br />

work rely on symbol order where data can be extracted from<br />

the symbol depend<strong>in</strong>g on the symbol order number <strong>in</strong> its set.<br />

For example, if it is <strong>in</strong> set 4 and order 3, the secret data is

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!