30.04.2013 Views

Signal Analysis Research (SAR) Group - RNet - Ryerson University

Signal Analysis Research (SAR) Group - RNet - Ryerson University

Signal Analysis Research (SAR) Group - RNet - Ryerson University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

formant referenced by the black arrow. The watermark is em-<br />

Normalized frequency<br />

0.5<br />

0.25<br />

0<br />

0 1000 2000 3000 4000<br />

Normalized time<br />

Fig. 3. Log–spectrogram of the test signal. Male utterance of<br />

the word “bingo” with a sampling rate of 8 kHz.<br />

bedded as described in Sec. 3.1. First, the data (τ)2 = 010011<br />

is used to generate the amplitude of the watermark by means<br />

of the Equ. 16. Then the insertion zone is manually choseninordertodene<br />

the warping operator used to generate<br />

the time–warped watermark. Finally, the time–warped watermark<br />

is added to the original signal. Result of the watermark<br />

embedding is shown in Fig. 4. As can be seen, the time–<br />

Normalized frequency<br />

0.5<br />

0.25<br />

0<br />

0<br />

1000<br />

Normalized time<br />

(a) Original signal<br />

Normalized frequency<br />

0.5<br />

0.25<br />

0<br />

0<br />

1000<br />

Normalized time<br />

(b) Watermarked signal<br />

Fig. 4. Zoomed log–spectrogram of the original and watermarked<br />

test signal.<br />

warped watermark is very close to the original formant. As<br />

expected, the frequency spread decreases very fast, thanks to<br />

the smoothness of the amplitude of the watermark sequence.<br />

We found the embedding strategy satisfactory since we<br />

were not able to guess wether the signal was watermarked or<br />

not during blind tests. In order to provide a more objective<br />

comparison criterion, we make use of the “Auditory Toolbox”<br />

[4] to generate auditory representations of original and<br />

watermarked signals which is a pseudo–time–frequency representation<br />

based on physiological aspects of human hears.<br />

Auditory representations of original and watermarked signals<br />

are depicted in Fig. 5. Both representations are very similar<br />

which conrms stealthiness of the watermark.<br />

Next step consists in the recovery of the watermark sequence.<br />

We tested the proposed approach on the true watermarked<br />

signal, and on two different deteriorated versions of<br />

the watermarked signal : the rst is a MP3 compression attack,<br />

and the second is an additive Gaussian noise attack with<br />

a signal–to–noise ratio of 0 dB.<br />

Results of the matched ltering estimation are presented<br />

in Tab. 1. Results of the estimation step show that the watermark<br />

is perfectly extracted and has resisted to the MP3 attack<br />

as well as the white–noise attack.<br />

II ­ 204<br />

27<br />

Channels<br />

120<br />

60<br />

0<br />

0<br />

1000<br />

Normalized time<br />

(a) Original signal<br />

Channels<br />

120<br />

60<br />

0<br />

0<br />

1000<br />

Normalized time<br />

(b) Watermarked signal<br />

Fig. 5. Auditory representation of the original signal and the<br />

watermarked signal.<br />

τ τ1 τ2 τ3 τ4 τ5 τ6<br />

True 0 1 0 0 1 1<br />

No attack 0 1 0 0 1 1<br />

MP3 attack 0 1 0 0 1 1<br />

Noise attack 0 1 0 0 1 1<br />

Table 1. Results of the estimation of the set {τi} by matched<br />

ltering.<br />

5. CONCLUSION<br />

In this paper we have proposed a new watermarking method<br />

for speech signals, based on the time–warping signal processing<br />

concept. We have shown that it is possible to exploit physiological<br />

aspects of the human hear in order to carry information<br />

while keeping stealthiness of the inserted watermark.<br />

Then, we have developed a complete extraction method based<br />

on time–varying lter, time–warping operators and match ltering,<br />

to recover the watermark sequence. Numerical results<br />

show that the watermark is likely not hearable and numerical<br />

information carried by the watermark are retrievable. Future<br />

work will include a close study the robustness of the method<br />

against various attacks. For real applications, another topic<br />

is the unsupervised embedding of the watermark according to<br />

the position of formant. This issue is left for future work.<br />

6. REFERENCES<br />

[1] M. Arnold, “Audio watermarking: Features, applications<br />

and algorithms,” in IEEE International Conference on<br />

Multimedia and Expo, New York, USA, July 2000.<br />

[2] R. Baraniuk, “Unitary equivalence: A new twist on signal<br />

processing,” IEEE Trans. on <strong>Signal</strong> Processing, vol. 43,<br />

no. 10, pp. 2269–2282, Oct. 1995.<br />

[3] M.C. Botte, G. Canevet, L. Demany, and C. Sorin, Psychoacoustique<br />

et perception auditive, Inserm, 1989.<br />

[4] M. Slanley, “Auditory toolbox, version 2.0,” Avaiable at<br />

http://www.slaney.org/malcolm/pubs.html, 1994.<br />

Authorized licensed use limited to: <strong>Ryerson</strong> <strong>University</strong> Library. Downloaded on July 7, 2009 at 11:18 from IEEE Xplore. Restrictions apply.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!