27.05.2015 Views

SPHSC 503 - Introduction to plots in Matlab 7 - ISDL@UW

SPHSC 503 - Introduction to plots in Matlab 7 - ISDL@UW

SPHSC 503 - Introduction to plots in Matlab 7 - ISDL@UW

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>SPHSC</strong> <strong>503</strong> – Speech Signal Process<strong>in</strong>g UW – Summer 2006<br />

Homework 1 – due Wednesday 6/28 <strong>in</strong> class<br />

S<strong>in</strong>ce this is the first homework <strong>in</strong> this course, you may wonder what <strong>to</strong> hand <strong>in</strong> for your<br />

homework. Your homework should <strong>in</strong>clude all requested <strong>plots</strong>, and it should provide answers <strong>to</strong><br />

all questions and assignments <strong>in</strong> the homework. For example, for problem 1.1, you don’t need <strong>to</strong><br />

provide anyth<strong>in</strong>g for part a, but you need <strong>to</strong> provide a plot for part b and c, provide the duration of<br />

the signal <strong>in</strong> seconds <strong>in</strong> part d, and provide estimates for t=0.2 and t=4 for the distance <strong>in</strong> seconds<br />

between peaks (part e, 2 estimates), and for the fundamental frequency (part f, 2 estimates).<br />

You’re free <strong>to</strong> <strong>in</strong>clude any additional <strong>in</strong>formation, such as <strong>Matlab</strong> commands, support<strong>in</strong>g <strong>plots</strong><br />

and comments, but those are not required.<br />

You may either br<strong>in</strong>g a hard-copy of your homework <strong>to</strong> class on Wednesday, or you can email<br />

me your homework. For the latter option, I recommend collect<strong>in</strong>g all your <strong>plots</strong> and answers <strong>in</strong> a<br />

s<strong>in</strong>gle word process<strong>in</strong>g document such as Microsoft Word.<br />

Problem 1.1 – Load<strong>in</strong>g and analyz<strong>in</strong>g a speech signal<br />

Download the file ex1_3.wav from the class website, and save it <strong>in</strong> a folder on your computer<br />

(if you’re <strong>in</strong> SAV 137, the suggested location for the file is C:\Temp\<strong>SPHSC</strong><strong>503</strong>\ex1_3.wav).<br />

The file conta<strong>in</strong>s the spoken word “zero” sampled at 10 kHz. Change the current direc<strong>to</strong>ry <strong>in</strong><br />

<strong>Matlab</strong> <strong>to</strong> the direc<strong>to</strong>ry that conta<strong>in</strong>s the saved file.<br />

a. Load the ex1_3.wav file <strong>in</strong><strong>to</strong> <strong>Matlab</strong>. You can either use <strong>Matlab</strong>’s Import Wizard, by<br />

double-click<strong>in</strong>g on the filename <strong>in</strong> <strong>Matlab</strong>’s current direc<strong>to</strong>ry w<strong>in</strong>dow, or use the<br />

wavread command (see help wavread for details).<br />

b. Plot the speech signal aga<strong>in</strong>st its <strong>in</strong>dex (n) by us<strong>in</strong>g the <strong>in</strong>teractive <strong>to</strong>ols or by us<strong>in</strong>g the<br />

plot command. Label the axes and the plot appropriately.<br />

c. Plot the speech signal as a function of time, and label the axes and the plot appropriately.<br />

H<strong>in</strong>t: you need <strong>to</strong> create the a ‘time’ vec<strong>to</strong>r t, and then plot the signal with “plot(t,y)”.<br />

You can create the correct time vec<strong>to</strong>r by divid<strong>in</strong>g the speech signal’s <strong>in</strong>dex vec<strong>to</strong>r by the<br />

sampl<strong>in</strong>g frequency of the signal, for example:<br />

t = n / fs;<br />

d. What is the duration of the signal <strong>in</strong> seconds? It may be helpful <strong>to</strong> use the figure’s zoom,<br />

pan and data cursor <strong>to</strong>ols.<br />

Voiced speech, such as vowels, is characterized by a series of high-energy peaks <strong>in</strong> the<br />

speech signal. Those peaks are created by the repeated open<strong>in</strong>g and clos<strong>in</strong>g of the vocal<br />

chords.<br />

e. Make a rough estimate of the distance <strong>in</strong> seconds between the peaks <strong>in</strong> the “zero” speech<br />

signal around t=0.2 and t=0.4 seconds, correspond<strong>in</strong>g <strong>to</strong> the two vowels. Aga<strong>in</strong>, it may be<br />

helpful <strong>to</strong> use the figure’s zoom, pan and data cursor <strong>to</strong>ols.<br />

f. Convert the measured distances <strong>in</strong> seconds from part e <strong>in</strong><strong>to</strong> an estimate of the<br />

fundamental frequency (<strong>in</strong> Hz) of the speech signal around t=0.2 and t=0.4 seconds.<br />

– 1 –


<strong>SPHSC</strong> <strong>503</strong> – Speech Signal Process<strong>in</strong>g UW – Summer 2006<br />

Problem 1.2 – Measur<strong>in</strong>g fundamental frequency with correlation<br />

In problem 1.1f, you’ve manually found an estimate for the fundamental frequency of the speech<br />

signal. In this problem, we will use a technique called correlation <strong>to</strong> estimate the fundamental<br />

frequency au<strong>to</strong>matically. Correlation is a measure of the degree <strong>to</strong> which two sequences are<br />

similar. It is related <strong>to</strong> convolution, and its mathematical expression looks like the convolution<br />

sum. There are two k<strong>in</strong>ds of correlation: au<strong>to</strong>-correlation (correlation between a signal and itself)<br />

and cross-correlation (correlation between two different signals). They are def<strong>in</strong>ed as follows:<br />

Au<strong>to</strong>-correlation: r [] l = xnxn [ ][ −l]<br />

xx<br />

∞<br />

∑<br />

n=−∞<br />

Cross-correlation: r [] l = x[ n] y[ n−l]<br />

xy<br />

∞<br />

∑<br />

n=−∞<br />

In <strong>Matlab</strong>, both types of correlation can be computed us<strong>in</strong>g the xcorr function from the Signal<br />

Process<strong>in</strong>g Toolbox. For example,<br />

>> x = [1 1 1]; lmax = 3;<br />

>> [rxx,l] = xcorr(x,lmax);<br />

>> stem(l,rxx); % a triangle<br />

computes and <strong>plots</strong> the au<strong>to</strong>-correlation of x,<br />

r [] xx<br />

l<br />

, for l=-lmax,…,lmax. Similarly,<br />

>> x = [1 1 1]; y = [-1 0 1]; lmax = 3;<br />

>> [rxy,l] = xcorr(x,y,lmax);<br />

>> stem(l,rxy); % 2 up, 2 down<br />

computes and <strong>plots</strong> the cross-correlation of x and y,<br />

r [] xy<br />

l<br />

, for l=-lmax,…,lmax.<br />

a. Clear the workspace (clear), load the speech signal from problem 1.1 aga<strong>in</strong>, and extract<br />

a voiced section of the speech signal us<strong>in</strong>g yvoiced1 = y(1900:2300); . Plot the<br />

voiced section aga<strong>in</strong>st its <strong>in</strong>dex, nvoiced1 = 1900:2300; .<br />

b. Compute and plot the au<strong>to</strong>correlation of the voiced section for lmax=250. You can<br />

modify the example code above <strong>to</strong> do this. Label your x-axis as ‘Lag (<strong>in</strong> samples)’.<br />

Notice the follow<strong>in</strong>g <strong>in</strong> your plot of the au<strong>to</strong>correlation: the au<strong>to</strong>correlation has the highest peak<br />

for zero lag (l=0). This is a necessary property of all au<strong>to</strong>correlations. Then the au<strong>to</strong>correlation<br />

has strong positive peaks at equal distances <strong>to</strong> the left and right of the zero lag po<strong>in</strong>t.<br />

c. Determ<strong>in</strong>e the value of the lag for the next strongest peak <strong>to</strong> the left and right. The zoom<br />

<strong>to</strong>ols of the plot w<strong>in</strong>dow may be helpful for this.<br />

d. Divide the value of the lag you found <strong>in</strong> part e by the sampl<strong>in</strong>g frequency <strong>to</strong> get the value<br />

of the lag <strong>in</strong> seconds.<br />

e. The value of the lag <strong>in</strong> seconds should correspond more or less <strong>to</strong> the distance between<br />

the peaks you found <strong>in</strong> problem 1.1e for t=0.2. Is that the case for you? Convert the value<br />

of the lag <strong>in</strong> seconds <strong>to</strong> a frequency <strong>in</strong> Hz. This value should correspond <strong>to</strong> the<br />

fundamental frequency from problem 1.1f for t=0.2.<br />

f. Repeat a-e for the second voiced section around t=0.4, nvoiced2 = 3800:4200;.<br />

– 2 –


<strong>SPHSC</strong> <strong>503</strong> – Speech Signal Process<strong>in</strong>g UW – Summer 2006<br />

Problem 1.3 – Measur<strong>in</strong>g fundamental frequency with a pitch estima<strong>to</strong>r<br />

It is possible <strong>to</strong> fully au<strong>to</strong>mate the estimation of the fundamental frequency of a speech signal<br />

with the methods used <strong>in</strong> problem 1.1 and 1.2. In this problem, we will use a pitch estima<strong>to</strong>r <strong>to</strong><br />

estimate the pitch of the entire speech signal.<br />

a. Download the file pitchestimate.m from the class website. This m-file conta<strong>in</strong>s the<br />

<strong>Matlab</strong> function pitchestimate, which can be used as follows<br />

% y,fs is the <strong>in</strong>put signal and sampl<strong>in</strong>g frequency<br />

>> [t,p]=pitchestimate(y,fs);<br />

% t is the times at which the fundamental frequency was estimated<br />

% p is the estimates of the fundamental frequency<br />

>> plot(t,p)<br />

See help pitchestimate for details.<br />

b. Clear the workspace, load the speech signal aga<strong>in</strong>, and estimate and plot its fundamental<br />

frequency us<strong>in</strong>g the pitchestimate function.<br />

c. [optional] If you have access <strong>to</strong> a microphone, it could be <strong>in</strong>terest<strong>in</strong>g <strong>to</strong> record your own<br />

voice and determ<strong>in</strong>e your own fundamental frequency.<br />

– 3 –

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!