Linear Prediction Analysis of Speech Sounds - Berlin Chen

Linear Predictive Coefficients (LPC)• An all-pole filter with a sufficient number of poles is agood approximation to model the vocal tract (filter) forspeech signals∴x~xH( z )XE( z ) 11==( z) − k A( z)=p1 −∑ap[] n = ∑ a x[ n − k ] + e[]nk = 1k[] n = ∑pa x[ n − k ]k = 1kk = 1kze[ n ]Vocal Tract Parametersa a,...,a1, 2px[ n ]– It predicts the current sample as a linear combination of itsseveral past samples• Linear predictive coding, LPC analysis, auto-regressivemodeling2004 SP- Berlin Chen 2

Short-Term Analysis: Algebra Approach• Estimate the corresponding LPC coefficients as thosethat minimize the total short-term prediction error(minimum mean squared error)EFraming/Windowing,The total short-termprediction errorfor a specific frame mTake the derivative∑n∑n∂ E∂ a⎡⎛⎢ ⎜⎣ ⎝mixmm===∑ne[ n] = ∑( x[ n] − x[ n])22 ~mmmn⎛p⎞∑ ⎜xm[ n ] − ∑ ajxm[ n −j ]⎟n ⎝j = 1⎠⎡⎢⎣⎢p2∂⎛x [ n ] − a x [ n −j ]⎞∑∑n⎜⎝mj = 1∂ aij⎞ ⎟⎠p[ n ] − ∑ a x [ n − j ] x [ n − i ] = 0 , ∀ 1j = 1{ e [] n x [ n − i ]}mmjm=me mm[] n0 , ∀ 1 ≤ i ≤ p⎟⎠⎤⎥⎦⎤⎥ ⎥ ⎦=20 ,, 0 ≤ n ≤ N −∀The error vector is orthogonalto the past vectorsThis property will be used later on!≤1i≤≤i≤pp12004 SP- Berlin Chen 3

Short-Term Analysis: Algebra Approach∂ E∂ a imp[] n − a x [ n − j] x [ n − i]∑ x ∑ = 0, ∀ 1 ≤ i ≤n⇒⇒⎡⎛⎢⎜ ⎣⎝m⎡⎣⎢⎣j=1jm⎞ ⎟⎠m[ n − i] x [ n − j] = ∑[ x [ n − i] x [ n]∑ ∑a x, ∀ 1 ≤ i ≤npp⎤⎥ ⎦j m mj= 1n[ x [ n − i] x [ n − j] = ∑[ x [ n − i] x [ n]∑a ∑ , ∀ 1 ≤ i ≤j m mj= 1 nn⎤⎥ ⎦mmmmpppDefineφp⇒ ∑mj=1correlation coefficients[ i, ,j] ∑ [ x[ n − i] x[ n −j]aφm⇒ Φa =Ψj= mmn[ i,j] =φ[ i,0] ,∀1m0 ≤i≤p:To be used in next page ![ 1,1 ] φm[ 1,2] ... φm[ 1, p][ 2,1] φ [ 2,2] ... φ [ 2, p][ 1,0][ 2,0]⎡ φm⎤⎡a1⎤ ⎡ φm⎢⎥⎢⎥ ⎢⎢φmmm ⎥⎢a2⎥ ⎢φm⎢ . . ... . ⎥⎢. ⎥ ⎢ .⎢⎥⎢⎥ = ⎢⎢ . . ... .⎥⎢ . ⎥⎢ .⎥⎢⎢⎢ ⎢⎢ . . ... . .⎥⎢⎥ ⎥⎥ .⎢⎢⎣ φmp,1φmp,2... φmp,p ⎦⎥⎣⎢aP⎥⎦⎣⎢φmp,0[ ] [ ] [ ] [ ]⎥ ⎥⎥⎥ ⎦ΦaΨ⎤⎥⎥⎥⎥⎥⎥2004 SP- Berlin Chen 4

Short-Term Analysis: Algebra Approach• The minimum error for the optimal, , 1 ≤ j ≤ pEmEm==∑n∑nex2p[] n = ∑ ( x [] n − x [] n ) = ∑ x [] n − ∑ax [ n − j]2 ~mnm m2mequal=∑n= φ⎛⎜ x⎝n⎛⎜⎝m⎞⎟ +⎠ppp[] n − 2∑ [ n] ∑ a x [ n − j] ∑ ∑ax [ n − j] ∑ax [ n − k]mnmj=1j⎛p[ ] p⎜ ajxmn − j akxm[ n − k ]n j = 1 k = 1m∑ ∑ ∑==⎝n⎛⎜⎝a jj=1p p⎧∑ a∑ a∑( x[ n −j] x[ n −k])∑j =1⎨⎩j kj= 1 k=1 npaj∑nxmm[ n − j] x [ n]p[] n − ∑a∑( x [] n x [ n − j])j=1pm[ 0,0] − ∑ ajφm[ 0, j]j=1jnmmmmjm⎞⎟⎠j mj= 1 k = 1⎞⎟⎠⎫⎬⎭2kmUse the property derivedin the previous page !2xTotal Prediction ErrorThe error can be monitored tohelp establish p⎞⎟ ⎠2004 SP- Berlin Chen 5

Short-Term Analysis: Geometric Approach• Vector Representations of Error and Speech Signalsthe pastvectorsare as columnvectorsxm⎡⎢⎢⎢⎢⎢⎢⎢⎣⎢x1x mexmp[ n] = ∑ a x [ n − k ] + e [ n] , 0 ≤ n ≤ N-1xk = 1kmm1[ −1] x x [ − 2] ... [ − ] ⎤⎡amxmp1 ⎤ ⎡ em[ 0]⎤ ⎡ xm[ 0]⎤m⎢ ⎥[ 1 −1] x [ 1 − 2] ... x [ 1 − p] ⎥a⎢[ 1] ⎥ ⎢[ 1]⎥2 exmxmmm ⎥⎢⎥ ⎢ m ⎥ ⎢ m.. ... . ⎥⎢. ⎥ ⎢ . ⎥ ⎢ .⎥⎢⎥ + ⎢ ⎥ = ⎢.. ... . ⎥⎢. ⎥ ⎢ . ⎥ ⎢ ... ... .⎥⎢⎥⎥⎥ ⎢ ⎢ ⎥ ⎢ ⎢⎥⎢⎢⎢ . ..⎥N −1−1x −1− 2 ... −1− ⎦⎣amNxmN p p ⎦⎥⎢⎣emN −1⎦⎥⎢⎣xmN −1[ ] [ ] [ ] [ ] [ ]⎥ ⎥⎥ ⎦[ ]1 2 P( = xamxmxm) e mx mX ....m m= ( em[] 0 , em[ 1] ,..., em[ N - 1])eiTmis minimal if= ( xm[-i] , xm[ 1 - i] ,..., xm[ N - 1 - i])m T⇒X( x −Xa)Tmemi, x = 0m,∀1≤ i ≤pXa⇒⇒+XaeT=Xa=mx=XXmTe0mT − 1 T( X X) X xmThe prediction error vector must beorthogonal to the past vectorsTx=⎥⎥⎥⎥⎥=This propertyhas been shownpreviously (P.3)02004 SP- Berlin Chen 6

Short-Term Analysis: Autocorrelation Method[]• x mn is identically zero outside 0 ≤ n ≤ N-1• The mean-squared error is calculated within n=0~N-1+p[ n]⎧x⎨⎩0,[ n mL] w [ n]+ , 0 ≤ n ≤ N −1L: Frame Period , the lengthx m=of time between successiveotherwise framesx[ n ]0mLmL+N-1shift[] n = x[ n mL ]~x m+0N-1Framing/Windowing0N-1x[ n] ~ x[ n] w[ n]m=m2004 SP- Berlin Chen 7

Short-Term Analysis: Autocorrelation Method• The mean-squared error will be:Em=N −1+pN −1+p2∑ e~m ∑ mmn =0n =0[] n = ( x [] n − x [] n )Why?x m[] ne m[] n20 N-1 N+P-1 0 N-1∂ EmTake the derivative:∂ a i⇒φp∑j = 1majφm[ i, j] = φ [ i,0]m, ∀ 1N + p −1[ i,j] = ∑ x [ n − i] x [ n − j]==n = 0N −1+j∑n = iN −1xm− ( i − j )∑n = 0mm≤i≤[ n−i] x[ n−j]xmpi ≥[] n x [ n + ( i − j )]mmjx mx mx m[] n0[ n − j]0− i[ n ]0jiN-1N-1+jN-1+iN-1+pN+P-12004 SP- Berlin Chen 8

Short-Term Analysis: Autocorrelation Method• Alternatively,φm[ i,j] = R[ i − j]N– And [ ] − 1∑ − kR = [ ] [ +mk xmn xmn k ]n =0• Therefore:R [ k ] = R [ − k ] Why?– Where is the autocorrelation function of⇒mp∑j = 1paj∑aj = 1φjmRm[ i,j] = φ [ i,0]mm, ∀ 1 ≤ i ≤ p[ i −j] =R[ i] ,∀1m≤i≤px m[ n]A Toeplitz Matrix: ⎡ R− ⎤⎡am[ 0] Rm[ 1] ... Rm[ p 1]1 ⎤ ⎡ Rm[ 1]⎢⎢⎥R [ ] R [ 0] ... R [ p − 2]⎥a⎢R [ 2]symmetric and all elementsof the diagonal are equal⎢⎢⎢⎢ ⎢ ⎢⎢⎣Rmm1mm2................⎥⎢⎥⎢.⎥⎢⎢⎥ ⎥ .⎢ .⎥⎢⎥⎣ ⎦⎢a⎥⎥⎥⎥ ⎥ ⎥⎥⎦⎢⎢= ⎢⎢⎢⎢⎣⎢R[ P −1] x [ P − ] R [ ] [ p]⎥ ⎥⎥⎥ m2 ...m0 p m ⎦..m...⎤⎥⎥⎥⎥⎥2004 SP- Berlin Chen 9

Short-Term Analysis: Autocorrelation Method• Levinson-Durbin Recursion1.Initialization( ) R[ 0]E =0 mEm=∑n= φ2. Iteration: For i=1…,p do the following recursionk( i )=Rm( i) k( i)i 1[ i] − ∑− a( i −1) R[ i −j]j = 1Ej( i − 1 )mmx2mp[] n − ∑a∑( x [] n x [ n − j])j=1jnmp[ 0,0] − ∑ a φ [ 0, j]a i=A new, higher order coefficienta iaE3. Final Solution:j( i ) a ( i − 1) − k ( i ) a ( i − 1) , for 1 ≤ j ≤ i-1=ji −j2( 1 − [ k i ]) E ( i − 1) , where - 1 ≤ k ( ) ≤ 1( i ) ( )j=1is produced at each iteration i= ijmmaj( p ) for ≤ j ≤ p= a1j2004 SP- Berlin Chen 10

Short-Term Analysis: Covariance Methodm[ [ ]• n is not identically zero outside 0 ≤ n ≤ N-1x m– Window function is not applied• The mean-squared error is calculated within n=0~N-1x[ [ n]0mLmL+N-1shift[ n] = x[ n mL]x m+0N-1• The mean-squared error will be:Em=N −1N −12e~mn = 0n = 0∑[ n] =∑ ( xm[ n] − xm[ n])22004 SP- Berlin Chen 11

Short-Term Analysis: Covariance MethodTake the derivative:⇒P∑φj = 1p∑j = 1ama∂ E∂ ami[ i, j] = φ [ i,0]φ mm, ∀ 1 ≤ i ≤ p[ n − j ]jN −1[ i,j] = ∑ x[ n − i] x[ n − j]jφm==n = 0N −1∑n =0xN − 1∑ − in = − immxm[ n − i] x [ n − j]mm[] n x [ n + ( i − j)]m[ i,j] = φ [ i,0] , ∀ 1 ≤ i ≤ Pm[ 1,11] φm[ 1,2] ...φm[ 1,p]⎤⎡a1⎤⎡φm[ 1,0]⎤][ 2,1] φ [ 2,2] ... φ [ 2, p]⎥ ⎢a⎥ ⎢φ [ 2,0]⎥⎡φm0⎢⎢φmmm ⎥ ⎢ 2 ⎥ ⎢ m ⎥⎢ . . ... . ⎥ ⎢ . ⎥ ⎢ .⎢⎥ ⎢ ⎥ = ⎢⎢⎢⎢ ⎢ . . ... .⎥ ⎥ .⎥ ⎥ .. . ... . ⎢ . ⎢ .⎢⎥ ⎢ ⎥ ⎢⎢ [ ] [ ] [ ] ⎥ ⎢ ⎥ ⎢ [ ]⎥ ⎥⎥⎥ ⎣φmp,1φmp,2... φmp,p ⎦ ⎣aP ⎦ ⎣φmp,0 ⎦x mx m⎥⎥⎥0[ n − i]0jiN-1N-1N-1+jN-1+iNot A Toeplitz Matrix:symmetric and but not all elementsof the diagonal are equalφ[ 1 ,1] ≠φ[ 2,2] ...≠φ[ p,p]m mm,2004 SP- Berlin Chen 12

LPC Spectra• LPC spectrum matches more closely the peaks thanthe valleysEmH′Parseval’s theoremjω( e )jω( e)2N −1+pππ X2 1jω22 1m= ∑ em[] n =∫ Em( e ) d ω = G =−π∫−2n = 0 2(jw) (jwe = G ⋅ H e )2π 2ππ ωHd ω( ) ( )jω jωH ( e ) > X ( e )– Because the regions where X jω jωme > H e contributetmore to the error than those wherem2004 SP- Berlin Chen 13

LPC Spectra• LPC provides estimate of a gross shape of theshort-term spectrumOrder = 6 Order = 14Order = 24 Order = 1282004 SP- Berlin Chen 14

LPC Prediction Errors2004 SP- Berlin Chen 15

MFCC vs. LPC Cepstrum Coefficients• MFCC outperforms LPC Cepstrum coefficients– Perceptually motivated mel-scale representation indeed helpsrecognition• Higher-order MFCC does not further reduce the error rate incomparison with the 13-order MFCC• Another perceptually motivated features such as first- andsecond-order delta features can significantly reduce therecognition errors2004 SP- Berlin Chen 16

Take-Home Exercise• Try to implement the short-term linear prediction coding(LPC) for speech signals• You should follow the following instructions:1. Using the autocorrelation method with Levinson-DurbinRecursion and Rectangular/Hamming windowing2. Analyzing the vowel (or FINAL) portions of speech signal withdifferent model orders (different P, e.g. P=6, 14, 24 and 128)3. Plotting the LPC spectra as well as the original speech spectrum4. Using the speech wave file, bk6_1.wav (no header, PCM 16KHzraw data), as the exemplar2004 SP- Berlin Chen 17

• Hints:Take-Home Exercise1. When the LPC coefficients a j are derived, you can constructimpulse response signal h[n], 0≤n ≤ N-1 (N: frame size) by:2. The prediction Error E can be expressed by the autocorrelationfunction:2004 SP- Berlin Chen 18

Take-Home Exercise3. The MATLab example code:x=[184.6400 184.1251 . . . . . . . 197.7890 -26.8000 ]; % original signal, dimension: frame sizey=[1.0000 2.0105 . . . . . . . 0.0738 0.0565 ]; % filter's impulse response h[n], dimension: frame sizegain=valG; %valG: the prediction Error EX=fft(x,512); % fast Fourier Transform, so the frame size < 512Y=fft(y,512); % fast Fourier TransformX(1)=[]; % remove the X(1), the DCY(1)=[]; % remove the Y(1), the DCM=512;powerX=abs(X(1:M/2)).^2; % the power spectrum of XlogPX=10*log(powerX); % the power spectrum of X in dBpowerY=abs(Y(1:M/2)) abs(Y(1:M/2)). ^2; % the power spectrum of YlogPY=10*log(powerY)+10*log(gain); % the power spectrum of Y in dB% plus the gain (Error) in dBnyquist=8000; % maximal frequency indexfreq=(1:M/2)/(M/2)*nyquist; % an array store the frequency indicesfigure(1);plot(freq,logPX,'b',freq,logPY,'r'); % plot the result,% b: blue line for the power spectrum of the original signal% r: red line for the power spectrum of the filter2004 SP- Berlin Chen 19

Take-Home Exercise• Example Figures of LPC SpectraOrder = 6Rectangle windowNo pre-emphasisOrder = 14Rectangle windowNo pre-emphasisOrder = 24Rectangle windowNo pre-emphasisOrder = 128Rectangle windowNo pre-emphasisOrder = 128Rectangle windowPre-emphasisOrder = 128Hamming windowPre-emphasisOrder = 128Hamming windowNo pre-emphasis Plotted by Roger Kuo, Fall 20022004 SP- Berlin Chen 20

Linear Prediction Analysis of Speech Sounds - Berlin Chen

Create successful ePaper yourself

Delete template?

Save as template?