Homomorphic Homomorphic Signal Processing [ ] [ ] [ ]
Homomorphic Homomorphic Signal Processing [ ] [ ] [ ]
Homomorphic Homomorphic Signal Processing [ ] [ ] [ ]
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Digital <strong>Signal</strong> <strong>Processing</strong> Design<br />
— Lecture 7<br />
<strong>Homomorphic</strong> <strong>Signal</strong><br />
<strong>Processing</strong><br />
Superposition Principle<br />
+ +<br />
x[ n]<br />
x 1[<br />
n]<br />
+ x2[<br />
n]<br />
L{<br />
}<br />
y[ n]<br />
= L{<br />
x[<br />
n]}<br />
L { x1[ n]<br />
} + L{<br />
x2[<br />
n]<br />
}<br />
xn ( ) = ax1( n) + bx2( n)<br />
yn ( ) = L xn ( ) = aL x( n) + bL x( n)<br />
[ ] [ ] [ ]<br />
1 2<br />
<strong>Homomorphic</strong> Filter<br />
• homomorphic filter => homomorphic system ⎡<br />
⎣H⎤ ⎦ that<br />
passes the desired signal unaltered, while removing the<br />
undesired signal<br />
x( n) = x1( n) ∗x2(<br />
n) - with x1( n)<br />
the "undesired" signal<br />
H ⎡⎣xn ( ) ⎤⎦ = H ⎡⎣x1(<br />
n) ⎦⎤∗H⎣⎡x2( n)<br />
⎦⎤<br />
H ⎡⎣x1( n) ⎤⎦<br />
→ δ ( n) - removal of x1( n)<br />
H ⎡⎣x2( n) ⎤⎦<br />
→ x2( n)<br />
H ⎡⎣xn ( ) ⎤⎦<br />
= δ(<br />
n) ∗ x2( n) = x2( n)<br />
• for linear systems this is analogous to additive noise removal<br />
1<br />
3<br />
5<br />
General Discrete Discrete-Time Time Model of<br />
Speech Production<br />
pL( n) = p( n) ∗hV( n)<br />
−− Voiced Speech<br />
hV( n) = AV ⋅g( n) ∗v( n) ∗r(<br />
n)<br />
pL( n) = u( n) ∗hU( n)<br />
−− Unvoiced Speech<br />
hU( n) = AN ⋅v( n) ∗r(<br />
n)<br />
2<br />
Generalized Superposition for Convolution<br />
x[ n]<br />
*<br />
H { }<br />
*<br />
y[ n]<br />
= H { x[<br />
n]<br />
}<br />
x1[ n]<br />
∗ x2[<br />
n]<br />
H { x1[ n]<br />
} ∗H<br />
{ x2[<br />
n]<br />
}<br />
• for LTI systems we have the result<br />
yn ( ) = xn ( ) ∗ hn ( ) = xkhn ( ) ( −k)<br />
∞<br />
∑<br />
k =−∞<br />
• "generalized" superposition => addition replaced by convolution<br />
xn ( ) = x1( n) ∗x2(<br />
n)<br />
yn ( ) = H ⎡⎣xn ( ) ⎤⎦ = H ⎡⎣x1( n) ⎤⎦∗H ⎡⎣x2( n)<br />
⎤⎦<br />
• homomorphic system<br />
for convolution<br />
Canonic Form for <strong>Homomorphic</strong> Convolution<br />
x[ n]<br />
x n x n<br />
* + + + + *<br />
D ∗{<br />
} L{<br />
}<br />
−1<br />
D { }<br />
xˆ[ n]<br />
yn ˆ[ ] ∗ yn [ ]<br />
ˆ[ ] ˆ [ ]<br />
ˆ [ ] ˆ [ ]<br />
1[ ] ∗ 2[<br />
] x1 n + x2 n y1 n + y2 n<br />
1 ∗ 2<br />
4<br />
y [ n] y [ n]<br />
• any homomorphic system can be represented as a cascade<br />
of three systems, e.g., for convolution<br />
1. system takes inputs combined by convolution and transforms<br />
them into additive outputs<br />
2. system is a conventional linear system<br />
3. inverse of first system--takes additive inputs and transforms<br />
them into convolutional outputs<br />
6<br />
1
Canonic Form for <strong>Homomorphic</strong> Convolution<br />
x[ n]<br />
x n x n<br />
* + + + + *<br />
D ∗{<br />
} L{<br />
}<br />
−1<br />
D { }<br />
xˆ[ n]<br />
yn ˆ[ ] ∗ yn [ ]<br />
ˆ ˆ<br />
ˆ ˆ<br />
1[ ] ∗ 2[<br />
] x1[ n] + x2[ n]<br />
y1[ n] + y2[ n]<br />
1 ∗ 2<br />
y [ n] y [ n]<br />
xn ( ) = x1( n) ∗x2(<br />
n)<br />
- convolutional relation<br />
xn ˆ( ) = D ˆ ˆ<br />
∗ ⎡⎣xn ( ) ⎤⎦<br />
= x1( n) + x2( n)<br />
- additive relation<br />
yn ˆ( ) = L ⎡ ⎣xˆ ˆ ˆ ˆ<br />
1( n) + x2( n) ⎤ ⎦ = y1( n) + y2( n)<br />
- conventional linear system<br />
−1<br />
yn ( ) = D ˆ ˆ<br />
∗ ⎡ ⎣y1( n) + y2( n) ⎤ ⎦ = y1( n) ∗y2(<br />
n)<br />
- inverse of convolutional relation<br />
=> design converted back to linear system, L<br />
D∗<br />
⎡⎣ ⎤⎦<br />
- fixed (called the characteristic system for homomorphic deconvolution)<br />
−1<br />
D∗<br />
⎡⎣ ⎤⎦<br />
- fixed (characteristic system<br />
for inverse homomorphic deconvolution)<br />
x[ n]<br />
*<br />
Characteristic Systems for <strong>Homomorphic</strong> Deconvolution<br />
D∗{<br />
}<br />
• • + +<br />
Z { } log{<br />
} -1<br />
X( z)<br />
Xˆ Z { }<br />
( z)<br />
+<br />
7<br />
xˆ[ n]<br />
x1[ n]<br />
∗ x2[<br />
n]<br />
X ( z) X ( z)<br />
Xˆ ( z) + Xˆ ( z)<br />
x ˆ1[ n]<br />
+ xˆ<br />
2[<br />
n]<br />
∞<br />
∑<br />
Xz ( ) = xnz ( )<br />
n=−∞<br />
1<br />
ˆ( ) ˆ ( )<br />
1 ⋅ 2<br />
1 2<br />
−n<br />
[ ] [ ]<br />
Xˆ ( z) = log X( z) = log X( z) + jarg X( z)<br />
∫<br />
n−1<br />
xn = Xzz dz<br />
2π<br />
j C<br />
Issues with Logarithms<br />
• it is essential that the logarithm obey the equation<br />
log ⎡⎣X1( z) ⋅ X2( z) ⎤⎦ = log ⎡⎣X1( z) ⎤⎦+ log ⎡⎣X2( z)<br />
⎤⎦<br />
• this is trivial if X1( z) and X2( z)<br />
are real -- however usually<br />
X1( z) and X2( z)<br />
are complex<br />
ω<br />
• consider<br />
evaluating ( ) only on the unit circle, e.g., = j<br />
Xz z e<br />
• then the complex log can be written in the form:<br />
arg ⎡ jω<br />
( ) ⎤<br />
jω jω<br />
j X e<br />
Xe ( ) = | Xe ( )| e ⎣ ⎦<br />
ω<br />
log ⎡ j<br />
( ) ⎤ ˆ jω ω ω<br />
= ( ) = log ⎡ j<br />
| ( )| ⎤+ arg ⎡ j<br />
Xe Xe Xe j Xe ( ) ⎤<br />
⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />
• no problems<br />
with log magnitude term; uniqueness problems<br />
arise in defining the imaginary part of the log; can show that<br />
the imaginary part (the phase angle of the z-transform) needs<br />
to be a continuous odd function of ω<br />
9<br />
11<br />
X( z)<br />
Frequency Domain Canonic Form<br />
X z X z<br />
• need to find a system that converts convolution to addition<br />
xn ( ) = x1( n) ∗x2(<br />
n)<br />
Xz ( ) = X1( z) ⋅ X2( z)<br />
• since<br />
D ( ) ˆ ˆ ˆ<br />
∗ ⎡⎣xn⎤⎦ = x1( n) + x2( n) = xn ( )<br />
D ( ) ˆ ˆ ˆ<br />
∗ ⎡⎣Xz⎤⎦ = X1( z) + X2( z) = Xz ( )<br />
=> use log function which converts products to sums<br />
Xz ˆ ( ) = log ⎡⎣⎡⎣ Xz ( ) ⎤⎦⎤⎦ = log ⎡⎣⎡⎣ X 1 ( z ) ⋅ X 2 ( z ) ⎤⎦⎤⎦<br />
= log ⎡ ˆ ˆ<br />
⎣X1( z) ⎤⎦+ log ⎡⎣X2( z) ⎤⎦<br />
= X1( z) + X2( z)<br />
Yz ˆ( ) = L ⎡Xˆ ˆ ˆ ˆ<br />
1( z) + X2( z) ⎤ = Y1( z) + Y2( z)<br />
⎣ ⎦<br />
Yz ( ) = exp ⎡Yˆ ˆ<br />
1( z) + Y2( z) ⎤ = Y1( z) ⋅Y2(<br />
z)<br />
⎣ ⎦<br />
• + + + + •<br />
log { } L<br />
Xˆ z{<br />
} exp{<br />
}<br />
( z)<br />
Yˆ ( z)<br />
Y( z)<br />
ˆ ˆ<br />
ˆ ˆ<br />
1( ) ⋅ 2(<br />
) X1( z) + X2( z)<br />
Y1( z) + Y2( z)<br />
1 ⋅ 2<br />
Y( z) Y ( z)<br />
Characteristic System for <strong>Homomorphic</strong> Deconvolution<br />
−1<br />
D∗<br />
{ }<br />
+ + + • • *<br />
Z { } exp{<br />
} -1<br />
yn ˆ[ ]<br />
Yˆ Z { }<br />
( z)<br />
Y( z)<br />
yn [ ]<br />
y ˆ [ n] + y ˆ [ n]<br />
Y ˆ ˆ( z) + Yˆ ˆ ( z)<br />
Y y1 [ n ] ∗ y2<br />
[ n ]<br />
1( z) ⋅Y2(<br />
z)<br />
1 2<br />
1 2<br />
∞<br />
∑<br />
Yz ˆ( ) = ynz ˆ(<br />
)<br />
n=−∞<br />
−n<br />
Yz ( ) = exp ⎡ ˆ(<br />
) ⎤<br />
⎣<br />
Yz<br />
⎦<br />
( )<br />
1<br />
( )<br />
∫<br />
n−1<br />
y n = Y z z dz<br />
2π<br />
j C<br />
Problems with arg Function<br />
jω jω jω<br />
{ ( ) } ≠ { 1( ) } +<br />
{ 2(<br />
) }<br />
ARG X e ARG X e ARG X e<br />
10<br />
12<br />
2
Terminology<br />
• spectrum – Fourier transform of signal autocorrelation<br />
• cepstrum – Inverse Fourier transform of logarithm of<br />
spectrum<br />
• analysis – determining the spectrum of a signal<br />
• alanysis y – determining g the cepstrum p of a signal g<br />
• filtering – linear operations on time signal<br />
• liftering – linear operations on log spectrum<br />
• frequency – independent variable of spectrum<br />
• quefrency – independent variable of cepstrum<br />
• harmonic – integer multiple of fundamental frequency<br />
• rahmonic – integer multiple of fundamental quefrency<br />
Fourier Transform Representation: The<br />
Complex Cepstrum<br />
D∗{<br />
}<br />
* • • + + +<br />
F {} log{<br />
}<br />
-1<br />
F {}<br />
( )<br />
ω j<br />
x [n]<br />
xˆ<br />
[ n]<br />
X e<br />
ˆ jω<br />
( ) X ( e )<br />
j<br />
X e<br />
( )<br />
j<br />
X e<br />
∞<br />
∑<br />
jω − jωn jω<br />
Xe ( ) = xne ( ) = Xe ( ) e<br />
n=−∞<br />
1 ˆ jω jωn xn ˆ( ) = ( ) ω<br />
2π<br />
∫ Xe e d<br />
jω<br />
{ }<br />
jarg X( e )<br />
ˆ jω jω jω jω<br />
Xe ( ) = log ⎡<br />
⎣<br />
Xe ( ) ⎤<br />
⎦<br />
= log Xe ( ) + jarg ⎡<br />
⎣<br />
Xe ( ) ⎤<br />
⎦<br />
π<br />
−π<br />
The Complex Cepstrum Cepstrum-DFT DFT Implementation<br />
2π j k<br />
N<br />
∞<br />
∑<br />
n n=−∞∞<br />
2π<br />
− j kn<br />
N<br />
X ( k) = X( e ) = x( n) e k = 01 , ,..., N −1,<br />
p<br />
jω<br />
i Xp( k) is the N point DFT corresponding to X( e )<br />
Xˆ ( k) = log X ( k) = log X ( k) + jarg X ( k)<br />
{ } { }<br />
N−1<br />
∑ ˆ<br />
2π<br />
j kn<br />
N<br />
∞<br />
∑<br />
p p p p<br />
1<br />
xˆ ( n) = X ( k) e = xˆ( n+ rN) n = 01 , ,..., N −1<br />
p p<br />
N k= 0<br />
r=−∞<br />
• xˆp( n) is an aliased version of xn ˆ(<br />
)<br />
⇒use<br />
as large a value of N as possible to minimize aliasing<br />
13<br />
15<br />
17<br />
Complex and Real Cepstrum<br />
define the inverse Fourier transform of ˆ jω<br />
•<br />
Xe ( ) as<br />
π<br />
1<br />
ˆ( ) ˆ jω jωn xn = Xe ( ) e dω<br />
2π<br />
∫<br />
−π<br />
• where xn ˆ(<br />
) called the "complex cepstrum" since a complex<br />
logarithm g is involved in the computation<br />
• can also define a "real<br />
cepstrum" using just the real part of<br />
the logarithm, giving<br />
π<br />
1 ˆ jω jωn cn ( ) = Re ⎡Xe ( ) ⎤ e dω<br />
2π<br />
∫ ⎣ ⎦<br />
−π<br />
π<br />
1<br />
jω jωn = log | Xe ( ) | e dω<br />
2π<br />
∫<br />
−π<br />
• can show that cn ( ) is the even part of xn ˆ(<br />
)<br />
Fourier Transform Representation:<br />
The Complex Cepstrum<br />
−1<br />
D∗<br />
{ }<br />
+ + + • • *<br />
F { } exp{<br />
}<br />
-1<br />
F { }<br />
yn [ ]<br />
ˆ jω<br />
Y ( e ) ( )<br />
ω j<br />
yn ˆ[ ]<br />
( ) Y e<br />
j<br />
Y e<br />
( )<br />
j<br />
Y e<br />
jω ∞<br />
= ∑<br />
n=−∞<br />
− jωn jω jω jω jω<br />
Ye ˆ( ) yne ˆ[<br />
]<br />
Ye ( ) = exp ⎡ ˆ ⎤ = + ⎡<br />
⎣<br />
⎤<br />
⎣<br />
Ye ( )<br />
⎦<br />
log Ye ( ) jarg Ye ( )<br />
⎦<br />
π<br />
1<br />
jω jωn yn [ ] =<br />
ω<br />
2π<br />
∫ Ye ( ) e d<br />
−π<br />
The Cepstrum-DFT Cepstrum DFT Implementation<br />
2π 2π<br />
j k<br />
∞<br />
− j kn<br />
N N<br />
Xp( k) = X( e ) = ∑ x( n) e n = 01 , ,..., N −1<br />
n=−∞<br />
jarg { Xp( k)<br />
}<br />
Xp( k) = Xp( k) e<br />
jω jω<br />
Ce ( ) = log Xe ( )<br />
2π<br />
j k<br />
N<br />
Cp( k) = log Xp( e )<br />
π<br />
1<br />
jω jωn cn ( ) = ( ) ω ˆ( ) ˆ(<br />
) / 2<br />
2π<br />
∫ Ce e d = ⎡⎣xn + x−n⎤⎦ −π<br />
1<br />
c ( n) = C ( k) e = c( n+ rN) n = 01 , ,..., N −1<br />
N−12π<br />
j kn<br />
∞<br />
N<br />
p ∑ p ∑<br />
N k= 0<br />
r=−∞<br />
• cp( n) is an aliased version of c( n) ⇒use<br />
as large a value of N<br />
as possible to minimize aliasing<br />
14<br />
16<br />
18<br />
3
z-Transform Transform Cepstrum Alanysis<br />
i The main z-transform formula for cepstrum alanysis is based on<br />
the power series expansion:<br />
∞ ( −1)<br />
∑<br />
n+<br />
1<br />
n<br />
log( 1+ x) = x x < 1<br />
n=<br />
1 n<br />
i EExample l 1<br />
--Apply A l thi this fformula l tto th the exponential ti l sequence<br />
n<br />
1<br />
x1( n)<br />
= aun ( ) ⇔ X1( z)<br />
= −1<br />
1−<br />
az<br />
n+<br />
1<br />
∞<br />
ˆ −1( −1)<br />
n −n<br />
X1( z) = log[ X1( z)] = −log( 1−<br />
az ) = −∑( −a)<br />
z<br />
n=<br />
1 n<br />
n ∞ n<br />
a 1<br />
ˆ 1( ) ( 1) ˆ<br />
− ⎛a ⎞ −n<br />
x n = u n− ⇔ X1( z) = −log( 1−<br />
az ) = ∑⎜<br />
⎟z<br />
n n=<br />
1 ⎝ n ⎠<br />
z-Transform Transform Cepstrum Alanysis<br />
• the complex logarithm of Xz ( ) is<br />
⎡⎣ ⎤⎦ ⎡⎣ ⎤⎦<br />
⎡ r ⎤<br />
⎣ ⎦<br />
Mi<br />
∑<br />
k = 1<br />
M N<br />
N<br />
−1<br />
( 1 k )<br />
Xz ˆ ( ) = log Xz ( ) = log A + log z + log −az<br />
0 i<br />
0<br />
− 1<br />
∑ ∑log ( 1 bz k ) ∑ ∑log ( 1 cz k ) ∑ ∑log<br />
( 1 dz k )<br />
+ − − − − −<br />
k= 1 k= 1 k=<br />
1<br />
• evaluating Xz ˆ ( ) on the unit circle we can ignore the term<br />
related to log ⎡ jωr e ⎤ (as this contributes only to the imaginary<br />
⎣ ⎦<br />
part and is a linear phase shift)<br />
z-Transform Transform Cepstrum Alanysis for 2<br />
Pulses<br />
• Example 2--an<br />
input sequence of two pulses of the form<br />
x3( n) = δ( n) + αδ( n− Np)<br />
( 0 < α < 1)<br />
−Np<br />
X3( z) = 1+<br />
αz<br />
ˆ<br />
−Np<br />
X3( z) = log ⎡⎣X3( z) ⎤⎦<br />
= log(<br />
1+<br />
αz<br />
)<br />
∞ n + 1<br />
( −1)<br />
n −nNp<br />
= ∑ α z<br />
n<br />
n=<br />
1<br />
∞<br />
k<br />
+ 1 α<br />
ˆ k<br />
x3( n) = ∑(<br />
−1) δ ( n−kNp) k<br />
k = 1<br />
• the cepstrum is an impulse train with impulses spaced at Np<br />
samples<br />
N p<br />
2N 2Np 3N 3Np 19<br />
21<br />
23<br />
z-Transform Transform Cepstrum Alanysis<br />
• consider digital systems with rational z-transforms of the general type<br />
Xz ( ) =<br />
M<br />
M<br />
Az a z b z<br />
i<br />
r<br />
∏( 1− k<br />
0<br />
−1<br />
) ∏(<br />
1−<br />
k )<br />
k= 1 k=<br />
1<br />
Ni<br />
N0<br />
− 1<br />
∏ ( 1 1−cz k ) ∏ ( 1 1−<br />
d dkz )<br />
k= 1 k=<br />
1<br />
• with all coefficients ak, bk, ck, dk < 1 => all ck<br />
poles and<br />
ak zeros<br />
are inside the unit circle; all dk poles and bk<br />
zeros<br />
r<br />
are outside the unit circle; the factor z is just a shift of the time<br />
origin and has no effect<br />
z-Transform Transform Cepstrum Alanysis<br />
• we can then evaluate the remaining terms, use power series<br />
expansion for logarithmic terms (and take the inverse<br />
transform to give the complex cepstrum) giving:<br />
π<br />
1<br />
ˆ( ) ˆ jω jωn xn ( ) = ∫ Xe ( ) e dωω<br />
2π<br />
∫ Xe e d<br />
−π<br />
= log⎡⎣A⎤⎦<br />
n = 0<br />
Ni n Mi<br />
n<br />
ck ak<br />
= ∑ −<br />
n ∑ n<br />
k= 1 k=<br />
1<br />
n > 0<br />
M0 −n N0<br />
−n<br />
bk dk<br />
= ∑ −<br />
n ∑ n<br />
n < 0<br />
k= 1 k=<br />
1<br />
Cepstrum for Train of Impulses<br />
• an important special case is a train of impulses<br />
of the form:<br />
∑αδ = 0<br />
M<br />
r p<br />
r<br />
M<br />
N<br />
α<br />
0<br />
−rNp<br />
∑ r<br />
r =<br />
xn ( ) = ( n−rN) Xz ( ) = z<br />
−Np −1<br />
• clearly Xz ( ) is a polynomial in z rather than z ;<br />
thus Xz ( ) can be expressed as a product of factors<br />
−Np<br />
Np<br />
of the form ( 1−az ) and ( 1−bz<br />
), giving a complex<br />
cepstrum, xn ˆ(<br />
), that is non-zero only at integer multiples of N<br />
20<br />
22<br />
p<br />
24<br />
4
z-Transform Transform Cepstrum Alanysis for<br />
Convolution of 3 Sequences<br />
i Example 3 -- consider the convolution of three sequences, i.e.,<br />
xn ( ) = x1( n) ∗x2( n) ∗x3(<br />
n)<br />
n<br />
= ⎡<br />
⎣<br />
aun ( ) ⎤<br />
⎦<br />
∗ [ δ( n) + bδ( n+ 1)<br />
] ∗ ⎡<br />
⎣δ( n) + αδ(<br />
n−N) ⎤ p ⎦<br />
n n−N p n<br />
n− N p + 1<br />
α p α<br />
p<br />
= a u ( n ) + a u ( n n− N ) + ba u ( n n+ 1 ) + ba u ( n n− N + 1 )<br />
i The complex cepstrum<br />
is therefore the sum of the complex cepstra<br />
of the three sequences<br />
xn ˆ( ) = xˆ1( n) + xˆ ˆ<br />
2( n) + x3( n)<br />
n ∞ k+ 1 k n+ 1 n<br />
a ( −1) α ( −1)<br />
b<br />
= un ( − 1) + ∑ δ ( n− kNp) + u( −n−1) n k n<br />
k = 1<br />
Properties of the Complex Cepstrum<br />
• in general, the complex cepstrum is non-zero and of<br />
infinite extent for both positive and negative n,<br />
even when<br />
xn ( ) is causal, stable, and even of finite duration<br />
• the complex cepstrum is a decaying sequence which is<br />
bounded byy<br />
| n|<br />
α<br />
xn ( ) < β for | n|<br />
→∞<br />
| n |<br />
• where α is the maximum absolute value of the quantities<br />
ak, bk, ck, dk<br />
and β is a constant multiplier<br />
• if Xz ( ) has no poles or zeros outside the unit circle ( bk = dk<br />
= 0)<br />
then xn ˆ(<br />
) = 0 for n<<br />
0 => such signals are called "minimum phase"<br />
signals<br />
27<br />
The Phase Issue in Computation<br />
i In order for the complex cepstrum to be computable, we need<br />
jω jω jω<br />
arg { Xe ( ) } = arg { X1( e ) } + arg { X2( e ) }<br />
i when<br />
Xe ( ) = X( e ) + X( e )<br />
jω jω jω<br />
1 2<br />
jω jω jω<br />
{ ( ) } ≠ { 1( ) } + { 2(<br />
) }<br />
ARG X e ARG X e ARG X e<br />
25<br />
29<br />
Example: a=.9, b=.8, α=.7, N p=15<br />
( 1)<br />
+<br />
− b<br />
n<br />
n 1 n<br />
a<br />
n<br />
n<br />
∞<br />
∑<br />
k = 1<br />
−1<br />
( )<br />
( 1 α )<br />
−Np<br />
( 1+<br />
bz)<br />
Xz ( ) = + z<br />
1−<br />
az<br />
k+ 1 k<br />
( −1)<br />
α<br />
δ [ n−kNp] k<br />
Relation of Cepstrum to Complex Cepstrum<br />
i for the sequence xn ( ) we have:<br />
∞<br />
jω<br />
jω − jωn jω<br />
jarg { X( e ) }<br />
Xe ( ) = ∑ xne ( ) = Xe ( ) e<br />
n=−∞<br />
ˆ jω jω jω jω<br />
Xe ( ) = log Xe ( ) = log Xe ( ) + jarg Xe ( )<br />
i giving the complex cepstrum<br />
{ } { }<br />
π<br />
1 ˆ jω jωn xˆ( n) =<br />
2 ∫ ( ) ω<br />
π ∫ X e e d<br />
−π<br />
i with cepstrum<br />
π<br />
1<br />
jω jωn cn ( ) = log ( ) ω<br />
2π<br />
∫ Xe e d<br />
−π<br />
jω jω<br />
i because of the symmetries of log Xe ( ) (even), and jarg { Xe ( ) } (odd),<br />
along with the symmetries of cos( ωn) (even) and sin( ωn)<br />
(odd), it can easily<br />
be shown<br />
that<br />
xn ˆ( ) + xˆ( −n)<br />
cn ( ) = Ev{ xn ˆ(<br />
) } =<br />
2<br />
The “Phase Unwrapping” Problem<br />
26<br />
28<br />
30<br />
5
Cepstrum for Minimum Phase <strong>Signal</strong>s<br />
• for minimum phase signals (no poles or zeros outside unit circle) the complex cepstrum<br />
can be completely represented by the real part of the Fourier transforms<br />
• this means we can represent the complex<br />
cepstrum of minimum phase signals by the log<br />
of the magnitude of the FT alone<br />
• since the real part of the FT is the FT of the even part of the sequence<br />
ˆ<br />
ˆ( ) ˆ(<br />
)<br />
Re ⎡ jω ⎡ + − ⎤<br />
( ) ⎤<br />
xn x n<br />
Xe = FT<br />
⎣ ⎦ ⎢<br />
2<br />
⎥<br />
⎣ 2 ⎦<br />
jω<br />
FT ⎡⎣c( n)<br />
⎤⎦<br />
= log Xe ( )<br />
xn ˆ( ) + xˆ( −n)<br />
cn ( ) =<br />
2<br />
• giving<br />
xn ˆ(<br />
) = 0 n<<br />
0<br />
= cn ( ) n=<br />
0<br />
= 2cn ( ) n><br />
0<br />
• thus the complex cepstrum (for minimum phase signals) can be computed by computing<br />
the cepstrum and using the equation above<br />
31<br />
Complex Cepstrum for Speech<br />
• the models for the speech components are as follows:<br />
Mi<br />
M0<br />
−M−1 ∏ 1− k ∏ 1−<br />
k<br />
k= 1 k=<br />
1<br />
Ni<br />
−1<br />
∏(<br />
1−<br />
cz k )<br />
k = 1<br />
a k = b k = 0<br />
Az ( a z ) ( b z)<br />
1. vocal tract: Vz ( ) =<br />
--for o voiced o ced speec speech, , oonly y po poles es => a b , aall<br />
k<br />
--unvoiced speech and nasals,<br />
need pole-zero model but all poles are<br />
inside the unit circle => ck<br />
< 1<br />
--all speech has complex poles and zeros that occur in complex conjugate<br />
pairs<br />
−1<br />
2. radiation model: Rz ( ) ≈1−z(high frequency emphasis)<br />
3. glottal pulse model: finite duration pulse with transform<br />
Li<br />
L0<br />
−1<br />
Gz ( ) = B∏( 1−αkz) ∏(<br />
1−βkz)<br />
k= 1 k=<br />
1<br />
with zeros both inside and outside the unit circle<br />
Complex Cepstrum for Voiced Speech<br />
33<br />
35<br />
Characteristic System for<br />
<strong>Homomorphic</strong> Convolution<br />
• still need to define (and design) the L<br />
operator part (the linear system<br />
component) p ) of the system y to completely p y<br />
define the characteristic system for<br />
homomorphic convolution for speech<br />
– to do this properly and correctly, need to look<br />
at the properties of the complex cepstrum for<br />
speech signals<br />
Complex Cepstrum for Voiced Speech<br />
Complex Cepstrum for Voiced Speech<br />
32<br />
34<br />
36<br />
6
Complex Cepstrum for Voiced Speech<br />
Complex Cepstrum for Voiced Speech<br />
Complex Cepstrum for Voiced Speech<br />
37<br />
39<br />
41<br />
Complex Cepstrum for Voiced<br />
Speech<br />
• combination of vocal tract, glottal pulse and<br />
radiation will be non-minimum phase =><br />
complex cepstrum exists for all values of n<br />
• the complex cepstrum will decay rapidly for<br />
large n (due to polynomial terms in expansion<br />
of complex cepstrum)<br />
• effect of the voiced source is a periodic pulse<br />
train for multiples of the pitch period<br />
Complex/Real Cepstrum for Voiced Speech<br />
<strong>Homomorphic</strong> Filtering of Voiced<br />
Speech<br />
• goal is to separate out the<br />
excitation impulses from the<br />
remaining components of the<br />
complex cepstrum<br />
• use cepstral window, l(n), to<br />
separate excitation pulses from<br />
combined vocal tract<br />
– l(n)=1 for |n|
<strong>Homomorphic</strong> Analysis of Voiced Speech<br />
<strong>Homomorphic</strong> Analysis of Voiced Speech<br />
<strong>Homomorphic</strong> Analysis of Voiced Speech<br />
43<br />
45<br />
47<br />
<strong>Homomorphic</strong> Analysis of Voiced Speech<br />
<strong>Homomorphic</strong> Analysis of Voiced Speech<br />
<strong>Homomorphic</strong> Analysis of Unvoiced Speech<br />
44<br />
46<br />
48<br />
8
<strong>Homomorphic</strong> Analysis of Unvoiced Speech<br />
<strong>Homomorphic</strong> Analysis<br />
Example 11—single<br />
single pole sequence<br />
(computed using all 3 methods)<br />
[ ] = [ ]<br />
n<br />
xn aun<br />
n<br />
a<br />
xn ˆ[ ] = un [ −1]<br />
n<br />
49<br />
51<br />
53<br />
<strong>Homomorphic</strong> Analysis<br />
<strong>Homomorphic</strong> Analysis of a sequence of 15<br />
frames from /SH/ to /EY/ (between the two<br />
demarked windows in the above plot).<br />
Review of Cepstral Calculation<br />
• 3 potential methods for computing cepstral<br />
coefficients, xn ˆ[ ] , of sequence x[n]<br />
– analytical method; assuming X(z) is a rational<br />
function; find poles and zeros and expand using log<br />
power series i<br />
– recursion method; assuming X(z) is either a minimum<br />
phase (all poles and zeros inside unit circle) or<br />
maximum phase (all poles and zeros outside unit<br />
circle) sequence<br />
– DFT implementation; using windows, with phase<br />
unwrapping (for complex cepstra)<br />
Example 22—voiced<br />
voiced speech frame<br />
50<br />
52<br />
54<br />
9
Example 33—low<br />
low quefrency liftering<br />
Example 4—effects 4 effects of low quefrency lifter<br />
Example 66—phase<br />
phase unwrapping<br />
55<br />
57<br />
59<br />
Example 33—high<br />
high quefrency liftering<br />
Example 55—phase<br />
phase unwrapping<br />
<strong>Homomorphic</strong> Spectrum Smoothing<br />
56<br />
58<br />
60<br />
10
<strong>Homomorphic</strong> Spectrum Smoothing<br />
3200 Hz<br />
Delta Cepstrum<br />
200 Hz<br />
Δ cmel [ n, ] = cmel [ n, ] −cmel [ n−1,<br />
]<br />
Δ log ( E [ n, ] ) = log ( E [ n, ] ) −log ( E [ n−1,<br />
]<br />
)<br />
mel mel mel<br />
Differencing removes effect of linear filtering.<br />
61<br />
63<br />
x[n]<br />
w[ n − m]<br />
Mel Cepstrum<br />
Xnω [ , k ]<br />
1<br />
mel [ , ] = ∑ ( ω ) [ , ω ]<br />
U<br />
E n Vk X n k<br />
A<br />
Vl ( ωk<br />
)<br />
R filters<br />
2<br />
U<br />
<br />
l k= L<br />
k= L<br />
mel[ ,<br />
1 1<br />
] log (<br />
0<br />
mel[<br />
, ] ) cos<br />
2 1 ( 2 )<br />
π<br />
R−<br />
=<br />
R =<br />
+<br />
R<br />
= ∑ ( ω )<br />
<br />
A V<br />
⎡ ⎤<br />
c n m ∑ E n ⎢ m⎥<br />
<br />
⎣ ⎦<br />
Summary<br />
Emel[ n,<br />
l]<br />
• Introduced the concept of the cepstrum of a signal,<br />
defined as the inverse Fourier transform of the log of the<br />
signal spectrum<br />
1<br />
j<br />
ˆ[ ] log ( )<br />
xn F X e ω<br />
− = ⎡<br />
⎣<br />
⎤<br />
⎦<br />
• Showed cepstrum p reflected pproperties p of both the<br />
excitation (high quefrency) and the vocal tract (low<br />
quefrency)<br />
– short quefrency window filters out excitation; long quefrency<br />
window filters out vocal tract<br />
• Mel-scale cepstral coefficients used as feature set for<br />
speech recognition<br />
• Delta and delta-delta cepstral coefficients used as<br />
indicators of spectral change over time 64<br />
k<br />
2<br />
62<br />
11