Optimal kernels for time-frequency analysis - Rice University Digital ...

Optimal kernels for time-frequency analysis 

Richard G. Baraniuk Douglas L. Jones 

Electrical and Computer Engineering 

252 Engineering Research Laboratory 

University of Illinois 

Coordinated Science Laboratory 

University of illinois 

1 101 W. Springfield Ave. 

Urbana, IL 61801 Urbana, IL 61801 

(217) 333-0766 (217) 244-6823 

ABSTRACT 

Current bilinear time-frequency representations apply a fixed kernel to smooth the Wigner distribution. However, the 

choice of a fixed kernel limits the class ofsignals that can be analyzed effectively. This paper presents optimality criteria 

for the design of signal-dependeni kernels that suppress cross-components while passing as much auto-component 

energy as possible, irrespective of the form of the signal. A fast algorithm for the optimal kernel solution makes the 

procedure competitive computationaily with fixed kernel methods. Examples demonstrate the superior performance 

of the optimal kernel for a frequency modulated signal. 

1. INTRODUCTION 

Time-Frequency Distributions (TFD's), which indicate the energy content of a signal as a function of both time and 

frequency, are a powerful tool for time-varying signal analysis. The Wigner Distribution (WD) 

W(t,w) = t: s(t + )s*(t _ )e_Tdr 

is of great interest due to a number of attractive properties' . However, it also has spurious cross-components and 

high noise sensitivity, both of which obscure the true signal features. Therefore, the WD is often convolved with a 

two-dimensional smoothing function that suppresses cross-components at the expense of signal energy concentration. 

It is well known that all bilinear TFD's can be represented as smoothed versions of the WD2; that is, if P(t,w) is a 

bilinear TFD, then 

P(t,w) = W(t,w) * *(t,w) (2) 

for some function (" ** " denotes two-dimensional convolution). Equation (2) may be rewritten usingthe twodimensional 

inverse Fourier transform as 

C(O, r) = A(6, r)'I(9, r), (3) 

where C(9, r), the inverse Fourier transform of P(t, w), is known as the characteristic function of the distribution; 

A(O, r), the transform of the WD, is called the Ambiguity Function (AF); and (9, r), the transform of the smoothing 

function is known as the kernel of the TFD. The AF is also given directly by 

A(O, r) =i: s(t + )s*(t _ JOtft (4) 

Equation (3) indicates that we can interpret any bilinear TFD as the two-dimensional Fourier transform of a weighted 

version of the AF. 

The kernel is frequently chosen to weight the AF such that the auto-components of the distribution are passed while 

the cross-components and noise are suppressed. In principle, this is possible when the auto-components and crosscomponents 

do not overlap. Many kernels have been proposed, but selection of a fixed kernel limits the class of signals 

for which the representation will perform well. That is, for any fixed kernel, it is always possible to find signals for 

SPIE Vol. 1348 Advanced Signal-Processing Algorithms, Architectures, and Implementations (1990) / 181 

Downloaded from SPIE Digital Library on 18 Jan 2010 to 128.42.157.67. Terms of Use: http://spiedl.org/terms 

(1)

which the TFD exhibits poor auto-component concentration or little cross-component suppression. (The same problem 

limits the performance of wavelet time-frequency analysis3, since the choice of a fixed analyzing wavelet restricts the 

class of signals which can be analyzed effectively.) 

The limitations of fixed kernel time-frequency analysis can be illustrated by analyzing a simple signal with several 

different TFD's. The WD of the sum of two chirp signals of large effective time envelope is shown in Fig. 1. Although 

the auto-components are highly concentrated, there is a large cross-component. The Choi-Williams distribution4, 

which has an exponential kernel 

cw(O,T)=e (5) 

works well for signals whose components have distributions nearly parallel with the time or frequency axis. It performs 

poorly, however, for signals with substantial frequency modulation, as seen in Fig. 2, since the kernel severely truncates 

the auto-components ofsuch signals. The kernel generating the spectrogram is related to the AF of the analysis window 

w(t) by 

S(9,T) = i: w(t + )w*(t )ei0tdt. 

(6) 

Results are excellent for signal components that resemble the window5, but all mismatched components are distorted. 

Figure 3 displays a spectrogram computed using a Hamming window of length similar to the effective time width of 

the signal. It is poorly concentrated and obscures the true nature of the signal. If the analysis window, and hence 

the kernel, is matched to one of the signal components, the matched-filter spectrogram results. See Fig. 4. While 

the matched-filter technique can yield excellent results, it works for only one type of signal component and requires a 

priori knowledge of the form of the component. 

Since the best kernel function depends on the signal to be analyzed, we expect to obtain good performance for a broad 

class of signals only by using a signal-dependent kernel. Signal-dependent kernels are proposed by several authors. 

The adaptive spectrogram representation for speech signals developed by Glinski6 adapts the window based on a 

segmentation (provided by the user) of the signal into pitch periods. Jones and Boashash7 adapt the modulation 

rate of a fixed window to match an estimate of the signal's instantaneous frequency. Optimal smoothing kernels are 

considered by Andrieux ei al.8 , but only for simple signals of the form s(t) = e2v(t) , and only for the restrictive class 

of Gaussian kernels. Kadambe, Boudreaux-Bartels and Duvaut9 utilize an adaptive filtering technique coupled with 

AR modeling and clustering to design kernels. Nuttall'° designs a kernel composed of Gaussian components based 

on information that the user provides after viewing the WD. Jones and Parks'1 develop a technique using Gaussian 

kernels which vary with time and frequency to maximize a local measure of signal-energy concentration. 

Each of the methods described above either is ad hoc, excessively restricts the class of allowable kernels, is computationally 

expensive or requires human intervention. We propose a new procedure for selecting a signal-dependent 

kernel. Given a signal, the method automatically designs a kernel that is optimal with respect to a set of performance 

criteria. Since the class of kernels that we consider is large, good performance is expected for a wide range of signals. 

The procedure also has a computational complexity that is comparable to fixed-kernel techniques. 

2. OPTIMAL KERNEL DESIGN 

Rather than choosing an ad hoc method for signal-dependent kernel selection, it seems appropriate to formulate the 

procedure as an optimization problem. The problem formulation requires a class of kernels from which the optimal 

kernel is chosen, and a performance index that measures the quality of the time-frequency representation with respect 

to criteria deemed important by the designer. The kernel that maximizes the value of the performance measure is 

selected as the optimal kernel for the signal. 

The class of kernels must be large enough to allow for good performance for all signals of interest in a given application. 

Likewise, the performance measure must be chosen to yield a tractable optimization problem that can be 

solved efficiently. An example of a useful performance index is a measure of the signal-energy concentration of the 

distribution". Clearly, the choice of kernel class and performance measure is crucial to the success of the method. 

However, once a satisfactory class and measure are found, kernel design for a wide range of signals is reduced to solving 

an optimization problem. 

The optimal design concept can be generalized to classes of TFD's other than the bilinear by defining a subclass of 

182 / SPIE Vol. 1348 Advanced Signal-Processing Algorithms, Architectures, and Implementations (1990) 

Downloaded from SPIE Digital Library on 18 Jan 2010 to 128.42.157.67. Terms of Use: http://spiedl.org/terms

allowable TFD's and a performance index. The formulation of optimization problems for TFD design is relatively 

simple in the bilinear case, because a bilinear TFD is completely specified by its two-dimensional smoothing kernel. 

Thus, we can find the optimal bilinear TFD for a signal simply by solving for the optimal kernel. 

2.1 Continuous optimization formulation 

This section develops an optimization problem formulation for kernel design that relies on the AF of the signal and 

the characteristic function representation of a TFD indicated by (3). We propose optimality criteria based on the AF 

for three reasons. First, the multiplicative operation of the kernel on the AF is easier to visualize than convolution of 

the WD with the smoothing function, which simplifies the construction of a quality measure. Second, the AF serves 

to separate the auto and cross-components5 . Third, the AF may lead to efficient computation of the optimal TFD, 

since the TFD is merely the two-dimensional Fourier transform of the product of the optimal kernel and the AF. 

We consider an optimal kernel in the continuous case to be one that satisfies the following optimization problem: 

p2 p00 

max / / (7) 

Jo Jo 

subject to (O, 0) 1 and 

I(ri,/) I(r2,'i/.') V r1 < r2 ,V b, (8) 

p2r p00 

and subject to I I I'I(r,cb)I2rdrdb < a 

Jo Jo 

(9) 

where A(r, ) is the AF of the signal in polar coordinates, and the kernel (r, b) is assumed to be real and positive. 

The performance measure (7) expresses our desire to pass as much auto—component energy as possible into the TFD 

for a kernel of fixed volume a, . The second constraint (8) forces the kernel to be radially nonincreasing. Since the 

AF auto—components are centered at the origin, this encourages the kernel to preferentially pass auto-components. 

The final constraint (9) restricts the size of the kernel so that cross-components are suppressed. An advantage of this 

formulation is that the constraints are insensitive to both the time-scale and orientation of the signal in time-frequency. 

2.2 Discrete optimization formulation 

In practice, TFD's are computed at discrete time and frequency locations, so we reformulate the optimization problem 

by discretizing equations (7)—(9). With suitably dense sampling, the discrete formulation converges to the continuous 

formulation. Performing the discretization, we define an optimal discrete kernel to be one that satisfies: 

fAd(rn, n)'1d(m, n)12, (10) 

nx ;: 

subject to d(O,O) = 1 and 

Jd(m, n) is radially nonincreasing, (11) 

and subject to :i Id(m, n)f2 ad (12) 

where Ad(m, n) is the N x N discrete AF of the signal to be analyzed, and the kernel d(rn, n) is assumed to bereal 

and positive. Note that since the AF is conjugate symmetric through the origin 

the optimal kernel can be computed from a half-plane of AF samples. 

Ad(rn,fl) = A(—m,_n), (13) 

The constraint that the kernel be radially nonincreasing can be implemented exactly only on a polar grid of samples. 

However, computing the AF and resulting TFD on a polar grid requires either the computation of a polar Fourier 

transform, for which no fast algorithm exists, or a costly interpolation from a rectangular grid. Therefore, we approximate 

the polar grid by a set of paths on a rectangular grid. Figure 5 illustrates a tree structure that approximates the 



adial dependencies of the kernel for the upper half-plane of a 64x64 rectangular grid. The nonincreasing constraint 

is enforced along each path from the origin to the edge. The branches of the tree are constructed to minimize the 

maximum deviation from the branch to the true radial line. 

3. OPTIMAL KERNEL SOLUTION 

Since the performance measure and constraints are linear in IdI2, the optimal kernel may be found by applying linear 

programming'2 to solve for the N2 unknowns 1'd(m, n) (since 1d is assumed to be real and positive, knowing NdI2is 

equivalent to knowing d). Moreover, it can be shown that the optimal kernel takes on essentially only the values of 

one and zero. 

The optimal TFD can thus be determined as follows. First, the discrete AF of the signal to be analyzed is computed. 

Next, the linear program (1O)—(12) is solved for the optimal kernel, which is then multiplied by the AF. The twodimensional 

Fourier transform of the product is the optimal TFD. 

3.1 Fast Algorithm for Solution 

A solution for the optimal kernel using standard linear programming methods may be simple, but it is also computationally 

expensive. Use of the simplex algorithm would cause the optimal kernel computation to dominate the total 

cost of computing the optimal TFD. However, we have found an extremely efficient inductive procedure that computes 

the optimal kernel with O[N2] operations to find the optimal kernel of size a. Since this number is small in comparison 

to the O[N2 log N] computations required to find the AF or WD, time-frequency analysis with signal-dependent 

kernels is competitive computationally with traditional fixed kernel methods. 

3.2 Implementation Issues 

Although the one-zero kernel is optimal according to the constraints stated in section 2.2, its sharp cutoff may introduce 

ringing in the optimal TFD. Thus, some form ofsmoothing may be desired. One simple approach, used in the examples, 

tapers the kernel. 

Adjustment of the parameter cd controls the tradeoff between cross-component suppression and smearing of the 

auto-components. A lower bound on reasonable values for a can be derived from uncertainty principle arguments. 

4. EXAMPLES 

In order to compare the results of the optimal kernel design procedure with other TFD's, the optimal kernel was 

computed for the same signal discussed in the introduction. The AF and kernel were of size 64 x 64, the parameter 

ad was set to 30 and tapering was applied. The AF, optimal kernel and resulting TFD are shown in Figs. 6,7 and 8. 

The cross-component visible in all of the other TFD's except the matched-filter spectrogram (see Fig. 4) is virtually 

eliminated, yet the distribution is still quite concentrated — much more so than the matched-filter spectrogram. 

Figure 9 illustrates the WD of the same signal corrupted by additive white Gaussian noise. The SNR of the resulting 

signal is 0dB. The optimal kernel was computed using the same parameters as above. The cross-component and noise 

suppression of the optimal TFD, shown in Fig. 10, are excellent, indicating that the kernel design procedure is robust 

in the presence of significant additive noise. 

5. CONCLUSION 

An optimization procedure has been presented for the automatic determination of signal-dependent smoothing kernels 

for time-frequency analysis. Due to the signal-dependent nature of the kernel, the quality of the resulting timefrequency 

representation is insensitive to the time-scale and orientation of the signal. A fast algorithm for the optimal 

kernel solution makes the method competitive computationally with traditional fixed kernel methods. The procedure 

appears to yield excellent results for a much larger class of signals than any fixed kernel representation. The technique 

performs well even in the presence of substantial additive noise. 

184 / SPIE Vol. 1348 Advanced Signal-Processing Algorithms, Architectures, and Implementations (1990) 


6. ACKNOWLEDGMENTS 

This work was supported by the Music Group of the Computer-based Education Research Laboratory and the Joint 

Services Electronics Program, Grant No. N00014-90-J--1270. 

7. REFERENCES 

1. T.A.C.M. Claasen and W.F.G. Mecklenbräuker, "The Wigner Distribution — A Tool for Time—Frequency Signal 

Analysis — Part I: Continuous—Time Signals," Philips Journal ofResearch 35(3), pp. 217-250, 1980. 

2. L. Cohen, "Time—Frequency Distributions —A Review," Proceedings of the IEEE 77(7), pp. 941—981, July 1989. 

3. P. Flandrin, 0. Rioul, "Affine Smoothing of the Wigner-Ville Distribution," IEEE ICASSP-1990, pp. 2455—2458, 

1990. 

4. 11.-I. Choi and W.J. Williams, "Improved Time-Frequency Representation of Multicomponent Signals Using Exponential 

Kernels," IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-37(6), pp. 862—871, 

June 1989. 

5. P.Flandrin, "Some Features of Time-Frequency Representations of Multicomponent Signals," IEEE IGASSP-1 984, 

pp. 41.B.4.1—41.B.4.4, 1984. 

6. S.C. Glinski, "Diphone Speech Synthesis Based on a Pitch-Adaptive Short-Time Fourier Transform," Ph.D Thesis, 

University of Illinois, Urbana, 1981. 

7. G. Jones and B. Boashash, "Instantaneous Frequency, Instantaneous Bandwidth and the Analysis of Multicomponent 

Signals," IEEE IGASSP-1990, pp. 2467—2470, 1990. 

8. J.C. Andrieux, M.R. Felix, G. Mourgues, P. Bertrand, B. Izrar and V.T. Nguyen, "Optimum Smoothing of the 

Wigner—Ville Distribution," IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP—35(6), pp. 

764—769, June 1987. 

9. 5. Kadambe, G.F. Boudreaux-Bartels and P. Duvaut, "Window Length Selection for Smoothing the Wigner Distribution 

by Applying an Adaptive Filter Technique," IEEE ICASSP-1989, pp. 2226—2229, 1989. 

10. A.H. Nuttall. "Wigner Distribution Function: Relation to Short-Term Spectral Estimation, Smoothing, and Performance 

in Noise," NUSC Technical Report 8225, February 16, 1988. 

11. D.L. Jones and T.W. Parks, "A High Resolution Data-Adaptive Time-Frequency Representation," IEEE ICASSP- 

87, Dallas TX, pp. 681—684, April 1987. 

12. D.G. Luenberger, Introduction to Linear and Nonlinear Programming, Addison-Wesley Co., Reading, MA, 1973. 



Fig.1. Magnitude of the Wigner distribution. 

Fig. 3. Magnitude of the spectrogram computed 

with a Hamming window of duration equal to 

the effective time width of the signal. 

Fig. 2. Magnitude of the Choi-Williams disthbution 

computed with smoothing parameter a =20. 

Fig. 4. Magnitude of the matched-filter spectrogram. 

Fig. 5. Approximation of a radial dependency graph'on a rectangular grid. 

Architectures, and Implementations (1990) 

186 / SPIE Vol. 1348 Advanced Signal-Processing Algorithms, 


Fig. 6. Magnitude of the ambiguity function. Fig. 7. The tapered optimal kernel, a

Optimal kernels for time-frequency analysis - Rice University Digital ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?