24.02.2013 Views

least squares theory and design of optimal noise shaping filters

least squares theory and design of optimal noise shaping filters

least squares theory and design of optimal noise shaping filters

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Verhelst <strong>and</strong> De Koning Least Squares Noise Shaping<br />

Magnitude (dB)<br />

30<br />

20<br />

10<br />

0<br />

-10<br />

-20<br />

-30<br />

0 5 10<br />

Frequency (kHz)<br />

15 20<br />

Figure 4: A � � order FIR <strong>noise</strong> <strong>shaping</strong> <strong>design</strong> (dashed<br />

line) compared to the F-weighting equi-loudness curve.<br />

The equi-loudness curve was downward shifted 16 dB for<br />

easy comparison: £ ÐÓ� Ï � � was plotted.<br />

Because the <strong>noise</strong> <strong>shaping</strong> filter is minimum phase, its<br />

spectrum on a log scale averages to zero [7].<br />

mum one. However, this is only valid under the assumption<br />

that � � �� is constant.<br />

Because SBM uses a non-dithered quantizer, it is doubtful<br />

that this would be satisfied for the more critical low<br />

level signals. This suggests that SBM could be improved<br />

by applying the LPC modellisation to the inverse <strong>of</strong> the<br />

desired <strong>noise</strong> <strong>shaping</strong> spectrum multiplied by the quantizer’s<br />

error spectrum � � �� or an estimate there<strong>of</strong>.<br />

3. EXPERIMENTS<br />

3.1. Duplication <strong>of</strong> the F-weighted <strong>design</strong><br />

To illustrate its <strong>optimal</strong>ity, we applied the proposed <strong>design</strong><br />

procedure to the problem <strong>of</strong> <strong>design</strong>ing a minimally<br />

audible dither signal using the F-weighting curve as the<br />

perceptual weighting function Ï � , as was discussed<br />

in section 1.3. Fig. 4 illustrates the result for a sampling<br />

frequency <strong>of</strong> ��� �ÀÞ. A� Ø� order filter approximation<br />

was used. The filter coefficients we obtained differed less<br />

than 0.1% from those obtained by direct optimization <strong>of</strong><br />

(5) that were published in [3]. This small deviation is<br />

probably due to the approximation <strong>of</strong> the autocorrelation<br />

function by inverse FFT, <strong>and</strong> could be further reduced by<br />

using more frequency samples <strong>of</strong> Ï � .<br />

3.2. Noise <strong>shaping</strong> experiment<br />

3.2.1. Experimental setup<br />

A s<strong>of</strong>tware version <strong>of</strong> the psychoacoustical <strong>noise</strong> <strong>shaping</strong><br />

requantizer (Fig. 3) was implemented. 44.1 kHz sampling<br />

frequency <strong>and</strong> � Ø� order FIR <strong>design</strong>s for À Þ were<br />

used. The filter coefficients were obtained by solving (13)<br />

as described above. Ö � was approximated by a 512 point<br />

inverse FFT <strong>of</strong> the sampled weighting function<br />

Ï �� � ��<br />

Æ<br />

�� � ���Æ � Æ ��<br />

Ï �� was updated every 256 input samples <strong>and</strong> corresponded<br />

to the inverse masking curve <strong>of</strong> a simplified<br />

psychoacoustic model:<br />

Ï �� � ¬ÈÜÜ �� ¬ ÈÌÉ ��<br />

(16)<br />

¬ � � Æ � (17)<br />

Here ÈÜÜ �� represents the energy spectrum <strong>of</strong> a 512<br />

point hanning windowed input segment. Its spectral resolution<br />

is comparable to the critical b<strong>and</strong>width <strong>of</strong> hearing<br />

at about �ÀÞ. ÈÌÉ �� is the hearing threshold in<br />

quiet, <strong>and</strong> � is the number <strong>of</strong> bits <strong>of</strong> the input signal representation.<br />

In our experiments � � � <strong>and</strong> È ÌÉ ��<br />

was approximated by the energy spectrum <strong>of</strong> the � � order<br />

F-weighting filter (dashed curve in Fig. 4). As with<br />

other perception models, ¬ depends on the expected sound<br />

pressure level <strong>of</strong> the input signal. The value in (17) compensates<br />

for the scale factor incurred with our choice <strong>of</strong><br />

ÈÌÉ �� <strong>and</strong> is appropriate when the loudest signal portions<br />

are played at 84 dB SPL.<br />

Sixteen bit test data from good-quality CDs was used in<br />

the experiments. Six different fragments were used, containing<br />

different types <strong>of</strong> instruments, vocals, <strong>and</strong> musical<br />

styles. They had a total duration <strong>of</strong> 1 minute 33 seconds.<br />

The fragments were first requantized using straightforward<br />

requantization (i.e., rounding) to a precision where<br />

the requantization errors became clearly audible. The<br />

fragments were then requantized to the same precision<br />

(between 5 <strong>and</strong> 7 bits, depending on the fragment) with<br />

three additional methods, resulting in four different versions:<br />

Version 1. Straightforward requantization<br />

Version 2. Requantization with st<strong>and</strong>ard dither<br />

Version 3. Non-dithered requantization with fixed <strong>noise</strong><br />

<strong>shaping</strong> (F-weighting)<br />

Version 4. Non-dithered requantization with adaptive psychoacoustic<br />

<strong>noise</strong> <strong>shaping</strong><br />

3.2.2. Informal diagnostic evaluation<br />

Two adults with normal hearing participated in this informal<br />

evaluation. They were allowed to listen to the different<br />

quantized <strong>and</strong> original sound fragments in any desired<br />

order <strong>and</strong> as <strong>of</strong>ten as they desired. The experiment<br />

was performed in a quiet <strong>of</strong>fice <strong>and</strong> the sound files were<br />

AES 22 Ò� International Conference on Virtual, Synthetic <strong>and</strong> Entertainment Audio 5

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!