10.07.2015 Views

errors in real-time room acoustics dereverberation - Wire ...

errors in real-time room acoustics dereverberation - Wire ...

errors in real-time room acoustics dereverberation - Wire ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

ERRORS IN REAL-TIME ROOM ACOUSTICS DEREVERBERATIONPANAGIOTIS D. HATZIANTONIOU AND JOHN N. MOURJOPOULOSAudio Group, <strong>Wire</strong> Communications LaboratoryElectrical and Computer Eng<strong>in</strong>eer<strong>in</strong>g DepartmentUniversity of Patras, Patras, 265 00 GreeceTel.: +30 261 0 996474Fax: +30 261 0 991855E-mail: mourjop@wcl.ee.upatras.grSignificant measurable audible distortions are generated dur<strong>in</strong>g <strong>real</strong><strong>time</strong><strong>room</strong> <strong>acoustics</strong> <strong>dereverberation</strong>, far greater <strong>in</strong> magnitude thanthose expected from identical simulated (off-l<strong>in</strong>e) <strong>dereverberation</strong>experiments. Long <strong>in</strong>verse filters worsen this effect. Possiblemechanisms for such discrepancy between simulated and <strong>real</strong>-<strong>time</strong>tests are exam<strong>in</strong>ed. It is also shown that <strong>dereverberation</strong> based onComplex Smooth<strong>in</strong>g is immune to such <strong>errors</strong>.0. INTRODUCTIONThe term <strong>dereverberation</strong> is used here to def<strong>in</strong>e the broad category of digitalsignal process<strong>in</strong>g methods, which attempt to undo (partially or completely) thephysical or perceptual artifacts generated by audio signal reproduction <strong>in</strong>sidereverberant spaces. This term is comparable to other, alternative terms for1


similar methods, such as: “<strong>room</strong> correction”, “<strong>room</strong> compensation”,“reverberation reduction”, “<strong>room</strong> equalization”, etc. Also, the term<strong>dereverberation</strong> can <strong>in</strong>clude similar methods applied to restricted audiobandwidth such as “low-frequency correction”, “modal-equalization”, etc.Given the mixed-phase nature of <strong>room</strong> response due to <strong>room</strong> boundaryreflections, this paper will ma<strong>in</strong>ly address the <strong>in</strong>version (“undo<strong>in</strong>g”) of suchmixed-phase responses, s<strong>in</strong>ce such methods present the most general anddifficult case of the problem. The conclusions of this work will also apply tothe simpler case of m<strong>in</strong>imum-phase <strong>room</strong> response <strong>in</strong>version, which isusually referred to as “<strong>room</strong> equalization”.The theoretical aspects of <strong>room</strong> <strong>dereverberation</strong> based on <strong>room</strong> impulseresponse <strong>in</strong>verse filter<strong>in</strong>g have been <strong>in</strong>troduced more than two decades agofor m<strong>in</strong>imum-phase [1] and for mixed-phase response <strong>in</strong>version [2]. Theseearly studies have highlighted a number of theoretical problems, namely, theacausal nature of such <strong>in</strong>verse filters when it is chosen to compensate for thenon-m<strong>in</strong>imum phase <strong>room</strong> response (practically <strong>real</strong>ized via the <strong>in</strong>troductionof appropriate delay [2, 3]), the sensitivity of the process<strong>in</strong>g with respect tothe measured source / receiver placement <strong>in</strong>side the <strong>room</strong> [4, 5], as well asthe large number of <strong>in</strong>verse filter coefficients required. For the first of thoseproblems it has been found that <strong>in</strong> pr<strong>in</strong>ciple, non-m<strong>in</strong>imum phasecompensation is beneficial compared to magnitude (m<strong>in</strong>imum-phase)correction, not only for reduc<strong>in</strong>g reverberation energy, but also for phasel<strong>in</strong>earization which is especially desirable for (anechoic) loudspeakerresponse equalization [6,7]. The disadvantage of such acausal filters, is that2


methods achieve by def<strong>in</strong>ition a non-ideal RTF <strong>in</strong>version and hence lower<strong>dereverberation</strong> performance than theoretically could be achieved by the“ideal” <strong>dereverberation</strong>, but generally are optimized to m<strong>in</strong>imize audibledegradations due to the previously discussed problems. These authors haverecently produced results for such a method, based on the ComplexSmooth<strong>in</strong>g of <strong>room</strong> responses [20], and tested it over a large variety ofdifferent spaces [29]. This method is free to a large extend of all of the aboveproblems affect<strong>in</strong>g practical <strong>dereverberation</strong>, i.e. audible artifacts,displacement sensitivity and measurement variations.Dur<strong>in</strong>g those past studies, the results were largely derived via off-l<strong>in</strong>esimulation tests and the issue of <strong>real</strong>-<strong>time</strong> <strong>dereverberation</strong> has been onlypass<strong>in</strong>gly exam<strong>in</strong>ed, often found to be highly sensitive to many practicalfactors so that the robustness of these methods and perceived ga<strong>in</strong>s aftertheir <strong>real</strong>-<strong>time</strong> application could not be easily quantified. An often-discussed,but never formally <strong>in</strong>vestigated aspect of these methods relates to the audibledistortions generated dur<strong>in</strong>g <strong>real</strong>-<strong>time</strong> <strong>dereverberation</strong> which althoughreported to be severe, could not be accounted for by the expected theoretical<strong>errors</strong> such as the previously discussed ones which were measured by offl<strong>in</strong>esimulations and tests. Such distortions were <strong>in</strong>itially described <strong>in</strong> early<strong>real</strong>-<strong>time</strong> listen<strong>in</strong>g <strong>dereverberation</strong> tests, by Neely and Allen [1] (for m<strong>in</strong>imumphaseresponse <strong>in</strong>version) and Mourjopoulos [4] (for mixed-phase response<strong>in</strong>version). It is often felt that these distortions were audible manifestations of4


the previously discussed error artifacts [7-9], although such conclusions werenot derived from formal <strong>in</strong>-situ measurements of the <strong>in</strong>version <strong>errors</strong>.Here it will be shown that such distortions are manifestations of large scalemeasurable <strong>errors</strong> generated dur<strong>in</strong>g practical <strong>real</strong>-<strong>time</strong> <strong>dereverberation</strong>,which are far greater <strong>in</strong> magnitude than those expected from the theoreticalstudies and which could be evaluated dur<strong>in</strong>g any identical simulated<strong>dereverberation</strong> experiment. It will be also shown that such large discrepancybetween simulated and <strong>real</strong>-<strong>time</strong> tests is not a result of system non-l<strong>in</strong>earity,but <strong>in</strong>stead, the <strong>in</strong>versed <strong>room</strong> path must be practically treated as a “weaklynon-stationary” system [42], that is, a system whose properties will practicallyvary between measurement and test even if source / receiver placement andother physical <strong>room</strong> parameters have not been <strong>in</strong>tentionally changed. This<strong>in</strong>dicates that an “uncerta<strong>in</strong>ty” will appear between the responsemeasurement and subsequent <strong>in</strong>verse filter <strong>real</strong>izations and hence suchf<strong>in</strong>d<strong>in</strong>gs are related to previously published results concern<strong>in</strong>g <strong>room</strong> responsemeasurement [34, 37, 38]. It will be shown experimentally that those factorsare becom<strong>in</strong>g more critical when long <strong>in</strong>verse filters are employed whichgenerate larger orders of measurable <strong>errors</strong> and audible degradations <strong>in</strong> thedereverberated signals. Possible sources for such <strong>errors</strong> will be discussedand analytical mechanisms for their generation will be exam<strong>in</strong>ed. It will bethen also shown that the proposed <strong>dereverberation</strong> method based onComplex Smooth<strong>in</strong>g of <strong>room</strong> responses, is immune to such discrepanciesbetween simulated and <strong>real</strong>-<strong>time</strong> tests.5


The paper is organized as follows: section 1 establishes the experimental setupand theoretical background for <strong>real</strong>-<strong>time</strong> and simulated (off-l<strong>in</strong>e) tests;section 2 provides experimental results which show the quantitative andqualitative aspects of differences between the <strong>dereverberation</strong> performance <strong>in</strong>each of the above cases; section 3 will analytically expla<strong>in</strong> the possible natureof such error; section 4 will illustrate why <strong>dereverberation</strong> based on <strong>room</strong>response Complex Smooth<strong>in</strong>g is free of such <strong>errors</strong>; section 5 will providediscussion and conclusions.1. THEORY: DIFFERENCES BETWEEN REAL-TIME AND OFF-LINEDEREVERBERATIONTo <strong>in</strong>vestigate any discrepancy between simulated and <strong>real</strong>-<strong>time</strong><strong>dereverberation</strong> methods, it is necessary at first to def<strong>in</strong>e the measurementand test cha<strong>in</strong>s for each class of experiments: (a) the simulated<strong>dereverberation</strong> experiments are conducted with<strong>in</strong> a computer, employ<strong>in</strong>gmeasured <strong>room</strong> responses, subsequently also referred to as “off-l<strong>in</strong>e” or“theoretical” tests and, (b) the <strong>real</strong>-<strong>time</strong> <strong>dereverberation</strong> measurements, aresimilarly employ<strong>in</strong>g previously measured responses, but are conducted <strong>in</strong>-situ(<strong>in</strong>side a <strong>room</strong>).To compare directly these two types of tests, all relevant parameters weredef<strong>in</strong>ed and kept identical <strong>in</strong> both classes. Nevertheless, given the largenumber of such parameters which potentially can affect the outcome of the6


esults, these parameters are noted <strong>in</strong> the correspond<strong>in</strong>g Figures (1,3-6) andare also listed <strong>in</strong> Table 2, <strong>in</strong> the Appendix.In the above Figures the detailed experimental set-up is shown and allpotentially adjustable and variable parameters are <strong>in</strong>dicated <strong>in</strong>side squarebrackets. Each module <strong>in</strong> the experimental cha<strong>in</strong>, this be<strong>in</strong>g electrical (analogor digital), electroacoustical or acoustical, is considered to be a l<strong>in</strong>ear system,characterized by its correspond<strong>in</strong>g measurable impulse response function, sothat well-established expressions can be employed, as is illustrated below.1.1 Room response measurementFollow<strong>in</strong>g Figure 1, it is:[ s ( )]ste ( t)= DACten(i)s′ ( t)= g ⋅s( t)(ii)srrtetate( t)= s′( t)∗ h ( t)(iii)teL[ s ( t)∗ h ( t)] n ( )( t)= t(iv) (1)ta ta R+tem( t)= r ( t)∗ h ( t)(v)taMr ′ ( t)= g ⋅ r ( t)(vi)tete[ r ( )]r ′ ( n)= ADC t(vii)′te tewhere the symbol ∗ denotes the cont<strong>in</strong>uous-<strong>time</strong> convolution. All othersymbols are listed <strong>in</strong> Table 2 <strong>in</strong> the Appendix.7


Follow<strong>in</strong>g the well-established direct approach for ideal <strong>in</strong>version of themeasured response h (n)[3, 33] (for a specific source/receiver position), an“<strong>in</strong>verse filter”h i(n) with length Li(samples) has to be <strong>in</strong>troduced, such thata new response d i(n)will be produced, accord<strong>in</strong>g to the follow<strong>in</strong>gexpression:d ( n)= h(n)⊗ h ( n)≅ δ ( n),n = 0,1, K,L(2)iidwhere δ (n)is the ideal impulse response, L L + L −1(samples) is thed= ilength of the deconvolved response and the symbol ⊗ denotes the discrete<strong>time</strong>l<strong>in</strong>ear convolution. The <strong>in</strong>verse filter response h i(n)was calculated hereto have the best (least-square) approximation for the ideal response δ (n)[3].Other <strong>in</strong>version strategies (e.g. via direct DFT <strong>in</strong>version) may also producecomparable results, not<strong>in</strong>g potential sources of practical <strong>errors</strong> [9].The test cha<strong>in</strong> for such <strong>in</strong>version is shown <strong>in</strong> Figure 3, where it can beobserved that all process<strong>in</strong>g is carried-out with<strong>in</strong> a simulated computerenvironment. The majority of results presented <strong>in</strong> the <strong>dereverberation</strong>literature were derived by such off-l<strong>in</strong>e tests.1.2.2 Real-<strong>time</strong> <strong>room</strong> response <strong>in</strong>versionIn this case, the source excitation signal is pre-filtered <strong>in</strong> <strong>real</strong>-<strong>time</strong> by theresponse <strong>in</strong>verse filter so that after acoustic reproduction <strong>in</strong> the <strong>room</strong>, themeasured response at the microphone would produce the <strong>in</strong>verted9


(deconvolved) <strong>room</strong> impulse response, as is shown <strong>in</strong> Figure 4 and it isdescribed below:sfte( n)= s ( n)⊗ h ( n)(i)te[ s ( )]isfte ( t)= DACften(ii)s′ ( t)= g ⋅s( t)(iii)srrfteftafte( t)= s′( t)∗ h ( t)(iv) (3)fteL[ s ( t)∗ h ( t)] n ( )( t)= t(v)fta fta R+ftei( t)= r ( t)∗ h ( t)(vi)ftaMr ′ ( t)= g ⋅ r ( t)(vii)ftefte[ ( )]r ′ ( n)= ADC r t(viii)′fte fteAll symbols are listed <strong>in</strong> Table2 <strong>in</strong> the Appendix.The recorded dereverberated excitation signal def<strong>in</strong>ed by eq. (3viii) isappropriately processed (see previous case) yield<strong>in</strong>g the measured <strong>real</strong>-<strong>time</strong><strong>in</strong>verted discrete-<strong>time</strong> system (loudspeaker/<strong>room</strong>/microphone) responsed ˆ ( n),n = 0,1,K . This response can be then analyzed as <strong>in</strong> Figure 3.i,L d1.2.3 Error criteria <strong>in</strong> <strong>room</strong> response <strong>dereverberation</strong>.The <strong>dereverberation</strong> performance can be monitored via the <strong>time</strong> doma<strong>in</strong>difference error e (n)between the desired off-l<strong>in</strong>e ( d i(n)) or <strong>real</strong>-<strong>time</strong> ( ˆ ( n )<strong>in</strong>versed response:di )e( n)δ ( n)− d ( n),n = 0,1, K,L − 1(4a)eˆ( n)=idδ ( n)− dˆ( n),n = 0,1, K,L − 1(4b)=id10


The term “<strong>dereverberation</strong> error” will be used from now on to describe theabove error sequences.Any discrepancy between <strong>real</strong>-<strong>time</strong> and off-l<strong>in</strong>e <strong>in</strong>version as was described <strong>in</strong>the previous sections, is def<strong>in</strong>ed here as the difference error e d(n), betweenthe two correspond<strong>in</strong>g <strong>in</strong>verted responses (equations (2) and (3)), i.e.:e ( n)dˆ( n)− d ( n),n = 0,1, K , L − 1(5)d=iidThe term “<strong>dereverberation</strong> discrepancy error” will be then used form now onto describe the above error sequence. Equivalently <strong>in</strong> the discrete frequencydoma<strong>in</strong>:E) = Dˆ( ω ) − D ( ω ) , ω 2π k L , k = 0,1 KL− 1(6)d( ωki k i kk=ddwhere d i(n), D ω ) and dˆ i( n), D ˆ ( ) are Discrete Fourier Transform pairs.i(kiω kFor the objective analysis of the <strong>dereverberation</strong> performance <strong>in</strong> the <strong>time</strong>doma<strong>in</strong>, the off-l<strong>in</strong>e <strong>dereverberation</strong> error energy J (dB), and the <strong>real</strong>-<strong>time</strong><strong>dereverberation</strong> error energy Ĵ (dB) can be derived from eq. (4), def<strong>in</strong>edrespectively as:L⎧⎫= ⎨ ⋅ ∑ − 11 d2J 10log10 [ e(n)] ⎬7(a)⎩Ldn=0 ⎭L⎧⎫= ⎨ ⋅ ∑ − 11d2J ˆ 10log10 [ eˆ(n)] ⎬7(b)⎩Ldn=0 ⎭The <strong>dereverberation</strong> discrepancy error energy Jd(dB) is def<strong>in</strong>ed, then, as:J d= J ˆ − J(8)11


In the frequency doma<strong>in</strong>, the standard deviation V of the DFT modulus isused as an objective criterion of the spectral flatness of the off-l<strong>in</strong>e or <strong>real</strong><strong>time</strong><strong>in</strong>verted response, def<strong>in</strong>ed respectively as:V⎧⎪ 1⎨Ld⎪⎩−1k = 02L −1⎡1⎤ ⎫⎪⎢10logDi( ω k ) − 10logDi( ω k ) ⎥ ⎬(9a)⎣Ldk = 0⎦ ⎪⎭L d d= ∑ ∑0.5Vˆ⎪⎧ 1⎨Ld⎪⎩L d −1L d −1k = 02⎡ˆ 110log ˆ⎤ ⎪⎫⎢10logDi( ωk) −Di( ωk) ⎥ ⎬(9b)⎣Ldk = 0⎦ ⎪⎭= ∑ ∑0.5The term “<strong>dereverberation</strong> spectral deviation” will be used from now todescribe this function.Any discrepancy between off-l<strong>in</strong>e and <strong>real</strong>-<strong>time</strong> <strong>dereverberation</strong> spectraldeviation can be then def<strong>in</strong>ed as:V d= V ˆ −V(10)The term “<strong>dereverberation</strong> discrepancy spectral deviation” will be used todescribe the above function.1.3 Dereverberation of audio signalsWhen a typical audio segment substitutes the source excitation signal, thensubjective evaluation of <strong>dereverberation</strong> can be carried-out. As with theprevious analysis, such tests can be carried-out under simulated (“theoretical”or “off-l<strong>in</strong>e”) conditions, or via <strong>real</strong>-<strong>time</strong> reproduction and audition.12


1.3.1 Off-l<strong>in</strong>e audio <strong>dereverberation</strong>As is shown <strong>in</strong> Figure 5, a simulated audio <strong>dereverberation</strong> test can becarried-out with<strong>in</strong> a computer us<strong>in</strong>g a measured <strong>room</strong> response, so that theprocessed signal can be auditioned, typically via headphones. In such a case,it is:r t( n)= s(n)⊗ h(n)(i)s ( n)= r ( n)⊗ h ( n)(ii)it[ s ( )]isi ( t)= DAC<strong>in</strong>(iii) (11)s′ ( t)= g ⋅ s ( t)(iv)iis′ ( t)= s′( t)∗ h ( t)(v)ihipAll symbols are listed <strong>in</strong> Table 2 <strong>in</strong> the Appendix.1.3.2 Real-<strong>time</strong> audio <strong>dereverberation</strong>In this case, the source audio signal is pre-filtered <strong>in</strong> <strong>real</strong>-<strong>time</strong> by theresponse <strong>in</strong>verse so that after acoustic reproduction <strong>in</strong> the <strong>room</strong>, the audiosignal at the listener / microphone position would <strong>in</strong> theory be devoid of <strong>room</strong>acoustic distortions, as is shown <strong>in</strong> Figure 6 and it is described below:s ( n)= s(n)∗ h ( n)(i)fi[ s ( )]sf ( t)= DACfn(ii)s′ ( t)= g ⋅ s ( t)(iii) (12)srffaf( t)= s′( t)∗ h ( t)(iv)fL[ s ( t)∗ h ( t)] n ( )( t)= t(v)fa fa R+i13


All symbols are listed <strong>in</strong> Table 2 <strong>in</strong> the Appendix.Clearly this experiment is appropriate for <strong>real</strong>istic subjective evaluation of any<strong>dereverberation</strong> method.1.3.3 Response Spectral Complex Smooth<strong>in</strong>g and <strong>dereverberation</strong>As was noted <strong>in</strong> the <strong>in</strong>troduction, many different methods have beenproposed as alternative to the previously described “ideal <strong>in</strong>version”. Forsome <strong>time</strong> now [14, 20] <strong>room</strong> response pre-condition<strong>in</strong>g via spectralsmooth<strong>in</strong>g has been proposed, so that a modified perceptually compliantversion of this response can be derived which is appropriate for<strong>dereverberation</strong> [29].Accord<strong>in</strong>g to this approach, the discrete-<strong>time</strong> measured <strong>room</strong> impulseresponse h (n)can be transformed <strong>in</strong>to a smoothed response h cs(n)by theapplication of the Complex Smooth<strong>in</strong>g operation on the discrete-frequencyresponse Η ω ) [20]. Specifically, a spectral smooth<strong>in</strong>g w<strong>in</strong>dow function( kWsm (m,ωk) operates on the complex discrete-frequency response H( ωk)accord<strong>in</strong>g to the follow<strong>in</strong>g convolution expression:HL= ∑ − 1 cs(ωk) H(ωk− ω ) ⋅Wsm(m,ωl)l=0l(13)In case that the spectral smooth<strong>in</strong>g w<strong>in</strong>dow function W m,ω ) has the formsm(kof a rectangular (spectral) w<strong>in</strong>dow function, the above expression can besimplified as:14


Hcs1(ωk) = H ωl 2m+ 1( )(14)k∑ + ml=k −mThe discrete variable m (samples) is def<strong>in</strong>ed as a function of the discretefrequency<strong>in</strong>dex k , allow<strong>in</strong>g a variable degree of spectral averag<strong>in</strong>g for eachfrequency typically employ<strong>in</strong>g fractional octave or other non-uniformfrequency smooth<strong>in</strong>g profiles. In this way Complex Smooth<strong>in</strong>g may beconsidered as a “generalized” form of the more traditional fractional octaveanalysis (e.g. via 1/3 octave filter banks), employed for at least half a century<strong>in</strong> <strong>room</strong> <strong>acoustics</strong> and audio eng<strong>in</strong>eer<strong>in</strong>g. Follow<strong>in</strong>g such an approach, m (k)can be considered as a half-bandwidth function that expresses thedependence of the desirable spectral averag<strong>in</strong>g on frequency. It must benoted that the smooth<strong>in</strong>g operation is practically mean<strong>in</strong>gful when thesmooth<strong>in</strong>g <strong>in</strong>dex m(k)is def<strong>in</strong>ed for the range of values 0 < m(k)≤ L / 4 . Theabove operation allows mapp<strong>in</strong>g of the Complex Smoothed <strong>room</strong> frequencyresponse <strong>in</strong>to a correspond<strong>in</strong>g smoothed <strong>room</strong> impulse response. Typicalresults of a smoothed version of the <strong>room</strong> response (shown orig<strong>in</strong>ally <strong>in</strong>Figure 1), are shown <strong>in</strong> Figure 7.The proposed <strong>dereverberation</strong> scheme is based on such pre-process<strong>in</strong>g of<strong>room</strong> responses [29]. After the application of this operation on the measured<strong>room</strong> response h(n) , a reduced-order smoothed responseh cs(n) with aTransfer Function Hcs(ωk) is derived. Then, an <strong>in</strong>verse filter (n)evaluated, which <strong>in</strong>verts the Complex Smoothed response, i.e.:h cs iis15


dDcs ics i(n) = hcs(k) = H(n) ⊗ hcs(ω ) ⋅ Hkcs i(n) ≅ δ(n)cs i(ω ) = 1k(15)In theory such an <strong>in</strong>verse filter responseh cs i(n) will achieve a compromisedresult when employed on the orig<strong>in</strong>al measured <strong>room</strong> response h(n) ,accord<strong>in</strong>g to the follow<strong>in</strong>g expression:d ′(n)= h(n) ⊗ hiD ′(k)= H(ω ) ⋅ Hikcs i(n)cs i(ωk)(16)Such compromised, but nevertheless perceptually more beneficialperformance can be observed on the (off-l<strong>in</strong>e) <strong>dereverberation</strong> results ofFigure 8. This <strong>dereverberation</strong> method was found to produce objective andsubjective improvement enhanc<strong>in</strong>g all known acoustic parameters related tomeasured responses, irrespective of the <strong>room</strong> size, without <strong>in</strong>troduc<strong>in</strong>gaudible distortions (see [29] and off-l<strong>in</strong>e audio demonstrations <strong>in</strong>http://www.wcl.ee.upatras.gr/audiogroup/Equalization/<strong>in</strong>dex.html).2. RESULTS: ERROR IN REAL-TIME DEREVERBERATIONTests were carried-out for all types of experiments described <strong>in</strong> section 1.Real-<strong>time</strong> tests were conducted <strong>in</strong> a professional laboratory <strong>room</strong> withdimensions and <strong>acoustics</strong> close to those recommended by the IEC 268-13standard for loudspeaker evaluation, i.e. L: 7,15m X W: 4,60m X H: 2,90m.Reverberation <strong>time</strong> and ambient noise data for this <strong>room</strong> are listed <strong>in</strong> theAppendix. For the <strong>room</strong> response measurements, a small 2-way closed-boxloudspeaker (Yamaha NS-10) was located at the center of the <strong>room</strong>, at 3 mfrom an omnidirectional microphone and at a height of 1.7 m above the floor.16


The excitation signal was a maximum-length sequence (MLS) reproducedand averaged 16 <strong>time</strong>s to reduce the effects of additive background noise.The orig<strong>in</strong>al <strong>room</strong> response (Figure 1) was derived from calculations basedon double-precision float<strong>in</strong>g-po<strong>in</strong>t arithmetic. The overall system responsewas obta<strong>in</strong>ed to a length ofL = 512Ksamples. The <strong>real</strong>-<strong>time</strong> <strong>in</strong>verted systemresponses were also obta<strong>in</strong>ed us<strong>in</strong>g the above method. Pre-filter<strong>in</strong>g wasapplied on such excitation sequences by us<strong>in</strong>g the appropriate <strong>in</strong>verse filters,hav<strong>in</strong>g different length L irang<strong>in</strong>g from 1K to 512K samples. The measured<strong>real</strong>-<strong>time</strong> <strong>in</strong>verted responses were trimmed to an analysis lengthL L + L −1 (samples). Off-l<strong>in</strong>e <strong>in</strong>verted responses were obta<strong>in</strong>ed by l<strong>in</strong>eard=iconvolution between the orig<strong>in</strong>al <strong>room</strong> and the <strong>in</strong>verse filter response. Fromthe results of those tests, the follow<strong>in</strong>g conclusions were derived:(i) In all cases the <strong>dereverberation</strong> error (as def<strong>in</strong>ed by eq. (4)) wassignificantly larger dur<strong>in</strong>g <strong>real</strong>-<strong>time</strong> <strong>in</strong>version than dur<strong>in</strong>g the identicalparameteroff-l<strong>in</strong>e <strong>in</strong>version. Typical results are shown <strong>in</strong> Figure 9, where the<strong>time</strong> and frequency doma<strong>in</strong> response <strong>in</strong>version results are shown for filterlengths ofL = 1K(short filter), (Figure 9(a), (b)) and L = 512K(long filter),ii(Figure 9(c), (d)). As can be observed, the <strong>real</strong>-<strong>time</strong> <strong>in</strong>version <strong>errors</strong> were upto 20 dB higher especially for low frequencies, result<strong>in</strong>g to low-frequencyr<strong>in</strong>g<strong>in</strong>g <strong>in</strong> the <strong>time</strong> doma<strong>in</strong>. In detailed exam<strong>in</strong>ation, it can be observed thatdur<strong>in</strong>g <strong>real</strong>-<strong>time</strong> tests the compensat<strong>in</strong>g poles of the <strong>in</strong>verse filter do not fullycancel the RTF zeros (dips), hence generat<strong>in</strong>g additional mismatched r<strong>in</strong>g<strong>in</strong>g17


poles. Such effects were found to be significantly smaller dur<strong>in</strong>g the off-l<strong>in</strong>etests.(ii) As a result of this, <strong>in</strong>formal listen<strong>in</strong>g tests carried-out with audio materialwere less distorted dur<strong>in</strong>g off-l<strong>in</strong>e <strong>in</strong>version (equations (11)), than dur<strong>in</strong>g <strong>real</strong><strong>time</strong><strong>in</strong>version (equations (12)). For the case of off-l<strong>in</strong>e <strong>in</strong>version, theexpected from the diagrams overall flatten<strong>in</strong>g of the spectrum was alsoobserved (heard) result<strong>in</strong>g to reduced coloration, but some low-amplitudepost-r<strong>in</strong>g<strong>in</strong>g together with just audible pre-r<strong>in</strong>g<strong>in</strong>g were also noticeable. Thisf<strong>in</strong>d<strong>in</strong>g is <strong>in</strong> agreement with past references [4, 8]. However, dur<strong>in</strong>g <strong>real</strong>-<strong>time</strong>tests, the audio material was significantly more degraded with additional andmore prom<strong>in</strong>ent harsh pre- and post- r<strong>in</strong>g<strong>in</strong>g resonances. Typical audioexamples are available <strong>in</strong> the electronic address:http://www.wcl.ee.upatras.gr/audiogroup/Equalization/<strong>in</strong>dex.html(iii) By <strong>in</strong>creas<strong>in</strong>g <strong>in</strong>verse filter length L i (samples), as is well-known [3, 33],the theoretical <strong>dereverberation</strong> error can be reduced <strong>in</strong> both <strong>time</strong> andfrequency doma<strong>in</strong>, albeit with the expected <strong>in</strong>crease of the low-level pre-echolength correspond<strong>in</strong>g to the partial deconvolution of the now-lengthier acausal<strong>room</strong> response error component. Aga<strong>in</strong>, this f<strong>in</strong>d<strong>in</strong>g is <strong>in</strong> agreement with pastreferences [2, 3]. However, rather unexpectedly and not previously noted <strong>in</strong>the literature, the <strong>real</strong>-<strong>time</strong> <strong>in</strong>version results were significantly worse forlonger <strong>in</strong>verse filters (see Figures 9(c) and 9(d)). As can be observed <strong>in</strong> thosefigures, the previously observed low-frequency <strong>errors</strong> were now comb<strong>in</strong>edwith large <strong>errors</strong> spread to most frequency ranges, generat<strong>in</strong>g broadband18


approximately, not dramatically <strong>in</strong>creas<strong>in</strong>g thereafter. This difference between<strong>real</strong>-<strong>time</strong> and off-l<strong>in</strong>e error (<strong>dereverberation</strong> discrepancy error energy,equation (8)) can be more clearly observed <strong>in</strong> Figure 11(a), where thisunexpected <strong>in</strong>crease <strong>in</strong> <strong>real</strong>-<strong>time</strong> error is clearly shown. Significantly,Complex Smooth<strong>in</strong>g based <strong>dereverberation</strong> error (results of correspond<strong>in</strong>gtests us<strong>in</strong>g such smoothed responses are also plotted <strong>in</strong> these figures)rema<strong>in</strong>s largely <strong>in</strong>dependent of filters length or test class. Similar trends canbe also observed <strong>in</strong> the frequency doma<strong>in</strong> where <strong>dereverberation</strong> spectraldeviation (see equation (9)) is <strong>in</strong>creas<strong>in</strong>g for longer filters for <strong>real</strong>-<strong>time</strong> tests,whereas, it rema<strong>in</strong>s <strong>in</strong>dependent of it for Complex Smoothed <strong>dereverberation</strong>(Figures 10(b) and 11(b)).These f<strong>in</strong>d<strong>in</strong>gs, believed not to have been previously reported, at firstconsideration cannot be accommodated with<strong>in</strong> the exist<strong>in</strong>g theoreticalanalysis of the <strong>room</strong> response <strong>in</strong>version problem. Hence, subsequentsections of this paper will exam<strong>in</strong>e <strong>in</strong> more detail the factors affect<strong>in</strong>g thisdiscrepancy and will propose possible mechanisms responsible for thegeneration of such <strong>errors</strong>.3. ERROR ANALYSIS FOR REAL-TIME DEREVERBERATIONIn order to <strong>in</strong>terpret those results it is suggested here that <strong>real</strong>-<strong>time</strong> is differentto off-l<strong>in</strong>e <strong>dereverberation</strong> possibly due to the <strong>in</strong>stantaneously <strong>time</strong>-vary<strong>in</strong>gnoise amplified by the <strong>in</strong>verse filter compensat<strong>in</strong>g poles. These narrowbandwidth (high-Q) large-ga<strong>in</strong> poles, <strong>in</strong> a frequency doma<strong>in</strong> sense are20


attempt<strong>in</strong>g to compensate for high-Q RTF zeros, where signals to noise ratio(SNR), as well as spectral resolution appear to be extremely criticalparameters, susceptible to variations between response measurement andtest conditions. Such mismatch between response measurement and<strong>dereverberation</strong> test noise level at those critical regions will severely degradeperformance due to any <strong>in</strong>stantaneous small variations. In a <strong>time</strong> doma<strong>in</strong>sense, these <strong>dereverberation</strong> mismatched components can be very lengthydue to r<strong>in</strong>g<strong>in</strong>g of mismatched poles, (<strong>in</strong>clud<strong>in</strong>g pre-r<strong>in</strong>g<strong>in</strong>g for the case ofmixed-phase <strong>in</strong>verse), so that the distortion is f<strong>in</strong>ally manifested as modulatednarrow-band noise. Practical tests maybe also affected by otherenvironmental as well as equipment-related factors which may generate smallshifts <strong>in</strong> the signal frequency. Such mechanisms may be due to <strong>room</strong>temperature variations between response measurement and <strong>dereverberation</strong>test or digital audio equipment jitter.These two potentially dom<strong>in</strong>ant error mechanisms, namely the effect ofbackground noise and of spectral shift, will be analyzed below.3.1 Effect of background noiseAs it was shown <strong>in</strong> previous sections dur<strong>in</strong>g practical tests the measuredimpulse response will always conta<strong>in</strong> a level of background noise. Hence, themeasured discrete-<strong>time</strong> <strong>room</strong> response h (n), hav<strong>in</strong>g a discrete-frequency21


RTF H(ω k) with ω = 2πkL,k = 0,1, K L , can be considered as the sum ofk,the noiseless system response h o(n)and of the background noise (n), i.e.:h( n)= h ( n)n ( n), n = 0,1,K,L(17)o+mand <strong>in</strong> the discrete frequency doma<strong>in</strong>:H ω ) = H ( ω ) + N ( ω )(18)(k o k m kwhere h o(n), Ho( ωk) and n m(n), N m(n)are Discrete Fourier Transform pairs.Follow<strong>in</strong>g the ideal <strong>dereverberation</strong> discussed earlier, the <strong>in</strong>verse filter withresponseh i(n) of length Liwill be <strong>in</strong>troduced, so that the theoretical (off-l<strong>in</strong>e)<strong>in</strong>verted system response d i(n)will be:ii[ ho( n)+ nm(n)] ⊗ hi( n),n = 0,1, Ldd ( n)= h(n)⊗ h ( n)=K,(19)and, equivalently, <strong>in</strong> the frequency doma<strong>in</strong>:D (ωik[ ) N (ω )] ⋅ H (ω ))= H (ωok+ (20)mkikwhere ωk= 2πk L , k = 0,1 KLd−1and Ld= L + Li(samples) is the length ofthe now <strong>in</strong>verted system response.dn mDur<strong>in</strong>g <strong>real</strong>-<strong>time</strong> equalization, provided all parameters rema<strong>in</strong> the same, it canbe assumed that the noiseless system response h o(n)also rema<strong>in</strong>sunchanged, but the measured background noise n m(n)it is possible to havedifferent per sample values (although it may reta<strong>in</strong> the same longer-termstatistics as for the previously measured response noise) and hence a newnoise sequence n i(n)is present <strong>in</strong> the equalized system response dˆ i( n).Then, accord<strong>in</strong>g to equations (19) and (20):22


i[ ho( n)+ ni( n)] ⊗ hi( n),n = 0,1, K Ldd ˆ ( n)= ,(21)Dˆ( ωik[ H ( ω ) + N ( ω )] ⋅ H ( ω )) = (22)okiComb<strong>in</strong>ation of equations (20) and (22) yields:kikH (ω ) N (ω )Dˆ o k+i ki(ωk) =⋅ Di(ωk)(23)H (ω ) + N (ω )okmkFrom the above expression it is obvious that the <strong>real</strong>-<strong>time</strong> <strong>in</strong>verted <strong>room</strong>response differs from the off-l<strong>in</strong>e one ma<strong>in</strong>ly due to the effect of thebackground noise. Comb<strong>in</strong>ation of equations (6), (20) and (22) yields:[ N (ω ) − N (ω )] ⋅ H (ω )E (ω ) = . (24)dkikmkikEquivalently, <strong>in</strong> the discrete-<strong>time</strong> doma<strong>in</strong>, the <strong>dereverberation</strong> discrepancyerror function e d(n)will be:[ n ( n)− n ( n)] ⊗ h ( )e ( n)= n . (25)dimiGiven that nm ( n)and ni( n)may have different per sample values, thedifference ni ( n)− nm( n)is a non-zero sequence. Hence, a filtered (by the<strong>in</strong>verse system response) noise sequence will be generated dur<strong>in</strong>g the <strong>real</strong><strong>time</strong>response <strong>in</strong>version. In contrast, the off-l<strong>in</strong>e equalized response will onlygenerate the significantly lower theoretical equalization error.From a physical po<strong>in</strong>t of view the above analysis leads to the conclusion that<strong>in</strong> spectral b<strong>in</strong>s where the power of the measured system response is lowrelative to the acoustic noise floor, then the effect of any <strong>in</strong>verse will largelydepend on the noise properties, be<strong>in</strong>g likely that any small (<strong>in</strong>stantaneous)noise variations to generate mismatch error which will be significantlyamplified by the now-mismatched compensat<strong>in</strong>g poles. This conclusion can23


partially expla<strong>in</strong> the results described <strong>in</strong> Section 2. However, the <strong>in</strong>terdependenceof noise and <strong>in</strong>verse filter length may be further expla<strong>in</strong>ed via anidealized example where a <strong>room</strong> response null will be represented via asimple notch filter. This example will show how such a noise amplificationmechanism becomes more dom<strong>in</strong>ant for longer <strong>in</strong>verse filters. To create sucha null at a specific frequency f0(Hz), a pair of complex-conjugate zerosz ,e1 2± jω0= are <strong>in</strong>troduced on the unit circle at a normalized angularfrequency ω 0, whereω0R ( ω k) for such a notch system will be:jω − jω−( ) ( e e ) ( e )jω −− ⋅ − eR ωk0 k 0 jω= 1k0= 2π f fs. Then the discrete frequency response1 (26)In order to simulate the <strong>real</strong>istic measurement conditions, a spectralcomponent for the background noise must be added, so that:Rjω jω− jω − jω( ω ) = ( − e e ) ⋅ ( − e e ) Ν ( ω )k0 − k 0 k1 1+(27)The ideal <strong>in</strong>verse filter response Ri( ω k) for the above system will be:Ri( ω )kjω jω− jω jω( 1−e e ) ⋅ ( 1−e e ) N ( ω )mk1(28)−= 0 k 0 − k+such that, the off-l<strong>in</strong>e equalized notch response will be:ik( ) ⋅ R ( ω )D ( ω ) = R ω(29)kikAssum<strong>in</strong>g now that this ideal <strong>in</strong>verse filter is applied to the orig<strong>in</strong>al notchsystem response under noisy test conditions and recall<strong>in</strong>g the previousexpression for the <strong>dereverberation</strong> discrepancy error (equation (24)), suchfilter<strong>in</strong>g yields:[ N ( ω ) − N ( ω )] ⋅ R ( ω )mEd( ωk) =i k m k i k, ωk= 2πk L , k = 0,1 KLd−1(30)kd24


The discrete <strong>in</strong>verse filter response Ri( ω k) is maximized when evaluatednear the null angular frequency ω 0. Clearly, as the b<strong>in</strong> spac<strong>in</strong>g of the DiscreteFourier Transform (DFT) approaches zero, then the maximum absolute valueof the <strong>in</strong>verse filter response Ri( ω k) approaches <strong>in</strong>f<strong>in</strong>ity. Hence the worstcase for generat<strong>in</strong>g such error will occur when the differenceN ω ) − N ( ω ) is large (i.e. noise differs between response measurementi(k m kand equalization test) and simultaneously, the frequency spac<strong>in</strong>g is very f<strong>in</strong>e,i.e. when the <strong>in</strong>verse filter length approaches <strong>in</strong>f<strong>in</strong>ity for a given <strong>room</strong>response length and sampl<strong>in</strong>g frequency.Figure 12 (a) shows the magnitude response of the idealized notch system forDFT length of 1K, 8K and 64K samples hence result<strong>in</strong>g to a differentfrequency b<strong>in</strong> separation. For illustration purposes, a typical spectrum ofacoustic noise is also drawn. Figure 12(b) shows the response of the ideal<strong>in</strong>verse filters (see equation (28)) generated for different filter length valuesLi, which compensates for the noisy notch system. As can be observed, by<strong>in</strong>creas<strong>in</strong>g filter length, the ga<strong>in</strong> and mismatch of the compensat<strong>in</strong>g <strong>in</strong>versefilter also <strong>in</strong>creases due to the comb<strong>in</strong>ed affect of noise and <strong>in</strong>creasedspectral resolution. This has the effect of <strong>in</strong>creas<strong>in</strong>g mismatch spectraldistortion <strong>in</strong> the deconvolution result, shown <strong>in</strong> Figure 13(a).3.2 Effect of spectral shift25


Recall<strong>in</strong>g the example response <strong>in</strong> the previous section, let assume now thatfor some reason (such as <strong>room</strong> temperature changes or due to equipmentrelatedfactors such as digital audio jitter), the orig<strong>in</strong>al simplified notchresponse R( ωk) (see equation (26)) has slightly changed, so that there is asmall shift of the pair of the complex-conjugate zeros along the frequency axis(a shift of the angular frequency on the unit circle), i.e. the new null angularfrequency is now ω~0= ω 0± dω, where d ω denotes a small shift. Then, the~“shifted” response R( ω k)~R ωwill be:jω~-jωk − jω( ) ( e e ) ( ek)jω ~00 −= 1−⋅ 1−ek(31)The ideal <strong>in</strong>verse filter response R ω ) for the noiseless response R ω ) isdef<strong>in</strong>ed as:i(k(kRi( ω )k=1jω0 − jωk − jω( e e ) ( ek)jω 0 −1−⋅ 1−e(32)such that, the ideal off-l<strong>in</strong>e equalized notch response will be:Di( ωkk i k) = R(ω ) ⋅ R ( ω )(33)~Thus the filter<strong>in</strong>g of R(ω ) by the <strong>in</strong>verse filter R ω ) will now produce animperfectly equalized frequency response Dˆ i( ω k):Dˆ(ωkki(k~) = R(ω ) ⋅ R ( ω )(34)kikThe <strong>dereverberation</strong> discrepancy error )will be:Ed(kE ω between Dˆ ( ) and D ( ω )~) = Dˆ( ω ) − D ( ω ) = R(ω ) ⋅ R ( ω ) - R(ω ) ⋅ R ( ω ) =d( ωki k i kk i kk i k~( R ( ωk) − R(ωk)) ⋅ Ri( ωk== )iω kik26


( − 2cos( ~ -jωk-j2ωk-jωk-j2ωk1 ω ) ⋅e+ e −1+2cos( ω ) ⋅e−e) ⋅R( ω == )0 0i k( −~ -jωkcos( ω ) cos( ω )) ⋅ e ⋅ R ( ω ⇔= ⋅)20 0i k-j( ω ± dω/ 2) ⋅ s<strong>in</strong>( ± dω/ 2) ⋅ eω⋅ R ( )kEd( ωk) = 4 ⋅ s<strong>in</strong>0 iωk(35)E ω isBy observ<strong>in</strong>g equation (35), it is obvious that the error function ( )zero-valued when the spectral shift d ω is zero. However, even small valuesof the spectral shift ωd will result ( )E ω to approximate <strong>in</strong>f<strong>in</strong>ity for values ofdthe discrete angular frequency ω kapproach<strong>in</strong>g the null frequency ω 0. Besides,it can be easily concluded that by decreas<strong>in</strong>g the discrete frequency spac<strong>in</strong>g(<strong>in</strong>creas<strong>in</strong>g <strong>in</strong>verse filter length for given sample rate), the error functiond( )E ω becomes more sensitive to smaller angular frequency shifts aroundkthe orig<strong>in</strong>al angular frequency ω 0.kdkFigure 13(b) shows the deconvolution result for different filter lengths, appliedto the notch system response whose notch frequency was shifted by 0.1%prior to deconvolution. Clearly, the longer filters generate more severedistortion after <strong>in</strong>verse filter<strong>in</strong>g.4. IMPROVING REAL-TIME DEREVERBERATION VIA RESPONSESMOOTHINGIn order to overcome the previously discussed practical problems, it becomesclear that any robust <strong>room</strong> <strong>acoustics</strong> <strong>dereverberation</strong> / equalization schememust employ techniques that avoid compensation for sharp response dips,hav<strong>in</strong>g also controlled resolution <strong>in</strong> an <strong>in</strong>verse spectral sense, i.e. employ<strong>in</strong>g27


a moderate length and order for the <strong>in</strong>verse filter for any given sampl<strong>in</strong>g rate.At the same <strong>time</strong>, perceptually-significant spectral resolution must not becompromised by such <strong>in</strong>verse filters, ideally <strong>in</strong> order to be able to tackle lowfrequencynarrow-Q <strong>room</strong> resonances [21, 27, 28, 41]). The <strong>real</strong>-<strong>time</strong>application of such filter should be able to improve direct / reverberant signalratios without add<strong>in</strong>g any unwanted artifacts; hence a mixed-phase <strong>in</strong>versefilter must be employed. These requirements can be met by ComplexSmooth<strong>in</strong>g <strong>dereverberation</strong> as it was described <strong>in</strong> section 1.3.3. Suchprocedure corrects gross spectral effects due to early <strong>room</strong> reflections withoutattempt<strong>in</strong>g to compensate for many of the orig<strong>in</strong>al narrow-bandwidth spectraldips, and on the other hand <strong>in</strong> the <strong>time</strong> doma<strong>in</strong> (be<strong>in</strong>g primarily a mixedphasecompensation scheme), shapes more power <strong>in</strong> the direct and earlyreflection path sounds and less power <strong>in</strong> some of the later reverberantcomponents. As was previously noted, the method allows user-def<strong>in</strong>edvariable frequency resolution/smooth<strong>in</strong>g so that full-bandwidth<strong>dereverberation</strong> can be achieved; however <strong>in</strong> such a case low-frequencyresolution may be compromised. To illustrate the immunity of this method tothe previously discussed problems, let consider the effect of the smooth<strong>in</strong>goperation on the notch filter example response. Equation (27), def<strong>in</strong><strong>in</strong>g thenotch filter frequency response under <strong>real</strong>istic (noisy) condition, is now writtenas:R( ω )(− jωjkek − ωkω ) = 1 − 2cos( ω )2⋅ + e +0Ν(36)m kApplication of the Complex Smooth<strong>in</strong>g operation to the above response, asdef<strong>in</strong>ed by eq (14), yields:28


Rcs( ωk) = 1−2cos( ω) ⋅ α⋅ e+ α⋅ e+ N( ω− jωk−2jωk0 12mcs k)(37)where,ααN12( 2m(k)+ 1)[ π( 2m(k)+ 1)/ L]1 s<strong>in</strong>= ⋅, (38)s<strong>in</strong>( π / L)( 2m(k)+ 1)[ π( 2m(k)+ 1)/ L]1 s<strong>in</strong> 2= , (39)s<strong>in</strong>(2π/ L)mcsk + m(k )1( ωk) = ∑N( ωl)(40)2m(k)+ 1ml=k −m(k )be<strong>in</strong>g the smoothed background noise spectrum.The ideal <strong>in</strong>verse filter ( )RcsiR ω for the above system will be:csik1( ω ) = k(41)− jωk2 jωk1 − 2 cos( ω ) ⋅ α ⋅ e + α ⋅ e−+ N ( ω )0 12mcs kThus the filter<strong>in</strong>g of R( ωk) by the <strong>in</strong>verse filter Rcsi( ωk) will now produce an′ :imperfectly equalized frequency response D ( ω )D′ ω ) = R(ω ) ⋅ R ( ω )(42)i(kk csi kFrom equations (38)-(41) it is evident that the <strong>in</strong>verse filter response Rcsi( ω k)is controlled by the parameters α 1and α 2so that when evaluated near thenull angular frequency ω 0, then the <strong>in</strong>verse filter response will be always f<strong>in</strong>ite<strong>in</strong> amplitude and always smaller than the orig<strong>in</strong>al (ideal) <strong>in</strong>verse R ω ) . Toiki(killustrate this let evaluate this response for the case when the smooth<strong>in</strong>gw<strong>in</strong>dow function m (k)is a 1/3 octave half-bandwidth function of the discretefrequency <strong>in</strong>dex k , choos<strong>in</strong>g also a large value of the <strong>in</strong>verse filter lengthL = 512K samples. Then, for a notch angular frequency ω0 = 0. 1425 radsi29


(correspond<strong>in</strong>g to f0= 1 kHz and = 44. 1equation (38) is evaluated as:ωlim{ R ( ω )}f kHz), lim { m(k)} = 2734sω k →ω0{ N ( ω )}samples= 1-5-5− 8.5394 ⋅10+ 3.7934 ⋅10⋅ i lim(43)csi kk → ω0+ω →ωThe correspond<strong>in</strong>g ideal <strong>in</strong>verse filter response def<strong>in</strong>ed by equation (28), nearthe notch angular frequency ω0 = 0. 1425ωlim{ R ( ω )}will be:= 1-6-7− 1.2910 ⋅10+ 1.8520 ⋅10⋅ i lim(44)i kk → ω0+kkω →ω0mcs{ N ( ω )}As can be observed, recall<strong>in</strong>g the error functions def<strong>in</strong>ed by equations (30)and (35), the <strong>in</strong>verse filter for the Complex Smoothed response will alwaysamplify to a lesser degree any noise, so that any distortion due to mismatchor frequency shift<strong>in</strong>g will be reduced. Figure 14(a) shows the magnitudespectrum for the above example for the noisy orig<strong>in</strong>al and the smoothednotch response. Then, after <strong>in</strong>version with Complex-Smoothed filters ofdifferent length no mismatch artifacts are generated, as is shown <strong>in</strong> Figure14(b).0mkk5. DISCUSSION AND CONCLUSIONSAlthough the pr<strong>in</strong>ciples of <strong>room</strong> <strong>acoustics</strong> <strong>dereverberation</strong> have been knownfor at least 20 years, the robust application of such methods to <strong>real</strong>-life caseswas often unsuccessful, s<strong>in</strong>ce their performance was degraded byundesirable process<strong>in</strong>g artifacts. S<strong>in</strong>ce that <strong>time</strong> it was found that audible<strong>real</strong>-<strong>time</strong> <strong>dereverberation</strong> benefits were lower than the off-l<strong>in</strong>e (simulated)30


measured performance of such methods. However, this discrepancy was notbeen previously formally <strong>in</strong>vestigated and it was often related to thecompromised <strong>in</strong>verse filter design caused by the <strong>in</strong>adequate <strong>real</strong>-<strong>time</strong>process<strong>in</strong>g power available at the <strong>time</strong> for such tests. This discrepancybetween <strong>real</strong>-<strong>time</strong> and off-l<strong>in</strong>e <strong>dereverberation</strong> performance has beenmeasured and analyzed here, illustrat<strong>in</strong>g some novel aspects, not previouslydiscussed <strong>in</strong> earlier publications which were largely based on simulated tests.It is found here that by keep<strong>in</strong>g all parameters identical and unchangedbetween <strong>room</strong> response measurement and <strong>dereverberation</strong> test, themismatch <strong>errors</strong> were significantly higher for the case of <strong>real</strong>-<strong>time</strong> tests thanfor the correspond<strong>in</strong>g off-l<strong>in</strong>e tests. It was also found that by <strong>in</strong>creas<strong>in</strong>g<strong>in</strong>verse filter length (i.e. the number of filter coefficients), <strong>real</strong>-<strong>time</strong><strong>dereverberation</strong> performance was further degraded; furthermore thediscrepancy between <strong>real</strong>-<strong>time</strong> and off-l<strong>in</strong>e performance <strong>in</strong>creased s<strong>in</strong>ce forsuch lengthier filters the theoretical off-l<strong>in</strong>e mismatch error decreases. It isalso known that allow<strong>in</strong>g for more <strong>in</strong>verse filter coefficients, r<strong>in</strong>g<strong>in</strong>g poles arema<strong>in</strong>ly implemented which attempt to compensate for the orig<strong>in</strong>al RTF zeros.Given that the perceptual benefits from such compensation of narrow RTFzeros are questionable, it is becom<strong>in</strong>g obvious that such filters will usuallyproduce a negative effect dur<strong>in</strong>g practical <strong>dereverberation</strong> performance.Specifically, it was found that under practical <strong>real</strong>-<strong>time</strong> conditions such “high-Q” r<strong>in</strong>g<strong>in</strong>g poles are extremely sensitive to small variations <strong>in</strong> system (<strong>room</strong>)properties which can possibly change between response measurement and31


<strong>dereverberation</strong> test even if source / receiver positions and equipmentsett<strong>in</strong>gs rema<strong>in</strong> identical, so that any such test will suffer from significantmismatch error. Hence, from this po<strong>in</strong>t of view the <strong>room</strong> maybe considered asa “weakly non-stationary system”. Although the exact nature of such systemvariation mechanisms could not be fully identified, <strong>in</strong> subsequent sections ofthis paper it was proposed and analytically discussed that possible factors arethe vary<strong>in</strong>g environmental acoustic noise and/or also small drifts <strong>in</strong> signalfrequency caused by environmental or equipment-related factors. It wasillustrated that such variations affect proportionally more the low-SNR b<strong>in</strong>sassociated with high-resolution discrete-frequency RTF zeros. Due to theabove factors, it has been shown that the “uncerta<strong>in</strong>ty” betweenmeasurement and test is <strong>in</strong>creas<strong>in</strong>g for <strong>in</strong>creas<strong>in</strong>g the discrete-<strong>time</strong> <strong>in</strong>versefilter length and spectral resolution, so that under such conditions the coupl<strong>in</strong>gbetween the discrete and cont<strong>in</strong>uous-<strong>time</strong> acoustic doma<strong>in</strong> seems tobreakdown. It must be also noted here that averaged MLS <strong>room</strong> responsemeasurements may themselves generate results which may present<strong>in</strong>consistent and biased properties at low-SNR regions (such as responsetails), due to such <strong>time</strong>-dependent variations <strong>in</strong> <strong>room</strong> noise and air currents,which can cause variable phase modulations <strong>in</strong> the received signal.In order to further assess and discuss the f<strong>in</strong>d<strong>in</strong>gs of these tests, it isnecessary to note that the known distortions (artifacts) generated by<strong>dereverberation</strong> methods can be broadly grouped <strong>in</strong>to: (a) “mismatch <strong>errors</strong>”which up to now have been ma<strong>in</strong>ly associated with source / receiverdisplacements with respect to the orig<strong>in</strong>al RTF measured positions, or by32


<strong>in</strong>sufficient <strong>in</strong>verse filter performance (typically due to length restrictions) and,(b) pre-echoes, ma<strong>in</strong>ly due to the effects of acausal components of the<strong>in</strong>verse filters employed to tackle the non-m<strong>in</strong>imum phase RTF.For the first of these distortions, it has been shown here that such <strong>errors</strong> willbe present dur<strong>in</strong>g ideal <strong>real</strong>-<strong>time</strong> <strong>dereverberation</strong>, irrespective of process<strong>in</strong>gpower (<strong>in</strong>verse filter length) and source / receiver position co<strong>in</strong>cidence. These<strong>errors</strong> would obviously <strong>in</strong>crease if <strong>dereverberation</strong> is carried-out outside theRTF measurement area, as has been repeatedly shown by past studies, but<strong>in</strong> the first place would always be generated due to the sensitivity of thecompensat<strong>in</strong>g poles to the ambient noise appear<strong>in</strong>g around the low-SNR RTFzeros. Such <strong>errors</strong> would become more prom<strong>in</strong>ent when <strong>in</strong>verse filter lengthis <strong>in</strong>creased even if under similar conditions the theoretical (off-l<strong>in</strong>e)<strong>dereverberation</strong> performance would always seems to improve.Over the past years, most of the proposed <strong>dereverberation</strong> methods haveattempted to avoid compensat<strong>in</strong>g for sharp RTF dips. The f<strong>in</strong>d<strong>in</strong>gs of thiswork also <strong>in</strong>dicate that given that mismatch error will be generated at all“high-Q” RTF regions, any robust <strong>dereverberation</strong> method must be based onmoderate-resolution <strong>in</strong>verse filters, i.e. hav<strong>in</strong>g a “smoothed” spectral profilewith respect to the measured RTF and hence rather short length (<strong>in</strong> theregion of 2K samples for 44.1 KHz sampl<strong>in</strong>g rate). Ideally, a carefulfrequency-vary<strong>in</strong>g smooth<strong>in</strong>g profile would not sacrifice significant resolutionat low frequencies, so that “narrow Q” <strong>room</strong> resonances can be alsocorrected.33


It has been also shown that this desirable form of spectral compensation canbe achieved by <strong>in</strong>verse filters derived from a Complex Smoothed RTF hav<strong>in</strong>gfrequency-depended profile, which also results to shorter filters, immune tothe undesirable noise sensitivity associated with the longer <strong>in</strong>verse filters.Such filters can be mixed-phase <strong>in</strong> which case it has been previously shownthat they can remove some of the unwanted reflection energy, improv<strong>in</strong>gdirect /reverberant signal ratios and other similar <strong>room</strong> acoustic metrics up toa limit when pre-echo artifacts become perceptually detrimental. Furthermore,Complex Smooth<strong>in</strong>g-based <strong>dereverberation</strong> appears to be beneficial acrossthe complete audio frequency range.Given that the Complex Smooth<strong>in</strong>g method appears to be largely immune tothe first class of <strong>errors</strong> and hence to present a viable solution, the secondclass of <strong>dereverberation</strong> artifacts, namely the pre-echoes, appears at presentto be the most challeng<strong>in</strong>g aspect of the problem for remov<strong>in</strong>g the <strong>time</strong>dispersedlate reflection energy. Currently, it appears that it is practicallyimpossible to fully recover the anechoic source signal via <strong>real</strong>-<strong>time</strong> anddistortion-free process<strong>in</strong>g. However this should not be a <strong>real</strong>istic eng<strong>in</strong>eer<strong>in</strong>gtarget for <strong>dereverberation</strong> methods, s<strong>in</strong>ce they should aim at improv<strong>in</strong>gaudible impression without remov<strong>in</strong>g all evidence of the specific listen<strong>in</strong>gspace’s reverberance. Provided that sophisticated and robust perceptualmodels can evolve which could clearly outl<strong>in</strong>e the complex <strong>in</strong>teractionsbetween the multiple parameters affect<strong>in</strong>g the perception of reverberation34


(e.g. see [39]), then a better compromise between <strong>dereverberation</strong>performance and perceived effect could be achieved by such future methods.6. ACKNOWLEDGEMENTThe authors express their gratitude to the anonymous reviewer for po<strong>in</strong>t<strong>in</strong>gout the potential bias<strong>in</strong>g of averaged MLS measurements due to <strong>room</strong> noisevariations.7. REFERENCES[1] S.T.Neely and J.B.Allen, “Invertibility of a Room Impulse Response”, J.Acoust. Soc. Am., Vol. 66, pp.165-169, (1979).[2] J. Mourjopoulos, P. M. Clarkson, J .K. Hammond, "A Comparative studyof Least-Squares and Homomorphic Techniques for the Inversion ofMixed-Phase Signals", Proc. IEEE ICASSP’82, pp.1858-1861, (1982).[3] P.Clarkson, J. Mourjopoulos, J. Hammond, “Spectral, Phase, andTransient Equalisation for Audio Systems”, J. Audio Eng. Soc, Vol. 33,pp. 127-132, (1985).[4] J. Mourjopoulos, “On the Variation and Invertibility of Room ImpulseResponse Functions”, Journal of Sound and Vibration, Vol. 102, pp.217-228, (1985).[5] B.D. Radlovic.; R.C. Williamson; R.A. Kennedy, "Equalization <strong>in</strong> anAcoustic Reverberant Environment: Robustness Results", IEEE Trans.Speech and Audio Process<strong>in</strong>g, Vol. 8, no. 3, pp. 311-319, (2000).35


[6] R.Greenfield, M.O.Hawksford, “The Audibility of Loudspeaker PhaseDistortion”, Proc. of the 88th AES Conv., prepr<strong>in</strong>t 2927, (1990).[7] B.D. Radlovic; R.A. Kennedy, "Non-m<strong>in</strong>imum Phase Equalization andits Subjective Importance <strong>in</strong> Room Acoustics", IEEE Trans. Speech andAudio Process<strong>in</strong>g, Vol. 8, no. 6, pp. 728-737, (2000).[8] L.D. Fielder, “Analysis of traditional and reverberation-reduc<strong>in</strong>gmethods of <strong>room</strong> equalization”, J. Audio Eng. Soc., Vol. 51, No 1/2, pp.3-26, (2003).[9] J.N. Mourjopoulos, “Comments on Analysis of traditional andreverberation-reduc<strong>in</strong>g methods of <strong>room</strong> equalization” to be published <strong>in</strong>J. Audio Eng. Soc, Vol. 51, No 12, (2003).[10] M.Miyoshi and Y.Kaneda, “Inverse Filter<strong>in</strong>g of Room Acoustics”, IEEETrans. Acoust., Speech & Signal Process., Vol. 36,pp. 145-152, (1988).[11] Mourjopoulos J., Paraskevas M., "Pole and Zero Model<strong>in</strong>g of the RoomTransfer Function", Journal of Sound and Vibration, Vol. 146(2), pp.281-302, (1991).[12] R.P.Genereux, “Adaptive Loudspeaker Systems: Correct<strong>in</strong>g for theAcoustic Environments”, Proc. of the AES 8th Int. Conf., Wash<strong>in</strong>gtonD.C., (1990).[13] P.G.Graven, M.A.Gerzon, “Practical Adaptive Room and LoudspeakerEqualizer for Hi-Fi Use”, Proc. of the AES 92nd Conv., prepr<strong>in</strong>t 3346,(1992).[14] S.Salamouris, K.Politopoulos, V.Tsakiris, and J.Mourjopoulos, “DigitalSystem for Loudspeaker and Room Equalization”, Proc. of the AES98th Conv., prepr<strong>in</strong>t 3976, (1995).36


[15] R. Wilson, “Equalization of Loudspeaker Drive Units Consider<strong>in</strong>g BothOn- and Off-Axis Responses”, J. Audio Eng. Soc., Vol. 39, Number 3pp. 127, (1991).[16] F.Asano, Y.Suzuki, and T. Sone, “Sound Equalization Us<strong>in</strong>g DerivativeConstra<strong>in</strong>ts”, Acustica, Vol.2, pp.311-320, (1996).[17] Y.Haneda, S.Mak<strong>in</strong>o, Y.Kaneda, “Multiple-Po<strong>in</strong>t Equalisation of RoomTransfer Functions by Us<strong>in</strong>g Common Acoustical Poles”, IEEE Trans.On Speech and Audio Process<strong>in</strong>g, Vol. 5, No 4, (1997).[18] J.N. Mourjopoulos, "Digital Equalization of Room Acoustics", J. AudioEng. Soc., Vol. 42, No 11, pp. 884-900, (1994).[19] M. Karjala<strong>in</strong>en, E.Piirilä, A. Järv<strong>in</strong>en, J. Huopaniemi, “Comparison ofLoudspeaker Equalisation Methods Based on DSP Techniques”, J.Audio Eng. Soc, Vol. 47, No 1/2, pp.14 -31, (1999).[20] P. Hatziantoniou and J. Mourjopoulos, “Generalised Fractional-OctaveSmooth<strong>in</strong>g of Audio and Acoustic Responses”, J. Audio Eng. Soc., Vol.48, No 4, pp. 259-280, (2000).[21] A. Mäkivirta, P. Antsalo, M. Karjala<strong>in</strong>en, V. Välimäki, “Low-FrequencyModal Equalization of Loudspeaker-Room Responses”, Proc. of theAES111th Conv., prepr<strong>in</strong>t 5480, (2001).[22] O.Kirkeby and P.A.Nelson, “Digital Filter Design for Inversion Problems<strong>in</strong> Sound Reproduction”, J.Audio Eng.Soc., Vol. 47, pp.583-595, (1999).[23] O.Kirkeby, P.Rubak, and A.Far<strong>in</strong>a, “Analysis of ill-condition<strong>in</strong>g of multichanneldeconvolution problems”, Proc. IEEE Workshop onApplications of Signal Process<strong>in</strong>g to Audio and Acoustics, pp. 155-158,(1999).37


[24] C. Kyriakakis, S. Bharitkar, P. Hilmes, “Robustness of Multiple ListenerEqualization with Magnitude Response Averag<strong>in</strong>g”, Proc. AES 113thConv., prepr<strong>in</strong>t 5669, (2002).[25] L.G.Johansen and P.Rubak, “Listen<strong>in</strong>g test results from a new digitalloudspeaker/<strong>room</strong> correction system”, Proc. of the 110 thAES Conv.,prepr<strong>in</strong>t 5323, (2001).[26] A. Azzali, A. Bell<strong>in</strong>i, E. Carpanoni, M. Romagnoli, and A. Far<strong>in</strong>a,“AQTtool an automatic tool for design and synthesis of psychoacousticEqualizers”, Proc. of the 114th Conv., prepr<strong>in</strong>t 5835, (2003).[27] R.J. Wilson, M. D. Capp and J. R. Stuart, “The Loudspeaker-RoomInterface - Controll<strong>in</strong>g Excitation of Room Modes”, Proc. of the 23rdAES Conf., Copenhagen, (2003).[28] J.A. Pedersen, “Adaptive Bass Control - The ABC Room AdaptationSystem”, Proc. of the AES 23th Conf., Copenhagen, (2003).[29] P.Hatziantoniou and J.Mourjopoulos, “Results for Room AcousticsEqualisation Based on Smoothed Responses”, Proc. of the 114th AESConv., prepr<strong>in</strong>t 5779, (2003).[30] D. Rife and J. Vanderkooy, “Transfer-Function Measurement withMaximum-Length Sequences”, J. Audio Eng. Soc., Vol. 37, No 6, pp.419-443, (1989).[31] S. Müller, P. Massarani, “Transfer-Function Measurement withSweeps”, J. Audio Eng. Soc., Vol. 49, No 6, pp. 443, (2001).[32] A. Far<strong>in</strong>a, “Simultaneous measurement of impulse response anddistortion with a swept-s<strong>in</strong>e technique“, Proc. of the 108th AES Conv.,(2000).38


[33] P.M. Clarkson, Optimal and adapive signal process<strong>in</strong>g, (CRC Press,ISBN 0-8493-8609-8), pp. 89-131, (1993).[34] A. Far<strong>in</strong>a, G. Cibelli, A. Bell<strong>in</strong>i, “AQT - A New Objective Measurement ofThe Acoustical Quality of Sound Reproduction <strong>in</strong> Small Compartments”,Proc. of the 110th AES Conv., prepr<strong>in</strong>t 5283, (2001).[35] S. Spors, A. Kunz and R. Rabenste<strong>in</strong>, “An approach to listen<strong>in</strong>g <strong>room</strong>compensation with wave field synthesis”, Proc. of the 24th AES Int.Conf., Banff, CA, (2003).[36] A. Härmä, M. Karjala<strong>in</strong>en, L. Savioja, V. Välimäki, U. K. La<strong>in</strong>e, J.Huopaniemi, “Frequency-Warped Signal Process<strong>in</strong>g for AudioApplications”, J. Audio Eng. Soc., Vol. 48, No 11, pp. 1011, (2000).[37] A.Lundeby, T.E. Vigran, H.Bietz, and M. Vorländer “Uncerta<strong>in</strong>ties ofMeasurements <strong>in</strong> Room Acoustics”, Acustica, Vol. 81, pp.344-355,(1995).[38] M. Karjala<strong>in</strong>en, P. Antsalo, A. Mäkivirta, T. Peltonen, and V. Välimäki,''Estimation of Modal Decay Parameters from Noisy ResponseMeasurements,'' Proc. of the 110th AES Conv., prepr<strong>in</strong>t 5290, (2001).[39] J.Buchholz, J.Mourjopoulos, J.Blauert, “Room Mask<strong>in</strong>g: Understand<strong>in</strong>gand Modell<strong>in</strong>g the Mask<strong>in</strong>g of Reflections <strong>in</strong> Rooms”, Proc. of the AES110th Conv., prepr<strong>in</strong>t 5312, (2001).[40] L.G. Johansen, “Correct<strong>in</strong>g Room Acoustics Us<strong>in</strong>g Digital SignalProcess<strong>in</strong>g”, Ph.D. Thesis, Aalborg University, Denmark, (2003).[41] M. Tyril, J.A. Pedersen, P. Rubak, “Digital Filters for Low-FrequencyEqualization”, J. Audio Eng. Soc., Vol. 49, No 1, pp. 36, (2001).39


[42] L.J. Ziomek, Fundamentals of Acoustic Field Theory and Space-TimeSignal Process<strong>in</strong>g, CRC Press, ISBN 0-8493-9455-4, pp. 651-662,(1995).[43] G. Cibelli, E. Ugolotti, A. Bell<strong>in</strong>i, A. Far<strong>in</strong>a, C. Morandi, “ExperimentalValidation of Loudspeaker Equalization Inside Car Cockpits”, Proc. ofthe 106th AES Conv., prepr<strong>in</strong>t 4898, (1999)[44] A. Bell<strong>in</strong>i, A. Far<strong>in</strong>a, G. Cibelli, E. Ugolotti, “Experimental Validation ofEqualiz<strong>in</strong>g Filters for Car Cockpits Designed with Warp<strong>in</strong>g Techniques”,Proc. of the 109th AES Conv., prepr<strong>in</strong>t 5278, (2000)[45] M. Karjala<strong>in</strong>en, P. A. A. Esquef, P. Antsalo, A. Makivirta, V. Valimaki,“AR/ARMA Analysis and Model<strong>in</strong>g of Modes <strong>in</strong> Resonant andReverberant Systems”, Proc. of the 112th AES Conv., prepr<strong>in</strong>t 5590,(2002).APPENDIX31,5Hz63Hz125Hz250Hz500Hz1kHz2kHz4kHz8kHzAVERAGEL p(dB-SPL)40 38 32 31 25 22 21 20 20 38RT (s) − − 0,468 0,396 0,349 0,342 0,424 0,371 0,371 0,368Table 140


f s (Hz) discrete-<strong>time</strong> signal sampl<strong>in</strong>g frequencyADC[ ], DAC[ ]signal analog to digital & digital to analog conversionL (samples) discrete-<strong>time</strong> signal lengthL i (samples) discrete-<strong>time</strong> <strong>in</strong>verse filter lengthL d (samples) discrete-<strong>time</strong> <strong>in</strong>version lengthQ (bit) discrete-<strong>time</strong> signal quantization resolutiong (dB) amplification ga<strong>in</strong>x s , x rsource and receiver position coord<strong>in</strong>ates <strong>in</strong>side the <strong>room</strong>θ ( 0 ), φ ( 0 ) source horizontal and vertical angle (with respect to acousticaxis)Τ (degrees <strong>in</strong> Celsius)h(n)ambient <strong>room</strong> temperaturemeasured discrete-<strong>time</strong> <strong>room</strong>-path, loudspeaker & microphoneimpulse responseh L (t)h R (t)h M (t)h P (t)n m (t)n i (t)loudspeaker impulse response<strong>room</strong>-path impulse responsemicrophone impulse responseheadphone impulse responseambient <strong>room</strong> acoustic noise dur<strong>in</strong>g response measurementambient <strong>room</strong> acoustic noise dur<strong>in</strong>g <strong>dereverberation</strong> testd i(n)discrete-<strong>time</strong> <strong>room</strong>-path, loudspeaker & microphone impulseresponse, post-filtered by response <strong>in</strong>versedˆ i( n)discrete-<strong>time</strong> <strong>room</strong>-path, loudspeaker & microphone impulseresponse, us<strong>in</strong>g pre-filter<strong>in</strong>g by response <strong>in</strong>verses(n), s(t)discrete and cont<strong>in</strong>uous <strong>time</strong> audio test signals41


s f (n), s f (t), s’ f (t)s te (n), s te (t), s’ te (t)s fte (n), s fte (t), s’ fte (t)r t (n)s i (n), s i (t), s’ i (t), s’ ih (t)s ta (t)s fta (t)audio signals, pre-filtered by response <strong>in</strong>versetest excitation signalstest excitation signal, pre-filtered by response <strong>in</strong>verseaudio signal, reverberated by measured responsereverberated audio signals, post-filtered by response <strong>in</strong>verseacoustic test excitation signal, after loudspeakeracoustic test excitation signal, after loudspeaker, pre-filtered byresponse <strong>in</strong>verses fa (t)acoustic audio signal, after loudspeaker, pre-filtered byresponse <strong>in</strong>verser ta (t)acoustic test excitation signal response, after loudspeaker,<strong>in</strong>clud<strong>in</strong>g <strong>room</strong>-path distortionsr fa (t)acoustic audio signal, after loudspeaker, <strong>in</strong>clud<strong>in</strong>g <strong>room</strong>-pathdistortions, pre-filtered by response <strong>in</strong>verser te (t), r’ te (t), r’ te (n)r fta (t), r fte (t), r’ fte (t), r’ fte (n)received test excitation response signalsreceived test excitation response signals, pre-filtered byresponse <strong>in</strong>verseTable 242


Table captionsTable 1: Sound Pressure Level (L p ) and Reverberation Time (RT) versusfrequency for the laboratory <strong>room</strong> employed for the <strong>real</strong>-<strong>time</strong> tests.Table 2: List of symbols employed for <strong>room</strong> response measurement, <strong>in</strong>versionanalysis and audio signal <strong>dereverberation</strong>.Figure captionsFigure 1: Room impulse response measurement.Figure 2: Measurements correspond<strong>in</strong>g to the test laboratory <strong>room</strong>: (a)impulse response (energy), (b) magnitude spectrum level (<strong>in</strong> dB-SPL),together with background acoustic noise.Figure 3: Off-l<strong>in</strong>e <strong>room</strong> impulse response <strong>in</strong>version.Figure 4: Real-<strong>time</strong> <strong>room</strong> impulse response <strong>in</strong>version.Figure 5: Off-l<strong>in</strong>e audio signal <strong>dereverberation</strong>.Figure 6: Real-<strong>time</strong> audio signal <strong>dereverberation</strong>.Figure 7: Comparison of Complex Smoothed vs orig<strong>in</strong>al <strong>room</strong> response: (a)<strong>time</strong> doma<strong>in</strong> (energy), (b) frequency doma<strong>in</strong> (magnitude spectrum).43


Figure 8: Typical off-l<strong>in</strong>e <strong>dereverberation</strong> results, us<strong>in</strong>g Complex Smooth<strong>in</strong>g:(a) <strong>time</strong> doma<strong>in</strong> (energy) (b) frequency doma<strong>in</strong> (magnitude spectrum).Figure 9: Results for <strong>real</strong>-<strong>time</strong> and off-l<strong>in</strong>e <strong>room</strong> response <strong>in</strong>version, asfunction of different <strong>in</strong>verse filter lengths: (a) and (b) <strong>in</strong>verted impulseresponse (energy) and magnitude spectrum for 1K filter length, (c) and (d)<strong>in</strong>verted impulse response (energy) and magnitude spectrum for 512K filterlength. Note the expected <strong>in</strong>creased pre-r<strong>in</strong>g<strong>in</strong>g <strong>in</strong> (c) compared to (a), due tothe long acausal filter component.Figure 10: Time and frequency doma<strong>in</strong> results for <strong>real</strong>-<strong>time</strong> and off-l<strong>in</strong>e<strong>dereverberation</strong> tests for ideal and Complex Smooth<strong>in</strong>g-based <strong>in</strong>version, asfunction of <strong>in</strong>verse filter length: (a) <strong>dereverberation</strong> error energy (see equation(7)), (b) <strong>dereverberation</strong> spectral deviation (see equation (9)).Figure 11: Time and frequency doma<strong>in</strong> results for <strong>real</strong>-<strong>time</strong> vs off-l<strong>in</strong>e<strong>dereverberation</strong> discrepancy error for ideal and Complex Smoothed based<strong>in</strong>version as a function of <strong>in</strong>verse filter length: (a) <strong>dereverberation</strong> discrepancyerror energy (see equation (8)), (b) <strong>dereverberation</strong> discrepancy spectraldeviation (see equation (10)).Figure 12: Example of a notch system response at the frequency of 1kHz: (a)discrete-frequency magnitude spectrum of the noiseless system as function ofdifferent DFT length L , (b) <strong>in</strong>verse filter magnitude spectrum compensat<strong>in</strong>g44


for the noisy notch system response as function of different <strong>in</strong>verse filterlength Li.Figure 13: Mismatched equalized magnitude response for the notch system ofFigure 12, as function of different <strong>in</strong>verse filter length L i: (a) noisy system<strong>in</strong>verse filter<strong>in</strong>g (b) <strong>in</strong>verse filter<strong>in</strong>g after 0.1 % shift<strong>in</strong>g of the notch frequency.Figure 14: Example of a Complex Smooth<strong>in</strong>g-based notch response <strong>in</strong>version(notch frequency of 1kHz): (a) orig<strong>in</strong>al (noisy) and Complex Smoothedspectra, (b) result of <strong>in</strong>version by us<strong>in</strong>g Complex Smoothed <strong>in</strong>verse filters ofdifferent length Li.45


P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”Figure 146


P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”0Impulse Response (dB)-20-40-60-80-100-1200 200 400 600 800 1000Time (msec)Figure 2(a)Magnitude Response (dB-SPL)70605040302010Room ResponseBackground Noise010 100 1k 10klog Frequency (Hz)Figure 2(b)47


P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”Figure 3Figure 4P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”48


Figure 5Figure 6P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”49


0Time Energy (dB)-20-40-60-80Orig<strong>in</strong>alComplex Smoothed0 5 10 15 20Time (msec)Figure 7(a)Magnitude (dB)3020100-10-20Orig<strong>in</strong>al-30Complex Smoothed-4010 100 1k 10klog Frequency (Hz)Figure 7(b)P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”50


Time Energy (dB)0-20-40-60-80Orig<strong>in</strong>alDereverberated100 105 110 115 120Time (msec)Figure 8(a)3020Magnitude (dB)100-10-20-30Orig<strong>in</strong>alDereverberated-4010 100 1k 10klog Frequency (Hz)Figure 8(b)P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”51


0Impulse Response (dB)-20-40-60-80-100-120Real-TimeOff-L<strong>in</strong>e0 200 400 600 800 1000Time (msec)Figure 9(a)Magnitude Response (dB)40200-20-40Real-TimeOff-L<strong>in</strong>e10 100 1k 10klog Frequency (Hz)Figure 9(b)P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”52


Impulse Response (dB)0-20-40-60-80-100-120Real-TimeOff-L<strong>in</strong>e9600 9800 10000 10200 10400Time (msec)Figure 9(c)Magnitude Response (dB)40200-20-40Real-TimeOff-L<strong>in</strong>e10 100 1k 10klog Frequency (Hz)Figure 9(d)P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”53


-50DereverberationError Energy (dB)-60-70-80-90-100Ideal (Off-L<strong>in</strong>e)Ideal (Real-Time)Complex Smoothed (Off-L<strong>in</strong>e)Complex Smoothed (Real-Time)2K 8K 32K 128K 512Klog2 Filter Length (samples)Figure 10(a)DereverberationSpectral Deviation (dB)43210Ideal (Off-L<strong>in</strong>e)Ideal (Real-Time)Complex Smoothed (Off-L<strong>in</strong>e)Complex Smoothed (Real-Time)2K 8K 32K 128K 512Klog2 Filter Length (samples)Figure 10(b)P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”54


Dereverberation DiscrepancyError Energy (dB)50 IdealComplex Smoothed4030201002K 8K 32K 128K 512Klog2 Filter Length (samples)Figure 11(a)Dereverberation DiscrepancySpectral Deviation (dB)4 IdealComplex Smoothed32102K 8K 32K 128K 512Klog2 Filter Length (samples)Figure 11(b)P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”55


Magnitude (dB)80400-40L=1KL=8KL=64KBackground Noise-80900 950 1000 1050 1100Frequency (Hz)Figure 12(a)Magnitude (dB)80400-40Noisless NotchBackground NoiseL i=1KL i=8KL i=64K-80900 950 1000 1050 1100Frequency (Hz)Figure 12(b)P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”56


Magnitude (dB)200-20-40-60L i=1KL i=8KL i=64K-80Background Noise Notch Response900 950 1000 1050 1100Frequency (Hz)Figure 13(a)20Magnitude (dB)0-20-40-60-80L i=1KL i=8KL i=64KNotch Response900 950 1000 1050 1100Frequency (Hz)Figure 13(b)P.D. Hatziantoniou and J.N. Mourjopoulos, “Errors <strong>in</strong> Real-Time Room AcousticsDereverberation”57


200Orig<strong>in</strong>al (noisy)SmoothedMagnitude (dB)-20-40-60-80900 950 1000 1050 1100Frequency (Hz)Figure 14(a)200Magnitude (dB)-20-40-60Orig<strong>in</strong>al (noisy)L i=1KL i=8KL i=64K-80900 950 1000 1050 1100Frequency (Hz)Figure 14(b)58

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!