Computational Models of Music Similarity and their ... - OFAI

More documents

Recommendations

Info

14 2 Audio-based Similarity Measures resentation this might not be trivial. For example, some of the similarity measures described in this section (which use features with a frame-level scope) use Monte Carlo sampling or the Kullback-Leibler divergence to compare pieces. Computational Limits In general it is not possible to model every nerve cell in the human auditory system when processing music archives with terabytes of data. Again the intended application defines the requirements. A similarity measure that runs on a mobile device will have other constraints than one which can be run in parallel on a massive server farm. Furthermore, it makes a big difference if the similarities are computed for a collection of a few hundred pieces, or for a catalog of a few million pieces. Finding the optimal trade-off between required resources (including memory and processing time) and quality might not be trivial. Structure of this Section This section is structured as follows. The next subsection gives a simple introduction to similarity computations using the Zero Crossing Rate as an example. Subsection 2.2.2 describes how the time domain representation of the audio signals is transformed to the frequency domain. Subsections 2.2.3– 2.2.5 describe different features and how they are used to compute similarity. The main focus is on spectral similarity (which is somehow related to timbre) and Fluctuation Patterns (which are somehow related to rhythmical properties). Subsection 2.2.6 describes how the different approaches are combined linearly. Subsection 2.2.7 describes anomalies in the similarity space. In particular, the triangular inequality does not always hold, and a few pieces are estimated to be highly similar to a very large number of pieces while others are highly dissimilar to almost all other pieces. 2.2.1 The Basic Idea (ZCR Illustration) This subsection illustrates the concept of audio-based music similarity using the Zero Crossing Rate as example. The ZCR is very simple to compute and has been applied to speech processing to distinguish voiced sections from noise. Furthermore, it has been applied to MIR tasks such as classifying percussive sounds, or genres. For example, the winning entry of the MIREX 2005 genre classification contest used the ZCR among other features. 7 7 The MIREX contest will be discussed in more detail in Subsection 2.3.1.1
2.2 Techniques 15 Amplitude 0 ZCR: 15/5ms 0 1 2 3 4 5 Time [ms] Figure 2.1: Illustration of the ZCR computation using a 5 milliseconds audio excerpt. The dotted line marks zero amplitude. The 15 circles mark the zero crossings. Dave Brubeck Quartet − Blue Rondo A La Turk ZCR: 1.4/ms Dave Brubeck Quartet − Kathy’s Waltz ZCR: 1.1/ms Bon Jovi − Bad Medicine ZCR: 3.7/ms Evanescene − Bring Me To Life ZCR: 2.7/ms DJs @ Work − Someday ZCR: 1.3/ms Maurice Ravel − Bolero ZCR: 1.9/ms Figure 2.2: Audio excerpts with a length of 10 seconds each and corresponding ZCR values. The amplitude is plotted on the y-axis, time on the x-axis.
Page 1: DISSERTATION Computational Models o
Page 5: Abstract This thesis aims at develo
Page 8 and 9: evaluate similarity measures for dr
Page 10 and 11: 2.2.7.3 Always Dissimilar . . . . .
Page 13 and 14: Chapter 1 Introduction This chapter
Page 15 and 16: 1.1 Outline of this Thesis 3 measur
Page 17 and 18: 1.2 Matlab Syntax 5 ◦ Development
Page 19 and 20: 1.2 Matlab Syntax 7 A frequently us
Page 21 and 22: Chapter 2 Audio-based Similarity Me
Page 23 and 24: 2.1 Introduction 11 Experts High qu
Page 25: 2.2 Techniques 13 2.2 Techniques To
Page 29 and 30: 2.2 Techniques 17 MFCCs Mel Frequen
Page 31 and 32: 2.2 Techniques 19 Segment wav(idx)
Page 33 and 34: 2.2 Techniques 21 Triangular Filter
Page 35 and 36: 2.2 Techniques 23 num_coeffs = 5 nu
Page 37 and 38: 2.2 Techniques 25 2.2.2.5 Parameter
Page 39 and 40: 2.2 Techniques 27 used for clusteri
Page 41 and 42: 2.2 Techniques 29 FFT window size w
Page 43 and 44: 2.2 Techniques 31 Unlike G30 no ran
Page 45 and 46: 2.2 Techniques 33 2.2.3.4 Single Ga
Page 47 and 48: 2.2 Techniques 35 Blue Rondo ... Ka
Page 49 and 50: 2.2 Techniques 37 G30 G30S G1 G1 re
Page 51 and 52: 2.2 Techniques 39 Relative Fluctuat
Page 53 and 54: 2.2 Techniques 41 36 mel 71 1 12 me
Page 55 and 56: 2.2 Techniques 43 2.2.5.1 Time Doma
Page 57 and 58: 2.2 Techniques 45 Alternatively, th
Page 59 and 60: 2.2 Techniques 47 ZCR (×10 −3 )
Page 61 and 62: 2.2 Techniques 49 2.2.6 Linear Comb
Page 63 and 64: 2.3 Optimization and Evaluation 51
Page 77 and 78:
2.3 Optimization and Evaluation 65
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
2.5 Alternative: Web-based Similari
Page 103 and 104:
2.6 Conclusions 91 2.5.3 Limitation
Page 105 and 106:
Chapter 3 Applications This chapter
Page 107 and 108:
3.2 Islands of Music 95 Figure 3.1:
Page 109 and 110:
3.2 Islands of Music 97 they use to
Page 111 and 112:
3.2 Islands of Music 99 a b c d Fig
Page 113 and 114:
3.2 Islands of Music 101 AMBIENT CL
Page 115 and 116:
3.2 Islands of Music 103 Figure 3.6
Page 117 and 118:
3.2 Islands of Music 105 scribing a
Page 119 and 120:
Page 121 and 122:
Page 123 and 124:
3.3 Fuzzy Hierarchical Organization
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Page 135 and 136:
Page 137 and 138:
3.4 Dynamic Playlist Generation 125
Page 139 and 140:
Page 141 and 142:
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
3.5 Conclusions 137 + Punk / Bad Re
Page 151 and 152:
Chapter 4 Conclusions In this thesi
Page 153 and 154:
Bibliography [AHH + 03] Eric Allama
Page 155 and 156:
[CKGB02] Pedro Cano, Martin Kaltenb
Page 157 and 158:
[Got03] Masataka Goto, A Chorus-Sec
Page 159 and 160:
[Lüb05] Dominik Lübbers, SoniXplo
Page 161 and 162:
[PFW05b] , Improvements of Audio-Ba
Page 163 and 164:
[SKW05a] Markus Schedl, Peter Knees
Page 165:
Elias Pampalk I was born in 1978 in
show all

Computational Models of Music Similarity and their ... - OFAI

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?