LIBRO DE ACTAS (pdf) - Universidad de Sevilla

More documents

Recommendations

Info

Melodic Transcription of Flamenco Singing from Monophonic and Polyphonic Music Recordings note nominal pitch c c c n n1 max c j c max c 1 c0 cmin 1 m1 L L · L · L P N0 Ni Ni 1, Ni i 1 k max 203 n max . n min k node k,j Figure 2: This figure shows the matrix M used by the short note segmentation process, and illustrates how the best path for the node with frame k and note j is determined. All possible note durations between n and n mix max are considered, as well as all possible jumps to previous notes. In this example is max found to be the most likely note duration and the index of the previous note. max In our approach, no particular characteristic is assumed a priori for the sung melody; therefore all possible note jumps have the same likelihood LN 1 i 1, N , i1, c 1 i n . On the other hand, the likelihood L N of a note N i is determined as the product of several likelihood functions based i on the following criteria: duration ( L dur ), fundamental frequency ( L pitch), existence of voiced and unvoiced frames ( L voicing ), and low-level features related to stability ( L stability ). For a note N , its i likelihood L is computed as L · · · N N L i dur L pitch Lvoicing Lstability . Duration likelihood L i dur is set so that it is small for short and long durations. Pitch likelihood L pitch is defined so that the likelihood is higher the closer the estimated pitch contour values are to the note nominal pitch c i and vice versa, giving more relevance to frames with lower values for the first derivative of the pitch contour. The voicing likelihood L voicing is defined so that segments with a high percentage of unvoiced frames are unlikely to be a voiced note, while segments with a high percentage of voiced frames are unlikely to be an unvoiced note. Finally, the stability likelihood considers that a voiced note is unlikely to have fast and significant timbre or energy changes in the middle. Note that this is not in contradiction with the typical characteristic of flamenco singing of changing the vowel at ending notes, since those changes are mostly smooth. 2.5 Iterative note consolidation and tuning frequency refinement In the last step, consecutive notes with the same pitch and a smooth transition are consolidated, the estimated tuning frequency is refined according to the obtained notes, and the note nominal pitch is re-estimated based on the new tuning frequency. This whole process is repeated until there are no more consolidations. Note consolidation: the notes obtained in the previous step have a limited duration between n min and n max , although longer notes are likely to have been sung. Therefore, it makes sense to consolidate consecutive voiced notes into longer notes if they have the same pitch. However, significant and fast energy or timbre changes around the note connection boundary may be indicative of phonetic changes unlikely to happen within a note, and thus may indicate that those consecutive notes are different ones. Thus, consecutive notes will be consolidated only if they have the same pitch and the stability measure of their connection falls below a certain threshold. Tuning frequency refinement: In a previous step, tuning frequency was estimated from the fundamental frequency contour. However, once notes have been segmented, it may be beneficial to use the note segmentation to refine the tuning frequency. For this purpose, we compute a frames
Melodic Transcription of Flamenco Singing from Monophonic and Polyphonic Music Recordings pitch deviation for each voiced note, and then estimate the new tuning frequency from a onesemitone histogram of weighted note pitch deviations. Weights are determined as a measure of the salience of each note, giving more weight to longer and louder notes. Figure 3 shows an example of the note transcription of a monophonic music recording. The system transcribes according to an equal-tempered scale, as requested by flamenco experts. That means that, even if the performer is out of tune, we approximate the used scale to a chromatic scale, i.e., mistuning is not transcribed. Figure 3: Example of the visualization tool for melodic transcription. Audio waveform (top), estimated f0 and pitch (middle) and energy (bottom). 3. Evaluation strategy 3.1 Music collections For monophonic music recordings, we gathered a music collection of 72 sung excerpts representative of different a cappella singing styles (Tonás). This collection was built in the context of a study on similarity and style classification of flamenco a cappella singing styles. We refer to (Mora et al. 2010) for a comprehensive description of the considered styles and their musical characteristics. All 72 excerpts are monophonic, their average duration is 30 seconds and there is enough variability for a proper evaluation of our methods, including a variety of singers, recording conditions, presence of percussion, clapping, background voices and noise. The files contain a total of 211047 frames and 2803 notes. In addition, we built a small control data set of pop/jazz a cappella singing, consisting of 5 musical phrases by different singers, recorded in good conditions. This control dataset will serve to evaluate the difficulty in transcribing flamenco material and test the algorithms in easier conditions to establish a performance ceiling. For polyphonic music recordings, we gathered approximately 20 minutes of music, consisting of 20 excerpts of singing voice with guitar accompaniment (Fandango style). This collection was also built in the context of the COFLA project 1 . It too contains a variety of male and female singers and recording conditions. The average duration of the analyzed samples is 57 seconds and the files contain a total of 189454 frames and 1680 notes. 3.2 Ground truth gathering We collected manual annotations from a musician with limited knowledge of flamenco music, so that there was no implicit knowledge applied in the transcription process. In order to gather manual annotations, we provided a user interface for visualizing the waveform and fundamental frequency in cents (in a piano roll), as shown in Figure 3. Since transcribing everything from 1 http://mtg.upf.edu/research/projects/cofla 204
Page 2:
Las fronteras entre los géneros: f
Page 5 and 6:
Reservados todos los derechos. Ni l
Page 7 and 8:
ÍNDICE Preface …. …………
Page 10:
Preface The III Interdisciplinary C
Page 13 and 14:
Ciencias de la Salud. De estos caso
Page 15 and 16:
Hacia una revisión del concepto
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 28 and 29:
El Romance de La loba parda y la Bu
Page 30 and 31:
Page 32 and 33:
Page 34 and 35:
Page 36 and 37:
Page 38 and 39:
Page 40 and 41:
Flamenco - Jazz: una perspectiva an
Page 42 and 43:
Page 44 and 45:
Page 46 and 47:
Page 48 and 49:
Page 50 and 51:
Page 52 and 53:
EL canto (cante) al Cristo de la C
Page 54 and 55:
Page 56 and 57:
Page 58 and 59:
Page 60:
Page 63 and 64:
Estudio de métodos para transcribi
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
El zapateado flamenco: naturaleza f
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
PERFIL FLAMENCO: ESTEBAN SANLÚCAR
Page 83 and 84:
Page 86 and 87:
Federico García Lorca y la música
Page 88 and 89:
Page 90 and 91:
Page 92 and 93:
Page 94 and 95:
Page 96:
Page 99 and 100:
El flamenco como expresión de lo s
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Nuevas propuestas para la historia
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
The legacy of Félix Fernández Gar
Page 117 and 118:
The legacy of Félix Fernández Gar
Page 119 and 120:
118
Page 121 and 122:
¿Tradición o renovación?: Antoni
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
138
Page 141 and 142:
El flamenco como epistemología par
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
Page 151 and 152:
Page 153 and 154: El flamenco como epistemología par
Page 161 and 162: La audición en los más pequeños
Page 169 and 170: 168
Page 171 and 172: Estrategias didácticas para el uso
Page 181 and 182: Arturo Pavón, recorrido diacrónic
Page 193 and 194: A Robust Audio Fingerprinter Based
Page 201 and 202: Melodic Transcription of Flamenco S
Page 203: Melodic Transcription of Flamenco S
Page 213 and 214: Retentional Syntagmatic Network, an
Page 219 and 220: 218
Page 221 and 222: Antipattern discovery in Basque fol
Page 227 and 228: Automatic Detection of Melodic Patt
Page 235 and 236: 234
Page 237 and 238: Does singing style correlate to soc
Page 239: Does singing style correlate to soc
show all

LIBRO DE ACTAS (pdf) - Universidad de Sevilla

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?