CCRMA OVERVIEW - CCRMA - Stanford University
CCRMA OVERVIEW - CCRMA - Stanford University
CCRMA OVERVIEW - CCRMA - Stanford University
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
6.3.5 Integrated Online-Offline Methods for Audio Segmentation<br />
Harvey Thornburg<br />
Segmentation refers to the identification of temporally homogeneous regions in a recorded signal. To<br />
this end, we discuss three frameworks concerning segment boundaries: online (sequential) detection,<br />
offline estimation and offline detection. In particular, a method testing for a parameter in a nested<br />
sequence of linear subspaces is developed for offline detection. Two integrated approaches are presented:<br />
the first interfaces online detectors in forward and backward scan to identify regions where change<br />
points may occur, applying offline methods in a refinement stage; the second pursues a local "predictorcorrector"<br />
approach. The latter may be adapted for real-time use with lookahead. Both methods run in<br />
approximately linear time, yielding detection/estimation performance approaching that of pure offline<br />
estimation despite the inherent quadratic cost of the latter. Applications are shown for speech and other<br />
highly transient audio signals.<br />
6.3.6 Rhythm Tracking and Segmentation with Priors: A Unified Approach<br />
Harvey Thornburg<br />
Audio segmentation is a fundamental problem in model-based analysis of sound. Segmentation means<br />
partitioning the time line into intervals over which the model parameters are relatively constant. Within<br />
these intervals, model parameters may vary continuously up to some maximum rate.<br />
In segmentation, a fundamental decision must be made as to whether measured parameter variations<br />
are classified as continuous or discontinuous. Errors in this decision can result in noticeable artifacts<br />
in model-based resynthesis. For example, portamento in the measured pitch of recorded piano tones is<br />
clearly unacceptable.<br />
It is thus important that segmentation be as accurate as possible, and that it make use of all appropriate<br />
constraints when the context is known.<br />
We have presented a segmentation algorithm which exploits global information about the temporal<br />
pattern of parameter changes. For example, in rhythmically predictable music, the history of prior<br />
change events influences the probability of change events at future times. In an offline setting (i.e.,<br />
out of real time), future change events can also be used to estimate the probability of change at a<br />
given point in time. Thus, for example, the segmentation can be carried out iteratively, using past and<br />
future change-point information from the previous iteration to optimally detect change points on the<br />
current iteration. (This is an example of what is sometimes called a batch relaxation method in system<br />
identification.)<br />
To use this global information, we develop a Bayesian segmentation method which makes use of prior<br />
likelihoods concerning the location and possible presence of change points. To propagate these likelihoods,<br />
we develop a rhythm tracker to operate in tandem with the segmentation method. The rhythm<br />
tracker uses a combination of extended Kalman filter and hidden Markov model methods to update prior<br />
likelihoods for future change point events given those previously identified in segmentation.<br />
There are basically two frameworks for segmentation: online and offline. In the online case, little or<br />
no look-ahead is allowed, while in the offline case, all data are available at all times. Even though<br />
the problem setting is usually offline, it is worthwhile to develop online methods in the interest of fast<br />
approximation.<br />
The complete segmentation software system includes three separate optimal Bayes estimators covering<br />
estimation and detection in the offline setting, and detection in the online setting. The Bayesian<br />
formulation of each problem can be described as follows:<br />
1. The problem of offline estimation assumes a fixed data window and a known number of changes:<br />
one finds the joint maximum a-posteriori (MAP) estimate of the change point locations using<br />
location priors.<br />
42