13.07.2015 Views

Quicktime File Format (2012-08-14).pdf

Quicktime File Format (2012-08-14).pdf

Quicktime File Format (2012-08-14).pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Audio Priming - Handling Encoder Delay in AACThis appendix describes temporal positioning of a source audio signal after AAC encoding into a sound trackfor QuickTime media files. The mechanisms described here are specified in ISO MPEG-4 standards (ISO/IEC<strong>14</strong>496-12, 20<strong>08</strong>) and are used here with additional constraints.Note on language useAAC implementations typically represent 1024 PCM audio samples in one AAC packet (synonymousin this context with a QuickTime media sample, and also referred to in ISO documents as an “accessunit”). The terms “sample” and “audio sample” in this appendix are used to refer to PCM samples.For the encoded audio data, the terms “AAC packet” and QuickTime “media sample” are used.Background – AAC EncodingAAC requires data beyond the source PCM audio samples in order to correctly encode and decode audiosamples due to the nature of the encoding algorithm. AAC encoding uses a transform over consecutive setsof 2048 audio samples, applied every 1024 audio samples (overlapped). For correct audio to be decoded, bothtransforms for any period of 1024 audio samples are needed. For this reason, encoders add at least 1024samples of silence before the first ‘true’ audio sample, and often add more. This is called variously “priming”,“priming samples”, or “encoder delay”. A couple of definitions for use in this discussion:●●Encoder delay is the delay incurred during encoding to produce properly formed, encoded audio packets.It typically refers to the number of silent media samples (priming samples) added to the front of an AACencoded bitstream.Decoder delay is the number of “pre-roll” audio samples required to reproduce an encoded source audiosignal for a given time index. For AAC this number is typically 1024 and is algorithmically based. This is incontrast to encoder delay which is determined by the encoder and encoding configuration used. However,decoder delay establishes the minimum encoder delay possible (that is, 1024 for AAC).The common practice is to propagate the encoder delay in the AAC bitstream. When these audio packets arethen decoded back to the PCM domain, the source waveform represented will be offset in its entirety by thisencoder delay amount. Since encoded audio packets hold a fixed number of audio samples (for instance, 1024samples) additional trailing or ‘remainder’ silent samples following the last source sample are required so asto pad the final audio packet to the required length.<strong>2012</strong>-<strong>08</strong>-<strong>14</strong> | © 2004, <strong>2012</strong> Apple Inc. All Rights Reserved.421

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!