XMP Specification Part 3: Storage in Files - Adobe
XMP Specification Part 3: Storage in Files - Adobe
XMP Specification Part 3: Storage in Files - Adobe
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
There are three important “flavors” of PostScript files that can affect how <strong>XMP</strong> is written, found, and used. They<br />
are:<br />
• DSC PostScript (or just “PostScript”): PostScript conform<strong>in</strong>g to the DSC conventions def<strong>in</strong>ed <strong>in</strong> Appendix<br />
G of the PostScript Language Reference.<br />
• Raw PostScript: PostScript follow<strong>in</strong>g no particular structural conventions. The use of raw PostScript is<br />
discouraged. As mentioned <strong>in</strong> 2.6.2.1.1, “Order<strong>in</strong>g of content”, a special DSC comment is required to<br />
support fast and reliable location of the ma<strong>in</strong> <strong>XMP</strong>.<br />
• EPS: PostScript conform<strong>in</strong>g to the EPS conventions def<strong>in</strong>ed <strong>in</strong> Appendix H of the PostScript Language<br />
Reference. EPS is a subset of DSC PostScript.<br />
Because of common usage issues, document-level <strong>XMP</strong> should be written differently for PostScript and EPS.<br />
Object-level <strong>XMP</strong> is written identically for PostScript and EPS.<br />
The <strong>XMP</strong> <strong>in</strong> a PostScript/EPS file must be encoded as UTF-8.<br />
2.6.2.1 Document-level metadata <strong>in</strong> PostScript<br />
As with any file format, locat<strong>in</strong>g conta<strong>in</strong>ed <strong>XMP</strong> <strong>in</strong> PostScript or EPS is most reliably done by fully process<strong>in</strong>g<br />
the file format. For PostScript, this means execut<strong>in</strong>g the PostScript <strong>in</strong>terpreter. Packet scann<strong>in</strong>g is not reliable<br />
whenever a file conta<strong>in</strong>s multiple <strong>XMP</strong> packets, or object <strong>XMP</strong> without ma<strong>in</strong> <strong>XMP</strong>.<br />
It is often worthwhile to f<strong>in</strong>d the ma<strong>in</strong> <strong>XMP</strong> and ignore (at least temporarily) object <strong>XMP</strong>. Interpretation of the<br />
entire PostScript file to locate the ma<strong>in</strong> <strong>XMP</strong> can be very expensive. A h<strong>in</strong>t and careful order<strong>in</strong>g are used to<br />
allow a comb<strong>in</strong>ation of <strong>XMP</strong> packet scann<strong>in</strong>g and PostScript comment scann<strong>in</strong>g to reliably f<strong>in</strong>d the ma<strong>in</strong> <strong>XMP</strong>.<br />
To write document-level metadata <strong>in</strong> PostScript, an application must:<br />
• Write the %ADO_Conta<strong>in</strong>s<strong>XMP</strong> comment as described under 2.6.2.1.1, “Order<strong>in</strong>g of content”.<br />
• Write the <strong>XMP</strong> packet as described under 2.6.2.1.2, “Document-level <strong>XMP</strong> <strong>in</strong> PostScript”.<br />
To write document-level metadata <strong>in</strong> EPS, an application must:<br />
• Write the %ADO_Conta<strong>in</strong>s<strong>XMP</strong> comment as described under 2.6.2.1.1, “Order<strong>in</strong>g of content”.<br />
• Write the <strong>XMP</strong> packet as described under 2.6.2.1.3, “Document-level <strong>XMP</strong> <strong>in</strong> EPS”.<br />
Use of raw PostScript is discouraged specifically because it lacks the %ADO_Conta<strong>in</strong>s<strong>XMP</strong> comment. If raw<br />
PostScript must be used, the <strong>XMP</strong> must be embedded as described under 2.6.2.1.2, “Document-level <strong>XMP</strong> <strong>in</strong><br />
PostScript”.<br />
2.6.2.1.1Order<strong>in</strong>g of content<br />
Many large publications use PostScript extensively. It is common to have very large layouts with hundreds or<br />
thousands of placed EPS files. Because PostScript is text, locat<strong>in</strong>g <strong>XMP</strong> embedded with<strong>in</strong> PostScript <strong>in</strong><br />
general requires pars<strong>in</strong>g the entire PostScript program, or at least scann<strong>in</strong>g all of its text. Placed PostScript<br />
files can be quite large. They can even represent compound documents, and might conta<strong>in</strong> multiple <strong>XMP</strong><br />
packets. For PostScript files conta<strong>in</strong><strong>in</strong>g <strong>XMP</strong> at all, the entire file would have to be searched to make that<br />
simple determ<strong>in</strong>ation.<br />
All of this presents performance challenges for layout programs that want to process <strong>XMP</strong> embedded <strong>in</strong><br />
PostScript. As a pragmatic partial solution, a special marker comment can be placed <strong>in</strong> the PostScript header<br />
comments to provide advice about locat<strong>in</strong>g the ma<strong>in</strong> <strong>XMP</strong>. This marker must be before the %%EndComments<br />
l<strong>in</strong>e.<br />
The purpose of this marker is to tell applications consum<strong>in</strong>g the PostScript whether a ma<strong>in</strong> <strong>XMP</strong> is present at<br />
all, and how to look for the ma<strong>in</strong> <strong>XMP</strong>. The form of the <strong>XMP</strong> marker is:<br />
%ADO_Conta<strong>in</strong>s<strong>XMP</strong>: ...<br />
44 ©<strong>Adobe</strong> Systems Incorporated, 2010