14.06.2014 Views

Unix Power Tools

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

34.11<br />

The ampersand makes it possible to reference the entire match in the replacement<br />

string.<br />

In the next example, the backslash is used to escape the ampersand, which<br />

appears literally in the replacement section:<br />

s/ORA/O'Reilly \& Associates, Inc./g<br />

It’s easy to forget about the ampersand appearing literally in the replacement<br />

string. If we had not escaped it in this example, the output would have been<br />

O'Reilly ORA Associates, Inc.<br />

—DD<br />

34.11 Referencing Portions<br />

of a Search String<br />

In sed, the substitution command provides metacharacters to select any individual<br />

portion of a string that is matched and recall it in the replacement string. A<br />

pair of escaped parentheses are used in sed to enclose any part of a regular<br />

expression and save it for recall. Up to nine “saves” are permitted for a single<br />

line. \n is used to recall the portion of the match that was saved, where n is a<br />

number from 1 to 9 referencing a particular “saved” string in order of use. (Article<br />

32.13 has more information.)<br />

For example, when converting a plain-text document into HTML, we could convert<br />

section numbers that appear in a cross-reference into an HTML hyperlink.<br />

The following expression is broken onto two lines for printing, but you should<br />

type all of it on one line:<br />

s/\([sS]ee \)\(Section \)\([1-9][0-9]*\)\.\([1-9][0-9]*\)/<br />

\1\2\3.\4/<br />

Four pairs of escaped parentheses are specified. String 1 captures the word see<br />

with an upper- or lowercase s. String 2 captures the section number (because this<br />

is a fixed string, it could have been simply retyped in the replacement string).<br />

String 3 captures the part of the section number before the decimal point, and<br />

String 4 captures the part of the section number after the decimal point. The<br />

replacement string recalls the first saved substring as \1. Next starts a link where<br />

the two parts of the section number, \3 and \4, are separated by an underscore (_)<br />

and have the string SEC- before them. Finally, the link text replays the section<br />

number again—this time with a decimal point between its parts. Note that<br />

although a dot (.) is special in the search pattern and has to be quoted with a<br />

backslash there, it’s not special on the replacement side and can be typed literally.<br />

Here’s the script run on a short test document, using checksed (34.4):<br />

678 Part VI: Scripting<br />

This is the Title of the Book, eMatter Edition<br />

Copyright © 2009 O’Reilly & Associates, Inc. All rights reserved.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!