21.07.2015 Views

GAWK: Effective AWK Programming

GAWK: Effective AWK Programming

GAWK: Effective AWK Programming

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

140 <strong>G<strong>AWK</strong></strong>: <strong>Effective</strong> <strong>AWK</strong> <strong>Programming</strong>If you need to replace bits and pieces of a string, combine substr with stringconcatenation, in the following manner:string = "abcdef"...string = substr(string, 1, 2) "CDE" substr(string, 6)tolower(string)This returns a copy of string, with each uppercase character in the string replacedwith its corresponding lowercase character. Nonalphabetic charactersare left unchanged. For example, tolower("MiXeD cAsE 123") returns "mixedcase 123".toupper(string)This returns a copy of string, with each lowercase character in the string replacedwith its corresponding uppercase character. Nonalphabetic charactersare left unchanged. For example, toupper("MiXeD cAsE 123") returns "MIXEDCASE 123".8.1.3.1 More About ‘\’ and ‘&’ with sub, gsub, and gensubWhen using sub, gsub, or gensub, and trying to get literal backslashes and ampersandsinto the replacement text, you need to remember that there are several levels of escapeprocessing going on.First, there is the lexical level, which is when awk reads your program and builds aninternal copy of it that can be executed. Then there is the runtime level, which is when awkactually scans the replacement string to determine what to generate.At both levels, awk looks for a defined set of characters that can come after a backslash.At the lexical level, it looks for the escape sequences listed in Section 2.2 [Escape Sequences],page 25. Thus, for every ‘\’ that awk processes at the runtime level, type two backslashesat the lexical level. When a character that is not valid for an escape sequence follows the‘\’, Unix awk and gawk both simply remove the initial ‘\’ and put the next character intothe string. Thus, for example, "a\qb" is treated as "aqb".At the runtime level, the various functions handle sequences of ‘\’ and ‘&’ differently.The situation is (sadly) somewhat complex. Historically, the sub and gsub functions treatedthe two character sequence ‘\&’ specially; this sequence was replaced in the generated textwith a single ‘&’. Any other ‘\’ within the replacement string that did not precede an ‘&’was passed through unchanged. This is illustrated in Table 8.1.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!