21.07.2015 Views

GAWK: Effective AWK Programming

GAWK: Effective AWK Programming

GAWK: Effective AWK Programming

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

138 <strong>G<strong>AWK</strong></strong>: <strong>Effective</strong> <strong>AWK</strong> <strong>Programming</strong>write two backslashes. Therefore, write ‘\\&’ in a string constant to include aliteral ‘&’ in the replacement. For example, the following shows how to replacethe first ‘|’ on each line with an ‘&’:{ sub(/\|/, "\\&"); print }As mentioned, the third argument to sub must be a variable, field or arrayreference. Some versions of awk allow the third argument to be an expressionthat is not an lvalue. In such a case, sub still searches for the pattern andreturns zero or one, but the result of the substitution (if any) is thrown awaybecause there is no place to put it. Such versions of awk accept expressions suchas the following:sub(/USA/, "United States", "the USA and Canada")For historical compatibility, gawk accepts erroneous code, such as in the previousexample. However, using any other nonchangeable object as the thirdparameter causes a fatal error and your program will not run.Finally, if the regexp is not a regexp constant, it is converted into a string, andthen the value of that string is treated as the regexp to match.gsub(regexp, replacement [, target])This is similar to the sub function, except gsub replaces all of the longest,leftmost, nonoverlapping matching substrings it can find. The ‘g’ in gsub standsfor “global,” which means replace everywhere. For example:{ gsub(/Britain/, "United Kingdom"); print }replaces all occurrences of the string ‘Britain’ with ‘United Kingdom’ for allinput records.The gsub function returns the number of substitutions made. If the variable tosearch and alter (target) is omitted, then the entire input record ($0) is used.As in sub, the characters ‘&’ and ‘\’ are special, and the third argument mustbe assignable.gensub(regexp, replacement, how [, target]) #gensub is a general substitution function. Like sub and gsub, it searches thetarget string target for matches of the regular expression regexp. Unlike suband gsub, the modified string is returned as the result of the function and theoriginal target string is not changed. If how is a string beginning with ‘g’ or‘G’, then it replaces all matches of regexp with replacement. Otherwise, howis treated as a number that indicates which match of regexp to replace. If notarget is supplied, $0 is used.gensub provides an additional feature that is not available in sub or gsub: theability to specify components of a regexp in the replacement text. This is doneby using parentheses in the regexp to mark the components and then specifying‘\N’ in the replacement text, where N is a digit from 1 to 9. For example:$ gawk ’> BEGIN {> a = "abc def"> b = gensub(/(.+) (.+)/, "\\2 \\1", "g", a)> print b

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!