21.07.2015 Views

GAWK: Effective AWK Programming

GAWK: Effective AWK Programming

GAWK: Effective AWK Programming

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

100 <strong>G<strong>AWK</strong></strong>: <strong>Effective</strong> <strong>AWK</strong> <strong>Programming</strong>If an awk program has only a BEGIN rule and no other rules, then the program exits afterthe BEGIN rule is run. 1 However, if an END rule exists, then the input is read, even if thereare no other rules in the program. This is necessary in case the END rule checks the FNR andNR variables.6.1.4.2 Input/Output from BEGIN and END RulesThere are several (sometimes subtle) points to remember when doing I/O from a BEGIN orEND rule. The first has to do with the value of $0 in a BEGIN rule. Because BEGIN rulesare executed before any input is read, there simply is no input record, and therefore nofields, when executing BEGIN rules. References to $0 and the fields yield a null string orzero, depending upon the context. One way to give $0 a real value is to execute a getlinecommand without a variable (see Section 3.8 [Explicit Input with getline], page 52).Another way is simply to assign a value to $0.The second point is similar to the first but from the other direction. Traditionally, duelargely to implementation issues, $0 and NF were undefined inside an END rule. The POSIXstandard specifies that NF is available in an END rule. It contains the number of fields fromthe last input record. Most probably due to an oversight, the standard does not say that$0 is also preserved, although logically one would think that it should be. In fact, gawkdoes preserve the value of $0 for use in END rules. Be aware, however, that Unix awk, andpossibly other implementations, do not.The third point follows from the first two. The meaning of ‘print’ inside a BEGIN or ENDrule is the same as always: ‘print $0’. If $0 is the null string, then this prints an emptyline. Many long time awk programmers use an unadorned ‘print’ in BEGIN and END rules,to mean ‘print ""’, relying on $0 being null. Although one might generally get away withthis in BEGIN rules, it is a very bad idea in END rules, at least in gawk. It is also poor style,since if an empty line is needed in the output, the program should print one explicitly.Finally, the next and nextfile statements are not allowed in a BEGIN rule, because theimplicit read-a-record-and-match-against-the-rules loop has not started yet. Similarly, thosestatements are not valid in an END rule, since all the input has been read. (See Section 6.4.8[The next Statement], page 108, and see Section 6.4.9 [Using gawk’s nextfile Statement],page 109.)6.1.5 The Empty PatternAn empty (i.e., nonexistent) pattern is considered to match every input record. For example,the program:awk ’{ print $1 }’ BBS-listprints the first field of every record.6.2 Using Shell Variables in Programsawk programs are often used as components in larger programs written in shell. For example,it is very common to use a shell variable to hold a pattern that the awk program searchesfor. There are two ways to get the value of the shell variable into the body of the awkprogram.1 The original version of awk used to keep reading and ignoring input until the end of the file was seen.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!