21.07.2015 Views

GAWK: Effective AWK Programming

GAWK: Effective AWK Programming

GAWK: Effective AWK Programming

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

304 <strong>G<strong>AWK</strong></strong>: <strong>Effective</strong> <strong>AWK</strong> <strong>Programming</strong>⊣ 0000051582Ctrl-dThis shows that some values can be represented exactly, whereas others are only approximated.This is not a “bug” in awk, but simply an artifact of how computers representnumbers.Another peculiarity of floating-point numbers on modern systems is that they often havemore than one representation for the number zero! In particular, it is possible to represent“minus zero” as well as regular, or “positive” zero.This example shows that negative and positive zero are distinct values when storedinternally, but that they are in fact equal to each other, as well as to “regular” zero:$ gawk ’BEGIN { mz = -0 ; pz = 0> printf "-0 = %g, +0 = %g, (-0 == +0) -> %d\n", mz, pz, mz == pz> printf "mz == 0 -> %d, pz == 0 -> %d\n", mz == 0, pz == 0> }’⊣ -0 = -0, +0 = 0, (-0 == +0) -> 1⊣ mz == 0 -> 1, pz == 0 -> 1It helps to keep this in mind should you process numeric data that contains negativezero values; the fact that the zero is negative is noted and can affect comparisons.D.3.3 Standards Versus Existing PracticeHistorically, awk has converted any non-numeric looking string to the numeric value zero,when required. Furthermore, the original definition of the language and the original POSIXstandards specified that awk only understands decimal numbers (base 10), and not octal(base 8) or hexadecimal numbers (base 16).As of this writing (February, 2007), changes in the language of the current POSIXstandard can be interpreted to imply that awk should support additional features. Thesefeatures are:• Interpretation of floating point data values specified in hexadecimal notation(‘0xDEADBEEF’). (Note: data values, not source code constants.)• Support for the special IEEE 754 floating point values “Not A Number” (NaN), positiveInfinity (“inf”) and negative Infinity (“−inf”). In particular, the format for these valuesis as specified by the ISO C99 standard, which ignores case and can allow machinedependentadditional characters after the ‘nan’ and allow either ‘inf’ or ‘infinity’.The first problem is that both of these are clear changes to historical practice:• The gawk maintainer feels that hexadecimal floating point values, in particular, is ugly,and was never intended by the original designers to be part of the language.• Allowing completely alphabetic strings to have valid numeric values is also a very severedeparture from historical practice.The second problem is that the gawk maintainer feels that this interpretation of thestandard, which requires a certain amount of “language lawyering” to arrive at in the firstplace, was not intended by the standard developers, either. In other words, “we see howyou got where you are, but we don’t think that that’s where you want to be.”Nevertheless, on systems that support IEEE floating point, it seems reasonable to providesome way to support NaN and Infinity values. The solution implemented in gawk, as ofversion 3.1.6, is as follows:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!