php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
152 ” PCRE Extension<br />
Ranges are respective <strong>to</strong> ASCII (American St a n d a rd Code for Information Interchange).<br />
In other words, the ASCII value for the beginning character must precede<br />
the ASCII value for the ending character. Otherwise, the warning “Warning:<br />
preg_match(): Compilation failed: range out of order in character class at offset n” is<br />
emitted, where n is character offset <strong>with</strong>in the regular expression.<br />
Within square brackets, single characters and special ranges are simply listed side<br />
by side <strong>with</strong> no delimiter, as shown in the second example above. Additionally, the<br />
escape sequences mentioned earlier such as \w can be used both inside and outside<br />
square brackets.<br />
i<br />
ASCII Ranges<br />
F o r an excellent ASCII lookup table, see http://www.asciitable.com.<br />
There are two other noteworthy points about character ranges, as illustrated in the<br />
examples below.<br />
<br />
• T ouse a literal ] character in a character range, escape it in the same manner<br />
in which other meta-characters are escaped.<br />
• T onegate a character range, use ˆ as the first character in that character range.<br />
(Yes,this can be confusing since ˆ is also used <strong>to</strong> denote the beginning of a line<br />
or entire string when it is not used inside a character range.) N o t e that negation<br />
applies <strong>to</strong> all characters in the range. In other words, a negated character<br />
range means “ a n ycharacter that is not any of these characters.”