php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
144 ” PCRE Extension<br />
i<br />
POSIX Extended Regular Expressions<br />
Many <strong>PHP</strong> developers will cut their teeth on regular expressions using the POSIX regular<br />
expression extension, also called the ereg extension. The functions from this extension<br />
are being deprecated in <strong>PHP</strong> 5.3 in favor of those in the PCRE extension, which<br />
are faster and provide a more powerful feature set. Aside from differences in syntax<br />
for some special character ranges, most ereg expressions require only the addition of<br />
expression delimiters <strong>to</strong> work <strong>with</strong> preg functions.<br />
P a t t e r nBasics<br />
Let’s start <strong>with</strong> something simple: detection of a substring anywhere <strong>with</strong>in a string.<br />
<br />
N o t i c e that the pattern in the preg_match() call is fairly similar <strong>to</strong> the string used in<br />
the strpos() call. In the former, / is used on either side of the pattern <strong>to</strong> indicate its<br />
beginning and end. The first character in the pattern string is considered <strong>to</strong> be the<br />
pattern delimiter and can be any character you specify. When choosing what you<br />
want <strong>to</strong> use for this character (/ is the most common choice), bear in mind that you<br />
will have <strong>to</strong> escape it (covered in the Escaping section later) if you use it <strong>with</strong>in the<br />
pattern. This will make more sense a little later in the chapter.<br />
A difference between the two functions used in this example is that strpos() returns<br />
the location of the substring <strong>with</strong>in the string beginning at 0 or false if the<br />
substring is not contained <strong>with</strong>in the string. This requires the use of the === opera<strong>to</strong>r<br />
<strong>to</strong> tell the difference between the substring being matched at the beginning of<br />
the string or not at all. By contrast, preg_match() returns the number of matches it<br />
found. This will be either 0 or 1 since preg_match() s<strong>to</strong>ps searching once it finds a<br />
match.