03.02.2014 Views

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

144 ” PCRE Extension<br />

i<br />

POSIX Extended Regular Expressions<br />

Many <strong>PHP</strong> developers will cut their teeth on regular expressions using the POSIX regular<br />

expression extension, also called the ereg extension. The functions from this extension<br />

are being deprecated in <strong>PHP</strong> 5.3 in favor of those in the PCRE extension, which<br />

are faster and provide a more powerful feature set. Aside from differences in syntax<br />

for some special character ranges, most ereg expressions require only the addition of<br />

expression delimiters <strong>to</strong> work <strong>with</strong> preg functions.<br />

P a t t e r nBasics<br />

Let’s start <strong>with</strong> something simple: detection of a substring anywhere <strong>with</strong>in a string.<br />

<br />

N o t i c e that the pattern in the preg_match() call is fairly similar <strong>to</strong> the string used in<br />

the strpos() call. In the former, / is used on either side of the pattern <strong>to</strong> indicate its<br />

beginning and end. The first character in the pattern string is considered <strong>to</strong> be the<br />

pattern delimiter and can be any character you specify. When choosing what you<br />

want <strong>to</strong> use for this character (/ is the most common choice), bear in mind that you<br />

will have <strong>to</strong> escape it (covered in the Escaping section later) if you use it <strong>with</strong>in the<br />

pattern. This will make more sense a little later in the chapter.<br />

A difference between the two functions used in this example is that strpos() returns<br />

the location of the substring <strong>with</strong>in the string beginning at 0 or false if the<br />

substring is not contained <strong>with</strong>in the string. This requires the use of the === opera<strong>to</strong>r<br />

<strong>to</strong> tell the difference between the substring being matched at the beginning of<br />

the string or not at all. By contrast, preg_match() returns the number of matches it<br />

found. This will be either 0 or 1 since preg_match() s<strong>to</strong>ps searching once it finds a<br />

match.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!