03.02.2014 Views

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

php|architect's Guide to Web Scraping with PHP - Wind Business ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 13<br />

CSS Selec<strong>to</strong>r Libraries<br />

This chapter will review several libraries that are built on <strong>to</strong>p of the XML extensions<br />

described in previous chapters. These libraries provide interfaces that uses CSS selec<strong>to</strong>r<br />

expressions <strong>to</strong> query markup documents rather than a programmatic API or<br />

XPath expressions. Don’t be concerned if you aren’t familiar <strong>with</strong> CSS selec<strong>to</strong>rs, as<br />

part of this chapter showcases basic expressions alongside their XPath equivalents.<br />

i<br />

CSS V e r s i o n s<br />

There are multiple versions of the CSS standard and supported selec<strong>to</strong>rs vary <strong>with</strong><br />

each version. This chapter will cover a subset of those available in CSS3. V e r s i o n s<br />

of the CSS standard supported by particular libraries are noted where available. A list<br />

of differences between the two common versions, CSS2 and CSS3, can be found at<br />

http://www.w3.org/TR/css3-selec<strong>to</strong>rs/#changesFromCSS2.<br />

R e a s o n <strong>to</strong> U s eThem<br />

Before getting in<strong>to</strong> the “how” of using CSS selec<strong>to</strong>r libraries, it’s probably best <strong>to</strong> get<br />

the “why” (and “why not”) out of the way first. It goes <strong>with</strong>out saying that these libraries<br />

add a layer of complexity <strong>to</strong> applications that use them, introducing another<br />

potential point of failure. They implement expression parsers in order <strong>to</strong> take CSS

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!