php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
php|architect's Guide to Web Scraping with PHP - Wind Business ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
CSS Selec<strong>to</strong>r Libraries ” 135<br />
• [href] matches all nodes that have an attribute node <strong>with</strong> the name href.<br />
• [href=“/home”] matches all nodes <strong>with</strong> an attribute node named href that has<br />
a value of “/home”.<br />
• [href!=“/home”] matches all nodes <strong>with</strong> an attribute node named href that do<br />
not have a value of “/home”.<br />
• [hrefˆ=“/”] matches all nodes <strong>with</strong> an attribute node named href and have a<br />
value that starts <strong>with</strong> “/”.<br />
• [href$=“-us”] matches all nodes <strong>with</strong> an attribute node named href and have<br />
a value that ends <strong>with</strong> “-us”.<br />
• [href*=“-us”] matches all nodes <strong>with</strong> an attribute node named href and have<br />
a value that contains “-us” anywhere <strong>with</strong>in the value.<br />
• [src*=“ad”][altˆ=“Advertisement”] matches all nodes that have both an attribute<br />
node named src <strong>with</strong> a value containing “ad” and an attribute node<br />
named alt <strong>with</strong> a value starting <strong>with</strong> “Advertisement”.<br />
Selec<strong>to</strong>r CSS XPath<br />
has attribute [href] //*[@href]<br />
has attribute value [href=“/home”] //*[@href=“/home”]<br />
has different attribute [href!=“/home”] //*[@href!=“/home”]<br />
value<br />
has attribute value<br />
starting <strong>with</strong> substring<br />
[hrefˆ=“/”] //*[starts-<strong>with</strong>(@href,<br />
“/”)]<br />
has attribute value<br />
ending <strong>with</strong> substring<br />
[href$=“-us”] //*[ends-width(@href,<br />
“-us”)]<br />
has attribute value<br />
containing substring<br />
[href*=“-us”] //*[contains(@href,<br />
“-us”)]<br />
multiple attribute<br />
filters<br />
[src*=“ad”][altˆ=<br />
“Advertisement”]<br />
//*[contains(@src,<br />
“ad”) and<br />
starts-<strong>with</strong>(@alt,<br />
“Advertisement”)]