- Page 1 and 2: php|architect’s Guide to Web Scra
- Page 3: php|ar chitect’s Guide to W eb Sc
- Page 7 and 8: vi ” CONTENTS Referring URLs . .
- Page 9 and 10: viii ” CONTENTS HTTP Authenticati
- Page 11: x ” CONTENTS Chapter 14 — PCRE
- Page 15: xiv ” CONTENTS pleted. Each had a
- Page 19: xviii ” CONTENTS Today, there are
- Page 22 and 23: 2 ” Introduction H o w to R ead T
- Page 24 and 25: 4 ” Introduction A ppropriate U s
- Page 27 and 28: Chapter 2 HTTP The first task that
- Page 29 and 30: HTTP ” 9 • “HTTP P ocket Refe
- Page 31 and 32: HTTP ” 11 should be used for the
- Page 33 and 34: HTTP ” 13 i URL Encoding One trai
- Page 35 and 36: HTTP ” 15 H eaders An all-purpose
- Page 37 and 38: HTTP ” 17 what is sent when the a
- Page 39 and 40: HTTP ” 19 disabling a primary sit
- Page 41 and 42: HTTP ” 21 Digest HTTP Authenticat
- Page 43 and 44: HTTP ” 23 likely “auth”) from
- Page 48 and 49: 28 ” HTTP Streams W rapper code (
- Page 50 and 51: 30 ” HTTP Streams W rapper )); )
- Page 52 and 53: 32 ” HTTP Streams W rapper $conte
- Page 55 and 56: Chapter 4 cURL Extension The cURL P
- Page 57 and 58: cURL Extension ” 37 $data = array
- Page 59 and 60: cURL Extension ” 39 CURLOPT_HTTP
- Page 61 and 62: cURL Extension ” 41 that file and
- Page 63 and 64: cURL Extension ” 43 B yte Ranges
- Page 65 and 66: cURL Extension ” 45 $ch1 = curl_i
- Page 69 and 70:
Chapter 5 pecl_http P ECL Extension
- Page 71 and 72:
pecl_http PECL Extension ” 51 $fi
- Page 73 and 74:
pecl_http PECL Extension ” 53
- Page 75 and 76:
pecl_http PECL Extension ” 55 N o
- Page 77 and 78:
pecl_http PECL Extension ” 57 R e
- Page 79:
pecl_http PECL Extension ” 59 •
- Page 82 and 83:
62 ” PEAR::HTTP_Client i PEAR Dev
- Page 84 and 85:
64 ” PEAR::HTTP_Client individual
- Page 86 and 87:
66 ” PEAR::HTTP_Client U sing the
- Page 88 and 89:
68 ” PEAR::HTTP_Client } ?> } } d
- Page 91 and 92:
Chapter 7 Zend_Http_Client Zend Fra
- Page 93 and 94:
Zend_Http_Client ” 73 // Returns
- Page 95 and 96:
Zend_Http_Client ” 75 // Another
- Page 97 and 98:
Zend_Http_Client ” 77 // All cook
- Page 101 and 102:
Chapter 8 R o l l i n g Y Own o u r
- Page 103 and 104:
Rolling Y o u Own r ” 83 • The
- Page 105 and 106:
Rolling Y o u Own r ” 85 ?> See S
- Page 109 and 110:
Chapter 9 T i d yExtension At this
- Page 111 and 112:
T i d y Extension ” 91 $tidy = ne
- Page 113 and 114:
T i d y Extension ” 93 compare ti
- Page 115 and 116:
T i d y Extension ” 95 Array ( [0
- Page 119 and 120:
Chapter 10 DOM Extension Once the r
- Page 121 and 122:
DOM Extension ” 101 // Buffer DOM
- Page 123 and 124:
DOM Extension ” 103 Elements and
- Page 125 and 126:
DOM Extension ” 105 will become t
- Page 127 and 128:
DOM Extension ” 107 R e l a t i v
- Page 129 and 130:
DOM Extension ” 109 // Returns th
- Page 133 and 134:
Chapter 11 S i m p l e X M L Extens
- Page 135 and 136:
SimpleXML Extension ” 115 echo $s
- Page 137 and 138:
SimpleXML Extension ” 117 Compari
- Page 141 and 142:
Chapter 12 XMLReader Extension The
- Page 143 and 144:
XMLReader Extension ” 123 • LIB
- Page 145 and 146:
XMLReader Extension ” 125 Element
- Page 147:
XMLReader Extension ” 127 DOM I n
- Page 150 and 151:
130 ” CSS Selector Libraries sele
- Page 152 and 153:
132 ” CSS Selector Libraries H i
- Page 154 and 155:
134 ” CSS Selector Libraries N o
- Page 156 and 157:
136 ” CSS Selector Libraries Chil
- Page 158 and 159:
138 ” CSS Selector Libraries Libr
- Page 160:
140 ” CSS Selector Libraries U n
- Page 164 and 165:
144 ” PCRE Extension i POSIX Exte
- Page 166 and 167:
146 ” PCRE Extension Alternation
- Page 168 and 169:
148 ” PCRE Extension // Matches
- Page 170 and 171:
150 ” PCRE Extension Escaping The
- Page 172 and 173:
152 ” PCRE Extension Ranges are r
- Page 174:
154 ” PCRE Extension • u: F o r
- Page 178 and 179:
158 ” T i p s and T r i c k s sin
- Page 180 and 181:
160 ” T i p s and T r i c k s gen
- Page 182:
162 ” T i p s and T r i c k s cat
- Page 186 and 187:
166 ” Legality of W e b Scraping
- Page 189 and 190:
A p p e n d i x B M u l t i p r o c
- Page 191:
M u l t i p r o c e s s i n g ” 1