- Page 1 and 2:
php|architect’s Guide to Web Scra
- Page 3:
php|ar chitect’s Guide to W eb Sc
- Page 7 and 8:
vi ” CONTENTS Referring URLs . .
- Page 9 and 10:
viii ” CONTENTS HTTP Authenticati
- Page 11:
x ” CONTENTS Chapter 14 — PCRE
- Page 15 and 16:
xiv ” CONTENTS pleted. Each had a
- Page 18 and 19:
For ewor d W eb scraping is the fut
- Page 21 and 22:
Chapter 1 Introduction If you are l
- Page 23 and 24:
Introduction ” 3 in some instance
- Page 25:
Introduction ” 5 • Chapters 3-7
- Page 28 and 29:
8 ” HTTP R equests The HTTP proto
- Page 30 and 31:
10 ” HTTP http://en.wikipedia.org
- Page 32 and 33:
12 ” HTTP i Query String Limits M
- Page 34 and 35:
14 ” HTTP Server: Apache X-Powere
- Page 36 and 37:
16 ” HTTP set, it will persist fo
- Page 38 and 39:
18 ” HTTP Content Caching Two met
- Page 40 and 41:
20 ” HTTP as 0-499. To specify fr
- Page 42 and 43:
22 ” HTTP • Initialize a reques
- Page 44:
24 ” HTTP W rap-U p At this point
- Page 49 and 50: HTTP Streams W rapper ” 29 Let
- Page 51 and 52: HTTP Streams W rapper ” 31 Error
- Page 53: HTTP Streams W rapper ” 33 ); ?>
- Page 56 and 57: 36 ” cURL Extension Simple R eque
- Page 58 and 59: 38 ” cURL Extension Setting M ult
- Page 60 and 61: 40 ” cURL Extension • CURLOPT_R
- Page 62 and 63: 42 ” cURL Extension containing th
- Page 64 and 65: 44 ” cURL Extension operate unpre
- Page 66: 46 ” cURL Extension • The sessi
- Page 70 and 71: 50 ” pecl_http PECL Extension bal
- Page 72 and 73: 52 ” pecl_http PECL Extension •
- Page 74 and 75: 54 ” pecl_http PECL Extension Deb
- Page 76 and 77: 56 ” pecl_http PECL Extension ass
- Page 78 and 79: 58 ” pecl_http PECL Extension );
- Page 81 and 82: Chapter 6 P EAR::HTTP_Client The PH
- Page 83 and 84: PEAR::HTTP_Client ” 63 • sendRe
- Page 85 and 86: PEAR::HTTP_Client ” 65 • By def
- Page 87 and 88: PEAR::HTTP_Client ” 67 } ?> $url
- Page 89: PEAR::HTTP_Client ” 69 • http:/
- Page 92 and 93: 72 ” Zend_Http_Client // Another
- Page 94 and 95: 74 ” Zend_Http_Client Configurat
- Page 96 and 97: 76 ” Zend_Http_Client getLastResp
- Page 101 and 102: Chapter 8 R o l l i n g Y Own o u r
- Page 103 and 104: Rolling Y o u Own r ” 83 • The
- Page 105 and 106: Rolling Y o u Own r ” 85 ?> See S
- Page 109 and 110: Chapter 9 T i d yExtension At this
- Page 111 and 112: T i d y Extension ” 91 $tidy = ne
- Page 113 and 114: T i d y Extension ” 93 compare ti
- Page 115 and 116: T i d y Extension ” 95 Array ( [0
- Page 119 and 120: Chapter 10 DOM Extension Once the r
- Page 121 and 122: DOM Extension ” 101 // Buffer DOM
- Page 123 and 124: DOM Extension ” 103 Elements and
- Page 125 and 126: DOM Extension ” 105 will become t
- Page 127 and 128: DOM Extension ” 107 R e l a t i v
- Page 129 and 130: DOM Extension ” 109 // Returns th
- Page 133 and 134: Chapter 11 S i m p l e X M L Extens
- Page 135 and 136: SimpleXML Extension ” 115 echo $s
- Page 137 and 138: SimpleXML Extension ” 117 Compari
- Page 141 and 142: Chapter 12 XMLReader Extension The
- Page 143 and 144: XMLReader Extension ” 123 • LIB
- Page 145 and 146: XMLReader Extension ” 125 Element
- Page 147: XMLReader Extension ” 127 DOM I n
- Page 150 and 151:
130 ” CSS Selector Libraries sele
- Page 152 and 153:
132 ” CSS Selector Libraries H i
- Page 154 and 155:
134 ” CSS Selector Libraries N o
- Page 156 and 157:
136 ” CSS Selector Libraries Chil
- Page 158 and 159:
138 ” CSS Selector Libraries Libr
- Page 160:
140 ” CSS Selector Libraries U n
- Page 164 and 165:
144 ” PCRE Extension i POSIX Exte
- Page 166 and 167:
146 ” PCRE Extension Alternation
- Page 168 and 169:
148 ” PCRE Extension // Matches
- Page 170 and 171:
150 ” PCRE Extension Escaping The
- Page 172 and 173:
152 ” PCRE Extension Ranges are r
- Page 174:
154 ” PCRE Extension • u: F o r
- Page 178 and 179:
158 ” T i p s and T r i c k s sin
- Page 180 and 181:
160 ” T i p s and T r i c k s gen
- Page 182:
162 ” T i p s and T r i c k s cat
- Page 186 and 187:
166 ” Legality of W e b Scraping
- Page 189 and 190:
A p p e n d i x B M u l t i p r o c
- Page 191:
M u l t i p r o c e s s i n g ” 1