- Page 1 and 2:
php|architect’s Guide to Web Scra
- Page 3:
php|ar chitect’s Guide to W eb Sc
- Page 7 and 8:
vi ” CONTENTS Referring URLs . .
- Page 9 and 10:
viii ” CONTENTS HTTP Authenticati
- Page 11:
x ” CONTENTS Chapter 14 — PCRE
- Page 15 and 16:
xiv ” CONTENTS pleted. Each had a
- Page 18 and 19:
For ewor d W eb scraping is the fut
- Page 21 and 22:
Chapter 1 Introduction If you are l
- Page 23 and 24:
Introduction ” 3 in some instance
- Page 25:
Introduction ” 5 • Chapters 3-7
- Page 28 and 29:
8 ” HTTP R equests The HTTP proto
- Page 30 and 31:
10 ” HTTP http://en.wikipedia.org
- Page 32 and 33:
12 ” HTTP i Query String Limits M
- Page 34 and 35:
14 ” HTTP Server: Apache X-Powere
- Page 36 and 37:
16 ” HTTP set, it will persist fo
- Page 38 and 39:
18 ” HTTP Content Caching Two met
- Page 40 and 41:
20 ” HTTP as 0-499. To specify fr
- Page 42 and 43:
22 ” HTTP • Initialize a reques
- Page 44:
24 ” HTTP W rap-U p At this point
- Page 49 and 50:
HTTP Streams W rapper ” 29 Let
- Page 51 and 52:
HTTP Streams W rapper ” 31 Error
- Page 53:
HTTP Streams W rapper ” 33 ); ?>
- Page 56 and 57:
36 ” cURL Extension Simple R eque
- Page 58 and 59:
38 ” cURL Extension Setting M ult
- Page 60 and 61:
40 ” cURL Extension • CURLOPT_R
- Page 62 and 63:
42 ” cURL Extension containing th
- Page 64 and 65:
44 ” cURL Extension operate unpre
- Page 66:
46 ” cURL Extension • The sessi
- Page 70 and 71:
50 ” pecl_http PECL Extension bal
- Page 72 and 73:
52 ” pecl_http PECL Extension •
- Page 74 and 75:
54 ” pecl_http PECL Extension Deb
- Page 76 and 77:
56 ” pecl_http PECL Extension ass
- Page 78 and 79:
58 ” pecl_http PECL Extension );
- Page 81 and 82:
Chapter 6 P EAR::HTTP_Client The PH
- Page 83 and 84:
PEAR::HTTP_Client ” 63 • sendRe
- Page 85 and 86:
PEAR::HTTP_Client ” 65 • By def
- Page 87 and 88:
PEAR::HTTP_Client ” 67 } ?> $url
- Page 89:
PEAR::HTTP_Client ” 69 • http:/
- Page 92 and 93:
72 ” Zend_Http_Client // Another
- Page 94 and 95:
74 ” Zend_Http_Client Configurat
- Page 96 and 97:
76 ” Zend_Http_Client getLastResp
- Page 98:
78 ” Zend_Http_Client HTTP A uthe
- Page 102 and 103:
82 ” Rolling Y o u Own r $stream
- Page 104 and 105:
84 ” Rolling Y o u Own r Logic to
- Page 106:
86 ” Rolling Y o u Own r See RFC
- Page 110 and 111:
90 ” T i d y Extension direct inp
- Page 112 and 113:
92 ” T i d y Extension public fun
- Page 114 and 115:
94 ” T i d y Extension There are
- Page 116:
96 ” T i d y Extension Output Obt
- Page 120 and 121: 100 ” DOM Extension T y p of P e
- Page 122 and 123: 102 ” DOM Extension ties include
- Page 124 and 125: 104 ” DOM Extension // A slightly
- Page 126 and 127: 106 ” DOM Extension // Also retur
- Page 128 and 129: 108 ” DOM Extension • //@id add
- Page 130: 110 ” DOM Extension • DOM Level
- Page 134 and 135: 114 ” SimpleXML Extension The co
- Page 136 and 137: 116 ” SimpleXML Extension foreach
- Page 138: 118 ” SimpleXML Extension W r a
- Page 142 and 143: 122 ” XMLReader Extension Loading
- Page 144 and 145: 124 ” XMLReader Extension false o
- Page 146 and 147: 126 ” XMLReader Extension cate to
- Page 149 and 150: Chapter 13 CSS Selector Libraries T
- Page 151 and 152: CSS Selector Libraries ” 131 Abou
- Page 153 and 154: CSS Selector Libraries ” 133 •
- Page 155 and 156: CSS Selector Libraries ” 135 •
- Page 157 and 158: CSS Selector Libraries ” 137 •
- Page 159 and 160: CSS Selector Libraries ” 139 It
- Page 163 and 164: Chapter 14 PCRE Extension There are
- Page 165 and 166: PCRE Extension ” 145 Anchors Y o
- Page 167 and 168: PCRE Extension ” 147 // Matches
- Page 169: PCRE Extension ” 149 if (preg_mat
- Page 173 and 174: PCRE Extension ” 153 • T ouse a
- Page 177 and 178: T i p sand T r i c k s Chapter 15 C
- Page 179 and 180: T i p s and T r i c ” k 159 s not
- Page 181 and 182: T i p s and T r i c ” k 161 s W e
- Page 185 and 186: A p p e n d i x A Legality of W e S
- Page 187: Legality of W e b Scraping ” 167
- Page 190 and 191: 170 ” M u l t i p r o c e s s i n