- Page 1 and 2:
php|architect’s Guide to Web Scra
- Page 3:
php|ar chitect’s Guide to W eb Sc
- Page 7 and 8:
vi ” CONTENTS Referring URLs . .
- Page 9 and 10:
viii ” CONTENTS HTTP Authenticati
- Page 11:
x ” CONTENTS Chapter 14 — PCRE
- Page 15 and 16:
xiv ” CONTENTS pleted. Each had a
- Page 18 and 19:
For ewor d W eb scraping is the fut
- Page 21 and 22:
Chapter 1 Introduction If you are l
- Page 23 and 24:
Introduction ” 3 in some instance
- Page 25:
Introduction ” 5 • Chapters 3-7
- Page 28 and 29:
8 ” HTTP R equests The HTTP proto
- Page 30 and 31:
10 ” HTTP http://en.wikipedia.org
- Page 32 and 33:
12 ” HTTP i Query String Limits M
- Page 34 and 35:
14 ” HTTP Server: Apache X-Powere
- Page 36 and 37:
16 ” HTTP set, it will persist fo
- Page 38 and 39:
18 ” HTTP Content Caching Two met
- Page 40 and 41:
20 ” HTTP as 0-499. To specify fr
- Page 42 and 43:
22 ” HTTP • Initialize a reques
- Page 44:
24 ” HTTP W rap-U p At this point
- Page 49 and 50:
HTTP Streams W rapper ” 29 Let
- Page 51 and 52:
HTTP Streams W rapper ” 31 Error
- Page 53:
HTTP Streams W rapper ” 33 ); ?>
- Page 56 and 57:
36 ” cURL Extension Simple R eque
- Page 58 and 59:
38 ” cURL Extension Setting M ult
- Page 60 and 61:
40 ” cURL Extension • CURLOPT_R
- Page 62 and 63:
42 ” cURL Extension containing th
- Page 64 and 65:
44 ” cURL Extension operate unpre
- Page 66:
46 ” cURL Extension • The sessi
- Page 70 and 71:
50 ” pecl_http PECL Extension bal
- Page 72 and 73:
52 ” pecl_http PECL Extension •
- Page 74 and 75:
54 ” pecl_http PECL Extension Deb
- Page 76 and 77:
56 ” pecl_http PECL Extension ass
- Page 78 and 79:
58 ” pecl_http PECL Extension );
- Page 81 and 82:
Chapter 6 P EAR::HTTP_Client The PH
- Page 83 and 84:
PEAR::HTTP_Client ” 63 • sendRe
- Page 85 and 86:
PEAR::HTTP_Client ” 65 • By def
- Page 87 and 88: PEAR::HTTP_Client ” 67 } ?> $url
- Page 89: PEAR::HTTP_Client ” 69 • http:/
- Page 92 and 93: 72 ” Zend_Http_Client // Another
- Page 94 and 95: 74 ” Zend_Http_Client Configurat
- Page 96 and 97: 76 ” Zend_Http_Client getLastResp
- Page 98: 78 ” Zend_Http_Client HTTP A uthe
- Page 102 and 103: 82 ” Rolling Y o u Own r $stream
- Page 104 and 105: 84 ” Rolling Y o u Own r Logic to
- Page 106: 86 ” Rolling Y o u Own r See RFC
- Page 110 and 111: 90 ” T i d y Extension direct inp
- Page 112 and 113: 92 ” T i d y Extension public fun
- Page 114 and 115: 94 ” T i d y Extension There are
- Page 116: 96 ” T i d y Extension Output Obt
- Page 120 and 121: 100 ” DOM Extension T y p of P e
- Page 122 and 123: 102 ” DOM Extension ties include
- Page 124 and 125: 104 ” DOM Extension // A slightly
- Page 126 and 127: 106 ” DOM Extension // Also retur
- Page 128 and 129: 108 ” DOM Extension • //@id add
- Page 130: 110 ” DOM Extension • DOM Level
- Page 134 and 135: 114 ” SimpleXML Extension The co
- Page 136 and 137: 116 ” SimpleXML Extension foreach
- Page 141 and 142: Chapter 12 XMLReader Extension The
- Page 143 and 144: XMLReader Extension ” 123 • LIB
- Page 145 and 146: XMLReader Extension ” 125 Element
- Page 147: XMLReader Extension ” 127 DOM I n
- Page 150 and 151: 130 ” CSS Selector Libraries sele
- Page 152 and 153: 132 ” CSS Selector Libraries H i
- Page 154 and 155: 134 ” CSS Selector Libraries N o
- Page 156 and 157: 136 ” CSS Selector Libraries Chil
- Page 158 and 159: 138 ” CSS Selector Libraries Libr
- Page 160: 140 ” CSS Selector Libraries U n
- Page 164 and 165: 144 ” PCRE Extension i POSIX Exte
- Page 166 and 167: 146 ” PCRE Extension Alternation
- Page 168 and 169: 148 ” PCRE Extension // Matches
- Page 170 and 171: 150 ” PCRE Extension Escaping The
- Page 172 and 173: 152 ” PCRE Extension Ranges are r
- Page 174: 154 ” PCRE Extension • u: F o r
- Page 178 and 179: 158 ” T i p s and T r i c k s sin
- Page 180 and 181: 160 ” T i p s and T r i c k s gen
- Page 182: 162 ” T i p s and T r i c k s cat
- Page 186 and 187: 166 ” Legality of W e b Scraping
- Page 189 and 190:
A p p e n d i x B M u l t i p r o c
- Page 191:
M u l t i p r o c e s s i n g ” 1