04.08.2014 Views

o_18ufhmfmq19t513t3lgmn5l1qa8a.pdf

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

238 CHAPTER 10 ■ BATTERIES INCLUDED<br />

The question mark means that the subpattern can appear once or not at all. There are a<br />

few other operators that allow you to repeat a subpattern more than once:<br />

(pattern)*<br />

(pattern)+<br />

(pattern){m,n}<br />

pattern is repeated zero or more times<br />

pattern is repeated one or more times<br />

pattern is repeated from m to n times<br />

So, for example, r'w*\.python\.org' matches 'www.python.org', but also '.python.org',<br />

'ww.python.org', and 'wwwwwww.python.org'. Similarly, r'w+\.python\.org' matches<br />

'w.python.org' but not '.python.org', and r'w{3,4}\.python\.org' matches only<br />

'www.python.org' and 'wwww.python.org'.<br />

■Note The term match is used loosely here to mean that the pattern matches the entire string. The match<br />

function, described in the text that follows, requires only that the pattern matches the beginning of the string.<br />

The Beginning and End of a String<br />

Until now, you’ve only been looking at a pattern matching an entire string, but you can also<br />

try to find a substring that matches the patterns, such as the substring 'www' of the string<br />

'www.python.org' matching the pattern 'w+'. When you’re searching for substrings like this,<br />

it can sometimes be useful to anchor this substring either at the beginning or the end of the<br />

full string. For example, you might want to match 'ht+p' at the beginning of a string, but<br />

not anywhere else. Then you use a caret ('^') to mark the beginning: '^ht+p' would match<br />

'http://python.org' (and 'htttttp://python.org', for that matter) but not 'www.http.org'.<br />

Similarly, the end of a string may be indicated by the dollar sign ('$').<br />

■Note For a complete listing of regexp operators, see the standard library reference, in the section “Regular<br />

Expression Syntax” (http://python.org/doc/lib/re-syntax.html).<br />

Contents of the re Module<br />

Knowing how to write regular expressions isn’t much good if you can’t use them for anything.<br />

The re module contains several useful functions for working with regular expressions. Some of<br />

the most important ones are described in Table 10-8.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!