04.08.2014 Views

o_18ufhmfmq19t513t3lgmn5l1qa8a.pdf

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 10 ■ BATTERIES INCLUDED 241<br />

The function re.escape is a utility function used to escape all the characters in a string that<br />

might be interpreted as a regexp operator. Use this if you have a long string with lots of these<br />

special characters and you want to avoid typing a lot of backslashes, or if you get a string from<br />

a user (for example, through the raw_input function) and want to use it as a part of a regexp.<br />

Here is an example of how it works:<br />

>>> re.escape('www.python.org')<br />

'www\\.python\\.org'<br />

>>> re.escape('But where is the ambiguity?')<br />

'But\\ where\\ is\\ the\\ ambiguity\\?'<br />

■Note In Table 10-8 you’ll notice that some of the functions have an optional parameter called flags. This<br />

parameter can be used to change how the regular expressions are interpreted. For more information about<br />

this, see the standard library reference, in the section about the re module at http://python.org/doc/<br />

lib/module-re.html. The flags are described in the subsection “Module Contents.”<br />

Match Objects and Groups<br />

The re functions that try to match a pattern against a section of a string all return MatchObjects<br />

when a match is found. These objects contain information about the substring that matched<br />

the pattern. They also contain information about which parts of the pattern matched which<br />

parts of the substring—and these “parts” are called groups.<br />

A group is simply a subpattern that has been enclosed in parentheses. The groups are<br />

numbered by their left parenthesis. Group zero is the entire pattern. So, in the pattern<br />

'There (was a (wee) (cooper)) who (lived in Fyfe)'<br />

the groups are as follows:<br />

0 There was a wee cooper who lived in Fyfe<br />

1 was a wee cooper<br />

2 wee<br />

3 cooper<br />

4 lived in Fyfe<br />

Typically, the groups contain special characters such as wildcards or repetition operators,<br />

and thus you may be interested in knowing what a given group has matched. For example, in<br />

the pattern<br />

r'www\.(.+)\.com$'<br />

group 0 would contain the entire string, and group 1 would contain everything between 'www.'<br />

and '.com'. By creating patterns like this, you can extract the parts of a string that interest you.<br />

Some of the more important methods of re match objects are described in Table 10-9.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!