15.04.2013 Views

Core Python Programming (2nd Edition)

Core Python Programming (2nd Edition)

Core Python Programming (2nd Edition)

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

You may have seen the use of raw strings in some of the examples<br />

above. Regular expressions were a strong motivation for the advent of<br />

raw strings. The reason is because of conflicts between ASCII characters<br />

and regular expression special characters. As a special symbol, "\b"<br />

represents the ASCII character for backspace, but "\b" is also a regular<br />

expression special symbol, meaning "match" on a word boundary. In<br />

order for the RE compiler to see the two characters "\b" as your string<br />

and not a (single) backspace, you need to escape the backslash in the<br />

string by using another backslash, resulting in "\\b."<br />

This can get messy, especially if you have a lot of special characters in<br />

your string, adding to the confusion. We were introduced to raw strings<br />

back in Chapter 6, and they can be (and are often) used to help keep<br />

REs looking somewhat manageable. In fact, many <strong>Python</strong> programmers<br />

swear by these and only use raw strings when defining regular<br />

expressions.<br />

Here are some examples of differentiating between the backspace "\b"<br />

and the regular expression "\b," with and without raw strings:<br />

>>> m = re.match('\bblow', 'blow') # backspace, no match<br />

>>> if m is not None: m.group()<br />

...<br />

>>> m = re.match('\\bblow', 'blow') # escaped \, now it works<br />

>>> if m is not None: m.group()<br />

...<br />

'blow'<br />

>>> m = re.match(r'\bblow', 'blow') # use raw string instead<br />

>>> if m is not None: m.group()<br />

...<br />

'blow'<br />

You may have recalled that we had no trouble using "\d" in our regular<br />

expressions without using raw strings. That is because there is no ASCII<br />

equivalent special character, so the regular expression compiler already<br />

knew you meant a decimal digit.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!