15.04.2018 Views

programming-for-dummies

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

168<br />

Finding Strings with Regular Expressions<br />

Finding Strings with Regular Expressions<br />

Be<strong>for</strong>e you can manipulate a string, you first must find it. Although some <strong>programming</strong><br />

languages include string searching functions, most of them are<br />

fairly limited to finding exact matches of strings.<br />

To remedy this problem, many <strong>programming</strong> languages (such as Perl and<br />

Tcl) use regular expressions. (A regular expression is just a series of symbols<br />

that tell the computer how to find a specific pattern in a string.)<br />

If a <strong>programming</strong> language doesn’t offer built-in support <strong>for</strong> regular expressions,<br />

many programmers have written subprogram libraries that let you<br />

add regular expressions to your program. By using regular expressions, your<br />

programs can per<strong>for</strong>m more sophisticated text searching than any built-in<br />

string functions could ever do.<br />

Pattern matching with the single<br />

character (.) wildcard<br />

The simplest way to search <strong>for</strong> a pattern is to look <strong>for</strong> a single character. For<br />

example, you might want to know if a certain string begins with the letter b,<br />

ends with the letter t, and contains exactly one character between. Although<br />

you could repetitively check every three-character string that begins with b<br />

and ends with t, like bat or but, it’s much easier to use a single-character<br />

wildcard instead, which is a dot or period character (.).<br />

So if you want to find every three-letter string that begins with a b and ends<br />

with a t, you’d use this regular expression:<br />

b.t<br />

To search <strong>for</strong> multiple characters, use the (.) wildcard multiple times to<br />

match multiple characters. So the pattern b..t matches the strings boot and<br />

boat with the two (..) wildcards representing the two characters between<br />

the b and the t.<br />

Of course, the b..t pattern doesn’t match bat because bat has only one<br />

character between the b and the t. Nor does it match boost because boost<br />

has more than two characters between the b and the t.<br />

When using the (.) wildcard, you must know the exact number of characters<br />

to match.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!