15.04.2018 Views

programming-for-dummies

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

630<br />

Bioin<strong>for</strong>matics Programming<br />

Although you could type molecular sequences by hand, it’s far easier to let<br />

the computer do it <strong>for</strong> you, especially if you want to compare a large number<br />

of sequences with BLAST. After BLAST gets through comparing your<br />

sequences, it returns a list of matching sequences.<br />

Using BLAST to compare sequences to a database of known sequences is an<br />

example of data mining. (See Chapter 1 of this mini-book <strong>for</strong> more in<strong>for</strong>mation<br />

about data mining.)<br />

You could scan through this list of matching yourself, but once again, that’s<br />

likely to be tedious, slow, and error-prone. Writing a program that can parse<br />

through reports generated by BLAST to look <strong>for</strong> certain characteristics is<br />

much simpler. Essentially, you can use the computer to automate sending<br />

data to BLAST and then have the computer filter through the results so you<br />

see only the sequences that you care about.<br />

Now you could write another program to skim or parse through the database<br />

results to filter out only the results you’re looking <strong>for</strong>. Because every<br />

database stores in<strong>for</strong>mation in slightly different <strong>for</strong>mats, you might need to<br />

write another program that converts file <strong>for</strong>mats from one database into<br />

another one.<br />

Because every biologist is using different in<strong>for</strong>mation to look <strong>for</strong> different<br />

results, there’s no single bioin<strong>for</strong>matics program standard in the same way<br />

that everyone has flocked to a single word processor standard, like Microsoft<br />

Word. As a result, bioin<strong>for</strong>matics involves writing a lot of little custom programs<br />

to work with an ever-growing library of standard programs that biologists<br />

need and use every day.<br />

Some biologists can learn <strong>programming</strong> and do much of this work themselves,<br />

but it’s far more common <strong>for</strong> biologists to give their data to an army of bioin<strong>for</strong>matics<br />

technicians who take care of the <strong>programming</strong> details. That way the<br />

biologists can focus on what they do best (studying biology) while the programmers<br />

can focus on what they do best (writing custom programs). The<br />

only way these two groups can communicate is if biologists understand how<br />

<strong>programming</strong> can help them and the programmers understand what type of<br />

data and results the biologists need.<br />

Bioin<strong>for</strong>matics Programming<br />

Because biologists use a wide variety of computers (UNIX, Windows, Linux,<br />

and Macintosh), they need a <strong>programming</strong> language that’s portable across<br />

all plat<strong>for</strong>ms. In addition, biologists need to work with existing programs,<br />

such as online databases. Finally, because most biologists aren’t trained as<br />

programmers, they need a simple language that gets the job done quickly.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!