12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

112 DateDetails of these tasks are described in Methods. These programs have to becoded by the user.3. MethodsBoth the computational methods described in this chapter have commoninitial requirements (described in Subheading 3.1.), such as the need for adatabase of reference genomes, and parsed results of BLAST searches againstthe database. Steps specific to the individual methods are described inSubheading 3.2.3.1. Common Initial Steps for the Phylogenetic Profiling Methodand the Rosetta Stone Method3.1.1. Creating a Database of Reference GenomesBoth methods require a database of genomes against which the querysequences are compared for similarity. It is important to note that this databaseshould contain genomes that are fully sequenced, as opposed to, say, creating adatabase similar to the BLAST nonredundant database, which is essentially arepository of all known protein sequences. Use of a database that contains proteinswithout regarding genome sequence status will generate incorrect profilesand create complications when applying statistical tests of confidence toRosetta stone linkages.Complete protein complements of fully sequenced genomes can be downloadedfrom NCBI (ftp://ftp.ncbi.nih.gov) or from websites of individualgenome sequencing centers (see Note 3). Users should ensure that all aminoacid sequences included in the database possess unique identifiers (see Note 4).The more genomes included in the sequence file, the better the methods willperform. Amino acid sequences of all proteins from the downloaded genomesare concatenated into a single file for ease of use and more accurate calculationof BLAST expectation (E) values. This file can also include proteins encoded bybacterial plasmids, if so desired.For illustration purposes, this database will be referred to as “myDatabaseFile,”populated with the following sequences:>ssolfataricus|gi|15896972MIVPVKNEERVLPRLLDRLVNLEYDKSKYEIIVVEDGSTDRTFQICKEYEIKYN NLIRCYSLPR>ecoli_K12|gi|16127996MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAMIEKTISGQDALPNI>hsapiens|gi|20093441

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!