12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Estimating Protein Function Using Protein–Protein Relationships 115system “myBlastParser.pl –imyBLASTOuputForProtein_ j >>myParsedBLASTOuputForProtein_ j”;compress myBLASTOutputForProtein_ j;move myBLASTOutputForProtein_j.compressed to dirstoreRawBLASTData/;}close myInputFile;3.1.4. Parsing BLAST ResultsParsing of BLAST results is required so that only the information necessaryfor generating phylogenetic profiles and identifying Rosetta stone sequences isretained from sequence matches against the database. This greatly reduces thesize of the input required for subsequent steps. For every match of the querysequence against the database, at least five important details need to be capturedand retained from the raw output:1. The unique identifier of the subject sequence.2. The genome to which the subject sequence belongs.3. The BLAST expectation value of the high-scoring pair (HSP).4. The start and stop position of the HSP on the query sequence.5. The start and stop position of the HSP on the subject sequence.Besides these attributes, other bits of information such as raw scores, orthe percentage of sequence identity, can also be captured (see also Note 2).As the user becomes more familiar with the methods, other pieces of informationcan be utilized as filters, or even as substitutes for the primary attributes,when deciding the quality of a match or a hit against the referencedatabase.One possible form of output from a parser program is described next:>query >subject raw_score: value | E-value: value |query_start: value | query_end: value |subject_start: value | subject_end: value |match_length: value | identity_percentage: value |similarity_percentage: value | query_length: value |subject_length: value>hsapiens|gi|20093443 >hsapiens|gi|20093443raw_score: 300 | E-value:1e-155 | query_start: 1 | query_end: 140 | subject_start:1 | subject_end: 140 | match_length: 140| identity_percentage: 100 | similarity_percentage:100 | query_length: 140 | subject_length: 140

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!