View - ResearchGate

More documents

Recommendations

Info

Sybil: Multiple Genome Comparison and Visualization 973. Methods3.1. Protein Clustering3.1.1. “All-vs-All” BLASTP Analysis1. xdformat is used to create a BLASTP-searchable database of the predictedpolypeptide sequences from all of the input genomes: xdformat –p –I –o all-peptidesall-peptides.fsa (15). It is assumed that each polypeptide has been assigned a uniqueidentifier and can be related back to the gene of which it is a product.2. Each of the predicted polypeptide sequences is searched against the database fromstep 1 with WU-BLASTP (15,16) (see Note 5) and the results are stored for use insubsequent steps (see Note 6): blastp all-peptides pep-1.fsa –E 1e-5 –matrix BLO-SUM62 –wordmask none –B 150 –V 150 –gspmax 5 –shortqueryok –novalidctxok–cpus 1 > pep-1-vs-all-blastp.raw.3.1.2. Clustering Phase 1: Jaccard Coefficient-Based Protein ClusteringThe first phase of the protein clustering algorithm is run on each inputgenome separately. In this phase, a subset of the all-vs-all BLASTP matches isused to compute a Jaccard similarity coefficient (10) for every pair of polypeptidesfrom the same genome. All pairs of polypeptides whose Jaccard coefficientis more than a specified threshold are then subjected to a straightforwardgraph analysis to determine the resulting clusters. For each input genome:1. Identify the subset of the BLASTP matches to be used. By default only BLASTPmatches with at least 80% sequence identity and an E-value of at most1 × 10 −5 are used in the subsequent steps (see Note 7).2. Use the BLASTP matches from step 1 to determine which pairs of polypeptidesare “related” to one another; by definition one considers two polypeptides relatedif either one has a BLASTP match to the other that meets the conditions describedin step 1. Every polypeptide is also considered to be related to itself, regardless ofwhether a BLASTP self-match was found in step 1.3. Compute and record a Jaccard similarity coefficient for each pair of predictedpolypeptides. Fig. 3 illustrates how this is done for a representative pair of polypeptides.For any two polypeptides P1 and P2 the Jaccard similarity coefficient is theratio of the number of polypeptides (including P1 and P2 themselves) that are relatedto both P1 and P2 to the number of polypeptides that are related to either P1 or P2.Therefore, the Jaccard similarity coefficient for any pair of polypeptides P1 and P2 isa number between zero and one that reflects how similarly connected P1 and P2 areto the other polypeptides in the same data set (in this case, a single genome).4. Create a graph (see Fig. 4) in which each node corresponds to one of the polypeptidesfrom the selected input genome, and an edge is drawn between two polypeptidesP1 and P2 only if the Jaccard similarity coefficient of P1 and P2 is equal to or morethan a predetermined threshold (set to 0.6 by default) (see Note 8).5. The connected components of the graph generated in step 4, when treated as setsof polypeptides, are referred to as “Jaccard clusters,” or “JACs” for short. These
Page 2:
Gene Function Analysis
Page 6:
METHODS IN MOLECULAR BIOLOGYGene Fu
Page 12:
PrefaceThis volume of Methods in Mo
Page 16:
Prefaceixcolleagues demonstrate how
Page 20:
xiiContentsPART III EXPERIMENTAL ME
Page 26:
ICOMPUTATIONAL METHODS I
Page 34:
4 BidautTable 1Input File Format Us
Page 38:
6 BidautTable 2Folder Layout to Use
Page 42:
8 Bidaut• alphaA: this is the num
Page 46:
10 Bidautcomputing the maximum corr
Page 50:
12 BidautFig. 3. The complete Clutr
Page 54:
Table 3Some Identified Patterns (5,
Page 58:
16 BidautFig. 4. This is a comparis
Page 62:
18 BidautReferences1. Hughes, T. R.
Page 66:
20 Kirov et al.way to associate gen
Page 70:
22 Kirov et al.based on a study ass
Page 74:
24 Kirov et al.1. Retrieve the gene
Page 78:
26Fig. 1. Functional associations f
Page 82:
28 Kirov et al.Fig. 2. Pathway anal
Page 86:
30 Kirov et al.3. Gene symbols usag
Page 90:
32 Kirov et al.9. OBO_Team, Open Bi
Page 94:
3Estimating Gene Function With Leas
Page 98:
Estimating Gene Function With LS-NM
Page 102:
Page 106:
Page 110:
Page 114:
Page 118:
Page 122:
50 Gonye et al.activity and problem
Page 126:
52 Gonye et al.Currently, PAINT can
Page 130:
54 Gonye et al.dynamic nature of th
Page 136:
Prediction Using PAINT 57represente
Page 140:
Prediction Using PAINT 59In PAINT,
Page 144:
Prediction Using PAINT 6114. On the
Page 148:
Prediction Using PAINT 634.2. Size
Page 152:
65Fig. 4. Localization of enrichmen
Page 156:
Prediction Using PAINT 673. Okubo,
Page 160:
5Prediction of Intrinsic Disorder a
Page 164:
Prediction of ID and Its Use in Fun
Page 168: Table 1Summary of the Web Servers O
Page 172: Prediction of ID and Its Use in Fun
Page 208: IICOMPUTATIONAL METHODS II
Page 212: 94 Crabtree et al.genomes, which is
Page 216: 96 Crabtree et al.Fig. 2. Sybil pro
Page 222: Sybil: Multiple Genome Comparison a
Page 242: 7Estimating Protein Function Using
Page 246: Estimating Protein Function Using P
Page 270:
Estimating Protein Function Using P
Page 274:
Page 278:
Page 282:
130 Davuluriinteracting proteins an
Page 286:
Table 1Web URLs of Promoter, TF Dat
Page 290:
134 DavuluriPWM-based models do not
Page 294:
136 DavuluriTF-map alignments of or
Page 298:
138 Davuluridiscussed which program
Page 302:
140 DavuluriTable 2ER-a-Responsive
Page 306:
Table 3Sample Data Matrix Represent
Page 310:
Table 3 (Continued)Class MYCMAX MYC
Page 314:
146 DavuluriFig. 3. (A) CART Tree:
Page 318:
148 Davuluri11. Vlieghe, D., Sandel
Page 322:
150 Davuluri44. Berezikov, E., Gury
Page 326:
9Mining Biomedical Data Using MetaM
Page 330:
Mining Biomedical Data Using MMTx a
Page 334:
Page 338:
Page 342:
Page 346:
Page 350:
Page 354:
Page 358:
Page 362:
172 Ho et al.Fig. 1. Artificial exa
Page 366:
174 Ho et al.allowing for cases whe
Page 370:
176 Ho et al.A different measure is
Page 374:
178 Ho et al.3.1.3. LA and Generali
Page 378:
180 Ho et al.The ECF-statistic can
Page 382:
182 Ho et al.In the special case of
Page 386:
184 Ho et al.Fig. 5. An illustratio
Page 390:
186 Ho et al.Fig. 7. The power curv
Page 394:
188 Ho et al.this section were not
Page 398:
190 Ho et al.References1. Schena, M
Page 402:
IIIEXPERIMENTAL METHODS
Page 406:
194 Caldwell et al.for sequences th
Page 410:
196 Caldwell et al.query because it
Page 414:
198 Caldwell et al.Fig. 1. (A) Prot
Page 418:
200 Caldwell et al.outside primer o
Page 422:
202 Caldwell et al.5. Targeting scr
Page 426:
204 Caldwell et al.will allow the s
Page 430:
206 Caldwell et al.3.1.6. Plasmid P
Page 434:
208 Caldwell et al.PCR amplify the
Page 438:
210 Caldwell et al.8. Thawing cells
Page 442:
212 Zhang et al.Going one step beyo
Page 446:
214 Zhang et al.Fig. 2. Generation
Page 450:
216 Zhang et al.Perform PCR cycles,
Page 454:
218 Zhang et al.Fig. 4. Schematic m
Page 458:
220 Zhang et al.Fig. 5. Replacement
Page 462:
13Construction of Simple and Effici
Page 466:
DNA Vector-Based shRNA-Expression S
Page 470:
Page 474:
Page 478:
Page 482:
Page 486:
Page 490:
Page 494:
Page 498:
Page 502:
244 Hust et al.overcome by two appr
Page 506:
246 Hust et al.Fig. 1. Schematic de
Page 510:
248 Hust et al.interaction during p
Page 514:
250 Hust et al.3.4. Titering1. Inoc
Page 518:
252 Hust et al.10. Shortly before u
Page 522:
254 Hust et al.activity by preservi
Page 526:
15A Bacterial/Yeast Merged Two-Hybr
Page 530:
Screening in Yeast With a Bacterial
Page 534:
Page 538:
Page 542:
Page 546:
Page 550:
Page 554:
Page 558:
Page 562:
Page 566:
Page 570:
Page 574:
Page 578:
Page 582:
Page 586:
Page 590:
Page 594:
16A Bacterial/Yeast Merged Two-Hybr
Page 598:
Dual Bait-Compatible Bacterial Two-
Page 602:
Page 606:
Page 610:
Page 614:
Page 618:
Page 622:
Page 626:
Page 630:
Page 634:
Page 638:
Page 642:
Page 646:
318 Thibodeau-Beganny and Joungbeen
Page 650:
320 Thibodeau-Beganny and JoungFig.
Page 654:
Page 658:
324 Thibodeau-Beganny and JoungTypi
Page 662:
Page 666:
328 Thibodeau-Beganny and JoungPCR
Page 670:
330 Thibodeau-Beganny and Joung16-1
Page 674:
332 Thibodeau-Beganny and Joung2. P
Page 678:
334 Thibodeau-Beganny and Joung11.
Page 682:
336 IndexKknockin (gene knockin) 19
show all

View - ResearchGate

Create successful ePaper yourself

Delete template?

Save as template?