View - ResearchGate

More documents

Recommendations

Info

Modeling Transcription Factor Target Promoters 137class representation in that group. The resulting model is a highly interpretabledecision tree, which helps design further experiments. Some of the principal limitationsof CART are low accuracy (because of the use of piece-wise, constantapproximations) and high variance or instability. In particular, when the number ofvariables (TFBSs) is much larger than the number of observations (promoters),CART would fail to give a robust classification model (see Note 3).In order to limit the number of variables for CART analysis, one can use theRandom Forest program (65) to preselect the most discriminative variablesfrom a large number of input variables. Random forest is an ensemble of manydecision trees, such that each tree depends on the values of a random vectorsampled independently and with the same distribution for all trees in the forest.To classify a new object from an input vector, the algorithm applies the inputvector to each tree of the forest. Each tree is a separate classification model, andthe tree “votes” for that class. The forest then chooses the classification havingthe most votes over all of the trees in the forest. The forest error rate dependson the correlation between any two trees in the forest (increasing the correlationincreases the forest error rate) and the strength of each individual tree in theforest (a tree with a low error rate is a strong classifier and increasing thestrength of the individual trees decreases the forest error rate).Random Forest can handle thousands of input variables without variableselection and gives estimates of what variables are important in the classification.Although Random Forest is a robust classifier, the black box nature of thealgorithm makes it impracticable to infer the decision rules from thousands oftrees. In the present case, it is critical to understand the interaction of variables(TFs) that provide the predictive accuracy. Hence, the use of Random Forest forvariable selection followed by application of CART algorithm is recommended.The commercially available CART program (66) is perhaps the best and isuser-friendly, and the authors have used it in their earlier studies (3,24). If thecommercial program is not available, the user may use rpart, a free implementationof CART in the R statistical package. Similarly, the freely availableimplementation of Random Forest in R can be used for variable selection. Theauthor suggests “Gini” method as the splitting method for growing the tree andthe 10-fold cross-validation to obtain the optimal minimal tree. TFBSs predictedby MATCH and conserved in the human and mouse orthologous promoterscan be used as predictor variables, wherein each binding site may beconsidered as a binary variable, such that it was either 1 or 0, depending onits presence or absence within a specified region.3.2. Worked ExampleVarious methods are discussed to predict TFBSs in a given promoter anddecision tree classification methods in the previous sections. Now will be
Page 2:
Gene Function Analysis
Page 6:
METHODS IN MOLECULAR BIOLOGYGene Fu
Page 12:
PrefaceThis volume of Methods in Mo
Page 16:
Prefaceixcolleagues demonstrate how
Page 20:
xiiContentsPART III EXPERIMENTAL ME
Page 26:
ICOMPUTATIONAL METHODS I
Page 34:
4 BidautTable 1Input File Format Us
Page 38:
6 BidautTable 2Folder Layout to Use
Page 42:
8 Bidaut• alphaA: this is the num
Page 46:
10 Bidautcomputing the maximum corr
Page 50:
12 BidautFig. 3. The complete Clutr
Page 54:
Table 3Some Identified Patterns (5,
Page 58:
16 BidautFig. 4. This is a comparis
Page 62:
18 BidautReferences1. Hughes, T. R.
Page 66:
20 Kirov et al.way to associate gen
Page 70:
22 Kirov et al.based on a study ass
Page 74:
24 Kirov et al.1. Retrieve the gene
Page 78:
26Fig. 1. Functional associations f
Page 82:
28 Kirov et al.Fig. 2. Pathway anal
Page 86:
30 Kirov et al.3. Gene symbols usag
Page 90:
32 Kirov et al.9. OBO_Team, Open Bi
Page 94:
3Estimating Gene Function With Leas
Page 98:
Estimating Gene Function With LS-NM
Page 102:
Page 106:
Page 110:
Page 114:
Page 118:
Page 122:
50 Gonye et al.activity and problem
Page 126:
52 Gonye et al.Currently, PAINT can
Page 130:
54 Gonye et al.dynamic nature of th
Page 136:
Prediction Using PAINT 57represente
Page 140:
Prediction Using PAINT 59In PAINT,
Page 144:
Prediction Using PAINT 6114. On the
Page 148:
Prediction Using PAINT 634.2. Size
Page 152:
65Fig. 4. Localization of enrichmen
Page 156:
Prediction Using PAINT 673. Okubo,
Page 160:
5Prediction of Intrinsic Disorder a
Page 164:
Prediction of ID and Its Use in Fun
Page 168:
Table 1Summary of the Web Servers O
Page 172:
Page 176:
Page 180:
Page 184:
Page 188:
Page 192:
Page 196:
Page 200:
Page 204:
Page 208:
IICOMPUTATIONAL METHODS II
Page 212:
94 Crabtree et al.genomes, which is
Page 216:
96 Crabtree et al.Fig. 2. Sybil pro
Page 220:
98 Crabtree et al.Fig. 3. Computing
Page 224:
100 Crabtree et al.3.1.5.1. FILTER
Page 228:
102 Crabtree et al.3. For the sake
Page 232:
104 Crabtree et al.Fig. 5. Best bid
Page 236:
106 Crabtree et al.17. Some cluster
Page 240:
108 Crabtree et al.19. Chado—The
Page 244: 110 Dateproducts prevents the under
Page 248: 112 DateDetails of these tasks are
Page 252: 114 DateThis step creates additiona
Page 256: 116 Date>hsapiens|gi|20093443 >hsap
Page 260: 118 DateBLAST score from the match
Page 264: Table 1A Sample of Results From Pro
Page 268: 122 DateFig. 1. A network of functi
Page 272: 124 Datedescribed by Verjovsky Marc
Page 276: 126 Dateor contracts put forth by t
Page 280: 8Bioinformatics Tools for Modeling
Page 284: Modeling Transcription Factor Targe
Page 288: VISTA Program to search for TFBSs H
Page 292: Modeling Transcription Factor Targe
Page 298: 138 Davuluridiscussed which program
Page 302: 140 DavuluriTable 2ER-a-Responsive
Page 306: Table 3Sample Data Matrix Represent
Page 310: Table 3 (Continued)Class MYCMAX MYC
Page 314: 146 DavuluriFig. 3. (A) CART Tree:
Page 318: 148 Davuluri11. Vlieghe, D., Sandel
Page 322: 150 Davuluri44. Berezikov, E., Gury
Page 326: 9Mining Biomedical Data Using MetaM
Page 330: Mining Biomedical Data Using MMTx a
Page 346:
Mining Biomedical Data Using MMTx a
Page 350:
Page 354:
Page 358:
Page 362:
172 Ho et al.Fig. 1. Artificial exa
Page 366:
174 Ho et al.allowing for cases whe
Page 370:
176 Ho et al.A different measure is
Page 374:
178 Ho et al.3.1.3. LA and Generali
Page 378:
180 Ho et al.The ECF-statistic can
Page 382:
182 Ho et al.In the special case of
Page 386:
184 Ho et al.Fig. 5. An illustratio
Page 390:
186 Ho et al.Fig. 7. The power curv
Page 394:
188 Ho et al.this section were not
Page 398:
190 Ho et al.References1. Schena, M
Page 402:
IIIEXPERIMENTAL METHODS
Page 406:
194 Caldwell et al.for sequences th
Page 410:
196 Caldwell et al.query because it
Page 414:
198 Caldwell et al.Fig. 1. (A) Prot
Page 418:
200 Caldwell et al.outside primer o
Page 422:
202 Caldwell et al.5. Targeting scr
Page 426:
204 Caldwell et al.will allow the s
Page 430:
206 Caldwell et al.3.1.6. Plasmid P
Page 434:
208 Caldwell et al.PCR amplify the
Page 438:
210 Caldwell et al.8. Thawing cells
Page 442:
212 Zhang et al.Going one step beyo
Page 446:
214 Zhang et al.Fig. 2. Generation
Page 450:
216 Zhang et al.Perform PCR cycles,
Page 454:
218 Zhang et al.Fig. 4. Schematic m
Page 458:
220 Zhang et al.Fig. 5. Replacement
Page 462:
13Construction of Simple and Effici
Page 466:
DNA Vector-Based shRNA-Expression S
Page 470:
Page 474:
Page 478:
Page 482:
Page 486:
Page 490:
Page 494:
Page 498:
Page 502:
244 Hust et al.overcome by two appr
Page 506:
246 Hust et al.Fig. 1. Schematic de
Page 510:
248 Hust et al.interaction during p
Page 514:
250 Hust et al.3.4. Titering1. Inoc
Page 518:
252 Hust et al.10. Shortly before u
Page 522:
254 Hust et al.activity by preservi
Page 526:
15A Bacterial/Yeast Merged Two-Hybr
Page 530:
Screening in Yeast With a Bacterial
Page 534:
Page 538:
Page 542:
Page 546:
Page 550:
Page 554:
Page 558:
Page 562:
Page 566:
Page 570:
Page 574:
Page 578:
Page 582:
Page 586:
Page 590:
Page 594:
16A Bacterial/Yeast Merged Two-Hybr
Page 598:
Dual Bait-Compatible Bacterial Two-
Page 602:
Page 606:
Page 610:
Page 614:
Page 618:
Page 622:
Page 626:
Page 630:
Page 634:
Page 638:
Page 642:
Page 646:
318 Thibodeau-Beganny and Joungbeen
Page 650:
320 Thibodeau-Beganny and JoungFig.
Page 654:
Page 658:
324 Thibodeau-Beganny and JoungTypi
Page 662:
Page 666:
328 Thibodeau-Beganny and JoungPCR
Page 670:
330 Thibodeau-Beganny and Joung16-1
Page 674:
332 Thibodeau-Beganny and Joung2. P
Page 678:
334 Thibodeau-Beganny and Joung11.
Page 682:
336 IndexKknockin (gene knockin) 19
show all

View - ResearchGate

Create successful ePaper yourself

Delete template?

Save as template?