12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Estimating Gene Function With LS-NMF 45variability within a pattern. Herein the process using a spreadsheet program isdescribed, as these are widely available.1. Output the A (distribution) matrix for the best simulation from the PattRun programby pressing the Export button once the results are viewed. Provide a namefor the output file in the Save dialog. Check only the box for the Distribution andthen press the Save button.2. Open the output file with a spreadsheet program. The top lines of the file willlook likecdc25-sep160147.00Wed Sep 13 13:42:57 2006DistributionSPAC222.09 10.03 3.53 5.75 0 1.28 0SPAC977.10 10.79 0 0 10.98 0 0.12SPAC821.06 5.59 0.02 5.10 0.31 6.88 2.09SPAC821.09 0 0 14.49 0 0 7.31with a header providing the name of the data set, the χ 2 value, and the date of theLS-NMF analysis. After the header, each row provides the gene name and thestrength of assignment of that gene to each of the six patterns.3. Choose the pattern of interest; herein again focus is on pattern 3 (column D inthe spreadsheet). Calculate the mean and standard deviation of the column byreplacing cell D3 with “=average(d6:dN)”, where N is the last row with dataand replacing cell D4 with “=stdev(d6:dN).” Using cut and paste these can becalculated for all patterns, if one wishes.4. In an empty column, move to the sixth cell and enter “=(d6-$d$4)/$d$5” andreturn. Then copy this cell and fill down. These are the Z-scores for the genes.5. Copy the first column (gene names) and the Z-scores, so they are side-by-side.If pasting into a new spreadsheet or page, choose to paste values. Sort thecolumns by the Z-score.6. Again, one must choose a cutoff to produce a gene list; however in general, thelarger the magnitude of the Z-score the more strongly a gene is associated with apattern. This allows one to compare the strength of association of a gene acrossdifferent patterns. Comparison with the list from ClutrFree will show the genes arein the same order, but the values have changed.4. Notes1. In addition, two command line C++ versions, one for single workstation (DesktopLS-NMF) and one for Beowulf cluster (LAM/MPI LS-NMF), are available foradvanced users. Both versions are coded in C++, should be compiled using a standardC++ and mpiCC. Packages are downloadable in tar ball form, and aREADME file is included with all necessary steps for installation. Other than theLS-NMF code itself, two Perl scripts are included under API subdirectory for

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!