12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3Estimating Gene Function With Least SquaresNonnegative Matrix FactorizationGuoli Wang and Michael F. OchsSummaryNonnegative matrix factorization is a machine learning algorithm that has extracted informationfrom data in a number of fields, including imaging and spectral analysis, text mining, andmicroarray data analysis. One limitation with the method for linking genes through microarraydata in order to estimate gene function is the high variance observed in transcription levelsbetween different genes. Least squares nonnegative matrix factorization uses estimates of theuncertainties on the mRNA levels for each gene in each condition, to guide the algorithm to alocal minimum in normalized χ 2 , rather than a Euclidean distance or divergence between thereconstructed data and the data itself. Herein, application of this method to microarray data isdemonstrated in order to predict gene function.Key Words: Clustering; least squares; microarray data analysis; nonnegative matrix factorization(NMF); pattern recognition; machine learning.1. IntroductionNonnegative matrix factorization (NMF) was introduced by Lee and Seung forimage decomposition (1). Because of benefits in both interpretation and implementation,NMF was soon adopted in other research, including text mining (2),spectral decomposition (3), multiple sequence alignment (4), and neurophysiology(5). The application of NMF to microarray data analysis showed that it couldbe superior to clustering techniques for prediction of gene function (6,7). Oneissue that has limited application of NMF in many areas is that the patterns foundwithin the data are diffuse, leading to attempts to limit the distributions throughsparse matrix methods (e.g., see ref. 8). In addition, because measurements onmRNA levels of different genes show large differences in variance, a method thatutilizes variance estimates was recently introduced to improve predictions of geneFrom: Methods in Molecular Biology, vol. 408: Gene Function AnalysisEdited by: M. Ochs © Humana Press Inc., Totowa, NJ35

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!