11.07.2015 Views

Using Lomb-Scargle Periodograms to Identify Periodic Genes in ...

Using Lomb-Scargle Periodograms to Identify Periodic Genes in ...

Using Lomb-Scargle Periodograms to Identify Periodic Genes in ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Us<strong>in</strong>g</strong> <strong>Lomb</strong>-<strong>Scargle</strong> <strong>Periodograms</strong><strong>to</strong> <strong>Identify</strong> <strong>Periodic</strong> <strong>Genes</strong> <strong>in</strong> Somi<strong>to</strong>genesisS<strong>to</strong>wers Institute for Medical ResearchEarl F. GlynnScientific Programmer29 March 20061


<strong>Us<strong>in</strong>g</strong> <strong>Lomb</strong>-<strong>Scargle</strong> <strong>Periodograms</strong><strong>to</strong> <strong>Identify</strong> <strong>Periodic</strong> <strong>Genes</strong> <strong>in</strong> Somi<strong>to</strong>genesis• <strong>Periodic</strong> Patterns <strong>in</strong> Biology• Simple <strong>Periodic</strong> Gene Expression Model• Introduction <strong>to</strong> <strong>Lomb</strong>-<strong>Scargle</strong> Periodogram• Data Pipel<strong>in</strong>e• Methodology Validation Study:Bozdech’s Plasmodium dataset• Application <strong>to</strong> Mary-Lee’s somi<strong>to</strong>genesis dataset• Conclusions2


<strong>Periodic</strong> Patterns <strong>in</strong> BiologyA vertebrate’s body plan: a segmented pattern.Segmentation is established dur<strong>in</strong>g somi<strong>to</strong>genesis.Pho<strong>to</strong>graph taken at Reptile Gardens, Rapid City, SDwww.reptile-gardens.com3


Introduction <strong>to</strong> <strong>Lomb</strong>-<strong>Scargle</strong> Periodogram• What is a Periodogram?• Why <strong>Lomb</strong>-<strong>Scargle</strong> Instead of Fourier?• Example <strong>Us<strong>in</strong>g</strong> Cos<strong>in</strong>e Expression Model• Mathematical Details• <strong>Lomb</strong>-<strong>Scargle</strong> Experiments- S<strong>in</strong>gle Dom<strong>in</strong>ant Frequency- Multiple Frequencies- Mixtures: Signal and Noise- Multiple Hypothesis Test<strong>in</strong>g5


What is a Periodogram?• A graph show<strong>in</strong>g frequency “power” for a spectrumof frequencies• “Peak” <strong>in</strong> periodogram <strong>in</strong>dicates a frequency withsignificant periodicity<strong>Periodic</strong> SignalPeriodogramLog 2(Expression)ComputationSpectral“Power”TimeFrequency6


Why <strong>Lomb</strong>-<strong>Scargle</strong> Instead of Fourier?<strong>Lomb</strong>-<strong>Scargle</strong> MethodWeights data po<strong>in</strong>tsData can be unevenly sampledNo data imputationAny number of data po<strong>in</strong>tsKnown statistical properties“p” valueNeed estimate of number of“<strong>in</strong>dependent frequencies” butexplore us<strong>in</strong>g cont<strong>in</strong>uumFourier MethodWeights frequency <strong>in</strong>tervalsRequires uniform spac<strong>in</strong>gMiss<strong>in</strong>g data imputed2 N po<strong>in</strong>ts for FFT; 0 padd<strong>in</strong>gPermutation tests needed <strong>to</strong>assess statistical propertiesAd hoc scor<strong>in</strong>g rulesUsually only look at“<strong>in</strong>dependent” Fourierfrequencies7


<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramExample <strong>Us<strong>in</strong>g</strong> Cos<strong>in</strong>e Expression ModelExpression-1.0 0.0 0.5 1.0Cos<strong>in</strong>e Curve (N=48)N = 480 10 20 30 40A small valuefor the false-alarmprobability <strong>in</strong>dicatesa highly significantperiodic signal.Time [hours]T = 1fNormalized Power Spectral Density0 5 10 20<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 48 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.00 0.05 0.10 0.15 0.20Probability0.0 0.4 0.8Peak Significancep = 3.3e-009 at Peak0.00 0.05 0.10 0.15 0.20Frequency [1/hour]Frequency [1/hour]Evenly-spaced time po<strong>in</strong>ts8


<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramExample <strong>Us<strong>in</strong>g</strong> Noisy Cos<strong>in</strong>e Expression ModelCos<strong>in</strong>e Curve + Noise (N=48)Time Interval VariabilityExpression-1.0 0.0 1.0Frequency0 2 4 6 8N = 480 10 20 30 40-1.0 -0.5 0.0 0.5 1.0Time [hours]log10(delta T)Normalized Power Spectral Density0 5 10 20<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 45.7 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.00 0.05 0.10 0.15 0.20Probability0.0 0.4 0.8Peak Significancep = 2.54e-007 at Peak0.00 0.05 0.10 0.15 0.20Frequency [1/hour]Frequency [1/hour]Unevenly-spaced time po<strong>in</strong>ts9


<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramExample <strong>Us<strong>in</strong>g</strong> NoiseNoise (N=48)Time Interval VariabilityExpression-1.0 0.0 0.5 1.0N = 480 10 20 30 40Frequency0 2 4 6 8-1.0 -0.5 0.0 0.5 1.0Time [hours]log10(delta T)Normalized Power Spectral Density0 5 10 20<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 7.4 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.00 0.05 0.10 0.15 0.20Probability0.0 0.4 0.8Peak Significancep = 0.973 at Peak0.00 0.05 0.10 0.15 0.20Frequency [1/hour]Frequency [1/hour]10


<strong>Lomb</strong>-<strong>Scargle</strong> Periodogram Mathematical DetailsP N (ω) has an exponential probability distribution with unit mean.Source: Numerical Recipes <strong>in</strong> C (2 nd Ed), p. 57711


<strong>Lomb</strong>-<strong>Scargle</strong> Periodogram Experiment:S<strong>in</strong>gle Dom<strong>in</strong>ant FrequencyCos<strong>in</strong>e Curve (N=48)Expression-1.0 0.0 0.5 1.0N = 480 10 20 30 40Expression = Cos<strong>in</strong>e(2pt/24)Time [hours]Normalized Power Spectral Density0 5 10 20<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 24 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.00 0.05 0.10 0.15 0.20Probability0.0 0.4 0.8Peak Significancep = 3.3e-009 at Peak0.00 0.05 0.10 0.15 0.20Frequency [1/hour]Frequency [1/hour]S<strong>in</strong>gle “peak” <strong>in</strong> periodogram. S<strong>in</strong>gle “valley” <strong>in</strong> significance curve.12


<strong>Lomb</strong>-<strong>Scargle</strong> Periodogram Experiment:Multiple FrequenciesSum of 3 Cos<strong>in</strong>es (N=48)Expression-2 -1 0 1 2 3Expression =Cos<strong>in</strong>e(2pt/48) +Cos<strong>in</strong>e(2pt/24) +Cos<strong>in</strong>e(2pt/ 8)N = 480 10 20 30 40Time [hours]Normalized Power Spectral Density0 5 10 20<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 21.8 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.00 0.05 0.10 0.15 0.20Probability0.0 0.4 0.8Peak Significancep = 0.00246 at Peak0.00 0.05 0.10 0.15 0.20Frequency [1/hour]Frequency [1/hour]Multiple peaks <strong>in</strong> periodogram. Correspond<strong>in</strong>g valleys <strong>in</strong> significance curve. 13


<strong>Lomb</strong>-<strong>Scargle</strong> Periodogram Experiment:Multiple FrequenciesSum of 3 Cos<strong>in</strong>es (N=48)Expression-2 0 2 4Expression =3*Cos<strong>in</strong>e(2pt/48) +Cos<strong>in</strong>e(2pt/24) +Cos<strong>in</strong>e(2pt/ 8)N = 480 10 20 30 40Time [hours]Normalized Power Spectral Density0 5 10 20<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 48 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.00 0.05 0.10 0.15 0.20Probability0.0 0.4 0.8Peak Significancep = 2.37e-007 at Peak0.00 0.05 0.10 0.15 0.20Frequency [1/hour]Frequency [1/hour]“Weaker” periodicities cannot always be resolved statistically.14


<strong>Lomb</strong>-<strong>Scargle</strong> Periodogram Experiment:Multiple Frequencies: “Duty Cycle”50% 66.6% (e.g., human sleep cycle)duty cycle: 1/2duty cycle: 2/3Expression0.0 0.2 0.4 0.6 0.8 1.0N = 480 10 20 30 40Expression0.0 0.2 0.4 0.6 0.8 1.0N = 480 10 20 30 40Time [hours]Time [hours]Normalized Power Spectral Density0 5 10 15 20 25<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 24 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.05Probability0.0 0.2 0.4 0.6 0.8 1.0Peak Significancep = 2.54e-007 at PeakNormalized Power Spectral Density0 5 10 15 20 25<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 24 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.05Probability0.0 0.2 0.4 0.6 0.8 1.0Peak Significancep = 5.06e-006 at Peak0.0 0.1 0.2 0.3 0.4 0.50.0 0.1 0.2 0.3 0.4 0.50.0 0.1 0.2 0.3 0.4 0.50.0 0.1 0.2 0.3 0.4 0.5Frequency [1/hour]Frequency [1/hour]Frequency [1/hour]Frequency [1/hour]One peak with symmetric “duty cycle”.Multiple peaks with asymmetric cycle.15


<strong>Lomb</strong>-<strong>Scargle</strong> Experiment:Mixtures: <strong>Periodic</strong> Signal Vs. Noiselog(p) his<strong>to</strong>gramFrequency0 50 100 150'p' His<strong>to</strong>gram for 5000 Simulated Expresson Profiles (N= 48 )p correspond<strong>in</strong>g <strong>to</strong> max Periodogram Power Spectral Density100 % simulated periodic genesFrequency0 500 1000 1500'p' His<strong>to</strong>gram for 5000 Simulated Expresson Profiles (N= 48 )p correspond<strong>in</strong>g <strong>to</strong> max Periodogram Power Spectral Density50 % simulated periodic genesFrequency0 500 1000 1500 2000'p' His<strong>to</strong>gram for 5000 Simulated Expresson Profiles (N= 48 )p correspond<strong>in</strong>g <strong>to</strong> max Periodogram Power Spectral Density0 % simulated periodic genes-8 -6 -4 -2 0-8 -6 -4 -2 0-8 -6 -4 -2 0log10(p)log10(p)log10(p)100% periodic genes 50% periodic50% noise100% noise16


More False Negatives<strong>Lomb</strong>-<strong>Scargle</strong> Experiment:Mixtures: <strong>Periodic</strong> Signal Vs. NoiseBonferroniHolmHochbergBenjam<strong>in</strong>i &Hochberg FDRMultiple-Hypothesis Test<strong>in</strong>gLog10(p)-8 -6 -4 -2 0Multiple Test<strong>in</strong>g Correction Methods50 % simulated periodic genesbonferroniholmhochbergfdrnoneNoneMore False Positives0 1000 2000 3000 4000 5000Rank Order of Sorted p Values50% periodic, 50% noise17


Data Pipel<strong>in</strong>e <strong>to</strong> Apply <strong>to</strong> Microarray Dataset1. Apply quality control checks <strong>to</strong> data2. Apply <strong>Lomb</strong>-<strong>Scargle</strong> algorithm <strong>to</strong> allexpression profiles3. Apply multiple hypothesis test<strong>in</strong>g <strong>to</strong>def<strong>in</strong>e “significant” periodic genes4. Analyze biological significance ofperiodic genes18


Methodology Validation Study:Bozdech’s Plasmodium datasetFrom Bozdech, et al, Fig. 1A, PLoS Biology, Vol 1, No 1, Oct 2003, p 3.Intraerythrocytic Developmental Cycle of Plasmodium falciparumCritical Assessment of Microarray Data Analysis Conference (CAMDA), Nov 2004http://research.s<strong>to</strong>wers-<strong>in</strong>stitute.org/efg/2005/<strong>Lomb</strong><strong>Scargle</strong>/19


Bozdech’s Plasmodium dataset:1. Apply Quality Control ChecksGlobal views of experiment.Remove certa<strong>in</strong> outliers.20


Bozdech’s Plasmodium dataset:1. Apply Quality Control ChecksMany miss<strong>in</strong>g data po<strong>in</strong>ts require imputation for Fourier analysis.21


Bozdech’s Plasmodium dataset:2. Apply <strong>Lomb</strong>-<strong>Scargle</strong> AlgorithmPhase<strong>Periodic</strong> Expression Patterns Phaseopfi17638Time Interval Variabilityi3518_1Time Interval Variability-4 -2 0 20 10 20 30 40-2 -1 0 10 10 20 30 40N = 460 10 20 30 40-1.0 -0.5 0.0 0.5 1.0N = 460 10 20 30 40-1.0 -0.5 0.0 0.5 1.00 5 10 15 20 25<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 45.7 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.0 0.2 0.4 0.6 0.8 1.0Peak Significancep = 1.19e-008 at Peak0 5 10 15 20 25<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 45.7 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.0 0.2 0.4 0.6 0.8 1.0Peak Significancep = 1.48e-008 at Peak0.00 0.05 0.10 0.15 0.20 0.00 0.05 0.10 0.15 0.200.00 0.05 0.10 0.15 0.20 0.00 0.05 0.10 0.15 0.20Examples of highly-significant periodic expression profiles.22


Bozdech’s Plasmodium dataset:2. Apply <strong>Lomb</strong>-<strong>Scargle</strong> AlgorithmAperiodic/Noise Expression Patternsj167_5Time Interval Variabilityf35105_2Time Interval Variability-0.5 0.0 0.5 1.0N = 350 10 20 30 400 5 10 15 20 25-1.0 -0.5 0.0 0.5 1.0-1.0 -0.5 0.0 0.5 1.0 1.5N = 450 10 20 30 400 10 20 30 40-1.0 -0.5 0.0 0.5 1.00 5 10 15 20 25<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 17.8 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.0 0.2 0.4 0.6 0.8 1.0Peak Significancep = 0.998 at Peak0 5 10 15 20 25<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 32 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.0 0.2 0.4 0.6 0.8 1.0Peak Significancep = 0.516 at Peak0.00 0.05 0.10 0.15 0.200.00 0.05 0.10 0.15 0.200.00 0.05 0.10 0.15 0.200.00 0.05 0.10 0.15 0.2023


Bozdech’s Plasmodium dataset:2. Apply <strong>Lomb</strong>-<strong>Scargle</strong> AlgorithmSmall “N”f58149_1Time Interval Variabilityn170_1Time Interval Variability-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5N = 390 10 20 30 400 5 10 15 20 25 30-1.0 -0.5 0.0 0.5 1.0-3 -2 -1 0 1 2N = 320 10 20 30 400 5 10 15 20 25 30-1.0 -0.5 0.0 0.5 1.00 5 10 15 20 25<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 48 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.0 0.2 0.4 0.6 0.8 1.0Peak Significancep = 8.54e-006 at Peak0 5 10 15 20 25<strong>Lomb</strong>-<strong>Scargle</strong> PeriodogramPeriod at Peak = 64 hoursp = 1e-06p = 1e-05p = 1e-04p = 0.001p = 0.01p = 0.050.0 0.2 0.4 0.6 0.8 1.0Peak Significancep = 2.74e-005 at Peak0.00 0.05 0.10 0.15 0.20N=390.00 0.05 0.10 0.15 0.200.00 0.05 0.10 0.15 0.20 0.00 0.05 0.10 0.15 0.20N=3224


Bozdech’s Plasmodium dataset:2. Apply <strong>Lomb</strong>-<strong>Scargle</strong> AlgorithmSignal and Noise Mixture'p' his<strong>to</strong>gramComplete Bozdech set of 6875 probesNumber of Probes0 50 100 150 200<strong>Periodic</strong> ProbesAperiodic Probes or NoiseMiss<strong>in</strong>g NoiseSpike?Frequency0 500 1000 1500 2000'p' His<strong>to</strong>gram for 5000 Simulated Expresson Profiles (N= 48 )p correspond<strong>in</strong>g <strong>to</strong> max Periodogram Power Spectral Density0 % simulated periodic genes-8 -6 -4 -2 0log10(p)-8 -6 -4 -2 0his<strong>to</strong>gram-log10p.pdf 2004-11-06 10:26log10(p)25


Bozdech’s Plasmodium dataset:3. Apply Multiple-Hypothesis Test<strong>in</strong>gMore False NegativesMultiple Test<strong>in</strong>g Correction Methods(<strong>Us<strong>in</strong>g</strong> R's p.adjust methods)BonferroniHolmHochbergBenjam<strong>in</strong>i &Hochberg FDRLog10(p)-8 -6 -4 -2 0bonferroniholmhochbergfdrnoneSignificanceα = 1E-4None0 1000 2000 3000 4000 5000 6000 7000Rank Order of Sorted p ValuesMore False Positivesp-adjust.pdf 2004-11-06 10:1226


Bozdech’s Plasmodium dataset:4. Analyze Biological SignificanceBozdech Complete Set6875 probesSmall "N"1795Qualilty Control Set5080Overview Set243 501 36111084355<strong>Lomb</strong>-<strong>Scargle</strong><strong>Periodic</strong>3719Bozdech<strong>Periodic</strong>27


Bozdech’s Plasmodium dataset:4. Analyze Biological Significance“Phaseograms”Probes Ordered by PhaseProbes Ordered by PhaseTime<strong>Lomb</strong>-<strong>Scargle</strong> Results4355 ProbesTimeBozdech: “Overview” Dataset2714 genes, 3395 probes28


Bozdech’s Plasmodium dataset:Bozech’s Ad Hoc Scor<strong>in</strong>gpowerMAX vs Power at Peak Frequency<strong>Periodic</strong><strong>Genes</strong>29


Bozdech’s Plasmodium dataset:Bozech’s Ad Hoc Scor<strong>in</strong>g Vs <strong>Lomb</strong>-<strong>Scargle</strong> p values30


Bozdech’s Plasmodium dataset:Bozdech’s “Phase” Vs. Peak of Smoothed Time Series31


Mary-Lee’s Somi<strong>to</strong>genesis DatasetWe wanted <strong>to</strong> validate the <strong>Lomb</strong>-<strong>Scargle</strong> method withBozdech’s dataset before apply<strong>in</strong>g <strong>to</strong> our somi<strong>to</strong>genesisproblem, s<strong>in</strong>ce the Fourier technique could not be used:<strong>Scargle</strong> (1982):“surpris<strong>in</strong>g result is that the … spectrum of a process canbe estimated … [with] only the order of the samples …”32


Somi<strong>to</strong>genesis DatasetCos<strong>in</strong>e (N = 10 )Time Interval VariabilityCos<strong>in</strong>e Curve-1.0 0.0 0.5 1.0Frequency0 2 4 6 8N100.1p0 20 40 60 80 100Time [m<strong>in</strong>utes]-6 -4 -2 0 2 4 6log10(delta T)170.006Normalized Power Spectral Density0 2 4 6 8 10<strong>Lomb</strong>-<strong>Scargle</strong> Periodogramperiod = 119 m<strong>in</strong>utes0.005 0.015 0.025Probability0.0 0.4 0.8p = 0.1058Peak Significance0.005 0.015 0.02520480.0023E-9Frequency [1/m<strong>in</strong>ute]Frequency [1/m<strong>in</strong>ute]33


Somi<strong>to</strong>genesis DatasetMary-Lee, Science Club, 200434


Somi<strong>to</strong>genesis DatasetMary-Lee, Science Club, 200435


Somi<strong>to</strong>genesis DatasetDo we care about time order, or only periodic genes?22,690Affy Probesets36


Somi<strong>to</strong>genesis DatasetDo we care about time order, or only periodic genes?ReorderPlan: Use only known periodic genesTimeOrder<strong>in</strong>gExperiment1.doc, Nov 200437


Somi<strong>to</strong>genesis DatasetPerturbation (“Jitter”) and Permutation TestsOrder Nveau.pptPerturbation (times):27.75443 33.08618 42.6607 11.04462 -0.6288767 18.55415 12.93994 44.90501 59.98529 67.2538 46.87758 55.91625 74.7259178.9921 83.98062 104.9910 78.12926 77.38472 111.5297 109.2331Permutation (order for fixed times):7 5 6 8 4 3 2 1 | 11 12 9 10 | 13 | 17 15 16 18 14 | 20 1938


Somi<strong>to</strong>genesis DatasetPerturbation (“Jitter”) and Permutation TestsBestRank ProductJitterSpreadHis<strong>to</strong>grams.pdf, Feb 2005WorstBestRank ProductPermuteSpreadHis<strong>to</strong>grams.pdfWorst 39


Somi<strong>to</strong>genesis DatasetPerturbation (“Jitter”) and Permutation TestsPermutePerturbComparison.xls, Jan 200540


Somi<strong>to</strong>genesis DatasetPerturbation (“Jitter”) and Permutation TestsTop 50: Somites Vs Random100009000800070006000SomitesTop 50 Score50004000300020001000Random01 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97RankStatComparison1000.xls: Top50, Jan 200541


Somi<strong>to</strong>genesis DatasetPerturbation (“Jitter”) and Permutation TestsSomites Vs Random350003000025000RandomLogRankSum2000015000100005000SomitesStatComparison1000.xls: TopLogRankSum01 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97Rank42


Somi<strong>to</strong>genesis DatasetHow many periodic genes are <strong>in</strong> the dataset?Approx Noise Cu<strong>to</strong>ff43


Somi<strong>to</strong>genesis DatasetWhat periodicities are present?Periodogram Cluster<strong>in</strong>gHierarchical Cluster<strong>in</strong>g4 7 241 9320.1 0PeriodogramCluster<strong>in</strong>g.docAug 2004F1 F5 F9 F13 F17 F21 F25 F29 F33 F37119 m<strong>in</strong> 59 m<strong>in</strong>44


Somi<strong>to</strong>genesis DatasetUnconstra<strong>in</strong>ed Permutations <strong>to</strong> Estimate False Hit Rate(10,000 Permutations, 17-po<strong>in</strong>t time series, 7544 Affy probes)Use p-value cu<strong>to</strong>ffs from “basel<strong>in</strong>e” with permutationsAbout 35-40 cyclic genes detected?F<strong>in</strong>alStatsPeriodRange.xls, March 200645


Conclusions• <strong>Lomb</strong>-<strong>Scargle</strong> periodogram is effective <strong>to</strong>ol <strong>to</strong>identify periodic gene expression profiles• Results comparable with Fourier analysis• <strong>Lomb</strong>-<strong>Scargle</strong> can help when data are miss<strong>in</strong>gor not evenly spaced46


Conclusions• Conclusions should not be drawn us<strong>in</strong>g the<strong>in</strong>dividual p-value calculated for each profile. Amultiple comparison procedure False DiscoveryRate (FDR) must be used <strong>to</strong> control the error rate.• Expression profiles may be more complex thansimple cos<strong>in</strong>e curves• Power spectra of non-s<strong>in</strong>usoid rhythms may bedifficult <strong>to</strong> <strong>in</strong>terpret47


AcknowledgementsPourquie LabOlivier PourquiéMary-Lee DequéantBio<strong>in</strong>formaticsArcady MushegianGal<strong>in</strong>a GlaskoJie Chen48

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!