70 CHAPTER 5: Estimat<strong>in</strong>g <strong>of</strong> <strong>the</strong> <strong>Jet</strong> <strong>Energy</strong> <strong>Scale</strong> Calibration Factor# events12000 Entries 2898810000Mean 1.224RMS 1.013800060004000# events1200010000800060004000Entries 28988Mean 0.662RMS 0.6669200000 1 2 3 4 5 6 7 8 9# ISR jetsFigure 5.1: The number <strong>of</strong> ISR jets withp T > 30 GeV per selected signal event.20000-0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5# ISR jetsFigure 5.2: The number <strong>of</strong> ISR jets thatcan be matched to <strong>the</strong> four lead<strong>in</strong>g jetsper selected signal event.lead<strong>in</strong>g jets can not be matched to <strong>the</strong> four quarks <strong>in</strong> <strong>the</strong> f<strong>in</strong>al state <strong>of</strong> t¯t system.Ano<strong>the</strong>r reason that can expla<strong>in</strong> why <strong>the</strong> partons produced <strong>in</strong> <strong>the</strong> hard scatter<strong>in</strong>g t¯tprocess, do not match to <strong>the</strong> four lead<strong>in</strong>g reconstructed jets, is due to <strong>the</strong> acceptancerequirement. S<strong>in</strong>ce reconstructed jets have already passed <strong>the</strong> cuts on η and are notallowed to be located <strong>in</strong> <strong>the</strong> forward region, <strong>the</strong> fraction <strong>of</strong> events where at least oneparton is produced out <strong>of</strong> acceptance region, would not have matched jets. F<strong>in</strong>al StateRadiation (FSR) can spoil <strong>the</strong> procedure <strong>of</strong> match<strong>in</strong>g <strong>of</strong> partons to reconstructed jets,too. S<strong>in</strong>ce FSR can split <strong>the</strong> <strong>in</strong>itial parton which consequently yields two separate jets,<strong>the</strong>refore <strong>the</strong> jets might be reconstructed far from <strong>the</strong> direction <strong>of</strong> <strong>the</strong> orig<strong>in</strong>al partonand can not be found <strong>in</strong> <strong>the</strong> match<strong>in</strong>g algorithm.5.1.2 Likelihood Ratio MethodThe normalized distributions <strong>of</strong> <strong>the</strong> various discrim<strong>in</strong>at<strong>in</strong>g variables, which make <strong>the</strong>Probability Density Functions (PDFs) <strong>of</strong> <strong>the</strong>se variables, are used to def<strong>in</strong>e <strong>the</strong> signaland background likelihoods, L S and L B respectively. A likelihood function is def<strong>in</strong>edas <strong>the</strong> product <strong>of</strong> <strong>the</strong> PDFs P k (x k ) <strong>of</strong> all <strong>in</strong>put variables x k , which constitute n varvariables and can be expressed for signal and background asandL S =L B =n∏vark=1n∏vark=1P S,k (x k ),P B,k (x k ),respectively. For each comb<strong>in</strong>ation among twelve possible comb<strong>in</strong>ations, <strong>the</strong> Likelihoodratio y L is def<strong>in</strong>ed as <strong>the</strong> signal likelihood divided by <strong>the</strong> sum <strong>of</strong> <strong>the</strong> signal andbackground likelihoodsL Sy L = .L S + L B
CHAPTER 5: Estimat<strong>in</strong>g <strong>of</strong> <strong>the</strong> <strong>Jet</strong> <strong>Energy</strong> <strong>Scale</strong> Calibration Factor 71The ratio y L is <strong>the</strong>n calculated for every jet comb<strong>in</strong>ation <strong>in</strong> each event. Per event, <strong>the</strong>jet comb<strong>in</strong>ation whose y L is <strong>the</strong> largest, is chosen and returned by <strong>the</strong> MVA method.Almost all <strong>the</strong> MVA methods which <strong>in</strong>cludd a Likelihood Ratio method need to betra<strong>in</strong>ed before apply<strong>in</strong>g <strong>the</strong> method on data. In <strong>the</strong> tra<strong>in</strong><strong>in</strong>g phase, one <strong>in</strong>troduces<strong>the</strong> signal and background PDFs to <strong>the</strong> MVA method. Then <strong>the</strong> MVA method tra<strong>in</strong>sitself and learns what k<strong>in</strong>d <strong>of</strong> behaviours can be extracted from both <strong>the</strong> signal andbackgrounds PDFs. In <strong>the</strong> application phase, us<strong>in</strong>g <strong>the</strong> <strong>in</strong>formation which has beencollected <strong>in</strong> <strong>the</strong> tra<strong>in</strong><strong>in</strong>g phase, <strong>the</strong> MVA method returns a value per event and categorizesthat event as signal or background. In case <strong>of</strong> a jet comb<strong>in</strong>ation study, where onesignal aga<strong>in</strong>st eleven backgrounds is present per event, <strong>the</strong> signal is def<strong>in</strong>ed as <strong>the</strong> jetcomb<strong>in</strong>ation with <strong>the</strong> highest value returned by <strong>the</strong> MVA method, although <strong>in</strong> somecases <strong>the</strong> MVA method is not able to return <strong>the</strong> “true” jet comb<strong>in</strong>ation, which correspondsto <strong>the</strong> correct jet comb<strong>in</strong>ation that is matched with <strong>the</strong> hard-scatter partons.5.1.3 Likelihood Concept <strong>in</strong> Bayesian StatisticsAccord<strong>in</strong>g to <strong>the</strong> Bayes’ <strong>the</strong>orem, posterior probability p(Y |X), is related to priorprobability p(Y ), with <strong>the</strong> follow<strong>in</strong>g equationp(Y |X) =p(X|Y )p(Y ),p(X)where p(X|Y ) is <strong>the</strong> conditional probability and is also referred to as <strong>the</strong> Likelihood. In<strong>the</strong> above equation, p(X|Y ) is <strong>the</strong> probability distribution <strong>of</strong> <strong>the</strong> parameter X, whichis usually a cont<strong>in</strong>uous variable <strong>of</strong> <strong>the</strong> event that belongs to <strong>the</strong> class Y and p(Y |X) is<strong>the</strong>n def<strong>in</strong>ed as <strong>the</strong> probability <strong>of</strong> assign<strong>in</strong>g a new observed event to <strong>the</strong> class Y giventhat <strong>the</strong> value X is measured for that particular event. Therefore <strong>in</strong> order to obta<strong>in</strong> <strong>the</strong>posterior probability p(Y |X), <strong>in</strong> addition to <strong>the</strong> prior probability p(Y ), p(X|Y ) shouldalso be known. In <strong>the</strong> language <strong>of</strong> multi-variate techniques, p(X|Y ) is obta<strong>in</strong>ed <strong>in</strong> <strong>the</strong>tra<strong>in</strong><strong>in</strong>g phase. Dur<strong>in</strong>g tra<strong>in</strong><strong>in</strong>g, <strong>the</strong> classes Y with <strong>the</strong>ir properties X are <strong>in</strong>troducedto <strong>the</strong> tra<strong>in</strong>er and subsequently <strong>the</strong> correspond<strong>in</strong>g PDFs are extracted.In order to clarify a bit more how a generic multi variate method works, a simplifiedexample based on Bayes’ <strong>the</strong>orem is expla<strong>in</strong>ed here. Consider a space with one variableX that can take only two values, namely “X = 1” or “X = 2”. In this space, eventsare categorized <strong>in</strong> ei<strong>the</strong>r signal or background classes, hence “Y = S” or “Y = B”. Anevent is said to be measured when <strong>the</strong> X value <strong>of</strong> that particular event is determ<strong>in</strong>ed.Assume 10 such events are selected and fed to an MVA method. The results <strong>of</strong> tra<strong>in</strong><strong>in</strong>gover 10 events, are summarized <strong>in</strong> Figure 5.3.The <strong>in</strong>formation, conta<strong>in</strong><strong>in</strong>g <strong>the</strong> conditional as well as <strong>the</strong> prior probabilities, thatcan be derived from Figure 5.3, is listed below.p(1|S) = 3 5 , p(1|B) = 1 5 ,p(2|S) = 2 5 , p(2|B) = 4 5 ,