72 CHAPTER 5: Estimat<strong>in</strong>g <strong>of</strong> <strong>the</strong> <strong>Jet</strong> <strong>Energy</strong> <strong>Scale</strong> Calibration FactorFigure 5.3: A simple example to show <strong>the</strong> basic concepts <strong>of</strong> <strong>the</strong> MVA method. In thisexample, <strong>the</strong> circles, which are represent<strong>in</strong>g events, are observed <strong>in</strong> two different b<strong>in</strong>s,be<strong>in</strong>g ei<strong>the</strong>r “1” or “2” and can belong to one <strong>of</strong> “S” or “B” classes.p(S) = 5 10 ,p(B) = 510 .From <strong>the</strong> above elements, p(1) and p(2) can also be extracted, accord<strong>in</strong>g to <strong>the</strong> “sumrule” <strong>of</strong> <strong>the</strong> probability <strong>the</strong>ory, as followsp(1) = p(1|S)p(S) + p(1|B)p(B) = 2 5 ,p(2) = p(2|S)p(S) + p(2|B)p(B) = 3 5 .Now that all <strong>the</strong> <strong>in</strong>put <strong>in</strong>formation is complete, one can calculate <strong>the</strong> posterior probabilitiesas expressed belowp(S|1) = p(1|S)p(S)p(1)= 912 ,p(S|2) = p(2|S)p(S)p(2)p(B|1) = p(1|B)p(B)p(1)p(B|2) = p(2|B)p(B)p(2)= 412 ,= 312 ,= 812 .The above numbers can be <strong>in</strong>terpreted as follows. Any new event, whose measured Xvalue yields to X = 1, would be assigned to class S s<strong>in</strong>ce <strong>the</strong> probability <strong>of</strong> be<strong>in</strong>g atype “S” is three times more than be<strong>in</strong>g a type “B” event when a value equals 1 ismeasured. With <strong>the</strong> same reason<strong>in</strong>g, any new event is grouped to <strong>the</strong> class B giventhat <strong>the</strong> meaurement <strong>of</strong> <strong>the</strong> X property <strong>of</strong> that event results <strong>in</strong> X = 2.The above example shows <strong>the</strong> basic idea <strong>of</strong> assign<strong>in</strong>g a new observed event to a specificclass <strong>in</strong> a one-dimensional phase space. In most cases, usually more than one <strong>in</strong>putvariable is used. As a result, <strong>the</strong> problem <strong>of</strong> label<strong>in</strong>g an event as signal or background,would not be that simple. Def<strong>in</strong><strong>in</strong>g a s<strong>in</strong>gle variable out <strong>of</strong> many <strong>in</strong>put variables canprovide a possible solution. Hence <strong>the</strong> Likelihood function L(⃗x), which comb<strong>in</strong>es <strong>the</strong>
CHAPTER 5: Estimat<strong>in</strong>g <strong>of</strong> <strong>the</strong> <strong>Jet</strong> <strong>Energy</strong> <strong>Scale</strong> Calibration Factor 73<strong>in</strong>formation <strong>of</strong> all <strong>in</strong>put variables ⃗x = (x 1 . . .x n ) as expla<strong>in</strong>ed <strong>in</strong> Section 5.1.2, is <strong>in</strong>troduced<strong>in</strong> an n-dimensional phase space.By def<strong>in</strong>ition, L can take values <strong>in</strong> a range <strong>of</strong> [0,1]. Therefore, <strong>in</strong> <strong>the</strong> language <strong>of</strong> LikelihoodRatio, an event is assigned as signal if <strong>the</strong> measurement <strong>of</strong> its X property wouldhappen <strong>in</strong> <strong>the</strong> b<strong>in</strong> whose Likelihood Ratio value is greater than 0.5. Then <strong>the</strong> event ismore signal-like compared to background-like. O<strong>the</strong>rwise, it is labeled as background.In case <strong>of</strong> f<strong>in</strong>d<strong>in</strong>g <strong>the</strong> correct jet comb<strong>in</strong>ation, where eleven backgrounds, be<strong>in</strong>g wrongcomb<strong>in</strong>ations, versus one signal, be<strong>in</strong>g <strong>the</strong> true comb<strong>in</strong>ation, exist, <strong>the</strong> chosen comb<strong>in</strong>ationis def<strong>in</strong>ed as <strong>the</strong> comb<strong>in</strong>ation whose Likelihood Ratio value is <strong>the</strong> maximumamong <strong>the</strong> o<strong>the</strong>r comb<strong>in</strong>ations.5.1.4 Input Variables for Tra<strong>in</strong><strong>in</strong>gVarious observable variables can be used to tra<strong>in</strong> <strong>the</strong> Likelihood Ratio method. Also acomb<strong>in</strong>ation <strong>of</strong> two or three variables is allowed. Among so many candidate variableswhich can be used <strong>in</strong> <strong>the</strong> tra<strong>in</strong><strong>in</strong>g, <strong>the</strong> most discrim<strong>in</strong>at<strong>in</strong>g variables that differentiatemaximally between signal and backgrond, are desired. The discrim<strong>in</strong>at<strong>in</strong>g power<strong>of</strong> a generic variable can be def<strong>in</strong>ed us<strong>in</strong>g different methods. As an example, <strong>the</strong>“separation” S, which can be a measure <strong>of</strong> discrim<strong>in</strong>at<strong>in</strong>g power between signal andbackground, is def<strong>in</strong>ed asS = 1 ∫ (PS (X) − P B (X)) 22 (P S (X) + P B (X)) dX,where P S (X) and P B (X) are <strong>the</strong> PDFs <strong>of</strong> <strong>the</strong> observable X for signal and background,respectively. By def<strong>in</strong>ition, for identical signal and background shapes, <strong>the</strong> separationis zero. If <strong>the</strong> signal and background PDFs do not overlap, <strong>the</strong>n <strong>the</strong> separation takesa value equal to one. Therefore higher values <strong>of</strong> <strong>the</strong> separation, <strong>in</strong>dicate <strong>the</strong> variablehas a higher probability to be a good discrim<strong>in</strong>ant candidate.S<strong>in</strong>ce <strong>the</strong> four vector <strong>of</strong> <strong>the</strong> neutr<strong>in</strong>o is not known, hence <strong>the</strong> leptonic W boson andconsequently <strong>the</strong> leptonic top quark can not be fully reconstructed. Therefore <strong>the</strong> <strong>in</strong>putvariables which can be chosen to tra<strong>in</strong> <strong>the</strong> MVA method with, should be <strong>in</strong>dependentfrom <strong>the</strong> reconstructed neutr<strong>in</strong>o. This makes <strong>the</strong> <strong>in</strong>put candidates to be limited to asmaller collection. Different k<strong>in</strong>ematics <strong>of</strong> <strong>the</strong> objects such as transverse momentump T , pseudorapidity η, polar angle θ and azimuthal angle φ can be used to compare<strong>the</strong> candidates <strong>in</strong> t¯t system. In this analysis, <strong>the</strong> Θ variable, which is def<strong>in</strong>ed as <strong>the</strong>space angle between two vectors <strong>in</strong> a three-dimensional phase space is used which isexpressed asΘ(⃗v 1 ,⃗v 2 ) = cos −1 ( ⃗v1 .⃗v 2|⃗v 1 ||⃗v 2 |)= cos −1 (s<strong>in</strong> θ 1 s<strong>in</strong> θ 2 (cos(φ 1 − φ 2 )) + cosθ 1 cos θ 2 ).The two-component variables <strong>in</strong> <strong>the</strong> t¯t system, exclud<strong>in</strong>g those variables related to <strong>the</strong>k<strong>in</strong>ematics <strong>of</strong> neutr<strong>in</strong>o, which are considered <strong>in</strong> this analysis, are listed below.• Θ(t h , W h ) which is <strong>the</strong> space angle between hadronic top quark and hadronic Wboson,