10 - H1 - Desy
10 - H1 - Desy
10 - H1 - Desy
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
6.3 Classifier 81<br />
step adjusts the free parameters of MVA to particular problem, the testing step checks<br />
the quality of the trained MVA and the evaluation step is used during determination of<br />
final results. Ideally, all three steps should be performed with statistically independent<br />
samples. Due to high statistical demand of the MVA, the training and testing samples in<br />
this analysis are single particle MC events with a single photons for signal (MC set XX in<br />
table 2.5) and a single hadrons for background (MC sets XXI-XXX). This simplification<br />
is possible since the MVA is purely based on the shower shapes and is insensitive to<br />
the other activity in the detector 2 . In order to minimise possible bias though, for the<br />
evaluation sample full prompt photon signal (MC sets I-II) and full background (MC set<br />
XVI) simulation is used.<br />
Table 6.2 summarises the number of MC events used in the MVA analysis. The training<br />
sample consist of 60% randomly chosen events from the single particles MC samples. The<br />
remaining 40% define the testing sample. The full MC simulation of prompt photon and<br />
background events was used for the evaluation sample. Due to specific, highly demanding<br />
usage of the evaluated discriminator, binned in multi-dimensional, extremely fine way<br />
(see section 8.2.1), the signal evaluation sample exceed the testing sample by a factor of<br />
roughly 75. The statistics for the background evaluating sample is relatively lower, due<br />
to extremely low background selection ratio (of order <strong>10</strong> −5 ). Number of events listed in<br />
the discussed table differ from tables 2.3−2.5 due to the selection criteria applied for the<br />
training, testing and evaluating samples.<br />
training testing<br />
evaluating evaluating<br />
(inclusive) (exclusive)<br />
signal 221 ′ 709 147 ′ 456 16 ′ 981 ′ 407 11 ′ 309 ′ 443<br />
background 246 ′ 512 164 ′ 453 72 ′ 354 58 ′ 445<br />
Table 6.2: Number of signal and background events used for MVA.<br />
6.3 Classifier 3<br />
The MVA method chosen for this analysis is commonly known as maximum likelihood or<br />
Naïve Bayes [117]. It builds a model out of probability density functions that reproduces<br />
the input variables for signal and background. For a given event i, the combined signal<br />
(background) probability p MV S(B) A (i) is obtained by multiplying the signal (background)<br />
density probabilities p S(B) of all n var input variables.<br />
2 Detailed study [66] shows that this assumption is not fully true. For a full MC event there is<br />
a certain probability for a final state soft particle being counted in the selected cluster. This effect<br />
produces statistically wider, more asymmetric and less compact clusters. In spite of this, single particle<br />
approximation has been found to be good enough for the training and testing of MVA. This approach<br />
produces a discriminator technically easier to obtain and though not anymore maximally optimal, still<br />
correct, as long as evaluation sample correctly describes real cluster shapes.<br />
3 All the classifiers methods were fully taken from the TMVA package [116].