23.05.2014 Views

10 - H1 - Desy

10 - H1 - Desy

10 - H1 - Desy

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.3 Classifier 81<br />

step adjusts the free parameters of MVA to particular problem, the testing step checks<br />

the quality of the trained MVA and the evaluation step is used during determination of<br />

final results. Ideally, all three steps should be performed with statistically independent<br />

samples. Due to high statistical demand of the MVA, the training and testing samples in<br />

this analysis are single particle MC events with a single photons for signal (MC set XX in<br />

table 2.5) and a single hadrons for background (MC sets XXI-XXX). This simplification<br />

is possible since the MVA is purely based on the shower shapes and is insensitive to<br />

the other activity in the detector 2 . In order to minimise possible bias though, for the<br />

evaluation sample full prompt photon signal (MC sets I-II) and full background (MC set<br />

XVI) simulation is used.<br />

Table 6.2 summarises the number of MC events used in the MVA analysis. The training<br />

sample consist of 60% randomly chosen events from the single particles MC samples. The<br />

remaining 40% define the testing sample. The full MC simulation of prompt photon and<br />

background events was used for the evaluation sample. Due to specific, highly demanding<br />

usage of the evaluated discriminator, binned in multi-dimensional, extremely fine way<br />

(see section 8.2.1), the signal evaluation sample exceed the testing sample by a factor of<br />

roughly 75. The statistics for the background evaluating sample is relatively lower, due<br />

to extremely low background selection ratio (of order <strong>10</strong> −5 ). Number of events listed in<br />

the discussed table differ from tables 2.3−2.5 due to the selection criteria applied for the<br />

training, testing and evaluating samples.<br />

training testing<br />

evaluating evaluating<br />

(inclusive) (exclusive)<br />

signal 221 ′ 709 147 ′ 456 16 ′ 981 ′ 407 11 ′ 309 ′ 443<br />

background 246 ′ 512 164 ′ 453 72 ′ 354 58 ′ 445<br />

Table 6.2: Number of signal and background events used for MVA.<br />

6.3 Classifier 3<br />

The MVA method chosen for this analysis is commonly known as maximum likelihood or<br />

Naïve Bayes [117]. It builds a model out of probability density functions that reproduces<br />

the input variables for signal and background. For a given event i, the combined signal<br />

(background) probability p MV S(B) A (i) is obtained by multiplying the signal (background)<br />

density probabilities p S(B) of all n var input variables.<br />

2 Detailed study [66] shows that this assumption is not fully true. For a full MC event there is<br />

a certain probability for a final state soft particle being counted in the selected cluster. This effect<br />

produces statistically wider, more asymmetric and less compact clusters. In spite of this, single particle<br />

approximation has been found to be good enough for the training and testing of MVA. This approach<br />

produces a discriminator technically easier to obtain and though not anymore maximally optimal, still<br />

correct, as long as evaluation sample correctly describes real cluster shapes.<br />

3 All the classifiers methods were fully taken from the TMVA package [116].

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!