14.06.2012 Views

Rating Models and Validation - Oesterreichische Nationalbank

Rating Models and Validation - Oesterreichische Nationalbank

Rating Models and Validation - Oesterreichische Nationalbank

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Rating</strong> <strong>Models</strong> <strong>and</strong> <strong>Validation</strong><br />

ing development stages. However, data pooling in the validation <strong>and</strong> continued<br />

development of rating models is usually less comprehensive, as in this case it is<br />

only necessary to retrieve the rating criteria which are actually used. However,<br />

additional data requirements may arise for any necessary further developments<br />

in the pool-based rating model.<br />

5.1.3 Definition of the Sample<br />

The data gathered in the data collection <strong>and</strong> cleansing stages represent the overall<br />

sample, which has to be divided into an analysis sample <strong>and</strong> a validation sample.<br />

The analysis sample supports the actual development of the scoring functions,<br />

while the validation sample serves exclusively as a hold-out sample to test<br />

the scoring functions after development. In general, one can expect sound discriminatory<br />

power from the data records used for development. Testing the<br />

modelÕs applicability to new (i.e. generally unknown) data is thus the basic prerequisite<br />

for the recognition of any classification procedure. In this context, it is<br />

possible to divide the overall sample into the analysis <strong>and</strong> validation samples in<br />

two different ways:<br />

— Actual division of the database into the analysis <strong>and</strong> validation samples<br />

— Application of a bootstrap procedure<br />

In cases where sufficient data (especially regarding bad cases) are available to<br />

enable actual division into two sufficiently large subsamples, the first option<br />

should be preferred. This ensures the strict separation of the data records in<br />

the analysis <strong>and</strong> validation samples. In this way, it is possible to check the quality<br />

of the scoring functions (developed using the analysis sample) using the<br />

unknown data records in the validation sample.<br />

In order to avoid bias due to subjective division, the sample should be split<br />

up by r<strong>and</strong>om selection (see chart 35). In this process, however, it is necessary<br />

to ensure that the data are representative in terms of their defined structural<br />

characteristics (see section 5.1.2).<br />

Only those cases which fulfill certain minimum data quality requirements<br />

can be used in the analysis sample. In general, this is already ensured during<br />

the data collection <strong>and</strong> cleansing stage. In cases where quality varies within a<br />

database, the higher-quality data should be used in the analysis sample. In such<br />

cases, however, the results obtained using the validation sample will be considerably<br />

less reliable.<br />

Borrowers included in the analysis sample must not be used in the validation<br />

sample, even if different cutoff dates are used. The analysis <strong>and</strong> validation samples<br />

thus have to be disjunct with regard to borrowers.<br />

With regard to weighting good <strong>and</strong> bad cases in the analysis sample, two different<br />

procedures are conceivable:<br />

— The analysis sample can be created in such a way that the proportion of bad<br />

cases is representative of the rating segment to be analyzed. In this case, calibrating<br />

the scoring function becomes easier (cf. section 5.3). For example,<br />

the result of logistic regression can be used directly as a probability of<br />

default (PD) without further processing or rescaling. This approach is advisable<br />

whenever the number of cases is not subject to restrictions in the data<br />

collection stage, <strong>and</strong> especially when a sufficient number of bad cases can be<br />

collected.<br />

72 Guidelines on Credit Risk Management

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!