12.07.2015 Views

1 Studies in the History of Statistics and Probability ... - Sheynin, Oscar

1 Studies in the History of Statistics and Probability ... - Sheynin, Oscar

1 Studies in the History of Statistics and Probability ... - Sheynin, Oscar

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

calculational sense method <strong>of</strong> estimation <strong>of</strong> those parameters, <strong>the</strong>method <strong>of</strong> maximal likelihood. It can also be realized <strong>and</strong> it is believedthat, for <strong>the</strong> data given, both methods provide results very close toeach o<strong>the</strong>r. It is <strong>in</strong>terest<strong>in</strong>g, however, to see what practical conclusionswere made <strong>in</strong> <strong>the</strong> cited source. After estimat<strong>in</strong>g somehow <strong>the</strong> values <strong>of</strong><strong>the</strong> parameters, we can apply formula (2.9) to f<strong>in</strong>d out <strong>the</strong> approximatevalue <strong>of</strong> <strong>the</strong> probability ˆp <strong>of</strong> develop<strong>in</strong>g <strong>the</strong> IHD for each exam<strong>in</strong>edperson dur<strong>in</strong>g <strong>the</strong> next 12 years.The highest probabilities were observed for men <strong>of</strong> 30 – 39 <strong>and</strong> 40 –49 years (0.986 <strong>and</strong> 0.742 that <strong>the</strong> IHD developed) <strong>and</strong> 50 – 62 years(0.770 did not develop). For women <strong>the</strong> probabilities <strong>of</strong> develop<strong>in</strong>g <strong>the</strong>disease were 0.838 for ages 30 – 49 <strong>and</strong> 0.773 for ages 50 – 62 [that itdid not develop?]. To a certa<strong>in</strong> extent <strong>the</strong>se results refute <strong>the</strong> classicalsay<strong>in</strong>g which served as <strong>the</strong> title <strong>of</strong> this § 2.2. True, it should be borne<strong>in</strong> m<strong>in</strong>d that formally <strong>the</strong>se figures concern forecast<strong>in</strong>g alreadyhappened events. Such a forecast <strong>of</strong> future events is only possible if<strong>the</strong> coefficients <strong>of</strong> function (2.9) are roughly <strong>the</strong> same <strong>in</strong> ano<strong>the</strong>r placeor time as those established by <strong>the</strong> authors. Such a supposition isprobable but not yet verified.Then, hav<strong>in</strong>g arranged <strong>the</strong> set <strong>of</strong> values ˆp for each exam<strong>in</strong>edperson, <strong>the</strong>y can be subdivided <strong>in</strong>to several equally numerous groups(<strong>of</strong> ten people, for example) such that those with <strong>the</strong> lowest values <strong>of</strong>ˆp are placed <strong>in</strong> <strong>the</strong> first <strong>of</strong> <strong>the</strong>m, people hav<strong>in</strong>g higher, still higher, ...values comprise <strong>the</strong> second, third, ... group. Had <strong>the</strong>re been noconnection between <strong>the</strong> considered l<strong>in</strong>ear comb<strong>in</strong>ation <strong>of</strong> risk factorswith <strong>the</strong> IHD, <strong>the</strong> number <strong>of</strong> cases <strong>of</strong> that disease <strong>in</strong> all groups wouldhave been approximately <strong>the</strong> same, but actually <strong>the</strong> emerged picture isdifferent <strong>in</strong> pr<strong>in</strong>ciple, see for example Table 3 borrowed from Truett etal (1967). The expected number <strong>of</strong> cases <strong>of</strong> <strong>the</strong> IHD was determ<strong>in</strong>edby summ<strong>in</strong>g up <strong>the</strong> probabilities ˆp <strong>of</strong> all people <strong>in</strong> <strong>the</strong> appropriategroup.In that table, we are surprised first <strong>of</strong> all by <strong>the</strong> great difference(amount<strong>in</strong>g to a few dozen times) between <strong>the</strong> sickness rate <strong>in</strong> groups<strong>of</strong> high <strong>and</strong> low risk. Second, <strong>in</strong> spite <strong>of</strong> <strong>the</strong> obvious non-normality <strong>of</strong><strong>the</strong> distributions, <strong>the</strong> results obta<strong>in</strong>ed by means <strong>of</strong> a normality modelagree well with <strong>the</strong> actual data. The isolation <strong>of</strong> groups <strong>of</strong> people witha higher danger <strong>of</strong> develop<strong>in</strong>g <strong>the</strong> IHD is thus possible by issu<strong>in</strong>g from<strong>the</strong> most simple cl<strong>in</strong>ical exam<strong>in</strong>ation (provid<strong>in</strong>g <strong>the</strong> listed above riskfactors). The same conclusion can be made when consider<strong>in</strong>g <strong>the</strong> datarepresented <strong>in</strong> separate age groups.However, it should not be thought that those results are reallysuitable for <strong>in</strong>dividual forecasts. Those can only be successful forcases <strong>of</strong> very high or very low <strong>in</strong>dividual risk ˆp but for all <strong>the</strong> totality<strong>the</strong> result would have been bad. This is connected with <strong>the</strong> IHDoccurr<strong>in</strong>g never<strong>the</strong>less rarely (11.8% <strong>in</strong> <strong>the</strong> mean for 12 years).Indeed, issu<strong>in</strong>g from <strong>the</strong> values <strong>of</strong> ˆp we can only forecast <strong>the</strong> disease<strong>in</strong> people with a sufficiently high ˆp <strong>and</strong> <strong>the</strong> opposite for all <strong>the</strong> o<strong>the</strong>rs.When choos<strong>in</strong>g <strong>the</strong> boundary <strong>of</strong> <strong>the</strong> group with <strong>the</strong> highest risk <strong>in</strong>Table 3 as <strong>the</strong> critical value <strong>of</strong> ˆp , a forecast <strong>of</strong> <strong>the</strong> disease would havebeen wrong <strong>in</strong> 100 – 37.5 = 62.5% <strong>of</strong> cases. And on <strong>the</strong> o<strong>the</strong>r h<strong>and</strong>110

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!