04.06.2013 Views

Introduction to Stata 8 - (GRIPS

Introduction to Stata 8 - (GRIPS

Introduction to Stata 8 - (GRIPS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Or you can extract key information from a CPR number read as one string variable (cprstr):<br />

generate bday = real(substr(cprstr,1,2))<br />

gen bmon = real(substr(cprstr,3,2))<br />

gen byear = real(substr(cprstr,5,2))<br />

gen control = real(substr(cprstr,7,4))<br />

gen pos7 = real(substr(cprstr,7,1)) // <strong>to</strong> find century<br />

Before creating bdate you must decide the century of birth; see the rules below:<br />

generate century = 19<br />

replace century = 20 if pos7 >= 4 & byear = 5 & pos7 = 58<br />

replace byear = 100*century + byear<br />

generate bdate = mdy(bmon,bday,byear)<br />

The information on sex can be extracted from control; the mod function calculates the<br />

remainder after division by 2 (male=1, female=0):<br />

generate sex = mod(control,2)<br />

Century information in Danish CPR numbers<br />

The 7th digit (the first control digit) informs on the century of birth:<br />

Pos. 5-6 (year of birth)<br />

Pos. 7 00-36 37-57 58-99<br />

0-3 19xx 19xx 19xx<br />

4, 9 20xx 19xx 19xx<br />

5-8 20xx not used 18xx<br />

Source: www.cpr.dk<br />

Validation of Danish CPR numbers<br />

To do the modulus 11 test for Danish CPR numbers first multiply the digits by 4, 3, 2, 7, 6, 5,<br />

4, 3, 2, 1; next sum these products; finally check whether the sum can be divided by 11.<br />

Assume that the CPR numbers were split in<strong>to</strong> 10 one-digit numbers c1-c10. Explanation of<br />

for : see section 7.<br />

generate test=0<br />

for C in varlist c1-c10 \ X in numlist 4/2 7/1 : ///<br />

replace test=test+C*X<br />

replace test=mod(test,11) // Remainder after division by 11<br />

list id cpr test if test !=0<br />

To extract c1-c10 from the string cprstr:<br />

for C in newlist c1-c10 \ X in numlist 1/10 : ///<br />

gen C=real(substr(cprstr,X,1))<br />

To extract c1-c10 already when reading data:<br />

infix str10 cprstr 1-10 c1-c10 1-10 using c:\...\dfile.txt<br />

I developed an ado-file (cprcheck.ado) that extracts birth date and sex information and checks<br />

the validity of a CPR number. Find and download it by:<br />

findit cprcheck<br />

58

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!