Introduction to Stata 8 - (GRIPS
Introduction to Stata 8 - (GRIPS
Introduction to Stata 8 - (GRIPS
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
15.2. String variables [U] 15.4; [U] 26<br />
Throughout this text I have demonstrated the use of numeric variables, but <strong>Stata</strong> also handles<br />
string (text) variables. It is almost always easier and more flexible <strong>to</strong> use numeric variables,<br />
but sometimes you might need string variables. String values must be enclosed in quotes:<br />
replace ph=45 if nation == "Danish"<br />
"Danish", "danish", and "DANISH" are different string values.<br />
A string can include any character, also numbers; however number strings are not interpreted<br />
by their numeric value, just as a sequence of characters. Strings are sorted in dictionary<br />
sequence, however all uppercase letters come before lowercase; numbers come before letters.<br />
This principle is also applies <strong>to</strong> relations: "12" < "2" < "A" < "AA" < "Z" < "a".<br />
String formats [U] 15.5.5<br />
%10s displays a 10 character string, right-justified; %-10s displays it left-justified.<br />
Reading string variables in<strong>to</strong> <strong>Stata</strong><br />
In the commands reading ASCII data (see section 8) the default data type is numeric. String<br />
variables should be defined in the input command. str5 means a 5 character text string:<br />
infix id 1-4 str5 icd10 5-9 using c:\dokumenter\p1\a.txt<br />
Generating new string variables<br />
The first time a string variable is defined it must be declared by its length (str10):<br />
generate str10 nation = "Danish" if ph==45<br />
replace nation = "Swedish" if ph==46<br />
Conversion between string and numeric variables<br />
Number strings <strong>to</strong> numbers<br />
If a CPR number is recorded in cprstr (type string), no calculations can be performed.<br />
Conversion <strong>to</strong> a numeric variable cprnum can be obtained by:<br />
generate double cprnum = real(cprstr)<br />
format cprnum %10.0f<br />
cprnum is a 10 digit number and must be declared double for sufficient precision (see<br />
section 6.2). Another option is destring (it au<strong>to</strong>matically declares cprnum double):<br />
destring cprstr , generate(cprnum)<br />
Non-number strings <strong>to</strong> numbers<br />
If a string variable sex is coded as eg. "M" and "F", convert <strong>to</strong> a numeric variable<br />
gender (with the original string codes as value labels) by:<br />
encode sex , generate(gender)<br />
[R] encode<br />
Display the meaning of the numeric codes by:<br />
label list gender<br />
55