STATA 11 for Windows SAMPLE SESSION - Food Security Group ...
STATA 11 for Windows SAMPLE SESSION - Food Security Group ...
STATA 11 for Windows SAMPLE SESSION - Food Security Group ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Stata <strong>11</strong> Sample Session Section 2 – Restructuring Data Files – Table Lookup & Aggregation<br />
Rename any key variables in both<br />
files to the same name<br />
each case in the production file (c-q4.dta), we need to<br />
look up the product and unit in the conver.dta file. We<br />
will merge the in<strong>for</strong>mation from this file into the file in<br />
memory (the production file). The variable with the<br />
conversion factor will then be available to calculate the<br />
total kgs produced. In Stata we want to use the “joinby”<br />
command <strong>for</strong> this merge. It can be found through the<br />
menus with the following choice:<br />
Data<br />
Combine datasets<br />
Form all pairwise combinations within groups.<br />
The input files <strong>for</strong> a merge must be sorted by the key<br />
variable(s) (key variables are those variables you are<br />
using to match by between the two files). Since there is a<br />
unique conversion factor <strong>for</strong> each product-unit<br />
combination, both our product variable and our unit<br />
variable are the key variables. The CONVER.DTA file is<br />
already sorted by prod and unit. We must sort the<br />
current working file that is in memory the same way,<br />
while taking account of the fact that the unit variable is<br />
named p1a and not unit. To sort the cases:<br />
1. From the Data menu select<br />
Sort<br />
Ascending data<br />
The Sort - Sort data dialog box will open.<br />
2. In the Variables: box select prod and p1a<br />
3. Click on the “copy” icon and then click on Ok.<br />
4. Switch to the do-file editor and paste the command.<br />
The Stata command is:<br />
sort prod p1a<br />
Let’s look at the two variables using the tab1 command.<br />
We can type in the Command window<br />
tab1 prod p1a<br />
There are 1,693 cases. We have many products. For the<br />
tabulation of p1a we see 2 values that have no labels (0<br />
and 1) and note that there are only 1670 cases that contain<br />
a value <strong>for</strong> p1a. There are possible data problems. We<br />
would expect to see a value in p1a <strong>for</strong> every crop that was<br />
harvested. How would you determine if there are missing<br />
data in the p1a variable? If it were possible, corrections<br />
should be made be<strong>for</strong>e proceeding further.<br />
We cannot merge the two files unless the variables that<br />
57