12.08.2013 Views

Hawai i's Green Workforce A Baseline Assessment December 2010

Hawai i's Green Workforce A Baseline Assessment December 2010

Hawai i's Green Workforce A Baseline Assessment December 2010

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

After critical review of remaining NAICS by both<br />

the <strong>Hawai</strong>‘i Labor Market Research Section and the<br />

<strong>Hawai</strong>‘i <strong>Green</strong> Jobs Initiative Team, several additional<br />

industries were designated as green. All told, there<br />

were 113 NAICS 4-digit level industry codes that<br />

<strong>Hawai</strong>‘i classified as green. However, it should be<br />

noted that ALL the remaining non-green industries at<br />

the 2-digit NAICS level were sampled, though at a<br />

much lower rate than those in the 4-digit NAICS green<br />

industries.<br />

For purposes of stratification, “green” means that at<br />

least a small number of codes at the 6-digit level in<br />

that particular 2-digit NAICS were likely green. The<br />

entire 4-digit NAICS was categorized green, even<br />

though most of the jobs within those codes are likely<br />

to be non-green.<br />

Industry (NAICS )<br />

Not including Public Administration, there are 23<br />

2-digit NAICS codes that cover 19 industrial sectors.<br />

Of these 2-digit NAICS, 16 contained the presence of<br />

at least some green 4-digit NAICS codes. Because the<br />

remainder of these 16 NAICS are not classified green,<br />

it is required that there be two separate sampling cells<br />

for each individual 2-digit NAICS, green and nongreen.<br />

In addition, there were seven 2-digit NAICS<br />

that had no “green” 4-digit NAICS. Thus, these<br />

individual 2-digit NAICS strata only require one cell,<br />

non-green, for purposes of sample selection. For the<br />

sample size of each NAICS strata, see Table 21.<br />

Estimation<br />

After random sampling and data collection, the<br />

following estimation procedure was followed.<br />

1) Sum across <strong>Green</strong> job descriptions in the sample<br />

data and remove non-unique survey IDs (multiple<br />

job descriptions for one employer)<br />

a. Out of business worksites (OOBs) are<br />

counted as 0 jobs for purposes of summing<br />

and weighting of green jobs for all<br />

categories. For example, if ½ of a sample<br />

cell is composed of OOBs, then infer ½<br />

OOBs in the universe cell (0 jobs for half<br />

of the cell population).<br />

2) Divide data into Wave 1 (data received prior to<br />

June 14) and Wave 2 (data received on or after<br />

June 14). The cut-off date, June 14, was chosen<br />

because it provided respondents with a ten-day<br />

grace period, and coincided with the start of an<br />

intensive campaign to improve response using<br />

phone calls, emails, postcard reminders, and<br />

additional survey mailings to nonrespondents.<br />

3) Make histograms and determine summaries<br />

comparing Wave 1 covariates (NAICS, Size,<br />

<strong>Green</strong>, etc) to Wave 2 covariates, and do a Z-test to<br />

determine whether systematic bias in Wave 1 and<br />

Wave 2 data is likely. The Z-test showed with 90<br />

percent certainty that bias existed between Wave<br />

1 and Wave 2 in terms of size category. This nonresponse<br />

bias will be corrected in estimation of<br />

green jobs in the universe below.<br />

4) Load data from QCEW universe<br />

a. Use file “UNIVERSEEQUI093.csv”<br />

5) Estimate logit model from the sample data<br />

stratified between Wave 1 and Wave 2 data to use<br />

for the propensity to respond variable (propensity<br />

score) in the universe of data. Wave 1 sample data<br />

will then be used to infer green jobs in the universe<br />

of likely responders, and Wave 2 sample data will<br />

be used to infer green jobs in the universe of likely<br />

non-responders. This procedure removes any nonresponse<br />

bias that may exist.<br />

a. Linear model: logit(y=BX+e)<br />

Logit(Responder = B(<strong>Green</strong>+C2.NAICS+<br />

County+Size)+e), where B is a vector of<br />

Four coefficients estimated by logistic<br />

regression.<br />

b. The model resulting from the Wave 1 and<br />

Wave 2 sample data is used to predict<br />

which unsampled observations would<br />

have been likely to respond or not respond<br />

(given their covariates – <strong>Green</strong>, NAICS,<br />

County, and Size). This is the unsampled<br />

worksite’s “propensity” to respond. Those<br />

with the highest propensity to respond<br />

(with cutoff propensity = x) are coded<br />

as Responders. The cutoff propensity is<br />

determined such that the proportion of<br />

responders in the universe of data equals<br />

the proportion of responders in the sample,<br />

<strong>Hawai</strong>ÿi’s <strong>Green</strong> <strong>Workforce</strong>: A <strong>Baseline</strong> <strong>Assessment</strong> 53

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!