Hawai i's Green Workforce A Baseline Assessment December 2010
Hawai i's Green Workforce A Baseline Assessment December 2010
Hawai i's Green Workforce A Baseline Assessment December 2010
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
After critical review of remaining NAICS by both<br />
the <strong>Hawai</strong>‘i Labor Market Research Section and the<br />
<strong>Hawai</strong>‘i <strong>Green</strong> Jobs Initiative Team, several additional<br />
industries were designated as green. All told, there<br />
were 113 NAICS 4-digit level industry codes that<br />
<strong>Hawai</strong>‘i classified as green. However, it should be<br />
noted that ALL the remaining non-green industries at<br />
the 2-digit NAICS level were sampled, though at a<br />
much lower rate than those in the 4-digit NAICS green<br />
industries.<br />
For purposes of stratification, “green” means that at<br />
least a small number of codes at the 6-digit level in<br />
that particular 2-digit NAICS were likely green. The<br />
entire 4-digit NAICS was categorized green, even<br />
though most of the jobs within those codes are likely<br />
to be non-green.<br />
Industry (NAICS )<br />
Not including Public Administration, there are 23<br />
2-digit NAICS codes that cover 19 industrial sectors.<br />
Of these 2-digit NAICS, 16 contained the presence of<br />
at least some green 4-digit NAICS codes. Because the<br />
remainder of these 16 NAICS are not classified green,<br />
it is required that there be two separate sampling cells<br />
for each individual 2-digit NAICS, green and nongreen.<br />
In addition, there were seven 2-digit NAICS<br />
that had no “green” 4-digit NAICS. Thus, these<br />
individual 2-digit NAICS strata only require one cell,<br />
non-green, for purposes of sample selection. For the<br />
sample size of each NAICS strata, see Table 21.<br />
Estimation<br />
After random sampling and data collection, the<br />
following estimation procedure was followed.<br />
1) Sum across <strong>Green</strong> job descriptions in the sample<br />
data and remove non-unique survey IDs (multiple<br />
job descriptions for one employer)<br />
a. Out of business worksites (OOBs) are<br />
counted as 0 jobs for purposes of summing<br />
and weighting of green jobs for all<br />
categories. For example, if ½ of a sample<br />
cell is composed of OOBs, then infer ½<br />
OOBs in the universe cell (0 jobs for half<br />
of the cell population).<br />
2) Divide data into Wave 1 (data received prior to<br />
June 14) and Wave 2 (data received on or after<br />
June 14). The cut-off date, June 14, was chosen<br />
because it provided respondents with a ten-day<br />
grace period, and coincided with the start of an<br />
intensive campaign to improve response using<br />
phone calls, emails, postcard reminders, and<br />
additional survey mailings to nonrespondents.<br />
3) Make histograms and determine summaries<br />
comparing Wave 1 covariates (NAICS, Size,<br />
<strong>Green</strong>, etc) to Wave 2 covariates, and do a Z-test to<br />
determine whether systematic bias in Wave 1 and<br />
Wave 2 data is likely. The Z-test showed with 90<br />
percent certainty that bias existed between Wave<br />
1 and Wave 2 in terms of size category. This nonresponse<br />
bias will be corrected in estimation of<br />
green jobs in the universe below.<br />
4) Load data from QCEW universe<br />
a. Use file “UNIVERSEEQUI093.csv”<br />
5) Estimate logit model from the sample data<br />
stratified between Wave 1 and Wave 2 data to use<br />
for the propensity to respond variable (propensity<br />
score) in the universe of data. Wave 1 sample data<br />
will then be used to infer green jobs in the universe<br />
of likely responders, and Wave 2 sample data will<br />
be used to infer green jobs in the universe of likely<br />
non-responders. This procedure removes any nonresponse<br />
bias that may exist.<br />
a. Linear model: logit(y=BX+e)<br />
Logit(Responder = B(<strong>Green</strong>+C2.NAICS+<br />
County+Size)+e), where B is a vector of<br />
Four coefficients estimated by logistic<br />
regression.<br />
b. The model resulting from the Wave 1 and<br />
Wave 2 sample data is used to predict<br />
which unsampled observations would<br />
have been likely to respond or not respond<br />
(given their covariates – <strong>Green</strong>, NAICS,<br />
County, and Size). This is the unsampled<br />
worksite’s “propensity” to respond. Those<br />
with the highest propensity to respond<br />
(with cutoff propensity = x) are coded<br />
as Responders. The cutoff propensity is<br />
determined such that the proportion of<br />
responders in the universe of data equals<br />
the proportion of responders in the sample,<br />
<strong>Hawai</strong>ÿi’s <strong>Green</strong> <strong>Workforce</strong>: A <strong>Baseline</strong> <strong>Assessment</strong> 53