A Method to Estimate the Human Capital from Sample Surveys on ...

More documents

Recommendations

Info

Indirect Indica<strong>to</strong>rs Γx 2 =H Gender; x 3 = H Race; x 6 = S Age; y 5 = H Yearsof Not Full-Time Work; y 7 = S Years of Not Full-Time Work; y 8 = H Job Status; y 9 = H Occupation;y 10 = H Industry; y 11 = S Job Status; y 12 = S Occupationy 13 = S IndustryFormative indica<strong>to</strong>rs F = (Ψ, y 14 )Ψ: x 1 = H Age; x 4 = Region ;x 5 = H Marital Status;x 7 = S Gender; y 1 = H Years of Schooling;y 2 = S Years of Schooling; y 3 = Number of Children;y 4 = H Years of Full-Time Work;y 6 = S Years of Full-Time Work; y 14 = HouseholdTotal Wealth; y 15 = Household Total Debts.Reflective indica<strong>to</strong>ry 17 = Household IncomeH: Household Head;S: SpouseTable 1 Observed indica<strong>to</strong>rsThe statistical definition of <strong>the</strong> LV HCWe have already stated (Dagum and Vittadini 1996)that, <strong>from</strong> <strong>the</strong> statistical point of view, HC can beexpressed as an LV. But <strong>the</strong>re are different ways an LVcan be defined. Traditionally, a variable can be definedas an LV if <strong>the</strong> equations cannot be manipulated in<strong>to</strong>expressing <strong>the</strong> variable as a function of manifestvariables (Bentler 1982). In o<strong>the</strong>r words, in thisdefinition, an LV is a fac<strong>to</strong>r that underlies and causesreflective indica<strong>to</strong>rs and accounts for <strong>the</strong>ir observedvariance in a measurement model (typically <strong>the</strong> fac<strong>to</strong>rmodel) given <strong>the</strong> effects of o<strong>the</strong>r explicative indica<strong>to</strong>rs(in this case <strong>the</strong> reflective indica<strong>to</strong>r Income, given <strong>the</strong>effect of <strong>the</strong> explicative indica<strong>to</strong>r wealth in equation(3)). O<strong>the</strong>rwise we can define HC as a latent variablecaused and measured (with errors) by a linearcombination of <strong>the</strong> formative indica<strong>to</strong>rs F in equation(2). Finally we can propose a third, more complete,definition of an LV, as in this case where it isconnected with both formative and reflectiveindica<strong>to</strong>rs in a Path Diagram. Hence <strong>the</strong> latent variableHC can be defined as a linear combination offormative indica<strong>to</strong>rs F that best fits <strong>the</strong> reflectiveindica<strong>to</strong>r earning income, as in equations (2)-(3).The proposed methodologyThis approach completes <strong>the</strong> methodology proposed byDagum and Slottje (2000) where <strong>the</strong>y combine azerodimensional latent variable approach (part A) andan actuarial ma<strong>the</strong>matical approach (part B).The Latent Variable approach proposes a newmethodology able <strong>to</strong> obtain <strong>the</strong> zerodimensional HClatent variable, <strong>the</strong>n transforms <strong>the</strong> estimated latentvariable in<strong>to</strong> an accounting monetary value, and finallyestimates <strong>the</strong> mean value of HC. The Path Analysisand <strong>the</strong> Latent Variable Approach are shown inFigure1.The Actuarial Ma<strong>the</strong>matical approach starts with <strong>the</strong>actuarial estimation, in monetary values, of <strong>the</strong> averagehuman capital by age of economic units and finallyestimates <strong>the</strong> average of <strong>the</strong> population in monetaryunits. The syn<strong>the</strong>sis gives <strong>the</strong> final HC estimation anddistribution of American Household.INDIRECTINDICATORS ΓINDICATORS OFHOUSEHOLD INVESTMENT INEDUCATION:FORMATIVE INDICATORS ΨH S YEARS OF SCHOOLING;H S YEARS OF TOTAL TIME WORK; HAGE;REGION; H MARITAL STATUS;S GENDER; NUMBER OF CHILDRENHCWEALTH y 14LATENT VARIABLE HUMAN CAPITALFigureINDICATOR OF EFFECTS OFHC: REFLECTIVEINDICATOR INCOME y 17Figure 1: Path Analysis and Latent Variablesapproachy 1 = g 1 (x 1 , x 3 , x 4 , x 5 ) + u 1y 2 = g 2 (x 1 , x 2 , x 3 , x 4 , x 5 , y 1 ) + u 2y 3 = g 3 (x 1 , x 3 , x 4 , x 5 , y 1 ) + u 3y 4 = g 4 (x 1 , x 2 , x 3 , x 4 , x 5 , y 2 , y 3 ) + u 4y 5 = g 5 (x 1 , x 2 , x 3 , x 4 , x 5 , y 1 , y 4 ) + u 5y 6 = g 6 (x 2 , x 3 , x 4 , x 5 , x 6 , y 2 , y 3 , y 4 ) + u 6y 7 = g 7 (x 2 , x 4 , x 5 , x 6 , y 2 , y 5 , y 6 )+ u 7y 8 = g 8 (x 1 , x 2 , x 3 , x 4 , x 5 , y 1 , y 3 , y 4 ) + u 8 (1)y 9 = g 9 (x 1 , x 2 , x 3 , y 8 ) + u 9(1)y 10 = g 10 (x 2 , x 3 , x 4 , y 1 , y 4 , y 5 , y 9 ) + u 10y 11 = g 11 (x 1 , x 2 , x 4 , x 5 , x 6 , y 2 , y 3 , y 6 , y 9 ) + u 11y 12 = g 12 (x 1 , x 2 , x 4 , x 5 , x 6 , y 2 , y 3 , y 9 , y 11 ) + u 12y 13 = g 13 (x 3 , x 4 , y 6 , y 12 ) + u 13y 14 = g 14 (x 4 , y 1 , y 2 , y 4 , y 7 , y 8 , y 9 , y 10 , y 11 , y 12 , y 13 )+ u 14y 15 = g 15 (x 1, x 3 , x 4 , y 1 , y 2 , y 3 , y 4 , y 9 , y 10 , y 12 , y 14 ) + u 15HC = Fg = [y 14 ,Ψ] g + u 16 (2)
y 17 = y 14 k 1 + HC k 2 + u 17 (3)The Latent Variable Approach: previous proposalThe traditional proposal in statistical literature is <strong>to</strong>obtain <strong>the</strong> latent variable HC as a latent cause whichunderlies observed indica<strong>to</strong>rs by means of Fac<strong>to</strong>rAnalysis . In this case starting <strong>from</strong> (3), we obtain:Q y14y 17 = HC k 2 # + u 17 (4)where Q y14=I– Py 14is <strong>the</strong> orthogonal complement of<strong>the</strong> column spaces of y14 with Py 14= y14 (y 14 ′ y 14 ) -1 y 14 ′.By means of Fac<strong>to</strong>r Analysis we obtain HC as <strong>the</strong>latent cause of <strong>the</strong> reflective indica<strong>to</strong>r earning income.First of all, in this way we define <strong>the</strong> HC withouttaking in<strong>to</strong> account <strong>the</strong> amount of investment ineducation measured by <strong>the</strong> formative indica<strong>to</strong>rs F.Secondly, under general conditions, given earningincome Wealth Q y14y17, <strong>the</strong> parameter k # 2 is notidentified and <strong>the</strong> scores of <strong>the</strong> latent variable HC arenot unique. In a Fac<strong>to</strong>rial or in a Structural model when<strong>the</strong> expected values of latent variables are null, <strong>the</strong>identification problem is essentially whe<strong>the</strong>r or notvec<strong>to</strong>r ϑ of parameters and of variances andcovariances of latent variables and errors is uniquelydetermined by <strong>the</strong> covariance matrix Σ of indica<strong>to</strong>rswhose elements are σ ij . In o<strong>the</strong>r words if a vec<strong>to</strong>r ϑ canbe uniquely determined <strong>from</strong> Σ ( and <strong>the</strong>refore if Σ isgenerated by one and only one vec<strong>to</strong>r ϑ) <strong>the</strong>n solving<strong>the</strong> equations σ ij = σ ij (ϑ), i ≤ j (with p manifestvariables, <strong>the</strong>re are ½ p(p + 1) equations in n(θ)unknown parameters), or a subset of <strong>the</strong>m, this vec<strong>to</strong>rof parameter is identified and <strong>the</strong> whole model is said<strong>to</strong> be identified; o<strong>the</strong>rwise it is not. Anderson andRubin showed that a necessary condition foridentification is that <strong>the</strong> number of equations σ ij =σ ij (ϑ), i ≤ j must be greater than <strong>the</strong> order of <strong>the</strong> vec<strong>to</strong>rϑ: p ≥ 2t n(θ)+1. However, since <strong>the</strong> equations aboveare often non-linear, <strong>the</strong> solution is often complicatedand tedious, and explicit solutions for all ϑ’s seldomexist. “No general and practically useful necessary andsufficient conditions for identification are available”(Everitt 1984).If a model is not completely identified, appropriaterestrictions may be imposed on ϑ <strong>to</strong> make itidentifiable. The choice of restrictions may affect <strong>the</strong>interpretation of <strong>the</strong> results of an estimated model.Under general conditions for <strong>the</strong> Fac<strong>to</strong>r Model, if wedo not consider a few very restricted cases in whichconditions for identifiability are studied analytically,e.g. where <strong>the</strong> endogenous variables are measuredwithout error (Geraci 1976), <strong>the</strong> problem cannot beresolved. In practice, it is suggested (Jöreskog, 1981b)that “The identification problem can be studied on acase by case basis by examining <strong>the</strong> equations”,choosing <strong>the</strong> restriction, not only in number but also inposition, in order <strong>to</strong> obtain unique solutions. This isalso true in <strong>the</strong> case of local identifiability of <strong>the</strong>parameters (Wegge 1965, 1991 Fisher 1976,Ro<strong>the</strong>nberg 1971, Geraci 1976, Bekker and Pollock1986, Shapiro 1985, Bekker 1989, 1991, Wegge andFeldman, 1983).In our case, we have one equation σ ij = σ ij (ϑ):σ = (k 2 # ) 2 + σ u17 (5)y Q y1417With two unknown values, <strong>the</strong> square of <strong>the</strong>#parameter k 2 (k # 2 ) 2 and <strong>the</strong> variance of <strong>the</strong> error u 17(σ u17 ). Therefore, under general conditions, when <strong>the</strong>Reliability Ratio between σ and (k2 # ) 2 isQ y14 y17unknown or <strong>the</strong> variance of <strong>the</strong> error σ u17 orInstrumental Variables are not available, <strong>the</strong> model (4)is not identifiable (Fuller, 1987).Regarding <strong>the</strong> problem of indeterminacy we can verifythat, under general conditions, <strong>the</strong> matrix of observedindica<strong>to</strong>rs is less than <strong>the</strong> matrix of latent scores anderrors. Therefore, it can be demonstrated that even if<strong>the</strong> model is identified <strong>the</strong> latent scores areindeterminate. There are infinite sets of latent scoresfor <strong>the</strong> same identified model. It can be proved thatsome of <strong>the</strong>m can be ei<strong>the</strong>r negatively correlated <strong>to</strong>each o<strong>the</strong>r (Reiersol 1950; Guttmann 1955; Andersonand Rubin 1956; Lawley and Maxwell 1963; Joreskog1967; Schonemann and Wang 1972; Schonemann andSteiger 1978; Steiger 1979; Schonemann and Haagen1987). In this case, given y17 and k # 2 , we canQ y14obtain infinite set of scores of HC; moreover some of<strong>the</strong>m can be negatively correlated.An alternative proposal is given by <strong>the</strong> Partial LeastSquares <strong>Method</strong> (<strong>from</strong> here on referred <strong>to</strong> as PLS):PLS provides estimates of parameters g in (2) definingand estimating an LV “by deliberate approximation asa linear aggregate of its observed indica<strong>to</strong>rs” (Wold1982). In this definition <strong>the</strong> HC appearing in (2) is nota fac<strong>to</strong>r of <strong>the</strong> observed reflective indica<strong>to</strong>rs (3) but anunobserved <strong>the</strong>oretical construct, approximated by alinear combination of observed formative indica<strong>to</strong>rs,e.g. following equation (2):H Ĉ = F ĝ(6)where HĈ is <strong>the</strong> proxy obtained by reducing <strong>the</strong> loss ofinformation with respect <strong>to</strong> <strong>the</strong> unobservable HC.There are two alternatives for obtaining <strong>the</strong> solutions ofHĈ in (6) by means of <strong>the</strong> PLS. The PLS mode A isbased on iterative multivariate regressions of <strong>the</strong> LV’son <strong>the</strong> observed indica<strong>to</strong>rs; <strong>the</strong>refore, if <strong>the</strong>re is a singleLV, it cannot be used, because it causes “circularsolutions” without improvements in <strong>the</strong> iterations. ThePLS mode B is based on simple iterative regressions on<strong>the</strong> observed indica<strong>to</strong>rs F=(y 14 ; Ψ). It can be provedthat <strong>the</strong> estimate of HĈ is equivalent <strong>to</strong> <strong>the</strong> firstprincipal component of F (Wold 1982). Therefore wehave in (6):HĈ = Fv 1 = y 14 v 11 + Ψ v 12 (7)
Page 1: A Method for The E
Page 5: We avoid the appro

A Method to Estimate the Human Capital from Sample Surveys on ...

Create successful ePaper yourself

Delete template?

Save as template?