Indirect Indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs Γx 2 =H Gender; x 3 = H Race; x 6 = S Age; y 5 = H Yearsof Not Full-Time Work; y 7 = S Years of Not Full-Time Work; y 8 = H Job Status; y 9 = H Occupati<strong>on</strong>;y 10 = H Industry; y 11 = S Job Status; y 12 = S Occupati<strong>on</strong>y 13 = S IndustryFormative indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs F = (Ψ, y 14 )Ψ: x 1 = H Age; x 4 = Regi<strong>on</strong> ;x 5 = H Marital Status;x 7 = S Gender; y 1 = H Years of Schooling;y 2 = S Years of Schooling; y 3 = Number of Children;y 4 = H Years of Full-Time Work;y 6 = S Years of Full-Time Work; y 14 = HouseholdTotal Wealth; y 15 = Household Total Debts.Reflective indica<str<strong>on</strong>g>to</str<strong>on</strong>g>ry 17 = Household IncomeH: Household Head;S: SpouseTable 1 Observed indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rsThe statistical definiti<strong>on</strong> of <str<strong>on</strong>g>the</str<strong>on</strong>g> LV HCWe have already stated (Dagum and Vittadini 1996)that, <str<strong>on</strong>g>from</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> statistical point of view, HC can beexpressed as an LV. But <str<strong>on</strong>g>the</str<strong>on</strong>g>re are different ways an LVcan be defined. Traditi<strong>on</strong>ally, a variable can be definedas an LV if <str<strong>on</strong>g>the</str<strong>on</strong>g> equati<strong>on</strong>s cannot be manipulated in<str<strong>on</strong>g>to</str<strong>on</strong>g>expressing <str<strong>on</strong>g>the</str<strong>on</strong>g> variable as a functi<strong>on</strong> of manifestvariables (Bentler 1982). In o<str<strong>on</strong>g>the</str<strong>on</strong>g>r words, in thisdefiniti<strong>on</strong>, an LV is a fac<str<strong>on</strong>g>to</str<strong>on</strong>g>r that underlies and causesreflective indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs and accounts for <str<strong>on</strong>g>the</str<strong>on</strong>g>ir observedvariance in a measurement model (typically <str<strong>on</strong>g>the</str<strong>on</strong>g> fac<str<strong>on</strong>g>to</str<strong>on</strong>g>rmodel) given <str<strong>on</strong>g>the</str<strong>on</strong>g> effects of o<str<strong>on</strong>g>the</str<strong>on</strong>g>r explicative indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs(in this case <str<strong>on</strong>g>the</str<strong>on</strong>g> reflective indica<str<strong>on</strong>g>to</str<strong>on</strong>g>r Income, given <str<strong>on</strong>g>the</str<strong>on</strong>g>effect of <str<strong>on</strong>g>the</str<strong>on</strong>g> explicative indica<str<strong>on</strong>g>to</str<strong>on</strong>g>r wealth in equati<strong>on</strong>(3)). O<str<strong>on</strong>g>the</str<strong>on</strong>g>rwise we can define HC as a latent variablecaused and measured (with errors) by a linearcombinati<strong>on</strong> of <str<strong>on</strong>g>the</str<strong>on</strong>g> formative indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs F in equati<strong>on</strong>(2). Finally we can propose a third, more complete,definiti<strong>on</strong> of an LV, as in this case where it isc<strong>on</strong>nected with both formative and reflectiveindica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs in a Path Diagram. Hence <str<strong>on</strong>g>the</str<strong>on</strong>g> latent variableHC can be defined as a linear combinati<strong>on</strong> offormative indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs F that best fits <str<strong>on</strong>g>the</str<strong>on</strong>g> reflectiveindica<str<strong>on</strong>g>to</str<strong>on</strong>g>r earning income, as in equati<strong>on</strong>s (2)-(3).The proposed methodologyThis approach completes <str<strong>on</strong>g>the</str<strong>on</strong>g> methodology proposed byDagum and Slottje (2000) where <str<strong>on</strong>g>the</str<strong>on</strong>g>y combine azerodimensi<strong>on</strong>al latent variable approach (part A) andan actuarial ma<str<strong>on</strong>g>the</str<strong>on</strong>g>matical approach (part B).The Latent Variable approach proposes a newmethodology able <str<strong>on</strong>g>to</str<strong>on</strong>g> obtain <str<strong>on</strong>g>the</str<strong>on</strong>g> zerodimensi<strong>on</strong>al HClatent variable, <str<strong>on</strong>g>the</str<strong>on</strong>g>n transforms <str<strong>on</strong>g>the</str<strong>on</strong>g> estimated latentvariable in<str<strong>on</strong>g>to</str<strong>on</strong>g> an accounting m<strong>on</strong>etary value, and finallyestimates <str<strong>on</strong>g>the</str<strong>on</strong>g> mean value of HC. The Path Analysisand <str<strong>on</strong>g>the</str<strong>on</strong>g> Latent Variable Approach are shown inFigure1.The Actuarial Ma<str<strong>on</strong>g>the</str<strong>on</strong>g>matical approach starts with <str<strong>on</strong>g>the</str<strong>on</strong>g>actuarial estimati<strong>on</strong>, in m<strong>on</strong>etary values, of <str<strong>on</strong>g>the</str<strong>on</strong>g> averagehuman capital by age of ec<strong>on</strong>omic units and finallyestimates <str<strong>on</strong>g>the</str<strong>on</strong>g> average of <str<strong>on</strong>g>the</str<strong>on</strong>g> populati<strong>on</strong> in m<strong>on</strong>etaryunits. The syn<str<strong>on</strong>g>the</str<strong>on</strong>g>sis gives <str<strong>on</strong>g>the</str<strong>on</strong>g> final HC estimati<strong>on</strong> anddistributi<strong>on</strong> of American Household.INDIRECTINDICATORS ΓINDICATORS OFHOUSEHOLD INVESTMENT INEDUCATION:FORMATIVE INDICATORS ΨH S YEARS OF SCHOOLING;H S YEARS OF TOTAL TIME WORK; HAGE;REGION; H MARITAL STATUS;S GENDER; NUMBER OF CHILDRENHCWEALTH y 14LATENT VARIABLE HUMAN CAPITALFigureINDICATOR OF EFFECTS OFHC: REFLECTIVEINDICATOR INCOME y 17Figure 1: Path Analysis and Latent Variablesapproachy 1 = g 1 (x 1 , x 3 , x 4 , x 5 ) + u 1y 2 = g 2 (x 1 , x 2 , x 3 , x 4 , x 5 , y 1 ) + u 2y 3 = g 3 (x 1 , x 3 , x 4 , x 5 , y 1 ) + u 3y 4 = g 4 (x 1 , x 2 , x 3 , x 4 , x 5 , y 2 , y 3 ) + u 4y 5 = g 5 (x 1 , x 2 , x 3 , x 4 , x 5 , y 1 , y 4 ) + u 5y 6 = g 6 (x 2 , x 3 , x 4 , x 5 , x 6 , y 2 , y 3 , y 4 ) + u 6y 7 = g 7 (x 2 , x 4 , x 5 , x 6 , y 2 , y 5 , y 6 )+ u 7y 8 = g 8 (x 1 , x 2 , x 3 , x 4 , x 5 , y 1 , y 3 , y 4 ) + u 8 (1)y 9 = g 9 (x 1 , x 2 , x 3 , y 8 ) + u 9(1)y 10 = g 10 (x 2 , x 3 , x 4 , y 1 , y 4 , y 5 , y 9 ) + u 10y 11 = g 11 (x 1 , x 2 , x 4 , x 5 , x 6 , y 2 , y 3 , y 6 , y 9 ) + u 11y 12 = g 12 (x 1 , x 2 , x 4 , x 5 , x 6 , y 2 , y 3 , y 9 , y 11 ) + u 12y 13 = g 13 (x 3 , x 4 , y 6 , y 12 ) + u 13y 14 = g 14 (x 4 , y 1 , y 2 , y 4 , y 7 , y 8 , y 9 , y 10 , y 11 , y 12 , y 13 )+ u 14y 15 = g 15 (x 1, x 3 , x 4 , y 1 , y 2 , y 3 , y 4 , y 9 , y 10 , y 12 , y 14 ) + u 15HC = Fg = [y 14 ,Ψ] g + u 16 (2)
y 17 = y 14 k 1 + HC k 2 + u 17 (3)The Latent Variable Approach: previous proposalThe traditi<strong>on</strong>al proposal in statistical literature is <str<strong>on</strong>g>to</str<strong>on</strong>g>obtain <str<strong>on</strong>g>the</str<strong>on</strong>g> latent variable HC as a latent cause whichunderlies observed indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs by means of Fac<str<strong>on</strong>g>to</str<strong>on</strong>g>rAnalysis . In this case starting <str<strong>on</strong>g>from</str<strong>on</strong>g> (3), we obtain:Q y14y 17 = HC k 2 # + u 17 (4)where Q y14=I– Py 14is <str<strong>on</strong>g>the</str<strong>on</strong>g> orthog<strong>on</strong>al complement of<str<strong>on</strong>g>the</str<strong>on</strong>g> column spaces of y14 with Py 14= y14 (y 14 ′ y 14 ) -1 y 14 ′.By means of Fac<str<strong>on</strong>g>to</str<strong>on</strong>g>r Analysis we obtain HC as <str<strong>on</strong>g>the</str<strong>on</strong>g>latent cause of <str<strong>on</strong>g>the</str<strong>on</strong>g> reflective indica<str<strong>on</strong>g>to</str<strong>on</strong>g>r earning income.First of all, in this way we define <str<strong>on</strong>g>the</str<strong>on</strong>g> HC withouttaking in<str<strong>on</strong>g>to</str<strong>on</strong>g> account <str<strong>on</strong>g>the</str<strong>on</strong>g> amount of investment ineducati<strong>on</strong> measured by <str<strong>on</strong>g>the</str<strong>on</strong>g> formative indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs F.Sec<strong>on</strong>dly, under general c<strong>on</strong>diti<strong>on</strong>s, given earningincome Wealth Q y14y17, <str<strong>on</strong>g>the</str<strong>on</strong>g> parameter k # 2 is notidentified and <str<strong>on</strong>g>the</str<strong>on</strong>g> scores of <str<strong>on</strong>g>the</str<strong>on</strong>g> latent variable HC arenot unique. In a Fac<str<strong>on</strong>g>to</str<strong>on</strong>g>rial or in a Structural model when<str<strong>on</strong>g>the</str<strong>on</strong>g> expected values of latent variables are null, <str<strong>on</strong>g>the</str<strong>on</strong>g>identificati<strong>on</strong> problem is essentially whe<str<strong>on</strong>g>the</str<strong>on</strong>g>r or notvec<str<strong>on</strong>g>to</str<strong>on</strong>g>r ϑ of parameters and of variances andcovariances of latent variables and errors is uniquelydetermined by <str<strong>on</strong>g>the</str<strong>on</strong>g> covariance matrix Σ of indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rswhose elements are σ ij . In o<str<strong>on</strong>g>the</str<strong>on</strong>g>r words if a vec<str<strong>on</strong>g>to</str<strong>on</strong>g>r ϑ canbe uniquely determined <str<strong>on</strong>g>from</str<strong>on</strong>g> Σ ( and <str<strong>on</strong>g>the</str<strong>on</strong>g>refore if Σ isgenerated by <strong>on</strong>e and <strong>on</strong>ly <strong>on</strong>e vec<str<strong>on</strong>g>to</str<strong>on</strong>g>r ϑ) <str<strong>on</strong>g>the</str<strong>on</strong>g>n solving<str<strong>on</strong>g>the</str<strong>on</strong>g> equati<strong>on</strong>s σ ij = σ ij (ϑ), i ≤ j (with p manifestvariables, <str<strong>on</strong>g>the</str<strong>on</strong>g>re are ½ p(p + 1) equati<strong>on</strong>s in n(θ)unknown parameters), or a subset of <str<strong>on</strong>g>the</str<strong>on</strong>g>m, this vec<str<strong>on</strong>g>to</str<strong>on</strong>g>rof parameter is identified and <str<strong>on</strong>g>the</str<strong>on</strong>g> whole model is said<str<strong>on</strong>g>to</str<strong>on</strong>g> be identified; o<str<strong>on</strong>g>the</str<strong>on</strong>g>rwise it is not. Anders<strong>on</strong> andRubin showed that a necessary c<strong>on</strong>diti<strong>on</strong> foridentificati<strong>on</strong> is that <str<strong>on</strong>g>the</str<strong>on</strong>g> number of equati<strong>on</strong>s σ ij =σ ij (ϑ), i ≤ j must be greater than <str<strong>on</strong>g>the</str<strong>on</strong>g> order of <str<strong>on</strong>g>the</str<strong>on</strong>g> vec<str<strong>on</strong>g>to</str<strong>on</strong>g>rϑ: p ≥ 2t n(θ)+1. However, since <str<strong>on</strong>g>the</str<strong>on</strong>g> equati<strong>on</strong>s aboveare often n<strong>on</strong>-linear, <str<strong>on</strong>g>the</str<strong>on</strong>g> soluti<strong>on</strong> is often complicatedand tedious, and explicit soluti<strong>on</strong>s for all ϑ’s seldomexist. “No general and practically useful necessary andsufficient c<strong>on</strong>diti<strong>on</strong>s for identificati<strong>on</strong> are available”(Everitt 1984).If a model is not completely identified, appropriaterestricti<strong>on</strong>s may be imposed <strong>on</strong> ϑ <str<strong>on</strong>g>to</str<strong>on</strong>g> make itidentifiable. The choice of restricti<strong>on</strong>s may affect <str<strong>on</strong>g>the</str<strong>on</strong>g>interpretati<strong>on</strong> of <str<strong>on</strong>g>the</str<strong>on</strong>g> results of an estimated model.Under general c<strong>on</strong>diti<strong>on</strong>s for <str<strong>on</strong>g>the</str<strong>on</strong>g> Fac<str<strong>on</strong>g>to</str<strong>on</strong>g>r Model, if wedo not c<strong>on</strong>sider a few very restricted cases in whichc<strong>on</strong>diti<strong>on</strong>s for identifiability are studied analytically,e.g. where <str<strong>on</strong>g>the</str<strong>on</strong>g> endogenous variables are measuredwithout error (Geraci 1976), <str<strong>on</strong>g>the</str<strong>on</strong>g> problem cannot beresolved. In practice, it is suggested (Jöreskog, 1981b)that “The identificati<strong>on</strong> problem can be studied <strong>on</strong> acase by case basis by examining <str<strong>on</strong>g>the</str<strong>on</strong>g> equati<strong>on</strong>s”,choosing <str<strong>on</strong>g>the</str<strong>on</strong>g> restricti<strong>on</strong>, not <strong>on</strong>ly in number but also inpositi<strong>on</strong>, in order <str<strong>on</strong>g>to</str<strong>on</strong>g> obtain unique soluti<strong>on</strong>s. This isalso true in <str<strong>on</strong>g>the</str<strong>on</strong>g> case of local identifiability of <str<strong>on</strong>g>the</str<strong>on</strong>g>parameters (Wegge 1965, 1991 Fisher 1976,Ro<str<strong>on</strong>g>the</str<strong>on</strong>g>nberg 1971, Geraci 1976, Bekker and Pollock1986, Shapiro 1985, Bekker 1989, 1991, Wegge andFeldman, 1983).In our case, we have <strong>on</strong>e equati<strong>on</strong> σ ij = σ ij (ϑ):σ = (k 2 # ) 2 + σ u17 (5)y Q y1417With two unknown values, <str<strong>on</strong>g>the</str<strong>on</strong>g> square of <str<strong>on</strong>g>the</str<strong>on</strong>g>#parameter k 2 (k # 2 ) 2 and <str<strong>on</strong>g>the</str<strong>on</strong>g> variance of <str<strong>on</strong>g>the</str<strong>on</strong>g> error u 17(σ u17 ). Therefore, under general c<strong>on</strong>diti<strong>on</strong>s, when <str<strong>on</strong>g>the</str<strong>on</strong>g>Reliability Ratio between σ and (k2 # ) 2 isQ y14 y17unknown or <str<strong>on</strong>g>the</str<strong>on</strong>g> variance of <str<strong>on</strong>g>the</str<strong>on</strong>g> error σ u17 orInstrumental Variables are not available, <str<strong>on</strong>g>the</str<strong>on</strong>g> model (4)is not identifiable (Fuller, 1987).Regarding <str<strong>on</strong>g>the</str<strong>on</strong>g> problem of indeterminacy we can verifythat, under general c<strong>on</strong>diti<strong>on</strong>s, <str<strong>on</strong>g>the</str<strong>on</strong>g> matrix of observedindica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs is less than <str<strong>on</strong>g>the</str<strong>on</strong>g> matrix of latent scores anderrors. Therefore, it can be dem<strong>on</strong>strated that even if<str<strong>on</strong>g>the</str<strong>on</strong>g> model is identified <str<strong>on</strong>g>the</str<strong>on</strong>g> latent scores areindeterminate. There are infinite sets of latent scoresfor <str<strong>on</strong>g>the</str<strong>on</strong>g> same identified model. It can be proved thatsome of <str<strong>on</strong>g>the</str<strong>on</strong>g>m can be ei<str<strong>on</strong>g>the</str<strong>on</strong>g>r negatively correlated <str<strong>on</strong>g>to</str<strong>on</strong>g>each o<str<strong>on</strong>g>the</str<strong>on</strong>g>r (Reiersol 1950; Guttmann 1955; Anders<strong>on</strong>and Rubin 1956; Lawley and Maxwell 1963; Joreskog1967; Sch<strong>on</strong>emann and Wang 1972; Sch<strong>on</strong>emann andSteiger 1978; Steiger 1979; Sch<strong>on</strong>emann and Haagen1987). In this case, given y17 and k # 2 , we canQ y14obtain infinite set of scores of HC; moreover some of<str<strong>on</strong>g>the</str<strong>on</strong>g>m can be negatively correlated.An alternative proposal is given by <str<strong>on</strong>g>the</str<strong>on</strong>g> Partial LeastSquares <str<strong>on</strong>g>Method</str<strong>on</strong>g> (<str<strong>on</strong>g>from</str<strong>on</strong>g> here <strong>on</strong> referred <str<strong>on</strong>g>to</str<strong>on</strong>g> as PLS):PLS provides estimates of parameters g in (2) definingand estimating an LV “by deliberate approximati<strong>on</strong> asa linear aggregate of its observed indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs” (Wold1982). In this definiti<strong>on</strong> <str<strong>on</strong>g>the</str<strong>on</strong>g> HC appearing in (2) is nota fac<str<strong>on</strong>g>to</str<strong>on</strong>g>r of <str<strong>on</strong>g>the</str<strong>on</strong>g> observed reflective indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs (3) but anunobserved <str<strong>on</strong>g>the</str<strong>on</strong>g>oretical c<strong>on</strong>struct, approximated by alinear combinati<strong>on</strong> of observed formative indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs,e.g. following equati<strong>on</strong> (2):H Ĉ = F ĝ(6)where HĈ is <str<strong>on</strong>g>the</str<strong>on</strong>g> proxy obtained by reducing <str<strong>on</strong>g>the</str<strong>on</strong>g> loss ofinformati<strong>on</strong> with respect <str<strong>on</strong>g>to</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> unobservable HC.There are two alternatives for obtaining <str<strong>on</strong>g>the</str<strong>on</strong>g> soluti<strong>on</strong>s ofHĈ in (6) by means of <str<strong>on</strong>g>the</str<strong>on</strong>g> PLS. The PLS mode A isbased <strong>on</strong> iterative multivariate regressi<strong>on</strong>s of <str<strong>on</strong>g>the</str<strong>on</strong>g> LV’s<strong>on</strong> <str<strong>on</strong>g>the</str<strong>on</strong>g> observed indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs; <str<strong>on</strong>g>the</str<strong>on</strong>g>refore, if <str<strong>on</strong>g>the</str<strong>on</strong>g>re is a singleLV, it cannot be used, because it causes “circularsoluti<strong>on</strong>s” without improvements in <str<strong>on</strong>g>the</str<strong>on</strong>g> iterati<strong>on</strong>s. ThePLS mode B is based <strong>on</strong> simple iterative regressi<strong>on</strong>s <strong>on</strong><str<strong>on</strong>g>the</str<strong>on</strong>g> observed indica<str<strong>on</strong>g>to</str<strong>on</strong>g>rs F=(y 14 ; Ψ). It can be provedthat <str<strong>on</strong>g>the</str<strong>on</strong>g> estimate of HĈ is equivalent <str<strong>on</strong>g>to</str<strong>on</strong>g> <str<strong>on</strong>g>the</str<strong>on</strong>g> firstprincipal comp<strong>on</strong>ent of F (Wold 1982). Therefore wehave in (6):HĈ = Fv 1 = y 14 v 11 + Ψ v 12 (7)