Decomposing Household Income by Source and Subgroup - Alex Eble
Decomposing Household Income by Source and Subgroup - Alex Eble
Decomposing Household Income by Source and Subgroup - Alex Eble
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong>:<br />
A Methodological Investigation Using Survey Data from Rural China<br />
<strong>Alex</strong> <strong>Eble</strong> 1<br />
Indiana University, Bloomington<br />
1 <strong>Alex</strong> <strong>Eble</strong>, correspondence Address: 9093 Sweet Bay Court, Indianapolis, IN 46260.<br />
Thanks are due to Ethan Michelson for impeccable guidance <strong>and</strong> access to his data.<br />
The Department of East Asian Languages <strong>and</strong> Cultures <strong>and</strong> the Hutton Honors<br />
College at Indiana University, Bloomington, provided much needed resources to help<br />
with the production of this paper.
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
I. Introduction:<br />
Gross inequality is widely considered an undesirable social condition to be mitigated<br />
<strong>by</strong> policy, as evidenced <strong>by</strong> diverse social programs instituted <strong>by</strong> government to<br />
encourage more equitable distribution of resources. This social conviction, in turn,<br />
encourages policy makers <strong>and</strong> social scientists to better underst<strong>and</strong> the current<br />
condition <strong>and</strong> sources of inequality. Massive ascent from poverty <strong>and</strong> rapid uneven<br />
development experienced in countries such as China <strong>and</strong> India only amplify this need.<br />
With the cooperation <strong>and</strong> guidance of Ethan Michelson I have been working to draw<br />
on economics to make a contribution to sociological research on income inequality.<br />
Sociologists focusing on China have given extreme amounts of attention to inequality<br />
at individual <strong>and</strong> regional (<strong>and</strong> even international) levels while overlooking<br />
investigation into sources of income. This paper looks into various methods, designed<br />
<strong>by</strong> economists, of describing inequality in terms of its contributing factors. It then<br />
applies these methods to a data set of rural household survey responses from China.<br />
The inequality indices used can “decompose,” meaning that the overall measure can<br />
be further explained or broken down into the sum of individual contributions to<br />
inequality. In one method of break-down, the paper looks at the contribution to<br />
inequality from each of five sources of income. In another method, the paper shows<br />
what proportion of inequality comes from income differences within villages, within<br />
counties but between villages, <strong>and</strong> between counties. The first method can identify<br />
income sources which are equalizing or, conversely, disproportionate contributors to<br />
overall inequality. The second measure can help identify where to further investigate<br />
geographical inequality.<br />
The second section of this paper briefly explains the relevant research on inequality<br />
indices <strong>and</strong> how it can be used to choose a few indices from a broad field of<br />
possibilities. The third <strong>and</strong> fourth sections explain two of these indices’ strengths <strong>and</strong><br />
weaknesses. The fifth gives a brief explanation of how these indices are applied to the<br />
data. The sixth section displays the results <strong>and</strong> the last section interprets issues raised<br />
<strong>by</strong> the data while suggesting further investigation.<br />
II. Narrowing the Field of <strong>Income</strong> Inequality Indices<br />
There are numerous mathematical measures to describe inequality <strong>and</strong> several<br />
methods with which to choose these measures. Among the most successful attempts at<br />
the latter are those performed <strong>by</strong> Francois Bourguignon <strong>and</strong> Anthony Shorrocks. In<br />
specifying the conditions for selection of an inequality index for income distribution,<br />
Bourguignon <strong>and</strong> Shorrocks agree upon a set of simple characteristics which define a<br />
sound inequality index. Namely, they suggest that the index, a function of the set of<br />
incomes:<br />
a) Be continuous <strong>and</strong> differentiable over all individuals’ incomes<br />
1
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
b) Be symmetric (the personality of income earners should not affect the value of the<br />
index)<br />
c) Does not vary when all incomes are multiplied <strong>by</strong> a scalar<br />
d) Satisfy the symmetry axiom for population (the index for a given distribution must<br />
be the same as that for a distribution obtained <strong>by</strong> replicating any number of times<br />
each individual income in the initial distribution)<br />
e) Satisfy the Pigou-Dalton condition (the inequality measure must decrease with a<br />
transfer from rich to less rich people that does not reverse their relative position in<br />
the distribution)<br />
f) Be decomposable (the index must be able to be expressed as the sum of a within<br />
group inequality term <strong>and</strong> a between group inequality term) (Bourguignon 1979,<br />
<strong>and</strong> Shorrocks 1980, 1982)<br />
These conditions, agreed upon <strong>by</strong> Bourguignon <strong>and</strong> Shorrocks, limit the possible<br />
indices to the class of generalized entropy measures given <strong>by</strong> the following equation:<br />
<br />
<br />
n 1 1 y <br />
<br />
p <br />
GE ( )<br />
<br />
<br />
1<br />
(1)<br />
2<br />
<br />
<br />
n p1<br />
y <br />
y p represents income of an individual p, y is the group’s mean income <strong>and</strong> n is<br />
population. The most widely used indices in the generalized entropy family are the<br />
two Theil indices <strong>and</strong> half the square of the coefficient of variation (CV), 2<br />
corresponding to equal to zero, one, <strong>and</strong> two, respectively in the above class.<br />
Shorrocks looked more closely into the nature of decomposing income inequality <strong>and</strong><br />
the traits of these indices. Going one step beyond those assumptions agreed upon <strong>by</strong><br />
Bourguignon, Shorrocks suggests the following two assumptions to further narrow the<br />
field of acceptable, decomposable inequality indices:<br />
The contribution from a source to inequality must be zero if income from that<br />
source is equal<br />
Two income components who are identical <strong>and</strong> together comprise total income<br />
must have the same contribution to income inequality<br />
The importance of these extra constraints is that they ensure that the decomposition<br />
rule for any inequality index is unique <strong>and</strong> that the relative value of income<br />
components’ contribution to inequality is independent of the choice of index. This is a<br />
major asset to social scientists wishing to decompose inequality as it ensures that<br />
comparison can be made across inequality indices that satisfy these assumptions. Both<br />
Theil’s T index <strong>and</strong> the CV satisfy these constraints <strong>and</strong> are the two indices which will<br />
be used in the following analyses. (Shorrocks 1982)<br />
2 Half of the square of the coefficient of variation will be referred to as “the CV” in this paper.<br />
2
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
III. Theil’s T<br />
Theil’s T index, the member of the general entropy family of inequality indices<br />
corresponding to = 1, is the sum of each individual’s contribution to total inequality.<br />
Theil’s T index weights a data point’s (individual’s) population share <strong>and</strong> distance<br />
from the mean through the following equation:<br />
T<br />
<br />
<br />
1 <br />
<br />
n<br />
p<br />
p<br />
<br />
* * ln<br />
<br />
<br />
p1 n y y <br />
<br />
y<br />
<br />
In this index, a data point gives a contribution to the overall index based on a<br />
decreasing function of the probability of its occurrence. In other words, given a<br />
normal distribution, the further from the mean an individual’s income is the greater is<br />
that individual’s contribution to the inequality index. (Theil 1967)<br />
Another interesting characteristic is Theil’s T’s non-linearity. As the richest half of the<br />
population’s share of income increases linearly Theil’s T index increases at a more<br />
than linear rate. This phenomenon is due to the decreasing nature of the negative<br />
contribution to overall inequality. When there is total equality in the group, Theil’s T<br />
reaches its minimum, zero. (Conceição <strong>and</strong> Ferreira, 2000) 3<br />
According to scholars at the University of Texas Inequality Project, (UTIP) a think<br />
tank focusing primarily on the measure <strong>and</strong> analysis of inequality, the main advantage<br />
of Theil’s T index is the facility with which it decomposes inequality into between <strong>and</strong><br />
within group components. Another strength of Theil’s T is its capacity to analyze<br />
inequality from aggregated data is this manner. Several other indices, including the<br />
Gini Coefficient <strong>and</strong> the CV, require comprehensive individual-level data which is<br />
often unavailable to social scientists. (UTIP 2005) Sicular <strong>and</strong> Morduch (2002)<br />
show that Theil’s T index can also readily be decomposed among factor incomes. A<br />
prior complaint about Theil’s T was that due to its logarithmic nature, it would be<br />
undefined under negative <strong>and</strong> zero incomes. Sicular <strong>and</strong> Morduch show that Theil’s T<br />
can be decomposed for income components <strong>and</strong> furthermore that Theil’s T is in fact<br />
defined for zero <strong>and</strong> negative factor income values in the following equation:<br />
s<br />
k<br />
TT<br />
1<br />
n<br />
<br />
1<br />
n<br />
n<br />
<br />
p1<br />
n<br />
<br />
p1<br />
<br />
<br />
y<br />
<br />
y p <br />
ln<br />
<br />
<br />
<br />
y<br />
<br />
<br />
y p <br />
ln<br />
<br />
<br />
<br />
y<br />
<br />
<br />
They go on to laud the benefits of Theil’s T as compared to other more frequently<br />
used inequality measures: “The Gini coefficient falls if an income source is increased<br />
<strong>by</strong> a constant amount for all members of a population”, a desirable characteristic, “but<br />
none of the components of the st<strong>and</strong>ard decomposition of the Gini are affected,”<br />
ignoring what we hope to measure as a decrease in income inequality for the given<br />
3 For graphical representation of this phenomenon, please also refer to Conceição <strong>and</strong> Ferreira, 2000.<br />
k<br />
p<br />
p<br />
<br />
(2)<br />
(3)<br />
3
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
component. Decomposed, Theil’s T shows such a reduction in inequality. Sicular <strong>and</strong><br />
Morduch add that “the Theil-T decomposition provides a better indicator of why the<br />
overall index takes its given value in the first place…the Theil-T index is thus<br />
potentially of greater use to researchers.” (Morduch <strong>and</strong> Sicular, 2002)<br />
The downsides of Theil’s T, according to UTIP, are that it has no intuitive motivating<br />
picture, cannot directly compare populations with different sizes or group structures,<br />
<strong>and</strong> that it is comparatively mathematically complex. (University of Texas Inequality<br />
Project 2005) In the words of Amartya Sen, Theil’s T “is an arbitrary formula, <strong>and</strong><br />
the average of the logarithms of the reciprocals of income shares weighted <strong>by</strong> income<br />
is not a measure that is exactly overflowing with intuitive sense.” (Sen 1997)<br />
Theil’s T index also yields a simple hierarchical decomposition among nested regions.<br />
<strong>Decomposing</strong> Theil’s T index <strong>by</strong> regions renders the following equations. Total<br />
inequality as the sum of between <strong>and</strong> within group equality is given <strong>by</strong>:<br />
With between group inequality given <strong>by</strong>:<br />
T<br />
B<br />
<br />
k<br />
<br />
gi1<br />
Y g is income in group g. Y is total income, g<br />
T T T <br />
(4)<br />
B<br />
W<br />
Yg<br />
Yg<br />
ng<br />
<br />
ln<br />
<br />
<br />
<br />
<br />
(5)<br />
Y Y N <br />
n is population of group g <strong>and</strong> N is total<br />
population. The between group component is obtained <strong>by</strong> replacing the income of an<br />
individual with the mean income of the individual’s respective subgroup. (Shorrocks<br />
<strong>and</strong> Wan 2004)<br />
Within Group Inequality is given <strong>by</strong>:<br />
where<br />
T<br />
k Yg<br />
<br />
T W '<br />
* TW<br />
gi1<br />
Y <br />
<br />
(6)<br />
<br />
<br />
<br />
<br />
<br />
y<br />
ng<br />
gp<br />
w <br />
p1 Yg<br />
g<br />
y <br />
gp <br />
<br />
1<br />
ln / <br />
(7)<br />
<br />
Yg<br />
n <br />
Akita (2000), recognizing that even this decomposition method resulted in undesirable<br />
use of regional mean incomes as opposed to household level incomes, devised a<br />
method of hierarchical inequality decomposition <strong>by</strong> which a three-tiered country<br />
structure (region, province, <strong>and</strong> district) could be decomposed into within province,<br />
between province, <strong>and</strong> between region contributions to overall inequality. 4 The<br />
equations for further decomposition are then:<br />
4 Note that the sum of between province <strong>and</strong> within province contributions is identically equivalent to the sum of<br />
the within region contributions.<br />
4
given<br />
<strong>and</strong><br />
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
T<br />
WP<br />
T T T T<br />
(8)<br />
WP<br />
BP<br />
BR<br />
<br />
Yij<br />
<br />
yijk<br />
yijk<br />
/ Yij<br />
<br />
<br />
<br />
<br />
<br />
ln (9)<br />
<br />
j k Y <br />
Yij<br />
nijk<br />
/ N <br />
i ij<br />
T<br />
BP<br />
T<br />
BR<br />
<br />
Yij<br />
Yij<br />
/ Yi<br />
<br />
ln <br />
(10)<br />
Y <br />
N ij / N <br />
i j i<br />
Yi<br />
Yi<br />
/ Y <br />
ln<br />
<br />
<br />
<br />
<br />
(11)<br />
i Y Ni<br />
/ N <br />
Where i, j, <strong>and</strong> k represent regions, provinces, <strong>and</strong> districts, respectively. (Akita 2000)<br />
IV. The CV - Half of the Square of the Coefficient of Variation<br />
Another widely used index to measure inequality, the coefficient of variation is the<br />
quantity squared of the st<strong>and</strong>ard deviation divided <strong>by</strong> the mean value of the set of<br />
responses:<br />
var( y)<br />
I CV ( y)<br />
(12)<br />
2<br />
<br />
One half of the square of this function is the equation derived from the general<br />
entropy class of inequality indices for α = 2. This index, herein referred to as the CV<br />
as mentioned before, also satisfies the requirements set out <strong>by</strong> Bourguignon <strong>and</strong><br />
Shorrocks.<br />
Initial arguments for use of the CV included the fact that it is much more intuitive<br />
than Theil’s T. Also, in using group data weighted <strong>by</strong> population size, the CV is not<br />
easily skewed <strong>by</strong> small outliers. Furthermore, it is defined under any form of zero <strong>and</strong><br />
negative incomes, a characteristic that many thought Theil’s T lacked. The main<br />
drawback of the CV is that it is particularly sensitive to income transfers in the upper<br />
tail of the income distribution, a less than ideal condition. (University of Texas<br />
Inequality Project 2005) Bourguignon explains that the CV “offers the inconvenience<br />
of referring implicitly to a utilitarian welfare function with convex individual<br />
utilities,” pointing to the upper tail sensitivity. (Bourguignon 1979)<br />
<strong>Decomposing</strong> the CV, proportional contributions of income factor components are<br />
given <strong>by</strong>:<br />
k<br />
k cov( y , y)<br />
SCV<br />
(13)<br />
var( y)<br />
5
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
k<br />
y is income received from income component k <strong>and</strong> y is total income.<br />
Sicular <strong>and</strong> Morduch (2002) suggest that satisfying the property of uniform additions<br />
is another desirable characteristic for an inequality index. The property of uniform<br />
additions “holds that measured inequality should fall if everyone in the population<br />
receives a positive transfer of equal size (or, conversely, that inequality should<br />
increase if everyone receives an equal, negative transfer).” They go on to show that<br />
the coefficient of variation, as well as the CV, do not satisfy the property of uniform<br />
additions. Theil’s T index, on the other h<strong>and</strong>, does satisfy this property. (Morduch <strong>and</strong><br />
Sicular, 2002)<br />
V. Application to the Data<br />
Data from the 2002 Rural <strong>Household</strong> Survey are used for the following analysis. This<br />
survey was conducted in 5 provinces – Shaanxi, Jiangsu, Henan, Hunan, <strong>and</strong><br />
Sh<strong>and</strong>ong – <strong>and</strong> in rural areas contained in the jurisdiction of the autonomous city of<br />
Chongqing. These locations were chosen so as to maximize regional <strong>and</strong> economic<br />
variation within the sample <strong>and</strong> thus are not necessarily representative of greater rural<br />
China. Differences within the sample, both geographic <strong>and</strong> economic, are great.<br />
The respondents comprise almost 3,000 households clustered in 37 villages.<br />
Incidentally, these responses are surprisingly representative of rural China as a whole<br />
when compared to those official estimates for Chinese income listed in the China<br />
County (City) Social <strong>and</strong> Economic Statistical Yearbook 2002. Per-capita household<br />
income within the survey responses, for example, differs from official values <strong>by</strong> a<br />
narrow margin of one to seven percent depending on income measure. The survey<br />
respondents are not representative of the actual distribution in the counties surveyed<br />
(the mean age was 64 <strong>and</strong> 55% of respondents are male), however the household<br />
information seems to resemble the general regional distribution much more closely.<br />
(Michelson 2005)<br />
The main data used in this analysis are household income data. The data are responses<br />
to 6 questions that inquire about income values. Overall income was asked for as well<br />
as revenues from five possible component parts of income. These parts are agriculture,<br />
sidelines 5 , family business, remittances, <strong>and</strong> craftsmanship. The overall income will<br />
be called “Single <strong>Income</strong> Response” for the purposes in this paper, <strong>and</strong> the sum of the<br />
five component parts will be called “<strong>Income</strong> Composite.”<br />
In this exercise I calculated Theil’s T <strong>and</strong> the CV for the data set. 6 In terms of<br />
5 The sidelines entry includes income from animal husb<strong>and</strong>ry, fish farming <strong>and</strong> other sources.<br />
6 Within the Stata® statistical software package, I used Stephen Jenkins “ineqdeco” program to calculate income<br />
inequality <strong>and</strong> its various decompositions. This program calculates a variety of inequality indices, including the<br />
three of the family of Atkinson’s Indices <strong>and</strong> four forms of the generalized entropy family. In light of the previous<br />
discussion, Theil’s T index <strong>and</strong> the CV are the inequality indices used in this analysis. Due to the logarithm in<br />
6
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
regional contributions to inequality, I calculated “within group” <strong>and</strong> “between group”<br />
values for decomposing inequality at both the village <strong>and</strong> county level. The next<br />
calculation made is to compute the “within-county, between village” component of<br />
income inequality. This is the difference of between-village inequality <strong>and</strong> between<br />
county inequality. The sum of within-village inequality, within-county, between<br />
village inequality, <strong>and</strong> between county inequality is also overall inequality. 7<br />
A discussion of sample size<br />
<strong>Decomposing</strong> inequality across income components raised concern about issues of<br />
sample size. There were very few survey respondents who recorded income from all<br />
five sources. It seemed that decomposing income across the sources meant looking at<br />
income inequality within groups of only those who earned income from a given<br />
source. This would lead to a difference in sample size, as the number of farmers was<br />
not equal to that of entrepreneurs, for example.<br />
After discussion <strong>and</strong> consultation, I decided to count those not reporting income from<br />
a given source as having zero income from that source. With time-series data, it is<br />
reasonable to assume that most farmers would have a mean positive income from<br />
farming whereas non-farmers would still have no farming income. The fact that our<br />
data is cross-sectional leaves the question of how to differentiate between those<br />
farmers who earned no income that year from farming <strong>and</strong> non-farmers. This is a<br />
vagary enforced <strong>by</strong> the nature of the data. To determine an income source’s<br />
contribution to total income inequality, the index <strong>by</strong> definition must include all<br />
individuals’ income from the given source. Inherently, then, the sample size for all<br />
income components will be identical. 8<br />
VI. Results<br />
Through Stata’s ‘ineqdeco’ inequality function, I investigated the regional<br />
decomposition <strong>and</strong> income component decomposition of two inequality indices,<br />
Theil’s T index <strong>and</strong> the CV. The results are shown numerically (proportionally) in<br />
table one:<br />
Theil’s T <strong>and</strong> the nature of Jenkins’ algorithm, “ineqdeco” cannot include negative <strong>and</strong> zero values for income or<br />
any of its components. To adjust, zero <strong>and</strong> negative values were assigned an arbitrarily low positive value, 1% of<br />
the mean This method was suggested <strong>by</strong> Glenn Firebaugh of Pennsylvania State University in personal<br />
correspondence with Ethan Michelson on Thursday, April 28 th .<br />
7 This method was verified <strong>by</strong> calculation <strong>and</strong> initially suggested <strong>by</strong> Ethan Michelson.<br />
8 This method was suggested <strong>by</strong> several social scientists in personal correspondence with <strong>Alex</strong> <strong>Eble</strong> <strong>and</strong> Ethan<br />
Michelson. These individuals include Jonathon Morduch (June 19 th , 2005), Stephen Jenkins (June 20 th , 2005),<br />
Dwayne Benjamin (June 21 st , 2005), <strong>and</strong> Terry Sicular (June 24 th 2005).<br />
7
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
Table 1 –<strong>Income</strong> Components Decomposed Across Regions<br />
Table 2 – <strong>Income</strong> Component Contributions to Total Inequality<br />
Table 3 – Regional Decomposition of Inequality<br />
The analysis begins with regional contributions to inequality, the results of which are<br />
graphically displayed in figures three <strong>and</strong> four. For both Theil’s T’s index <strong>and</strong> the CV,<br />
inequality within villages contributed the most to total inequality. This finding is in<br />
harmony (indeed, almost perfect correspondence) with the large-scale income<br />
8
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
inequality investigation recently performed <strong>by</strong> Benjamin, Br<strong>and</strong>t <strong>and</strong> Giles. (2005)<br />
Inequality between counties also contributed significantly to overall inequality.<br />
Contribution to inequality from the differences within a county between villages,<br />
however, was low. The high within-village contribution shows large inequality<br />
between individuals within a village, perhaps the difference between government<br />
officials/businessmen <strong>and</strong> farmers. The contribution from inter-county inequality<br />
could also suggest preferential political treatment for certain counties, although this is<br />
much more likely the product of differences in physical <strong>and</strong> human capital<br />
endowments <strong>and</strong> the physical characteristics of the area that condition the amount of<br />
government investment received. (Jalan <strong>and</strong> Ravallion 2002: 343) Inequality between<br />
villages within a county does not exceed ten percent of total inequality for either<br />
index, a marginal contribution when compared to the other two regional contributors.<br />
This suggests that the average incomes of villages are fairly evenly distributed within<br />
each county.<br />
Contriubution to <strong>Income</strong> Inequality<br />
(Theil's T)<br />
100%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
0%<br />
Single <strong>Income</strong><br />
Response<br />
<strong>Income</strong><br />
Composite<br />
Inter-county inequality<br />
Within-county, intervillage<br />
inequality<br />
Intra-village inequality<br />
Figure 1 – Regional Contributions to Overall <strong>Income</strong> for the CV<br />
9
Contribution to Inequality (CV)<br />
100%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
0%<br />
Single <strong>Income</strong><br />
Response<br />
<strong>Income</strong> Composite<br />
Inter-county inequality<br />
Within-county, intervillage<br />
inequality<br />
Intra-village inequality<br />
Figure 2 – Regional Contributions to Overall <strong>Income</strong> for Theil’s T<br />
To further test theories about natural resources I examine the regional contributions to<br />
inequality for different income factors. As shown in figures five <strong>and</strong> six, inequality<br />
stemming from within-village income differences still dominates inequality<br />
contributions for each income factor. The character of its dominance, however, varies<br />
among factors. In agriculture, differences between villages within a county <strong>and</strong><br />
between counties account for 30 percent of overall inequality according to Theil’s T<br />
index. 9 Again, this seems likely to be the result of geographical <strong>and</strong> political factors<br />
as in Jalan <strong>and</strong> Ravallion, i.e. geographically-conditioned access to public resources.<br />
Inequality in earnings from family business, on the other h<strong>and</strong>, stems almost entirely<br />
from differences within a village. This is also to say that revenues from family<br />
business are fairly equally distributed between villages <strong>and</strong> counties.<br />
<strong>Income</strong> from remittances, however, has a different structure of contributions to<br />
inequality. Over 20 percent of inequality in income from remittances is accounted for<br />
<strong>by</strong> differences across counties. To underst<strong>and</strong> this, one has to better underst<strong>and</strong> the<br />
nature of migration in China. Regional economic disparity, a local population’s level<br />
of human capital, a location’s proximity to urban centers, <strong>and</strong> a given location’s<br />
population density are all positive correlates of the likelihood to migrate <strong>and</strong> thus of<br />
the amount of remittances as they are calculated in this analysis. 10 (Fan 2005) Over<br />
10 percent of inequality springing from income from h<strong>and</strong>icrafts can be accounted for<br />
<strong>by</strong> revenue differences in h<strong>and</strong>icrafts between villages. This could be indicative of<br />
cultural differences between villages <strong>and</strong> as well as the presence of village-level<br />
cottage industries. According to field studies of Thai ethnic craft making, the success<br />
9<br />
Notice that this is only 10 percent according to the CV. The significance of this difference will be discussed<br />
shortly.<br />
10<br />
Remittances are the response to question A6 of the survey: “In Question A1, you mention that some household<br />
members are leaving the village to work. What’s their total income this year?”<br />
10
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
of these crafts depends on commercial ties outside the village, which in turn is also<br />
dependent upon infrastructure <strong>and</strong>, to an extent, proximity to population centers.<br />
(Cohen 2000)<br />
Of income inequality in revenue from sidelines, a mere 5 percent can be explained <strong>by</strong><br />
differences between villages within counties <strong>and</strong> between counties. This is also to say<br />
that almost all of the inequality in revenues from sidelines, as in family business, rises<br />
from differences between individuals within a village. Said differently, revenue from<br />
sidelines is relatively equally distributed across villages <strong>and</strong> counties.<br />
Contribution to Inequality (CV)<br />
100%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
0%<br />
Agriculture<br />
Family Business<br />
Remittances<br />
H<strong>and</strong>icrafts<br />
<strong>Source</strong> of <strong>Income</strong><br />
Sidelines<br />
Inter-county inequality<br />
Within-county, intervillage<br />
inequality<br />
Intra-village inequality<br />
Figure 5 – <strong>Income</strong> Factor Inequality Decomposed <strong>by</strong> Regional Contribution Given <strong>by</strong><br />
the CV<br />
Contribution to Inequality<br />
(Theil's T)<br />
100%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
0%<br />
Agriculture<br />
Family Business<br />
Remittances<br />
H<strong>and</strong>icrafts<br />
<strong>Source</strong> of <strong>Income</strong><br />
Sidelines<br />
Inter-county inequality<br />
Within-county, intervillage<br />
inequality<br />
Intra-village inequality<br />
11
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
Figure 6 – <strong>Income</strong> Factor Inequality Decomposed <strong>by</strong> Regional Contribution Given <strong>by</strong><br />
Theil’s T<br />
The next task is to analyze how each income source contributes to overall inequality. 11<br />
Figures seven through twelve illustrate these contributions graphically. Looking at the<br />
following six graphs, there are several obvious conclusions:<br />
Family business contributes the most to inequality within villages <strong>by</strong> a<br />
fantastically large margin.<br />
Agriculture contributes very little to inequality at any level.<br />
Within counties but between villages, sidelines, craftsmanship <strong>and</strong> family<br />
business all play large roles in contributing to inequality, although family<br />
business still plays the most prevalent role.<br />
Between counties, remittances are the largest contributor <strong>by</strong> far to inequality.<br />
This also shows that between counties the incidence of <strong>and</strong>/or revenue from<br />
family business <strong>and</strong> sidelines is more equal than that coming from individuals<br />
leaving home to work.<br />
In the overall income inequality decomposition of Theil’s T as in that of the<br />
CV, family business is <strong>by</strong> far the main contributor to inequality. This is an<br />
unsurprising finding. Jenkins (1995) <strong>and</strong> Papatheodorou (1998) both find<br />
similar contributions to inequality in their analysis of income factors’<br />
contribution to inequality in the United Kingdom <strong>and</strong> Greece, respectively.<br />
One of the more interesting results found in this investigation is the difference<br />
between the two indices that the algorithm yields for proportional contributions to<br />
inequality. As stated before, Shorrocks (1982) asserts that “the relative [sic]<br />
importance of different income components is independent of the choice of inequality<br />
measure.” If “relative proportion” is taken to mean proportional contribution, the<br />
values that ineqdeco yields for the contribution of income components for Theil’s T<br />
<strong>and</strong> the CV violates Shorrocks’ assertion. Numerous tests of Jenkins’ ineqdeco<br />
algorithm all yielded the same result: the proportional contributions of income<br />
components are not independent of the choice of inequality measure. This warrants<br />
further investigation into the nature of Jenkins’ algorithm <strong>and</strong> its relation to<br />
Shorrocks’ assertions, both which lie beyond the scope of this paper.<br />
11 Shorrocks cautions against two ways of interpreting this data. In the past, this contribution has been interpreted<br />
as either a) the inequality that would be observed if income component k was the only source of income<br />
differences; or b) the amount <strong>by</strong> which inequality would fall if differences in a factor’s income receipts were<br />
eliminated. Unfortunately, neither provides a consistent decomposition rule. This inconsistency leads to the<br />
conclusion that they are both to be avoided when making policy recommendations.<br />
12
Contribution to Total Inequality<br />
(CV)<br />
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
40<br />
35<br />
30<br />
25<br />
20<br />
15<br />
10<br />
5<br />
0<br />
Agriculture<br />
Family Business<br />
Remittances<br />
H<strong>and</strong>icrafts<br />
<strong>Source</strong> of <strong>Income</strong><br />
Sidelines<br />
Inter-county inequality<br />
Within-county, intervillage<br />
inequality<br />
Intra-village inequality<br />
Figure 7 – Absolute <strong>Income</strong> Factor Inequality Decomposed <strong>by</strong> Regional Contribution<br />
Given <strong>by</strong> the CV<br />
Contribution to Inequality<br />
(Theil's T)<br />
3.5<br />
3<br />
2.5<br />
2<br />
1.5<br />
1<br />
0.5<br />
0<br />
Agriculture<br />
Family Business<br />
Remittances<br />
H<strong>and</strong>icrafts<br />
<strong>Source</strong> of <strong>Income</strong><br />
Sidelines<br />
Inter-county inequality<br />
Within-county, intervillage<br />
inequality<br />
Intra-village inequality<br />
Figure 8 – Absolute <strong>Income</strong> Factor Inequality Decomposed <strong>by</strong> Regional Contribution<br />
Given <strong>by</strong> Theil’s T<br />
13
<strong>Income</strong> <strong>Source</strong> Contribution to Inequality (CV)<br />
100%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
0%<br />
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
Intra-village<br />
inequality<br />
Withincounty,<br />
inter-village<br />
inequality<br />
Inter-county<br />
inequality<br />
Total<br />
Inequality<br />
Sidelines<br />
H<strong>and</strong>icrafts<br />
Remittances<br />
Family Business<br />
Agriculture<br />
Figure 9 – Regional Contributions to <strong>Income</strong> Inequality Decomposed Proportionally<br />
<strong>by</strong> <strong>Income</strong> Component Given <strong>by</strong> the CV<br />
<strong>Income</strong> <strong>Source</strong> Contribution to Inequality<br />
(Theil's T)<br />
100%<br />
90%<br />
80%<br />
70%<br />
60%<br />
50%<br />
40%<br />
30%<br />
20%<br />
10%<br />
0%<br />
Intra-village<br />
inequality<br />
Withincounty,<br />
inter-village<br />
inequality<br />
Inter-county<br />
inequality<br />
Total<br />
Inequality<br />
Sidelines<br />
H<strong>and</strong>icrafts<br />
Remittances<br />
Family Business<br />
Agriculture<br />
Figure 10 – Regional Contributions to <strong>Income</strong> Inequality Decomposed Proportionally<br />
<strong>by</strong> <strong>Income</strong> Component Given <strong>by</strong> Theil’s T<br />
14
<strong>Income</strong> <strong>Source</strong> Contribution to<br />
Inequality (CV)<br />
50<br />
45<br />
40<br />
35<br />
30<br />
25<br />
20<br />
15<br />
10<br />
5<br />
0<br />
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
Intra-village<br />
inequality<br />
Within-county,<br />
inter-village<br />
inequality<br />
Regional Contribution<br />
Inter-county<br />
inequality<br />
Sidelines<br />
H<strong>and</strong>icrafts<br />
Remittances<br />
Family Business<br />
Agriculture<br />
Figure 11 – Absolute Regional Contributions to <strong>Income</strong> Inequality Decomposed <strong>by</strong><br />
<strong>Income</strong> Component Given <strong>by</strong> the CV<br />
<strong>Income</strong> <strong>Source</strong> Contribution to<br />
Inequality (Theil's T)<br />
6<br />
5<br />
4<br />
3<br />
2<br />
1<br />
0<br />
Intra-village<br />
inequality<br />
Within-county,<br />
inter-village<br />
inequality<br />
Regional Contribution<br />
Inter-county<br />
inequality<br />
Sidelines<br />
H<strong>and</strong>icrafts<br />
Remittances<br />
Family Business<br />
Agriculture<br />
Figure 12 – Absolute Regional Contributions to <strong>Income</strong> Inequality Decomposed <strong>by</strong><br />
<strong>Income</strong> Component Given <strong>by</strong> Theil’s T<br />
VII. Conclusions<br />
This paper offers a foundation from which to choose possible measures of income<br />
inequality. From this basis, it goes on to explain the capabilities of those measures<br />
which are seen most attractive under the lens of Shorrocks <strong>and</strong> Bourguignon’s rules<br />
15
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
for a measure of inequality. Two of these measures, Theil’s T index <strong>and</strong> the half the<br />
square of the coefficient of variation, herein referred to as the CV, are then applied to<br />
the responses to the 2002 Rural <strong>Household</strong> Survey. Inequality across geographical<br />
regions is measured through decomposition of the two indices. The same process is<br />
applied to identify <strong>and</strong> analyze contributions to inequality from five mutually<br />
exclusive income sources.<br />
Overall, inequality stemming from income differences within villages contributes<br />
much, much more to overall inequality than inequality from differences in income<br />
between villages within a county <strong>and</strong> between counties. Of this inequality, much was<br />
accounted for <strong>by</strong> the income inequality in earnings from family business. Differences<br />
between counties, however, still contribute almost 35 percent to overall inequality<br />
according to Theil’s T Index. This is seen as exposing inequality in resource allocation,<br />
likely both political <strong>and</strong> natural.<br />
According to Theil’s T for inequality between counties, income inequality from<br />
remittances <strong>and</strong> h<strong>and</strong>icrafts contributes 70 percent of all inequality. The prominence<br />
of these two factors suggests differences in natural resource distribution <strong>and</strong> proximity<br />
to infrastructure as a likely cause of between county income inequality. Pro-poverty<br />
migration policies, targeted investment programs, <strong>and</strong> expansion of infrastructure are<br />
all possible means to alleviate this inequality. <strong>Income</strong> inequality rising from family<br />
business, sidelines <strong>and</strong> h<strong>and</strong>icrafts was responsible for all but 15 percent of inequality<br />
between villages within counties. This again points to differences in natural resources<br />
as a potential point of departure from income equality (some areas lend themselves<br />
more easily to fish farming or souvenir sales than others, for example) <strong>and</strong> also points<br />
to the aforementioned means to alleviate inequality stemming from this source.<br />
<strong>Income</strong> inequality within counties but between villages, however, contributed only a<br />
very small portion of overall income inequality <strong>and</strong> should form a policy focus point.<br />
The obvious culprit as the main income factor contributor to overall inequality was<br />
family business. <strong>Income</strong> inequality stemming from family business revenues accounts<br />
for nearly 80 percent of overall inequality according to the CV <strong>and</strong> nearly 40 percent<br />
according to Theil’s T index. The policy conclusion to be drawn from this mirrors that<br />
of Papatheodorou (1998):<br />
“The reduction of the inequality of entrepreneurial income appears to be<br />
the most effective way to reduce total inequality in Greece. It is, therefore,<br />
of great importance to redesign the current tax system in Greece in order<br />
to eliminate the tax evasion among the recipients of entrepreneurial<br />
income. This policy could prove the most efficient, if not only way, to<br />
significantly reduce income inequality.”<br />
Similarly, in the areas investigated in this paper family business is a significant<br />
contributor to every regional slice of inequality <strong>and</strong> also contributes the lion’s share to<br />
16
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
overall inequality. Through instituting new tax policy <strong>and</strong>/or revitalizing existing<br />
policy to curb the gross inequality stemming from this source of income, China could<br />
make enormous strides in reducing income inequality. Conversely, without attention<br />
paid to reducing income inequality stemming from family business, government<br />
policy aimed at reducing income inequality could at best finish only a little more than<br />
half of the job.<br />
Looking at the other end of the spectrum, agriculture contributes almost nothing to<br />
overall inequality (never more than five percent in any index). This finding has<br />
equally strong policy implications. Policies geared at increasing income have<br />
potentially inequality-inducing effects. Whereas a government-sponsored business<br />
incubator would likely increase inequality in light of the results of this paper,<br />
government investment in raising income from agriculture would almost certainly<br />
reduce inequality.<br />
Also interesting, remittances are the main contributor to inequality between counties.<br />
Further investigation into literature on migration could reveal what factors contribute<br />
to an individual’s decision to leave a village <strong>and</strong> remit money. Existing networks of<br />
villagers, limited information, prosperity, <strong>and</strong> proximity to travel infrastructure are all<br />
possible inputs. Policy towards alleviating inequality could improve access these for<br />
the poor <strong>and</strong> thus affect the amount of resources flowing into a region from<br />
remittances to accordingly even distribution across counties.<br />
The main surprise that came from this investigation is a methodological one. The<br />
relative contributions of income components <strong>and</strong> regional income inequality as<br />
measured <strong>by</strong> the ineqdeco function in Stata® may violate Shorrocks’ assertion that the<br />
relative contribution to income inequality of various components is independent of the<br />
choice of inequality index. Certainly, the question remains: is family business<br />
responsible for 80 percent of all income inequality or 40 percent? As can be seen<br />
numerically in figure one <strong>and</strong> graphically in several of the preceding graphs, there are<br />
numerous other instances in which the relative contribution of an income component<br />
or regional contribution is different according to the CV than that given <strong>by</strong> Theil’s T.<br />
In most cases the general structure of inequality contributions remains the same.<br />
Family business <strong>and</strong> intra-village inequality, for example, contribute the lion’s share<br />
to overall inequality for both indices. The proportional contributions of the several<br />
income components, however, differ frequently <strong>and</strong> noticeably between the two<br />
indices, as does regional decomposition. This result suggests one of four possibilities.<br />
The most obvious possibility is that I committed a mathematical error in calculations.<br />
It is also quite possible that I am misinterpreting Shorrocks’ use of the phrase “relative<br />
contribution.” The other two less likely although more interesting possibilities are that<br />
either Shorrocks made an error in his assertion or that Jenkins did so in writing his<br />
algorithm.<br />
For further investigation, time series data would help distinguish non-farmers from<br />
17
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
farmers <strong>and</strong> yield a more reliable data set from which to draw analyses. The analyses<br />
themselves could be improved greatly <strong>by</strong> resolving the apparent contradiction<br />
between Shorrocks’ claim <strong>and</strong> Jenkins’ algorithm. As it st<strong>and</strong>s, the differences in the<br />
results between the two indices are large enough to weaken any policy advice derived<br />
from this analysis. Also, adaptation of Sicular <strong>and</strong> Morduch’s Theil’s T decomposition<br />
into the ineqdeco or ineqdec0 algorithm would allow for greater precision –<br />
preventing the substitution of one percent of the mean for the negative values <strong>and</strong><br />
zeroes. Akita’s nested decomposition equation would also add to the precision <strong>and</strong><br />
scope of the decomposition algorithm.<br />
18
References:<br />
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
1. Akita, Takahira. 2000 “<strong>Decomposing</strong> Regional <strong>Income</strong> Inequality using a<br />
Two-Stage Nested Theil Decomposition Method.” International Development<br />
Working Paper Series 2, IUJ, Research Institute, International University of<br />
Japan<br />
2. Benjamin, Dwayne, Loren Br<strong>and</strong>t <strong>and</strong> John Giles. 2005. “The Evolution of<br />
<strong>Income</strong> Inequality in Rural China.” Economic Development <strong>and</strong> Cultural<br />
Change. 53(4)<br />
3. Bourguignon, Francois. 1979. “Decomposable <strong>Income</strong> Distribution<br />
Measures.” Econometrica. 47<br />
4. Conceição, Pedro <strong>and</strong> Pedro Ferreira. “The Young Person’s Guide to the Theil<br />
Index: Suggesting Intuitive Interpretations <strong>and</strong> Exploring Analytical<br />
Applications.” UTIP Working Paper Number [14 February 29, 2000]<br />
Available: http://utip.gov.utexas.edu/abstract.htm#UTIP14<br />
5. Cohen, Erik. 2000. The Commercialized Crafts of Thail<strong>and</strong>: Hill Tribes <strong>and</strong><br />
Lowl<strong>and</strong> Villages; Collected Articles. Honolulu: University of Hawaii Press,<br />
6. Fan, C. Cindy. 2005. “Modeling Interprovincial Migration in China,<br />
1985-2000.” Eurasian Geography <strong>and</strong> Economics. Palm Beach: Apr/May<br />
2005. 46(3)<br />
7. Jenkins, Stephen. 1995. “Accounting for inequality trends: decomposition<br />
analyses for the UK 1971-86.” Economica, 62<br />
8. Michelson, Ethan. 2005 “Peasants’ Burdens’ <strong>and</strong> State Response: A<br />
Preliminary Explanation of State Concession to Popular Tax Resistance in<br />
Rural China,” Unpublished manuscript<br />
9. Morduch, Jonathan <strong>and</strong> Terry Sicular. 2002. “Rethinking Inequality<br />
Decomposition, with Evidence from Rural China.” The Economic Journal<br />
112(93)<br />
10. Papatheodorou, Christos. 1998 “Inequality in Greece: An Analysis <strong>by</strong> <strong>Income</strong><br />
<strong>Source</strong>.” Discussion Paper No. DARP 39<br />
11. Sen, Amartya. 1997. On Economic Inequality. Oxford: Clarendon Press.<br />
12. Shorrocks, Anthony. 1982 “Inequality Decomoposition <strong>by</strong> Factor<br />
Components” Econometrica, 50(1)<br />
19
<strong>Decomposing</strong> <strong>Household</strong> <strong>Income</strong> <strong>by</strong> <strong>Source</strong> <strong>and</strong> <strong>Subgroup</strong><br />
13. Shorrocks, Anthony. 1980. “The Class of Additively Decomposable Inequality<br />
Measures.” Econometrica 48<br />
14. Shorrocks, Anthony <strong>and</strong> Guaghua Wan. 2004. “Spatial decomposition of<br />
Inequality.” WIDER Discussion Paper No. 2004/01.Available at<br />
http://www.wider.unu.edu/publications/publications.htm<br />
15. Theil, Henri. 1967. Economics <strong>and</strong> Information Theory. Amsterdam:<br />
North-Holl<strong>and</strong><br />
16. University of Texas Inequality Project. 2005. “Measuring Inequality.” Austin,<br />
Texas: University of Texas Inequality Project. Retrieved June 12 th , 2005.<br />
Available:<br />
http://utip.gov.utexas.edu/web/Tutorials_Techniques/Introduction%20to%20In<br />
equality%20Studies.ppt<br />
20