Using administrative data sources in Labour Market research [pdf 5 ...
Using administrative data sources in Labour Market research [pdf 5 ...
Using administrative data sources in Labour Market research [pdf 5 ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
26<br />
<strong>Labour</strong> <strong>Market</strong> <strong>Labour</strong> Bullet<strong>in</strong> <strong>Market</strong> 2000–02 Bullet<strong>in</strong> Special 2000–02 Issue Special Pages 26–30 Issue<br />
<strong>Us<strong>in</strong>g</strong> <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>sources</strong> <strong>in</strong> labour market<br />
<strong>research</strong>: an <strong>in</strong>troduction<br />
SYLVIA DIXON 1<br />
NTERNATIONALLY, SOCIAL AND ECONOMIC RESEARCHERS are mak<strong>in</strong>g<br />
I<strong>in</strong>creas<strong>in</strong>g use of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>sources</strong> to <strong>in</strong>vestigate a wide range of<br />
issues. A large amount of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> is rout<strong>in</strong>ely collected by<br />
government agencies as the by-product of their <strong>adm<strong>in</strong>istrative</strong> functions, and the<br />
costs of access<strong>in</strong>g and utilis<strong>in</strong>g these <strong>data</strong> for <strong>research</strong> are usually quite low<br />
compared with the costs of collect<strong>in</strong>g new <strong>data</strong>. As statistical and comput<strong>in</strong>g tools<br />
have improved, the number of empirical studies that analyse <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong><br />
has grown. Examples of the types of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> that have been used <strong>in</strong><br />
labour market <strong>research</strong> <strong>in</strong>clude social security payment or <strong>in</strong>come support<br />
records, unemployment registration records, <strong>in</strong>come tax <strong>data</strong>, workers’ accident<br />
<strong>in</strong>surance and compensation records, the enrolment records of educational<br />
<strong>in</strong>stitutions and company personnel records.<br />
This special issue of the <strong>Labour</strong> <strong>Market</strong> Bullet<strong>in</strong> features three articles that<br />
illustrate the potential of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>sources</strong> for empirical <strong>research</strong>. Maria<br />
Gobbi and David Rea analyse unemployment register <strong>data</strong> collected by the New<br />
Zealand Employment Service (NZES) to <strong>in</strong>vestigate the duration of unemployment<br />
spells and the distribution of unemployment across jobseekers. Dave Maré uses a<br />
different selection of NZES <strong>data</strong> to exam<strong>in</strong>e the effects of active labour market<br />
assistance programmes <strong>in</strong> <strong>in</strong>fluenc<strong>in</strong>g the subsequent unemployment experiences<br />
of jobseekers. Dave Maré and Kerry Papps analyse <strong>data</strong> collected by the Occupational<br />
Safety and Health Service (OSH) <strong>in</strong> order to evaluate the effectiveness of OSH<br />
<strong>in</strong>formation and enforcement activities <strong>in</strong> reduc<strong>in</strong>g the number of workplace<br />
accidents.<br />
These studies illustrate the value of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>in</strong> policy or<br />
programme evaluation <strong>research</strong>. Frequently, the <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> that are<br />
generated dur<strong>in</strong>g the delivery of a social programme provide a rich source of<br />
<strong>in</strong>formation on the participants and the services they receive, and are a key source<br />
of <strong>in</strong>formation for the evaluation of the programme’s operation and effectiveness.<br />
Typically, a universal sample of all participants <strong>in</strong> the programme is available,<br />
and the <strong>in</strong>formation that is recorded on key aspects of programme participation,<br />
such as the f<strong>in</strong>ancial support or other assistance received, is fairly detailed and<br />
1<br />
Sylvia Dixon is a Research Advisor <strong>in</strong> the <strong>Labour</strong> <strong>Market</strong> Policy Group of the New<br />
Zealand Department of <strong>Labour</strong>.<br />
© 2002, Department of <strong>Labour</strong>, New Zealand, unless specified otherwise
Sylvia Dixon 27<br />
accurate. The registered unemployment <strong>data</strong> analysed by Gobbi and Rea<br />
conta<strong>in</strong>ed <strong>in</strong>formation on everyone who registered as unemployed <strong>in</strong> the period<br />
1993 to 1997, represent<strong>in</strong>g around 1.2 million <strong>in</strong>dividuals and 2.8 million dist<strong>in</strong>ct<br />
spells of registration. The NZES <strong>data</strong> conta<strong>in</strong> <strong>in</strong>formation on demographic<br />
characteristics such as age, gender, ethnicity, educational atta<strong>in</strong>ment and place of<br />
residence, as well as detailed <strong>in</strong>formation on registration history and employment<br />
assistance received. The <strong>data</strong> analysed by Maré provide complete <strong>in</strong>formation on<br />
all assistance that was provided to registered jobseekers over a n<strong>in</strong>e-year period<br />
(1989–97).<br />
One of the most important properties of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>sets is that the<br />
<strong>in</strong>dividuals or firms that <strong>in</strong>teract with the <strong>adm<strong>in</strong>istrative</strong> agency are usually<br />
assigned unique identifiers. Consequently, the records relat<strong>in</strong>g to each unique<br />
respondent can be accurately l<strong>in</strong>ked across time. This enables the creation of<br />
<strong>data</strong>sets that <strong>in</strong>clude time-related and spell variables, such as the duration of time<br />
that jobseekers spend unemployed, or the number of times a firm has been<br />
<strong>in</strong>vestigated by OSH. The longitud<strong>in</strong>al nature of many <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>sets also<br />
enables <strong>research</strong>ers to apply panel <strong>data</strong> techniques <strong>in</strong> their modell<strong>in</strong>g of outcomes<br />
or policy effects, which can be useful <strong>in</strong> deal<strong>in</strong>g with classic estimation problems<br />
such as unmeasured heterogeneity among programme participants. Maré and<br />
Papps use a range of different fixed effect specifications of their basic statistical<br />
model <strong>in</strong> attempt to estimate the effects of OSH compliance assessments,<br />
education, <strong>in</strong>formation and advice on the <strong>in</strong>cidence of accidents, while controll<strong>in</strong>g<br />
for firm and <strong>in</strong>dustry heterogeneity.<br />
A important disadvantage of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>sources</strong> is that the range of<br />
variables collected on each <strong>in</strong>dividual or firm is often quite limited, h<strong>in</strong>der<strong>in</strong>g the<br />
capacity of the <strong>research</strong>er to explore all relevant hypotheses about the causes of<br />
the relationship or outcome of <strong>in</strong>terest, or control for potentially confound<strong>in</strong>g<br />
factors. For example, the OSH <strong>data</strong> analysed by Maré and Papps conta<strong>in</strong>ed very<br />
little <strong>in</strong>formation on the range of firm characteristics that may be associated with<br />
differences <strong>in</strong> the level of hazards faced by firms, or differences <strong>in</strong> the costs of<br />
prevent<strong>in</strong>g accidents. This central weakness <strong>in</strong> the <strong>data</strong> source limits the weight<br />
that can be attached to the conclusions of their study. A key limitation of the<br />
unemployment register <strong>data</strong> analysed <strong>in</strong> the other two studies is that very little<br />
<strong>in</strong>formation is recorded on the activities and circumstances of those who leave the<br />
register. This means that transitions from unemployment <strong>in</strong>to employment cannot<br />
be properly dist<strong>in</strong>guished from less favourable outcomes, such as exit from the<br />
labour market.<br />
One response to the problem of omitted variables <strong>in</strong> <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>sets is<br />
to l<strong>in</strong>k them to other <strong>data</strong> <strong>sources</strong>. While the studies <strong>in</strong> this issue of the <strong>Labour</strong><br />
<strong>Market</strong> Bullet<strong>in</strong> do not <strong>in</strong>volve any match<strong>in</strong>g of <strong>data</strong> from different <strong>sources</strong>, other<br />
New Zealand <strong>research</strong>ers are currently engaged <strong>in</strong> <strong>in</strong>novative projects that are<br />
l<strong>in</strong>k<strong>in</strong>g <strong>adm<strong>in</strong>istrative</strong> with survey <strong>data</strong> to <strong>in</strong>vestigate problems that could not
28<br />
<strong>Labour</strong> <strong>Market</strong> Bullet<strong>in</strong> 2000–02 Special Issue<br />
otherwise be adequately <strong>research</strong>ed. For example, a team of <strong>research</strong>ers at the<br />
Well<strong>in</strong>gton School of Medic<strong>in</strong>e, led by Tony Blakely, is study<strong>in</strong>g socio-economic<br />
variations <strong>in</strong> mortality patterns by l<strong>in</strong>k<strong>in</strong>g census <strong>data</strong> on socio-economic factors<br />
to <strong>in</strong>dividual mortality records (Blakely et al, 1999). Overseas experience <strong>in</strong>dicates<br />
that l<strong>in</strong>k<strong>in</strong>g <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> on key labour market outcomes such as<br />
employment, annual taxable <strong>in</strong>comes or benefit payments to survey <strong>data</strong> on<br />
<strong>in</strong>dividual or household characteristics can greatly extend the <strong>research</strong> potential<br />
of the <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>. Statistics New Zealand (1998) is a useful reference<br />
source on the techniques of <strong>in</strong>tegrat<strong>in</strong>g <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> and the potential costs<br />
and benefits of <strong>in</strong>tegration exercises.<br />
Because <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>sets are compiled for operational reasons rather<br />
than for <strong>research</strong>, and typically conta<strong>in</strong> <strong>data</strong> that was keyed <strong>in</strong> by a large number<br />
of people who are not <strong>data</strong>-entry specialists, deficiencies <strong>in</strong> <strong>data</strong> coverage and<br />
quality are common. Adm<strong>in</strong>istrative decisions can also cause discont<strong>in</strong>uities <strong>in</strong> the<br />
concepts and populations that are measured. For example, changes <strong>in</strong> the jobsearch<br />
obligations that different classes of beneficiary are required to meet have<br />
led to changes over time <strong>in</strong> the size and composition of the registered<br />
unemployed population. Data <strong>in</strong>consistencies result<strong>in</strong>g from <strong>in</strong>consistencies <strong>in</strong> the<br />
application of questions, miss<strong>in</strong>g fields of <strong>data</strong>, <strong>in</strong>consistencies <strong>in</strong> the cod<strong>in</strong>g of<br />
responses, the duplication of records or identities, and <strong>in</strong>accuracies <strong>in</strong> the l<strong>in</strong>k<strong>in</strong>g<br />
of records over time are common problems faced by <strong>research</strong>ers who are<br />
analys<strong>in</strong>g a particular <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> source for the first time. The up-front<br />
time required to identify and, where possible, resolve these <strong>data</strong> quality issues<br />
before analysis can proceed is often quite large. Indeed, it is not uncommon for<br />
<strong>research</strong>ers to spend more time assess<strong>in</strong>g the quality of the <strong>data</strong> and attempt<strong>in</strong>g<br />
to resolve <strong>in</strong>consistencies than actually analys<strong>in</strong>g it. At worst, major changes <strong>in</strong><br />
the <strong>data</strong> over time that were driven by <strong>adm<strong>in</strong>istrative</strong> decisions can underm<strong>in</strong>e<br />
attempts to use the <strong>data</strong> to understand real world trends or identify the<br />
behavioural effects of a policy change.<br />
The <strong>in</strong>creased use of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>in</strong> social and economic <strong>research</strong> is<br />
prompt<strong>in</strong>g the agencies that compile such <strong>data</strong> to consider more carefully what<br />
access and <strong>data</strong> protection rules should apply. Access to <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> is<br />
typically conditioned by the obligation of the <strong>data</strong> holder to protect the<br />
confidentiality of the subjects on the one hand, but allow legitimate access for<br />
<strong>research</strong> purposes on the other. The NZES and OSH <strong>data</strong> that are analysed <strong>in</strong> this<br />
issue were collected by the Department of <strong>Labour</strong>, and Department of <strong>Labour</strong><br />
standards for the protection of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> collected <strong>in</strong>ternally have<br />
evolved over time. Current practices limit <strong>research</strong> access to authorised<br />
<strong>research</strong>ers, who must be employees of the Department and must have agreed to<br />
observe a set of pr<strong>in</strong>ciples and procedures for protect<strong>in</strong>g confidentiality. Records<br />
are ‘confidentialised’ by the removal of identify<strong>in</strong>g <strong>in</strong>formation such as names
Sylvia Dixon 29<br />
and addresses. Data are stored on an isolated and secure <strong>data</strong> server, so that only<br />
authorised people can view them. Confidentiality standards also require that<br />
<strong>research</strong> outputs are screened and any <strong>in</strong>formation that might permit <strong>in</strong>dividuals<br />
or firms to be identified is removed.<br />
Other related <strong>research</strong><br />
For a broader view on the potential of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> for <strong>in</strong>novative labour<br />
market <strong>research</strong>, a number of other recent New Zealand and Australian studies<br />
can be consulted. Dean Hyslop has used longitud<strong>in</strong>ally l<strong>in</strong>ked <strong>data</strong> from the<br />
Inland Revenue Department’s tax <strong>data</strong>base to analyse <strong>in</strong>dividual <strong>in</strong>come mobility<br />
(2000b) and to estimate the effects of receiv<strong>in</strong>g welfare benefits on <strong>in</strong>dividuals’<br />
future <strong>in</strong>comes (2000a). Moira Wilson has used longitud<strong>in</strong>ally l<strong>in</strong>ked <strong>data</strong> on<br />
work<strong>in</strong>g-aged beneficiaries collected by the Department of Social Welfare (and<br />
more recently the Department of Work and Income) to analyse the dynamics of<br />
benefit receipt. Wilson (1999) <strong>in</strong>vestigates the duration patterns and rates of<br />
return to <strong>in</strong>come support that are associated with different types of benefit.<br />
Wilson (2000) uses <strong>data</strong> on successive cohorts of Domestic Purposes Benefit (DPB)<br />
beneficiaries to exam<strong>in</strong>e the extent to which patterns of DPB receipt were affected<br />
by a series of policy changes that strengthened the <strong>in</strong>centives for DPB recipients to<br />
undertake part-time work and <strong>in</strong>creased their obligations to search for work.<br />
Wilson’s work on the dynamics of benefit receipt is paralleled <strong>in</strong> Australia by<br />
a number of studies commissioned recently by the Department of Family and<br />
Community Services (Barrett, 2001; Flatau and Dockery, 2001; Strombeck and<br />
Dockery, 2001). <strong>Us<strong>in</strong>g</strong> <strong>data</strong> extracted from the Department’s client and benefit<br />
payment records, these studies exam<strong>in</strong>e the durations of time that different types<br />
of recipient spend on different types of benefit (unemployment, sole parent<br />
pension) and the factors that are associated with exit from the benefit system.<br />
New Zealand Conference on Database Integration and L<strong>in</strong>ked<br />
Employer-Employee Data<br />
To promote greater understand<strong>in</strong>g of the techniques and potential benefits of<br />
match<strong>in</strong>g <strong>adm<strong>in</strong>istrative</strong> and survey <strong>data</strong> across <strong>data</strong> <strong>sources</strong>, a group of New<br />
Zealand government departments have jo<strong>in</strong>tly organised a conference on this<br />
theme, which will take place <strong>in</strong> Well<strong>in</strong>gton on 21–22 March 2002.<br />
The Conference on Database Integration and L<strong>in</strong>ked Employer-Employee Data<br />
features a number of respected American <strong>research</strong>ers who have pioneered the<br />
development of <strong>in</strong>tegrated employer-employee <strong>data</strong>sets, such as John Abowd and<br />
John Haltiwanger. They will offer <strong>in</strong>sights <strong>in</strong>to their experiences. Other<br />
<strong>in</strong>ternational speakers will discuss methods for the preparation and l<strong>in</strong>k<strong>in</strong>g of<br />
<strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> and the practical lessons they have learned <strong>in</strong> <strong>data</strong> <strong>in</strong>tegration<br />
projects. Several speakers will present papers on the confidentiality protection
30<br />
<strong>Labour</strong> <strong>Market</strong> Bullet<strong>in</strong> 2000–02 Special Issue<br />
issues are that faced by agencies with ownership of important survey or<br />
<strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>sets. A series of overseas and New Zealand-based <strong>research</strong>ers<br />
who are currently engaged <strong>in</strong> <strong>data</strong> <strong>in</strong>tegration exercises will present results from<br />
their work.<br />
Further <strong>in</strong>formation about the conference is available at the conference<br />
website, http://www.dileed.govt.nz. Lead<strong>in</strong>g papers from the conference will be<br />
published <strong>in</strong> the New Zealand Economic Papers.<br />
References<br />
Barrett, Gary (2001) ‘The Dynamics of Participation <strong>in</strong> Parent<strong>in</strong>g Payment (S<strong>in</strong>gle) and Sole Parent<br />
Pension’, Department of Family and Community Services, Policy Research Paper No 14, Commonwealth<br />
Government of Australia. Available at http://www.facs.gov.au.<br />
Blakely, Tony; Salmond, Clare and Woodward, Alistair (1999) ‘Anonymous Record L<strong>in</strong>kage of<br />
1991 Census Records and 1991–94 Mortality Records’, NZCMS Technical Report No 1, Department of<br />
Public Health, Well<strong>in</strong>gton School of Medic<strong>in</strong>e.<br />
Flatau, Paul and Dockery, Mike (2001) ‘How Do Income Support Recipients Engage with the<br />
<strong>Labour</strong> <strong>Market</strong>?’, Department of Family and Community Services, Policy Research Paper No 12,<br />
Commonwealth Government of Australia. Available at http://www.facs.gov.au.<br />
Hyslop, Dean (2000a) ‘Does Benefit Receipt Affect Future Income? An econometric explanation’,<br />
Treasury Work<strong>in</strong>g Paper No 2000/14. Available at http://www.treasury.govt.nz/work<strong>in</strong>gpapers/.<br />
Hyslop, Dean (2000b) ‘A Prelim<strong>in</strong>ary Analysis of the Dynamics of Individual <strong>Market</strong> and<br />
Disposable Incomes’, Treasury Work<strong>in</strong>g Paper No 2000/15. Available at http://ww.treasury.govt.nz/<br />
work<strong>in</strong>gpapers/.<br />
Statistics New Zealand (1998) F<strong>in</strong>al Report on the Feasibility Study <strong>in</strong>to the Costs and Benefits of<br />
Integrat<strong>in</strong>g Cross-sectional Adm<strong>in</strong>istrative Data to Produce New Social Statistics, Statistics New<br />
Zealand, Well<strong>in</strong>gton.<br />
Strombeck, Thorsten and Dockery, Mike (2001) ‘The Duration of Unemployment Benefit Spells: A<br />
comparison of <strong>in</strong>digenous and non-<strong>in</strong>digenous persons’, Department of Family and Community<br />
Services, Policy Research Paper No 10, Commonwealth Government of Australia. Available at http://<br />
www.facs.gov.au.<br />
Wilson, Moira (1999) ‘The Duration of Benefit Receipt: New f<strong>in</strong>d<strong>in</strong>gs from the benefit dynamics<br />
<strong>data</strong> set’, Social Policy Journal of New Zealand, Issue 13, pp 59–82.<br />
Wilson, Moira (2000) ‘The Policy Response to the Employment Taskforce and Chang<strong>in</strong>g Patterns of<br />
Domestic Purpose Benefit Receipt: A cohort analysis’, Social Policy Journal of New Zealand, Issue 14,<br />
pp 78–103.