03.10.2014 Views

Using administrative data sources in Labour Market research [pdf 5 ...

Using administrative data sources in Labour Market research [pdf 5 ...

Using administrative data sources in Labour Market research [pdf 5 ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

26<br />

<strong>Labour</strong> <strong>Market</strong> <strong>Labour</strong> Bullet<strong>in</strong> <strong>Market</strong> 2000–02 Bullet<strong>in</strong> Special 2000–02 Issue Special Pages 26–30 Issue<br />

<strong>Us<strong>in</strong>g</strong> <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>sources</strong> <strong>in</strong> labour market<br />

<strong>research</strong>: an <strong>in</strong>troduction<br />

SYLVIA DIXON 1<br />

NTERNATIONALLY, SOCIAL AND ECONOMIC RESEARCHERS are mak<strong>in</strong>g<br />

I<strong>in</strong>creas<strong>in</strong>g use of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>sources</strong> to <strong>in</strong>vestigate a wide range of<br />

issues. A large amount of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> is rout<strong>in</strong>ely collected by<br />

government agencies as the by-product of their <strong>adm<strong>in</strong>istrative</strong> functions, and the<br />

costs of access<strong>in</strong>g and utilis<strong>in</strong>g these <strong>data</strong> for <strong>research</strong> are usually quite low<br />

compared with the costs of collect<strong>in</strong>g new <strong>data</strong>. As statistical and comput<strong>in</strong>g tools<br />

have improved, the number of empirical studies that analyse <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong><br />

has grown. Examples of the types of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> that have been used <strong>in</strong><br />

labour market <strong>research</strong> <strong>in</strong>clude social security payment or <strong>in</strong>come support<br />

records, unemployment registration records, <strong>in</strong>come tax <strong>data</strong>, workers’ accident<br />

<strong>in</strong>surance and compensation records, the enrolment records of educational<br />

<strong>in</strong>stitutions and company personnel records.<br />

This special issue of the <strong>Labour</strong> <strong>Market</strong> Bullet<strong>in</strong> features three articles that<br />

illustrate the potential of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>sources</strong> for empirical <strong>research</strong>. Maria<br />

Gobbi and David Rea analyse unemployment register <strong>data</strong> collected by the New<br />

Zealand Employment Service (NZES) to <strong>in</strong>vestigate the duration of unemployment<br />

spells and the distribution of unemployment across jobseekers. Dave Maré uses a<br />

different selection of NZES <strong>data</strong> to exam<strong>in</strong>e the effects of active labour market<br />

assistance programmes <strong>in</strong> <strong>in</strong>fluenc<strong>in</strong>g the subsequent unemployment experiences<br />

of jobseekers. Dave Maré and Kerry Papps analyse <strong>data</strong> collected by the Occupational<br />

Safety and Health Service (OSH) <strong>in</strong> order to evaluate the effectiveness of OSH<br />

<strong>in</strong>formation and enforcement activities <strong>in</strong> reduc<strong>in</strong>g the number of workplace<br />

accidents.<br />

These studies illustrate the value of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>in</strong> policy or<br />

programme evaluation <strong>research</strong>. Frequently, the <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> that are<br />

generated dur<strong>in</strong>g the delivery of a social programme provide a rich source of<br />

<strong>in</strong>formation on the participants and the services they receive, and are a key source<br />

of <strong>in</strong>formation for the evaluation of the programme’s operation and effectiveness.<br />

Typically, a universal sample of all participants <strong>in</strong> the programme is available,<br />

and the <strong>in</strong>formation that is recorded on key aspects of programme participation,<br />

such as the f<strong>in</strong>ancial support or other assistance received, is fairly detailed and<br />

1<br />

Sylvia Dixon is a Research Advisor <strong>in</strong> the <strong>Labour</strong> <strong>Market</strong> Policy Group of the New<br />

Zealand Department of <strong>Labour</strong>.<br />

© 2002, Department of <strong>Labour</strong>, New Zealand, unless specified otherwise


Sylvia Dixon 27<br />

accurate. The registered unemployment <strong>data</strong> analysed by Gobbi and Rea<br />

conta<strong>in</strong>ed <strong>in</strong>formation on everyone who registered as unemployed <strong>in</strong> the period<br />

1993 to 1997, represent<strong>in</strong>g around 1.2 million <strong>in</strong>dividuals and 2.8 million dist<strong>in</strong>ct<br />

spells of registration. The NZES <strong>data</strong> conta<strong>in</strong> <strong>in</strong>formation on demographic<br />

characteristics such as age, gender, ethnicity, educational atta<strong>in</strong>ment and place of<br />

residence, as well as detailed <strong>in</strong>formation on registration history and employment<br />

assistance received. The <strong>data</strong> analysed by Maré provide complete <strong>in</strong>formation on<br />

all assistance that was provided to registered jobseekers over a n<strong>in</strong>e-year period<br />

(1989–97).<br />

One of the most important properties of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>sets is that the<br />

<strong>in</strong>dividuals or firms that <strong>in</strong>teract with the <strong>adm<strong>in</strong>istrative</strong> agency are usually<br />

assigned unique identifiers. Consequently, the records relat<strong>in</strong>g to each unique<br />

respondent can be accurately l<strong>in</strong>ked across time. This enables the creation of<br />

<strong>data</strong>sets that <strong>in</strong>clude time-related and spell variables, such as the duration of time<br />

that jobseekers spend unemployed, or the number of times a firm has been<br />

<strong>in</strong>vestigated by OSH. The longitud<strong>in</strong>al nature of many <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>sets also<br />

enables <strong>research</strong>ers to apply panel <strong>data</strong> techniques <strong>in</strong> their modell<strong>in</strong>g of outcomes<br />

or policy effects, which can be useful <strong>in</strong> deal<strong>in</strong>g with classic estimation problems<br />

such as unmeasured heterogeneity among programme participants. Maré and<br />

Papps use a range of different fixed effect specifications of their basic statistical<br />

model <strong>in</strong> attempt to estimate the effects of OSH compliance assessments,<br />

education, <strong>in</strong>formation and advice on the <strong>in</strong>cidence of accidents, while controll<strong>in</strong>g<br />

for firm and <strong>in</strong>dustry heterogeneity.<br />

A important disadvantage of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>sources</strong> is that the range of<br />

variables collected on each <strong>in</strong>dividual or firm is often quite limited, h<strong>in</strong>der<strong>in</strong>g the<br />

capacity of the <strong>research</strong>er to explore all relevant hypotheses about the causes of<br />

the relationship or outcome of <strong>in</strong>terest, or control for potentially confound<strong>in</strong>g<br />

factors. For example, the OSH <strong>data</strong> analysed by Maré and Papps conta<strong>in</strong>ed very<br />

little <strong>in</strong>formation on the range of firm characteristics that may be associated with<br />

differences <strong>in</strong> the level of hazards faced by firms, or differences <strong>in</strong> the costs of<br />

prevent<strong>in</strong>g accidents. This central weakness <strong>in</strong> the <strong>data</strong> source limits the weight<br />

that can be attached to the conclusions of their study. A key limitation of the<br />

unemployment register <strong>data</strong> analysed <strong>in</strong> the other two studies is that very little<br />

<strong>in</strong>formation is recorded on the activities and circumstances of those who leave the<br />

register. This means that transitions from unemployment <strong>in</strong>to employment cannot<br />

be properly dist<strong>in</strong>guished from less favourable outcomes, such as exit from the<br />

labour market.<br />

One response to the problem of omitted variables <strong>in</strong> <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>sets is<br />

to l<strong>in</strong>k them to other <strong>data</strong> <strong>sources</strong>. While the studies <strong>in</strong> this issue of the <strong>Labour</strong><br />

<strong>Market</strong> Bullet<strong>in</strong> do not <strong>in</strong>volve any match<strong>in</strong>g of <strong>data</strong> from different <strong>sources</strong>, other<br />

New Zealand <strong>research</strong>ers are currently engaged <strong>in</strong> <strong>in</strong>novative projects that are<br />

l<strong>in</strong>k<strong>in</strong>g <strong>adm<strong>in</strong>istrative</strong> with survey <strong>data</strong> to <strong>in</strong>vestigate problems that could not


28<br />

<strong>Labour</strong> <strong>Market</strong> Bullet<strong>in</strong> 2000–02 Special Issue<br />

otherwise be adequately <strong>research</strong>ed. For example, a team of <strong>research</strong>ers at the<br />

Well<strong>in</strong>gton School of Medic<strong>in</strong>e, led by Tony Blakely, is study<strong>in</strong>g socio-economic<br />

variations <strong>in</strong> mortality patterns by l<strong>in</strong>k<strong>in</strong>g census <strong>data</strong> on socio-economic factors<br />

to <strong>in</strong>dividual mortality records (Blakely et al, 1999). Overseas experience <strong>in</strong>dicates<br />

that l<strong>in</strong>k<strong>in</strong>g <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> on key labour market outcomes such as<br />

employment, annual taxable <strong>in</strong>comes or benefit payments to survey <strong>data</strong> on<br />

<strong>in</strong>dividual or household characteristics can greatly extend the <strong>research</strong> potential<br />

of the <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>. Statistics New Zealand (1998) is a useful reference<br />

source on the techniques of <strong>in</strong>tegrat<strong>in</strong>g <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> and the potential costs<br />

and benefits of <strong>in</strong>tegration exercises.<br />

Because <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>sets are compiled for operational reasons rather<br />

than for <strong>research</strong>, and typically conta<strong>in</strong> <strong>data</strong> that was keyed <strong>in</strong> by a large number<br />

of people who are not <strong>data</strong>-entry specialists, deficiencies <strong>in</strong> <strong>data</strong> coverage and<br />

quality are common. Adm<strong>in</strong>istrative decisions can also cause discont<strong>in</strong>uities <strong>in</strong> the<br />

concepts and populations that are measured. For example, changes <strong>in</strong> the jobsearch<br />

obligations that different classes of beneficiary are required to meet have<br />

led to changes over time <strong>in</strong> the size and composition of the registered<br />

unemployed population. Data <strong>in</strong>consistencies result<strong>in</strong>g from <strong>in</strong>consistencies <strong>in</strong> the<br />

application of questions, miss<strong>in</strong>g fields of <strong>data</strong>, <strong>in</strong>consistencies <strong>in</strong> the cod<strong>in</strong>g of<br />

responses, the duplication of records or identities, and <strong>in</strong>accuracies <strong>in</strong> the l<strong>in</strong>k<strong>in</strong>g<br />

of records over time are common problems faced by <strong>research</strong>ers who are<br />

analys<strong>in</strong>g a particular <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> source for the first time. The up-front<br />

time required to identify and, where possible, resolve these <strong>data</strong> quality issues<br />

before analysis can proceed is often quite large. Indeed, it is not uncommon for<br />

<strong>research</strong>ers to spend more time assess<strong>in</strong>g the quality of the <strong>data</strong> and attempt<strong>in</strong>g<br />

to resolve <strong>in</strong>consistencies than actually analys<strong>in</strong>g it. At worst, major changes <strong>in</strong><br />

the <strong>data</strong> over time that were driven by <strong>adm<strong>in</strong>istrative</strong> decisions can underm<strong>in</strong>e<br />

attempts to use the <strong>data</strong> to understand real world trends or identify the<br />

behavioural effects of a policy change.<br />

The <strong>in</strong>creased use of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> <strong>in</strong> social and economic <strong>research</strong> is<br />

prompt<strong>in</strong>g the agencies that compile such <strong>data</strong> to consider more carefully what<br />

access and <strong>data</strong> protection rules should apply. Access to <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> is<br />

typically conditioned by the obligation of the <strong>data</strong> holder to protect the<br />

confidentiality of the subjects on the one hand, but allow legitimate access for<br />

<strong>research</strong> purposes on the other. The NZES and OSH <strong>data</strong> that are analysed <strong>in</strong> this<br />

issue were collected by the Department of <strong>Labour</strong>, and Department of <strong>Labour</strong><br />

standards for the protection of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> collected <strong>in</strong>ternally have<br />

evolved over time. Current practices limit <strong>research</strong> access to authorised<br />

<strong>research</strong>ers, who must be employees of the Department and must have agreed to<br />

observe a set of pr<strong>in</strong>ciples and procedures for protect<strong>in</strong>g confidentiality. Records<br />

are ‘confidentialised’ by the removal of identify<strong>in</strong>g <strong>in</strong>formation such as names


Sylvia Dixon 29<br />

and addresses. Data are stored on an isolated and secure <strong>data</strong> server, so that only<br />

authorised people can view them. Confidentiality standards also require that<br />

<strong>research</strong> outputs are screened and any <strong>in</strong>formation that might permit <strong>in</strong>dividuals<br />

or firms to be identified is removed.<br />

Other related <strong>research</strong><br />

For a broader view on the potential of <strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> for <strong>in</strong>novative labour<br />

market <strong>research</strong>, a number of other recent New Zealand and Australian studies<br />

can be consulted. Dean Hyslop has used longitud<strong>in</strong>ally l<strong>in</strong>ked <strong>data</strong> from the<br />

Inland Revenue Department’s tax <strong>data</strong>base to analyse <strong>in</strong>dividual <strong>in</strong>come mobility<br />

(2000b) and to estimate the effects of receiv<strong>in</strong>g welfare benefits on <strong>in</strong>dividuals’<br />

future <strong>in</strong>comes (2000a). Moira Wilson has used longitud<strong>in</strong>ally l<strong>in</strong>ked <strong>data</strong> on<br />

work<strong>in</strong>g-aged beneficiaries collected by the Department of Social Welfare (and<br />

more recently the Department of Work and Income) to analyse the dynamics of<br />

benefit receipt. Wilson (1999) <strong>in</strong>vestigates the duration patterns and rates of<br />

return to <strong>in</strong>come support that are associated with different types of benefit.<br />

Wilson (2000) uses <strong>data</strong> on successive cohorts of Domestic Purposes Benefit (DPB)<br />

beneficiaries to exam<strong>in</strong>e the extent to which patterns of DPB receipt were affected<br />

by a series of policy changes that strengthened the <strong>in</strong>centives for DPB recipients to<br />

undertake part-time work and <strong>in</strong>creased their obligations to search for work.<br />

Wilson’s work on the dynamics of benefit receipt is paralleled <strong>in</strong> Australia by<br />

a number of studies commissioned recently by the Department of Family and<br />

Community Services (Barrett, 2001; Flatau and Dockery, 2001; Strombeck and<br />

Dockery, 2001). <strong>Us<strong>in</strong>g</strong> <strong>data</strong> extracted from the Department’s client and benefit<br />

payment records, these studies exam<strong>in</strong>e the durations of time that different types<br />

of recipient spend on different types of benefit (unemployment, sole parent<br />

pension) and the factors that are associated with exit from the benefit system.<br />

New Zealand Conference on Database Integration and L<strong>in</strong>ked<br />

Employer-Employee Data<br />

To promote greater understand<strong>in</strong>g of the techniques and potential benefits of<br />

match<strong>in</strong>g <strong>adm<strong>in</strong>istrative</strong> and survey <strong>data</strong> across <strong>data</strong> <strong>sources</strong>, a group of New<br />

Zealand government departments have jo<strong>in</strong>tly organised a conference on this<br />

theme, which will take place <strong>in</strong> Well<strong>in</strong>gton on 21–22 March 2002.<br />

The Conference on Database Integration and L<strong>in</strong>ked Employer-Employee Data<br />

features a number of respected American <strong>research</strong>ers who have pioneered the<br />

development of <strong>in</strong>tegrated employer-employee <strong>data</strong>sets, such as John Abowd and<br />

John Haltiwanger. They will offer <strong>in</strong>sights <strong>in</strong>to their experiences. Other<br />

<strong>in</strong>ternational speakers will discuss methods for the preparation and l<strong>in</strong>k<strong>in</strong>g of<br />

<strong>adm<strong>in</strong>istrative</strong> <strong>data</strong> and the practical lessons they have learned <strong>in</strong> <strong>data</strong> <strong>in</strong>tegration<br />

projects. Several speakers will present papers on the confidentiality protection


30<br />

<strong>Labour</strong> <strong>Market</strong> Bullet<strong>in</strong> 2000–02 Special Issue<br />

issues are that faced by agencies with ownership of important survey or<br />

<strong>adm<strong>in</strong>istrative</strong> <strong>data</strong>sets. A series of overseas and New Zealand-based <strong>research</strong>ers<br />

who are currently engaged <strong>in</strong> <strong>data</strong> <strong>in</strong>tegration exercises will present results from<br />

their work.<br />

Further <strong>in</strong>formation about the conference is available at the conference<br />

website, http://www.dileed.govt.nz. Lead<strong>in</strong>g papers from the conference will be<br />

published <strong>in</strong> the New Zealand Economic Papers.<br />

References<br />

Barrett, Gary (2001) ‘The Dynamics of Participation <strong>in</strong> Parent<strong>in</strong>g Payment (S<strong>in</strong>gle) and Sole Parent<br />

Pension’, Department of Family and Community Services, Policy Research Paper No 14, Commonwealth<br />

Government of Australia. Available at http://www.facs.gov.au.<br />

Blakely, Tony; Salmond, Clare and Woodward, Alistair (1999) ‘Anonymous Record L<strong>in</strong>kage of<br />

1991 Census Records and 1991–94 Mortality Records’, NZCMS Technical Report No 1, Department of<br />

Public Health, Well<strong>in</strong>gton School of Medic<strong>in</strong>e.<br />

Flatau, Paul and Dockery, Mike (2001) ‘How Do Income Support Recipients Engage with the<br />

<strong>Labour</strong> <strong>Market</strong>?’, Department of Family and Community Services, Policy Research Paper No 12,<br />

Commonwealth Government of Australia. Available at http://www.facs.gov.au.<br />

Hyslop, Dean (2000a) ‘Does Benefit Receipt Affect Future Income? An econometric explanation’,<br />

Treasury Work<strong>in</strong>g Paper No 2000/14. Available at http://www.treasury.govt.nz/work<strong>in</strong>gpapers/.<br />

Hyslop, Dean (2000b) ‘A Prelim<strong>in</strong>ary Analysis of the Dynamics of Individual <strong>Market</strong> and<br />

Disposable Incomes’, Treasury Work<strong>in</strong>g Paper No 2000/15. Available at http://ww.treasury.govt.nz/<br />

work<strong>in</strong>gpapers/.<br />

Statistics New Zealand (1998) F<strong>in</strong>al Report on the Feasibility Study <strong>in</strong>to the Costs and Benefits of<br />

Integrat<strong>in</strong>g Cross-sectional Adm<strong>in</strong>istrative Data to Produce New Social Statistics, Statistics New<br />

Zealand, Well<strong>in</strong>gton.<br />

Strombeck, Thorsten and Dockery, Mike (2001) ‘The Duration of Unemployment Benefit Spells: A<br />

comparison of <strong>in</strong>digenous and non-<strong>in</strong>digenous persons’, Department of Family and Community<br />

Services, Policy Research Paper No 10, Commonwealth Government of Australia. Available at http://<br />

www.facs.gov.au.<br />

Wilson, Moira (1999) ‘The Duration of Benefit Receipt: New f<strong>in</strong>d<strong>in</strong>gs from the benefit dynamics<br />

<strong>data</strong> set’, Social Policy Journal of New Zealand, Issue 13, pp 59–82.<br />

Wilson, Moira (2000) ‘The Policy Response to the Employment Taskforce and Chang<strong>in</strong>g Patterns of<br />

Domestic Purpose Benefit Receipt: A cohort analysis’, Social Policy Journal of New Zealand, Issue 14,<br />

pp 78–103.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!