All Ireland Traveller Health Study Our Geels - Department of Health ...
All Ireland Traveller Health Study Our Geels - Department of Health ...
All Ireland Traveller Health Study Our Geels - Department of Health ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Health</strong> Survey Findings<br />
Analysis Strategy for<br />
Technical Report 1<br />
Data Cleaning and Preparation<br />
The data was received from the field in a raw electronic format (SQL database), with the responses for<br />
the census and all the surveys in one single database. The database also included files for a number <strong>of</strong><br />
interview outcomes, including refusals, unavailable or moved families. It was therefore necessary for the<br />
questionnaire and database programmer to go through cleaning and restructuring at the level <strong>of</strong> the<br />
SQL database. This was then followed by conversion to an SPSS database. The main tasks in the data<br />
cleaning and preparation process were:<br />
• Removal <strong>of</strong> duplicate and invalid files and addition <strong>of</strong> files that had not been uploaded but retrieved<br />
from laptops that were returned from the field.<br />
• Renaming and restructuring <strong>of</strong> variables into an analysable and interpretable format.<br />
• Combining the proxy-answered variables <strong>of</strong> the adult <strong>Health</strong> Status and Service Utilisation Surveys<br />
with their directly-answered counterparts.<br />
• Creation <strong>of</strong> individual-level demographic data files from family level files that nested demographic<br />
data for several individuals <strong>of</strong> the same family.<br />
Because ROI data was received before the NI data, all the processes were initially carried out on the ROI<br />
database. The data cleaning and transformation syntax generated was then customised to suit the NI<br />
variables and any issues specific to the NI database, before it was run on the NI database.<br />
Analysis<br />
The analysis was carried out in 2 rounds. The first round ran in tandem with the data cleaning and<br />
preparation process. ROI data was initially partitioned into the constituent census, child and adult<br />
surveys. The study team data analysts attended a training session where the structure <strong>of</strong> the databases<br />
and the questionnaire were explained. They were supplied with an analysis and reporting programme<br />
prepared by the study’s information technology (IT) group. The programme produces standard reports<br />
<strong>of</strong> census/survey response percentages as well as age and gender breakdown for the survey responses.<br />
Each analyst had to customise the programme into a census- or survey-specific programme. They<br />
received training on the process to operate their analysis, and all analysts submitted a complete report<br />
including their analysis programmes once completed. Finalised census- and survey-specific analysis<br />
programmes were then generated, including an adjusted version for the NI database. Finally, the<br />
cleaned and finalised analysis syntaxes were rerun on the final databases, in a second round <strong>of</strong> analysis.<br />
Respondents did not answer every question, which created a variation in the response rates across<br />
the questions. Missing values were not inputted and the reported percentages are based only on<br />
the number <strong>of</strong> people who responded to the question, excluding those who responded with ‘Don’t<br />
know’ or ‘Refused’ as well as those who did not respond. That is there is a difference in the number <strong>of</strong><br />
response reported for each variable. For multiple response sets, the percentages were based on those<br />
who selected at least one response, excluding those selecting ‘Don’t know’ or ‘Refused’ and those who<br />
did not select any option. Furthermore, the total number <strong>of</strong> <strong>Traveller</strong>s who responded to each subquestionnaire<br />
is calculated from the maximal valid response to an item within that sub-questionnaire.<br />
63