12.07.2015 Views

Evolution of Audio Recording in Field Surveys - RTI International

Evolution of Audio Recording in Field Surveys - RTI International

Evolution of Audio Recording in Field Surveys - RTI International

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Evolution</strong> <strong>of</strong> <strong>Audio</strong> <strong>Record<strong>in</strong>g</strong> <strong>in</strong> <strong>Field</strong> <strong>Surveys</strong>M. Rita Thissen, Sridevi Sattaluri, Emily McFarlane, and Paul P. Biemer<strong>RTI</strong> <strong>International</strong>, Research Triangle Park, NC 27709Abstract:The tools <strong>of</strong> field survey adm<strong>in</strong>istration change quickly.By tak<strong>in</strong>g advantage <strong>of</strong> new technology and adapt<strong>in</strong>g itfor time-honored needs, survey managers can boost theeffectiveness, efficiency and quality <strong>of</strong> data collection.One method which has evolved rapidly is computeraudio-recorded <strong>in</strong>terview<strong>in</strong>g (CARI), an approach toensur<strong>in</strong>g the quality <strong>of</strong> data through unobtrusiverecord<strong>in</strong>g by the computer <strong>of</strong> the audio portion <strong>of</strong> <strong>in</strong>person<strong>in</strong>terviews, much as silent monitor<strong>in</strong>g has beenused to ensure quality at call centers.Several developments <strong>in</strong> the past few years haveimproved the technical feasibility <strong>of</strong> CARI for rout<strong>in</strong>eand <strong>in</strong>expensive use <strong>in</strong> field studies. Advances <strong>in</strong> filecompression and available bandwidth enable collection<strong>of</strong> longer record<strong>in</strong>gs with little stra<strong>in</strong> on transmissioncapacity and no burden to the <strong>in</strong>terviewer. Use <strong>of</strong> asimple external file for specify<strong>in</strong>g items to be recorded<strong>in</strong> a Blaise <strong>in</strong>strument <strong>of</strong>fers great flexibility <strong>in</strong> select<strong>in</strong>gportions <strong>of</strong> the <strong>in</strong>terview for audit<strong>in</strong>g, even permitt<strong>in</strong>gmodification <strong>of</strong> the recorded-item list while an<strong>in</strong>strument is <strong>in</strong> production. A web-based monitor<strong>in</strong>gapplication, for use by tra<strong>in</strong>ed reviewers <strong>in</strong> evaluat<strong>in</strong>gthe audio files, can now provide access to centrallylocated audio files by geographically distributed staff.Progress has also been made from an operationalviewpo<strong>in</strong>t. Work has been done to determ<strong>in</strong>e them<strong>in</strong>imum amount <strong>of</strong> record<strong>in</strong>g needed to achieveagreement among reviewers as to the authenticity <strong>of</strong> therecorded session, and cost model<strong>in</strong>g shows that CARIcan provide quality assurance at equal or reduced costscompared to more traditional approaches <strong>of</strong> re-<strong>in</strong>terviewor telephone verification.Use <strong>of</strong> CARI on several national surveys has providedproduction experience to bolster laboratory tests. Thisarticle reviews the progress <strong>of</strong> CARI technology <strong>in</strong> theyears s<strong>in</strong>ce it was <strong>in</strong>troduced, with an emphasis onfeasibility for rout<strong>in</strong>e use with field surveys.Key Words: survey technology; audio record<strong>in</strong>g;computer audio-recorded <strong>in</strong>terview<strong>in</strong>g (CARI); soundfile; quality assurance; performance management; field<strong>in</strong>terview; <strong>in</strong>-person <strong>in</strong>terview1. IntroductionMonitor<strong>in</strong>g the performance <strong>of</strong> field staff and thequality <strong>of</strong> data collection has been challeng<strong>in</strong>g s<strong>in</strong>ce theearliest <strong>in</strong>-person surveys. Traditionally, field staffhave worked largely unobserved, with occasionalshadow<strong>in</strong>g by supervisory personnel or re-contact<strong>in</strong>gthe respondent to confirm the <strong>in</strong>terview’s authenticityand <strong>in</strong>quire about the pr<strong>of</strong>essionalism <strong>of</strong> the <strong>in</strong>terviewer.It can also be difficult to evaluate or confirm theeffectiveness <strong>of</strong> questionnaire items <strong>in</strong> field surveys,whether from a usability perspective, such as the ability<strong>of</strong> <strong>in</strong>terviewers to read the questions <strong>in</strong> a fluent andunderstandable manner, or from the perspective <strong>of</strong>clarity for the respondent, such that the responseprovides the desired <strong>in</strong>formation without the need forexplanation or prob<strong>in</strong>g. While focus groups orcognitive <strong>in</strong>terviews <strong>in</strong> advance <strong>of</strong> data collection may<strong>of</strong>fer <strong>in</strong>sight <strong>in</strong>to presentation and response patterns, thepractice does not fully anticipate field conditions.Now the situation has changed. Many computers nowhave audio record<strong>in</strong>g capabilities, and some have built<strong>in</strong>microphones. With this technology, surveys can beset up to collect digital audio record<strong>in</strong>gs <strong>in</strong> anunobtrusive manner while the <strong>in</strong>terview is tak<strong>in</strong>g place.With computer audio-recorded <strong>in</strong>terview<strong>in</strong>g (CARI),sound files can be created electronically without theneed for external equipment and can be transmittedalong with response data files and track<strong>in</strong>g <strong>in</strong>formation.Because the record<strong>in</strong>g process is “<strong>in</strong>visible,” onceconsent has been given, it can provide a faithfulrepresentation <strong>of</strong> the reality <strong>of</strong> <strong>in</strong>-person data collection.The technology provides a potent tool for deterr<strong>in</strong>g anddetect<strong>in</strong>g falsification, provid<strong>in</strong>g performance feedbackand enabl<strong>in</strong>g study <strong>of</strong> questionnaire item effectiveness.2. <strong>Audio</strong> <strong>Record<strong>in</strong>g</strong> Technology, Past and PresentFrom the market<strong>in</strong>g <strong>of</strong> the Dictaphone <strong>in</strong> 1907 (NuanceCommunications 2005) to the availability <strong>of</strong> m<strong>in</strong>iaturerecorders embedded <strong>in</strong> portable electronic devices today(Dwyer et al,1998), people have been discover<strong>in</strong>g waysto take advantage <strong>of</strong> audio record<strong>in</strong>g tools to capturevoices for later review. While the early acousticrecorders proved helpful for journalistic <strong>in</strong>terviews, theywere not usable for large-scale research surveys; the<strong>in</strong>troduction <strong>of</strong> cassette tapes improved convenience for<strong>in</strong>terview<strong>in</strong>g (Stockdale, 2002).With the advent <strong>of</strong> digital record<strong>in</strong>g, and as computersbegan <strong>of</strong>fer<strong>in</strong>g built-<strong>in</strong> sound cards, the task <strong>of</strong>captur<strong>in</strong>g audio records became easier and sound


quality improved. Sound files now can be recordedelectronically through sound cards and s<strong>of</strong>tware onlaptops, handhelds and other portable devices, mak<strong>in</strong>gthis technology handy for use <strong>in</strong> field surveys.Figure 1. Milestones <strong>of</strong> audio record<strong>in</strong>g history<strong>Audio</strong> Storage Invention WidespreadUseWax cyl<strong>in</strong>der 1885 Early 1900’s to1940’sMagnetized wire 1898 1940’sMagnetic coat<strong>in</strong>gson plastic tape1928 1930’s topresentCompact cassette 1963 1960’s to 1990’sPulse code1937 1990’smodulationDigital audio 1971 1990’smicroprocessorSoundBlaster audio 1989 1990’scardFirst use <strong>of</strong> CARI 1999 PresentPortable digitalvoice recorders2003 PresentIn 1999, use <strong>of</strong> digital audio record<strong>in</strong>g was firstdeveloped and deployed on a national field survey, asthe result <strong>of</strong> <strong>in</strong>novative work by <strong>RTI</strong> developers R.Suresh, A. Bethke and P. Cooley. Use <strong>of</strong> CARI hasgrown s<strong>in</strong>ce then, as the feasibility and utility <strong>of</strong> theapproach have been confirmed. Electronic record<strong>in</strong>grequires little attention dur<strong>in</strong>g the <strong>in</strong>terview, as there areno tapes to change, no additional equipment to set upand no distraction dur<strong>in</strong>g the <strong>in</strong>terview. Feedback fromrespondents and <strong>in</strong>terviewers <strong>in</strong>dicates that most peopleforget about digital record<strong>in</strong>g when the microphone is<strong>in</strong>ternal, once the <strong>in</strong>terview gets underway.Many laptops now have built-<strong>in</strong> microphones, soundcards and adequate disk space for conduct<strong>in</strong>g audiorecord<strong>in</strong>g. Handheld digital recorders from somecompanies <strong>of</strong>fer audio record<strong>in</strong>g capabilities butfunction much like analog tape recorders, requir<strong>in</strong>g theuser to switch them on and <strong>of</strong>f manually. A fewhandhelds <strong>of</strong>fer programmed record<strong>in</strong>g capabilities plusan <strong>in</strong>ternal microphone. Most laptop and many handheldcomputers allow use <strong>of</strong> an external microphone <strong>in</strong>stead<strong>of</strong> the <strong>in</strong>ternal one, for improved audio fidelity.However, the visible hardware calls attention to therecord<strong>in</strong>g process and may be more likely to affect therespondent’s and <strong>in</strong>terviewer’s behavior.<strong>Audio</strong> fidelity from any device depends on a number <strong>of</strong>factors. When us<strong>in</strong>g a laptop with <strong>in</strong>ternal microphone,these <strong>in</strong>clude• Placement <strong>of</strong> the microphone with respect to noiseproduc<strong>in</strong>ghardware (keyboard, fan and disk drive)• Placement <strong>of</strong> the microphone with respect to the<strong>in</strong>terviewer and respondent.• Microphone control sett<strong>in</strong>gs.Some <strong>in</strong>ternal microphones are adequate to capturevoices with<strong>in</strong> 8 feet or so <strong>of</strong> the laptop when configuredproperly, at a quality level that allows a listener todist<strong>in</strong>guish among multiple voices and discern thespoken content.3. <strong>Audio</strong> <strong>Record<strong>in</strong>g</strong> for Quality AssuranceAlthough there are several advantages <strong>of</strong>fered byimplement<strong>in</strong>g CARI, perhaps the most compell<strong>in</strong>greason is to confirm the authenticity <strong>of</strong> data for areduced cost compared to traditional verificationmethods. CARI can act as a deterrent to curbston<strong>in</strong>g andas a tool for detect<strong>in</strong>g questionable <strong>in</strong>terviews.Interviewers who are aware that monitors may listen toparts <strong>of</strong> each <strong>in</strong>terview may be less likely to falsify data,because the audio file acts as a “witness” to theiractions. In this way, the simple presence <strong>of</strong> CARI canreduce cheat<strong>in</strong>g.Speech patterns heard <strong>in</strong> audio files provide <strong>in</strong>formationto the monitors about the veracity <strong>of</strong> the <strong>in</strong>terview, as<strong>in</strong>dicated by the tim<strong>in</strong>g and phras<strong>in</strong>g <strong>of</strong> questions andresponses. In a normal <strong>in</strong>terchange, people pausebetween words, phrases or sentences, as they considertheir answers or express their views (Kowal et al, 1975;O’Connell and Kowal, 1983).Figure 2. Indicators <strong>of</strong> questionable authenticitySilence No voices can be heard, although roomnoises and key clicks are audibleMumbl<strong>in</strong>g The <strong>in</strong>terviewer can be heard, but appearsto be speak<strong>in</strong>g to him or herselfUnnatural The respondent answers too quickly orpatterns laughs <strong>in</strong> <strong>in</strong>appropriate placesComments The respondent or <strong>in</strong>terviewer makescomments suggest<strong>in</strong>g the <strong>in</strong>terview isbe<strong>in</strong>g falsifiedSame The same respondent’s voice is heard <strong>in</strong>voice multiple <strong>in</strong>terviews or does not match thestated sex or age <strong>of</strong> the respondent.For example, when an <strong>in</strong>terviewer acts alone andfalsifies data, there may be no voice at all <strong>in</strong> therecord<strong>in</strong>g or only one voice without the expectedpaus<strong>in</strong>g, <strong>in</strong>flection or clarity <strong>of</strong> voice which would beexpected <strong>in</strong> a two-way exchange. If the <strong>in</strong>terviewerenlists someone to pose as the respondent, theaccomplice may display <strong>in</strong>appropriate attitudes oremotions, make unexpected remarks, respond without


paus<strong>in</strong>g to understand the question or pause at unnaturalplaces while <strong>in</strong>vent<strong>in</strong>g an answer. CARI monitors listento the record<strong>in</strong>gs, and quickly become adept atdist<strong>in</strong>guish<strong>in</strong>g between record<strong>in</strong>gs <strong>of</strong> normal<strong>in</strong>terview<strong>in</strong>g circumstances and suspicious ones, bylisten<strong>in</strong>g for characteristics such as those <strong>in</strong> Figure 2.(Thissen and Rodriguez, IBUC 2004)Us<strong>in</strong>g CARI, a survey organization may reduce its recontactefforts and costs. CARI monitor<strong>in</strong>g may replacemost telephone verification calls or field re-<strong>in</strong>terview.However, it rema<strong>in</strong>s important to have a second meansfor follow<strong>in</strong>g up a small sample <strong>of</strong> the cases s<strong>in</strong>ce somerespondents may refuse to allow audio record<strong>in</strong>g, and<strong>in</strong>terviewers may attempt to use that option to preventdetection <strong>of</strong> poor <strong>in</strong>terview<strong>in</strong>g habits or curbston<strong>in</strong>g.The benefits <strong>of</strong> CARI plus optional re-contact aretw<strong>of</strong>old: it tells the <strong>in</strong>terview<strong>in</strong>g staff that they cannotavoid monitor<strong>in</strong>g even if they discourage theirrespondents from allow<strong>in</strong>g CARI, and it allowscomparison <strong>of</strong> the two approaches to confirm thevalidity <strong>of</strong> the results.4. Data Collection MethodologyAnother benefit <strong>of</strong> CARI is that it provides a method foridentify<strong>in</strong>g questionnaire problems and data collectiondifficulties <strong>in</strong> <strong>in</strong>terviewer-respondent <strong>in</strong>teractions. <strong>Field</strong>staff do not always conduct <strong>in</strong>terviews <strong>in</strong> an optimalmanner, and it can be difficult to obta<strong>in</strong> reliable<strong>in</strong>formation about their performance. While personalobservation can provide a wealth <strong>of</strong> <strong>in</strong>formation, thepresence <strong>of</strong> an observer may bias the evaluation. CARI<strong>of</strong>fers a unique opportunity to listen to the <strong>in</strong>terviewexactly as it took place, without observation effects.Dur<strong>in</strong>g the first few weeks <strong>in</strong> the field, feedback can bean important tool for re<strong>in</strong>forc<strong>in</strong>g lessons learned dur<strong>in</strong>gtra<strong>in</strong><strong>in</strong>g. A CARI monitor may be able to providefeedback, either praise or constructive criticism, aboutthe way <strong>in</strong> which the <strong>in</strong>terview was conducted.Improper question adm<strong>in</strong>istration which can be detectedthrough CARI <strong>in</strong>cludes• Paraphras<strong>in</strong>g• Improper prob<strong>in</strong>g• Suggested responses• Poor enunciation• Improper commentaryCARI can be used to identify positive behavior such as• Precise adherence to protocol• Adept handl<strong>in</strong>g <strong>of</strong> difficult situations• Consistency, honesty, and pr<strong>of</strong>essional behaviorCARI can also be used to evaluate the usability <strong>of</strong>questionnaire items. The audio record<strong>in</strong>g <strong>of</strong> an<strong>in</strong>terviewer’s presentation <strong>of</strong> an item and the subject’sresponse provides a clear <strong>in</strong>dication <strong>of</strong> whether the itemsucceeds <strong>in</strong> several ways:• Readability – based on the <strong>in</strong>terviewer’s fluency <strong>in</strong>present<strong>in</strong>g the item• Clarity <strong>of</strong> content – based on the respondent’s ease<strong>of</strong> understand<strong>in</strong>gSurvey items which evoke negative reactions or requirefrequent explanations are detrimental to the responserate and <strong>in</strong>crease the level <strong>of</strong> burden on both <strong>in</strong>terviewerand subject. Us<strong>in</strong>g CARI, especially dur<strong>in</strong>g field test<strong>in</strong>g<strong>of</strong> an <strong>in</strong>strument, allows the survey specialist to evaluatethe success <strong>of</strong> the questionnaire items <strong>in</strong> elicit<strong>in</strong>g thedesired <strong>in</strong>formation.5. Privacy, Security, Consent and LegalitiesFor CARI to be used dur<strong>in</strong>g an <strong>in</strong>terview, participantsmust give express consent for the <strong>in</strong>terview to berecorded. Respondents are told that their participation isvoluntary and that their <strong>in</strong>formation and responses areconfidential and will only be used for statisticalpurposes. In two national field studies us<strong>in</strong>g CARI,approximately 83% <strong>of</strong> respondents <strong>in</strong> one survey on ahighly sensitive topic agreed to allow the <strong>in</strong>terview tobe recorded, and 93% <strong>of</strong> respondents on another lesssensitive survey agreed. (Wrenn-Yorker and Thissen,FedCASIC 2005). For those who do not allowrecord<strong>in</strong>g, traditional verification methods such astelephone verification <strong>in</strong>terviews are used.All survey data, <strong>in</strong>clud<strong>in</strong>g CARI record<strong>in</strong>gs, must besafeguarded. In addition to design considerations basedon user needs, careful attention must be paid to securityand privacy issues when deal<strong>in</strong>g with human data. Inthe United States, laws and regulations direct themanagement <strong>of</strong> personal identification <strong>in</strong>formation,health records and other specific types <strong>of</strong> data.Comput<strong>in</strong>g pr<strong>of</strong>essionals must be aware <strong>of</strong> federal, stateand local requirements for confidentiality <strong>in</strong> storage,transmission and release <strong>of</strong> personal <strong>in</strong>formation.Institutional review boards oversee all research data onhuman subjects, to ensure that the studies contribute tothe greater good without harm<strong>in</strong>g <strong>in</strong>dividuals. Tocomply with guidel<strong>in</strong>es and regulations, <strong>in</strong>formationsystems may need to <strong>in</strong>clude authenticationmechanisms, audit histories and user records. These areregulatory rather than usability requirements but areessential components <strong>of</strong> survey <strong>in</strong>formation systems.Given heightened consciousness <strong>of</strong> confidentiality andsecurity concerns, care is required <strong>in</strong> handl<strong>in</strong>g audiorecord<strong>in</strong>gs. Even though the survey may not deliberatelyrecord personally identify<strong>in</strong>g <strong>in</strong>formation, it cannot beguaranteed to avoid it. For this reason, audio files arebest treated as sensitive data, much the way response


data is handled. Encryption may be desirable for digitalrecord<strong>in</strong>gs, and special care may need to be taken <strong>in</strong>handl<strong>in</strong>g tapes, if the record<strong>in</strong>g is by analog device.6. <strong>Audio</strong> File FormatsDigital audio record<strong>in</strong>g can take place with variouslevels <strong>of</strong> sound quality, and the result<strong>in</strong>g files may bestored <strong>in</strong> various electronic formats. The soundrecord<strong>in</strong>g algorithm affects the follow<strong>in</strong>g:• <strong>Audio</strong> file size and storage requirements• Required s<strong>of</strong>tware for record<strong>in</strong>g and playback• Quality <strong>of</strong> sound on playback• Platform requirements and CPU demands• Cost and licens<strong>in</strong>g issues.Many audio file formats have been developed over theyears, and their sheer variety may seem baffl<strong>in</strong>g to thenew observer. Recent attention has been given to mp3(Motion Picture Group <strong>Audio</strong> Layer 3) format, butmany other file formats exist as well. A few <strong>of</strong> thecommon formats are listed <strong>in</strong> Figure 3.Micros<strong>of</strong>t W<strong>in</strong>dows operat<strong>in</strong>g systems <strong>in</strong>clude SoundRecorder s<strong>of</strong>tware which writes to the wave file format,and the W<strong>in</strong>dows Media Player which can play backwave files and a number <strong>of</strong> other non-proprietaryformats. The PCM (pulse code modulation) digitalrecord<strong>in</strong>g algorithm is used <strong>in</strong> various encoders<strong>in</strong>clud<strong>in</strong>g Sound Recorder, and records uncompressedsound with no required licens<strong>in</strong>g.Figure 3. Common audio file formatsNameFile UseExtensionWave .wav W<strong>in</strong>dowsuncompressedMP3 .mp3 Compressed audioRealMedia .rm Compressed audioReal<strong>Audio</strong> .ra Compressed, forstream<strong>in</strong>g audioAIFF .aiff Mac<strong>in</strong>tosh defaultuncompressedCD <strong>Audio</strong> .cda Music CD tracksActive Stream<strong>in</strong>gFormat.asf Stream<strong>in</strong>g audioWave files are not especially efficient at storage, but therecord<strong>in</strong>g process places little demand on the computer.The size <strong>of</strong> a particular wave file depends on therecord<strong>in</strong>g parameters selected <strong>in</strong> its creation. For eachavailable audio file format, there is a choice <strong>of</strong> sampl<strong>in</strong>grate, bandwidth, number <strong>of</strong> channels and otherparameters. For <strong>RTI</strong>’s current CARI system, thestandard configuration is 16 bit bandwidth, 11.25 KHzsampl<strong>in</strong>g rate and a s<strong>in</strong>gle channel. <strong>Record<strong>in</strong>g</strong> twochannels (stereo) would require twice the storage spaceand provides no extra quality s<strong>in</strong>ce a s<strong>in</strong>gle microphoneis generally used. <strong>Audio</strong> quality also is affected bysampl<strong>in</strong>g rate, compression and audio file format, andthe sett<strong>in</strong>gs given above are m<strong>in</strong>imal for useful files.To reduce the space for audio files, a compressiontechnique may be used. Coder-decoder algorithms(CODECS) <strong>of</strong>fer ways to store record<strong>in</strong>gs <strong>in</strong> less space.They elim<strong>in</strong>ate silence and mathematically map thesampled analog sound frequencies <strong>in</strong>stead <strong>of</strong> preserv<strong>in</strong>gthe actual data po<strong>in</strong>ts. CODECS may be employed as apost-process<strong>in</strong>g step after creation <strong>of</strong> the sound file or asa real-time action at the time <strong>of</strong> record<strong>in</strong>g.CODECS (compression–decompression techniques)were developed for use with audio record<strong>in</strong>g, to reducethe size <strong>of</strong> sound files. It is possible for audio record<strong>in</strong>gto comb<strong>in</strong>e the digitiz<strong>in</strong>g process and compression atonce. For use <strong>in</strong> surveys, the system designer canchoose among simple record<strong>in</strong>g with no compression,simultaneous record<strong>in</strong>g and compression or record<strong>in</strong>gfollowed by compression. Section 11 discusses theseapproaches <strong>in</strong> a comparison <strong>of</strong> post-record<strong>in</strong>gcompression with simultaneous record<strong>in</strong>g andcompression7. Integrat<strong>in</strong>g <strong>Audio</strong> <strong>Record<strong>in</strong>g</strong> with SurveyS<strong>of</strong>twareA variety <strong>of</strong> technologies have been <strong>in</strong> use to implementsurvey <strong>in</strong>struments, such as Blaise (StatisticsNetherlands), CASES (University <strong>of</strong> California,Berkeley) and web-based technologies like ASP.NET(Micros<strong>of</strong>t). <strong>Audio</strong> record<strong>in</strong>g components have beensuccessfully <strong>in</strong>corporated <strong>in</strong> all these environments.One <strong>of</strong> the challenges <strong>of</strong> <strong>in</strong>corporat<strong>in</strong>g audio record<strong>in</strong>g<strong>in</strong> a survey <strong>in</strong>strument is to make the processunnoticeable to the <strong>in</strong>terviewer. The record<strong>in</strong>g processmust not slow the system or provide any visual oraudible clue as to when it starts and stops.<strong>Audio</strong> record<strong>in</strong>g can be added to Blaise <strong>in</strong>struments byus<strong>in</strong>g either <strong>of</strong> two programm<strong>in</strong>g approaches. Oneapproach uses a Blaise procedure which <strong>in</strong> turn <strong>in</strong>vokesan external application to start and stop the recorder.Us<strong>in</strong>g this approach requires complex programm<strong>in</strong>gwith<strong>in</strong> Blaise <strong>in</strong> every place the record<strong>in</strong>g applicationneeds to be <strong>in</strong>voked, to keep track <strong>of</strong> whether record<strong>in</strong>gis already <strong>in</strong> progress or needs to be started or stopped(Thissen and Rodriguez, IBUC 2004).The second approach uses the Blaise alien router.Start<strong>in</strong>g with version 4.6, Blaise <strong>in</strong>troduced the alienrouter as part <strong>of</strong> the Blaise component pack. The alien


outer technology allows the <strong>in</strong>vocation <strong>of</strong> an externalcomponent before and after every survey item. Use <strong>of</strong>the alien router externalizes the complexities <strong>of</strong> track<strong>in</strong>gthe recorder state. It also opens up the possibility <strong>of</strong>ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g a text list <strong>of</strong> items to be recorded, externalto the <strong>in</strong>strument. This reduces the complexity <strong>of</strong><strong>in</strong>strument programm<strong>in</strong>g and allows easy modification<strong>of</strong> the list <strong>of</strong> items to be recorded, without any need tomodify the data model or recompile the <strong>in</strong>strument(Thissen and Sattaluri, 2006b).For CASES <strong>in</strong>struments the record<strong>in</strong>g can be <strong>in</strong>tegratedby spawn<strong>in</strong>g a separate application to start and stop anexternal recorder (Wrenn-Yorker and Thissen, 2005).When a survey is <strong>of</strong>fered <strong>in</strong> multiple modes by us<strong>in</strong>g aweb-based <strong>in</strong>strument, field <strong>in</strong>terview<strong>in</strong>g may takeplace through a website runn<strong>in</strong>g on the laptop withoutcont<strong>in</strong>uous connection to the <strong>in</strong>ternet. In that case, theaudio record<strong>in</strong>g component can be achieved by<strong>in</strong>stall<strong>in</strong>g a client side Java applet and Java script<strong>in</strong>g,similar to the way <strong>in</strong> which CARI can be implementedfor <strong>in</strong>ternet-based surveys (Suresh, 2005)Once a survey <strong>in</strong>strument has been enabled with CARItechnology, survey <strong>in</strong>formation systems (Thissen, 2004)must also be expanded to handle the audio data files.From a case management and data security perspective,CARI files are no more than response data stored <strong>in</strong> adifferent format. Issues and concerns are the same forfiles conta<strong>in</strong><strong>in</strong>g audio response data as they are for files<strong>of</strong> textual responses. File protection on the laptop,transmission to a central site, central storage, access byauthorized researchers and eventual deletion must all beplanned with the same security and confidentiality usedfor traditional response files.9. TransmissionThere are several options for transferr<strong>in</strong>g audio filesfrom the field laptop to a central management system.The files can be sent us<strong>in</strong>g dialup transmission,broadband, or removable media like flash drivesshipped by secure delivery methods. For small surveys,it may be practical to leave audio files on the laptopsuntil the end <strong>of</strong> data collection. With the pervasiveness<strong>of</strong> broadband access at homes through cable modem orDSL (digital subscriber l<strong>in</strong>e telephone service), thecapacity for transmitt<strong>in</strong>g large files has greatly<strong>in</strong>creased. Still, researchers must plan for transmissionwhen us<strong>in</strong>g CARI, s<strong>in</strong>ce audio files can be large.The choice <strong>of</strong> transmission option may depend on thesize <strong>of</strong> files be<strong>in</strong>g transmitted. It is found thatuncompressed audio record<strong>in</strong>g consumes about onemegabyte <strong>of</strong> disk space for each m<strong>in</strong>ute <strong>of</strong> recordeddialog. (See Section 11 below for a comparison <strong>of</strong>record<strong>in</strong>g parameters and file sizes.) Assum<strong>in</strong>g an<strong>in</strong>strument were programmed to collect three onem<strong>in</strong>uterecord<strong>in</strong>gs which were compressed to 100KBeach, the case management system would have 300KBto transmit for every case. If the <strong>in</strong>terviewer transmitsone case each day, these files can be sent us<strong>in</strong>g a dialupconnection. The use <strong>of</strong> broadband allows transmitt<strong>in</strong>g alarger number <strong>of</strong> files or larger size files at a faster rate.The third option, us<strong>in</strong>g removable external media andshipment, can be used when entire <strong>in</strong>terviews or lengthysections are recorded. However, security concerns, theeffort <strong>of</strong> handl<strong>in</strong>g external media and the possibility <strong>of</strong>loss make this approach less desirable than automatictransmission via dialup or broadband. Still, it mayprove useful when record<strong>in</strong>g <strong>in</strong>terviews <strong>in</strong> their entiretyor when other forms <strong>of</strong> file transfer are not available.<strong>Audio</strong> record<strong>in</strong>gs may conta<strong>in</strong> personal identify<strong>in</strong>g<strong>in</strong>formation, whether by <strong>in</strong>tention or by accident, and soit is important to protect these files by us<strong>in</strong>g encryptiontools while they reside <strong>in</strong> any location accessible tounauthorized <strong>in</strong>dividuals. In addition, if files aretransferred over the <strong>in</strong>ternet, secure socket layer (SSL)certification can be used, which provides a way toencrypt the data stream dur<strong>in</strong>g transmission.10. CARI Monitor<strong>in</strong>gAfter audio files are received at a central location, themonitor<strong>in</strong>g process may be as simple as open<strong>in</strong>g up thefiles us<strong>in</strong>g a free player tool like W<strong>in</strong>dows Media Playeror Real Player. However, s<strong>in</strong>ce manual casemanagement is impractical for all but the smallest <strong>of</strong>surveys, it is best to build a system that provides an<strong>in</strong>terface for review<strong>in</strong>g the files and a database forrecord<strong>in</strong>g evaluations.The monitor<strong>in</strong>g system might be a client-serverapplication or a browser-based application located on an<strong>in</strong>ternal or external network. Client-server applicationsrestrict access to an organization’s <strong>in</strong>ternal network andlocally-located users, due to poor performance <strong>of</strong>database connections over long distance. A web-basedapproach has advantage <strong>of</strong> be<strong>in</strong>g available from anyworkstation which has access to the network, support<strong>in</strong>gorganizations with review staff distributed nationally oreven <strong>in</strong>ternationally (Thissen and Sattaluri, 2006a).Regardless <strong>of</strong> the implementation, it should providerole-based access to protect the security <strong>of</strong> the<strong>in</strong>formation stored <strong>in</strong> the audio files. For example,three levels <strong>of</strong> access might be designed <strong>in</strong>to the system:• CARI monitor<strong>in</strong>g staff, who listen to andevaluate audio files


• Supervisory staff, who designate monitors,manage caseloads and track review-completionstatus• System adm<strong>in</strong>istrators who configure newsurveys and create new log<strong>in</strong>s and passwords.For large surveys, the system may also <strong>in</strong>clude analgorithm to select a specified percentage <strong>of</strong> files to bereviewed per <strong>in</strong>terviewer. Ideally, it would <strong>of</strong>fer theflexibility to adjust review rates for any field<strong>in</strong>terviewer for any active survey, so that qualityassurance personnel can <strong>in</strong>crease monitor<strong>in</strong>g <strong>of</strong> any<strong>in</strong>terviewer who has been suspected <strong>of</strong> improper datacollection practices. (Hartman et al, 2006)11. <strong>Audio</strong> and Operational ResultsIn this section, we present some results <strong>of</strong> <strong>RTI</strong>’sexperience with CARI technology. The data givenbelow were obta<strong>in</strong>ed by lab test, field test andproduction survey use <strong>of</strong> CARI processes.A comparison <strong>of</strong> record<strong>in</strong>g alternatives is shown <strong>in</strong>Figure 4, with an <strong>in</strong>dication <strong>of</strong> the result<strong>in</strong>g playbacksound quality. The column labeled “MB Per M<strong>in</strong>” liststhe number <strong>of</strong> megabytes <strong>of</strong> storage required for onem<strong>in</strong>ute <strong>of</strong> sound when us<strong>in</strong>g the uncompressed wavefile format. Similar patterns <strong>of</strong> relative file size can befound for other file formats.Figure 4. <strong>Record<strong>in</strong>g</strong> parametersBandwidthSampl<strong>in</strong>g ChannelsSoundQualityMB PerM<strong>in</strong>8 bit 11.25 KHz 1 Low 0.6616 bit 11.25 KHz 1 Medium 1.318 bit 22.5 KHz 1 Medium 1.7916 bit 22.5 KHz 1 High 1.1916 bit 44.1 KHz 1 Very 5.25High16 bit 44.1 KHz 2 VeryHigh12.3We have looked at alternative processes forcompress<strong>in</strong>g exist<strong>in</strong>g audio files. A wave file wascompressed as a separate step after record<strong>in</strong>g, us<strong>in</strong>g aspecific CODEC and selected record<strong>in</strong>g parameters. Interms <strong>of</strong> a CARI system, this process might beperformed by the case management system after the<strong>in</strong>terview was completed but prior to transmission.Us<strong>in</strong>g this approach, compression ratios ranged from afactor <strong>of</strong> 2 to 75. In general, if the record<strong>in</strong>g was <strong>of</strong>very high fidelity stereo, the orig<strong>in</strong>al file would be verylarge and compress greatly. Lower<strong>in</strong>g the record<strong>in</strong>gquality produces a smaller file orig<strong>in</strong>ally butproportionally less compression.At <strong>RTI</strong>, files are recorded with the W<strong>in</strong>dows nativeSound Recorder s<strong>of</strong>tware called from Blaise or CASES,result<strong>in</strong>g <strong>in</strong> file sizes <strong>of</strong> about one MB/m<strong>in</strong>uteuncompressed. Use <strong>of</strong> the LAME (The LAME Project)open source compression algorithm and appropriateparameters yields an average compression ratio <strong>of</strong>approximately 11:1 without loss <strong>of</strong> audio quality,result<strong>in</strong>g <strong>in</strong> about 100KB files for one m<strong>in</strong>ute <strong>of</strong> audio.Figure 5. File sizes obta<strong>in</strong>ed by concurrent record<strong>in</strong>gand compressionCODECInputSoundNumberOf FilesAverageMB/M<strong>in</strong>Quality TestedMPEGRec Low 4 0.98MPEGRec Mod 3 1.68MPEGRec V.High, 2 0.96MonoMPEGRec V.High, 1 1.80StereoRealMedia Low 24 0.34RealMedia Mod 3 0.51RealMedia V.High, 2 0.34MonoRealMedia V.High,Stereo1 0.47In another experiment, we recorded sound directly to acompressed format, without <strong>in</strong>terven<strong>in</strong>g storage as awave file. In a CARI system, this requires the<strong>in</strong>strument to call a specific record<strong>in</strong>g application andCODEC, such as MPEGRec (mp3), produc<strong>in</strong>g acompact file that is ready to encrypt and transmit. Thesimplicity <strong>of</strong> this approach was attractive becausecompression was immediate and effective, as shown <strong>in</strong>Figure 5. On the down side, simultaneous compressionand record<strong>in</strong>g tax the computer’s process<strong>in</strong>g power.This reduces system performance, produces lag andvisible <strong>in</strong>dication <strong>of</strong> record<strong>in</strong>g processes, and limits itsusefulness.Figure 6. Loudness Effect on File SizeFileFormatSoundLevelAveragedOver # <strong>of</strong>MB PerM<strong>in</strong>uteFilesWave Silent 6 1.30Wave Quiet voice 9 1.31Wave Voice 6 1.32MP3 Silent 6 0.97MP3 Quiet voice 8 0.96MP3 Voice 6 0.97RM Silent 6 0.34RM Quiet voice 8 0.34RM Voice 6 0.34


We tested whether loudness had any effect on the size<strong>of</strong> the recorded output file by look<strong>in</strong>g at the level <strong>of</strong>sound <strong>in</strong> audio files compared to file size, for CARIfiles which where all recorded under identicalconfiguration sett<strong>in</strong>gs on the same laptop. Figure 6shows the results <strong>of</strong> the comparison, demonstrat<strong>in</strong>g thatthere was no apparent effect <strong>of</strong> loudness on audio filesize.The quality <strong>of</strong> the sound files from the field is <strong>of</strong><strong>in</strong>terest, as an <strong>in</strong>dicator <strong>of</strong> the feasibility <strong>of</strong> gather<strong>in</strong>g<strong>in</strong>formation for large numbers <strong>of</strong> <strong>in</strong>terviews. Figure 7shows results from review<strong>in</strong>g a sample <strong>of</strong> 11% <strong>of</strong> thefirst 1500 completed <strong>in</strong>terviews from a survey. Theasterisk (*) <strong>in</strong>dicates that the default rat<strong>in</strong>g was chosen,as opposed to an explicitly-def<strong>in</strong>ed score. Rat<strong>in</strong>g thefile quality rat<strong>in</strong>g was optional through the monitor<strong>in</strong>g<strong>in</strong>terface if the quality was acceptable for review(Hartman et al, 2006).Figure 7. CARI sound file quality distributionSound Quality Number <strong>of</strong> Interviews1 – Poor 42 – Passable 53 – Adequate 21* – Acceptable 484 – Good 495 – Excellent 37Problems noted with audio files <strong>in</strong>cluded backgroundnoise, static, fa<strong>in</strong>tness <strong>of</strong> voices, key tapp<strong>in</strong>g, hum andother record<strong>in</strong>g problems which <strong>in</strong>terfered withdetection <strong>of</strong> vocal content. <strong>Audio</strong> files were consideredadequate if voices could be pla<strong>in</strong>ly heard andunderstood, regardless <strong>of</strong> other noises. This def<strong>in</strong>ition<strong>of</strong> quality differs from any commonly used to rate thequality <strong>of</strong> audio record<strong>in</strong>g for other purposes, such asmusical enterta<strong>in</strong>ment, but it is appropriate for surveyevaluation purposes.Figure 8. <strong>Field</strong> performance problems detectedthrough CARICount % <strong>of</strong> Problem Def<strong>in</strong>itionCases13 0.2 Authenticity Questionable217 3.9 Read<strong>in</strong>g - M<strong>in</strong>or Deviation72 1.3 Read<strong>in</strong>g - Major Deviation73 1.3 <strong>Record<strong>in</strong>g</strong> Errors44 0.8 Unpr<strong>of</strong>essional Behavior86 1.5 Inappropriate Prob<strong>in</strong>g79 1.4 Feedback not Neutral1 0.01 Incorrect Incentive ProvidedWe have also gathered operational <strong>in</strong>formation on fieldstaff performance from production use <strong>of</strong> CARI. Figure8 shows the distribution <strong>of</strong> field performance problemsfound <strong>in</strong> one study after review <strong>of</strong> approximately 5600<strong>in</strong>terviews. A s<strong>in</strong>gle case might be assigned multipleproblem codes, and so the problem count total is greaterthan the number <strong>of</strong> affected cases (Wrenn-Yorker andThissen, FedCASIC, 2005).In general, field <strong>in</strong>terviewers and respondents have beenaccept<strong>in</strong>g <strong>of</strong> the technology. In a feedback study, 82%<strong>of</strong> <strong>in</strong>terviewers felt neutral or positive about use <strong>of</strong>CARI and a post-<strong>in</strong>terview survey <strong>of</strong> 283 respondentsfound that 70% <strong>of</strong> the respondents reported they had noreaction one way or the other, 15% reported lik<strong>in</strong>g theidea, while 13% disliked the idea (Herget et al, 2005).As noted above, assent to CARI by respondents rangedfrom around 83% <strong>in</strong> one survey to 93% <strong>in</strong> another. Thisassent was <strong>in</strong>dependent <strong>of</strong> consent to conduct the<strong>in</strong>terview (Wrenn-Yorker and Thissen, 2005).A small experiment was conducted to determ<strong>in</strong>e them<strong>in</strong>imum number <strong>of</strong> CARI audio files required formak<strong>in</strong>g consistent monitor<strong>in</strong>g evaluations, that is, howmany audio files were required before reach<strong>in</strong>g a po<strong>in</strong>twhere listen<strong>in</strong>g to additional audio files for an <strong>in</strong>terviewhad no effect on the determ<strong>in</strong>ations. This worksuggested that three audio files each <strong>of</strong> 30-secondduration may be adequate for verification purposes.After review <strong>of</strong> three files, CARI monitors reached 97%agreement with the rat<strong>in</strong>gs found by review <strong>of</strong> five files,<strong>in</strong>dicat<strong>in</strong>g that three files provide sufficient <strong>in</strong>formationfor evaluation purposes.It is difficult to compare costs precisely between CARIoperations and more traditional re-<strong>in</strong>terview orverification processes, because the traditional systemstend to be well established while CARI systems are stillevolv<strong>in</strong>g. A theoretical cost-analysis model was createdto compare the expected costs <strong>of</strong> operat<strong>in</strong>g both systemsat the same “steady state” <strong>in</strong> which all systems had beenimplemented. Analysis <strong>of</strong> that model suggests that thesteady-state cost <strong>of</strong> verification is less with CARI thanfor the traditional approach, but actual data were notavailable for that comparison.12. Visions <strong>of</strong> the FutureLook<strong>in</strong>g forward, we see expanded use <strong>of</strong> CARI <strong>in</strong> fieldsurveys, for monitor<strong>in</strong>g survey quality and also as an<strong>in</strong>tegral part <strong>of</strong> data collection. Advances <strong>in</strong> digitalsignal process<strong>in</strong>g may support automation <strong>of</strong> activitiesnow be<strong>in</strong>g done by CARI monitors or coders.With regard to data quality monitor<strong>in</strong>g, it may bepossible one day to screen a large portion <strong>of</strong> the audi<strong>of</strong>iles automatically for evidence <strong>of</strong> falsification. Forexample, s<strong>of</strong>tware may be able to dist<strong>in</strong>guish between


audio files with and without voices and to identify thenumber <strong>of</strong> differ<strong>in</strong>g voices with<strong>in</strong> a s<strong>in</strong>gle record<strong>in</strong>g.This technology could be employed for a populationcensus or large survey that requires many <strong>in</strong>terviews tobe screened very quickly for falsification. <strong>Audio</strong>process<strong>in</strong>g s<strong>of</strong>tware may be able to determ<strong>in</strong>erespondent qualities such as whether a voice is male orfemale, or to match spoken <strong>in</strong>terviewer words with thepredef<strong>in</strong>ed question text, for evaluation <strong>of</strong> how well the<strong>in</strong>terviewer followed protocol.CARI can also be used as a data collection tool. Anumber <strong>of</strong> surveys tape record respondent responsesthat are subsequently coded, and CARI <strong>of</strong>fers aconvenient, unobtrusive alternative for collect<strong>in</strong>g theserecord<strong>in</strong>gs. Match<strong>in</strong>g audio responses to a dictionary <strong>of</strong>expected words might allow automated cod<strong>in</strong>g <strong>of</strong> openendeditems or <strong>of</strong> an “other-specify” option <strong>of</strong> multiplechoiceitems.Farther <strong>in</strong> the future, record<strong>in</strong>gs may be transcribedautomatically to text with can be parsed and analyzed.Current commercial s<strong>of</strong>tware <strong>of</strong>ten requires “tra<strong>in</strong><strong>in</strong>g”the package to recognize the user’s voice, which limitsusefulness <strong>in</strong> the field. However, research is underwayon speech-to-text conversion tools <strong>in</strong> uncontrolled or“noisy” surround<strong>in</strong>gs (M<strong>in</strong>g, et al, 2006), which maybroaden its applicability to <strong>in</strong>clude home environments.AcknowledgementsThe authors would like to acknowledge the work <strong>of</strong>Albert Bethke, Phil Cooley and R. Suresh <strong>in</strong> the<strong>in</strong>vention <strong>of</strong> CARI, the contributions <strong>of</strong> Frank Mierzwa<strong>in</strong> cost model<strong>in</strong>g and <strong>of</strong> Paul<strong>in</strong>e Rob<strong>in</strong>son <strong>in</strong> filecompression studies. F<strong>in</strong>ally we would like torecognize the contributions <strong>of</strong> the U.S. Census Bureauto the field <strong>of</strong> audio-recorded <strong>in</strong>terview<strong>in</strong>g.ReferencesBiemer, P.P., Hergert, D., Morton, J. and Willis, W.(2000), “The Feasibility <strong>of</strong> Monitor<strong>in</strong>g <strong>Field</strong>Interview Performance Us<strong>in</strong>g Computer <strong>Audio</strong>Recorded Interview<strong>in</strong>g (CARI)”, Proceed<strong>in</strong>gs <strong>of</strong> theAmerican Statistical Association’s Section on SurveyResearch Methods, pp. 1068-1073Dwyer; J.J, God<strong>in</strong>, D.K., Colon, R.S., Sr., Rothschild, S.Pawlowski, J.J., and Vaughan, J.C. (1998), “Voicefile management <strong>in</strong> portable digital audio recorder”,United States Patent 6671567Hartman, P., Wrenn-Yorker, C., Sattaluri, S. andThissen, M.R. (2006), “Research and Development<strong>in</strong> <strong>Audio</strong>-Recorded Interview<strong>in</strong>g”, Presented atFederal Computer Assisted Survey InformationCollection (FedCASIC) ConferenceHerget, D., Biemer, P.P., Morton, J. and Sand, K.(2005), “Computer <strong>Audio</strong> Recorded Interview<strong>in</strong>g(CARI): Additional Feasibility Efforts <strong>of</strong>Monitor<strong>in</strong>g <strong>Field</strong> Interview Performance”,Presented at Federal Conference on StatisticalMethods.Kowal, S., O'Connell, D.C. and Sab<strong>in</strong>, E.J. (1975)“Development <strong>of</strong> Temporal Pattern<strong>in</strong>g and VocalHesitation <strong>in</strong> Spontaneous Narratives”, Journal <strong>of</strong>Psychol<strong>in</strong>guistic Research, Vol. 4, p. 195-207.The LAME Project, LAME Compression S<strong>of</strong>tware,http://lame.sourceforge.net/<strong>in</strong>dex.phpM<strong>in</strong>g, J., Hazen, T.J. and Glass, J.R. (2006), "SpeakerVerification Over Handheld Devices with RealisticNoisy Speech Data," Proceed<strong>in</strong>gs <strong>of</strong> the<strong>International</strong> Conference on Acoustics, Speech, andSignal Process<strong>in</strong>g, pp I-637 to I-640.Nuance Communications, Inc. (2005), “AboutDictaphone”,http://www.dictaphone.com/aboutus/history.aspO’Connell, D.C. and Kowal, S. (1983), “Pausology”. InComputers <strong>in</strong> Language Research, Sedelow, W. A.Jr. and Sedelow, S.Y. (eds), Berl<strong>in</strong>-New York:Walter de Gruyter & Co., pp. 221-301.Statistics Netherlands, Statistical InformaticsDepartment, P.O. Box 4000, 2270 JM Voorburg,The Netherlands.Stockdale, A. (2002), “Tools for digital audio record<strong>in</strong>g<strong>in</strong> qualitative research”, Social Research Update,pp 1-4Suresh, R. (2005). “Web-Based Computer <strong>Audio</strong>Recorded Interview (Web-CARI).” Presented at the<strong>International</strong> <strong>Field</strong> Directors and TechnologyConference 2005, Atlanta, GAThissen, M. R., and Rodriguez, G. (2004), “<strong>Record<strong>in</strong>g</strong>Interview Sound Bites Through BlaiseInstruments”, Proceed<strong>in</strong>gs <strong>of</strong> the <strong>International</strong>Blaise Users’ Conference, pp. 411-423.Thissen, M.R., and Sattaluri, S. (2006a) “Computer<strong>Audio</strong>-Recorded Interview<strong>in</strong>g (CARI)”, Presentedat The <strong>International</strong> <strong>Field</strong> Directors andTechnologies Conference, MontrealThissen, M.R, and Sattaluri, S. (2006b), “Research andDevelopment <strong>in</strong> <strong>Audio</strong>-Recorded Interview<strong>in</strong>g, PartII”, Presented at The <strong>International</strong> <strong>Field</strong> Directorsand Technologies Conference, Montreal, CanadaUniversity <strong>of</strong> California, Berkeley, S<strong>of</strong>tware SupportServices, “Computer-Assisted Survey ExecutionSystem (CASES),” CSM Program, 358 BarrowsHall #3820, Berkeley, CA 94720.Wrenn-Yorker, C. and Thissen, M.R.(2005), “Computer<strong>Audio</strong> Recorded Interview<strong>in</strong>g (CARI)Technology”, Presented at the Federal Computer-Assisted Survey Information Collection(FedCASIC) Conference.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!