Thoracic Imaging 2003 - Society of Thoracic Radiology
Thoracic Imaging 2003 - Society of Thoracic Radiology
Thoracic Imaging 2003 - Society of Thoracic Radiology
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Voice Recognition<br />
Theresa C. McLoud, M.D.<br />
Associate Radiologist-in-Chief, Director <strong>of</strong> Education; <strong>Thoracic</strong> Radiologist, Massachusetts General Hospital<br />
Pr<strong>of</strong>essor <strong>of</strong> <strong>Radiology</strong>, Harvard Medical School, Boston, Massachusetts<br />
Voice recognition technology allows radiologists to achieve<br />
efficiency goals and become more competitive in the digital<br />
environment. It provides a link that can improve the speed <strong>of</strong><br />
communication between radiologists and their referring physicians.<br />
Speech recognition systems were introduced into main<br />
stream large radiology departments in the United States in the<br />
mid 1990’s. Most systems require limited amounts <strong>of</strong> learning<br />
and adaptation compared to other transcription systems and<br />
methods. Most s<strong>of</strong>tware packages allow easy integration into<br />
existing radiology and hospital information systems. The major<br />
disadvantage <strong>of</strong> voice recognition comes from radiologists’<br />
resistance to change and fear <strong>of</strong> technology. In addition, voice<br />
recognition does require more radiologist time and secretarial<br />
skills. However, the benefits to the hospital, referring clinicians<br />
and patients, are impressive because radiology reports are<br />
immediately available on the radiology and hospital information<br />
systems.<br />
Voice recognition s<strong>of</strong>tware at the human interface level comprises<br />
four core technologies; 1) the recognition <strong>of</strong> spoken<br />
human speech, 2) the synthesis <strong>of</strong> the spoken speech into readable<br />
characters, 3) the identification <strong>of</strong> the speaker and author<br />
verification, and 3) the understanding <strong>of</strong> the recognized word.<br />
These technologies are <strong>of</strong>ten referred to as speech recognition<br />
or speech-to-text; speech synthesis or text-to-speech; speaker<br />
identification and verification; and natural language understanding.<br />
Most voice recognition systems have certain hardware<br />
requirements. These include specific processor and speeds, usually<br />
a Pentium 200 MHz chip as well as an operating system<br />
usually a Windows NT platform. In addition to individual<br />
workstations, the system requires integration into a network.<br />
The network allows integration into the radiology information<br />
system and ultimately into PACS and the hospital information<br />
system. Most systems also require the use <strong>of</strong> high quality<br />
microphones and sound cards to achieve high accuracy rates.<br />
Integration <strong>of</strong> a voice recognition system into a radiology<br />
department is not a trivial process. Most systems <strong>of</strong>fer several<br />
key features. These include an RIS interface, i.e. an interface<br />
with links to the radiology information system, the hospital<br />
information system, PACS and the billing system. The second<br />
component consists <strong>of</strong> standardized reports. These allow radiologists<br />
to create pre-defined reports for individual radiologists or<br />
the institution. The third components are templates/macros.<br />
Most systems allow the creation <strong>of</strong> not only standard reports but<br />
customizable templates and fields. The template capability<br />
gives the radiologist the flexibility to create a form with blank<br />
areas that vary with each dictation. Feature filled macros allow<br />
standard phrases or components <strong>of</strong> report to be easily inserted in<br />
the text. These functions allow greater efficiency. The fourth<br />
component is customizable fields. Most packages permit customized<br />
definitions by the institution and multiple fields associated<br />
with the report. These fields may include ICD-9, CPT, BI-<br />
RADS, ACR /NEMA codes or ACR pathology identifiers. The<br />
fifth component includes bar code interfaces. Most voice recognition<br />
s<strong>of</strong>tware packages support the use <strong>of</strong> a bar code laser<br />
reader or integrated microphone laser reader into the system.<br />
The final component is a system for security.<br />
The most important feature <strong>of</strong> speech recognition is report<br />
turnaround. In our own institution, the previously utilized dictation<br />
system required an average <strong>of</strong> over three days to complete.<br />
This included transcription, review by the radiologist and<br />
the resident. The introduction <strong>of</strong> voice recognition eliminates<br />
the transcription and correction steps and a report by a staff<br />
radiologist becomes immediately finalized after being dictated<br />
and edited. The turnaround time in our own department has<br />
now dropped to 0.4 days.<br />
REFERENCES:<br />
Mehta A. Voice Recognition. In: PACS. A Guide to the Digital<br />
Revolution.<br />
Eds: Dreyer KJ, Mehta A, Thrall JH. Springer-Verlag New York,<br />
Inc., 2002.<br />
Chapter 11, pages 281-302.<br />
179<br />
TUESDAY