Rome Wasn't Digitized in a Day - Council on Library and Information ...
Rome Wasn't Digitized in a Day - Council on Library and Information ...
Rome Wasn't Digitized in a Day - Council on Library and Information ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3<br />
<str<strong>on</strong>g>in</str<strong>on</strong>g> the check<str<strong>on</strong>g>in</str<strong>on</strong>g>g of all the texts, both to correct transcripti<strong>on</strong>al errors <strong>and</strong> to provide a c<strong>on</strong>sistent markup<br />
scheme. 5 This generati<strong>on</strong> also saw the development of BetaCode by classicists to capture ancient<br />
languages such as Greek <strong>and</strong> Coptic. A third class of corpora, which evolved <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1980s, <str<strong>on</strong>g>in</str<strong>on</strong>g>volved<br />
tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g professi<strong>on</strong>ally entered text <strong>and</strong> semantically mark<str<strong>on</strong>g>in</str<strong>on</strong>g>g it up <str<strong>on</strong>g>in</str<strong>on</strong>g> SGML/XML, such as with the<br />
markup designed by the Text Encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g Initiative (TEI); 6 an example is the Perseus Digital <strong>Library</strong><br />
(PDL). A fourth generati<strong>on</strong> of corpora <str<strong>on</strong>g>in</str<strong>on</strong>g>volved image-fr<strong>on</strong>t collecti<strong>on</strong>s that provided users with page<br />
images that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded hidden uncorrected optical character recogniti<strong>on</strong> (OCR) that could be searched.<br />
This strategy, popularized <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1990s, has driven mass-digitizati<strong>on</strong> projects such as Google Books 7<br />
<strong>and</strong> the Open C<strong>on</strong>tent Alliance (OCA). 8 Stewart et al. call for a fifth generati<strong>on</strong> of corpora that<br />
synthesize the strengths of the four previous generati<strong>on</strong>s while also allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g decentralized<br />
c<strong>on</strong>tributi<strong>on</strong>s from users; us<str<strong>on</strong>g>in</str<strong>on</strong>g>g automated methods to create both scalable <strong>and</strong> semantic markup; <strong>and</strong><br />
synthesiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g “the scholarly dem<strong>and</strong>s of capital <str<strong>on</strong>g>in</str<strong>on</strong>g>tensive, manually c<strong>on</strong>structed collecti<strong>on</strong>s” such as<br />
Perseus, the TLG, <strong>and</strong> the PHI databank of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> literature, with “the <str<strong>on</strong>g>in</str<strong>on</strong>g>dustrial scale of very large,<br />
“milli<strong>on</strong> book” libraries now emerg<str<strong>on</strong>g>in</str<strong>on</strong>g>g.”<br />
In an article written <str<strong>on</strong>g>in</str<strong>on</strong>g> 1959, James McD<strong>on</strong>ough explored the potential of classics <strong>and</strong> comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. He<br />
opened by not<str<strong>on</strong>g>in</str<strong>on</strong>g>g that it took James Turney Allen almost 43 years to create a c<strong>on</strong>cordance of<br />
Euripides, a task that a newly available IBM computer could do <str<strong>on</strong>g>in</str<strong>on</strong>g> 12 hours. McD<strong>on</strong>ough c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ued<br />
with the now-can<strong>on</strong>ical example of how Father Roberto Busa used a computer to create a c<strong>on</strong>cordance<br />
to the works of Thomas Aqu<str<strong>on</strong>g>in</str<strong>on</strong>g>as. 9 McD<strong>on</strong>ough used these examples to expla<str<strong>on</strong>g>in</str<strong>on</strong>g> that computers could<br />
help revoluti<strong>on</strong>ize studies by perform<str<strong>on</strong>g>in</str<strong>on</strong>g>g excepti<strong>on</strong>ally time-c<strong>on</strong>sum<str<strong>on</strong>g>in</str<strong>on</strong>g>g manual tasks such as the<br />
creati<strong>on</strong> of c<strong>on</strong>cordances, textual emendati<strong>on</strong>, auto abstracti<strong>on</strong> of articles, <strong>and</strong>, most important, the<br />
collecti<strong>on</strong> <strong>and</strong> collati<strong>on</strong> of manuscripts. Although McD<strong>on</strong>ough observed, “mach<str<strong>on</strong>g>in</str<strong>on</strong>g>es now make<br />
ec<strong>on</strong>omically feasible a critical editi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> which the exact read<str<strong>on</strong>g>in</str<strong>on</strong>g>g of every source could be pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted <str<strong>on</strong>g>in</str<strong>on</strong>g><br />
full,” this phenomen<strong>on</strong> has yet to occur, a po<str<strong>on</strong>g>in</str<strong>on</strong>g>t to which we return <str<strong>on</strong>g>in</str<strong>on</strong>g> our discussi<strong>on</strong>s of digital critical<br />
editi<strong>on</strong>s <strong>and</strong> manuscripts.<br />
McD<strong>on</strong>ough optimistically predicted that new comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies would c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>ce classicists to<br />
take <strong>on</strong> new forms of research that were not previously possible, argu<str<strong>on</strong>g>in</str<strong>on</strong>g>g that classicists were enter<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />
“a new era <str<strong>on</strong>g>in</str<strong>on</strong>g> scholarship, a golden age <str<strong>on</strong>g>in</str<strong>on</strong>g> which mach<str<strong>on</strong>g>in</str<strong>on</strong>g>es perform the servile secretarial tasks, <strong>and</strong> so<br />
leave the scholar free for his proper functi<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretive scholarly re-search. ...” He c<strong>on</strong>cluded with<br />
three recommendati<strong>on</strong>s: (1) all classicists should request that the mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e tape for their editi<strong>on</strong>s be<br />
given to them; (2) classical studies associati<strong>on</strong>s should work together to found <strong>and</strong> support a center that<br />
will record the complete texts of at least all major Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek authors; <strong>and</strong> (3) relevant parties<br />
should <str<strong>on</strong>g>in</str<strong>on</strong>g>crease their comprehensive bibliographic efforts. As a f<str<strong>on</strong>g>in</str<strong>on</strong>g>al thought, McD<strong>on</strong>ough returned to<br />
the lifetime work of James Turney Allen. “That such techniques as this article attempts to sketch were<br />
not available to Professor Allen at the turn of the century is tragic,” McD<strong>on</strong>ough offered, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that<br />
“if they be not extensively employed from this day forth by all <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> scholarship, it will <str<strong>on</strong>g>in</str<strong>on</strong>g>deed<br />
by [sic] a harsh commentary <strong>on</strong> our <str<strong>on</strong>g>in</str<strong>on</strong>g>telligence” (McD<strong>on</strong>ough 1959). McD<strong>on</strong>ough’s po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts about the<br />
importance of mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g all primary data such as manuscripts <strong>and</strong> texts available, the need for classical<br />
5 A special open-source tool named Diogenes (http://www.dur.ac.uk/p.j.hesl<str<strong>on</strong>g>in</str<strong>on</strong>g>/Software/Diogenes/) was created by Peter Hesl<str<strong>on</strong>g>in</str<strong>on</strong>g> to work with these two<br />
corpora (TLG, PHI Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> databank), as many scholars had criticized the limited usability as well as search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g features of these two<br />
“commercial” databases.<br />
6 http://www.tei-c.org/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.xml<br />
7 http://books.google.com<br />
8 http://www.archive.org<br />
9 The work of Father Busa is typically c<strong>on</strong>sidered to be the beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g of classical comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Crane 2004) as well as of literary comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> corpus<br />
l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics (Lüdel<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Zeldes 2007).