26.12.2014 Views

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the check<str<strong>on</strong>g>in</str<strong>on</strong>g>g of all the texts, both to correct transcripti<strong>on</strong>al errors <strong>and</strong> to provide a c<strong>on</strong>sistent markup<br />

scheme. 5 This generati<strong>on</strong> also saw the development of BetaCode by classicists to capture ancient<br />

languages such as Greek <strong>and</strong> Coptic. A third class of corpora, which evolved <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1980s, <str<strong>on</strong>g>in</str<strong>on</strong>g>volved<br />

tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g professi<strong>on</strong>ally entered text <strong>and</strong> semantically mark<str<strong>on</strong>g>in</str<strong>on</strong>g>g it up <str<strong>on</strong>g>in</str<strong>on</strong>g> SGML/XML, such as with the<br />

markup designed by the Text Encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g Initiative (TEI); 6 an example is the Perseus Digital <strong>Library</strong><br />

(PDL). A fourth generati<strong>on</strong> of corpora <str<strong>on</strong>g>in</str<strong>on</strong>g>volved image-fr<strong>on</strong>t collecti<strong>on</strong>s that provided users with page<br />

images that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded hidden uncorrected optical character recogniti<strong>on</strong> (OCR) that could be searched.<br />

This strategy, popularized <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1990s, has driven mass-digitizati<strong>on</strong> projects such as Google Books 7<br />

<strong>and</strong> the Open C<strong>on</strong>tent Alliance (OCA). 8 Stewart et al. call for a fifth generati<strong>on</strong> of corpora that<br />

synthesize the strengths of the four previous generati<strong>on</strong>s while also allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g decentralized<br />

c<strong>on</strong>tributi<strong>on</strong>s from users; us<str<strong>on</strong>g>in</str<strong>on</strong>g>g automated methods to create both scalable <strong>and</strong> semantic markup; <strong>and</strong><br />

synthesiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g “the scholarly dem<strong>and</strong>s of capital <str<strong>on</strong>g>in</str<strong>on</strong>g>tensive, manually c<strong>on</strong>structed collecti<strong>on</strong>s” such as<br />

Perseus, the TLG, <strong>and</strong> the PHI databank of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> literature, with “the <str<strong>on</strong>g>in</str<strong>on</strong>g>dustrial scale of very large,<br />

“milli<strong>on</strong> book” libraries now emerg<str<strong>on</strong>g>in</str<strong>on</strong>g>g.”<br />

In an article written <str<strong>on</strong>g>in</str<strong>on</strong>g> 1959, James McD<strong>on</strong>ough explored the potential of classics <strong>and</strong> comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. He<br />

opened by not<str<strong>on</strong>g>in</str<strong>on</strong>g>g that it took James Turney Allen almost 43 years to create a c<strong>on</strong>cordance of<br />

Euripides, a task that a newly available IBM computer could do <str<strong>on</strong>g>in</str<strong>on</strong>g> 12 hours. McD<strong>on</strong>ough c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ued<br />

with the now-can<strong>on</strong>ical example of how Father Roberto Busa used a computer to create a c<strong>on</strong>cordance<br />

to the works of Thomas Aqu<str<strong>on</strong>g>in</str<strong>on</strong>g>as. 9 McD<strong>on</strong>ough used these examples to expla<str<strong>on</strong>g>in</str<strong>on</strong>g> that computers could<br />

help revoluti<strong>on</strong>ize studies by perform<str<strong>on</strong>g>in</str<strong>on</strong>g>g excepti<strong>on</strong>ally time-c<strong>on</strong>sum<str<strong>on</strong>g>in</str<strong>on</strong>g>g manual tasks such as the<br />

creati<strong>on</strong> of c<strong>on</strong>cordances, textual emendati<strong>on</strong>, auto abstracti<strong>on</strong> of articles, <strong>and</strong>, most important, the<br />

collecti<strong>on</strong> <strong>and</strong> collati<strong>on</strong> of manuscripts. Although McD<strong>on</strong>ough observed, “mach<str<strong>on</strong>g>in</str<strong>on</strong>g>es now make<br />

ec<strong>on</strong>omically feasible a critical editi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> which the exact read<str<strong>on</strong>g>in</str<strong>on</strong>g>g of every source could be pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

full,” this phenomen<strong>on</strong> has yet to occur, a po<str<strong>on</strong>g>in</str<strong>on</strong>g>t to which we return <str<strong>on</strong>g>in</str<strong>on</strong>g> our discussi<strong>on</strong>s of digital critical<br />

editi<strong>on</strong>s <strong>and</strong> manuscripts.<br />

McD<strong>on</strong>ough optimistically predicted that new comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies would c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>ce classicists to<br />

take <strong>on</strong> new forms of research that were not previously possible, argu<str<strong>on</strong>g>in</str<strong>on</strong>g>g that classicists were enter<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

“a new era <str<strong>on</strong>g>in</str<strong>on</strong>g> scholarship, a golden age <str<strong>on</strong>g>in</str<strong>on</strong>g> which mach<str<strong>on</strong>g>in</str<strong>on</strong>g>es perform the servile secretarial tasks, <strong>and</strong> so<br />

leave the scholar free for his proper functi<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretive scholarly re-search. ...” He c<strong>on</strong>cluded with<br />

three recommendati<strong>on</strong>s: (1) all classicists should request that the mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e tape for their editi<strong>on</strong>s be<br />

given to them; (2) classical studies associati<strong>on</strong>s should work together to found <strong>and</strong> support a center that<br />

will record the complete texts of at least all major Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek authors; <strong>and</strong> (3) relevant parties<br />

should <str<strong>on</strong>g>in</str<strong>on</strong>g>crease their comprehensive bibliographic efforts. As a f<str<strong>on</strong>g>in</str<strong>on</strong>g>al thought, McD<strong>on</strong>ough returned to<br />

the lifetime work of James Turney Allen. “That such techniques as this article attempts to sketch were<br />

not available to Professor Allen at the turn of the century is tragic,” McD<strong>on</strong>ough offered, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that<br />

“if they be not extensively employed from this day forth by all <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> scholarship, it will <str<strong>on</strong>g>in</str<strong>on</strong>g>deed<br />

by [sic] a harsh commentary <strong>on</strong> our <str<strong>on</strong>g>in</str<strong>on</strong>g>telligence” (McD<strong>on</strong>ough 1959). McD<strong>on</strong>ough’s po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts about the<br />

importance of mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g all primary data such as manuscripts <strong>and</strong> texts available, the need for classical<br />

5 A special open-source tool named Diogenes (http://www.dur.ac.uk/p.j.hesl<str<strong>on</strong>g>in</str<strong>on</strong>g>/Software/Diogenes/) was created by Peter Hesl<str<strong>on</strong>g>in</str<strong>on</strong>g> to work with these two<br />

corpora (TLG, PHI Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> databank), as many scholars had criticized the limited usability as well as search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g features of these two<br />

“commercial” databases.<br />

6 http://www.tei-c.org/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.xml<br />

7 http://books.google.com<br />

8 http://www.archive.org<br />

9 The work of Father Busa is typically c<strong>on</strong>sidered to be the beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g of classical comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Crane 2004) as well as of literary comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> corpus<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics (Lüdel<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Zeldes 2007).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!