26.12.2014 Views

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

"<str<strong>on</strong>g>Rome</str<strong>on</strong>g> <str<strong>on</strong>g>Wasn't</str<strong>on</strong>g> <str<strong>on</strong>g>Digitized</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g> a <str<strong>on</strong>g>Day</str<strong>on</strong>g>":<br />

Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

for Digital Classics<br />

by Alis<strong>on</strong> Babeu<br />

August 2011<br />

<str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong><br />

Informati<strong>on</strong> Resources


“<str<strong>on</strong>g>Rome</str<strong>on</strong>g> Wasn’t <str<strong>on</strong>g>Digitized</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g> a <str<strong>on</strong>g>Day</str<strong>on</strong>g>”:<br />

Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for Digital Classicists<br />

Alis<strong>on</strong> Babeu<br />

August 2011


ii<br />

ISBN 978-1-932326-38-3<br />

CLIR Publicati<strong>on</strong> No. 150<br />

Published by:<br />

<str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources<br />

1752 N Street, NW, Suite 800<br />

Wash<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, DC 20036<br />

Web site at http://www.clir.org<br />

This publicati<strong>on</strong> is available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e at http://www.clir.org/pubs/abstract/pub150abst.html.<br />

“<str<strong>on</strong>g>Rome</str<strong>on</strong>g> Wasn’t <str<strong>on</strong>g>Digitized</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g> a <str<strong>on</strong>g>Day</str<strong>on</strong>g>”: Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for Digital Classicists, by Alis<strong>on</strong><br />

Babeu, is licensed under a Creative Comm<strong>on</strong>s Attributi<strong>on</strong>-N<strong>on</strong>Commerical-ShareAlike 3.0 Unported<br />

License. Based <strong>on</strong> a work at www.clir.org.


iii<br />

C<strong>on</strong>tents<br />

ABOUT THE AUTHOR ...................................................................................................................................... vi<br />

ACKNOWLEDGMENTS ................................................................................................................................... vi<br />

AUTHOR’S NOTE .............................................................................................................................................. vi<br />

KEY ACRONYMS ............................................................................................................................................... vi<br />

FOREWORD ........................................................................................................................................................ ix<br />

INTRODUCTION ................................................................................................................................................. 1<br />

CLASSICS AND COMPUTERS: A LONG HISTORY .................................................................................... 1<br />

MULTIDISCIPLINARY CLASSICAL DIGITAL LIBRARIES: ADVANCED TECHNOLOGIES AND<br />

SERVICES ............................................................................................................................................................. 7<br />

Bibliographies/Catalogs/Directories ................................................................................................................... 9<br />

Document Analysis, Recogniti<strong>on</strong>, <strong>and</strong> Optical Character Recogniti<strong>on</strong> for Historical Languages .................... 12<br />

Ancient Greek ............................................................................................................................................... 13<br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> .............................................................................................................................................................. 16<br />

Sanskrit .......................................................................................................................................................... 20<br />

Syriac............................................................................................................................................................. 23<br />

Cuneiform Texts <strong>and</strong> Sumerian..................................................................................................................... 27<br />

Digital Editi<strong>on</strong>s <strong>and</strong> Text Edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g...................................................................................................................... 32<br />

Introducti<strong>on</strong> ................................................................................................................................................... 32<br />

Theoretical Issues of Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Markup for Digital Editi<strong>on</strong>s ................................................................. 33<br />

New Models of Collaborati<strong>on</strong>, Tools, <strong>and</strong> Frameworks for Digital Editi<strong>on</strong>s ............................................... 40<br />

The Challenges of Text Alignment <strong>and</strong> Text Variants .................................................................................. 44<br />

Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <strong>and</strong> Natural Language Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g .......................................................................... 48<br />

Treebanks ...................................................................................................................................................... 48<br />

Morphological Analysis ................................................................................................................................ 50<br />

Lexic<strong>on</strong>s ........................................................................................................................................................ 52<br />

Can<strong>on</strong>ical Text Services, Citati<strong>on</strong> Detecti<strong>on</strong>, <strong>and</strong> Citati<strong>on</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g ................................................................. 56<br />

Text M<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g, Quotati<strong>on</strong> Detecti<strong>on</strong>, <strong>and</strong> Authorship Attributi<strong>on</strong> ....................................................................... 60<br />

THE DISCIPLINES AND TECHNOLOGIES OF DIGITAL CLASSICS ................................................... 62<br />

Ancient History ................................................................................................................................................. 62<br />

Classical Archaeology ....................................................................................................................................... 63<br />

Overview ....................................................................................................................................................... 63<br />

Electr<strong>on</strong>ic Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Traditi<strong>on</strong>al Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g ......................................................................................... 64<br />

Data Creati<strong>on</strong>, Data Shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> Digital Dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> .............................................................................. 66<br />

Data Integrati<strong>on</strong>, Digital Repositories, <strong>and</strong> Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for Archaeology........................................ 69<br />

Design<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Infrastructures for the Research Methods of Archaeology .............................................. 76<br />

Visualizati<strong>on</strong> <strong>and</strong> 3-D Rec<strong>on</strong>structi<strong>on</strong>s of Archaeological Sites .................................................................. 82<br />

Classical Art <strong>and</strong> Architecture .......................................................................................................................... 87<br />

Classical Geography .......................................................................................................................................... 89<br />

The Ancient World Mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g Center ............................................................................................................. 90<br />

The Pleiades Project ...................................................................................................................................... 90<br />

The HESTIA Project ..................................................................................................................................... 92<br />

Epigraphy .......................................................................................................................................................... 96<br />

Overview: Epigraphy Databases, Digital Epigraphy, <strong>and</strong> EpiDoc................................................................ 96<br />

Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Epigraphy Databases ....................................................................................................................... 101


iv<br />

EpiDoc-Based Digital Epigraphy Projects .................................................................................................. 105<br />

The Challenges of L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Epigraphy <strong>and</strong> Digital Classics Projects ............................................. 110<br />

Advanced Imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g Technologies for Epigraphy ......................................................................................... 115<br />

Manuscript Studies .......................................................................................................................................... 117<br />

Digital Libraries of Manuscripts ................................................................................................................. 118<br />

Digital Challenges of Individual Manuscripts <strong>and</strong> Manuscript Collecti<strong>on</strong>s ............................................... 123<br />

Digital Manuscripts, Infrastructure, <strong>and</strong> Automatic L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Technologies ................................................ 128<br />

Numismatics .................................................................................................................................................... 131<br />

Numismatics Databases............................................................................................................................... 132<br />

Numismatic Data Integrati<strong>on</strong> <strong>and</strong> Digital Publicati<strong>on</strong> ................................................................................ 135<br />

Palaeography ................................................................................................................................................... 138<br />

Papyrology ...................................................................................................................................................... 141<br />

Digital Papyri Projects ................................................................................................................................ 142<br />

Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Collecti<strong>on</strong>s of Papyri <strong>and</strong> Digital Infrastructure ........................................................... 146<br />

EpiDoc, Digital Papyrology, <strong>and</strong> Reus<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Resources ..................................................................... 150<br />

Collaborative Workspaces, Image Analysis, <strong>and</strong> Read<str<strong>on</strong>g>in</str<strong>on</strong>g>g Support Systems .............................................. 152<br />

Philology ......................................................................................................................................................... 157<br />

Tools for Electr<strong>on</strong>ic Philology: BAMBI <strong>and</strong> Aristarchus ........................................................................... 158<br />

Infrastructure for Digital Philology: The Teuchos Project .......................................................................... 160<br />

Prosopography................................................................................................................................................. 164<br />

Issues <str<strong>on</strong>g>in</str<strong>on</strong>g> the Creati<strong>on</strong> of Prosopographical Databases ................................................................................ 165<br />

Network Analysis <strong>and</strong> Digital Prosopography ............................................................................................ 167<br />

Relati<strong>on</strong>al Databases <strong>and</strong> Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g Prosopography ................................................................................... 169<br />

Other Prosopographical Databases .............................................................................................................. 172<br />

THE USE AND USERS OF RESOURCES IN DIGITAL CLASSICS AND THE DIGITAL<br />

HUMANITIES................................................................................................................................................... 175<br />

Citati<strong>on</strong> of Digital Classics Resources ............................................................................................................ 176<br />

The Research Habits of Digital Humanists ..................................................................................................... 177<br />

Humanist Use of Source Materials: Digital <strong>Library</strong> Design Implicati<strong>on</strong>s ...................................................... 181<br />

Creators of Digital Humanities Resources: Factors for Successful Use ......................................................... 184<br />

“Traditi<strong>on</strong>al” Academic Use of Digital Humanities Resources ...................................................................... 186<br />

The CSHE Study ......................................................................................................................................... 186<br />

The LAIRAH Project .................................................................................................................................. 188<br />

The RePAH Project ..................................................................................................................................... 190<br />

The TIDSR Study ........................................................................................................................................ 193<br />

OVERVIEW OF DIGITAL CLASSICS CYBERINFRASTRUCTURE ..................................................... 195<br />

Requirements of Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for Classics .......................................................................................... 195<br />

Open-Access Repositories of Sec<strong>on</strong>dary Scholarship ................................................................................. 195<br />

Open Access, Collaborati<strong>on</strong>, Reuse, <strong>and</strong> Digital Classics .......................................................................... 196<br />

Undergraduate Research, Teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> E-Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g .................................................................................. 202<br />

Look<str<strong>on</strong>g>in</str<strong>on</strong>g>g Backward: State of Digital Classics <str<strong>on</strong>g>in</str<strong>on</strong>g> 2005 ................................................................................ 209<br />

Look<str<strong>on</strong>g>in</str<strong>on</strong>g>g Forward: Classics Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, Themes, <strong>and</strong> Requirements <str<strong>on</strong>g>in</str<strong>on</strong>g> 2010 ............................... 210<br />

Classics Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure Projects.............................................................................................................. 215<br />

APIS—Advanced Papyrological Informati<strong>on</strong> System ................................................................................ 215<br />

CLAROS—Classical Art Research Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Services .................................................................................. 215<br />

C<strong>on</strong>cordia .................................................................................................................................................... 216<br />

Digital Antiquity ......................................................................................................................................... 216<br />

Digital Classicist ......................................................................................................................................... 216<br />

eAQUA ....................................................................................................................................................... 216<br />

eSAD—e-Science <strong>and</strong> Ancient Documents ................................................................................................ 217


v<br />

Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Papyrology <strong>and</strong> Papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo .......................................................................................... 217<br />

Interediti<strong>on</strong>: An “Interoperable Supranati<strong>on</strong>al Infrastructure for Digital Editi<strong>on</strong>s”.................................... 218<br />

LaQuAT—L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Query<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Ancient Texts ................................................................................... 219<br />

PELAGIOS: Enable Ancient L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked Geodata <str<strong>on</strong>g>in</str<strong>on</strong>g> Open Systems ................................................................ 219<br />

SPQR—Support<str<strong>on</strong>g>in</str<strong>on</strong>g>g Productive Queries for Research ................................................................................ 220<br />

BUILDING A HUMANITIES CYBERINFRASTRUCTURE ..................................................................... 220<br />

Def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Humanities, Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <strong>and</strong> the Future ................................................................. 220<br />

Open C<strong>on</strong>tent, Services, <strong>and</strong> Tools as Infrastructure ...................................................................................... 221<br />

New Evaluati<strong>on</strong> <strong>and</strong> Incentive Models for Digital Scholarship <strong>and</strong> Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g ............................................. 227<br />

Challenges of Humanities Data <strong>and</strong> Digital Infrastructure ............................................................................. 229<br />

“General” Humanities Infrastructures, Doma<str<strong>on</strong>g>in</str<strong>on</strong>g>-Specific Needs, <strong>and</strong> the Needs of Humanists ...................... 231<br />

Virtual Research Envir<strong>on</strong>ments <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities: A Way to Address Doma<str<strong>on</strong>g>in</str<strong>on</strong>g>-Specific Needs ................ 237<br />

New Models of Scholarly Collaborati<strong>on</strong> ......................................................................................................... 239<br />

Susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able Preservati<strong>on</strong> <strong>and</strong> Curati<strong>on</strong> Infrastructures for Digital Humanities .............................................. 242<br />

Levels of Interoperability <strong>and</strong> Infrastructure................................................................................................... 249<br />

The Future of Digital Humanities <strong>and</strong> Digital Scholarship ............................................................................. 256<br />

OVERVIEW OF LARGE CYBERINFRASTRUCTURE PROJECTS ...................................................... 257<br />

Alliance of Digital Humanities Organizati<strong>on</strong>s ................................................................................................ 257<br />

arts-humanities.net .......................................................................................................................................... 258<br />

centerNET ....................................................................................................................................................... 259<br />

CLARIN .......................................................................................................................................................... 259<br />

DARIAH Project ............................................................................................................................................. 261<br />

Digital Humanities Observatory...................................................................................................................... 264<br />

DRIVER .......................................................................................................................................................... 264<br />

NoC-Network of Expert Centres ..................................................................................................................... 265<br />

Project Bamboo ............................................................................................................................................... 265<br />

SEASR ............................................................................................................................................................ 267<br />

TextGrid .......................................................................................................................................................... 267<br />

TextVRE ......................................................................................................................................................... 270<br />

REFERENCES .................................................................................................................................................. 271


vi<br />

ABOUT THE AUTHOR<br />

Alis<strong>on</strong> Babeu has served as the Digital Librarian <strong>and</strong> research coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ator for the Perseus Project s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce<br />

2004. Before com<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Perseus, she worked as a librarian at both the Harvard Bus<str<strong>on</strong>g>in</str<strong>on</strong>g>ess School <strong>and</strong> the<br />

Bost<strong>on</strong> Public <strong>Library</strong>. She has a BA <str<strong>on</strong>g>in</str<strong>on</strong>g> History from Mount Holyoke College <strong>and</strong> an MLS from<br />

Simm<strong>on</strong>s College.<br />

ACKNOWLEDGMENTS<br />

Thanks to Gregory Crane, Lisa Cerrato, David Bamman, Rashmi S<str<strong>on</strong>g>in</str<strong>on</strong>g>ghal, Marco Büchler, M<strong>on</strong>ica<br />

Berti, <strong>and</strong> Mari<strong>on</strong> Lame, <strong>and</strong> to all Perseus Project staff members, present <strong>and</strong> past, who made the<br />

writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g of this work far easier, not <strong>on</strong>ly by provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g a w<strong>on</strong>derful <strong>and</strong> supportive work envir<strong>on</strong>ment<br />

but also by c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>u<str<strong>on</strong>g>in</str<strong>on</strong>g>g to produce far more <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> excit<str<strong>on</strong>g>in</str<strong>on</strong>g>g work than any author could ever<br />

summarize. In additi<strong>on</strong>, I would like to thank Amy Friedl<strong>and</strong>er as well as Kathl<str<strong>on</strong>g>in</str<strong>on</strong>g> Smith, Brian Leney,<br />

<strong>and</strong> Jessica Wade at the <str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources for their help <str<strong>on</strong>g>in</str<strong>on</strong>g> publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

both the draft <strong>and</strong> the f<str<strong>on</strong>g>in</str<strong>on</strong>g>al versi<strong>on</strong> of this review. Thanks are also due to my family <strong>and</strong> friends who<br />

make all the work I do possible. This work was funded through a generous grant from the Institute for<br />

Museum <strong>and</strong> <strong>Library</strong> Services with <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al support from Tufts University.<br />

AUTHOR’S NOTE<br />

The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al versi<strong>on</strong> of this report has benefited greatly from the thorough commentary of many<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals, <strong>and</strong> the author greatly appreciates the time reviewers took to offer suggesti<strong>on</strong>s. Ow<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

both to time c<strong>on</strong>stra<str<strong>on</strong>g>in</str<strong>on</strong>g>ts <strong>and</strong> to the object of this work, which was to provide a summative <strong>and</strong> recent<br />

overview of the use of digital technologies <str<strong>on</strong>g>in</str<strong>on</strong>g> classics rather than a critical <strong>and</strong> comprehensive history<br />

of the use of comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> each of the many related discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of classics, a number of important<br />

articles <strong>and</strong> books have not received the level of attenti<strong>on</strong> they deserve. I have attempted <str<strong>on</strong>g>in</str<strong>on</strong>g> this f<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

versi<strong>on</strong> to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude references to all missed works that commentators brought to my attenti<strong>on</strong>, but I have<br />

not been able to adequately address all of the thoughtful <strong>and</strong> comprehensive comments received. As<br />

with any literature review, the goal has been to highlight important themes <strong>and</strong> projects, a difficult<br />

challenge for such a dynamic <strong>and</strong> evolv<str<strong>on</strong>g>in</str<strong>on</strong>g>g field as digital classics. All omissi<strong>on</strong>s are solely the fault of<br />

the author, but as this f<str<strong>on</strong>g>in</str<strong>on</strong>g>al versi<strong>on</strong> has been published under a Creative Comm<strong>on</strong>s Attributi<strong>on</strong> Share-<br />

Alike license, it is hoped that others can make use of this work, improve it, <strong>and</strong> help keep it up-to-date.<br />

KEY ACRONYMS<br />

ACLS American <str<strong>on</strong>g>Council</str<strong>on</strong>g> of Learned Societies<br />

ADHO Alliance of Digital Humanities Organizati<strong>on</strong>s<br />

ADS Archaeology Data Service<br />

AIA Archaeological Institute of America<br />

AHDS Arts <strong>and</strong> Humanities Data Service<br />

AHRC Arts <strong>and</strong> Humanities Research <str<strong>on</strong>g>Council</str<strong>on</strong>g><br />

ANS American Numismatic Society<br />

APA American Philological Associati<strong>on</strong><br />

APh L’Année Philologique<br />

API applicati<strong>on</strong> programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terface<br />

APIS Advanced Papyrological Informati<strong>on</strong> System


vii<br />

ARL<br />

BAMBI<br />

CC<br />

CDLI<br />

CLARIN<br />

CLAROS<br />

CMS<br />

CSE<br />

CSHE<br />

CTS<br />

DC<br />

DHO<br />

DOI<br />

DARIAH<br />

DDbDP<br />

DRIVER<br />

DTD<br />

EAD<br />

eAQUA<br />

eSAD<br />

ETANA<br />

ETSCL<br />

FRBR<br />

GIS<br />

HCA<br />

HEML<br />

HMT<br />

IAph<br />

IADB<br />

ICA<br />

ICT<br />

IDP<br />

ISAW<br />

ISS<br />

JISC<br />

LaQuAT<br />

LAIRAH<br />

LGPN<br />

NEH<br />

NLP<br />

NSF<br />

OAI<br />

OAI-ORE<br />

OAI-PMH<br />

OAIS<br />

OCA<br />

OCLC<br />

OCHRE<br />

OCR<br />

OER<br />

OPAC<br />

Associati<strong>on</strong> of Research Libraries<br />

Better Access to Manuscripts <strong>and</strong> Brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Images<br />

Creative Comm<strong>on</strong>s<br />

Cuneiform Digital <strong>Library</strong> Initiative<br />

Comm<strong>on</strong> Language Resources <strong>and</strong> Technology Infrastructure<br />

Classical Art Research Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Services<br />

c<strong>on</strong>tent-management system<br />

creativity support envir<strong>on</strong>ment<br />

Center for Studies <str<strong>on</strong>g>in</str<strong>on</strong>g> Higher Educati<strong>on</strong><br />

Can<strong>on</strong>ical Text Services<br />

Dubl<str<strong>on</strong>g>in</str<strong>on</strong>g> Core<br />

Digital Humanities Observatory<br />

digital object identifier<br />

Digital Research Infrastructure for the Arts <strong>and</strong> Humanities<br />

Duke Data Bank of Documentary Papyri<br />

Digital Repository Infrastructure Visi<strong>on</strong> for European Research<br />

document type def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong><br />

Encoded Archival Descripti<strong>on</strong><br />

Extrakti<strong>on</strong> v<strong>on</strong> strukturiertem Wissen aus Antiken Quellen für die Altertumswissenschaft<br />

E-science <strong>and</strong> Ancient Documents<br />

Electr<strong>on</strong>ic Tools <strong>and</strong> Ancient Near Eastern Archives<br />

Electr<strong>on</strong>ic Text Corpus of Sumerian Literature<br />

Functi<strong>on</strong>al Requirements for Bibliographic Research<br />

geographical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> system<br />

History, Classics <strong>and</strong> Archaeology Study Centre<br />

Historical Event <strong>and</strong> Markup L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Project<br />

Homer Multitext Project<br />

Inscripti<strong>on</strong>s of Aphrodisias<br />

Integrated Archaeological Database<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dependent comp<strong>on</strong>ent analysis<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> communicati<strong>on</strong> technology<br />

Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Papryology<br />

Institute for Study of the Ancient World<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> support system<br />

Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t Informati<strong>on</strong> Systems Committee<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Query<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Ancient Texts<br />

Log Analysis of Internet Resources <str<strong>on</strong>g>in</str<strong>on</strong>g> the Arts <strong>and</strong> Humanities<br />

Lexic<strong>on</strong> of Greek Pers<strong>on</strong>al Names<br />

Nati<strong>on</strong>al Endowment for the Humanities<br />

natural language process<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Nati<strong>on</strong>al Science Foundati<strong>on</strong><br />

Open Archives Initiative<br />

Open Archives Initiative-Object Reuse <strong>and</strong> Exchange<br />

Open Archives Initiative-Protocol for Metadata Harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Open Archival Informati<strong>on</strong> System<br />

Open C<strong>on</strong>tent Alliance<br />

Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Computer <strong>Library</strong> Center<br />

Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Cultural Heritage Envir<strong>on</strong>ment<br />

optical character recogniti<strong>on</strong><br />

<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e educati<strong>on</strong> resource<br />

Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Public Access Catalog


viii<br />

PBW<br />

PCA<br />

PDB<br />

PDL<br />

PHI<br />

PLANETS<br />

PN<br />

PSD<br />

PSWPC<br />

RDF<br />

RePAH<br />

RRDL<br />

SEASR<br />

SKOS<br />

SNA<br />

SPQR<br />

SQL<br />

tDAR<br />

TEI<br />

TIDSR<br />

TILE<br />

TLG<br />

TLL<br />

URI<br />

URL<br />

URN<br />

VERA<br />

VMR<br />

VRE<br />

VRE-SDM<br />

Prosopography of the Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e World<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cipal comp<strong>on</strong>ent analysis<br />

prosopographical database<br />

Perseus Digital <strong>Library</strong><br />

Packard Humanities Institute<br />

Preservati<strong>on</strong> <strong>and</strong> L<strong>on</strong>g-Term Access Through Networked Services<br />

Papyrological Navigator<br />

Pennsylvania Sumerian Dicti<strong>on</strong>ary<br />

Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong> Stanford Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g Papers <str<strong>on</strong>g>in</str<strong>on</strong>g> Classics<br />

resource descripti<strong>on</strong> framework<br />

Research Portal for the Arts <strong>and</strong> Humanities<br />

Roman de la Rose Digital <strong>Library</strong><br />

Software Envir<strong>on</strong>ment for the Advancement of Scholarly Research<br />

Simple Knowledge Organizati<strong>on</strong> System<br />

social network analysis<br />

Support<str<strong>on</strong>g>in</str<strong>on</strong>g>g Productive Queries for Research<br />

structured query language<br />

the Digital Archaeological Record<br />

Text Encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g Initiative<br />

Toolkit for the Impact of Digitised Scholarly Resources<br />

Text Image L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Envir<strong>on</strong>ment<br />

Thesaurus L<str<strong>on</strong>g>in</str<strong>on</strong>g>guae Graecae<br />

Thesaurae L<str<strong>on</strong>g>in</str<strong>on</strong>g>guae Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>ae<br />

uniform resource identifier<br />

uniform resource locator<br />

uniform resource name<br />

Virtual Envir<strong>on</strong>ment for Research <str<strong>on</strong>g>in</str<strong>on</strong>g> Archaeology<br />

Virtual Manuscript Room<br />

virtual research envir<strong>on</strong>ment<br />

Virtual Research Envir<strong>on</strong>ment for the Study of Documents <strong>and</strong> Manuscripts


ix<br />

FOREWORD<br />

“My purpose is to tell of bodies which have been transformed <str<strong>on</strong>g>in</str<strong>on</strong>g>to shapes of a different k<str<strong>on</strong>g>in</str<strong>on</strong>g>d.”<br />

Ovid, Metamorphoses, trans. R. Humphries.<br />

Cogent <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>sightful, <str<strong>on</strong>g>Rome</str<strong>on</strong>g> Wasn’t <str<strong>on</strong>g>Digitized</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g> a <str<strong>on</strong>g>Day</str<strong>on</strong>g>: Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for Digital<br />

Classicists rewards the reader with a many-faceted explorati<strong>on</strong> of classical studies: the history of this<br />

complex <strong>and</strong> multidimensi<strong>on</strong>al field, its development of computer-based resources <strong>and</strong> tools over the<br />

last several decades, its current opportunities <strong>and</strong> needs <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital era, <strong>and</strong> prospects for its future<br />

evoluti<strong>on</strong> as envisi<strong>on</strong>ed by digital classicists. Alis<strong>on</strong> Babeu rem<str<strong>on</strong>g>in</str<strong>on</strong>g>ds us early <str<strong>on</strong>g>in</str<strong>on</strong>g> her report of the<br />

ast<strong>on</strong>ish<str<strong>on</strong>g>in</str<strong>on</strong>g>g reach of classical studies, a field that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of history, literature,<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, art, anthropology, science, <strong>and</strong> mythology, am<strong>on</strong>g others, bounded by the Mycenean<br />

culture at its most distant past <strong>and</strong> c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>u<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the seventh century C.E. Not surpris<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, with<str<strong>on</strong>g>in</str<strong>on</strong>g> this<br />

historical compass the sources for classicists are equally complex: st<strong>on</strong>e fragments, papyri, pottery<br />

shards, the plastic arts, co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, <strong>and</strong> some of the most breathtak<str<strong>on</strong>g>in</str<strong>on</strong>g>g physical structures the world has<br />

known.<br />

In the course of this report, the substantial ga<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the use of digital technologies <str<strong>on</strong>g>in</str<strong>on</strong>g> service to classical<br />

studies become obvious. Over the past 40 years, remarkable resources have been built, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g largescale<br />

text databases <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of languages; digital repositories for archeological data, as well as for<br />

co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>and</strong> cuneiform tablets; <strong>and</strong> datasets of texts for paleography <strong>and</strong> epigraphical studies.<br />

Applicati<strong>on</strong>s that assist the scholar <str<strong>on</strong>g>in</str<strong>on</strong>g> morphological analysis, citati<strong>on</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g, text m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong><br />

treebank c<strong>on</strong>structi<strong>on</strong>, am<strong>on</strong>g others, are impressive. The challenges are also significant: there persist<br />

problems with the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrity of OCR scans; the <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability of multimedia data that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> texts,<br />

images, <strong>and</strong> other forms of cultural expressi<strong>on</strong>; <strong>and</strong> the daunt<str<strong>on</strong>g>in</str<strong>on</strong>g>g magnitude of so many languages <str<strong>on</strong>g>in</str<strong>on</strong>g> so<br />

many different scripts.<br />

The <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual return <strong>on</strong> this <str<strong>on</strong>g>in</str<strong>on</strong>g>vestment <str<strong>on</strong>g>in</str<strong>on</strong>g> technology as a service to classical studies is equally<br />

startl<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> complex. One of the more salient developments has been the rec<strong>on</strong>ceptualizati<strong>on</strong> of the<br />

text. As recently as a generati<strong>on</strong> ago, the “text” <str<strong>on</strong>g>in</str<strong>on</strong>g> classics was most often def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as a def<str<strong>on</strong>g>in</str<strong>on</strong>g>itive<br />

editi<strong>on</strong>, a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted artifact that was by nature static, usually edited by a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle scholar, <strong>and</strong> represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

a compilati<strong>on</strong> <strong>and</strong> collati<strong>on</strong> of several extant variati<strong>on</strong>s. Today, through the power <strong>and</strong> fluidity of<br />

digital tools, a text can mean someth<str<strong>on</strong>g>in</str<strong>on</strong>g>g very different: there may be no can<strong>on</strong>ical artifact, but <str<strong>on</strong>g>in</str<strong>on</strong>g>stead a<br />

dataset of its many variati<strong>on</strong>s, with n<strong>on</strong>e accorded primacy. A work of ancient literature is now more<br />

often deeply c<strong>on</strong>textualized, its transmissi<strong>on</strong> over time more nuanced, <strong>and</strong> its c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uity am<strong>on</strong>g the<br />

various <str<strong>on</strong>g>in</str<strong>on</strong>g>stantiati<strong>on</strong>s more accurately articulated. The performative nature of some of the great<br />

works—the epics of Homer are a prime example—can be captured more rigorously by digital<br />

technology, which can layer the centuries of manuscript fragments to produce a sharper underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

of what was emphasized <str<strong>on</strong>g>in</str<strong>on</strong>g> the epics over time <strong>and</strong> what passages or stories appear less important from<br />

<strong>on</strong>e era to another, afford<str<strong>on</strong>g>in</str<strong>on</strong>g>g new <str<strong>on</strong>g>in</str<strong>on</strong>g>sight <str<strong>on</strong>g>in</str<strong>on</strong>g>to the cultural appropriati<strong>on</strong> of these fundamental<br />

expressi<strong>on</strong>s of the human c<strong>on</strong>diti<strong>on</strong>.<br />

Achiev<str<strong>on</strong>g>in</str<strong>on</strong>g>g these new perspectives has required a cultural change <str<strong>on</strong>g>in</str<strong>on</strong>g> the classics. Scholarship <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

digital envir<strong>on</strong>ment is more collaborative, <strong>and</strong> can <str<strong>on</strong>g>in</str<strong>on</strong>g>clude students as <str<strong>on</strong>g>in</str<strong>on</strong>g>tegral c<strong>on</strong>tributors to the<br />

research effort. The c<strong>on</strong>necti<strong>on</strong>s, c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uities, <strong>and</strong> cultural dialogue to which classical works were<br />

subject are reflected by new teams of scholars, work<str<strong>on</strong>g>in</str<strong>on</strong>g>g across traditi<strong>on</strong>al discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es (which can often<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude computer science) to develop new methodological approaches <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual strategies <str<strong>on</strong>g>in</str<strong>on</strong>g>


x<br />

pursuit of knowledge about the ancient world. In this regard, the digital classics encompass new<br />

alignments of traditi<strong>on</strong>al hierarchies, academic boundaries, <strong>and</strong> technologies.<br />

The <str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources is pleased to publish this far-reach<str<strong>on</strong>g>in</str<strong>on</strong>g>g study. The<br />

issues <strong>and</strong> perspectives to which it gives voice perta<str<strong>on</strong>g>in</str<strong>on</strong>g> significantly to the humanities at large. Its<br />

appearance is especially relevant as plans to build very large digital libraries <str<strong>on</strong>g>in</str<strong>on</strong>g> Europe <strong>and</strong> the United<br />

States flourish. Indeed, a transdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary approach will be essential <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>struct<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital<br />

envir<strong>on</strong>ment with the scale <strong>and</strong> sophisticati<strong>on</strong> necessary to support advanced research, teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong><br />

lifel<strong>on</strong>g learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g. As this study suggests, we must c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ue to engage humanists, eng<str<strong>on</strong>g>in</str<strong>on</strong>g>eers, scientists,<br />

<strong>and</strong> all manner of pedagogical expertise <str<strong>on</strong>g>in</str<strong>on</strong>g> pursuit of a new, transformative educati<strong>on</strong>al ecology.<br />

Chuck Henry<br />

President<br />

<str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources


1<br />

INTRODUCTION<br />

“You would th<str<strong>on</strong>g>in</str<strong>on</strong>g>k that classics is classics, but, <str<strong>on</strong>g>in</str<strong>on</strong>g> fact, the sensibilities <str<strong>on</strong>g>in</str<strong>on</strong>g>side the different<br />

subdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es can be radically different” (Unnamed scholar cited <str<strong>on</strong>g>in</str<strong>on</strong>g> Harley et al. 2010, 74).<br />

“It has always been the case that scholars need to cite primary <strong>and</strong> sec<strong>on</strong>dary texts <str<strong>on</strong>g>in</str<strong>on</strong>g> retraceable<br />

form <strong>and</strong> argue cogently <strong>and</strong> replicably from the data to the c<strong>on</strong>clusi<strong>on</strong>s, just as it has l<strong>on</strong>g been the<br />

case that academic language, jarg<strong>on</strong>, abbreviati<strong>on</strong>s, <strong>and</strong> c<strong>on</strong>venti<strong>on</strong>s ought to be st<strong>and</strong>ardised with<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

(if not between) discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es. N<strong>on</strong>e of the philosophies <strong>and</strong> practices of the Digital Classics community<br />

need therefore be seen as new or unfamiliar” (Bodard 2008).<br />

Classics is a complicated <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary field with a wide-rang<str<strong>on</strong>g>in</str<strong>on</strong>g>g group of subdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> a<br />

seem<str<strong>on</strong>g>in</str<strong>on</strong>g>gly endless variety of technical challenges. Digital classics, for the purpose of this report, is<br />

broadly def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as the use of digital technologies <str<strong>on</strong>g>in</str<strong>on</strong>g> any field related to the study of classical antiquity.<br />

This report c<strong>on</strong>centrates <strong>on</strong> projects that have focused <strong>on</strong> classical Greece, <str<strong>on</strong>g>Rome</str<strong>on</strong>g>, <strong>and</strong> the ancient<br />

Middle <strong>and</strong> Near East, <strong>and</strong> generally <strong>on</strong> the period up to about 600 AD. It offers brief coverage of<br />

projects with c<strong>on</strong>tent from the medieval era, largely <str<strong>on</strong>g>in</str<strong>on</strong>g> the area of manuscript studies. The report<br />

explores the state of the art <str<strong>on</strong>g>in</str<strong>on</strong>g> digital classics <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of what projects exist <strong>and</strong> how they are used;<br />

exam<str<strong>on</strong>g>in</str<strong>on</strong>g>es the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that currently exists to support digital classics as a discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e; <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>vestigates larger humanities cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure projects <strong>and</strong> exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools or services that might be<br />

repurposed for the digital classics.<br />

To set the c<strong>on</strong>text, this report opens with an overview of the history of comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> classics. This is<br />

followed by a look at large multidiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary classical digital libraries (i.e., those of the ancient Near<br />

East, Greece, <strong>and</strong> <str<strong>on</strong>g>Rome</str<strong>on</strong>g>) <strong>and</strong> the types of advanced services <strong>and</strong> technologies they use <strong>and</strong> are likely to<br />

require. Next is a summary of the various discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of digital classics, the major projects <str<strong>on</strong>g>in</str<strong>on</strong>g> each, <strong>and</strong><br />

the major technologies <str<strong>on</strong>g>in</str<strong>on</strong>g> use. Next, an overview of digital humanities user studies attempts to get at<br />

the needs of users of these projects, as no user studies of digital classicists could be found. An<br />

overview of requirements for a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital classics, al<strong>on</strong>g with a survey of relevant<br />

projects, follows. The report c<strong>on</strong>cludes by present<str<strong>on</strong>g>in</str<strong>on</strong>g>g recommendati<strong>on</strong>s for a humanities<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> relevant nati<strong>on</strong>al <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure projects.<br />

CLASSICS AND COMPUTERS: A LONG HISTORY<br />

The field of classical studies is broad <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a variety of related discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, such as ancient<br />

history, archaeology, epigraphy, numismatics, papyrology, philology, <strong>and</strong> prosopography, <strong>and</strong> the<br />

impact of comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g has varied greatly am<strong>on</strong>g them. 1 In 2000, Lorna Hardwick offered a useful<br />

def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of the field:<br />

As an academic discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e it is broad <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of chr<strong>on</strong>ology, <str<strong>on</strong>g>in</str<strong>on</strong>g> geographical provenance <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the range of specialisms <strong>on</strong> which it relies. It <str<strong>on</strong>g>in</str<strong>on</strong>g>volves the study of the languages, literatures,<br />

histories, ideas, religi<strong>on</strong>, science <strong>and</strong> technology, art, architecture <strong>and</strong> all aspects of the material<br />

1 While each subsecti<strong>on</strong> of this report provides an overview of the important projects <strong>and</strong> issues for the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g> questi<strong>on</strong>, a quick perusal of the table<br />

of c<strong>on</strong>tents (http://www.worldcat.org/isbn/0754677737) of the recently published Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Study of Classical Antiquity (Bodard <strong>and</strong><br />

Mah<strong>on</strong>y 2010) illustrates the diversity of research <str<strong>on</strong>g>in</str<strong>on</strong>g> digital classics.


2<br />

<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual cultures of the peoples liv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> around the Aegean <strong>and</strong> Mediterranean<br />

from the age of Mycenae (c. 1400–1200 BCE) until roughly the seventh century CE (Hardwick<br />

2000).<br />

Melissa Terras recently offered a helpful def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of a classicist:<br />

Often understood as ‘<strong>on</strong>e who advocates the school study of the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek classics’, this<br />

def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> belies the complex range of sources <strong>and</strong> associated research techniques often used by<br />

academic Classicists. Varied archaeological, epigraphic, documentary, l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic, forensic <strong>and</strong><br />

art historical evidence can be c<strong>on</strong>sulted <str<strong>on</strong>g>in</str<strong>on</strong>g> the course of everyday research <str<strong>on</strong>g>in</str<strong>on</strong>g>to history,<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, philology, literature, ethnography, anthropology, art, architecture, science,<br />

mythology, religi<strong>on</strong> <strong>and</strong> bey<strong>on</strong>d. Classicists, have by nature <strong>and</strong> necessity, always been<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary boundaries <str<strong>on</strong>g>in</str<strong>on</strong>g> a data-<str<strong>on</strong>g>in</str<strong>on</strong>g>tensive research area, be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

‘<str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary, rather than simply un-discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed’. The additi<strong>on</strong> of advanced digital <strong>and</strong><br />

computati<strong>on</strong>al tools to many a Classicist’s arsenal of skills should therefore not really come as a<br />

surprise, given the efficiencies they afford <str<strong>on</strong>g>in</str<strong>on</strong>g> the search<str<strong>on</strong>g>in</str<strong>on</strong>g>g, retrieval, classificati<strong>on</strong>, label<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

order<str<strong>on</strong>g>in</str<strong>on</strong>g>g, display <strong>and</strong> visualizati<strong>on</strong> of data (Terras 2010, 172).<br />

Classical studies is thus an <str<strong>on</strong>g>in</str<strong>on</strong>g>herently <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary field <strong>and</strong> <strong>on</strong>e that has l<strong>on</strong>g made use of<br />

advanced technology. Various studies 2 have c<strong>on</strong>sidered the impact of comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> classical studies.<br />

This secti<strong>on</strong> provides an overview of several articles <str<strong>on</strong>g>in</str<strong>on</strong>g> order to set the c<strong>on</strong>text for an exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of<br />

more recent developments <str<strong>on</strong>g>in</str<strong>on</strong>g> the field of digital classics.<br />

The evidence <strong>on</strong> which classicists draw can range from build<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, artifacts, <strong>and</strong> art objects<br />

to written evidence such as poetry, drama, narrative histories, <strong>and</strong> philosophical works. One particular<br />

challenge is that many of the classicist’s textual sources are fragmentary. Hardwick notes:<br />

Even where we possess a more or less complete form, texts have generally survived via<br />

manuscripts copied <strong>and</strong> recopied <str<strong>on</strong>g>in</str<strong>on</strong>g> late antiquity or medieval times. Almost all of the material<br />

evidence has to be excavated <strong>and</strong>/or rec<strong>on</strong>structed. Even objects that have survived relatively<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tact often lack their orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al c<strong>on</strong>texts. Thus, far from c<strong>on</strong>sist<str<strong>on</strong>g>in</str<strong>on</strong>g>g of a fixed, closed body of<br />

knowledge, as used to be imag<str<strong>on</strong>g>in</str<strong>on</strong>g>ed when Classical Studies had the reputati<strong>on</strong> of be<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

ultimate ‘can<strong>on</strong>ical’ field of study, the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e often <str<strong>on</strong>g>in</str<strong>on</strong>g>volves c<strong>on</strong>siderable experimentati<strong>on</strong>,<br />

c<strong>on</strong>jecture <strong>and</strong> hypothesis (Hardwick 2000).<br />

Although classics is often c<strong>on</strong>sidered to be a field of fixed knowledge, the research discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> this<br />

paper illustrates the <str<strong>on</strong>g>in</str<strong>on</strong>g>accuracy of that belief.<br />

In an effort to trace the l<strong>on</strong>g history of classics <strong>and</strong> comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Stewart et al. (2007) have def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

several generati<strong>on</strong>s of digital corpora <str<strong>on</strong>g>in</str<strong>on</strong>g> classics. The first generati<strong>on</strong> simply sought to make texts<br />

available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e (such as the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>Library</strong> 3 ) <strong>and</strong> relied <strong>on</strong> community c<strong>on</strong>tributi<strong>on</strong>s. A sec<strong>on</strong>d<br />

generati<strong>on</strong> of corpora, such as the Thesaurus L<str<strong>on</strong>g>in</str<strong>on</strong>g>guae Graecae (TLG) <strong>and</strong> the Packard Humanities<br />

Institute (PHI) 4 Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>Library</strong> <strong>on</strong> CD-ROM, <str<strong>on</strong>g>in</str<strong>on</strong>g>vested <str<strong>on</strong>g>in</str<strong>on</strong>g> professi<strong>on</strong>al data entry <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>volved scholars<br />

2 A full review of all the important studies of classics <strong>and</strong> comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. or <str<strong>on</strong>g>in</str<strong>on</strong>g>deed even a comprehensive bibliography of this topic, is bey<strong>on</strong>d the scope of<br />

this paper. For several of the larger studies see Bagnall (1980); Brunner (1993), found <str<strong>on</strong>g>in</str<strong>on</strong>g> Solom<strong>on</strong> (1993); <strong>and</strong> Edmunds (1995). For a brief but slightly<br />

more recent overview, see Latousek (2001.<br />

3 http://thelat<str<strong>on</strong>g>in</str<strong>on</strong>g>library.com/<br />

4 The Packard Humanities Institute is a n<strong>on</strong>profit <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong> that was established <str<strong>on</strong>g>in</str<strong>on</strong>g> 1987 (http://en.wikipedia.org/wiki/Packard_Humanities_Institute) <strong>and</strong><br />

has funded a range of projects <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology <strong>and</strong> historic c<strong>on</strong>servati<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the above-menti<strong>on</strong>ed database of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> literature, an extensive <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

Greek epigraphy project, the Duke Databank of Documentary Papyri, <strong>and</strong> a database of “Persian Literature <str<strong>on</strong>g>in</str<strong>on</strong>g> Translati<strong>on</strong>”<br />

(http://persian.packhum.org/persian/ma<str<strong>on</strong>g>in</str<strong>on</strong>g>).


3<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the check<str<strong>on</strong>g>in</str<strong>on</strong>g>g of all the texts, both to correct transcripti<strong>on</strong>al errors <strong>and</strong> to provide a c<strong>on</strong>sistent markup<br />

scheme. 5 This generati<strong>on</strong> also saw the development of BetaCode by classicists to capture ancient<br />

languages such as Greek <strong>and</strong> Coptic. A third class of corpora, which evolved <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1980s, <str<strong>on</strong>g>in</str<strong>on</strong>g>volved<br />

tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g professi<strong>on</strong>ally entered text <strong>and</strong> semantically mark<str<strong>on</strong>g>in</str<strong>on</strong>g>g it up <str<strong>on</strong>g>in</str<strong>on</strong>g> SGML/XML, such as with the<br />

markup designed by the Text Encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g Initiative (TEI); 6 an example is the Perseus Digital <strong>Library</strong><br />

(PDL). A fourth generati<strong>on</strong> of corpora <str<strong>on</strong>g>in</str<strong>on</strong>g>volved image-fr<strong>on</strong>t collecti<strong>on</strong>s that provided users with page<br />

images that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded hidden uncorrected optical character recogniti<strong>on</strong> (OCR) that could be searched.<br />

This strategy, popularized <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1990s, has driven mass-digitizati<strong>on</strong> projects such as Google Books 7<br />

<strong>and</strong> the Open C<strong>on</strong>tent Alliance (OCA). 8 Stewart et al. call for a fifth generati<strong>on</strong> of corpora that<br />

synthesize the strengths of the four previous generati<strong>on</strong>s while also allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g decentralized<br />

c<strong>on</strong>tributi<strong>on</strong>s from users; us<str<strong>on</strong>g>in</str<strong>on</strong>g>g automated methods to create both scalable <strong>and</strong> semantic markup; <strong>and</strong><br />

synthesiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g “the scholarly dem<strong>and</strong>s of capital <str<strong>on</strong>g>in</str<strong>on</strong>g>tensive, manually c<strong>on</strong>structed collecti<strong>on</strong>s” such as<br />

Perseus, the TLG, <strong>and</strong> the PHI databank of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> literature, with “the <str<strong>on</strong>g>in</str<strong>on</strong>g>dustrial scale of very large,<br />

“milli<strong>on</strong> book” libraries now emerg<str<strong>on</strong>g>in</str<strong>on</strong>g>g.”<br />

In an article written <str<strong>on</strong>g>in</str<strong>on</strong>g> 1959, James McD<strong>on</strong>ough explored the potential of classics <strong>and</strong> comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. He<br />

opened by not<str<strong>on</strong>g>in</str<strong>on</strong>g>g that it took James Turney Allen almost 43 years to create a c<strong>on</strong>cordance of<br />

Euripides, a task that a newly available IBM computer could do <str<strong>on</strong>g>in</str<strong>on</strong>g> 12 hours. McD<strong>on</strong>ough c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ued<br />

with the now-can<strong>on</strong>ical example of how Father Roberto Busa used a computer to create a c<strong>on</strong>cordance<br />

to the works of Thomas Aqu<str<strong>on</strong>g>in</str<strong>on</strong>g>as. 9 McD<strong>on</strong>ough used these examples to expla<str<strong>on</strong>g>in</str<strong>on</strong>g> that computers could<br />

help revoluti<strong>on</strong>ize studies by perform<str<strong>on</strong>g>in</str<strong>on</strong>g>g excepti<strong>on</strong>ally time-c<strong>on</strong>sum<str<strong>on</strong>g>in</str<strong>on</strong>g>g manual tasks such as the<br />

creati<strong>on</strong> of c<strong>on</strong>cordances, textual emendati<strong>on</strong>, auto abstracti<strong>on</strong> of articles, <strong>and</strong>, most important, the<br />

collecti<strong>on</strong> <strong>and</strong> collati<strong>on</strong> of manuscripts. Although McD<strong>on</strong>ough observed, “mach<str<strong>on</strong>g>in</str<strong>on</strong>g>es now make<br />

ec<strong>on</strong>omically feasible a critical editi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> which the exact read<str<strong>on</strong>g>in</str<strong>on</strong>g>g of every source could be pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

full,” this phenomen<strong>on</strong> has yet to occur, a po<str<strong>on</strong>g>in</str<strong>on</strong>g>t to which we return <str<strong>on</strong>g>in</str<strong>on</strong>g> our discussi<strong>on</strong>s of digital critical<br />

editi<strong>on</strong>s <strong>and</strong> manuscripts.<br />

McD<strong>on</strong>ough optimistically predicted that new comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies would c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>ce classicists to<br />

take <strong>on</strong> new forms of research that were not previously possible, argu<str<strong>on</strong>g>in</str<strong>on</strong>g>g that classicists were enter<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

“a new era <str<strong>on</strong>g>in</str<strong>on</strong>g> scholarship, a golden age <str<strong>on</strong>g>in</str<strong>on</strong>g> which mach<str<strong>on</strong>g>in</str<strong>on</strong>g>es perform the servile secretarial tasks, <strong>and</strong> so<br />

leave the scholar free for his proper functi<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretive scholarly re-search. ...” He c<strong>on</strong>cluded with<br />

three recommendati<strong>on</strong>s: (1) all classicists should request that the mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e tape for their editi<strong>on</strong>s be<br />

given to them; (2) classical studies associati<strong>on</strong>s should work together to found <strong>and</strong> support a center that<br />

will record the complete texts of at least all major Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek authors; <strong>and</strong> (3) relevant parties<br />

should <str<strong>on</strong>g>in</str<strong>on</strong>g>crease their comprehensive bibliographic efforts. As a f<str<strong>on</strong>g>in</str<strong>on</strong>g>al thought, McD<strong>on</strong>ough returned to<br />

the lifetime work of James Turney Allen. “That such techniques as this article attempts to sketch were<br />

not available to Professor Allen at the turn of the century is tragic,” McD<strong>on</strong>ough offered, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that<br />

“if they be not extensively employed from this day forth by all <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> scholarship, it will <str<strong>on</strong>g>in</str<strong>on</strong>g>deed<br />

by [sic] a harsh commentary <strong>on</strong> our <str<strong>on</strong>g>in</str<strong>on</strong>g>telligence” (McD<strong>on</strong>ough 1959). McD<strong>on</strong>ough’s po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts about the<br />

importance of mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g all primary data such as manuscripts <strong>and</strong> texts available, the need for classical<br />

5 A special open-source tool named Diogenes (http://www.dur.ac.uk/p.j.hesl<str<strong>on</strong>g>in</str<strong>on</strong>g>/Software/Diogenes/) was created by Peter Hesl<str<strong>on</strong>g>in</str<strong>on</strong>g> to work with these two<br />

corpora (TLG, PHI Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> databank), as many scholars had criticized the limited usability as well as search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g features of these two<br />

“commercial” databases.<br />

6 http://www.tei-c.org/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.xml<br />

7 http://books.google.com<br />

8 http://www.archive.org<br />

9 The work of Father Busa is typically c<strong>on</strong>sidered to be the beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g of classical comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Crane 2004) as well as of literary comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> corpus<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics (Lüdel<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Zeldes 2007).


4<br />

associati<strong>on</strong>s to work together, <strong>and</strong> the need for scholars to give up obsess<str<strong>on</strong>g>in</str<strong>on</strong>g>g over “slavish” tasks <strong>and</strong><br />

return to the more important work of humanistic <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of sources, still r<str<strong>on</strong>g>in</str<strong>on</strong>g>g true.<br />

More than 30 years later, J. D. Bolter offered a detailed analysis of how <strong>on</strong>e feature of the Internet—<br />

hypertext—offered great new potential for record<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> present<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarship (Bolter 1991). Bolter<br />

c<strong>on</strong>tended that studies for the past two centuries had been def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by the qualities of the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted book<br />

where the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> goal of scholarship had been to “to fix each text of each ancient author: to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

the authenticity of works ascribed to an author <strong>and</strong> for each work to establish the Urtext—what the<br />

author actually wrote, letter for letter.” This is the essential work of a classical philologist or creator of<br />

a critical editi<strong>on</strong>: to study all the extant manuscripts of an author, to “rec<strong>on</strong>struct” a text, <strong>and</strong> then to<br />

list all the text variants (or at least the most important) found <str<strong>on</strong>g>in</str<strong>on</strong>g> the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al sources with explanati<strong>on</strong>s<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> an apparatus criticus. While postmodern literary theory had challenged the ideal of the Urtext <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

many discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, Bolter submitted, classical studies had rema<str<strong>on</strong>g>in</str<strong>on</strong>g>ed largely unaffected. N<strong>on</strong>etheless,<br />

Bolter believed that the nature of hypertext was affect<str<strong>on</strong>g>in</str<strong>on</strong>g>g how classicists perceived the nature of their<br />

texts. “Hypertext now challenges the Urtext not <str<strong>on</strong>g>in</str<strong>on</strong>g> the jarg<strong>on</strong> of postmodern theory,” he expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed,<br />

“but practically <strong>and</strong> visibly <str<strong>on</strong>g>in</str<strong>on</strong>g> the way that it h<strong>and</strong>les text.” Hypertext, he posited, might lead scholars<br />

to focus less <strong>on</strong> establish<str<strong>on</strong>g>in</str<strong>on</strong>g>g an exact Urtext <strong>and</strong> more <strong>on</strong> explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the c<strong>on</strong>necti<strong>on</strong>s between texts to<br />

“emphasize the c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uity between the ancient text <strong>and</strong> its ancient, medieval, <strong>and</strong> modern<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s.” The need for digital editi<strong>on</strong>s to reflect a more sophisticated traditi<strong>on</strong> of textual<br />

transmissi<strong>on</strong> <strong>and</strong> textual variati<strong>on</strong> is a theme that is echoed throughout the literature of digital classics.<br />

An even more extensive explorati<strong>on</strong> of how new technologies might affect the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of classics<br />

was provided by Karen Ruhleder (Ruhleder 1995), albeit with an exclusive focus <strong>on</strong> the TLG. One<br />

challenge Ruhleder underscored was that “humanists themselves have been more <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> apply<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies to their work <str<strong>on</strong>g>in</str<strong>on</strong>g> detailed studies of the impact of comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies <strong>on</strong><br />

their discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es.” Christ<str<strong>on</strong>g>in</str<strong>on</strong>g>e Borgman echoes this criticism <str<strong>on</strong>g>in</str<strong>on</strong>g> her recent exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of the digital<br />

humanities (Borgman 2009). Ruhleder surveyed how us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the TLG had affected the daily work of<br />

classicists, how it had changed their relati<strong>on</strong>ship to the textual materials they used, <strong>and</strong> how it affected<br />

both social relati<strong>on</strong>s am<strong>on</strong>g classicists <strong>and</strong> their discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures. She c<strong>on</strong>ducted 60<br />

unstructured <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews with classicists <strong>and</strong> c<strong>on</strong>centrated <strong>on</strong> work <str<strong>on</strong>g>in</str<strong>on</strong>g> literary scholarship <strong>and</strong> textual<br />

criticism. Ruhleder observed that the work of classical scholarship was often like detective work, <strong>and</strong><br />

that scholars’ questi<strong>on</strong>s typically <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded “manuscript authorship <strong>and</strong> authenticity, social relati<strong>on</strong>ships<br />

between different groups or classes <str<strong>on</strong>g>in</str<strong>on</strong>g> ancient Greek society, <strong>and</strong> the different mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of a word or<br />

phrase over time.” Classical scholars used analytical techniques to weigh <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpret evidence <strong>and</strong><br />

used tools “to locate particular pieces of evidence with<str<strong>on</strong>g>in</str<strong>on</strong>g> texts,” Ruhleder noted. As Bolter expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

previously, the nature of textual evidence for classicists is complicated because materials are often<br />

fragmentary or questi<strong>on</strong>able, their transmissi<strong>on</strong> is disputable, <strong>and</strong> the “rec<strong>on</strong>structi<strong>on</strong> of the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al, or<br />

urtext, is an important primary activity with<str<strong>on</strong>g>in</str<strong>on</strong>g> classical scholarship.”<br />

While the TLG did offer amaz<str<strong>on</strong>g>in</str<strong>on</strong>g>g new search<str<strong>on</strong>g>in</str<strong>on</strong>g>g opportunities as well as both breadth <strong>and</strong> depth of<br />

material, Ruhleder criticized its unexam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed use by classicists. The TLG uses <strong>on</strong>e “best editi<strong>on</strong>”<br />

chosen by a special committee of the American Philological Associati<strong>on</strong> (APA) 10 for each Greek<br />

author <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes no commentaries or apparatus criticus, a practice challenged by Ruhleder:<br />

The TLG has altered not <strong>on</strong>ly the form (book to electr<strong>on</strong>ic medium) but also the c<strong>on</strong>tent <strong>and</strong> the<br />

organizati<strong>on</strong> of materials presented <str<strong>on</strong>g>in</str<strong>on</strong>g> the package. It <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes neither critical notes nor other<br />

10 http://www.apaclassics.org/


5<br />

elements of an apparatus criticus <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes <strong>on</strong>ly a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle editi<strong>on</strong> of each text. These<br />

limitati<strong>on</strong>s have led to serious criticism, particularly where there is dispute over the versi<strong>on</strong><br />

used by the TLG (Ruhleder 1995).<br />

Ruhleder also noted that while the corpus may have been broadened <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e sense, it is also far<br />

shallower as critical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> has been “decoupled” from the texts. Similar criticism of the TLG has<br />

also been offered more recently by Notis Toufexis:<br />

In the absence of detailed c<strong>on</strong>textualizati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> accompany<str<strong>on</strong>g>in</str<strong>on</strong>g>g the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e versi<strong>on</strong> of<br />

each text, the user who wishes to check the reliability of a given editi<strong>on</strong> (if, for <str<strong>on</strong>g>in</str<strong>on</strong>g>stance, it uses<br />

all extant manuscripts of a text or not) has to refer to the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong> or other h<strong>and</strong>books.<br />

The same applies to any attempt to put search results obta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by the TLG with<str<strong>on</strong>g>in</str<strong>on</strong>g> the wider<br />

c<strong>on</strong>text of a literary genre or a historical period. The TLG assumes <str<strong>on</strong>g>in</str<strong>on</strong>g> a sense that its users have<br />

a broad knowledge of Greek literature <strong>and</strong> language of all historical periods <strong>and</strong> are capable of<br />

c<strong>on</strong>textualiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g each search result <strong>on</strong> their own (Toufexis 2010, 110).<br />

Criticisms such as these have been aimed at the TLG s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce its found<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1970s, <strong>and</strong> project<br />

founder Theodore Brunner has also acknowledged that access to the TLG does not exempt scholars<br />

from check<str<strong>on</strong>g>in</str<strong>on</strong>g>g pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s of classical texts for the apparatus criticus (Brunner 1987). Brunner<br />

cited both the desire to enter as many texts as possible <strong>and</strong> the relatively high costs of data entry (10<br />

cents a word <str<strong>on</strong>g>in</str<strong>on</strong>g> 1987) as reas<strong>on</strong>s for the approach the TLG chose to take.<br />

In additi<strong>on</strong> to the possible lack of c<strong>on</strong>textual or critical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, many classicists whom Ruhleder<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed were c<strong>on</strong>cerned with the authority that was afforded to texts <str<strong>on</strong>g>in</str<strong>on</strong>g> the TLG, ow<str<strong>on</strong>g>in</str<strong>on</strong>g>g to their<br />

electr<strong>on</strong>ic nature. Ruhleder hypothesized that the TLG had affected the work of classicists <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of<br />

(1) the beliefs <strong>and</strong> expectati<strong>on</strong>s they had of the materials with which they worked, (2) the nature of<br />

their skill sets <strong>and</strong> expertise, <strong>and</strong> (3) the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure of their discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. In terms of beliefs regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

materials, classicists had previously assumed ga<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g familiarity with a corpus was a life’s work <strong>and</strong><br />

happened <strong>on</strong>ly through c<strong>on</strong>stant read<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> reread<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the text <strong>and</strong> that add<str<strong>on</strong>g>in</str<strong>on</strong>g>g to that corpus was a<br />

collaborative act.<br />

Ease of search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts <str<strong>on</strong>g>in</str<strong>on</strong>g> the TLG, Ruhleder proposed, left scholars free to pursue other<br />

work such as scholarly tool build<str<strong>on</strong>g>in</str<strong>on</strong>g>g or the creati<strong>on</strong> of electr<strong>on</strong>ic texts. But this process was not<br />

without its problems:<br />

Of course, tool build<str<strong>on</strong>g>in</str<strong>on</strong>g>g is a form of scholarly work <str<strong>on</strong>g>in</str<strong>on</strong>g> itself, <strong>and</strong> databanks <strong>and</strong> electr<strong>on</strong>ic texts<br />

are a form of “scholarly producti<strong>on</strong>.” However, this k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of activity has traditi<strong>on</strong>ally ranked<br />

low; develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>dex or a c<strong>on</strong>cordance ranks above develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g materials, but<br />

below writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g articles, books, commentaries <strong>and</strong> produc<str<strong>on</strong>g>in</str<strong>on</strong>g>g new textual editi<strong>on</strong>s. Develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

computer-based tools is not even <strong>on</strong> the list (Ruhleder 1995).<br />

The challenge of evaluat<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarly work <str<strong>on</strong>g>in</str<strong>on</strong>g> digital classics, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>deed all of digital humanities, as<br />

well as the unwill<str<strong>on</strong>g>in</str<strong>on</strong>g>gness of many traditi<strong>on</strong>al tenure evaluati<strong>on</strong>s to c<strong>on</strong>sider digital scholarship, are<br />

themes that are seen throughout the literature.<br />

The sec<strong>on</strong>d major change identified by Ruhleder, that of shift<str<strong>on</strong>g>in</str<strong>on</strong>g>g skill sets <strong>and</strong> expertise, c<strong>on</strong>sidered<br />

how technical expertise was be<str<strong>on</strong>g>in</str<strong>on</strong>g>g substituted for experience ga<str<strong>on</strong>g>in</str<strong>on</strong>g>ed over time, <strong>and</strong> how classicists<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly needed more-sophisticated technical knowledge to underst<strong>and</strong> the limitati<strong>on</strong>s of tools such<br />

as the TLG. The third major change, that of challenges to discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, Ruhleder used to


6<br />

briefly explore issues that are now major discussi<strong>on</strong> po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts, particularly the challenges of electr<strong>on</strong>ic<br />

publicati<strong>on</strong> to the traditi<strong>on</strong>al pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> the new level of technical <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong><br />

support required to create, dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ate, <strong>and</strong> preserve electr<strong>on</strong>ic texts.<br />

Ruhleder c<strong>on</strong>cluded by identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g a number of larger issues about the impact of comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong><br />

classics. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, she c<strong>on</strong>tended that the ways <str<strong>on</strong>g>in</str<strong>on</strong>g> which the TLG has “flattened,” <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> some<br />

ways “misrepresented,” the corpus of Greek negatively affects the amount of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> available to<br />

scholars. In additi<strong>on</strong>, as the TLG moved scholars another step away from the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al source text,<br />

Ruhleder criticized scholars for simply accept<str<strong>on</strong>g>in</str<strong>on</strong>g>g other scholars’ read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of a text <strong>and</strong> never<br />

c<strong>on</strong>sult<str<strong>on</strong>g>in</str<strong>on</strong>g>g the primary text to draw their own c<strong>on</strong>clusi<strong>on</strong>s. This sec<strong>on</strong>d criticism <str<strong>on</strong>g>in</str<strong>on</strong>g> particular has<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>spired projects such as Demos, 11 an electr<strong>on</strong>ic publicati<strong>on</strong> of the Stoa C<strong>on</strong>sortium <strong>on</strong> Athenian<br />

Democracy, where every statement is l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked back to the primary textual evidence (mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g particular<br />

use of the Perseus Project) <strong>on</strong> which it is based. Ruhleder w<strong>on</strong>dered why more scholars didn’t<br />

challenge the nature of the TLG corpus or explore larger questi<strong>on</strong>s of how comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g was affect<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

their discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e:<br />

Fundamental paradigmatic questi<strong>on</strong>s, methodological discussi<strong>on</strong>s, <strong>and</strong> mechanisms for<br />

resource allocati<strong>on</strong> <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructural development that are appropriately discussed at the level<br />

of the community-of-practice often masquerade as <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual problems of skill or resources.<br />

“We need to reth<str<strong>on</strong>g>in</str<strong>on</strong>g>k what it means for our discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e to take <strong>on</strong> a technological character” is<br />

reduced to “I feel detached from the text” (Ruhleder 1995).<br />

Ruhleder hoped that larger discussi<strong>on</strong>s would beg<str<strong>on</strong>g>in</str<strong>on</strong>g> to take place <strong>and</strong> c<strong>on</strong>cluded that the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> problem<br />

was not so much <str<strong>on</strong>g>in</str<strong>on</strong>g> what systems could or could not do but <str<strong>on</strong>g>in</str<strong>on</strong>g> their users’ unwill<str<strong>on</strong>g>in</str<strong>on</strong>g>gness to explore the<br />

limitati<strong>on</strong>s of such systems <strong>and</strong> set them <str<strong>on</strong>g>in</str<strong>on</strong>g> a broader discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary c<strong>on</strong>text. Many of the challenges<br />

raised by Ruhleder are explored <str<strong>on</strong>g>in</str<strong>on</strong>g> our discussi<strong>on</strong> of the various discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of digital classics.<br />

By the early part of this century, many explorati<strong>on</strong>s of classics <strong>and</strong> computers had turned from<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual c<strong>on</strong>siderati<strong>on</strong>s of particular databases to broader exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>s of the Internet. In 2000,<br />

Hardwick argued that the impact of the Internet <strong>on</strong> classics had been largest <str<strong>on</strong>g>in</str<strong>on</strong>g> the areas of<br />

communicati<strong>on</strong>, publicati<strong>on</strong>, dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of research, <strong>and</strong> the development of specialist research<br />

tools. She listed several important advances, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g improved access to primary texts <strong>and</strong> sec<strong>on</strong>dary<br />

research, rapid search tools, research databases, rapidly updated specialist bibliographies, electr<strong>on</strong>ic<br />

journals, more quickly reviewed academic publicati<strong>on</strong>s, <strong>and</strong> new potential with electr<strong>on</strong>ic discussi<strong>on</strong><br />

lists <strong>and</strong> c<strong>on</strong>ferences. Teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> classical studies had also changed over the past 30 years, Hardwick<br />

argued, with an <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g focus <strong>on</strong> history <strong>and</strong> culture <strong>and</strong> less focus <strong>on</strong> language-based learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

Hardwick noted that whereas students used to enter college with a fair amount of l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

many were now first encounter<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greek <strong>and</strong> Roman culture <strong>and</strong> history <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of venues, which<br />

had c<strong>on</strong>sequently re<str<strong>on</strong>g>in</str<strong>on</strong>g>vigorated their <str<strong>on</strong>g>in</str<strong>on</strong>g>terest <str<strong>on</strong>g>in</str<strong>on</strong>g> language learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Hardwick also observed that the<br />

grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g availability of data <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e was “blurr<str<strong>on</strong>g>in</str<strong>on</strong>g>g the l<str<strong>on</strong>g>in</str<strong>on</strong>g>es” between teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> research, <strong>and</strong> help<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

support a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g movement toward new forms of undergraduate research <strong>and</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g. McManus<br />

<strong>and</strong> Rub<str<strong>on</strong>g>in</str<strong>on</strong>g>o (2003) also provided a brief overview of the state of Internet resources available <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

classics, po<str<strong>on</strong>g>in</str<strong>on</strong>g>t<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the impact of these resources <strong>on</strong> pedagogy. 12<br />

Instead of focus<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the uniqueness of classical comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Greg Crane proposed that classical<br />

studies no l<strong>on</strong>ger needed its own separate comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g history. He noted that the study of antiquity has<br />

11 http://www.stoa.org/projects/demos/home.<br />

12 For an earlier discussi<strong>on</strong> of both the pitfalls <strong>and</strong> promise of technology <str<strong>on</strong>g>in</str<strong>on</strong>g> the teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g of classics see Neal (1990).


7<br />

always been a “data-<str<strong>on</strong>g>in</str<strong>on</strong>g>tensive” enterprise <strong>and</strong> that all the reference works <strong>and</strong> critical editi<strong>on</strong>s created<br />

by classicists were well suited to an electr<strong>on</strong>ic envir<strong>on</strong>ment. He argued that the needs of classicists <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

2004 were not so different from those of scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> other discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, <strong>and</strong> that classicists would need<br />

to learn to adapt the tools of other discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es.<br />

There should not be a history of classics <strong>and</strong> the computer, for the needs of classicists are<br />

simply not so dist<str<strong>on</strong>g>in</str<strong>on</strong>g>ctive as to warrant a separate “<str<strong>on</strong>g>in</str<strong>on</strong>g>formatics.” Discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary specialists<br />

learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g the strengths <strong>and</strong> weaknesses have, <str<strong>on</strong>g>in</str<strong>on</strong>g> the author's experience, a str<strong>on</strong>g tendency to<br />

exaggerate the extent to which their problems are unique <strong>and</strong> to call for a specialized, doma<str<strong>on</strong>g>in</str<strong>on</strong>g>specific<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> approach (Crane 2004).<br />

While Melissa Terras has recently agreed with Crane’s call for classicists to work <str<strong>on</strong>g>in</str<strong>on</strong>g> an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary manner <strong>and</strong> adapt the computati<strong>on</strong>al advances <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure from other<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, she also noted how logistical <strong>and</strong> pers<strong>on</strong>al issues of discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>arity need to be c<strong>on</strong>sidered<br />

when pursu<str<strong>on</strong>g>in</str<strong>on</strong>g>g cross-discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary work (Terras 2010), add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that these issues have received little<br />

research. 13 Discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>arity has presented digital classicists with two key challenges, Terras argued. The<br />

first challenge is the difficulty of “forg<str<strong>on</strong>g>in</str<strong>on</strong>g>g an identity <strong>and</strong> ga<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g recogniti<strong>on</strong>” for their work with<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the traditi<strong>on</strong>al discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of classics; the sec<strong>on</strong>d, faced by scholars who go bey<strong>on</strong>d traditi<strong>on</strong>al classics,<br />

is the difficulty <str<strong>on</strong>g>in</str<strong>on</strong>g> engag<str<strong>on</strong>g>in</str<strong>on</strong>g>g with experts <str<strong>on</strong>g>in</str<strong>on</strong>g> various computer science discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es:<br />

Classicists us<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital technologies <str<strong>on</strong>g>in</str<strong>on</strong>g> their research are regularly at the forefr<strong>on</strong>t of research<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> digital humanities, given the range of primary <strong>and</strong> sec<strong>on</strong>dary sources c<strong>on</strong>sulted <strong>and</strong> the array<br />

of tools <strong>and</strong> techniques necessary to <str<strong>on</strong>g>in</str<strong>on</strong>g>terrogate them. However, to adopt new <strong>and</strong> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

techniques, <strong>and</strong> to adopt <strong>and</strong> adapt emergent technologies, the Digital Classicist has to work <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary space between Classics <strong>and</strong> comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g science (Terras 2010, 178).<br />

Two projects that Terras believed illustrated the <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary visi<strong>on</strong> necessary to pursue successful<br />

work <str<strong>on</strong>g>in</str<strong>on</strong>g> digital classics were eSAD <strong>and</strong> VERA, both of which require large-scale <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary<br />

teams to c<strong>on</strong>duct their work.<br />

Balanc<str<strong>on</strong>g>in</str<strong>on</strong>g>g the needs of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es such as classics with<str<strong>on</strong>g>in</str<strong>on</strong>g> a larger scholarly<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that is still useful across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es is a challenge echoed by many humanities<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure projects, <strong>and</strong> will be discussed later <str<strong>on</strong>g>in</str<strong>on</strong>g> this report. Before mov<str<strong>on</strong>g>in</str<strong>on</strong>g>g to questi<strong>on</strong>s of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, whether for classics or for the humanities as a whole, however, the report looks at some<br />

large multidiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary classical digital libraries, the services they provide, <strong>and</strong> the state of the art <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

potential services they might develop.<br />

MULTIDISCIPLINARY CLASSICAL DIGITAL LIBRARIES: ADVANCED TECHNOLOGIES<br />

AND SERVICES<br />

The extensive <str<strong>on</strong>g>in</str<strong>on</strong>g>terest <str<strong>on</strong>g>in</str<strong>on</strong>g> the language, literature, <strong>and</strong> history of the ancient world is evidenced by the<br />

large number of resources available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Digital collecti<strong>on</strong>s of classical texts (particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek<br />

<strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>) abound <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> have been created both by enthusiasts <strong>and</strong> by academics. For Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, some<br />

of the larger collecti<strong>on</strong>s are Corpus Scriptorum Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>orum, IntraText, Lacus Curtius, <strong>and</strong> the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

<strong>Library</strong>; for Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek, there is the Bibliotheca Augustana <strong>and</strong> the Internet Classics Archive; <strong>and</strong><br />

13 Terras cites Siemens (2009) as <strong>on</strong>e of the few research articles <str<strong>on</strong>g>in</str<strong>on</strong>g> this area.


8<br />

for English translati<strong>on</strong>s of Greek mythology, there is the Theoi E-Text collecti<strong>on</strong>. 14 Typically these<br />

collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>volve digital provisi<strong>on</strong> of texts that have been typed <str<strong>on</strong>g>in</str<strong>on</strong>g> manually <strong>and</strong> the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> form<br />

access is provided by a browsable list of authors <strong>and</strong> works. Basic services such as keyword search<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

are also typically implemented.<br />

One way to dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guish between a digital collecti<strong>on</strong> <strong>and</strong> a digital library is that a library provides a<br />

variety of services <str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong> to an organized collecti<strong>on</strong> of objects <strong>and</strong> texts. C<strong>and</strong>ela et al. (2007)<br />

proposed that the def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of digital libraries has exp<strong>and</strong>ed recently because “generally accepted<br />

c<strong>on</strong>cepti<strong>on</strong>s have shifted from a c<strong>on</strong>tent-centric system that merely supports the organizati<strong>on</strong> <strong>and</strong><br />

provisi<strong>on</strong> of access to particular collecti<strong>on</strong>s of data <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, to a pers<strong>on</strong>-centric system that<br />

delivers <str<strong>on</strong>g>in</str<strong>on</strong>g>novative, evolv<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> pers<strong>on</strong>alized services to users.” Focus<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> this c<strong>on</strong>cept of<br />

services, this secti<strong>on</strong> provides overviews of some multidiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary classical digital libraries that are<br />

large <str<strong>on</strong>g>in</str<strong>on</strong>g> scope <strong>and</strong> also provide specialized services to their users. They <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the Cuneiform Digital<br />

<strong>Library</strong> Initiative (CDLI), 15 the PDL, 16 <strong>and</strong> the TLG. 17 We then look at a number of technologies that<br />

could be used to provide advanced services for classical digital libraries <str<strong>on</strong>g>in</str<strong>on</strong>g> general.<br />

The CDLI is a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t project of the University of California, Los Angeles, <strong>and</strong> the Max Planck Institute<br />

for the History of Science. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its website, CDLI “represents the efforts of an <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al<br />

group of Assyriologists, museum curators <strong>and</strong> historians of science to make available through the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>ternet the form <strong>and</strong> c<strong>on</strong>tent of cuneiform tablets dat<str<strong>on</strong>g>in</str<strong>on</strong>g>g from the beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g of writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g, ca. 3350 B.C.,<br />

until the end of the pre-Christian era.” While estimat<str<strong>on</strong>g>in</str<strong>on</strong>g>g that there are about 500,000 documents<br />

available <str<strong>on</strong>g>in</str<strong>on</strong>g> private <strong>and</strong> public collecti<strong>on</strong>s, the CDLI currently provides <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e access to more than<br />

225,000 that have been cataloged. The CDLI ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s an extensive website with a full list of <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded<br />

collecti<strong>on</strong>s, educati<strong>on</strong>al resources, a list of related publicati<strong>on</strong>s, 18 project partners, tools <strong>and</strong> resources,<br />

<strong>and</strong> extensive documentati<strong>on</strong> about how data are entered <strong>and</strong> transliterated <strong>and</strong> how the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e catalog<br />

was created. Access to the collecti<strong>on</strong> is supported by both a general <strong>and</strong> an advanced search opti<strong>on</strong>.<br />

The basic search <str<strong>on</strong>g>in</str<strong>on</strong>g>terface supports search<str<strong>on</strong>g>in</str<strong>on</strong>g>g transliterati<strong>on</strong>s, the catalog, or both simultaneously; the<br />

advanced opti<strong>on</strong> also supports search<str<strong>on</strong>g>in</str<strong>on</strong>g>g by publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, physical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, text c<strong>on</strong>tent<br />

(with language limits), provenience, <strong>and</strong> chr<strong>on</strong>ology. The record for each document <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes extensive<br />

catalog <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, h<strong>and</strong>-drawn or digital images, <strong>and</strong> extensive transliterati<strong>on</strong>s. 19<br />

The PDL, currently <str<strong>on</strong>g>in</str<strong>on</strong>g> versi<strong>on</strong> 4.0, began <str<strong>on</strong>g>in</str<strong>on</strong>g> 1985 <strong>and</strong> it has evolved from CD-ROM to the current<br />

versi<strong>on</strong> of its <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e digital library. While its flagship collecti<strong>on</strong> is a classical collecti<strong>on</strong> that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a<br />

large number of Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> texts (currently 8,378,421 Greek words <strong>and</strong> 8,696,429 Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> words),<br />

multiple English translati<strong>on</strong>s, <strong>and</strong> various reference works such as commentaries, histories, <strong>and</strong><br />

lexic<strong>on</strong>s, it has other collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> Arabic, Old Norse, <strong>and</strong> English. An art <strong>and</strong> archaeology browser<br />

also provides access to an extensive collecti<strong>on</strong> of images. All the texts <str<strong>on</strong>g>in</str<strong>on</strong>g> the PDL are encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> TEI-<br />

XML, <strong>and</strong> most can be downloaded, al<strong>on</strong>g with the source code or “hopper” that runs the producti<strong>on</strong><br />

digital library. 20 For most Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek collecti<strong>on</strong>s, the PDL presents an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e read<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

envir<strong>on</strong>ment that provides parallel texts (Greek <strong>and</strong> English, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> English) where a user can read<br />

a Greek or Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> text aligned with a public doma<str<strong>on</strong>g>in</str<strong>on</strong>g> English translati<strong>on</strong>. Each text <str<strong>on</strong>g>in</str<strong>on</strong>g> the various<br />

14 For a list of these URLs plus some other selected collecti<strong>on</strong>s, see http://www.delicious.com/Alis<strong>on</strong>Babeu/clir-review+digital_collecti<strong>on</strong>s<br />

15 http://cdli.ucla.edu/<br />

16 http://www.perseus.tufts.edu/hopper/<br />

17 http://www.tlg.uci.edu/<br />

18 These publicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the CDLJournal, the CDLBullet<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> CDLNotes. Many of these scholarly <strong>and</strong> peer-reviewed publicati<strong>on</strong>s (all of which are<br />

freely available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e) represent fairly traditi<strong>on</strong>al research that has been made possible through the availability of the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e collecti<strong>on</strong>, but there is also<br />

research that has made use both of the CDLI <strong>and</strong> computati<strong>on</strong>al techniques; see, for example, Jaworski (2008).<br />

19 http://cdli.ucla.edu/search/result.ptid_text=P020005&start=0&result_format=s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle&-op_id_text=eq&size=100<br />

20 http://www.perseus.tufts.edu/hopper/opensource/download


9<br />

collecti<strong>on</strong>s is extensively hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to relevant entries <str<strong>on</strong>g>in</str<strong>on</strong>g> lexic<strong>on</strong>s, dicti<strong>on</strong>aries, <strong>and</strong> commentaries.<br />

The classics collecti<strong>on</strong> can be browsed by author name, <strong>and</strong> a number of sophisticated search<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

opti<strong>on</strong>s are available, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual search<str<strong>on</strong>g>in</str<strong>on</strong>g>g (English, Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Old, English, German,<br />

<strong>and</strong> Old Norse) of the whole digital library, <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual collecti<strong>on</strong>s, or <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual texts. Named-entity<br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g (people, places, <strong>and</strong> dates) is available. The PDL also offers several useful l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic tools,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g an English look-up of words <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> Arabic based <strong>on</strong> their English def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong>s, a<br />

vocabulary tool, <strong>and</strong> a word-study tool. While a full overview of the history, services, <strong>and</strong> collecti<strong>on</strong>s<br />

available at the PDL is bey<strong>on</strong>d the scope of this review, a full list of publicati<strong>on</strong>s (with most available<br />

for download) is available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 21<br />

Founded <str<strong>on</strong>g>in</str<strong>on</strong>g> 1972, the TLG is arguably the best-known digital library <str<strong>on</strong>g>in</str<strong>on</strong>g> classics. 22 Based at the<br />

University of California, Irv<str<strong>on</strong>g>in</str<strong>on</strong>g>e, the TLG has “collected <strong>and</strong> digitized most texts written <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek from<br />

Homer (8th c. BC) until the fall of Byzantium <str<strong>on</strong>g>in</str<strong>on</strong>g> AD 1453 <strong>and</strong> bey<strong>on</strong>d.” The ma<str<strong>on</strong>g>in</str<strong>on</strong>g> goal of the TLG “is<br />

to create a comprehensive digital library of Greek literature from antiquity to the present era.” The<br />

TLG was first available <strong>on</strong> magnetic tapes <strong>and</strong> then <strong>on</strong> CD-ROM, <strong>and</strong> s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce 2001 it has been available<br />

<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e by subscripti<strong>on</strong> with its own search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 23 The TLG c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s more than “105 milli<strong>on</strong> words<br />

from over 10,000 works associated with 4,000 authors.” The corpus can be browsed by author or<br />

searched by author, work title, or TLG number. It recently started support<str<strong>on</strong>g>in</str<strong>on</strong>g>g lemmatized search<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

Large digital libraries of classical materials such as those listed above are highly curated <strong>and</strong> offer<br />

specialized services. As mass-digitizati<strong>on</strong> projects such as Google Books <strong>and</strong> the OCA c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ue to put<br />

materials of Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> other historical languages such as Sanskrit <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, scal<str<strong>on</strong>g>in</str<strong>on</strong>g>g these services<br />

to meet the needs of larger collecti<strong>on</strong>s is <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly important. In additi<strong>on</strong>, there are many reference<br />

tools <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e such as bibliographies <strong>and</strong> directories that will likely need to adapt to the scale of milli<strong>on</strong>book<br />

classical digital libraries. The rest of this secti<strong>on</strong> focuses <strong>on</strong> some of the important types of crossdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary<br />

services, tools, <strong>and</strong> technologies available for digital classics <strong>and</strong> <strong>on</strong> how such doma<str<strong>on</strong>g>in</str<strong>on</strong>g>specific<br />

tools might be of use <str<strong>on</strong>g>in</str<strong>on</strong>g> larger a classical cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

Bibliographies/Catalogs/Directories<br />

The wealth of research <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools for digital classics matches the large number of digital<br />

collecti<strong>on</strong>s <strong>and</strong> digital libraries of classical materials available. This subsecti<strong>on</strong> provides an overview<br />

of some of these tools, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g bibliographical research databases, bibliographies, catalogs, <strong>and</strong><br />

portals/resource directories.<br />

L' Année Philologique (APh) 24 is c<strong>on</strong>sidered to be the preem<str<strong>on</strong>g>in</str<strong>on</strong>g>ent research tool for sec<strong>on</strong>dary literature<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> the fields of classics <strong>and</strong> is published by the Société Internati<strong>on</strong>ale de Bibliographie Classique<br />

(overseen by Eric Rebillard) al<strong>on</strong>g with the APA <strong>and</strong> the Database of Classical Bibliography (managed<br />

by Dee L. Clayman), with support from both the Centre Nati<strong>on</strong>al de la Recherche Scientifique <strong>and</strong> the<br />

Nati<strong>on</strong>al Endowment for the Humanities (NEH). The goal of the APh is to annually collect scholarly<br />

works relat<str<strong>on</strong>g>in</str<strong>on</strong>g>g to all aspects of Greek <strong>and</strong> Roman civilizati<strong>on</strong>, not just with<str<strong>on</strong>g>in</str<strong>on</strong>g> classics but also from the<br />

“auxiliary” discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of archaeology, epigraphy, numismatics, papyrology, <strong>and</strong> palaeography. The<br />

APh has collaborat<str<strong>on</strong>g>in</str<strong>on</strong>g>g teams <str<strong>on</strong>g>in</str<strong>on</strong>g> France, Germany, Italy, Spa<str<strong>on</strong>g>in</str<strong>on</strong>g>, Switzerl<strong>and</strong>, <strong>and</strong> the United States, <strong>and</strong><br />

item abstracts <strong>and</strong> database search<str<strong>on</strong>g>in</str<strong>on</strong>g>g are available <str<strong>on</strong>g>in</str<strong>on</strong>g> all these languages. Every year a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted volume<br />

21 http://www.perseus.tufts.edu/hopper/about/publicati<strong>on</strong>s<br />

22 Numerous articles have provided overviews of the TLG <strong>and</strong> its history; see, for example, Brunner (1987), Brunner (1991), <strong>and</strong> Pantelia (2000).<br />

23 There are several other large subscripti<strong>on</strong> databases of classical texts, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Brepols <strong>Library</strong> of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Texts-Series A <strong>and</strong> B<br />

(http://www.brepols.net/publishers/pdf/Brepolis_LLT_En.pdf), the Bibliotheca Teubneriana Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>a<br />

(http://www.degruyter.de/c<strong>on</strong>t/fb/at/detailEn.cfmisbn=9783110214567), <strong>and</strong> the Thesaurus L<str<strong>on</strong>g>in</str<strong>on</strong>g>guae Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>ae.<br />

24 http://www.annee-philologique.com/aph/


10<br />

is created of the entire bibliography; this is then uploaded <str<strong>on</strong>g>in</str<strong>on</strong>g>to the database where it can be searched by<br />

author of the scholarly work, full text (of the abstract), ancient author <strong>and</strong> text that is referenced, <strong>and</strong><br />

subject <strong>and</strong> discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Currently, the APh <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes bibliographic records for works published through<br />

2008. It is available through subscripti<strong>on</strong>.<br />

Two freely available bibliographic databases also exist for classical studies. Gnom<strong>on</strong> Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e 25 is<br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by Jürgen Malitz of Catholic University Eichstatt-Ingolstadt <strong>and</strong> Gregor Weber of the<br />

University of Augsburg. Its ma<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>terface is <str<strong>on</strong>g>in</str<strong>on</strong>g> German, <strong>and</strong> the bibliographic metadata <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

database can be searched or browsed by a thesaurus. TOCS-IN 26 provides access to the tables of<br />

c<strong>on</strong>tents from about 185 journals (<strong>and</strong> thus more than 45,000 articles) <str<strong>on</strong>g>in</str<strong>on</strong>g> classics, Near Eastern Studies,<br />

<strong>and</strong> religi<strong>on</strong>, both <str<strong>on</strong>g>in</str<strong>on</strong>g> a text format <strong>and</strong> through a web program. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the website, access is<br />

provided to full-text articles about 15 percent of the time. TOCS-IN is an entirely volunteer project that<br />

began to archive tables of c<strong>on</strong>tents <str<strong>on</strong>g>in</str<strong>on</strong>g> 1992 <strong>and</strong> is currently managed by PMW Mathes<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Tor<strong>on</strong>to.<br />

Some 80 volunteers from 16 countries c<strong>on</strong>tribute tables of c<strong>on</strong>tents to this service. TOCS-IN can be<br />

either searched or browsed.<br />

The digital envir<strong>on</strong>ment has provided a useful way of updat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to bibliographies.<br />

One of the oldest bibliographies <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e is Abzu, 27 which s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce 1994 has provided a guide to<br />

“networked open access data relevant to the study <strong>and</strong> public presentati<strong>on</strong> of the Ancient Near East <strong>and</strong><br />

the Ancient Mediterranean world.” Charles E. J<strong>on</strong>es, head librarian at ISAW, manages Abzu, <strong>and</strong><br />

resources <str<strong>on</strong>g>in</str<strong>on</strong>g>clude websites as well as open-access electr<strong>on</strong>ic publicati<strong>on</strong>s. The collecti<strong>on</strong> can either be<br />

browsed by author or searched by a variety of criteria.<br />

Another significant <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e bibliography is the “Checklist of Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Demotic <strong>and</strong> Coptic Papyri,<br />

Ostraca <strong>and</strong> Tablets,” 28 which was created to provide for both librarians <strong>and</strong> scholars a comprehensive<br />

“bibliography of all m<strong>on</strong>ographic volumes, both current <strong>and</strong> out-of-pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t, of Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Demotic <strong>and</strong><br />

Coptic documentary texts <strong>on</strong> papyrus, parchment, ostraca or wood tablets.” Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the website,<br />

this checklist also sought to “establish a st<strong>and</strong>ard list of abbreviati<strong>on</strong>s for editi<strong>on</strong>s of Greek texts” <strong>and</strong><br />

thus serves as a c<strong>on</strong>venient source for f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g the abbreviati<strong>on</strong>s used for various m<strong>on</strong>ograph<br />

collecti<strong>on</strong>s as well as a number of periodicals that are often required to fully decipher citati<strong>on</strong>s (e.g.,<br />

BKT = Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>er Klassikertexte).<br />

A major bibliography regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the recepti<strong>on</strong> of classical texts 29 by later authors is “The Traditio<br />

Classicorum,” 30 created by Charles H. Lahr of the University of Freiburg. Available <str<strong>on</strong>g>in</str<strong>on</strong>g> German <strong>and</strong><br />

English, this website c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s “a bibliography of sec<strong>on</strong>dary literature c<strong>on</strong>cern<str<strong>on</strong>g>in</str<strong>on</strong>g>g the fortuna of<br />

classical authors to the year 1650.” The bibliography is arranged by the comm<strong>on</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> names of<br />

authors (whether the author wrote <str<strong>on</strong>g>in</str<strong>on</strong>g> Arabic, Greek, or Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>) <strong>and</strong> the entries for authors have been<br />

divided <str<strong>on</strong>g>in</str<strong>on</strong>g>to general works with specific titles arranged chr<strong>on</strong>ologically.<br />

The LDAB (Leuven Database of Ancient Books), 31 now a participant <str<strong>on</strong>g>in</str<strong>on</strong>g> the Trismegistos portal, also<br />

supports research <str<strong>on</strong>g>in</str<strong>on</strong>g>to the recepti<strong>on</strong> of classical texts. This database collects basic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong><br />

25 http://www.gnom<strong>on</strong>.ku-eichstaett.de/Gnom<strong>on</strong>/en/Gnom<strong>on</strong>.html<br />

26 http://www.chass.utor<strong>on</strong>to.ca/amphoras/tocs.html<br />

27 http://www.etana.org/abzu/<br />

28 http://scriptorium.lib.duke.edu/papyrus/texts/clist.html<br />

29 Another project that explores the c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>u<str<strong>on</strong>g>in</str<strong>on</strong>g>g recepti<strong>on</strong> of classical texts, albeit with a focus <strong>on</strong> classical drama, is the Archive of Performances of Greek<br />

<strong>and</strong> Roman Drama Database (http://www.apgrd.ox.ac.uk/database.htm), which “offers <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> more than 9,000 producti<strong>on</strong>s of ancient Greek <strong>and</strong><br />

Roman drama performed <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>ally <strong>on</strong> stage, screen, <strong>and</strong> radio from the Renaissance to the present day.”<br />

30 http://www.theol.uni-freiburg.de/forschung/projekte/tcdt/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex_en.html<br />

31 http://www.trismegistos.org/ldab/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php


11<br />

ancient literary texts or works (rather than documents) from the fourth century BC to 800 AD <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes more than 3,600 “an<strong>on</strong>ymous” texts. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the website:<br />

Text editi<strong>on</strong>s by classical philologists <strong>and</strong> patristic scholars are usually based up<strong>on</strong> medieval<br />

manuscripts, dat<str<strong>on</strong>g>in</str<strong>on</strong>g>g many centuries after the work <str<strong>on</strong>g>in</str<strong>on</strong>g> questi<strong>on</strong> was first written down <strong>and</strong><br />

transmitted by copies from copies from copies. Here the user will f<str<strong>on</strong>g>in</str<strong>on</strong>g>d the oldest preserved<br />

copies of each text. At the same time he will get a view of the recepti<strong>on</strong> of ancient literature<br />

throughout the Hellenistic, Roman <strong>and</strong> Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e period: which author was read when, where<br />

<strong>and</strong> by whom throughout Antiquity.<br />

Because LDAB focuses <strong>on</strong> books, this project has excluded documentary texts <strong>and</strong> references to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. The database has a variety of advanced search<str<strong>on</strong>g>in</str<strong>on</strong>g>g features, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g publicati<strong>on</strong>, editor,<br />

catalogs, ancient author, book, century, date, provenance, nome/regi<strong>on</strong>, material (papyrus, parchment),<br />

bookform, language/script, <strong>and</strong> script type (Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Demotic, Coptic). For example, search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong><br />

the author Herodotus provides a list of documents that have discussed or made references to his<br />

history. The LDAB provides an excellent way to study the recepti<strong>on</strong> of various classical authors <strong>and</strong><br />

provides l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks <str<strong>on</strong>g>in</str<strong>on</strong>g>to various papyri collecti<strong>on</strong>s.<br />

Rather than provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g a bibliography <strong>on</strong> a particular topic, P<str<strong>on</strong>g>in</str<strong>on</strong>g>ax Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e 32 offers an annotated list of<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e bibliographies regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ancient Greek world. This website is ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by Marc<br />

Huys of the Department of Classical Studies, Katholieke Universiteit Leuven, <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks are <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded to<br />

general bibliographies of the Greek world, bibliographies for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual Greek authors, <strong>and</strong> thematic<br />

bibliographies (literature, l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, mythology <strong>and</strong> religi<strong>on</strong>, history, <strong>and</strong> archaeology).<br />

In additi<strong>on</strong> to bibliographies, there are a number of important <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e catalogs for f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital<br />

classical <strong>and</strong> medieval materials. LATO, or <strong>Library</strong> of Ancient Texts Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e, 33 has the goal of<br />

provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Internet’s “most thorough catalogue of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e copies of ancient Greek texts, both <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Greek <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> translati<strong>on</strong>.” This website does not host any actual texts but <str<strong>on</strong>g>in</str<strong>on</strong>g>stead ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s a set of<br />

extensive l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to Greek texts <strong>and</strong> their translati<strong>on</strong>s <strong>on</strong> other sites such as Bibliotheca Augustana,<br />

Perseus, Project Gutenberg, <strong>and</strong> Theoi Greek Mythology. The l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to texts are organized <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

alphabetical order by author, with a list of the author’s works then organized by title. This catalog also<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to many fragmentary authors<br />

Another useful resource is the “Catalogue of <str<strong>on</strong>g>Digitized</str<strong>on</strong>g> Medieval Manuscripts,” 34 which was created by<br />

the University of California, Los Angeles. This catalog attempts to provide a straightforward way of<br />

discover<str<strong>on</strong>g>in</str<strong>on</strong>g>g medieval manuscripts <strong>on</strong> the web <strong>and</strong> is labeled as a “work <str<strong>on</strong>g>in</str<strong>on</strong>g> progress.” The catalog<br />

currently c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s more than 3,000 manuscript descripti<strong>on</strong>s that can be searched by keyword, <strong>and</strong> the<br />

catalog can also be browsed by locati<strong>on</strong>, shelfmark, author, title, or language (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Arabic, Greek,<br />

<strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>). Each manuscript descripti<strong>on</strong> c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to the digitized manuscript.<br />

A number of specialized research portals <strong>and</strong> directories to classical materials that encompass catalogs<br />

are also available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. The German website KIRKE (Catalog der Internetressourcen für die<br />

Klassische Philologie), 35 created by Ulrich Schmitzer, <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes an extensive <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e directory of<br />

resources <strong>on</strong> the Internet for classicists. Another helpful resource is SISYPHOS, 36 a searchable<br />

32 https://perswww.kuleuven.be/~u0013314/p<str<strong>on</strong>g>in</str<strong>on</strong>g>ax<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.html - Specifiek<br />

33 http://sites.google.com/site/ancienttexts/<br />

34 http://manuscripts.cmrs.ucla.edu/languages_list.php<br />

35 http://www.kirke.hu-berl<str<strong>on</strong>g>in</str<strong>on</strong>g>.de/ressourc/ressourc.html<br />

36 http://vifa.ub.uni-heidelberg.de/sisyphos/servlet/de.izsoz.dbclear.query.browse.Query/doma<str<strong>on</strong>g>in</str<strong>on</strong>g>=allg//lang=enquerydef=query-simple


12<br />

directory of more than 2,100 cataloged Internet resources created by UB Heidelberg. This resource<br />

provides access to “Classical Archaeological, Ancient Near Eastern <strong>and</strong> Egyptological websites,”<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g subject portals, databases of images, mail<str<strong>on</strong>g>in</str<strong>on</strong>g>g lists, <strong>and</strong> discussi<strong>on</strong> forums. One useful feature<br />

is that this site provides a “full-text search” of the websites it has cataloged, rather than just a search of<br />

the metadata of the resource descripti<strong>on</strong>s. Interfaces to this website are available <str<strong>on</strong>g>in</str<strong>on</strong>g> English <strong>and</strong><br />

German.<br />

One of the most extensive resources that provides coverage of multiple discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es with<str<strong>on</strong>g>in</str<strong>on</strong>g> classics is<br />

Propylaeum: A Virtual <strong>Library</strong> of Classical Studies. 37 This subject portal 38 encompasses eight areas of<br />

classical studies: Egyptology, Ancient History, Ancient Near Eastern Studies, Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e Studies,<br />

Classical Archaeology, Classical Philology, Medieval <strong>and</strong> Neo-Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Philology, <strong>and</strong> Pre- <strong>and</strong> Early<br />

History. The entire collecti<strong>on</strong> of multidiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary resources can be searched at <strong>on</strong>e time or a full list of<br />

resources for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es can be browsed. Each subject has its own subportal that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a<br />

def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of the subject, a list of specialist library catalogs, new acquisiti<strong>on</strong>s for partner collecti<strong>on</strong>s, a<br />

list of traditi<strong>on</strong>al <strong>and</strong> e-journals, a list of subject databases <strong>and</strong> digital collecti<strong>on</strong>s, l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to more general<br />

Internet resources, <strong>and</strong> a list of specialized academic <strong>and</strong> research services. Six academic <strong>and</strong> museum<br />

project partners, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Bavarian State <strong>Library</strong> (Munich) <strong>and</strong> the Institute of Classical Philology<br />

at the Humboldt University, Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>, are <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g this academic portal.<br />

F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, DAPHNE (Data <str<strong>on</strong>g>in</str<strong>on</strong>g> Archaeology, Prehistory <strong>and</strong> History <strong>on</strong> the Net) 39 is a freely available<br />

portal that provides a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of access to subject-oriented bibliographic databases <str<strong>on</strong>g>in</str<strong>on</strong>g> prehistory,<br />

protohistory, archaeology, <strong>and</strong> the sciences of antiquity, until about 1000 AD. DAPHNE comb<str<strong>on</strong>g>in</str<strong>on</strong>g>es<br />

resources from three French databases: BAHR (Bullet<str<strong>on</strong>g>in</str<strong>on</strong>g> Analytique d’histoire Roma<str<strong>on</strong>g>in</str<strong>on</strong>g>e), FRANCIS,<br />

<strong>and</strong> FRANTIQ-CCI. Users of DAPHNE can search across the bibliographic records <str<strong>on</strong>g>in</str<strong>on</strong>g> these databases<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> Dutch, English, French, German, Italian, <strong>and</strong> Spanish.<br />

Document Analysis, Recogniti<strong>on</strong>, <strong>and</strong> Optical Character Recogniti<strong>on</strong> for Historical<br />

Languages<br />

While the major classical digital libraries described above support search<str<strong>on</strong>g>in</str<strong>on</strong>g>g across their collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al languages such as Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek, implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g such techniques <str<strong>on</strong>g>in</str<strong>on</strong>g> a scalable manner for<br />

ever-grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> mass-digitized collecti<strong>on</strong>s is a far more challeng<str<strong>on</strong>g>in</str<strong>on</strong>g>g task. C<strong>on</strong>venti<strong>on</strong>al OCR<br />

systems have a limited ability to work with historical languages such as Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, but languages such as<br />

Greek <strong>and</strong> Sanskrit are even more problematic. Special document-recogniti<strong>on</strong>-<strong>and</strong>-analysis systems<br />

have been developed to deal with many of the issues these languages present, <strong>and</strong> the research<br />

literature <strong>on</strong> historical document analysis <strong>and</strong> recogniti<strong>on</strong> is extensive. 40 Some research is start<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

explore how technologies that have been developed might be used to solve problems across historical<br />

languages or types of object, whether it is an <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, a papyrus, or a palimpsest. 41 This secti<strong>on</strong><br />

summarizes the use of document analysis <strong>and</strong> OCR technologies to support <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> access to<br />

documents <str<strong>on</strong>g>in</str<strong>on</strong>g> classical languages <strong>and</strong> dialects such as ancient Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Sanskrit, Sumerian, <strong>and</strong><br />

Syriac.<br />

37 http://www.propylaeum.de/<str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary.html<br />

38 It also searches both KIRKE <strong>and</strong> SISYPHOS.<br />

39 http://www.daphne.cnrs.fr/daphne/search.html;jsessi<strong>on</strong>id=5636B7A687E5E01429A1FD79CC88168B<br />

40 A full review of this literature is bey<strong>on</strong>d the scope of this paper. For some recent overviews <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of digital libraries see Sankar et al. (2006) <strong>and</strong><br />

Choudhury et al. (2006).<br />

41 A c<strong>on</strong>ference held <str<strong>on</strong>g>in</str<strong>on</strong>g> 2010, “Digital Imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Ancient Textual Heritage: Technological Challenges <strong>and</strong><br />

Soluti<strong>on</strong>s”(http://www.eik<strong>on</strong>opoiia.org/home.html) explored these issues <str<strong>on</strong>g>in</str<strong>on</strong>g> depth, <strong>and</strong> the full proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs are available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

(http://www.eik<strong>on</strong>opoiia.org/files/Eik<strong>on</strong>opoiia-2010-Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs.pdf). For a multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual approach to manuscripts, see Leydier et al. (2009)


13<br />

One major project that has recently been funded <str<strong>on</strong>g>in</str<strong>on</strong>g> this area is “New Technology for Digitizati<strong>on</strong> of<br />

Ancient Objects <strong>and</strong> Documents,” a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t project of the Archaeological Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g Research Group<br />

(ACRG) <strong>and</strong> the School of Electr<strong>on</strong>ics <strong>and</strong> Computer Science (ECS), Southampt<strong>on</strong>; the Centre for the<br />

Study of Ancient Documents (CSAD), Oxford; the CDLI, Los Angeles-Philadelphia-Oxford-Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>;<br />

<strong>and</strong> the Electr<strong>on</strong>ic Text Corpus of Sumerian Literature (ETCSL), Oxford. 42 This project has received a<br />

12-m<strong>on</strong>th Arts <strong>and</strong> Humanities Research <str<strong>on</strong>g>Council</str<strong>on</strong>g> (AHRC) grant to “develop a “Reflectance<br />

Transformati<strong>on</strong> Imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g (RTI) System for Ancient Documentary Artefacts.” The team plans to<br />

develop two RTI systems that can be used to capture high-quality digital images of documentary texts<br />

<strong>and</strong> archaeological materials. The <str<strong>on</strong>g>in</str<strong>on</strong>g>itial test<str<strong>on</strong>g>in</str<strong>on</strong>g>g will be c<strong>on</strong>ducted <strong>on</strong> stylus tablets from V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a,<br />

st<strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, L<str<strong>on</strong>g>in</str<strong>on</strong>g>ear B, <strong>and</strong> cuneiform tablets.<br />

Other relevant research is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>ducted by the IMPACT (Improv<str<strong>on</strong>g>in</str<strong>on</strong>g>g Access to Text) 43 project. The<br />

European Commissi<strong>on</strong> has funded this project <strong>and</strong> it is explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g how to develop advanced OCR<br />

methods for historical texts, particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the use of OCR <str<strong>on</strong>g>in</str<strong>on</strong>g> mass digitizati<strong>on</strong> processes. 44<br />

While their research is not specifically focused <strong>on</strong> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques for classical languages, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

was the major language of <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual discourse <str<strong>on</strong>g>in</str<strong>on</strong>g> Europe for almost a century, so techniques adapted<br />

for either manuscripts or early pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted books would be useful to classical scholarship <strong>and</strong> bey<strong>on</strong>d.<br />

Ancient Greek<br />

Only a limited amount of work has c<strong>on</strong>sidered us<str<strong>on</strong>g>in</str<strong>on</strong>g>g automatic techniques <str<strong>on</strong>g>in</str<strong>on</strong>g> the optical recogniti<strong>on</strong> of<br />

ancient or classical Greek. While some recent research has focused <strong>on</strong> the development of OCR for<br />

“Old Greek” historical manuscripts, 45 little work has explored develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques for either<br />

manuscripts or pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s of Ancient Greek texts.<br />

Some prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary work <str<strong>on</strong>g>in</str<strong>on</strong>g> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g an automatic-recogniti<strong>on</strong> methodology for Ancient Greek is<br />

detailed by Stewart et al. (2007). In these authors’ <str<strong>on</strong>g>in</str<strong>on</strong>g>itial survey of Greek editi<strong>on</strong>s, they found that <strong>on</strong><br />

average almost 14 percent of the Greek words <strong>on</strong> a text page were found <str<strong>on</strong>g>in</str<strong>on</strong>g> the notes or apparatus<br />

criticus. The authors first used a multi-tiered approach to OCR that applied two major post-process<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

techniques to the output of two commercial OCR packages, ABBYY F<str<strong>on</strong>g>in</str<strong>on</strong>g>eReader (8.0) 46 <strong>and</strong><br />

Anagnostis 4.1. Dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g this experiment, they found that character accuracy <strong>on</strong> simple uncorrected text<br />

averaged about 98.57 percent. Other prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary experiments with OCR-generated text revealed that<br />

the uncorrected OCR could serve as searchable corpora. Even when work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with a mid-n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenthcentury<br />

editi<strong>on</strong> of Aristotle <str<strong>on</strong>g>in</str<strong>on</strong>g> a n<strong>on</strong>st<strong>and</strong>ard Greek f<strong>on</strong>t, searches of the OCR-generated text typically<br />

provided superior recall than searches of texts that had been manually typed because the OCR text<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cluded variant read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs found <str<strong>on</strong>g>in</str<strong>on</strong>g> the notes. In a sec<strong>on</strong>d experiment, the automatic correcti<strong>on</strong> of s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle<br />

texts was performed us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a list of <strong>on</strong>e milli<strong>on</strong> Greek words <strong>and</strong> the Morpheus Greek morphological<br />

analyzer that was developed by the PDL.<br />

For their third experiment, Stewart <strong>and</strong> colleagues used the OCR output of multiple editi<strong>on</strong>s of the<br />

same work to correct <strong>on</strong>e another <str<strong>on</strong>g>in</str<strong>on</strong>g> a three-step process. First, different editi<strong>on</strong>s of a text were aligned<br />

by f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g unique str<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <str<strong>on</strong>g>in</str<strong>on</strong>g> each. Sec<strong>on</strong>d, if an error word was found <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e text, a fuzzy search was<br />

performed <str<strong>on</strong>g>in</str<strong>on</strong>g> the aligned parallel text to try to locate the correct form. F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, <strong>on</strong>ce error words <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

base text had been matched aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st potential ground truth counterparts <str<strong>on</strong>g>in</str<strong>on</strong>g> the parallel texts, rules<br />

42 http://www.southampt<strong>on</strong>.ac.uk/archaeology/news/news_2010/acrg_dedefi_ma<str<strong>on</strong>g>in</str<strong>on</strong>g>.shtml<br />

43 http://www.impact-project.eu/home/<br />

44 For a recent overview of some of the IMPACT project’s research, see Ploeger et al. (2009).<br />

45 For an example, see Ntzios et al. (2007).<br />

46 http://www.abbyy.com/


14<br />

generated by the decisi<strong>on</strong> tree program (C4.5) were used to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e the more likely variant. The<br />

authors found that the parallel-text correcti<strong>on</strong> rate was c<strong>on</strong>sistently higher than the s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle-text<br />

correcti<strong>on</strong> rate by between 5 percent <strong>and</strong> 16 percent. Basel<str<strong>on</strong>g>in</str<strong>on</strong>g>e character accuracy <str<strong>on</strong>g>in</str<strong>on</strong>g> this f<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

experiment rose to 99.49 percent.<br />

The ability to search text variants <strong>and</strong> to automatically collate various editi<strong>on</strong>s of the same work <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

digital library through the use of OCR <strong>and</strong> a number of automated techniques offers new research<br />

opportunities. In additi<strong>on</strong>, the work of Stewart et al. provides useful less<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> how curated digital<br />

corpora, automated methods, <strong>and</strong> milli<strong>on</strong>-book libraries can be used to create new, more sophisticated<br />

digital libraries:<br />

By situat<str<strong>on</strong>g>in</str<strong>on</strong>g>g corpus producti<strong>on</strong> with<str<strong>on</strong>g>in</str<strong>on</strong>g> a digital library (i.e., a collecti<strong>on</strong> of authenticated digital<br />

objects with basic catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g data), exploit<str<strong>on</strong>g>in</str<strong>on</strong>g>g the strengths of large collecti<strong>on</strong>s (e.g., multiple<br />

editi<strong>on</strong>s), <strong>and</strong> mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g judicious use of practical automated methods, we can start to build new<br />

corpora <strong>on</strong> top of our digital libraries that are not <strong>on</strong>ly larger but, <str<strong>on</strong>g>in</str<strong>on</strong>g> many ways, more useful<br />

than their manually c<strong>on</strong>structed predecessors (Stewart et al. 2007).<br />

Further research reported by Boschetti et al. (2009) was <str<strong>on</strong>g>in</str<strong>on</strong>g>formed by the prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary techniques<br />

reported <str<strong>on</strong>g>in</str<strong>on</strong>g> Stewart et al. (2007), but also exp<strong>and</strong>ed it s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the <str<strong>on</strong>g>in</str<strong>on</strong>g>itial work did not <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the<br />

recogniti<strong>on</strong> of Greek accents <strong>and</strong> diacritical marks.<br />

Boschetti et al. (2009) c<strong>on</strong>ducted a series of experiments <str<strong>on</strong>g>in</str<strong>on</strong>g> attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create a scalable workflow<br />

for outputt<str<strong>on</strong>g>in</str<strong>on</strong>g>g highly accurate OCR of Greek text. This workflow used progressive multiple alignment<br />

of the OCR output of two commercial products (Anagnostis, Abbyy F<str<strong>on</strong>g>in</str<strong>on</strong>g>eReader) <strong>and</strong> <strong>on</strong>e opensource<br />

47 OCR eng<str<strong>on</strong>g>in</str<strong>on</strong>g>e (OCRopus), which was not available when Stewart et al. (2007) c<strong>on</strong>ducted their<br />

research. Multiple editi<strong>on</strong>s of Athenaeus’ Deipnosophistae, <strong>on</strong>e editi<strong>on</strong> of Aeschylus, <strong>and</strong> a 1475<br />

editi<strong>on</strong> of August<str<strong>on</strong>g>in</str<strong>on</strong>g>e’s De Cogitate Dei were used for the OCR experiments. This research determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

that the accuracy of s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle eng<str<strong>on</strong>g>in</str<strong>on</strong>g>es was very dependent <strong>on</strong> the tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g set created for it, but it also<br />

revealed that <str<strong>on</strong>g>in</str<strong>on</strong>g> several cases OCRopus obta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed better results than either commercial opti<strong>on</strong>. The<br />

highest accuracy level (99.01 percent), which was for sample pages from the fairly recent Loeb editi<strong>on</strong><br />

of Athenaeus, was obta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed through the use of multiple progressive alignment <strong>and</strong> a spell-check<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

algorithm. (Accuracy levels <strong>on</strong> earlier editi<strong>on</strong>s of Athenaeus ranged from 94 percent to 98 percent).<br />

The additi<strong>on</strong> of accents did produce lower character accuracy results than those reported by Stewart et<br />

al., but at the same time, accents are an important part of Ancient Greek, <strong>and</strong> any OCR system for this<br />

language will ultimately need to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude them. This research also dem<strong>on</strong>strated that OCRopus, a<br />

relatively new open-source OCR eng<str<strong>on</strong>g>in</str<strong>on</strong>g>e, could produce results comparable to those of expensive<br />

commercial products.<br />

While both Stewart et al. (2007) <strong>and</strong> Boschetti et al. (2009) focused <strong>on</strong> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g OCR to recognize pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted<br />

editi<strong>on</strong>s of Ancient Greek, a variety of both classical scholarship <strong>and</strong> document-recogniti<strong>on</strong> research 48<br />

has been c<strong>on</strong>ducted <strong>on</strong> the Archimedes Palimpsest, 49 a thirteenth-century prayer book that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

erased texts that were written several centuries before, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g previously “lost” treatises by<br />

Archimedes <strong>and</strong> Hypereides. This manuscript has s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce been digitized, <strong>and</strong> the images created of the<br />

47 http://code.google.com/p/ocropus/<br />

48 A palimpsest is a manuscript “<strong>on</strong> which more than <strong>on</strong>e text has been written with the earlier writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>completely erased <strong>and</strong> still visible”<br />

(http://wordnetweb.pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>.edu/perl/webwns=palimpsest). For a full list of research publicati<strong>on</strong>s us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Archimedes Palimpsest, see<br />

http://www.archimedespalimpsest.org/bibliography1.html<br />

49 http://www.archimedespalimpsest.org/


15<br />

manuscript pages <strong>and</strong> the transcripti<strong>on</strong>s of the text are available for download <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 50 Scholars are<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with digital images rather than the manuscript itself, <strong>and</strong> scholars from diverse discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g palaeography, the history of mathematics <strong>and</strong> science, <strong>and</strong> Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e liturgy, have d<strong>on</strong>e<br />

extensive work with this palimpsest. Much of the image-process<str<strong>on</strong>g>in</str<strong>on</strong>g>g work with the palimpsest has<br />

focused <strong>on</strong> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g algorithms to extract the text of Archimedes <str<strong>on</strong>g>in</str<strong>on</strong>g> particular from page images.<br />

Salerno et al. (2007) used pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cipal comp<strong>on</strong>ent analysis (PCA) <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent comp<strong>on</strong>ent analysis<br />

(ICA) techniques to extract “clean maps of the primary Archimedes text, the overwritten text, <strong>and</strong> the<br />

mold pattern present <str<strong>on</strong>g>in</str<strong>on</strong>g> the pages” from 14 hyperspectral images of the Archimedes. Their goals were<br />

to provide better access to the text <strong>and</strong> to develop techniques that could be used <str<strong>on</strong>g>in</str<strong>on</strong>g> other palimpsestdigitizati<strong>on</strong><br />

projects. The authors also report that:<br />

A further aspect of the problem is to partly automate the read<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> transcripti<strong>on</strong> tasks. This<br />

cannot be <str<strong>on</strong>g>in</str<strong>on</strong>g>tended as a substituti<strong>on</strong> of the human experts <str<strong>on</strong>g>in</str<strong>on</strong>g> a task where they perform better<br />

than any presently c<strong>on</strong>ceivable numerical strategy, but as an accelerati<strong>on</strong> of the human work<br />

(Salerno et al. 2007).<br />

The importance of not replac<str<strong>on</strong>g>in</str<strong>on</strong>g>g expert scholars with systems but rather of develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools that assist<br />

them <str<strong>on</strong>g>in</str<strong>on</strong>g> their traditi<strong>on</strong>al tasks is a theme seen throughout the literature.<br />

Other significant work <str<strong>on</strong>g>in</str<strong>on</strong>g> the area of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to fragile manuscripts has been c<strong>on</strong>duced by the<br />

EDUCE (Enhanced Digital Unwrapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g for C<strong>on</strong>servati<strong>on</strong> <strong>and</strong> Educati<strong>on</strong>) Project. 51 Investigators <strong>on</strong><br />

this Nati<strong>on</strong>al Science Foundati<strong>on</strong>–funded project have been work<str<strong>on</strong>g>in</str<strong>on</strong>g>g to develop systems that support<br />

the “virtual unwrapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> visualizati<strong>on</strong> of ancient texts.” Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to their website:<br />

The overall purpose is to capture <str<strong>on</strong>g>in</str<strong>on</strong>g> digital form fragile 3D texts, such as ancient papyrus <strong>and</strong><br />

scrolls of other materials us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a custom built, portable, multi-power CT scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g device <strong>and</strong><br />

then to virtually “unroll” the scroll us<str<strong>on</strong>g>in</str<strong>on</strong>g>g image algorithms, render<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital facsimile that<br />

exposes <strong>and</strong> makes legible <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> other mark<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <strong>on</strong> the artifact, all <str<strong>on</strong>g>in</str<strong>on</strong>g> a n<strong>on</strong>-<str<strong>on</strong>g>in</str<strong>on</strong>g>vasive<br />

process.<br />

Some of the EDUCE Project’s image-process<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques have been used by the Homer Multitext 52<br />

Project as described by Baumann <strong>and</strong> Seales (2009), who presented an applicati<strong>on</strong> of imageregistrati<strong>on</strong><br />

techniques, or the “process of mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g a sensed image <str<strong>on</strong>g>in</str<strong>on</strong>g>to the coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ate system of a<br />

reference image,” to the Venetus A manuscript of the Iliad used <str<strong>on</strong>g>in</str<strong>on</strong>g> this project. The Homer Multitext<br />

Project <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded 3-D scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g as part of its digitizati<strong>on</strong> strategy, but as the 3-D scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g system<br />

acquired un-textured 3-D models a “procedure to register the 2D photography to the 3D scans was<br />

performed periodically.” Dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>e photography sessi<strong>on</strong> it was discovered that technical issues had<br />

produced a number of images of poor quality. While these images were reshot, time c<strong>on</strong>stra<str<strong>on</strong>g>in</str<strong>on</strong>g>ts<br />

prevented perform<str<strong>on</strong>g>in</str<strong>on</strong>g>g the 3-D geometry capture for these pages aga<str<strong>on</strong>g>in</str<strong>on</strong>g>. The result was a number of<br />

folios that had two sets of data—a “dirty” image that had registered 3-D geometry <strong>and</strong> a “clean” image<br />

with no associated geometry—to which the project wished to apply digital flatten<str<strong>on</strong>g>in</str<strong>on</strong>g>g algorithms. The<br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g> computati<strong>on</strong>al problem was thus to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e a means of obta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g a “high-quality deformati<strong>on</strong><br />

of the ‘clean image’ such that the text was <str<strong>on</strong>g>in</str<strong>on</strong>g> the same positi<strong>on</strong> as the ‘dirty image’” that would then<br />

allow them to “apply digital flatten<str<strong>on</strong>g>in</str<strong>on</strong>g>g us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the acquired corresp<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g 3D geometry.”<br />

50 http://archimedespalimpsest.net/<br />

51 http://www.stoa.org/educe/<br />

52 http://chs.harvard.edu/wa/pageRtn=ArticleWrapper&bdc=12&mn=1169


16<br />

The image-registrati<strong>on</strong> algorithm developed by Baumann <strong>and</strong> Seales was successful, <strong>and</strong> the authors<br />

rightly c<strong>on</strong>cluded that:<br />

High-resoluti<strong>on</strong>, multispectral digital imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g of important documents is emerg<str<strong>on</strong>g>in</str<strong>on</strong>g>g as a<br />

st<strong>and</strong>ard practice for enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarly analysis of difficult or damaged texts. As imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

techniques improve, documents are revisited <strong>and</strong> re-imaged, <strong>and</strong> registrati<strong>on</strong> of these images<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>to the same frame of reference for direct comparis<strong>on</strong> can be a powerful tool (Baumann <strong>and</strong><br />

Seales 2009).<br />

The work of the EDUCE Project illustrates how the state of the art is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used to provide new levels<br />

of access to valuable <strong>and</strong> damaged manuscripts.<br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

In light of the extensive digitizati<strong>on</strong> of cultural heritage materials such as manuscripts <strong>and</strong> the large<br />

number of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> texts that are becom<str<strong>on</strong>g>in</str<strong>on</strong>g>g available through massive digitizati<strong>on</strong> projects, techniques for<br />

improv<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to these materials is an area of grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g research that is exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> this subsecti<strong>on</strong>.<br />

A variety of approaches have been explored for improv<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> manuscripts. Leydier et al.<br />

(2007) explored the use of “word-spott<str<strong>on</strong>g>in</str<strong>on</strong>g>g” to improve <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> retrieval of textual data <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

primarily Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> medieval manuscript images. They describe the technique as follows:<br />

In practice, word-spott<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>sists <str<strong>on</strong>g>in</str<strong>on</strong>g> retriev<str<strong>on</strong>g>in</str<strong>on</strong>g>g all the occurrences of an image of a word. This<br />

template word is selected by the user by outl<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>e occurrence <strong>on</strong> the document. It results <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the system propos<str<strong>on</strong>g>in</str<strong>on</strong>g>g a sorted list of hits that the user can prune manually. … Word-spott<str<strong>on</strong>g>in</str<strong>on</strong>g>g is<br />

based <strong>on</strong> a similarity or a distance between two images, the reference image def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by the user<br />

<strong>and</strong> the target images represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g the rest of the page or all the pages of a multi-page<br />

document. C<strong>on</strong>trary to text query <strong>on</strong> a document processed by OCR, a word-image query can<br />

be sensitive to the style of the writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g or the typography used. This technique is used when<br />

word recogniti<strong>on</strong> cannot be d<strong>on</strong>e, for example <strong>on</strong> very deteriorated pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted documents or <strong>on</strong><br />

manuscripts (Leydier et al. 2007).<br />

The authors report that ma<str<strong>on</strong>g>in</str<strong>on</strong>g> drawback to this approach is that a user has to select a keyword <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

manuscript image (typically based <strong>on</strong> an ascii transcript) as a basis for further image retrieval, limit<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

their approach to retrieval of other images by word <strong>on</strong>ly.<br />

Another approach, presented by Edwards et al. (2004), tra<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a generalized Hidden Markov Model<br />

(gHMM) <strong>on</strong> the transcripti<strong>on</strong> of a Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> manuscript to get both a transmissi<strong>on</strong> model <strong>and</strong> <strong>on</strong>e example<br />

each for 22 letters to create an emissi<strong>on</strong> model. Their transiti<strong>on</strong> model for unigrams, bigrams, <strong>and</strong><br />

trigrams was fitted us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>Library</strong>’s electr<strong>on</strong>ic versi<strong>on</strong> of Caesar’s Gallic Wars, <strong>and</strong> their<br />

emissi<strong>on</strong> model was tra<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>on</strong> 22 glyphs taken from a twelfth-century manuscript of Terence’s<br />

Comoediae. In c<strong>on</strong>trast to Leydier et al., the authors argued that word-spott<str<strong>on</strong>g>in</str<strong>on</strong>g>g was not entirely<br />

appropriate for a highly <str<strong>on</strong>g>in</str<strong>on</strong>g>flected language such as Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>:<br />

Manmatha et al. … <str<strong>on</strong>g>in</str<strong>on</strong>g>troduce the technique of “word spott<str<strong>on</strong>g>in</str<strong>on</strong>g>g,” which segments text <str<strong>on</strong>g>in</str<strong>on</strong>g>to word<br />

images, rectifies the word images, <strong>and</strong> then uses an aligned tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g set to learn<br />

corresp<strong>on</strong>dences between rectified word images <strong>and</strong> str<str<strong>on</strong>g>in</str<strong>on</strong>g>gs. The method is not suitable for a<br />

heavily <str<strong>on</strong>g>in</str<strong>on</strong>g>flected language, because words take so many forms. In an <str<strong>on</strong>g>in</str<strong>on</strong>g>flected language, the<br />

natural unit to match to is a subset of a word, rather than a whole word, imply<str<strong>on</strong>g>in</str<strong>on</strong>g>g that <strong>on</strong>e


17<br />

should segment the text <str<strong>on</strong>g>in</str<strong>on</strong>g>to blocks—which may be smaller than words—while recogniz<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

(Edwards et al. 2004).<br />

In their model, Edwards <strong>and</strong> colleagues chose not to model word to word transiti<strong>on</strong> probabilities s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce<br />

word order <str<strong>on</strong>g>in</str<strong>on</strong>g> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> is highly arbitrary. The method had reas<strong>on</strong>able accuracy: 75 percent of the letters<br />

were correctly transcribed <strong>and</strong> the search<str<strong>on</strong>g>in</str<strong>on</strong>g>g ability was reported to be relatively str<strong>on</strong>g.<br />

Some research with document analysis of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> manuscripts has focused <strong>on</strong> assist<str<strong>on</strong>g>in</str<strong>on</strong>g>g palaeographers.<br />

The discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of palaeography is explored further <str<strong>on</strong>g>in</str<strong>on</strong>g> its subsecti<strong>on</strong>, but <str<strong>on</strong>g>in</str<strong>on</strong>g> general, palaeography<br />

studies the writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g style of ancient documents. 53 Moalla et al. (2006) c<strong>on</strong>ducted automatic analysis of<br />

the writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g styles of ancient Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> manuscripts from the eighth to the sixteenth centuries <strong>and</strong> focused<br />

<strong>on</strong> the extracti<strong>on</strong> of “sufficiently discrim<str<strong>on</strong>g>in</str<strong>on</strong>g>ative features” to be able to differentiate between<br />

sufficiently large numbers of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> writ<str<strong>on</strong>g>in</str<strong>on</strong>g>gs. A number of problems complicated their image analysis,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the complexity of the shapes of letters, hybrid writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g styles, poor manuscript quality,<br />

overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g l<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> words, <strong>and</strong> poor-quality manuscript images. Their discrim<str<strong>on</strong>g>in</str<strong>on</strong>g>ant analysis of 15<br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> classes achieved a classificati<strong>on</strong>-accuracy rate of <strong>on</strong>ly 59 percent <str<strong>on</strong>g>in</str<strong>on</strong>g> their first iterati<strong>on</strong>, but the<br />

elim<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of four classes that were not statistically well-represented <str<strong>on</strong>g>in</str<strong>on</strong>g>creased the rate to 81 percent.<br />

Another key area of technology research is the development of techniques for digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> search<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cunabula, or early pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted books, a large number of which were pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted <str<strong>on</strong>g>in</str<strong>on</strong>g> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>. One major project<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> this area is CAMENA—Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Texts of Early Modern Europe, 54 hosted by the University of<br />

Mannheim. Their digital library <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes five collecti<strong>on</strong>s: a collecti<strong>on</strong> of Neo-Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> poetry composed<br />

by German authors available as images <strong>and</strong> mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-readable texts; a collecti<strong>on</strong> of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> historical <strong>and</strong><br />

political writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g from early modern Germany; a reference collecti<strong>on</strong> of dicti<strong>on</strong>aries <strong>and</strong> h<strong>and</strong>books<br />

from 1500–1750 that helps provide a read<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment; a corpus of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> letters written by German<br />

scholars between 1530 <strong>and</strong> 1770; <strong>and</strong> a collecti<strong>on</strong> of early pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s of Italian Renaissance<br />

humanists born before 1500. This project also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the Term<str<strong>on</strong>g>in</str<strong>on</strong>g>i <strong>and</strong> Lemmata databases, which are<br />

now part of the eAQUA Project. The wealth of Neo-Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> materials <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e is well documented by the<br />

“Philological Museum: An Analytic Bibliography of On-L<str<strong>on</strong>g>in</str<strong>on</strong>g>e Neo Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Texts,” 55 an extensive<br />

website created by Dana F. Sutt<strong>on</strong> of the University of California, Irv<str<strong>on</strong>g>in</str<strong>on</strong>g>e, that s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce 1999 has served as<br />

an “analytic bibliography of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> texts written dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Renaissance <strong>and</strong> later that are freely<br />

available to the general public <strong>on</strong> the Web” <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes more than 33,960 records.<br />

Digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>cunabula, or books pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted before 1500, poses a number of challenges, as outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by<br />

Schibel <strong>and</strong> Rydberg-Cox (2006) <strong>and</strong> Rydberg-Cox (2009). As they expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed:<br />

The primary challenges arise from the use of n<strong>on</strong>st<strong>and</strong>ard typographical glyphs based <strong>on</strong><br />

medieval h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g to abbreviate words. Further difficulties are posed by the practice of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>c<strong>on</strong>sistently mark<str<strong>on</strong>g>in</str<strong>on</strong>g>g word breaks at the end of l<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> of reduc<str<strong>on</strong>g>in</str<strong>on</strong>g>g or even elim<str<strong>on</strong>g>in</str<strong>on</strong>g>at<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

spac<str<strong>on</strong>g>in</str<strong>on</strong>g>g between some words (Rydberg-Cox 2009).<br />

In additi<strong>on</strong>, such digitized texts are often presented to a modern audience <strong>on</strong>ly after an extensive<br />

amount of edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> annotati<strong>on</strong> has occurred, a level of edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g that is not scalable to milli<strong>on</strong>-book<br />

libraries.<br />

53 An excellent resource for explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g ancient writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g systems is Mnam<strong>on</strong>: Ancient Writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g Systems <str<strong>on</strong>g>in</str<strong>on</strong>g> the Mediterranean<br />

(http://lila.sns.it/mnam<strong>on</strong>/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phppage=Home&lang=en), which not <strong>on</strong>ly provides extensive descripti<strong>on</strong>s <strong>on</strong> various writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g systems but also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes<br />

selected electr<strong>on</strong>ic resources.<br />

54 http://www.uni-mannheim.de/mateo/camenahtdocs/camena.html<br />

55 http://www.philological.bham.ac.uk/bibliography/


18<br />

Schibel <strong>and</strong> Rydberg-Cox argued that good bibliographic descripti<strong>on</strong> is required for this historical<br />

source material (ideally so that such collecti<strong>on</strong>s can be sorted by period, place, language, literary<br />

genre, publisher, <strong>and</strong> audience), particularly s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many digitized texts will often be reused <str<strong>on</strong>g>in</str<strong>on</strong>g> other<br />

c<strong>on</strong>texts. A sec<strong>on</strong>d recommendati<strong>on</strong> made by Schibel <strong>and</strong> Rydberg-Cox (2006) is the need to identify<br />

at least basic structural metadata for such books (e.g., fr<strong>on</strong>t, body, back) or to create a rough table of<br />

c<strong>on</strong>tents that provides a framework by which to make page images available. They suggested that such<br />

structural metadata would support new research <str<strong>on</strong>g>in</str<strong>on</strong>g>to traditi<strong>on</strong>al questi<strong>on</strong>s of textual <str<strong>on</strong>g>in</str<strong>on</strong>g>fluence for<br />

researchers who could use automatic text-similarity measures to recognize text families <strong>and</strong> trace either<br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>fluence of major authors or the purposes of a given document. Despite such new opportunities,<br />

problems rema<str<strong>on</strong>g>in</str<strong>on</strong>g>. An <str<strong>on</strong>g>in</str<strong>on</strong>g>itial analysis by the authors of digital libraries of page images of early modern<br />

books revealed that page images produced were often <str<strong>on</strong>g>in</str<strong>on</strong>g>accurate or <str<strong>on</strong>g>in</str<strong>on</strong>g>adequate, OCR tools were not<br />

yet flexible enough to produce transcripti<strong>on</strong>s, <strong>and</strong> automated tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g is far more difficult<br />

with “pre-st<strong>and</strong>ardized language.”<br />

Schibel <strong>and</strong> Rydberg-Cox c<strong>on</strong>cluded, however, that the greatest challenge faced <str<strong>on</strong>g>in</str<strong>on</strong>g> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to<br />

early modern books is that l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic tools for Early Modern Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> are c<strong>on</strong>siderably underdeveloped:<br />

Aside from the issues outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed above, two major challenges face humans <strong>and</strong> computers alike.<br />

First, we have no comprehensive dicti<strong>on</strong>ary of Neo-Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>. Readers must cope with neologisms<br />

or, often much harder to decipher, idioms <strong>and</strong> turns of expressi<strong>on</strong> of particular groups. Sec<strong>on</strong>d,<br />

aside from morphological analyzers such as Morpheus—the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> morphological analyzer<br />

found <str<strong>on</strong>g>in</str<strong>on</strong>g> the Perseus Digital <strong>Library</strong>—we have few computati<strong>on</strong>al tools for Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>. Even<br />

Morpheus does not use c<strong>on</strong>textual clues to prioritize analyses, <strong>and</strong> we are not aware of any<br />

substantive work <strong>on</strong> named entity recogniti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>. We do not yet have mature electr<strong>on</strong>ic<br />

authority lists for the Greco-Roman world, much less the people, places, etc. of the early<br />

modern period (Schibel <strong>and</strong> Rydberg Cox 2006).<br />

Some of the issues listed here, such as the development of l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic tools for early modern Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, have<br />

received further attenti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the past four years by authors such as Reddy <strong>and</strong> Crane (2006). They<br />

tested the abilities of the commercial OCR ABBY F<str<strong>on</strong>g>in</str<strong>on</strong>g>eReader <strong>and</strong> the open-source document<br />

recogniti<strong>on</strong> system Gamera 56 to recognize glyphs <str<strong>on</strong>g>in</str<strong>on</strong>g> early modern Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> documents. They found that<br />

after extensive tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g Gamera could recognize about 80 percent of glyphs while F<str<strong>on</strong>g>in</str<strong>on</strong>g>eReader could<br />

recognize about 84 percent. To improve the character-recogniti<strong>on</strong> output, they recommended the use of<br />

language model<str<strong>on</strong>g>in</str<strong>on</strong>g>g for future work.<br />

Rydberg-Cox (2009) also explored some of the computati<strong>on</strong>al challenges <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a corpus of early<br />

Modern Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> reported <strong>on</strong> work from the NEH project, “Approach<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Problems of Digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Incunables.” The primary aim of this project was to exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e the “challenges associated with<br />

represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> digital form the complex <strong>and</strong> n<strong>on</strong>-st<strong>and</strong>ard typefaces used <str<strong>on</strong>g>in</str<strong>on</strong>g> these texts to abbreviate<br />

words” a practice that was d<strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g> imitati<strong>on</strong> of medieval h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g practice. Such features of early<br />

typography occurred at vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g rates <str<strong>on</strong>g>in</str<strong>on</strong>g> different books, Rydberg-Cox noted, but they do appear so<br />

frequently that no digitizati<strong>on</strong> project can fail to c<strong>on</strong>sider them. This issue was also faced by the<br />

Archimedes Digital <strong>Library</strong> 57 project, which, when digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts published between 1495 <strong>and</strong> 1691,<br />

discovered between three <strong>and</strong> five abbreviati<strong>on</strong>s <strong>on</strong> every pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted page. Rydberg-Cox emphasized that<br />

56 http://gamera.<str<strong>on</strong>g>in</str<strong>on</strong>g>formatik.hsnr.de/. In March 2011, the Gamera Project announced that they were releas<str<strong>on</strong>g>in</str<strong>on</strong>g>g a GreekOCR Toolkit<br />

(http://gamera.<str<strong>on</strong>g>in</str<strong>on</strong>g>formatik.hsnr.de/add<strong>on</strong>s/greekocr4gamera/), an OCR system that can be used for “polyt<strong>on</strong>al Greek text documents.” Although still <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

test<str<strong>on</strong>g>in</str<strong>on</strong>g>g stage, it <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes extensive documentati<strong>on</strong> <strong>and</strong> the ability to recognize accents.<br />

57 http://archimedes.fas.harvard.edu/


19<br />

when digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g early modern books, a project needs to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e how much functi<strong>on</strong>ality users will<br />

require from a digital facsimile <strong>and</strong> how much human <str<strong>on</strong>g>in</str<strong>on</strong>g>terventi<strong>on</strong> will be required to create it.<br />

In analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g these questi<strong>on</strong>s, Rydberg-Cox proposed five possible approaches: (1) image books with<br />

simple page images; (2) image books with m<str<strong>on</strong>g>in</str<strong>on</strong>g>imal structural data; (3) image fr<strong>on</strong>t transcripti<strong>on</strong>s (such<br />

as those found <str<strong>on</strong>g>in</str<strong>on</strong>g> the Mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g of America 58 project) with page images that have searchable uncorrected<br />

OCR; (4) carefully edited <strong>and</strong> tagged transcripti<strong>on</strong>s (generally marked up <str<strong>on</strong>g>in</str<strong>on</strong>g> XML); <strong>and</strong> (5) scholarly<br />

<strong>and</strong> critical editi<strong>on</strong>s. Ultimately, the project decided to create sample texts <str<strong>on</strong>g>in</str<strong>on</strong>g> all of these genres except<br />

that of the scholarly critical editi<strong>on</strong> because of the cost of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g such editi<strong>on</strong>s. The decisi<strong>on</strong> to<br />

digitize the text, rather than just provide page images with limited OCR, raised its own issues,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the need to manually photograph rather than scan pages <strong>and</strong> how to address characters <strong>and</strong><br />

glyphs that could not be represented by Unicode. They had to create a method that could be used by<br />

data entry c<strong>on</strong>tractors to represent characters as they typed up texts, <strong>and</strong> the first step was to create a<br />

catalog of all the brevigraphs that appeared <str<strong>on</strong>g>in</str<strong>on</strong>g> the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted books <strong>and</strong> that assigned a unique entity<br />

identifier to each n<strong>on</strong>st<strong>and</strong>ard character that data entry pers<strong>on</strong>nel could use to represent the glyph.<br />

In additi<strong>on</strong> to this catalog, a number of computati<strong>on</strong>al tools were created to assist the data entry<br />

operators:<br />

Because the expansi<strong>on</strong> of these abbreviati<strong>on</strong>s is an extremely time-c<strong>on</strong>sum<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> pa<str<strong>on</strong>g>in</str<strong>on</strong>g>stak<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

task, we developed three tools to facilitate the tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g process. These tools suggest possible<br />

expansi<strong>on</strong>s for Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> abbreviati<strong>on</strong>s <strong>and</strong> brevigraphs, help identify words that are divided across<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>es, <strong>and</strong> separate words that are jo<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as the results of irregular spac<str<strong>on</strong>g>in</str<strong>on</strong>g>g. All three programs<br />

can return results <str<strong>on</strong>g>in</str<strong>on</strong>g> HTML for human readability or by XML <str<strong>on</strong>g>in</str<strong>on</strong>g> resp<strong>on</strong>se to remote procedure<br />

call as part of a program to automatically exp<strong>and</strong> abbreviati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> these texts (Rydberg-Cox<br />

2009).<br />

Another important po<str<strong>on</strong>g>in</str<strong>on</strong>g>t raised by Rydberg-Cox was that while the project needed to develop tools<br />

such as this, if such tools were shared <str<strong>on</strong>g>in</str<strong>on</strong>g> a larger <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, they could be reused by the numerous<br />

projects digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> books. Ultimately, Rydberg-Cox c<strong>on</strong>cluded that this work showed that a<br />

large-scale project that created image-fr<strong>on</strong>t editi<strong>on</strong>s (e.g., us<str<strong>on</strong>g>in</str<strong>on</strong>g>g uncorrected data that were manually<br />

typed to support search<str<strong>on</strong>g>in</str<strong>on</strong>g>g rather than uncorrected OCR) could be affordably managed. In the<br />

workflow of his own project, Rydberg-Cox found that the most significant expense was hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g human<br />

editors tag abbreviati<strong>on</strong>s <strong>and</strong> a sec<strong>on</strong>d editor proofread the work.<br />

N<strong>on</strong>etheless, Rydberg-Cox c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>gly argued that a certa<str<strong>on</strong>g>in</str<strong>on</strong>g> level of transcripti<strong>on</strong> is typically worth<br />

the cost because it provides better searchability <strong>and</strong>, even more important, supports automatic<br />

hypertext <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g to dicti<strong>on</strong>aries <strong>and</strong> other read<str<strong>on</strong>g>in</str<strong>on</strong>g>g support tools. Such tools can help students <strong>and</strong><br />

scholars read texts <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> without expert knowledge of such languages, <strong>and</strong> they are<br />

particularly important for early modern books, many of which have never been translated.<br />

Furthermore, Rydberg-Cox noted that larger collecti<strong>on</strong>s of lightly edited text often reach far larger<br />

audiences than small collecti<strong>on</strong>s of closely edited texts or critical editi<strong>on</strong>s. In additi<strong>on</strong>, this model does<br />

not preclude the development of critical editi<strong>on</strong>s, for as l<strong>on</strong>g as the images <strong>and</strong> transcripti<strong>on</strong>s are made<br />

available as open c<strong>on</strong>tent they can be reused by scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> support of their own editi<strong>on</strong>s.<br />

In c<strong>on</strong>trast to us<str<strong>on</strong>g>in</str<strong>on</strong>g>g digitized images <strong>and</strong> typed <str<strong>on</strong>g>in</str<strong>on</strong>g> transcripti<strong>on</strong>s, recent research reported by Sim<strong>on</strong>e<br />

Mar<str<strong>on</strong>g>in</str<strong>on</strong>g>ai (2009) explored the use of automatic text <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> retrieval methods to support<br />

58 http://moa.umdl.umich.edu/


20<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> retrieval from early modern books. She tested her methods <strong>on</strong> the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Gutenberg Bible<br />

<strong>and</strong> reported the same problems as Schibel <strong>and</strong> Rydberg-Cox, namely, the high density of text <strong>on</strong> each<br />

page, the limited spac<str<strong>on</strong>g>in</str<strong>on</strong>g>g between words, <strong>and</strong>, most important, the use of many abbreviati<strong>on</strong>s <strong>and</strong><br />

ligatures. She noted that such issues limit not just automatic techniques but human read<str<strong>on</strong>g>in</str<strong>on</strong>g>g as well. The<br />

Gutenberg Bible al<strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded 75 types of ligatures, with two dense columns of text per page, each<br />

c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g 42 l<str<strong>on</strong>g>in</str<strong>on</strong>g>es. The methodology proposed, Mar<str<strong>on</strong>g>in</str<strong>on</strong>g>ai hoped, would support <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> retrieval<br />

bey<strong>on</strong>d this <strong>on</strong>e text:<br />

… our aim is not to deal <strong>on</strong>ly with the Gutenberg Bible, but to design tools that can process<br />

early pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted books, that can adopt different ligatures <strong>and</strong> abbreviati<strong>on</strong>s. We therefore designed<br />

a text retrieval tool that deals with the text <str<strong>on</strong>g>in</str<strong>on</strong>g> a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted document <str<strong>on</strong>g>in</str<strong>on</strong>g> a different way, try<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

identify occurrences of query words rather than recogniz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the whole text (Mar<str<strong>on</strong>g>in</str<strong>on</strong>g>ai 2009).<br />

Instead of segment<str<strong>on</strong>g>in</str<strong>on</strong>g>g words, Mar<str<strong>on</strong>g>in</str<strong>on</strong>g>ai’s technique extracted “character objects” from documents that<br />

were then clustered together us<str<strong>on</strong>g>in</str<strong>on</strong>g>g self-organiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g maps so that “symbolic” classes could be assigned<br />

to <str<strong>on</strong>g>in</str<strong>on</strong>g>dexed objects. User query terms were selected from “<strong>on</strong>e-word” images <str<strong>on</strong>g>in</str<strong>on</strong>g> the collecti<strong>on</strong> that were<br />

then compared aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st “<str<strong>on</strong>g>in</str<strong>on</strong>g>dexed character objects with a Dynamic Time Warp<str<strong>on</strong>g>in</str<strong>on</strong>g>g (DTW) based<br />

approach.” This “query by example” approach did face <strong>on</strong>e major challenge <str<strong>on</strong>g>in</str<strong>on</strong>g> that it could not f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<br />

occurrences of query words that were pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted with different ligatures.<br />

As this subsecti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dicates, the development of tools for the automatic recogniti<strong>on</strong> <strong>and</strong> process<str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> is a research area that still has many challenges.<br />

Sanskrit<br />

The issues <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> the digitizati<strong>on</strong> of Sanskrit texts <strong>and</strong> the development of tools to study <strong>and</strong><br />

present them <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e are so complicated that an annual <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al Sanskrit computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics<br />

symposium was established <str<strong>on</strong>g>in</str<strong>on</strong>g> 2007. 59 This subsecti<strong>on</strong> provides an overview of some of the major<br />

digital Sanskrit projects <strong>and</strong> current issues <str<strong>on</strong>g>in</str<strong>on</strong>g> digitizati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> that language.<br />

The major digital Sanskrit project <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e is the Sanskrit <strong>Library</strong>, a “digital library dedicated to<br />

facilitat<str<strong>on</strong>g>in</str<strong>on</strong>g>g educati<strong>on</strong> <strong>and</strong> research <str<strong>on</strong>g>in</str<strong>on</strong>g> Sanskrit by provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to digitized primary texts <str<strong>on</strong>g>in</str<strong>on</strong>g> Sanskrit<br />

<strong>and</strong> computerized research <strong>and</strong> study tools to analyze <strong>and</strong> maximize the utility of digitized Sanskrit<br />

text.” 60 The Sanskrit <strong>Library</strong> is part of the Internati<strong>on</strong>al Digital Sanskrit <strong>Library</strong> Integrati<strong>on</strong> project,<br />

which seeks to c<strong>on</strong>nect various Sanskrit digital archives <strong>and</strong> tool projects as well as to establish<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g st<strong>and</strong>ards, enhance manuscript access, <strong>and</strong> develop OCR technology <strong>and</strong> display software for<br />

Devanagari text. On an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual basis, the Sanskrit <strong>Library</strong> supports philological research <strong>and</strong><br />

educati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Vedic <strong>and</strong> Classical Sanskrit language <strong>and</strong> literature <strong>and</strong> provides access to Sanskrit texts<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> digital form. The Sanskrit <strong>Library</strong> currently c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent-study Sanskrit readers,<br />

grammatical literature, morphological software, <str<strong>on</strong>g>in</str<strong>on</strong>g>structi<strong>on</strong>al materials, <strong>and</strong> a digital versi<strong>on</strong> of W. D.<br />

Whitney’s The Roots, Verb-Forms, <strong>and</strong> Primary Derivatives of the Sanskrit Language. The <strong>Library</strong>’s<br />

current areas of research <str<strong>on</strong>g>in</str<strong>on</strong>g>clude “l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic issues <str<strong>on</strong>g>in</str<strong>on</strong>g> encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, computati<strong>on</strong>al ph<strong>on</strong>ology <strong>and</strong><br />

morphology, OCR for Indic scripts, <strong>and</strong> markup of digitized Sanskrit lexica.” Free access to this<br />

library is provided, but users must register.<br />

59 http://www.spr<str<strong>on</strong>g>in</str<strong>on</strong>g>gerl<str<strong>on</strong>g>in</str<strong>on</strong>g>k.com/c<strong>on</strong>tent/p665684g40h7/p=967bbca4213c4cb6988c40c0e3ae3a95&pi=0<br />

60 http://sanskritlibrary.org/


21<br />

Another major scholarly <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e collecti<strong>on</strong> of Sanskrit is the Digital Corpus of Sanskrit (DCS), 61 which<br />

provides access to a searchable collecti<strong>on</strong> of lemmatized Sanskrit texts <strong>and</strong> to a partial versi<strong>on</strong> of the<br />

database of the SanskritTagger software. SanskritTagger is a “part-of-speech (POS) <strong>and</strong> lexical tagger<br />

for post-Vedic Sanskrit” <strong>and</strong> it is able to analyze unprocessed digital Sanskrit text both lexically <strong>and</strong><br />

morphologically. 62 The DCS was automatically created from the most recent versi<strong>on</strong> of the<br />

SanskritTagger database with a corpus chosen by the software creator Oliver Hellwig (the website<br />

notes that this corpus had made no attempt to be exhaustive). The DCS was designed to support<br />

research <str<strong>on</strong>g>in</str<strong>on</strong>g> Sanskrit philology, <strong>and</strong> it is possible to search for lexical units <strong>and</strong> their collocati<strong>on</strong>s from a<br />

corpus of 2,700,000 words.<br />

A variety of research has been c<strong>on</strong>ducted <str<strong>on</strong>g>in</str<strong>on</strong>g>to the development of tools for Sanskrit, <strong>and</strong> this<br />

subsecti<strong>on</strong> reviews <strong>on</strong>ly some of it. The need for digitized Sanskrit lexic<strong>on</strong>s 63 as part of a larger<br />

computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics platform is an area of research for the Sanskrit <strong>Library</strong>, an issue that receives<br />

substantial attenti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Huet (2004). Huet’s article provides an overview of work to develop both a<br />

Sanskrit lexical database <strong>and</strong> various automatic-tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools to support a philologist:<br />

The first level of <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of a Sanskrit text is its word-to-word segmentati<strong>on</strong>, <strong>and</strong> our<br />

tagger will be able to assist a philology specialist to achieve complete morphological mark-up<br />

systematically. This will allow the development of c<strong>on</strong>cordance analysis tools recogniz<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

morphological variants, a task which up to now has to be performed manually (Huet 2004).<br />

Huet also asserted that the classical Sanskrit corpus is extensive <strong>and</strong> presents computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics<br />

with many analytical challenges.<br />

In additi<strong>on</strong> to the challenges Sanskrit presents for develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g computati<strong>on</strong>al tools, the features of the<br />

language itself make the creati<strong>on</strong> of critical editi<strong>on</strong>s difficult. As Csernel <strong>and</strong> Patte (2009) expla<str<strong>on</strong>g>in</str<strong>on</strong>g>, a<br />

“critical editi<strong>on</strong>” must take “<str<strong>on</strong>g>in</str<strong>on</strong>g>to account all the different known versi<strong>on</strong>s of the same text <str<strong>on</strong>g>in</str<strong>on</strong>g> order to<br />

show the differences between any two dist<str<strong>on</strong>g>in</str<strong>on</strong>g>ct versi<strong>on</strong>s.” 64 The creati<strong>on</strong> of critical editi<strong>on</strong>s is<br />

challeng<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> any language, particularly if there are many manuscript witnesses, but Sanskrit presents<br />

some unique problems. In this paper, Csernel <strong>and</strong> Patte present an approach based <strong>on</strong> paragraphs <strong>and</strong><br />

sentences extracted from a collecti<strong>on</strong> of manuscripts known as the “Banaras” gloss. This gloss was<br />

written <str<strong>on</strong>g>in</str<strong>on</strong>g> the seventh century AD <strong>and</strong> is the most famous commentary <strong>on</strong> the “notorious” Pan<str<strong>on</strong>g>in</str<strong>on</strong>g>i<br />

grammar, which was known as the first “generative” grammar <strong>and</strong> was written around the fifth century<br />

BC. One major characteristic of Sanskrit described by Csernel <strong>and</strong> Patte is that it is “not l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to a<br />

specific script,” <strong>and</strong> while the Brahmi script was used for a l<strong>on</strong>g time, Devanagari is now the most<br />

comm<strong>on</strong>. The authors reported that they used the transliterati<strong>on</strong> scheme of Sanskrit for Tex that was<br />

developed by Frans Velthius 65 where<str<strong>on</strong>g>in</str<strong>on</strong>g> each Sanskrit letter is written us<str<strong>on</strong>g>in</str<strong>on</strong>g>g between <strong>on</strong>e <strong>and</strong> three<br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> characters.<br />

An <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>sight provided by these authors was how <strong>on</strong>e problematic feature of Sanskrit texts—<br />

namely, text written without spaces—was also found <str<strong>on</strong>g>in</str<strong>on</strong>g> other ancient texts:<br />

61 http://kjc-fs-cluster.kjc.uni-heidelberg.de/dcs/<br />

62 For more details <strong>on</strong> this tagger, see Hellwig (2007); for <strong>on</strong>e of its research uses <str<strong>on</strong>g>in</str<strong>on</strong>g> philology see Hellwig (2010).<br />

63 The NEH has recently funded a first step <str<strong>on</strong>g>in</str<strong>on</strong>g> this directi<strong>on</strong>. A project entitled “Sanskrit Lexical Sources: Digital Synthesis <strong>and</strong> Revisi<strong>on</strong>” will support an<br />

“<str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al partnership between the Sanskrit <strong>Library</strong> (Maharishi University of Management) <strong>and</strong> the Cologne Digital Sanskrit Lexic<strong>on</strong> (CDSL) project<br />

(Institute of Indology <strong>and</strong> Tamil Studies, Cologne University) to establish a digital Sanskrit lexical reference work.”<br />

http://www.neh.gov/news/archive/201007200.html<br />

64 Further discussi<strong>on</strong> of this issue can be found <str<strong>on</strong>g>in</str<strong>on</strong>g> the secti<strong>on</strong> <strong>on</strong> Digital Editi<strong>on</strong>s.<br />

65 http://www.ctan.org/tex-archive/language/devanagari/velthuis/


22<br />

In ancient manuscripts, Sanskrit is written without spaces, <strong>and</strong> from our po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of view, this is an<br />

important graphical specificity, because it <str<strong>on</strong>g>in</str<strong>on</strong>g>creases greatly the complexity of text comparis<strong>on</strong><br />

algorithms. One may remark that Sanskrit is not the <strong>on</strong>ly language where spaces are miss<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the text: Roman epigraphy <strong>and</strong> European Middle Age manuscripts are also good examples of<br />

that (Csernel <strong>and</strong> Patte 2009).<br />

The soluti<strong>on</strong> that the authors ultimately proposed for creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a critical editi<strong>on</strong> of a Sanskrit text<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>volved the lemmatizati<strong>on</strong> by h<strong>and</strong> of <strong>on</strong>e of the two texts, specifically, the text of the editi<strong>on</strong>.<br />

Alignments between this lemmatized text <strong>and</strong> other texts then made use of the l<strong>on</strong>gest comm<strong>on</strong><br />

subsequence (LCS) algorithm. The authors are still experiment<str<strong>on</strong>g>in</str<strong>on</strong>g>g with their methodology, but po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted<br />

out that the absence of a Sanskrit lexic<strong>on</strong> limited their approach.<br />

The development of OCR tools that will process Sanskrit scripts is a highly sought-after goal. Very<br />

little work has been d<strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g> this area, but Thomas Breuel recently reported not <strong>on</strong>ly <strong>on</strong> the use of<br />

OCRopus to recognize the Devanagari script but also <strong>on</strong> its applicati<strong>on</strong> both to primary texts <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

classical languages <strong>and</strong> to sec<strong>on</strong>dary classical scholarship. As was discussed previously <str<strong>on</strong>g>in</str<strong>on</strong>g> Boschetti et<br />

al. (2009), prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary work with OCRopus produced promis<str<strong>on</strong>g>in</str<strong>on</strong>g>g results with Ancient Greek.<br />

Breuel (2009) described OCRopus as an OCR system that is designed to be both omnil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual <strong>and</strong><br />

omniscript <strong>and</strong> that advances the state of the art <str<strong>on</strong>g>in</str<strong>on</strong>g> that new text-recogniti<strong>on</strong> <strong>and</strong> layout-analysis<br />

modules can be easily plugged <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> that it uses an adaptive <strong>and</strong> user-extensible character recogniti<strong>on</strong><br />

module. Breuel acknowledged that there are many challenges to recogniz<str<strong>on</strong>g>in</str<strong>on</strong>g>g Devanagari script,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the large number of ligatures, complicated diacritics, <strong>and</strong> the “large <strong>and</strong> unusual vocabulary<br />

used <str<strong>on</strong>g>in</str<strong>on</strong>g> academic <strong>and</strong> historical texts” (Breuel 2009). In additi<strong>on</strong> to Sanskrit texts, Breuel made the<br />

important po<str<strong>on</strong>g>in</str<strong>on</strong>g>t that historical scholarship about Sanskrit <strong>and</strong> other classical languages is frequently<br />

multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual <strong>and</strong> multiscript <strong>and</strong> can mix Devanagari <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> as well as Greek. Breuel thus proposed<br />

that OCRopus has a number of potential applicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the field of classical scholarship, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

recogniti<strong>on</strong> of orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al documents (written records), orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al primary source texts (pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s of<br />

classical texts), <strong>and</strong> both modern <strong>and</strong> historical sec<strong>on</strong>dary scholarship, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g commentaries <strong>and</strong><br />

textbooks, <strong>and</strong> reference works such as dicti<strong>on</strong>aries <strong>and</strong> encyclopedias.<br />

He also expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that OCRopus uses a “strictly feed-forward system,” an important feature that<br />

supports the plug-<str<strong>on</strong>g>in</str<strong>on</strong>g> of other layout-analysis <strong>and</strong> text-recogniti<strong>on</strong> modules. Other features <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the<br />

use of <strong>on</strong>ly a small number of data types to support reuse, “weighted f<str<strong>on</strong>g>in</str<strong>on</strong>g>ite state transducers” (WFSTs)<br />

to represent the output of text l<str<strong>on</strong>g>in</str<strong>on</strong>g>e recogniti<strong>on</strong>, <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>al output <str<strong>on</strong>g>in</str<strong>on</strong>g> the hOCR format, which “encodes<br />

OCR <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> completely st<strong>and</strong>ards-compliant HTML files.” This open-source system can be<br />

hosted through a web service, run from the comm<strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>e or shell scripts, <strong>and</strong> users can customize how<br />

it performs by script<str<strong>on</strong>g>in</str<strong>on</strong>g>g “the OCR eng<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g> Lua.”<br />

The basic stages <str<strong>on</strong>g>in</str<strong>on</strong>g> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g OCRopus are image preprocess<str<strong>on</strong>g>in</str<strong>on</strong>g>g, layout analysis, text-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e recogniti<strong>on</strong>, <strong>and</strong><br />

statistical language model<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Each stage offers a variety of customizati<strong>on</strong> opti<strong>on</strong>s that make it<br />

particularly useful for historical languages. In terms of text-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e recogniti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> historical texts, the fact<br />

that OCRopus has both built-<str<strong>on</strong>g>in</str<strong>on</strong>g> text-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e recognizers <strong>and</strong> the ability to add external text-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

recognizers for different scripts is very important, because as Breuel articulated:<br />

Some historical texts may use different writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g systems, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce Devanagari is not the <strong>on</strong>ly script<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> historical use for Sanskrit. Scholarly writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> Sanskrit almost always uses Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> script, <strong>and</strong><br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> script is also used for writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g Sanskrit itself, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g extended passages. Sanskrit<br />

written <str<strong>on</strong>g>in</str<strong>on</strong>g> Devanagari <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> scripts also makes use of numerous diacritics that need to be


23<br />

recognized. In additi<strong>on</strong>, IPA may be used for pr<strong>on</strong>unciati<strong>on</strong>, Greek letters may be used for<br />

classical Greek quotati<strong>on</strong>s, <strong>and</strong> Greek letters <strong>and</strong> other special characters may be used for<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dicat<str<strong>on</strong>g>in</str<strong>on</strong>g>g footnotes or other references (Breuel 2009).<br />

The challenge of multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual document recogniti<strong>on</strong> is a significant <strong>on</strong>e for classical scholarship that<br />

has been reported by many digital classics projects. OCRopus has built-<str<strong>on</strong>g>in</str<strong>on</strong>g> l<str<strong>on</strong>g>in</str<strong>on</strong>g>e recognizers for Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

scripts, <strong>and</strong> unlike those of many other OCR systems, these recognizers make few assumpti<strong>on</strong>s about<br />

character sets <strong>and</strong> f<strong>on</strong>ts <strong>and</strong> are <str<strong>on</strong>g>in</str<strong>on</strong>g>stead “tra<str<strong>on</strong>g>in</str<strong>on</strong>g>ed” <strong>on</strong> text-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g>put that is then aligned aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st ground<br />

truth data <strong>and</strong> can be used to automatically tra<str<strong>on</strong>g>in</str<strong>on</strong>g> “<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual character shape models.” For Devanagari,<br />

OCRopus h<strong>and</strong>led diacritics by treat<str<strong>on</strong>g>in</str<strong>on</strong>g>g “character+diacritic” comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>s as novel characters.<br />

The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al process<str<strong>on</strong>g>in</str<strong>on</strong>g>g stage of OCRopus is language model<str<strong>on</strong>g>in</str<strong>on</strong>g>g, which <str<strong>on</strong>g>in</str<strong>on</strong>g> the case of OCRopus is based<br />

<strong>on</strong> WFSTs. WFSTs allow language models <strong>and</strong> character recogniti<strong>on</strong> alternatives to be “manipulated<br />

algebraically” <strong>and</strong> such language models can be learned from tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g data or c<strong>on</strong>structed manually.<br />

One important use of such models for mixed-language classical texts is that they can be used to<br />

automatically identify languages with<str<strong>on</strong>g>in</str<strong>on</strong>g> a digital text. “We can take exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g language models for<br />

English <strong>and</strong> Sanskrit <strong>and</strong> comb<str<strong>on</strong>g>in</str<strong>on</strong>g>e them,” Breuel explicated. “As part of the comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, we can tra<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

or specify the probable locati<strong>on</strong>s <strong>and</strong> frequencies of transiti<strong>on</strong>s between the two language models,<br />

corresp<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g to, for example, isolated foreign words with<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e language, or l<strong>on</strong>g quotati<strong>on</strong>s” (Breuel<br />

2009).<br />

As this subsecti<strong>on</strong> has <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated, the computati<strong>on</strong>al challenges of process<str<strong>on</strong>g>in</str<strong>on</strong>g>g Sanskrit are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

actively researched, <strong>and</strong> some of the technical soluti<strong>on</strong>s may be adaptable to other historical languages<br />

as well.<br />

Syriac<br />

The Syriac dialect bel<strong>on</strong>gs to the Aramaic branch of the Semitic languages <strong>and</strong> flourished between the<br />

third <strong>and</strong> seventh centuries AD, although it c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ued to be used as a written language through the<br />

n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth century. It has a somewhat smaller body of research <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of document analysis <strong>and</strong><br />

recogniti<strong>on</strong> than other ancient dialects <strong>and</strong> languages covered <str<strong>on</strong>g>in</str<strong>on</strong>g> this review, but n<strong>on</strong>etheless there is an<br />

active <strong>and</strong> grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g body of research <strong>on</strong> this topic. Although there are fewer digital texts available <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Syriac <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e than for Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Sumerian, or Sanskrit, some texts written <str<strong>on</strong>g>in</str<strong>on</strong>g> this dialect can be<br />

found <str<strong>on</strong>g>in</str<strong>on</strong>g> many papyri <strong>and</strong> manuscript collecti<strong>on</strong>s. 66 In additi<strong>on</strong>, a number of pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s <strong>and</strong><br />

reference works <strong>on</strong> Syriac can be found <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g> both Google Books <strong>and</strong> the Internet Archive. 67<br />

Document-recogniti<strong>on</strong> work with Syriac has been reported by Bilane et al. (2008), who have<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated the use of word-spott<str<strong>on</strong>g>in</str<strong>on</strong>g>g 68 for h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g analysis <str<strong>on</strong>g>in</str<strong>on</strong>g> digitized Syriac manuscripts. They<br />

noted that Syriac presents a particularly <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g case for automatic historical-document analysis<br />

because it comb<str<strong>on</strong>g>in</str<strong>on</strong>g>es the word structure <strong>and</strong> calligraphy of Arabic h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g while also be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tenti<strong>on</strong>ally written at an angle. Bilane et al. (2008) used a word-spott<str<strong>on</strong>g>in</str<strong>on</strong>g>g method so that they would<br />

not need to rely <strong>on</strong> any prior <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> or be dependent <strong>on</strong> specific word or character-segmentati<strong>on</strong><br />

algorithms. Tse <strong>and</strong> Bigun (2007) have also reported <strong>on</strong> work <str<strong>on</strong>g>in</str<strong>on</strong>g> the automatic recogniti<strong>on</strong> of Syriac,<br />

with a focus <strong>on</strong> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>itial character-recogniti<strong>on</strong> system that can serve as a basel<str<strong>on</strong>g>in</str<strong>on</strong>g>e OCR for<br />

Syriac-Aramaic texts that use the Serto script. Their system does not require the use of segmentati<strong>on</strong><br />

66 For example, see the Syriac manuscripts available through the Virtual Manuscript Room, http://vmr.bham.ac.uk/Collecti<strong>on</strong>s/M<str<strong>on</strong>g>in</str<strong>on</strong>g>gana/part/Syriac/<br />

67 See, for example, Breviarium: juxta ritum ecclesiæ Antiochenæ Syrorum (http://books.google.com/booksid=w-UOAAAAQAAJ) <str<strong>on</strong>g>in</str<strong>on</strong>g> Google Books, or<br />

The book of c<strong>on</strong>solati<strong>on</strong>s; or, The pastoral epistles; the Syriac text (with both the Syriac text <strong>and</strong> the English translati<strong>on</strong>) <str<strong>on</strong>g>in</str<strong>on</strong>g> the Internet Archive<br />

(http://www.archive.org/details/bookofc<strong>on</strong>solatio00ishouoft).<br />

68 Word-spott<str<strong>on</strong>g>in</str<strong>on</strong>g>g is a technique that has also been used with Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> manuscripts <strong>and</strong> is expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> detail earlier <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper.


24<br />

algorithms; <str<strong>on</strong>g>in</str<strong>on</strong>g>stead, it uses “l<str<strong>on</strong>g>in</str<strong>on</strong>g>ear symmetry with a threshold of correlati<strong>on</strong> for each character, <strong>and</strong> an<br />

ordered sequence of characters to be searched for” (Tse <strong>and</strong> Bigun 2007). Tse <strong>and</strong> Bigun offered a<br />

fairly detailed explanati<strong>on</strong> for why they chose to avoid an approach us<str<strong>on</strong>g>in</str<strong>on</strong>g>g segmentati<strong>on</strong> algorithms,<br />

which are often employed <str<strong>on</strong>g>in</str<strong>on</strong>g> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> text recogniti<strong>on</strong>:<br />

The system proposed <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper uses a segmentati<strong>on</strong>-free approach because the Serto script<br />

has characters that are cursive with difficult to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e start <strong>and</strong> end po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts for characters.<br />

This is <strong>on</strong>e difference between Serto <strong>and</strong> Arabic or the cursive form of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> languages. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce<br />

segmentati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> these languages is difficult <strong>and</strong> not easy as <str<strong>on</strong>g>in</str<strong>on</strong>g> scripts like pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>,<br />

remov<str<strong>on</strong>g>in</str<strong>on</strong>g>g the need for segmentati<strong>on</strong> becomes an alternative way of deal<str<strong>on</strong>g>in</str<strong>on</strong>g>g with the problem of<br />

segmentati<strong>on</strong>, at least to obta<str<strong>on</strong>g>in</str<strong>on</strong>g> a quick basel<str<strong>on</strong>g>in</str<strong>on</strong>g>e recogniti<strong>on</strong> scheme (Tse <strong>and</strong> Bigun 2007).<br />

The Serto script OCR system that Tse <strong>and</strong> Bigun ultimately developed produced character-recogniti<strong>on</strong><br />

rates of approximately 90 percent. Earlier work by Clocks<str<strong>on</strong>g>in</str<strong>on</strong>g> (2003) had also described methods for the<br />

automatic recogniti<strong>on</strong> of Syriac h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g, albeit texts written <str<strong>on</strong>g>in</str<strong>on</strong>g> the Strangely script, <strong>and</strong> used a<br />

collecti<strong>on</strong> of historical manuscript images. This system reported recogniti<strong>on</strong> rates that ranged from<br />

between 61 percent <strong>and</strong> 100 percent, depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> both the techniques used <strong>and</strong> the manuscript<br />

source.<br />

One research project that seeks to build an <str<strong>on</strong>g>in</str<strong>on</strong>g>itial cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure or <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e hub for Syriac is the<br />

Syriac Research Group, a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t project of the University of Alabama <strong>and</strong> Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong> University. 69 The<br />

group’s major goal is to produce a new generati<strong>on</strong> of tools <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> resources that will help<br />

alleviate the access <strong>and</strong> discovery problem that currently h<str<strong>on</strong>g>in</str<strong>on</strong>g>ders “scholarly research <strong>on</strong> Syriac<br />

language, cultures <strong>and</strong> history.” An <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al team of scholars is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create an “<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

reference source” that will meet the needs both of advanced Syriac scholars <strong>and</strong> of the <str<strong>on</strong>g>in</str<strong>on</strong>g>terested<br />

public, <strong>and</strong> their website offers a useful mockup of the potential portal as well as a list of potential user<br />

scenarios (e.g., a Syriac researcher work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> manuscripts, a n<strong>on</strong>specialist researcher).<br />

Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the project website, the “Syriac Reference Portal” 70 will serve as an “<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> hub”<br />

that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes an <strong>on</strong>tology or classificati<strong>on</strong> system that can be used for creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Syriac reference works,<br />

an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e encyclopedia, a gazetteer that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes both geographic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> maps that are<br />

relevant to Syriac studies, an extensive bibliography, <strong>and</strong> a multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual authority file that will support<br />

“st<strong>and</strong>ardiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g references to Syriac authors, texts, <strong>and</strong> place names.” Other related research work<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes revisi<strong>on</strong> of the Unicode st<strong>and</strong>ard for Syriac, a TEI adaptati<strong>on</strong> for the descripti<strong>on</strong> of Syriac<br />

manuscripts, <strong>and</strong> plans to add a prosopographical tool. While the Syriac Research Group hopes to<br />

make all the resources listed above available <str<strong>on</strong>g>in</str<strong>on</strong>g> the first generati<strong>on</strong> of the portal, the ultimate goal is to<br />

open these resources to the scholarly community for “collaborative augmentati<strong>on</strong> <strong>and</strong> annotati<strong>on</strong>.” The<br />

project hopes that as the portal grows that the project team will be able to iteratively add to the hub by<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g to digitized Syriac c<strong>on</strong>tent that is available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, such as the Syriac corpus of literature be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

prepared by Brigham Young University (BYU), 71 the eBeth Arké Syriac Studies Collecti<strong>on</strong>, 72 <strong>and</strong> the<br />

Manumed project <str<strong>on</strong>g>in</str<strong>on</strong>g> the European Uni<strong>on</strong>. 73<br />

69 http://www.syriac.ua.edu/<br />

70 The “Syriac Reference Portal” has been based <strong>on</strong> the Indiana Philosophy Ontology Project (InPhO) (https://<str<strong>on</strong>g>in</str<strong>on</strong>g>pho.cogs.<str<strong>on</strong>g>in</str<strong>on</strong>g>diana.edu/)<br />

71 http://cpart.byu.edu/page=112&sidebar<br />

72 http://www.hmml.org/vivarium/BethArke.htm<br />

73 The Manumed Project (http://www.manumed.org/en/) is build<str<strong>on</strong>g>in</str<strong>on</strong>g>g an extensive virtual library of digitized cultural heritage documents from the Euro-<br />

Mediterranean regi<strong>on</strong> with a particular focus <strong>on</strong> manuscripts <str<strong>on</strong>g>in</str<strong>on</strong>g> languages <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g “Arabic, Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Syriac, Hebrew, Aramaic, Coptic, Berber,<br />

Armenian.”


25<br />

The eBeth Arké Syriac Studies Collecti<strong>on</strong>, described by its project website as “an electr<strong>on</strong>ic <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

resource library for Syriac studies,” complements the Syriac Studies Reference <strong>Library</strong> 74 that is hosted<br />

by BYU. Hold<str<strong>on</strong>g>in</str<strong>on</strong>g>gs for both collecti<strong>on</strong>s have come from the Semitics/Institute of Christian Oriental<br />

Research <strong>Library</strong> 75 at the Catholic University of America <str<strong>on</strong>g>in</str<strong>on</strong>g> Wash<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, DC. When complete, eBeth<br />

Arké will <str<strong>on</strong>g>in</str<strong>on</strong>g>clude approximately 650 digitized items. 76 This collecti<strong>on</strong> of Syriac texts <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes early<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted catalogs, grammars, <strong>and</strong> lexic<strong>on</strong>s, as well as many other rare volumes. The eBeth Arké<br />

Collecti<strong>on</strong> is hosted <strong>on</strong> Vivarium, the digital library of the Hill Museum & Manuscript <strong>Library</strong><br />

(HMML), 77 which uses the proprietary software CONTENTdm 78 to manage its digital collecti<strong>on</strong>, <strong>and</strong><br />

can be searched or browsed by a variety of opti<strong>on</strong>s (e.g., keyword, name of author). Each digitized<br />

item can be viewed <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e as an image book (<str<strong>on</strong>g>in</str<strong>on</strong>g> PDF format), <strong>and</strong> users can also create a collecti<strong>on</strong> of<br />

favorites. Access to these books is also available through the Syriac Studies Reference <strong>Library</strong> at<br />

BYU, <strong>and</strong> the collecti<strong>on</strong> can be browsed by ancient author (e.g., Cyril of Alex<strong>and</strong>ria, Philoxenus) or<br />

topic, or searched by keyword.<br />

The largest research project <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to digitized Syriac texts is currently under<br />

way at BYU <strong>and</strong> seeks to create a comprehensive Syriac corpus of electr<strong>on</strong>ic texts. The project website<br />

notes that no “coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ated <strong>and</strong> large scale effort has yet been attempted” 79 so BYU began work<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

create a Syriac corpus <str<strong>on</strong>g>in</str<strong>on</strong>g> 2001 <strong>and</strong> their efforts were jo<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by David G. K. Taylor of Oxford<br />

University <str<strong>on</strong>g>in</str<strong>on</strong>g> 2004. The project is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with both pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s of Syriac <strong>and</strong> manuscript<br />

collecti<strong>on</strong>s. 80 An electr<strong>on</strong>ic lexic<strong>on</strong> will be fully <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g>to this corpus, <strong>and</strong> the project has chosen<br />

to use Jessie Payne-Smith’s Compendious Syriac Dicti<strong>on</strong>ary. This dicti<strong>on</strong>ary is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>verted <str<strong>on</strong>g>in</str<strong>on</strong>g>to a<br />

lexical database, <strong>and</strong> each word that is tagged <str<strong>on</strong>g>in</str<strong>on</strong>g> the corpus will be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to its appropriate lexic<strong>on</strong><br />

entry.<br />

Several recent publicati<strong>on</strong>s address the work currently be<str<strong>on</strong>g>in</str<strong>on</strong>g>g undertaken <str<strong>on</strong>g>in</str<strong>on</strong>g> the development of the<br />

“BYU-Oxford Corpus of Syriac Literature.” McClanahan et al. (2010) describe Syriac as an “underresourced”<br />

dialect <str<strong>on</strong>g>in</str<strong>on</strong>g> that there are either few or no language tools (e.g., morphological analyzers, POS<br />

taggers) available to work with <strong>and</strong> there are relatively little “labeled data” available (e.g., tagged<br />

corpora, digitized Syriac texts) up<strong>on</strong> which to tra<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> test algorithms. 81 Despite these challenges,<br />

McClanahan et al. sought to replicate the type of manual annotati<strong>on</strong> used to create the Peshitta New<br />

Testament (Kiraz 1994) <strong>on</strong> a far larger scale us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a number of computati<strong>on</strong>al tools <strong>and</strong> the data from<br />

this s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle labeled resource, or essentially to automatically annotate Syriac texts <str<strong>on</strong>g>in</str<strong>on</strong>g> a “data-driven<br />

fashi<strong>on</strong>,” an approach they labeled Syromorph. They created a probabilistic morphological analyzer for<br />

Syriac that made use of “five probabilistic sub-models that can be tra<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> a supervised fashi<strong>on</strong> <strong>and</strong><br />

comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t model of morphological annotati<strong>on</strong>” (McClanahan et al. 2010).<br />

One other major c<strong>on</strong>tributi<strong>on</strong> of their work is the <str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong> of “novel algorithms” for the important<br />

natural language process<str<strong>on</strong>g>in</str<strong>on</strong>g>g (NLP) subtasks of segmentati<strong>on</strong> (often described as tokenizati<strong>on</strong>),<br />

dicti<strong>on</strong>ary l<str<strong>on</strong>g>in</str<strong>on</strong>g>kage, <strong>and</strong> morphological tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g. All these algorithms made use of maximum entropy<br />

74 http://www.lib.byu.edu/dlib/cua/<br />

75 http://libraries.cua.edu/semicoll/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

76 Informati<strong>on</strong> about the digitizati<strong>on</strong> of these materials is available <str<strong>on</strong>g>in</str<strong>on</strong>g> a project report that was published <str<strong>on</strong>g>in</str<strong>on</strong>g> Hugoye, the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e journal of Syriac studies<br />

(http://syrcom.cua.edu/Hugoye/Vol8No1/HV8N1PRBlanchard.html). Both the digitizati<strong>on</strong> <strong>and</strong> preparati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>itial metadata were undertaken by a<br />

project team from Beth Mardutho of the Syriac Institute.<br />

77 http://www.hmml.org/<br />

78 http://www.c<strong>on</strong>tentdm.org/<br />

79 The project notes that the most significant effort so far has been the work of the Comprehensive Aramaic Lexic<strong>on</strong> with the Peshitta, a Syriac translati<strong>on</strong><br />

of the Bible (http://cal1.cn.huc.edu/<br />

80 See http://cpart.byu.edu/page=114&sidebar for a list of <str<strong>on</strong>g>in</str<strong>on</strong>g>itial texts that will be available<br />

81 The work reported here made use of an annotated versi<strong>on</strong> of the Peshitta New Testament as well as of a c<strong>on</strong>cordance described <str<strong>on</strong>g>in</str<strong>on</strong>g> Kiraz (1994) <strong>and</strong><br />

(2000).


26<br />

Markov Models (MEMM) 82 <strong>and</strong> outperformed previous basel<str<strong>on</strong>g>in</str<strong>on</strong>g>e methods. “We hope to use this<br />

comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed model for preannotati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> an active learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g sett<str<strong>on</strong>g>in</str<strong>on</strong>g>g to aid annotators <str<strong>on</strong>g>in</str<strong>on</strong>g> label<str<strong>on</strong>g>in</str<strong>on</strong>g>g a large<br />

Syriac corpus,” wrote McClanahan et al., c<strong>on</strong>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “this corpus will c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> data spann<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

multiple centuries <strong>and</strong> a variety of authors <strong>and</strong> genres. Future work will require address<str<strong>on</strong>g>in</str<strong>on</strong>g>g issues<br />

encountered <str<strong>on</strong>g>in</str<strong>on</strong>g> this corpus. In additi<strong>on</strong>, there is much to do <str<strong>on</strong>g>in</str<strong>on</strong>g> gett<str<strong>on</strong>g>in</str<strong>on</strong>g>g the overall tag accuracy closer to<br />

the accuracy of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual decisi<strong>on</strong>s.” 83<br />

Another challenge faced <str<strong>on</strong>g>in</str<strong>on</strong>g> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g NLP tools for Syriac is that it is an abjad writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g system that omits<br />

vowels <strong>and</strong> other diacritics, yet the automatic additi<strong>on</strong> of diacritics to Syriac text, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Haertel<br />

et al. (2010), could greatly enhance the utility of these texts for historical <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic research. They<br />

c<strong>on</strong>sequently developed an automatic-diacritizati<strong>on</strong> system that utilized c<strong>on</strong>diti<strong>on</strong>al Markov models<br />

(CMMS) <strong>and</strong> a number of already-diacritized texts as tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g data. When test<str<strong>on</strong>g>in</str<strong>on</strong>g>g their system with<br />

related Semitic languages (e.g., Hebrew) as well as resource-rich languages such as English, they<br />

reported that their system outperformed other low-resource approaches <strong>and</strong> achieved nearly state-ofthe-art<br />

results when compared with resource-rich systems.<br />

The TURGAMA Project, 84 based at the Institute for Religious Studies at Leiden University, is also<br />

explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the use of “computer-assisted l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic analysis” for both Aramaic <strong>and</strong> Syriac translati<strong>on</strong>s of<br />

the Bible. In c<strong>on</strong>trast to creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a tagged corpus, Project Director Wido van Peursen has expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that<br />

the TURGAMA Project has focused <strong>on</strong> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a highly specific model of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g for their<br />

research. He noted that their project started at <strong>on</strong>e level below that of the Oxford-BYU project, or with<br />

the level of morphology rather than POS tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

Van Peursen outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a number of challenges that arise when c<strong>on</strong>duct<str<strong>on</strong>g>in</str<strong>on</strong>g>g l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics analysis with<br />

ancient texts such as Syriac Biblical manuscripts, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g that there are no native speakers of the<br />

languages <str<strong>on</strong>g>in</str<strong>on</strong>g>volved, there are <strong>on</strong>ly written sources, there are multiple manuscript witnesses for<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual “texts,” <strong>and</strong> the corpora are typically quite small. These challenges, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to van<br />

Peursen, lead to two c<strong>on</strong>crete dilemmas when develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g ancient text corpora: (1) Should<br />

computati<strong>on</strong>al analysis be “data-oriented or theory-driven” <strong>and</strong> (2) What is the priority for the corpus<br />

of the language The analysis of these dilemmas led van Peursen to argue that an encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g model,<br />

rather than a tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g approach, should be taken:<br />

The challenges <strong>and</strong> dilemmas menti<strong>on</strong>ed above require a model that is deductive rather than<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>ductive; that goes from form (the c<strong>on</strong>crete textual data) to functi<strong>on</strong> (the categories that we do<br />

not know a priori); that entails register<str<strong>on</strong>g>in</str<strong>on</strong>g>g the distributi<strong>on</strong> of l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic elements, rather than<br />

merely add<str<strong>on</strong>g>in</str<strong>on</strong>g>g functi<strong>on</strong>al labels—<str<strong>on</strong>g>in</str<strong>on</strong>g> other words, that <str<strong>on</strong>g>in</str<strong>on</strong>g>volves encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g rather than tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g;<br />

that registers both the paradigmatic forms <strong>and</strong> their realizati<strong>on</strong>s; that allows grammatical<br />

categories <strong>and</strong> formal descripti<strong>on</strong>s to be redef<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>on</strong> the basis of corpus analysis; <strong>and</strong> that<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>volves <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive analytical procedures, which are needed for the level of accuracy we aim<br />

for (van Peursen 2009).<br />

C<strong>on</strong>sequently, the TURGAMA Project’s l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic analysis of Hebrew <strong>and</strong> Syriac is c<strong>on</strong>ducted from<br />

the bottom up; i.e., at the levels of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual words, phrases, clauses, <strong>and</strong> the entire text. The<br />

workflow of their system <str<strong>on</strong>g>in</str<strong>on</strong>g>volves pattern-recogniti<strong>on</strong> programs, “language-specific auxiliary files”<br />

82 http://en.wikipedia.org/wiki/Maximum_entropy_Markov_model<br />

83 An earlier discussi<strong>on</strong> of the complicated process of comb<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g, human annotati<strong>on</strong> <strong>and</strong> ancient corpus creati<strong>on</strong> by this project can be<br />

found <str<strong>on</strong>g>in</str<strong>on</strong>g> Carroll et al. (2007).<br />

84 The full project name is TURGAMA, “Computer -Assisted Analysis of the Peshitta <strong>and</strong> the Targum: Text, Language <strong>and</strong> Interpretati<strong>on</strong>.”<br />

http://www.hum.leiden.edu/religi<strong>on</strong>/research/research-programmes/antiquity/turgama.html


27<br />

such as a word grammar, <strong>and</strong> an “Analytical Lexic<strong>on</strong>” (a data file that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s previous scholarly<br />

analyses), data sets that are gradually built up from patterns that are registered <str<strong>on</strong>g>in</str<strong>on</strong>g> the Analytical<br />

Lexic<strong>on</strong>, <strong>and</strong> active human <str<strong>on</strong>g>in</str<strong>on</strong>g>terventi<strong>on</strong> where researchers must decide to accept or reject proposals (or<br />

add a new analysis) made by an automatic morphology program called Analyse.<br />

The major reas<strong>on</strong> to utilize an encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g procedure rather than a tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g methodology, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to van<br />

Peursen, is that it will guarantee c<strong>on</strong>sistent morphological analysis s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce all functi<strong>on</strong>al deducti<strong>on</strong>s are<br />

automatically produced. “It has the advantage that not <strong>on</strong>ly the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of a word, but also the<br />

data which led to a certa<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> can be retrieved,” van Peursen argued, “whereas the<br />

motivati<strong>on</strong> beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d a tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g is usually not visible. It also has the advantage that both the surface forms<br />

<strong>and</strong> the functi<strong>on</strong>al analysis are preserved” (van Peursen 2009. In additi<strong>on</strong>, van Peursen stated that by<br />

us<str<strong>on</strong>g>in</str<strong>on</strong>g>g language-specific files such as the Analytical Lexic<strong>on</strong>, their system was utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

scholarly knowledge regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g Semitic studies. The encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g system ultimately deployed also<br />

supported a first with<str<strong>on</strong>g>in</str<strong>on</strong>g> Semitic studies, van Peursen <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted, the ability to test alternative<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s or scholarly assumpti<strong>on</strong>s aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st actual data. This ability to represent multiple textual<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s was particularly important as the manuscript evidence for Syriac typically <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a<br />

large number of “orthographic variants.” 85 The encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g system used by TURGAMA thus supports<br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g for both “attested surface forms” found <str<strong>on</strong>g>in</str<strong>on</strong>g> actual manuscript witnesses <strong>and</strong> the abstract<br />

morphemes for these words.<br />

“This way of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g the verb forms attested <str<strong>on</strong>g>in</str<strong>on</strong>g> multiple textual witnesses provides us with a large<br />

database from which language variati<strong>on</strong> data can be retrieved,” van Peursen expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “In some cases<br />

language development is <str<strong>on</strong>g>in</str<strong>on</strong>g>volved as well, <strong>and</strong> the data can be used for diachr<strong>on</strong>ic analysis” (van<br />

Peursen 2010). All encod<str<strong>on</strong>g>in</str<strong>on</strong>g>gs developed by the TURGAMA Project, van Peursen cauti<strong>on</strong>ed, however,<br />

should ultimately be c<strong>on</strong>sidered as “hypotheses” that can be tested aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st the data they have created.<br />

Cuneiform Texts <strong>and</strong> Sumerian<br />

Generally c<strong>on</strong>sidered to be the earliest writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g system known <str<strong>on</strong>g>in</str<strong>on</strong>g> the world, cuneiform script was used<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the Ancient Near East from about 3200 BC to about 100 AD. While the largest number of cuneiform<br />

texts represent the Sumerian language, the cuneiform script was adapted for other languages, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Akkadian, Elamite, <strong>and</strong> Hittite. Sumero-Akkadian cuneiform is the most comm<strong>on</strong> by far <strong>and</strong> is a<br />

complex “syllabic <strong>and</strong> ideographic writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g system, with different signs for the various syllables”<br />

(Cohen et al. 2004). There are approximately 1,000 different cuneiform signs that form a complex<br />

script system where most signs are also “polyvalent,” or have multiple ph<strong>on</strong>emic <strong>and</strong> semantic<br />

realizati<strong>on</strong>s. Additi<strong>on</strong>al glyphs have also shown great “palaeographic development” over their three<br />

millennia of use (Cohen et al. 2004). Sumerian has also been described as a “language isolate”: <strong>on</strong>e for<br />

which no other related languages have been identified <strong>and</strong> that therefore lacks resources such as a<br />

“st<strong>and</strong>ardized sign list <strong>and</strong> comprehensive dicti<strong>on</strong>ary” (Ebel<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2007). These various factors make<br />

digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g, transliterat<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> present<str<strong>on</strong>g>in</str<strong>on</strong>g>g cuneiform <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e a complicated task.<br />

As <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated by the size of the previously described CDLI, there are hundreds of thous<strong>and</strong>s of<br />

cuneiform tablets <strong>and</strong> other texts around the world <str<strong>on</strong>g>in</str<strong>on</strong>g> both private <strong>and</strong> public collecti<strong>on</strong>s. In additi<strong>on</strong> to<br />

the CDLI, there are a number of significant digital collecti<strong>on</strong>s <strong>and</strong> corpora of cuneiform texts <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.<br />

This secti<strong>on</strong> describes them, al<strong>on</strong>g with relevant literature, briefly.<br />

85 The topic of textual variants found with<str<strong>on</strong>g>in</str<strong>on</strong>g> various witnesses to a text <strong>and</strong> the need to develop appropriate encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g/markup models to represent them<br />

was also reported <str<strong>on</strong>g>in</str<strong>on</strong>g> the secti<strong>on</strong>s <strong>on</strong> Greek <strong>and</strong> Sanskrit, <strong>and</strong> is more fully discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> the secti<strong>on</strong> <strong>on</strong> Digital Editi<strong>on</strong>s.


28<br />

One major project to recently emerge from the CDLI is the Open Richly Annotated Cuneiform Corpus<br />

(Oracc). 86 This project has grown out of the CDLI <strong>and</strong> has utilized technology developed by the<br />

Pennsylvania Sumerian Dicti<strong>on</strong>ary (PSD). 87 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its website, Oracc was created by Steve<br />

T<str<strong>on</strong>g>in</str<strong>on</strong>g>ney, Eleanor Robs<strong>on</strong>, <strong>and</strong> Niek Veldhuis <strong>and</strong> “comprises a workspace <strong>and</strong> toolkit for the<br />

development of a complete corpus of cuneiform whose rich annotati<strong>on</strong> <strong>and</strong> open licens<str<strong>on</strong>g>in</str<strong>on</strong>g>g support the<br />

next generati<strong>on</strong> of scholarly research.” In additi<strong>on</strong> to CDLI <strong>and</strong> PSD, a number of other digital<br />

cuneiform projects are <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> Oracc, 88 <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Assyrian Empire Builders (AEB), 89 the Digital<br />

Corpus of Cuneiform Mathematical Texts (DCCMT), 90 <strong>and</strong> the Geography of Knowledge <str<strong>on</strong>g>in</str<strong>on</strong>g> Assyria<br />

<strong>and</strong> Babyl<strong>on</strong>ia (GKAB). 91 Oracc was designed as a “corpus build<str<strong>on</strong>g>in</str<strong>on</strong>g>g cooperative” that will provide<br />

both <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> technical support for “the creati<strong>on</strong> of free <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e editi<strong>on</strong>s of cuneiform texts.”<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce Oracc wishes to promote both open <strong>and</strong> reusable data, they recommend that all participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

projects make use of Creative Comm<strong>on</strong>s (CC) 92 licens<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> all default Oracc projects have been<br />

placed under a CC “Attributi<strong>on</strong>-Share Alike” license. Oracc was designed to complement the CDLI<br />

<strong>and</strong> allows scholars to “slice” groups of texts from the larger CDLI corpus <strong>and</strong> then study those<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tensively with<str<strong>on</strong>g>in</str<strong>on</strong>g> what they have labeled “projects.” Am<strong>on</strong>g its various features, Oracc supports<br />

multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual translati<strong>on</strong> support, enables projects to be turned <str<strong>on</strong>g>in</str<strong>on</strong>g>to Word files, PDFs, or books us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the “ISO OpenDocument” st<strong>and</strong>ard; <strong>and</strong> allows data to be exported <str<strong>on</strong>g>in</str<strong>on</strong>g> the TEI format. Any cuneiform<br />

tablet transliterati<strong>on</strong>s that are created with<str<strong>on</strong>g>in</str<strong>on</strong>g> Oracc will also be automatically uploaded to the CDLI.<br />

The Oracc Project recognizes six major roles 93 <strong>and</strong> has developed specific documentati<strong>on</strong> for each: (1)<br />

user (a scholar us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Oracc corpora); (2) builder (some<strong>on</strong>e work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> texts to help build up Oracc,<br />

e.g., lemmatiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g or data entry); (3) manager (some<strong>on</strong>e manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g or adm<str<strong>on</strong>g>in</str<strong>on</strong>g>ister<str<strong>on</strong>g>in</str<strong>on</strong>g>g an Oracc project);<br />

(4) developer (some<strong>on</strong>e c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g code to the Oracc project); (5) system adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrator; <strong>and</strong> (6) <strong>and</strong><br />

steerer (senior Oracc users). Significant documentati<strong>on</strong> is freely available for all but the last two roles.<br />

Oracc is a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g project, <strong>and</strong> researchers are <str<strong>on</strong>g>in</str<strong>on</strong>g>vited to c<strong>on</strong>tribute texts to it through either a<br />

d<strong>on</strong>ati<strong>on</strong> or curati<strong>on</strong> model. Through the d<strong>on</strong>ati<strong>on</strong> model, text editi<strong>on</strong>s <strong>and</strong> any additi<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

are simply sent to Oracc, <strong>and</strong> the project team <str<strong>on</strong>g>in</str<strong>on</strong>g>stalls, c<strong>on</strong>verts, <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s them (Oracc reserves<br />

the right to perform m<str<strong>on</strong>g>in</str<strong>on</strong>g>or edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g but promises to provide proper identificati<strong>on</strong> <strong>and</strong> credit for all data<br />

as well as to identify all revisers of data). Through the curati<strong>on</strong> model, the Oracc team helps users to<br />

set up their cuneiform texts as a separate project <strong>on</strong> the Oracc server, <strong>and</strong> the curator is then<br />

resp<strong>on</strong>sible for lemmatiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g their texts (this model also gives the user greater c<strong>on</strong>trol<br />

over subsequent edits to materials). 94 Various web services assist those that are c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g corpora to<br />

Oracc.<br />

Oracc is an excellent example of a project that supports reuse of its data through the use of CC<br />

licenses, comm<strong>on</strong>ly adopted technical st<strong>and</strong>ards, <strong>and</strong> extensive documentati<strong>on</strong> as to how the data are<br />

86 http://oracc.museum.upenn.edu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

87 The Pennsylvania Sumerian Dicti<strong>on</strong>ary project (http://psd.museum.upenn.edu/epsd/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html) is based at the Babyl<strong>on</strong>ian Secti<strong>on</strong> of the University of<br />

Pennsylvania Museum of Anthropology <strong>and</strong> Archaeology. In additi<strong>on</strong> to their work with Oracc <strong>and</strong> the CDLI, they have collaborated with the Electr<strong>on</strong>ic<br />

Text Corpus of Sumerian Literature (ETSCL).<br />

88 For the full list, see http://oracc.museum.upenn.edu/project-list.html.<br />

89 http://www.ucl.ac.uk/sarg<strong>on</strong><br />

90 http://oracc.museum.upenn.edu/dccmt/<br />

91 http://oracc.museum.upenn.edu/gkab<br />

92 Creative Comm<strong>on</strong>s is a “n<strong>on</strong>profit corporati<strong>on</strong> dedicated to mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g it easier for people to share <strong>and</strong> build up<strong>on</strong> the work of others, c<strong>on</strong>sistent with the<br />

rules of copyright” (http://creativecomm<strong>on</strong>s.org/about/) <strong>and</strong> provides free licenses <strong>and</strong> legal tools that can be used by creators of <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual works that<br />

wish to provide various levels of reuse of their work, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g attributi<strong>on</strong>-<strong>on</strong>ly, share-alike, n<strong>on</strong>commercial, <strong>and</strong> no-derivative works.<br />

93 http://oracc.museum.upenn.edu/doc/<br />

94 For technical details <strong>on</strong> the curati<strong>on</strong> model, see http://oracc.museum.upenn.edu/c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g.html#curati<strong>on</strong>; for their extensive corpus-builder<br />

documentati<strong>on</strong>, see http://oracc.museum.upenn.edu/c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g.html#curati<strong>on</strong>), <strong>and</strong> for the guide to project management, see<br />

http://oracc.museum.upenn.edu/doc/manager/


29<br />

created, stored, <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed. By recogniz<str<strong>on</strong>g>in</str<strong>on</strong>g>g different roles of users <strong>and</strong> design<str<strong>on</strong>g>in</str<strong>on</strong>g>g specific<br />

documentati<strong>on</strong> for them, Oracc also illustrates the very different skills <strong>and</strong> needs of its potential users.<br />

F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, through encourag<str<strong>on</strong>g>in</str<strong>on</strong>g>g two c<strong>on</strong>tributi<strong>on</strong> models (both of which encourage shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> provide<br />

attributi<strong>on</strong>), the Oracc Project has recognized that there may be many scholars who wish to share their<br />

data but either do not want to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> it <str<strong>on</strong>g>in</str<strong>on</strong>g> the l<strong>on</strong>g term or lack the technical skill to do so.<br />

While both CDLI <strong>and</strong> Oracc illustrate that while there are many digital cuneiform projects under way<br />

<strong>and</strong> thous<strong>and</strong>s of digitized cuneiform tablets with both transliterati<strong>on</strong>s <strong>and</strong> translati<strong>on</strong>s <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, the need<br />

to digitize thous<strong>and</strong>s of cuneiform tablets <strong>and</strong> to provide l<strong>on</strong>g-term access to them is an <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

challenge. The importance of c<strong>on</strong>sider<str<strong>on</strong>g>in</str<strong>on</strong>g>g 3-D scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g as a means of preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g cuneiform tablets has<br />

also been discussed by Kumar et al. (2003). The authors observed that cuneiform documents typically<br />

exhibit three-dimensi<strong>on</strong>al writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> three-dimensi<strong>on</strong>al surfaces, so the Digital Hammurabi 95 Project<br />

described <str<strong>on</strong>g>in</str<strong>on</strong>g> this article sought to create high-resoluti<strong>on</strong> 3-D models of tablets not <strong>on</strong>ly to preserve<br />

them but also to provide better access to scholars. Typically, cuneiformists have had two ma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

techniques for represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g cuneiform documents, “2D photography <strong>and</strong> h<strong>and</strong>-drawn<br />

copies, or autographs.” Many such autographs can be found <str<strong>on</strong>g>in</str<strong>on</strong>g> collecti<strong>on</strong>s such as the CDLI. These<br />

autographs, however, have several disadvantages as outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by Kumar et al., <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the fact that<br />

they represent <strong>on</strong>e author’s <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of the signs <strong>on</strong> a tablet, cannot be used for collati<strong>on</strong>, <strong>and</strong> are<br />

not very useful for palaeography. The authors thus c<strong>on</strong>clude that:<br />

It is no w<strong>on</strong>der then that we are also see<str<strong>on</strong>g>in</str<strong>on</strong>g>g a number of recent forays <str<strong>on</strong>g>in</str<strong>on</strong>g>to 3D surface scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

of cuneiform tablets, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g by our Digital Hammurabi project. … Accurate, detailed, <strong>and</strong><br />

efficient 3D visualizati<strong>on</strong> will enable the virtual “autopsy” of cuneiform tablets <strong>and</strong> will<br />

revoluti<strong>on</strong>ize cuneiform studies, not <strong>on</strong>ly by mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g the world’s tablet collecti<strong>on</strong>s broadly<br />

available, but also by limit<str<strong>on</strong>g>in</str<strong>on</strong>g>g physical c<strong>on</strong>tact with these valuable <strong>and</strong> unique ancient artifacts,<br />

while at the same time provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g redundant archival copies of the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>als (Kumar et al. 2003).<br />

The Digital Hammurabi Project was founded <str<strong>on</strong>g>in</str<strong>on</strong>g> 1999 <strong>and</strong> is based at Johns Hopk<str<strong>on</strong>g>in</str<strong>on</strong>g>s University.<br />

Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its website, this project has “pi<strong>on</strong>eered basic research <strong>on</strong> digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g ancient cuneiform<br />

tablets.” Their research has focused <strong>on</strong> solv<str<strong>on</strong>g>in</str<strong>on</strong>g>g three technological problems: (1) creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a st<strong>and</strong>ard<br />

computer encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g for cuneiform text; (2) creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g comprehensive cuneiform collecti<strong>on</strong>s; <strong>and</strong> (3)<br />

develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g soluti<strong>on</strong>s for 3-D scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> visualizati<strong>on</strong> of the tablets. As of this writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g, the project<br />

has successfully <str<strong>on</strong>g>in</str<strong>on</strong>g>vented a “3D surface scanner that scans cuneiform tablets at 4 times the resoluti<strong>on</strong><br />

of any comparable technology,” 96 developed algorithms designed for “cuneiform tablet rec<strong>on</strong>structi<strong>on</strong><br />

<strong>and</strong> 3D visualizati<strong>on</strong>,” <strong>and</strong> has successfully overseen a Unicode adopti<strong>on</strong> of “the first <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al<br />

st<strong>and</strong>ard for the representati<strong>on</strong> of cuneiform text <strong>on</strong> computers” (Cohen et al. 2004).<br />

Cohen et al. (2004) have described some of these algorithms, the development of the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

st<strong>and</strong>ard for cuneiform by the “Initiative for Cuneiform Encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g”(ICE), 97 <strong>and</strong> iClay, 98 “a crossplatform,<br />

Internet-deployable, Java applet that allows for the view<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> manipulati<strong>on</strong> of 2D+ images<br />

of cuneiform tablets.” At the time ICE was formed, there was no st<strong>and</strong>ard computer encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g for<br />

cuneiform text <strong>and</strong> Sumerologists had to create Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> transliterati<strong>on</strong>s for cuneiform texts. To support<br />

“automated cuneiform text process<str<strong>on</strong>g>in</str<strong>on</strong>g>g,” Cohen et al. stated that a “simple c<strong>on</strong>text-free descripti<strong>on</strong> of<br />

the text provided by a native cuneiform computer encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g” was needed. C<strong>on</strong>sequently, ICE<br />

95 http://www.jhu.edu/digitalhammurabi/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

96 For more <strong>on</strong> this scanner, see Hahn et al. (2006)<br />

97 http://www.jhu.edu/digitalhammurabi/ice/ice.html<br />

98 http://www.jhu.edu/digitalhammurabi/iclay/iclayalert.html


30<br />

developed a cuneiform sign repertoire that merged the three most important sign lists <str<strong>on</strong>g>in</str<strong>on</strong>g> the world (all<br />

unpublished), which was subsequently adopted by Unicode.<br />

Other research <str<strong>on</strong>g>in</str<strong>on</strong>g>to assist<str<strong>on</strong>g>in</str<strong>on</strong>g>g the effective analysis of cuneiform texts has been c<strong>on</strong>ducted by the<br />

Cuneiform Digital Palaeography (CDP) Project. 99 CDP is a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t research effort between an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary team at the University of Birm<str<strong>on</strong>g>in</str<strong>on</strong>g>gham <strong>and</strong> the British Museum that “aims to establish<br />

a detailed palaeography for the cuneiform script.” The website notes that while palaeography has l<strong>on</strong>g<br />

been taken for granted <str<strong>on</strong>g>in</str<strong>on</strong>g> other discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, it is <str<strong>on</strong>g>in</str<strong>on</strong>g> its <str<strong>on</strong>g>in</str<strong>on</strong>g>fancy for Assyriology. This project has<br />

c<strong>on</strong>structed an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e database that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes digital images of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual cuneiform signs taken directly<br />

from the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al sources <strong>and</strong> has used <strong>on</strong>ly those sources that can be dated to the reign of particular<br />

k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> are “broadly provenanced.” The CDP database can be either browsed or searched, <strong>and</strong> items<br />

that are found can be saved to a clipboard <strong>and</strong> pers<strong>on</strong>al notes can be added. Users can access the<br />

database as guests or they can create a registered account.<br />

In additi<strong>on</strong> to research projects <strong>on</strong> preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g cuneiform texts, there are a number of<br />

significant cuneiform databases <strong>and</strong> corpora that are <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 100 The Database of Neo-Sumerian Texts<br />

(BDTNS) 101 has been developed by the Centro de Ciencias Humanas y Sociales of the C<strong>on</strong>sejo<br />

Superior de Investigaci<strong>on</strong>es Científicas <str<strong>on</strong>g>in</str<strong>on</strong>g> Madrid. They have created an open database (registrati<strong>on</strong> is<br />

required to view the unpublished tablets) that manages over 88,000 adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrative cuneiform tablets<br />

written <str<strong>on</strong>g>in</str<strong>on</strong>g> the Sumerian language (c. 74,000 published <strong>and</strong> 14,000 unpublished). The tablets are from<br />

the Neo-Sumerian period (c. 2100–2000 BC) <strong>and</strong> come primarily from five southern cities of Ancient<br />

Mesopotamia. A catalog for both the database <strong>and</strong> transliterati<strong>on</strong>s of the texts provides a variety of<br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g opti<strong>on</strong>s, <strong>and</strong> full records for tablets <str<strong>on</strong>g>in</str<strong>on</strong>g>clude extensive descriptive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

publicati<strong>on</strong> details, a draw<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the tablet or digital image, <strong>and</strong> a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to the CDLI (as there are records<br />

for many of the same tablets <str<strong>on</strong>g>in</str<strong>on</strong>g> both collecti<strong>on</strong>).<br />

The Electr<strong>on</strong>ic Text Corpus of Sumerian Literature (ETSCL), 102 a project of the University of Oxford,<br />

provides access to nearly 400 literary compositi<strong>on</strong>s from the late third <strong>and</strong> early sec<strong>on</strong>d century BC. 103<br />

This corpus c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s Sumerian texts that have been transliterated <strong>and</strong> also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes English prose<br />

translati<strong>on</strong>s 104 <strong>and</strong> bibliographic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> for each text. The ETSCL can be browsed or searched<br />

<strong>and</strong> also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes an impressive of list of over 700 signs that provides sign names, an image of the<br />

sign, <strong>and</strong> the ETSCL values for search<str<strong>on</strong>g>in</str<strong>on</strong>g>g the corpus. 105<br />

Ebel<str<strong>on</strong>g>in</str<strong>on</strong>g>g (2007) has provided an overview of the development of this corpus <strong>and</strong> the technical<br />

challenges there<str<strong>on</strong>g>in</str<strong>on</strong>g>. Four features make the ETSCL different from other cuneiform projects: (1) it is a<br />

corpus of literary Sumerian texts rather than adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrative Sumerian texts such as those found <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

CDLI <strong>and</strong> the BDTNS; (2) it is a corpus of compositi<strong>on</strong>, where many “of the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual documents <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the corpus are put together from several copies, often damaged or fragmented, of the same text”; (3) it<br />

99 http://www.cdp.bham.ac.uk/<br />

100 In additi<strong>on</strong> to the larger projects <strong>and</strong> databases discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> this subsecti<strong>on</strong>, there are many smaller <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e exhibiti<strong>on</strong>s, such as “Cuneiform Tablets:<br />

From the Reign of Gudea of Lagash to Shalmanassar III” from the <strong>Library</strong> of C<strong>on</strong>gress, http://<str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al.loc.gov/<str<strong>on</strong>g>in</str<strong>on</strong>g>tldl/cuneihtml/, <strong>and</strong> the N<str<strong>on</strong>g>in</str<strong>on</strong>g>eveh<br />

Tablet Collecti<strong>on</strong> from the British Museum, http://f<str<strong>on</strong>g>in</str<strong>on</strong>g>cke.uni-hd.de/n<str<strong>on</strong>g>in</str<strong>on</strong>g>eveh/<br />

101 http://bdts.filol.csic.es/<br />

102 http://etcsl.or<str<strong>on</strong>g>in</str<strong>on</strong>g>st.ox.ac.uk/<br />

103 While the ETSCL focuses <strong>on</strong> a specific time period, a related project that it appears has just begun is the “Diachr<strong>on</strong>ic Corpus of Sumerian Literature”<br />

(DCSL) (http://dcsl.or<str<strong>on</strong>g>in</str<strong>on</strong>g>st.ox.ac.uk/), which seeks to create a “web-based corpus of Sumerian Literature spann<str<strong>on</strong>g>in</str<strong>on</strong>g>g the entire history of Mesopotamian<br />

civilizati<strong>on</strong> over a range of 2500 years.”<br />

104 English translati<strong>on</strong>s of cuneiform texts are fairly uncomm<strong>on</strong>, but another <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e collecti<strong>on</strong> is eTACT, “Electr<strong>on</strong>ic Translati<strong>on</strong>s of Akkadian Cuneiform<br />

Texts” (http://www.etana.org/etact/). This collecti<strong>on</strong> (part of ETANA) provides access to translati<strong>on</strong>s of 28 Akkadian texts, al<strong>on</strong>g with full bibliographic<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al cuneiform text.<br />

105 http://etcsl.or<str<strong>on</strong>g>in</str<strong>on</strong>g>st.ox.ac.uk/editi<strong>on</strong>2/signlist.php


31<br />

provides English translati<strong>on</strong>s; <strong>and</strong> (4) it is the “<strong>on</strong>ly corpus of any Ancient Middle Eastern language<br />

that has been tagged <strong>and</strong> lemmatized” (Ebel<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2007). These literary texts also differ from<br />

adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrative texts <str<strong>on</strong>g>in</str<strong>on</strong>g> that they spell out the morphology <str<strong>on</strong>g>in</str<strong>on</strong>g> detail <strong>and</strong> provide a source for cultural <strong>and</strong><br />

religious vocabulary.<br />

The ETSCL, like the CDLI <strong>and</strong> the BDNTS, c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s transliterati<strong>on</strong>s of Sumerian, where the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

cuneiform has been c<strong>on</strong>verted <str<strong>on</strong>g>in</str<strong>on</strong>g>to <strong>and</strong> is represented by a sequence of Roman characters. As noted<br />

above, it also c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s English translati<strong>on</strong>s, <strong>and</strong> transliterati<strong>on</strong>s <strong>and</strong> translati<strong>on</strong>s are l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked at the<br />

paragraph level. This supports a parallel read<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al text <strong>and</strong> translati<strong>on</strong>. In additi<strong>on</strong>, the<br />

entire corpus is marked up <str<strong>on</strong>g>in</str<strong>on</strong>g> TEI (P4) with some extensi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> order to accommodate textual variants<br />

<strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic annotati<strong>on</strong>s, which, Ebel<str<strong>on</strong>g>in</str<strong>on</strong>g>g admitted, had “sometimes stretched the descriptive<br />

apparatus to the limit.” One challenge of present<str<strong>on</strong>g>in</str<strong>on</strong>g>g the text <strong>and</strong> transliterati<strong>on</strong> side-by-side <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

ETSCL was that the transliterati<strong>on</strong> was often put together from several fragmentary sources. This was<br />

solved by us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the tagpair <strong>and</strong> from the TEI. The “type” attribute is used to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dicate whether it is a primary or sec<strong>on</strong>dary variant. A special format was also developed for<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g broken <strong>and</strong> damaged texts.<br />

One major advance of the ETSCL for corpus studies is that the transliterati<strong>on</strong>s were lemmatized with<br />

an automatic morphological parser (developed by the PSD Project) <strong>and</strong> the output was then proofread.<br />

While this process took a year, it also supports lemmatized search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the ETSCL, <strong>and</strong> when a user<br />

clicks <strong>on</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual lemma <str<strong>on</strong>g>in</str<strong>on</strong>g> the ETSCL it can launch a search <str<strong>on</strong>g>in</str<strong>on</strong>g> the PSD <strong>and</strong> vice versa. In sum,<br />

the ETSCL serves as a “diachr<strong>on</strong>ic, annotated, transliterated, bil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual, parallel corpus of literature or<br />

as an all-<str<strong>on</strong>g>in</str<strong>on</strong>g>-<strong>on</strong>e corpus” (Ebel<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2007). The further development of l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic analysis <strong>and</strong> corpus<br />

search tools for the ETSCL has also been detailed by Tablan et al. (2006):<br />

The ma<str<strong>on</strong>g>in</str<strong>on</strong>g> aim of our work is to create a set of tools for perform<str<strong>on</strong>g>in</str<strong>on</strong>g>g automatic morphological<br />

analysis of Sumerian. This essentially entails identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g the part of speech for each word <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

corpus (technically, this <strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g>volves nouns <strong>and</strong> verbs which are the <strong>on</strong>ly categories that are<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>flected), separat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the lemma part from the clitics <strong>and</strong> assign<str<strong>on</strong>g>in</str<strong>on</strong>g>g a morphological functi<strong>on</strong> to<br />

each of the clitics (Tablan et al. 2006).<br />

The authors used the open-source GATE (General Architecture for Text Eng<str<strong>on</strong>g>in</str<strong>on</strong>g>eer<str<strong>on</strong>g>in</str<strong>on</strong>g>g), 106 which was<br />

developed at the University of Sheffield, <strong>and</strong> found that <strong>on</strong>e of the biggest problems <str<strong>on</strong>g>in</str<strong>on</strong>g> evaluat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

success of their methods was that they lacked a morphological gold st<strong>and</strong>ard for Sumerian aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st<br />

which to evaluate their data. Many of the challenges faced by the ETSCL thus illustrate some of the<br />

comm<strong>on</strong> issues faced when creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g corpora for historical languages, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a lack of lexical<br />

resources <strong>and</strong> gold st<strong>and</strong>ard tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> evaluati<strong>on</strong> data, the difficulties of automatic process<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong><br />

the need to represent physically fragmented sources.<br />

A recently started literary text project is the SEAL (Sources of Early Akkadian Literature) 107 corpus,<br />

which is composed of Akkadian (Babyl<strong>on</strong>ian <strong>and</strong> Assyrian) literary texts from the third <strong>and</strong> sec<strong>on</strong>d<br />

centuries BC that were documented <strong>on</strong> cuneiform tablets. The goal of this project, which is funded by<br />

the German Israeli Foundati<strong>on</strong> for Scientific Research <strong>and</strong> Development (GIF), is to “compile a<br />

complete <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dexed corpus of Akkadian literary texts from the 3rd <strong>and</strong> 2nd Millennia BCE.” They<br />

hope that this corpus will form the basis for both a history <strong>and</strong> a glossary of early Akkadian literature.<br />

Around 150 texts are available <strong>and</strong> they are organized by genre classificati<strong>on</strong>s (such as epics, hymns,<br />

106 http://gate.ac.uk/<br />

107 http://www.seal.uni-leipzig.de/


32<br />

<strong>and</strong> prayers), <strong>and</strong> edited texts are downloadable as PDFs. While no images of the tablets are given,<br />

those texts that are available as downloads <str<strong>on</strong>g>in</str<strong>on</strong>g>clude basic catalog <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, transliterati<strong>on</strong>s, an<br />

English translati<strong>on</strong>, a commentary, <strong>and</strong> a full bibliography. There are also various <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes to the texts,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g words, deity <strong>and</strong> pers<strong>on</strong>al names, <strong>and</strong> geographical names (there are separate <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes for<br />

epics <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cantati<strong>on</strong>s).<br />

Another grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g research project is the Persepolis Fortificati<strong>on</strong> Archive Project (PFAP), 108 which is<br />

based at the University of Chicago’s Oriental Institute. Archaeologists from the Institute work<str<strong>on</strong>g>in</str<strong>on</strong>g>g at<br />

Persepolis <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1930s exposed ru<str<strong>on</strong>g>in</str<strong>on</strong>g>s of the palaces of Darius, Xerxes, <strong>and</strong> their successors <strong>and</strong> found<br />

tens of thous<strong>and</strong>s of clay tablets <str<strong>on</strong>g>in</str<strong>on</strong>g> a fortificati<strong>on</strong> wall. These tablets were adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrative records<br />

produced around 500 BC. This archive is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g made available for study through PFA Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 109<br />

Primary access to the PFA Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e is made available through OCHRE (Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Cultural Heritage<br />

Research Envir<strong>on</strong>ment) <strong>and</strong> requires the Java Runtime Envir<strong>on</strong>ment to be used. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce 2002, the PFA<br />

project has captured <strong>and</strong> edited almost 2,000 digital images of Elamite tablets, created <strong>and</strong> edited high<br />

resoluti<strong>on</strong> digital images of more than 600 Aramaic tablets am<strong>on</strong>g a variety of other research, 110 <strong>and</strong><br />

has created a blog to track their progress. 111 In additi<strong>on</strong> to us<str<strong>on</strong>g>in</str<strong>on</strong>g>g OCHRE, the PFAP makes digital<br />

material available through InscriptiFact <strong>and</strong> is <str<strong>on</strong>g>in</str<strong>on</strong>g> the process of submitt<str<strong>on</strong>g>in</str<strong>on</strong>g>g images <strong>and</strong> texts to the<br />

CDLI <strong>and</strong> to Achemenet. 112 Around 15,000 high-quality images of approximately 515 Persepolis<br />

documents are currently available through InscriptiFact, <strong>and</strong> the PFAP is add<str<strong>on</strong>g>in</str<strong>on</strong>g>g images to this<br />

database. While InscriptiFact presents access to images <strong>and</strong> object-level metadata, access to the PFA<br />

Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e through OCHRE also presents images l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to text editi<strong>on</strong>s (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g transliterati<strong>on</strong>s <strong>and</strong><br />

translati<strong>on</strong>s), catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> analytical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>.<br />

As <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated by this overview, the challenge of work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with cuneiform tablets <strong>and</strong> the languages<br />

represented by them is an area of active <strong>and</strong> grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terest. One theme also illustrated throughout<br />

the document-recogniti<strong>on</strong> secti<strong>on</strong> was the importance of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital editi<strong>on</strong>s of texts that represent<br />

variant read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <strong>and</strong> make editorial decisi<strong>on</strong>s transparent, two topics that receive further discussi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the next secti<strong>on</strong>.<br />

Digital Editi<strong>on</strong>s <strong>and</strong> Text Edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Introducti<strong>on</strong><br />

Two of the most extensive <strong>and</strong> frequently debated questi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the field of digital classics, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>deed<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> all of the digital humanities, are how to build an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that supports the creati<strong>on</strong> of “true”<br />

digital critical editi<strong>on</strong>s, <strong>and</strong> what c<strong>on</strong>stitutes a “digital critical editi<strong>on</strong>.” Several l<strong>on</strong>g-st<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g projects<br />

serve as examples of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital editi<strong>on</strong>s for the entire corpus of an author (Chicago Homer), 113<br />

for the selected works of an author (Homer <strong>and</strong> the Papyri, 114 Homeric Multitext 115 ) or for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

108 http://ochre.lib.uchicago.edu/PFA_Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e/<br />

109 http://ochre.lib.uchicago.edu/ochre6.jspproject=65801673-ad89-6757-330b-fd5926b2a685<br />

110 http://oi.uchicago.edu/research/projects/pfa/<br />

111 http://persepolistablets.blogspot.com/<br />

112 http://www.achemenet.com/<br />

113 http://www.library.northwestern.edu/homer/<br />

114 http://chs.harvard.edu/wb/86/wo/XNM7918Nrebkz3KQV0bTHg/4.0.0.0.19.1.7.15.1.1.0.1.2.0.4.1.7.1.0.0.1.3.3.1<br />

115 http://chs.harvard.edu/wa/pageRtn=ArticleWrapper&bdc=12&mn=1169


33<br />

works by <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual authors (Electr<strong>on</strong>ic Boethius, 116 Ovid’s Metamorphoses, 117 <strong>and</strong> the Vergil<br />

Project). 118<br />

In their discussi<strong>on</strong> of multitexts <strong>and</strong> digital editi<strong>on</strong>s, Blackwell <strong>and</strong> Crane (2009) offered an overview<br />

of digital editi<strong>on</strong>s <strong>and</strong> what an ideal digital library <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure might provide for them:<br />

… digital editi<strong>on</strong>s are designed from the start to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude images of the manuscripts,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, papyri <strong>and</strong> other source materials, not <strong>on</strong>ly those available when the editor is at<br />

work but those which become available even after active work <strong>on</strong> the editi<strong>on</strong> has ceased. …<br />

This is possible because a true digital editi<strong>on</strong> will <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e acti<strong>on</strong>able set of sigla.<br />

Even if we do not yet have an <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>ally recognized set of electr<strong>on</strong>ic identifiers for<br />

manuscripts, the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t world has often produced unique names (e.g., LIBRARY + NUMBER)<br />

that can later be c<strong>on</strong>verted <str<strong>on</strong>g>in</str<strong>on</strong>g>to whatever st<strong>and</strong>ard identifiers appear. A mature digital library<br />

system manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g the digital editi<strong>on</strong> will underst<strong>and</strong> the list of witnesses <strong>and</strong> automatically<br />

search for digital exemplars of these witnesses, associat<str<strong>on</strong>g>in</str<strong>on</strong>g>g them with the digital editi<strong>on</strong> if <strong>and</strong><br />

when they come <strong>on</strong>-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e (Blackwell <strong>and</strong> Crane 2009).<br />

They stated that digital editi<strong>on</strong>s need to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude images of all their primary sources of data, <strong>and</strong> that<br />

“true” digital editi<strong>on</strong>s will be dynamic <strong>and</strong> will automatically search for digital facsimiles of<br />

manuscript witnesses as they become available, provided st<strong>and</strong>ard sets of mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-acti<strong>on</strong>able<br />

identifiers are used.<br />

Whether <str<strong>on</strong>g>in</str<strong>on</strong>g> a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t or digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, Borgman has also noted the c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>u<str<strong>on</strong>g>in</str<strong>on</strong>g>g importance of<br />

access to various editi<strong>on</strong>s of a text, comment<str<strong>on</strong>g>in</str<strong>on</strong>g>g that humanities scholars <strong>and</strong> students “also makes the<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>est dist<str<strong>on</strong>g>in</str<strong>on</strong>g>cti<strong>on</strong>s am<strong>on</strong>g editi<strong>on</strong>s, pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, <strong>and</strong> other variants – dist<str<strong>on</strong>g>in</str<strong>on</strong>g>cti<strong>on</strong>s that are sometimes<br />

overlooked <str<strong>on</strong>g>in</str<strong>on</strong>g> the transiti<strong>on</strong> from pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t to digital form” (Borgman 2009). This secti<strong>on</strong> provides an<br />

overview of some of the theoretical <strong>and</strong> technical issues <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for<br />

digital editi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> classics <strong>and</strong> bey<strong>on</strong>d.<br />

Theoretical Issues of Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Markup for Digital Editi<strong>on</strong>s<br />

Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a critical editi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> classics typically <str<strong>on</strong>g>in</str<strong>on</strong>g>volves the c<strong>on</strong>sultati<strong>on</strong> of a variety of sources<br />

(manuscripts, pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s, etc.) where editors seek to rec<strong>on</strong>struct the Urtext (orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al text, best<br />

text, etc.) of a work as “orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ally” created by an ancient author, while at the same time creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an<br />

apparatus criticus that records all major variants found <str<strong>on</strong>g>in</str<strong>on</strong>g> the sources <strong>and</strong> the reas<strong>on</strong>s they chose to<br />

rec<strong>on</strong>struct the text as they did. The complicated reality of textual variants am<strong>on</strong>g manuscript <strong>and</strong> other<br />

witnesses of a text is <strong>on</strong>e of the major reas<strong>on</strong>s beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d the development of modern pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted critical<br />

editi<strong>on</strong>s, as stated by Paolo M<strong>on</strong>ella:<br />

Critical editi<strong>on</strong>s, i.e., editi<strong>on</strong>s of texts with a text-critical apparatus, resp<strong>on</strong>d to the necessity of<br />

represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>e aspect of the complex reality of textual traditi<strong>on</strong>: the textual variance. Their<br />

functi<strong>on</strong> is double: <strong>on</strong> the <strong>on</strong>e h<strong>and</strong>, they present the different versi<strong>on</strong>s of a text with<str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

c<strong>on</strong>text of the textual traditi<strong>on</strong>; <strong>on</strong> the other h<strong>and</strong>, they try to ‘extract’, out of the different texts<br />

born by many carriers (manuscripts, <str<strong>on</strong>g>in</str<strong>on</strong>g>cunabula, modern <strong>and</strong> c<strong>on</strong>temporary pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t editi<strong>on</strong>s), a<br />

rec<strong>on</strong>structed Text, the closest possible to the ‘orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al’ <strong>on</strong>e prior to its ‘corrupti<strong>on</strong>’ due to the<br />

very process of textual traditi<strong>on</strong>, thus ideally recover<str<strong>on</strong>g>in</str<strong>on</strong>g>g the <str<strong>on</strong>g>in</str<strong>on</strong>g>tentio auctoris (M<strong>on</strong>ella 2008).<br />

116 http://beowulf.engl.uky.edu/~kiernan/eBoethius/<str<strong>on</strong>g>in</str<strong>on</strong>g>lad.htm<br />

117 http://etext.lib.virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia.edu/lat<str<strong>on</strong>g>in</str<strong>on</strong>g>/ovid/<br />

118 http://vergil.classics.upenn.edu/


34<br />

Digital critical editi<strong>on</strong>s, however, offer a number of advantages over their pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t counterparts, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

to M<strong>on</strong>ella, the most important of which is the ability to better present textual variance <str<strong>on</strong>g>in</str<strong>on</strong>g> detail (such<br />

as by l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g critical editi<strong>on</strong>s to the sources of variants such as transcripti<strong>on</strong>s <strong>and</strong> images of<br />

manuscripts). Two other benefits of digital critical editi<strong>on</strong>s are that they enable the reader to verify <strong>and</strong><br />

questi<strong>on</strong> the work of an editor <strong>and</strong> allow scholars to build an “open” model of the text where the<br />

versi<strong>on</strong> presented by any <strong>on</strong>e editor is not c<strong>on</strong>sidered to be “the” text.<br />

At the same time, M<strong>on</strong>ella noted that most orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al sources (whether manuscript or pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted), <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

additi<strong>on</strong> to <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a “ma<str<strong>on</strong>g>in</str<strong>on</strong>g> text” of an ancient author, also <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded a variety of “paratexts” that<br />

commented <strong>on</strong> it, such as <str<strong>on</strong>g>in</str<strong>on</strong>g>terl<str<strong>on</strong>g>in</str<strong>on</strong>g>ear annotati<strong>on</strong>s, glosses, scholia, 119 footnotes, commentaries, <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong>s. Scholia, <str<strong>on</strong>g>in</str<strong>on</strong>g> particular, were often c<strong>on</strong>sidered so important <strong>and</strong> were also so vast that the<br />

scholia <strong>on</strong> a number of major authors have appeared <str<strong>on</strong>g>in</str<strong>on</strong>g> their own editi<strong>on</strong>s. 120 To represent the<br />

complicated nature of such sources, M<strong>on</strong>ella proposed a model for a “document-based digital critical<br />

editi<strong>on</strong>s” that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes both ma<str<strong>on</strong>g>in</str<strong>on</strong>g> texts <strong>and</strong> paratexts as they appear <str<strong>on</strong>g>in</str<strong>on</strong>g> different <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual sources:<br />

Such a model should <str<strong>on</strong>g>in</str<strong>on</strong>g>clude both the ma<str<strong>on</strong>g>in</str<strong>on</strong>g>texts <strong>and</strong> the paratexts of each source, express<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

explicitly the relati<strong>on</strong> between s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle porti<strong>on</strong>s of each paratext <strong>and</strong> the precise porti<strong>on</strong>s of ma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

text to which they refer. This implies that, rather than a traditi<strong>on</strong>al editi<strong>on</strong> of scholia, it would<br />

be both an editi<strong>on</strong> of the text <strong>and</strong> of its ancient (<strong>and</strong> modern) commentaries—<strong>and</strong> the<br />

relati<strong>on</strong>ships between the text <strong>and</strong> its commentaries (M<strong>on</strong>ella 2008).<br />

This model for digital critical editi<strong>on</strong>s then <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the need to publish each “ma<str<strong>on</strong>g>in</str<strong>on</strong>g> text” (e.g., each<br />

“rec<strong>on</strong>structed” text of an ancient author <str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual witness/source) with all of its paratexts such<br />

as scholia. N<strong>on</strong>etheless, M<strong>on</strong>ella admitted that develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a markup strategy that supports l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g each<br />

paratext to the exact porti<strong>on</strong> of the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> text it refers to is very difficult, <strong>and</strong> this has led to the<br />

development of a number of project-specific markup strategies as well as to debates over what level of<br />

“paratextuality” should be marked up <str<strong>on</strong>g>in</str<strong>on</strong>g> the transcripti<strong>on</strong>s. Develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g project-specific markup is to be<br />

avoided whenever possible, M<strong>on</strong>ella <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted, <strong>and</strong> the raw <str<strong>on</strong>g>in</str<strong>on</strong>g>put data (typically manuscript<br />

transcripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> this case) should be based <strong>on</strong> exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g st<strong>and</strong>ards so that the data can be reused by other<br />

projects.<br />

M<strong>on</strong>ella ultimately recommended a fairly complicated model of transcripti<strong>on</strong> <strong>and</strong> markup that clearly<br />

separates the roles of transcriber <strong>and</strong> editor. Transcribers who create primary-source transcripti<strong>on</strong>s<br />

must c<strong>on</strong>f<str<strong>on</strong>g>in</str<strong>on</strong>g>e themselves to encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g “<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> neutral with regards to the paratextuality levels” or<br />

else <strong>on</strong>ly append such <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> to any necessary elements with an “<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative” attribute. An<br />

editor, who is assumed to be work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>teractively with a specific software tool, takes this transcripti<strong>on</strong><br />

<strong>and</strong> assigns paratextuality levels to pert<str<strong>on</strong>g>in</str<strong>on</strong>g>ent places <str<strong>on</strong>g>in</str<strong>on</strong>g> the transcripti<strong>on</strong>s, generates an Alignment-Text<br />

of all the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> texts <str<strong>on</strong>g>in</str<strong>on</strong>g> the transcripti<strong>on</strong>s, <strong>and</strong> stores the l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> necessary to align the ma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

text Alignment-Text with all of the different paratexts. The next phase is to create custom software that<br />

119 As def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by the Oxford Dicti<strong>on</strong>ary of the Classical World, “Scholia are notes <strong>on</strong> a text, normally substantial sets of explanatory <strong>and</strong> critical notes<br />

written <str<strong>on</strong>g>in</str<strong>on</strong>g> the marg<str<strong>on</strong>g>in</str<strong>on</strong>g> or between the l<str<strong>on</strong>g>in</str<strong>on</strong>g>es of manuscripts. Many of them go back to ancient commentaries (which might fill volumes of their own). Scholia<br />

result from excerpti<strong>on</strong>, abbreviati<strong>on</strong>, <strong>and</strong> c<strong>on</strong>flati<strong>on</strong>, brought about partly by readers' needs <strong>and</strong> partly by lack of space.” Oxford Dicti<strong>on</strong>ary of the<br />

Classical World. Ed. John Roberts. Oxford University Press, 2007. Oxford Reference Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Oxford University Press. Tufts University. 19 May 2010<br />

http://www.oxfordreference.com/views/ENTRY.htmlsubview=Ma<str<strong>on</strong>g>in</str<strong>on</strong>g>&entry=t180.e1984<br />

120 For example, <strong>on</strong>e recently released project created by D<strong>on</strong>ald Mastr<strong>on</strong>arde, professor of classics at the University of California, Berkeley, the<br />

“Euripides Scholia Dem<strong>on</strong>strati<strong>on</strong>” presents a new open-access digital editi<strong>on</strong> of all the scholia <strong>on</strong> the plays of Euripides that were found <strong>on</strong> more than 29<br />

manuscripts <strong>and</strong> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted <str<strong>on</strong>g>in</str<strong>on</strong>g> 10 different editi<strong>on</strong>s. For a distributed-edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g approach to scholia, a new project, Scholiastae<br />

(http://www.scholiastae.org/scholia/Ma<str<strong>on</strong>g>in</str<strong>on</strong>g>_Page), has extended MediaWiki with easier word <strong>and</strong> phrase annotati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> order to support <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals who<br />

wish to create their own scholia <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e for public doma<str<strong>on</strong>g>in</str<strong>on</strong>g> classical texts.


35<br />

can use these objects to support dynamic <strong>and</strong> customizable access for readers to both the literary work<br />

(ma<str<strong>on</strong>g>in</str<strong>on</strong>g> text) <strong>and</strong> its different commentary (paratexts).<br />

The model of open-source critical editi<strong>on</strong>s (OSCE) that has recently been described by Bodard <strong>and</strong><br />

Garcés (2009) supported many of the c<strong>on</strong>clusi<strong>on</strong>s reached by M<strong>on</strong>ella, particularly the need to make<br />

all the critical decisi<strong>on</strong>s of a scholarly editor transparent to the reader <strong>and</strong> the importance of better<br />

represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g the complicated nature of primary textual sources, variants, <strong>and</strong> their textual transmissi<strong>on</strong>.<br />

The term OSCE <strong>and</strong> its def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> grow out of a meet<str<strong>on</strong>g>in</str<strong>on</strong>g>g of scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital classics community,<br />

who met <str<strong>on</strong>g>in</str<strong>on</strong>g> 2006 to discuss the needs of new digital critical editi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>:<br />

Our proposal is that Classical scholarship should recognize OSCEs as a deeper, richer <strong>and</strong><br />

potentially different k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of publicati<strong>on</strong> from pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s of texts, or even from digitized<br />

<strong>and</strong> open c<strong>on</strong>tent <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e editi<strong>on</strong>s. OSCEs are more than merely the f<str<strong>on</strong>g>in</str<strong>on</strong>g>al representati<strong>on</strong>s of<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>ished work; <str<strong>on</strong>g>in</str<strong>on</strong>g> their essence they <str<strong>on</strong>g>in</str<strong>on</strong>g>volve the distributi<strong>on</strong> of raw data, of scholarly traditi<strong>on</strong>,<br />

of decisi<strong>on</strong>-mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g processes, <strong>and</strong> of the tools <strong>and</strong> applicati<strong>on</strong>s that were used <str<strong>on</strong>g>in</str<strong>on</strong>g> reach<str<strong>on</strong>g>in</str<strong>on</strong>g>g these<br />

c<strong>on</strong>clusi<strong>on</strong>s (Bodard <strong>and</strong> Garcés 2009, 84-85).<br />

In sum, OSCEs are a new form of digital editi<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g> that from the very beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g, they should be<br />

designed to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude access to all the raw data (page images, transcripti<strong>on</strong>s), previous editi<strong>on</strong>s, <strong>and</strong><br />

scholarship <strong>on</strong> which this editi<strong>on</strong> is based, as well as any algorithms or tools used <str<strong>on</strong>g>in</str<strong>on</strong>g> its creati<strong>on</strong>. This<br />

argument illustrates the important need for all digital scholarship (whether it be the creati<strong>on</strong> of an<br />

archaeological rec<strong>on</strong>structi<strong>on</strong> or a digital editi<strong>on</strong>) to be recognized as an <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative act.<br />

Another significant issue addressed by Bodard <strong>and</strong> Garcés is the c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>u<str<strong>on</strong>g>in</str<strong>on</strong>g>g restricti<strong>on</strong> of copyright.<br />

While they recognize that an apparatus criticus deserves copyright protecti<strong>on</strong>, they argue that there<br />

should be no restricti<strong>on</strong>s <strong>on</strong> the public doma<str<strong>on</strong>g>in</str<strong>on</strong>g> text of a classical author. OSCEs also highlight the<br />

chang<str<strong>on</strong>g>in</str<strong>on</strong>g>g nature of traditi<strong>on</strong>al scholarship <strong>and</strong> the creati<strong>on</strong> of editi<strong>on</strong>s, by declar<str<strong>on</strong>g>in</str<strong>on</strong>g>g that the database<br />

created or XML text beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d an editi<strong>on</strong> may be <str<strong>on</strong>g>in</str<strong>on</strong>g> many ways far more important than the scholarly<br />

prose that accompanies it:<br />

In the digital age, however, there is more to scholarship than simply abstract ideas expressed <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

elegant rhetoric <strong>and</strong> language; sometimes the most essential part of an academic work is<br />

precisely the actual words <strong>and</strong> codes used <str<strong>on</strong>g>in</str<strong>on</strong>g> the expressi<strong>on</strong> of that work. … A database or<br />

XML-encoded text is not merely an abstract idea, it is itself both the scholarly expressi<strong>on</strong> of<br />

research <strong>and</strong> the raw data up<strong>on</strong> which that research is based, <strong>and</strong> which must form the basis of<br />

any derivative research that attempts to reproduce or refute its c<strong>on</strong>clusi<strong>on</strong>s (Bodard <strong>and</strong> Garcés<br />

2009, 87).<br />

In other words, the data up<strong>on</strong> which an editi<strong>on</strong>’s c<strong>on</strong>clusi<strong>on</strong>s are founded are what scholars need more<br />

than anyth<str<strong>on</strong>g>in</str<strong>on</strong>g>g else; thus, not <strong>on</strong>ly the text but also all the data <strong>and</strong> the tools that produce it must be<br />

open source. The authors also observed through their brief history of the development of critical<br />

editi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth <strong>and</strong> twentieth centuries that the apparatus criticus also has a l<strong>on</strong>g traditi<strong>on</strong> of<br />

many scholars’ c<strong>on</strong>tributi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> it, <strong>and</strong> that all criticism is <str<strong>on</strong>g>in</str<strong>on</strong>g> the end a “communal enterprise.” In fact,<br />

the authors asserted that text editi<strong>on</strong>s should be seen as “fully critical” <strong>on</strong>ly if all “<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative<br />

decisi<strong>on</strong>s that led to the text” were made both transparent <strong>and</strong> accessible. 121<br />

121 Similar arguments have been by Henriette Roued-Cunliffe <str<strong>on</strong>g>in</str<strong>on</strong>g> regard to creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a read<str<strong>on</strong>g>in</str<strong>on</strong>g>g-support system for papyrologists that models <strong>and</strong> records<br />

their <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative processes as they create an editi<strong>on</strong> of a text: “It is very important that this evaluati<strong>on</strong> of the evidence for <strong>and</strong> aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st the different<br />

read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs is c<strong>on</strong>ducted. However, the commentary <strong>on</strong>ly presents the c<strong>on</strong>clusi<strong>on</strong>s of this exercise. It would be a great aid, both for editors as they go through


36<br />

While these requirements may seem <strong>on</strong>erous for the creati<strong>on</strong> of pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t editi<strong>on</strong>s, Bodard <strong>and</strong> Garcés<br />

argued that digital editi<strong>on</strong>s could far more easily meet all these dem<strong>and</strong>s. In additi<strong>on</strong> to provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g all<br />

the data <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative decisi<strong>on</strong>s, however, these authors recommend formaliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g critical editi<strong>on</strong>s<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>to a mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-readable format. In sum, an OSCE provides access to an open text, to the data <strong>and</strong><br />

software used <str<strong>on</strong>g>in</str<strong>on</strong>g> mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g an editi<strong>on</strong>, <strong>and</strong> to the editorial <str<strong>on</strong>g>in</str<strong>on</strong>g>terventi<strong>on</strong>s made <strong>and</strong> scholarships beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d the<br />

decisi<strong>on</strong>s.<br />

The authors also listed another major advantage of digital editi<strong>on</strong>s, namely, the ability to get back to<br />

the materiality of actual manuscripts <strong>and</strong> move away from the “ideal” of rec<strong>on</strong>struct<str<strong>on</strong>g>in</str<strong>on</strong>g>g an Urtext, as<br />

previously discussed by Ruhleder (1995) <strong>and</strong> Bolter (1991), to focus <str<strong>on</strong>g>in</str<strong>on</strong>g>stead <strong>on</strong> textual transmissi<strong>on</strong>.<br />

Bodard <strong>and</strong> Garcés posited that papyrologists better underst<strong>and</strong> the nature of the transcripti<strong>on</strong> process<br />

<strong>and</strong> how creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a text is an editorial process, where there is typically not just <strong>on</strong>e correct read<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

Scholarship has <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly challenged the idea that an editor can ever get back to the “orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al text”<br />

of an author, <strong>and</strong> Bodard <strong>and</strong> Garcés stressed that attenti<strong>on</strong> would be better focused <strong>on</strong> how to present<br />

a text with multiple manuscript witnesses to a reader <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital envir<strong>on</strong>ment:<br />

Digital editi<strong>on</strong>s may stimulate our critical engagement with such crucial textual debate. They<br />

may push the classic def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of the ‘editi<strong>on</strong>’ by not <strong>on</strong>ly offer<str<strong>on</strong>g>in</str<strong>on</strong>g>g a presentati<strong>on</strong>al publicati<strong>on</strong><br />

layer but also by allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to the underly<str<strong>on</strong>g>in</str<strong>on</strong>g>g encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the repository or database<br />

beneath. Indeed, an editor need not make any authoritative decisi<strong>on</strong>s that supersede all<br />

alternative read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs if all possibilities can be unambiguously rec<strong>on</strong>structed from the base<br />

manuscript data, although most would <str<strong>on</strong>g>in</str<strong>on</strong>g> practice probably want to privilege their favoured<br />

read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <str<strong>on</strong>g>in</str<strong>on</strong>g> some way. The critical editi<strong>on</strong>, with sources fully <str<strong>on</strong>g>in</str<strong>on</strong>g>corporated, would potentially<br />

provide an <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive resource that assists the user <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g virtual research envir<strong>on</strong>ments<br />

(Bodard <strong>and</strong> Garcés 2009, 96).<br />

Thus, the authors hoped that digital or virtual research envir<strong>on</strong>ments would support the creati<strong>on</strong> of<br />

“ideal” digital editi<strong>on</strong>s where the editor does not have to decide <strong>on</strong> a “best text” s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce all editorial<br />

decisi<strong>on</strong>s could be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to their base data (e.g., manuscript images, diplomatic transcripti<strong>on</strong>s). They<br />

also argued that the creati<strong>on</strong> of such ideal digital editi<strong>on</strong>s must be a collaborative enterprise, where all<br />

modificati<strong>on</strong>s <strong>and</strong> changes are made explicit, are attributed to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals, <strong>and</strong> are both citable <strong>and</strong><br />

permanent. Bodard <strong>and</strong> Garcés c<strong>on</strong>cluded that future research should exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e what methodologies <strong>and</strong><br />

technology are necessary to make this visi<strong>on</strong> a reality. A key part of this research will be to explore the<br />

relati<strong>on</strong>ship between OSCEs <strong>and</strong> the materials found <str<strong>on</strong>g>in</str<strong>on</strong>g> massive digital collecti<strong>on</strong>s <strong>and</strong> milli<strong>on</strong>-book<br />

libraries with little or no markup. OSCEs <strong>and</strong> other small curated collecti<strong>on</strong>s, Bodard <strong>and</strong> Garcés<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>sisted, could be used as tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g data to enrich larger collecti<strong>on</strong>s, a theme which we return to <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

discussi<strong>on</strong> of classics <strong>and</strong> cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

One project that embodies some of the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciples discussed here is the Homer Multitext Project<br />

(HMT), 122 def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by Blackwell <strong>and</strong> Smith (2009) as an “effort to br<str<strong>on</strong>g>in</str<strong>on</strong>g>g together a comprehensive<br />

record of the Homeric traditi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital library.” The website for the HMT is hosted by the Center<br />

for Hellenic Studies <strong>and</strong> offers free access to a library of text transcripti<strong>on</strong>s <strong>and</strong> images of Homeric<br />

manuscripts, with its major comp<strong>on</strong>ent be<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital images of the tenth-century manuscript of the<br />

Iliad known as the Venetus A from the Marciana <strong>Library</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Venice. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the website:<br />

this process, <strong>and</strong> for future editors of the same text, if it were possible to present this evaluati<strong>on</strong> for each character <strong>and</strong> word <str<strong>on</strong>g>in</str<strong>on</strong>g> a structured format”<br />

(Roued-Cunliffe 2010).<br />

122 http://chs.harvard.edu/wa/pageRtn=ArticleWrapper&bdc=12&mn=1169. While the discussi<strong>on</strong> here focuses largely <strong>on</strong> the technical architecture <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>novative approach to digital editi<strong>on</strong>s of the HMT, a l<strong>on</strong>g-term view of the scholarly potential <strong>and</strong> future of the project has recently been offered by Nagy<br />

(2010).


37<br />

This manuscript, the oldest <strong>and</strong> best, <strong>on</strong> which all modern editi<strong>on</strong>s are ultimately based,<br />

c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g> its marg<str<strong>on</strong>g>in</str<strong>on</strong>g>al commentaries a wealth of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about the history of the text.<br />

These commentaries <str<strong>on</strong>g>in</str<strong>on</strong>g> the marg<str<strong>on</strong>g>in</str<strong>on</strong>g>, or scholia, derive ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ly from the work of scholars at the<br />

<strong>Library</strong> at Alex<strong>and</strong>ria <str<strong>on</strong>g>in</str<strong>on</strong>g> Egypt dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Hellenistic <strong>and</strong> Roman eras. It has been a central<br />

goal of the project to obta<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> publish high-resoluti<strong>on</strong> digital images of the manuscript,<br />

together with an electr<strong>on</strong>ic editi<strong>on</strong> of the Greek text of the scholia al<strong>on</strong>g with an English<br />

translati<strong>on</strong>. 123<br />

In order to represent such a complicated manuscript, Blackwell <strong>and</strong> Smith reported that “the HMT has<br />

focused not <strong>on</strong> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle-purpose applicati<strong>on</strong> to support a particular theoretical approach, but<br />

<strong>on</strong> def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g a l<strong>on</strong>g-term generic digital library expressly <str<strong>on</strong>g>in</str<strong>on</strong>g>tended to encourage reuse of its c<strong>on</strong>tents,<br />

services, <strong>and</strong> tools.”<br />

As was earlier observed by Bodard <strong>and</strong> Garcés, the twentieth century began a movement from<br />

scholarship based <strong>on</strong> manuscripts toward the creati<strong>on</strong> of critical editi<strong>on</strong>s. Blackwell <strong>and</strong> Smith listed<br />

some of the most prom<str<strong>on</strong>g>in</str<strong>on</strong>g>ent Homeric editi<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g editi<strong>on</strong>s of the scholia, <strong>and</strong> argued that they<br />

all suffer from the same major flaw, namely, they are works based <strong>on</strong> selecti<strong>on</strong> where the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> goal of<br />

the editors was to present a unified text that represented their best judgment, with the result be<str<strong>on</strong>g>in</str<strong>on</strong>g>g that<br />

many scholia or variants were often excluded. Recent changes <str<strong>on</strong>g>in</str<strong>on</strong>g> Homeric textual criticism <strong>and</strong> the<br />

development of digital technology, however, have allowed these questi<strong>on</strong>s of textual variati<strong>on</strong> <strong>and</strong><br />

types of commentary to be revisited:<br />

The very existence of variati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the text has become a matter of historical <str<strong>on</strong>g>in</str<strong>on</strong>g>terest (rather than<br />

a problem to be removed). The precise relati<strong>on</strong>ship between text <strong>and</strong> commentary, as expressed<br />

<strong>on</strong> the pages of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscripts, hold promise to shed light <strong>on</strong> the traditi<strong>on</strong> that<br />

preserved these texts, the nature of the texts <str<strong>on</strong>g>in</str<strong>on</strong>g> antiquity, <strong>and</strong> therefore their fundamental<br />

nature. … The best scholarly envir<strong>on</strong>ment for address<str<strong>on</strong>g>in</str<strong>on</strong>g>g these questi<strong>on</strong>s would be a digital<br />

library of facsimiles <strong>and</strong> accompany<str<strong>on</strong>g>in</str<strong>on</strong>g>g diplomatic editi<strong>on</strong>s. This library should also be<br />

supplemented by other texts of related <str<strong>on</strong>g>in</str<strong>on</strong>g>terest such as n<strong>on</strong> Homeric texts that <str<strong>on</strong>g>in</str<strong>on</strong>g>clude relevant<br />

comments <strong>and</strong> quotati<strong>on</strong>s <strong>and</strong> other collecti<strong>on</strong>s of data <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dices. Thus our focus <strong>on</strong> both<br />

collecti<strong>on</strong> of data <strong>and</strong> <strong>on</strong> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a scalable, technologically agnostic, <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for<br />

publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g collecti<strong>on</strong>s of data, images, texts, <strong>and</strong> extensi<strong>on</strong>s to these types (Blackwell <strong>and</strong><br />

Smith 2009).<br />

Thus, the HMT stresses the importance of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al data used to create critical<br />

editi<strong>on</strong>s such as manuscript images, the need for diplomatic transcripti<strong>on</strong>s that can be reused, <strong>and</strong> a<br />

related body of textual <strong>and</strong> other material that helps more firmly place these manuscripts <str<strong>on</strong>g>in</str<strong>on</strong>g> their<br />

historical c<strong>on</strong>text <strong>and</strong> better supports the explorati<strong>on</strong> of their textual transmissi<strong>on</strong>. The HMT makes<br />

extensive use of the CTS protocol <strong>and</strong> its related Reference Index<str<strong>on</strong>g>in</str<strong>on</strong>g>g service, both of which exist as<br />

“JavaServlets <strong>and</strong> Pyth<strong>on</strong> applicati<strong>on</strong>s runn<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the Google AppEng<str<strong>on</strong>g>in</str<strong>on</strong>g>e.” The HMT has also<br />

developed PanDect, its own web-based <str<strong>on</strong>g>in</str<strong>on</strong>g>terface to their library. 124<br />

The HMT’s <str<strong>on</strong>g>in</str<strong>on</strong>g>clusi<strong>on</strong> of scholia (<strong>and</strong> work d<strong>on</strong>e by the related Homer <strong>and</strong> the Papyri projects) 125 also<br />

highlights the fact that manuscripts <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded many “paratexts” as previously described by M<strong>on</strong>ella<br />

(2008). In fact, Blackwell <strong>and</strong> Smith noted that the Venetus A, like many Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e codices, c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

123 http://chs.harvard.edu/wa/pageRtn=ArticleWrapper&bdc=12&mn=1381<br />

124 http://p<strong>and</strong>ect.sourceforge.net/<br />

125 http://chs.harvard.edu/wb/93/wo/1DSibj9gCANI20WSihcJNM/0.0.0.0.19.1.7.15.1.1.0.1.2.0.4.1.3.3.1


38<br />

many discrete texts, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a copy of Proclus’s Chrestomathy, the Iliad, summaries of books of the<br />

Iliad, four different scholiastic texts, <strong>and</strong> later notes. In these authors’ system of text l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

retrieval through abstract citati<strong>on</strong>, all of these c<strong>on</strong>tents are described as separate texts <strong>and</strong> the CTS<br />

protocol is used to refer to the structure of each text. Indexes are then created that associate these texts<br />

with digital images of the folio sides that make up the manuscript. 126 Ultimately, Blackwell <strong>and</strong> Smith<br />

c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>gly argued that this approach to primary sources, which favors diplomatic or facsimile<br />

editi<strong>on</strong>s <strong>and</strong> simple <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g over complicated markup, would support the greatest possible amount of<br />

reuse for the data, posit<str<strong>on</strong>g>in</str<strong>on</strong>g>g that it “requires less knowledge to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate texts with simple markup <strong>and</strong><br />

simple, documented <str<strong>on</strong>g>in</str<strong>on</strong>g>dices, than to disaggregate an elaborately marked up texts that embeds l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to<br />

other digital objects.”<br />

While all primary sources might benefit from such an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, the Homeric poems <str<strong>on</strong>g>in</str<strong>on</strong>g> particular<br />

require <strong>on</strong>e, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Dué <strong>and</strong> Ebbott (2009), largely because of the complicated oral-performance<br />

traditi<strong>on</strong> that created the poems. Traditi<strong>on</strong>al pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s of the poems with a ma<str<strong>on</strong>g>in</str<strong>on</strong>g> text <strong>and</strong> an<br />

apparatus that records all alternative <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s, they argue, creates the mislead<str<strong>on</strong>g>in</str<strong>on</strong>g>g impressi<strong>on</strong> that<br />

there is “<strong>on</strong>e” correct text <strong>and</strong> then there is everyth<str<strong>on</strong>g>in</str<strong>on</strong>g>g else. Dué <strong>and</strong> Ebbott also criticized the fact that<br />

a critical apparatus could be deciphered <strong>on</strong>ly by a specialist audience with years of tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g. The digital<br />

medium can be used to better represent the nature of the Homeric texts, however, for as they<br />

c<strong>on</strong>tended:<br />

The Homeric epics were composed aga<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> aga<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g> performance: the digital medium, which<br />

can more readily h<strong>and</strong>le multiple texts, is therefore em<str<strong>on</strong>g>in</str<strong>on</strong>g>ently suitable for a critical editi<strong>on</strong> of<br />

Homeric poetry—<str<strong>on</strong>g>in</str<strong>on</strong>g>deed, the fullest realizati<strong>on</strong> of a critical editi<strong>on</strong> of Homer may require a<br />

digital medium (Dué <strong>and</strong> Ebbott 2009).<br />

Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to their argument that the Homeric poems were composed anew <str<strong>on</strong>g>in</str<strong>on</strong>g> every performance, there<br />

is no s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al author’s compositi<strong>on</strong> or text to attempt to rec<strong>on</strong>struct. They put forward that the<br />

variati<strong>on</strong>s found <str<strong>on</strong>g>in</str<strong>on</strong>g> different manuscript witnesses are not necessarily copy<str<strong>on</strong>g>in</str<strong>on</strong>g>g “errors,” <strong>and</strong> that <str<strong>on</strong>g>in</str<strong>on</strong>g> fact<br />

the traditi<strong>on</strong>al term “variants” as used by textual critics is not appropriate for the compositi<strong>on</strong>al process<br />

used to create the poems. Dué <strong>and</strong> Ebbott asserted that the digital medium could support a superior<br />

form of textual criticism for these epics:<br />

Textual criticism as practiced is predicated <strong>on</strong> selecti<strong>on</strong> <strong>and</strong> “correcti<strong>on</strong>” as it creates the ficti<strong>on</strong> of a<br />

s<str<strong>on</strong>g>in</str<strong>on</strong>g>gular text. The digital criticism we are propos<str<strong>on</strong>g>in</str<strong>on</strong>g>g for the Homer Multitext ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrity of<br />

each witness to allow for c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ual <strong>and</strong> dynamic comparis<strong>on</strong>, better reflect<str<strong>on</strong>g>in</str<strong>on</strong>g>g the multiplicity of the<br />

textual record <strong>and</strong> of the oral traditi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> which these epics were created (Dué <strong>and</strong> Ebbott, 2009).<br />

Instead of the term “variants,” they co<str<strong>on</strong>g>in</str<strong>on</strong>g> the term “performance multiforms” to describe the variati<strong>on</strong>s<br />

found <str<strong>on</strong>g>in</str<strong>on</strong>g> the manuscript witnesses. St<strong>and</strong>ard pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted critical editi<strong>on</strong>s could never reflect the complexity<br />

of such multiforms, Dué <strong>and</strong> Ebbott submitted, but they <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted that a digital “editi<strong>on</strong>” such as the<br />

HMT supports the representati<strong>on</strong> of various manuscript witnesses <strong>and</strong> could more clearly <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate<br />

where variati<strong>on</strong>s occur, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce no “ma<str<strong>on</strong>g>in</str<strong>on</strong>g> text” must be selected for presentati<strong>on</strong> as with the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted page.<br />

Yet another advantage of the multitextual approach, Dué <strong>and</strong> Ebbott c<strong>on</strong>tended, is that it can be far<br />

more explicit about the many channels of textual transmissi<strong>on</strong>. For example, many quotati<strong>on</strong>s of<br />

Homer <str<strong>on</strong>g>in</str<strong>on</strong>g> Plato <strong>and</strong> the Attic orators, as well as fragmentary papyri, are quite different from the<br />

medieval texts that served as the basis for all modern pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s of Homer. A multitext approach<br />

126 For detail <strong>on</strong> these text structures, data models, <strong>and</strong> technical implementati<strong>on</strong>, see Smith (2010).


39<br />

allows each of these channels to be placed <str<strong>on</strong>g>in</str<strong>on</strong>g> a historical or cultural framework that can help the reader<br />

better underst<strong>and</strong> how they vary, rather than <str<strong>on</strong>g>in</str<strong>on</strong>g> an apparatus that often obfuscates these differences.<br />

N<strong>on</strong>etheless, Dué <strong>and</strong> Ebbott acknowledged that build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a multitext that moves from a “static<br />

percepti<strong>on</strong> to a dynamic presentati<strong>on</strong>” <strong>and</strong> attempts to present all manuscript witnesses to a reader<br />

without an <str<strong>on</strong>g>in</str<strong>on</strong>g>terven<str<strong>on</strong>g>in</str<strong>on</strong>g>g apparatus faces a number of technical challenges, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g how to highlight<br />

multiforms so they are easy to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d <strong>and</strong> compare, <strong>and</strong> how to display hexameter l<str<strong>on</strong>g>in</str<strong>on</strong>g>es (the unit of<br />

compositi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the Homeric epic) as parts of whole texts rather than just po<str<strong>on</strong>g>in</str<strong>on</strong>g>t out the differences (as <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

an apparatus). While these issues are still be<str<strong>on</strong>g>in</str<strong>on</strong>g>g worked out, the authors c<strong>on</strong>cluded that three ma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciples drive their <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g work: collaborati<strong>on</strong>, open access, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability.<br />

Similar criticism of modern critical editi<strong>on</strong>s <strong>and</strong> their <str<strong>on</strong>g>in</str<strong>on</strong>g>ability to accurately represent the manuscript<br />

traditi<strong>on</strong> of texts has been offered by Stephen Nichols. Nichols stated that the modern editorial practice<br />

of attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to faithfully rec<strong>on</strong>struct a text as the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al author <str<strong>on</strong>g>in</str<strong>on</strong>g>tended it has little to do with the<br />

“reality of medieval literary practice” <strong>and</strong> is <str<strong>on</strong>g>in</str<strong>on</strong>g>stead an “artefact of analogue scholarship” where the<br />

limitati<strong>on</strong>s of the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted page required editors to choose a base manuscript to transcribe <strong>and</strong> to banish<br />

all <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g variants from other manuscripts to the apparatus (Nichols 2009). He also voiced that<br />

there was very little <str<strong>on</strong>g>in</str<strong>on</strong>g>terest <str<strong>on</strong>g>in</str<strong>on</strong>g> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al manuscripts, as many scholars c<strong>on</strong>sidered<br />

the scribes who produced them to have <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced both copy<str<strong>on</strong>g>in</str<strong>on</strong>g>g errors <strong>and</strong> their own thoughts <strong>and</strong> thus<br />

to have “corrupted” the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al text. The advent of digital technology, however, Nichols c<strong>on</strong>cluded,<br />

had produced new opportunities for study<str<strong>on</strong>g>in</str<strong>on</strong>g>g literary producti<strong>on</strong>:<br />

The Internet has altered the equati<strong>on</strong> by mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g possible the study of literary works <str<strong>on</strong>g>in</str<strong>on</strong>g> their<br />

orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al c<strong>on</strong>figurati<strong>on</strong>s. We can now underst<strong>and</strong> that manuscripts designed <strong>and</strong> produced by<br />

scribes <strong>and</strong> artists—often l<strong>on</strong>g after the death of the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al poet—have a life of their own. It<br />

was not that scribes were ‘<str<strong>on</strong>g>in</str<strong>on</strong>g>capable’ of copy<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts word-for-word, but rather that this was<br />

not what their culture dem<strong>and</strong>ed of them. This is but <strong>on</strong>e of the reas<strong>on</strong>s why the story of<br />

medieval manuscripts is both so fasc<str<strong>on</strong>g>in</str<strong>on</strong>g>at<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> so very different from the <strong>on</strong>e we are<br />

accustomed to hear<str<strong>on</strong>g>in</str<strong>on</strong>g>g. But it requires reth<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>cepts as fundamental as authorship, for<br />

example. C<strong>on</strong>fr<strong>on</strong>ted with over 150 versi<strong>on</strong>s of the work, no two quite alike, what becomes of<br />

the c<strong>on</strong>cept of authorial c<strong>on</strong>trol And how can <strong>on</strong>e assert with certa<str<strong>on</strong>g>in</str<strong>on</strong>g>ty which of the 150 or so<br />

versi<strong>on</strong>s is the ‘correct’ <strong>on</strong>e, or even whether such a c<strong>on</strong>cept even makes sense <str<strong>on</strong>g>in</str<strong>on</strong>g> a pre-pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<br />

culture (Nichols 2009).<br />

Thus, the digitizati<strong>on</strong> of manuscripts <strong>and</strong> the creati<strong>on</strong> of digital critical editi<strong>on</strong>s have not <strong>on</strong>ly provided<br />

new opportunities for textual criticism but also might even be viewed as enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g a type of criticism<br />

that better respects the traditi<strong>on</strong>s of the texts or objects of analysis themselves.<br />

While M<strong>on</strong>ella (2008), Bodard <strong>and</strong> Garcés (2009), <strong>and</strong> Dué <strong>and</strong> Ebbott (2009) focused largely <strong>on</strong> the<br />

utility of digital editi<strong>on</strong>s for philological study <strong>and</strong> textual criticism, Notis Toufexis has recently<br />

argued that digital editi<strong>on</strong>s are central to the work of historical l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics as well. As he expla<str<strong>on</strong>g>in</str<strong>on</strong>g>s,<br />

historical l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics “exam<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> evaluates the appearance of new—that is changed—l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic<br />

forms next to old (unchanged) <strong>on</strong>es <str<strong>on</strong>g>in</str<strong>on</strong>g> the same text or <str<strong>on</strong>g>in</str<strong>on</strong>g> texts of the same date <strong>and</strong>/or geographical<br />

evidence (Toufexis 2010, 111). Similar to Stephen Nichols, Toufexis criticized modern critical editi<strong>on</strong>s<br />

for creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a far simpler l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic picture than was actually the case with<str<strong>on</strong>g>in</str<strong>on</strong>g> medieval manuscripts. He<br />

described how scribes might have unc<strong>on</strong>sciously used newer forms of language <strong>and</strong> not copied the old<br />

forms found <str<strong>on</strong>g>in</str<strong>on</strong>g> a manuscript or how they might have made specific decisi<strong>on</strong>s to use older forms as a<br />

stylistic choice to elevate the register of the text. For these reas<strong>on</strong>s, the <str<strong>on</strong>g>in</str<strong>on</strong>g>clusi<strong>on</strong> of all text variants <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the apparatus criticus is necessary not just for philologists but also for historical l<str<strong>on</strong>g>in</str<strong>on</strong>g>guists who wish to


40<br />

exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e how l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic features have changed across historical corpora. Toufexis argued that digital<br />

editi<strong>on</strong>s could thus solve the problems of both philologists <strong>and</strong> historical l<str<strong>on</strong>g>in</str<strong>on</strong>g>guists:<br />

A technology-based approach can help us resolve this c<strong>on</strong>flict: <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital envir<strong>on</strong>ment<br />

‘ec<strong>on</strong>omy of space’ is no l<strong>on</strong>ger an issue. By lift<str<strong>on</strong>g>in</str<strong>on</strong>g>g the c<strong>on</strong>stra<str<strong>on</strong>g>in</str<strong>on</strong>g>ts of pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s, a digital<br />

editi<strong>on</strong> can serve the needs of both philologists <strong>and</strong> historical l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics (or for that matter any<br />

other scholar who has an <str<strong>on</strong>g>in</str<strong>on</strong>g>terest <str<strong>on</strong>g>in</str<strong>on</strong>g> approach<str<strong>on</strong>g>in</str<strong>on</strong>g>g ancient texts). A ‘plural’ representati<strong>on</strong> of<br />

ancient texts <str<strong>on</strong>g>in</str<strong>on</strong>g> digital form, especially those transmitted <str<strong>on</strong>g>in</str<strong>on</strong>g> ‘fluid’ form, is today a perfectly<br />

viable alternative to a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>. Only a few years ago such a digital endeavor seemed<br />

technologically impossible or someth<str<strong>on</strong>g>in</str<strong>on</strong>g>g reserved for the very few computer-literate editors<br />

(Toufexis 2010, 114-115).<br />

Even if critical editors could not be c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>ced to change the way they edited texts, Toufexis hoped that<br />

most of the problems of critical editi<strong>on</strong>s could at least be ameliorated by be<str<strong>on</strong>g>in</str<strong>on</strong>g>g transposed to a digital<br />

medium, because digital editi<strong>on</strong>s could make editorial choices transparent by l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g the apparatus<br />

criticus to the electr<strong>on</strong>ic text <strong>and</strong> could be accompanied by digital images of manuscript witnesses.<br />

Digital editi<strong>on</strong>s, Toufexis argued, were ultimately far better for readers as well, because “a pluralistic<br />

digital editi<strong>on</strong> encourages readers to approach all transmitted texts equally, even if <strong>on</strong>e text is<br />

highlighted am<strong>on</strong>g the many texts <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> the editi<strong>on</strong>” (Toufexis 2010, 117-118).<br />

New Models of Collaborati<strong>on</strong>, Tools, <strong>and</strong> Frameworks for Digital Editi<strong>on</strong>s<br />

Digital tools create new opportunities for textual edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> the creati<strong>on</strong> of digital editi<strong>on</strong>s, <strong>and</strong> <strong>on</strong>e<br />

key area of opportunity is the ability to support new types of collaborati<strong>on</strong>. Tobias Blanke has recently<br />

suggested that “traditi<strong>on</strong>al humanities activities such as the creati<strong>on</strong> of critical editi<strong>on</strong>s could benefit<br />

from the collaborati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> research enabled by new <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures that e-Science promises to deliver”<br />

(Blanke 2010). Indeed, Peter Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> has argued that the s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle greatest shift <str<strong>on</strong>g>in</str<strong>on</strong>g> edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g practice<br />

brought about by the digital world is “that it is creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g new models of collaborati<strong>on</strong>: it changes who<br />

we collaborate with, how we collaborate, <strong>and</strong> what we mean by collaborati<strong>on</strong>” (Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2010).<br />

Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>gly argued that the first digital editi<strong>on</strong>s did not challenge the traditi<strong>on</strong>al editorial<br />

model, where a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle editor frequently gathered materials <strong>and</strong> made all f<str<strong>on</strong>g>in</str<strong>on</strong>g>al edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g decisi<strong>on</strong>s, even if<br />

he or she had a number of partners <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g. In the digital world, Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> proposed, a<br />

new model is possible. In such a model, libraries could put up images of manuscripts, various scholars,<br />

students, or experts could make transcripti<strong>on</strong>s that l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to these images; other scholars could collate<br />

these transcripti<strong>on</strong>s <strong>and</strong> publish editi<strong>on</strong>s <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g to both the transcripti<strong>on</strong>s <strong>and</strong> images; still other<br />

scholars could analyze these collati<strong>on</strong>s <strong>and</strong> create an apparatus or commentaries; <strong>and</strong> yet other scholars<br />

could then l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to these commentaries. All these activities could occur <str<strong>on</strong>g>in</str<strong>on</strong>g>dependently or together.<br />

Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> granted that more traditi<strong>on</strong>al, s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle-editor-c<strong>on</strong>trolled editi<strong>on</strong>s can be made <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital<br />

world. He argued, however, that such editi<strong>on</strong>s are far too expensive, particularly s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce an editor cannot<br />

simply present samples <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e but also needs to provide access to all images <strong>and</strong> transcripti<strong>on</strong>s of the<br />

text. The s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle-editor model, he believed, would also lead <str<strong>on</strong>g>in</str<strong>on</strong>g>evitably to the sole creati<strong>on</strong> of a limited<br />

number of digital editi<strong>on</strong>s of major works by well-studied authors. Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> reported that there had<br />

been a scholarly backlash aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st the creati<strong>on</strong> of such high-profile <strong>and</strong> expensive digital editi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

past few years, where the divide was largely between those with access to expensive tools <strong>and</strong> those<br />

without. He c<strong>on</strong>cluded that this backlash directly c<strong>on</strong>tributed to the clos<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the Arts <strong>and</strong> Humanities<br />

Data Service (AHDS).


41<br />

In this new digital world, where there is an endless amount of edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g to be d<strong>on</strong>e, Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> urged<br />

humanists to actively guide the build<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the necessary <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure by provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools, examples of<br />

good practice, <strong>and</strong> key parts of their own editi<strong>on</strong>s, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce, after all, it is <str<strong>on</strong>g>in</str<strong>on</strong>g> their own best <str<strong>on</strong>g>in</str<strong>on</strong>g>terest as well<br />

to have a say <str<strong>on</strong>g>in</str<strong>on</strong>g> any <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure designed for them. He stressed, however, that “there is not <strong>and</strong> will<br />

not be a Wikipedia for editi<strong>on</strong>s, nor <str<strong>on</strong>g>in</str<strong>on</strong>g>deed will there ever be any <strong>on</strong>e tool or software envir<strong>on</strong>ment<br />

which does everyth<str<strong>on</strong>g>in</str<strong>on</strong>g>g for every<strong>on</strong>e. What there might be is a set of tools <strong>and</strong> resources, built <strong>on</strong><br />

agreed nam<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>venti<strong>on</strong>s” (Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2010). He thus argued for basic st<strong>and</strong>ards <strong>and</strong> nam<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

c<strong>on</strong>venti<strong>on</strong>s but aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st any massive universal <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed what he was call<str<strong>on</strong>g>in</str<strong>on</strong>g>g for<br />

as “distributed, dynamic <strong>and</strong> collaborative editi<strong>on</strong>s” (Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009) <strong>and</strong> argued that:<br />

The c<strong>on</strong>cept is not that there is a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle system, a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle set of software tools, which everybody<br />

uses. Instead, across the web we have a federati<strong>on</strong> of separate but co-operat<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources, all<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> different systems, but all <str<strong>on</strong>g>in</str<strong>on</strong>g>terl<str<strong>on</strong>g>in</str<strong>on</strong>g>ked so that to any user anywhere it appears as if they<br />

were all <strong>on</strong> the <strong>on</strong>e server (Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009).<br />

In fact, Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> expressed frustrati<strong>on</strong> at the fact that most fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g agencies were obsessed with what<br />

he c<strong>on</strong>sidered to be “Gr<strong>and</strong> S<str<strong>on</strong>g>in</str<strong>on</strong>g>gle Soluti<strong>on</strong>s.” He saw little utility <str<strong>on</strong>g>in</str<strong>on</strong>g> projects such as SEASR or<br />

Bamboo, <strong>and</strong> summed up his op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong> thusly: “Let me say this clearly, as most scholars seem afraid to<br />

say it: projects like these are vast wastes of time, effort <strong>and</strong> m<strong>on</strong>ey” (Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009). He argued that<br />

the future did not lie with massive, s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle-purpose <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, but <str<strong>on</strong>g>in</str<strong>on</strong>g>stead with projects such as<br />

Interediti<strong>on</strong> 127 <strong>and</strong> the Virtual Manuscript Room 128 that are not <strong>on</strong>ly seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g ways to create more<br />

resources such as manuscript images <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e but also to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k disparate parts (images, tools,<br />

transcripti<strong>on</strong>s) that are already <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e together, rather than try to force them all <str<strong>on</strong>g>in</str<strong>on</strong>g>to <strong>on</strong>e new<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

The Interediti<strong>on</strong> Project is seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create an “<str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable supranati<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital<br />

editi<strong>on</strong>s” across Europe that will promote <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability of the tools <strong>and</strong> methodologies used <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

field of digital textual edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Still <str<strong>on</strong>g>in</str<strong>on</strong>g> its early stages, this project is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> a tool called<br />

CollateX, 129 “a java library for collat<str<strong>on</strong>g>in</str<strong>on</strong>g>g textual sources” that is the latest versi<strong>on</strong> of the tool Collate.<br />

Collate was created by Peter Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> to support the collati<strong>on</strong> of multiple versi<strong>on</strong>s of an electr<strong>on</strong>ic<br />

text <str<strong>on</strong>g>in</str<strong>on</strong>g> order to create scholarly editi<strong>on</strong>s, <strong>and</strong> its functi<strong>on</strong>alities are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g enhanced <str<strong>on</strong>g>in</str<strong>on</strong>g> CollateX<br />

(Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009).<br />

In his discussi<strong>on</strong> of Collate, Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> emphasized that developers of scholarly collati<strong>on</strong> tools should<br />

recognize two essential facts, the first <strong>and</strong> foremost of which is that “scholarly collati<strong>on</strong> is not Diff”<br />

<strong>and</strong> that any attempt to build a collati<strong>on</strong> program that meets all scholars’ needs will meet with failure.<br />

He listed two major reas<strong>on</strong>s for this. The first is that while automated programs could easily identify<br />

differences between texts, these differences are not necessarily variants; <strong>and</strong> sec<strong>on</strong>d, teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e the best way to present “any given sequence of variants, at any particular<br />

moment” would be a m<strong>on</strong>umental task. One of the features, Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> argued, that allowed Collate to<br />

receive a reas<strong>on</strong>able measure of uptake was that it “allows the scholar to fix the collati<strong>on</strong> exactly as he<br />

or she wants” (Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009).<br />

127 http://www.<str<strong>on</strong>g>in</str<strong>on</strong>g>terediti<strong>on</strong>.eu/<br />

128 http://vmr.bham.ac.uk/<br />

129 https://launchpad.net/collatex


42<br />

The sec<strong>on</strong>d fact that Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> felt developers should recognize was that “collati<strong>on</strong> is more than<br />

visualizati<strong>on</strong>.” While many collati<strong>on</strong> programs can beautifully show variati<strong>on</strong>, Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong><br />

acknowledged, they can present it <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e format <strong>on</strong>ly. For this reas<strong>on</strong>, he designed Collate differently:<br />

Aga<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>on</strong>e th<str<strong>on</strong>g>in</str<strong>on</strong>g>g I did right with Collate, right back <str<strong>on</strong>g>in</str<strong>on</strong>g> the very beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g, was that I did not<br />

design the program so it could produce just <strong>on</strong>e output. I designed it so that you could generate<br />

what you might call an <str<strong>on</strong>g>in</str<strong>on</strong>g>telligent output. Essentially, this dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guished all the different<br />

comp<strong>on</strong>ents of an apparatus - the lemma, the variant, the witness sigil, <strong>and</strong> much more—<strong>and</strong><br />

allowed you to specify what might appear before <strong>and</strong> after each comp<strong>on</strong>ent, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> what order<br />

the various comp<strong>on</strong>ents might appear (Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009).<br />

Collate allows scholars to output the apparatus <str<strong>on</strong>g>in</str<strong>on</strong>g> forms ready for process<str<strong>on</strong>g>in</str<strong>on</strong>g>g by various analysis<br />

programs. At the same time, it has a number of issues that have required the development of CollateX,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g difficulties h<strong>and</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>g transpositi<strong>on</strong>s <strong>and</strong>, even more critically, the <str<strong>on</strong>g>in</str<strong>on</strong>g>ability to support<br />

collaborative work.<br />

A number of smaller research projects are also attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to build envir<strong>on</strong>ments where humanists can<br />

work together <strong>on</strong> texts <strong>and</strong> digital editi<strong>on</strong>s. The Humanities Research Infrastructure <strong>and</strong> Tools<br />

(HRIT) 130 project, based at the Center for Textual Studies <strong>and</strong> Digital Humanities at Loyola University<br />

Chicago, is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> e-Carrel, a tool that will create a secure <strong>and</strong> distributed <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for<br />

access, preservati<strong>on</strong>, <strong>and</strong> reuse of humanities texts. The e-Carrel tool will also support the use of a<br />

collaborative annotati<strong>on</strong> tool <strong>and</strong> st<strong>and</strong>off markup. In particular, the creators of e-Carrel want to<br />

support <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability with other humanities text projects <strong>and</strong> to promote collaborative work <strong>on</strong> a<br />

series of “core texts” that will likely exist as multiversi<strong>on</strong>ed documents (Thiruvathukal et al. 2009).<br />

In additi<strong>on</strong> to collaborative tools, a number of other tools exist to support the creati<strong>on</strong> of digital<br />

editi<strong>on</strong>s. Am<strong>on</strong>g them is “Editi<strong>on</strong> Producti<strong>on</strong> & Presentati<strong>on</strong> Technology” (EPPT), 131 a set of XML<br />

tools designed to assist editors <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g image-based electr<strong>on</strong>ic editi<strong>on</strong>s. The EPPT is a st<strong>and</strong>-al<strong>on</strong>e<br />

applicati<strong>on</strong> that editors can <str<strong>on</strong>g>in</str<strong>on</strong>g>stall <strong>on</strong> their own computers <strong>and</strong> that supports more effective “imagebased<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g” or where users l<str<strong>on</strong>g>in</str<strong>on</strong>g>k descriptive markup (such as a TEI transcripti<strong>on</strong> of a manuscript)<br />

to material evidence (an image of that manuscript) through XML. Templates are automatically<br />

generated from the data of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual projects so that scholars <strong>and</strong> students need little tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

TEI/XML to get started. EPPT has two basic tools for <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g images <strong>and</strong> text, ImagText (with<br />

OverLay) <strong>and</strong> xTagger, <strong>and</strong> enables very precise l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of both full images <strong>and</strong> image secti<strong>on</strong>s with<br />

structural <strong>and</strong> descriptive metadata. To create a basic image-based editi<strong>on</strong>, a user simply needs images,<br />

corresp<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g pla<str<strong>on</strong>g>in</str<strong>on</strong>g> text, <strong>and</strong> a document type def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> (DTD). C<strong>on</strong>sequently, EPPT can be used to<br />

create image-based editi<strong>on</strong>s us<str<strong>on</strong>g>in</str<strong>on</strong>g>g images <strong>and</strong> data available <str<strong>on</strong>g>in</str<strong>on</strong>g> different archives, <strong>and</strong> can also be used<br />

by scholars to “prepare, collate <strong>and</strong> search” variant manuscript versi<strong>on</strong>s of texts. A number of projects,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Roman de la Rose <strong>and</strong> Electr<strong>on</strong>ic Boethius, 132 have already made use of this tool.<br />

A related project <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of desired functi<strong>on</strong>ality is TILE (Text Image L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Envir<strong>on</strong>ment), 133 a<br />

collaborative project of the Maryl<strong>and</strong> Institute for Technology <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities, 134 the Digital<br />

Humanities Observatory (DHO), 135 <strong>and</strong> Indiana University Bloom<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>. This project is seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

130 http://text.etl.luc.edu/HRIT_CaTT/overview.php<br />

131 http://www.eppt.org/eppt/. For more <strong>on</strong> the technical approach beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d the EPPT <strong>and</strong> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML for “image-based electr<strong>on</strong>ic editi<strong>on</strong>s,” see Dekhytar et<br />

al. (2005).<br />

132 http://beowulf.engl.uky.edu/~kiernan/eBoethius/<str<strong>on</strong>g>in</str<strong>on</strong>g>lad.htm<br />

133 http://mith.umd.edu/tile/<br />

134 http://mith.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo/<br />

135 http://dho.ie/


43<br />

“develop a new web-based, modular, collaborative image markup tool for both manual <strong>and</strong> semiautomated<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g between encoded text <strong>and</strong> image of text, <strong>and</strong> image annotati<strong>on</strong>” <strong>and</strong> has just<br />

announced the release of TILE 0.9. 136 Doug Reside of this project recently outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>on</strong> the TILE blog a<br />

“four-layer model for image-based editi<strong>on</strong>s” that was designed to address l<strong>on</strong>g-term preservati<strong>on</strong><br />

issues <strong>and</strong> clearly outl<str<strong>on</strong>g>in</str<strong>on</strong>g>e resp<strong>on</strong>sibilities for digital librarians <strong>and</strong> scholars (Reside 2010).<br />

The first level <str<strong>on</strong>g>in</str<strong>on</strong>g>volves the digitizati<strong>on</strong> of source materials, particularly their l<strong>on</strong>g-term curati<strong>on</strong> <strong>and</strong><br />

distributi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> open formats with the use of regular <strong>and</strong> progressive nam<str<strong>on</strong>g>in</str<strong>on</strong>g>g systems. Reside made the<br />

useful suggesti<strong>on</strong> that grant<str<strong>on</strong>g>in</str<strong>on</strong>g>g agencies should c<strong>on</strong>sider requir<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>tent providers to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> stable<br />

uniform resource identifiers (URIs) for at least 10 to 15 years for all digital objects. The sec<strong>on</strong>d level<br />

for image based-editi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>volves metadata creati<strong>on</strong>, <strong>and</strong> Reside argued that all metadata external to<br />

the file itself (e.g., descriptive rather than technical metadata) bel<strong>on</strong>g at this level. He also proposed<br />

that <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s or <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals that did not create the digital files should probably create such metadata:<br />

While the impulse towards quality assurance <strong>and</strong> thorough work is laudable, a perfecti<strong>on</strong>ist<br />

policy that delays publicati<strong>on</strong> of prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary work is better suited for immutable pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t media<br />

than an extensible digital archive. In our model, c<strong>on</strong>tent providers need not wait to provide<br />

c<strong>on</strong>tent until it has been processed <strong>and</strong> catalogued (Reside 2010).<br />

By open<str<strong>on</strong>g>in</str<strong>on</strong>g>g the task of catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> resource descripti<strong>on</strong> to a larger audience, Reside hypothesized<br />

that far more c<strong>on</strong>tent could get <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e quickly <strong>and</strong> be available for reuse. Separat<str<strong>on</strong>g>in</str<strong>on</strong>g>g metadata <strong>and</strong><br />

c<strong>on</strong>tent would also allow multiple transcripti<strong>on</strong>s or metadata to po<str<strong>on</strong>g>in</str<strong>on</strong>g>t to the same item’s URI.<br />

The third level of the TILE model <str<strong>on</strong>g>in</str<strong>on</strong>g>volves the <str<strong>on</strong>g>in</str<strong>on</strong>g>terface layer, an often-ignored feature <str<strong>on</strong>g>in</str<strong>on</strong>g> the move to<br />

get open c<strong>on</strong>tent available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. While Reside granted that more transcripti<strong>on</strong>s <strong>and</strong> files <str<strong>on</strong>g>in</str<strong>on</strong>g> open<br />

repositories is a useful first step, many humanities scholars still need <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces that do more than<br />

access <strong>on</strong> file at a time. He also recognized that while Software Envir<strong>on</strong>ment for the Advancement of<br />

Scholarly Research (SEASR) is try<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create a susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able model for <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable digital<br />

humanities tools, their work has not yet met with wide-scale adopti<strong>on</strong>. At this most critical layer,<br />

Reside outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the TILE approach:<br />

We propose a code framework for web-based editi<strong>on</strong>s, first implemented <str<strong>on</strong>g>in</str<strong>on</strong>g> JavaScript us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the popular jQuery library, but adaptable to other languages when the prevalent w<str<strong>on</strong>g>in</str<strong>on</strong>g>ds of web<br />

development change. An <str<strong>on</strong>g>in</str<strong>on</strong>g>stance of this framework is composed of a manifest file (probably <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

XML or JSON 137 format) that identifies the locati<strong>on</strong>s of the relevant c<strong>on</strong>tent <strong>and</strong> any associated<br />

metadata <strong>and</strong> a core file (similar to, but c<strong>on</strong>siderably leaner than, the core jQuery.js file at the<br />

heart of the popular JavaScript library) with a system of “hooks” <strong>on</strong>to which developers might<br />

hang widgets they develop for their own editi<strong>on</strong>s. A widget, <str<strong>on</strong>g>in</str<strong>on</strong>g> this c<strong>on</strong>text, is a program with<br />

limited functi<strong>on</strong>ality that provides well-def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed resp<strong>on</strong>ses to specific <str<strong>on</strong>g>in</str<strong>on</strong>g>put (Reside 2009).<br />

This model thus <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a manifest file that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s all c<strong>on</strong>tent locati<strong>on</strong>s <strong>and</strong> associated metadata,<br />

<strong>and</strong> a core file, or base text, that can be used by different developers to create their own digital editi<strong>on</strong>s<br />

utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g their own tools or “widgets.” Widgets should depend <strong>on</strong>ly <strong>on</strong> the core files, Reside argued,<br />

not <strong>on</strong> each other, <strong>and</strong> ideally they could be shared between scholars. Reside admitted that basically<br />

they are propos<str<strong>on</strong>g>in</str<strong>on</strong>g>g the development of a “c<strong>on</strong>tent management system” (CMS) for manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

136 This software <strong>and</strong> its functi<strong>on</strong>ality are discussed later <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper.<br />

137 JSON, or JavaScript Object Notati<strong>on</strong> is a “lightweight data-<str<strong>on</strong>g>in</str<strong>on</strong>g>terchange format” that is based <strong>on</strong> the JavaScript programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g language but is also a<br />

“text format that is completely language <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent.” (http://www.js<strong>on</strong>.org/)


44<br />

“multimedia scholarly editi<strong>on</strong>s.” Reside acknowledged that the market is currently crowded with CMS<br />

opti<strong>on</strong>s but noted that n<strong>on</strong>e of the current opti<strong>on</strong>s quite meets the needs of scholarly editi<strong>on</strong>s.<br />

The fourth <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>al layer of the TILE model <str<strong>on</strong>g>in</str<strong>on</strong>g>volves user-generated data layers, which Reside<br />

c<strong>on</strong>sidered possibly to be the most “volatile data <str<strong>on</strong>g>in</str<strong>on</strong>g> current digital humanities scholarship.” The open<br />

nature of many sites makes it hard to dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guish c<strong>on</strong>tributi<strong>on</strong>s from the <str<strong>on</strong>g>in</str<strong>on</strong>g>experienced versus expert<br />

scholars. Thus, while their framework argued for the “development of repositories of user-generated<br />

c<strong>on</strong>tent,” s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce all c<strong>on</strong>tent c<strong>on</strong>tributed by users cannot be permanently stored, they suggested<br />

“s<strong>and</strong>box” databases where <strong>on</strong>ly the best user-generated c<strong>on</strong>tent is selected for <str<strong>on</strong>g>in</str<strong>on</strong>g>clusi<strong>on</strong> <strong>and</strong><br />

publicati<strong>on</strong>.<br />

One partner <str<strong>on</strong>g>in</str<strong>on</strong>g> TILE, the DHO, has also c<strong>on</strong>ducted some <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent research <str<strong>on</strong>g>in</str<strong>on</strong>g>to develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

framework for scholarly editi<strong>on</strong>s (Schreibman 2009). Schreibman offered criticisms similar to those of<br />

Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> <strong>and</strong> Reside, stat<str<strong>on</strong>g>in</str<strong>on</strong>g>g that not <strong>on</strong>ly were many early digital editi<strong>on</strong>s typically <strong>on</strong>e-off<br />

producti<strong>on</strong>s where the c<strong>on</strong>tent was tightly <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated with the chosen software, but that:<br />

We also d<strong>on</strong>’t, as a scholarly edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g community, have agreed up<strong>on</strong> formats, protocols, <strong>and</strong><br />

methodologies for digital scholarly edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> editi<strong>on</strong>s. Moreover, many of the more mature<br />

first-generati<strong>on</strong> digital projects creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e editi<strong>on</strong>s from pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t sources have more <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

comm<strong>on</strong> with digital library projects—i.e., editi<strong>on</strong>s created with a light editorial h<strong>and</strong>,<br />

m<str<strong>on</strong>g>in</str<strong>on</strong>g>imally encoded <strong>and</strong> with little more c<strong>on</strong>textualizati<strong>on</strong> than their pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t counterparts<br />

(Schreibman 2009).<br />

To address some of these issues, the DHO held a <strong>on</strong>e-day symposium <str<strong>on</strong>g>in</str<strong>on</strong>g> 2009 <strong>on</strong> the issue of digital<br />

editi<strong>on</strong>s that was followed by a weekl<strong>on</strong>g scholarly edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g school to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e “a set of protocols,<br />

methodologies, rights management <strong>and</strong> technical procedures to create a shared <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital<br />

scholarly editi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> Irel<strong>and</strong>.” They plan to follow relevant developments from TextGrid <strong>and</strong><br />

Interediti<strong>on</strong>, so that the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> tools developed <str<strong>on</strong>g>in</str<strong>on</strong>g> Irel<strong>and</strong> can l<str<strong>on</strong>g>in</str<strong>on</strong>g>k up with these other<br />

nati<strong>on</strong>al <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al projects.<br />

The Challenges of Text Alignment <strong>and</strong> Text Variants<br />

As illustrated by the preced<str<strong>on</strong>g>in</str<strong>on</strong>g>g discussi<strong>on</strong>, any <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure developed for digital classics <strong>and</strong> for the<br />

creati<strong>on</strong> of digital editi<strong>on</strong>s will need to c<strong>on</strong>sider the challenges of both text alignment <strong>and</strong> textual<br />

variants. The research literature <strong>on</strong> this topic is extensive. This subsecti<strong>on</strong> briefly describes two recent<br />

state-of-the-art approaches to deal with these issues. 138<br />

While the <str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong> to this review illustrated that there are a large number of digital corpora<br />

available <str<strong>on</strong>g>in</str<strong>on</strong>g> both Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Federico Boschetti (2007) has criticized the fact that although these<br />

corpora are typically based <strong>on</strong> authoritative editi<strong>on</strong>s, they provide no access to the apparatus<br />

criticus. 139 He rem<str<strong>on</strong>g>in</str<strong>on</strong>g>ded his readers that when us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a literary corpus such as the TLG, they must<br />

remember they are deal<str<strong>on</strong>g>in</str<strong>on</strong>g>g with the text of an author that has been created by editorial choices. This<br />

makes it particularly difficult to study l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic or stylistic phenomen<strong>on</strong> because without access to the<br />

apparatus criticus it is impossible to know what variants an editor may have suppressed. This can<br />

render digital corpora useless for philologists:<br />

138 For a thorough bibliography of over 50 papers <str<strong>on</strong>g>in</str<strong>on</strong>g> this area see the list of references <str<strong>on</strong>g>in</str<strong>on</strong>g> Schmidt <strong>and</strong> Colomb (2009).<br />

139 This criticism has also been made by Ruhleder (1995) <strong>and</strong> Stewart, Crane, <strong>and</strong> Babeu (2007).


45<br />

Philologists use digital corpora but they must verify results <strong>on</strong> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g> order to<br />

evaluate if the text retrieved is attested <str<strong>on</strong>g>in</str<strong>on</strong>g> every manuscript, <strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g> the codex optimus, <str<strong>on</strong>g>in</str<strong>on</strong>g> an<br />

error pr<strong>on</strong>e family of manuscripts, <str<strong>on</strong>g>in</str<strong>on</strong>g> a scholium, <str<strong>on</strong>g>in</str<strong>on</strong>g> the <str<strong>on</strong>g>in</str<strong>on</strong>g>direct traditi<strong>on</strong> or if it is c<strong>on</strong>jectured<br />

by a modern scholar. In short, the text of the reference editi<strong>on</strong> has no scientific value without<br />

the apparatus. … (Boschetti 2007).<br />

He noted that two excepti<strong>on</strong>s to this phenomen<strong>on</strong> are the Homer Multitext Project <strong>and</strong> Musisque<br />

Deoque, 140 both of which are seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to enrich the corpora they create with variants <strong>and</strong> c<strong>on</strong>jectures.<br />

Boschetti articulated that there were two basic methods to add apparati critici to digital critical<br />

editi<strong>on</strong>s. The first method was based <strong>on</strong> the automatic collati<strong>on</strong> of diplomatic editi<strong>on</strong>s, where digital<br />

diplomatic editi<strong>on</strong>s are def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as “complete transcripti<strong>on</strong>s of s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle manuscripts” with encoded<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about text layout <strong>and</strong> positi<strong>on</strong> (typically encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> TEI-XML). In agreement with<br />

M<strong>on</strong>ella (2008), Boschetti commented that <strong>on</strong>e of the most useful features of markup such as TEI is<br />

that it makes it “possible to separate the actual text of the manuscript from its <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s.” This<br />

method is particularly useful, Boschetti argued, for texts with a limited number of manuscripts. The<br />

sec<strong>on</strong>d method <str<strong>on</strong>g>in</str<strong>on</strong>g>volves the manual fill<str<strong>on</strong>g>in</str<strong>on</strong>g>g of forms by data entry operators, an approach utilized by<br />

Musisque Deoque that, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Boschetti, is “useful if the aim is the acquisiti<strong>on</strong> of large amounts<br />

of apparatus’ <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, <strong>on</strong> many texts of different authors.” Both approaches have shortcom<str<strong>on</strong>g>in</str<strong>on</strong>g>gs,<br />

Boschetti po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out, for the collati<strong>on</strong> of diplomatic editi<strong>on</strong>s must be <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated with other techniques,<br />

<strong>and</strong> manual form fill<str<strong>on</strong>g>in</str<strong>on</strong>g>g is subject to human error.<br />

Neither approach would be feasible for an author like Aeschylus, whose work is accompanied by an<br />

extensive body of sec<strong>on</strong>dary analysis <strong>and</strong> large numbers of c<strong>on</strong>jectures registered <str<strong>on</strong>g>in</str<strong>on</strong>g> various<br />

commentaries <strong>and</strong> reviews. Boschetti thus proposed a third approach comb<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g automatic collati<strong>on</strong><br />

<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> extracti<strong>on</strong>:<br />

The automatic pars<str<strong>on</strong>g>in</str<strong>on</strong>g>g of apparatuses <strong>and</strong> repertories, <str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong> to the automatic collati<strong>on</strong> for<br />

a group of relevant diplomatic transcripti<strong>on</strong>s, should be an acceptable trade-off. Subjective<br />

choices by operators <str<strong>on</strong>g>in</str<strong>on</strong>g> this case is limited to the correcti<strong>on</strong> phases. This third approach has a<br />

double goal: <strong>on</strong> <strong>on</strong>e h<strong>and</strong> it aims to parse automatically exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g critical apparatuses <strong>and</strong><br />

repertories of c<strong>on</strong>jectures of Aeschylus <strong>and</strong> <strong>on</strong> the other h<strong>and</strong> it aims to discover heuristics<br />

useful for any collecti<strong>on</strong> of variants <strong>and</strong>/or c<strong>on</strong>jectures with a similar structure (Boschetti<br />

2007).<br />

Boschetti designed a complete methodology that began with the automatic collati<strong>on</strong> of three reference<br />

editi<strong>on</strong>s for Aeschylus so that there would be a unified reference editi<strong>on</strong> <strong>on</strong> which to map the<br />

apparatuses <strong>and</strong> repertories of c<strong>on</strong>jectures. The sec<strong>on</strong>d step was to c<strong>on</strong>duct a manual survey of various<br />

apparatuses <str<strong>on</strong>g>in</str<strong>on</strong>g> order to identify typical structures (e.g., verse number, read<str<strong>on</strong>g>in</str<strong>on</strong>g>g to substitute a word <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

text, manuscript, <strong>and</strong> scholar). After identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g references to verses, Boschetti developed a typology<br />

of read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <strong>and</strong> sources for the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the apparatus. He noted the most frequent case <str<strong>on</strong>g>in</str<strong>on</strong>g>volved<br />

read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs where an orthographic or a morphological variant would “substitute a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle word <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

reference editi<strong>on</strong>.” The most comm<strong>on</strong> other operati<strong>on</strong>s were deleti<strong>on</strong>, additi<strong>on</strong>, <strong>and</strong> transpositi<strong>on</strong> of<br />

text. In terms of sources, they were typically “<strong>on</strong>e or more manuscripts for variants” <strong>and</strong> “<strong>on</strong>e or more<br />

scholars for c<strong>on</strong>jectures,” occasi<strong>on</strong>ally followed by accurate bibliographical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. One major<br />

difficulty, Boschetti noted, was that the same manuscript or author could be abbreviated differently <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

140 Musisque Deoque is “a digital archive of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> poetry, from its orig<str<strong>on</strong>g>in</str<strong>on</strong>g>s to the Italian Renaissance” that was established <str<strong>on</strong>g>in</str<strong>on</strong>g> 2005 <strong>and</strong> is creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

poetry database that is “supplemented <strong>and</strong> updated with critical apparatus <strong>and</strong> exegetical equipments.” http://www.mqdq.it/mqdq/home.jspl<str<strong>on</strong>g>in</str<strong>on</strong>g>gua=en.


46<br />

different apparatuses. In Boschetti’s system, names had to “match items of a table that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s the<br />

can<strong>on</strong>ical form of the name, abbreviati<strong>on</strong>s, orthographical variants <strong>and</strong> possible decl<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>s”<br />

The next major step was to develop a set of heuristics to be used <str<strong>on</strong>g>in</str<strong>on</strong>g> automatically pars<str<strong>on</strong>g>in</str<strong>on</strong>g>g the different<br />

apparatuses. Each item <str<strong>on</strong>g>in</str<strong>on</strong>g> the apparatus was separated by a new l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> all items were then tokenized<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>to <strong>on</strong>e of the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g categories: verse number, Greek word, Greek punctuati<strong>on</strong> mark, metrical<br />

sign, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> word, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> punctuati<strong>on</strong> mark, scholar name, manuscript abridgment, <strong>and</strong> bibliographic<br />

reference. All scholars’ names, manuscript abridgments, <strong>and</strong> bibliographic references were compared<br />

with <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> from the tables created <str<strong>on</strong>g>in</str<strong>on</strong>g> the previous step. The rest of the tokens were then<br />

aggregated to identify verse references, read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, <strong>and</strong> sources. The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al step was the use of an<br />

alignment algorithm to parse text substituti<strong>on</strong>s “<str<strong>on</strong>g>in</str<strong>on</strong>g> order to map the read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <strong>on</strong> the exact positi<strong>on</strong> of<br />

the verse <str<strong>on</strong>g>in</str<strong>on</strong>g> the reference editi<strong>on</strong>.” Boschetti revealed that about 90 percent of read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs found <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

apparatuses were substituti<strong>on</strong>s, or chunks of text that should replace <strong>on</strong>e or more l<str<strong>on</strong>g>in</str<strong>on</strong>g>es <str<strong>on</strong>g>in</str<strong>on</strong>g> a reference<br />

editi<strong>on</strong>. His algorithm utilized the c<strong>on</strong>cept of “edit distance” 141 to align read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs from the apparatus<br />

with the porti<strong>on</strong> of text <str<strong>on</strong>g>in</str<strong>on</strong>g> the reference editi<strong>on</strong> where the edit distance was lowest. Boschetti also<br />

chose to use a “brute force” comb<str<strong>on</strong>g>in</str<strong>on</strong>g>atorial algorithm that “rec<strong>on</strong>structs all the comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>s of adjacent<br />

words <str<strong>on</strong>g>in</str<strong>on</strong>g> the reference text (capitalised <strong>and</strong> without spaces) <strong>and</strong> it compares them with the read<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

its permutati<strong>on</strong>s.” One limitati<strong>on</strong> of his work, Boschetti reported, was that the current system is applied<br />

<strong>on</strong>ly to “items c<strong>on</strong>stituted by Greek sequences, immediately followed by source,” <strong>and</strong> excludes those<br />

cases where items <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> language explanati<strong>on</strong>s of textual operati<strong>on</strong>s to perform.<br />

To test his system, Boschetti calculated its performance aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st 56 verses of Weckle<str<strong>on</strong>g>in</str<strong>on</strong>g>’s editi<strong>on</strong> of<br />

Aeschylus’ Persae <strong>and</strong> evaluated it by h<strong>and</strong>. For processed items (exclud<str<strong>on</strong>g>in</str<strong>on</strong>g>g items with Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

predicates), 88 percent of c<strong>on</strong>jectures were mapped <strong>on</strong>to the reference text correctly, <strong>and</strong> 77 percent of<br />

c<strong>on</strong>jectures were mapped correctly <str<strong>on</strong>g>in</str<strong>on</strong>g> the total collecti<strong>on</strong>. This work illustrates that although an<br />

automated system does require a fair amount of prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary manual analysis, the heuristics <strong>and</strong><br />

algorithms that were created provide encourag<str<strong>on</strong>g>in</str<strong>on</strong>g>g results that deserve further explorati<strong>on</strong>.<br />

Recent work by Schmidt <strong>and</strong> Colomb (2009) has taken a different approach to the challenge of textual<br />

variati<strong>on</strong>, <strong>on</strong>e that also addresses related issues with overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g hierarchies <str<strong>on</strong>g>in</str<strong>on</strong>g> markup. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

Schmidt <strong>and</strong> Colomb, there are two basic forms of textual variati<strong>on</strong>: that found <str<strong>on</strong>g>in</str<strong>on</strong>g> multiple copies of a<br />

work, such as <str<strong>on</strong>g>in</str<strong>on</strong>g> the case of multiple manuscripts; <strong>and</strong> that aris<str<strong>on</strong>g>in</str<strong>on</strong>g>g from physical alterati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced<br />

by an author or copyist <str<strong>on</strong>g>in</str<strong>on</strong>g> a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle manuscript. Early pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted books <strong>and</strong> h<strong>and</strong>written medieval<br />

manuscripts often have high levels of variati<strong>on</strong>, <strong>and</strong> the techniques of textual criticism grew up around<br />

the desire to create a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle, def<str<strong>on</strong>g>in</str<strong>on</strong>g>itive text. Despite the fact that the digital envir<strong>on</strong>ment provided new<br />

possibilities for represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g multiple versi<strong>on</strong>s of a text, significant disagreement am<strong>on</strong>g textual editors<br />

c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ued, as Schmidt <strong>and</strong> Colomb related:<br />

With the arrival of the digital medium the old arguments gradually gave way to the realisati<strong>on</strong><br />

that multiple versi<strong>on</strong>s could now coexist with<str<strong>on</strong>g>in</str<strong>on</strong>g> the same text. … This raised the prospect of a<br />

s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle model of variati<strong>on</strong> that might at last unite the various str<strong>and</strong>s of text-critical theory.<br />

However, so far no generally accepted technique of how to achieve this has been developed.<br />

This failure perhaps underlies the comm<strong>on</strong>ly held belief am<strong>on</strong>g humanists that any<br />

computati<strong>on</strong>al model of a text is necessarily temporary, subjective <strong>and</strong> imperfect (Schmidt <strong>and</strong><br />

Colomb 2009).<br />

141 Edit distance has been def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as a “str<str<strong>on</strong>g>in</str<strong>on</strong>g>g distance,” or the number of operati<strong>on</strong>s required to transform <strong>on</strong>e str<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>to another (with typical allowable<br />

operati<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the <str<strong>on</strong>g>in</str<strong>on</strong>g>serti<strong>on</strong>, deleti<strong>on</strong>, or substituti<strong>on</strong> of a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle character) (http://en.wikipedia.org/wiki/Edit_distance).


47<br />

Additi<strong>on</strong>ally, Schmidt <strong>and</strong> Colomb postulated that the lack of an “accurate model of textual variati<strong>on</strong>”<br />

<strong>and</strong> the ability to implement such a model <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital world have c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ued to frustrate many<br />

humanists.<br />

A related problem identified by Schmidt <strong>and</strong> Colomb is that of overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g hierarchies, or when<br />

different markup structures (e.g., generic structural markup, l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic markup, literary markup)<br />

overlap <str<strong>on</strong>g>in</str<strong>on</strong>g> a text. Markup is said to overlap <str<strong>on</strong>g>in</str<strong>on</strong>g> that “the tags <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e perspective are not always well<br />

formed with respect to tags <str<strong>on</strong>g>in</str<strong>on</strong>g> another” (e.g., as <str<strong>on</strong>g>in</str<strong>on</strong>g> well-formed XML). Schmidt <strong>and</strong> Colomb proposed<br />

that the term “overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g hierarchies” is essentially <str<strong>on</strong>g>in</str<strong>on</strong>g>correct: “Firstly, not all overlap is between<br />

compet<str<strong>on</strong>g>in</str<strong>on</strong>g>g hierarchies, <strong>and</strong> sec<strong>on</strong>dly what is meant by the term ‘hierarchy’ is actually ‘trees’, that is a<br />

specific k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of hierarchy <str<strong>on</strong>g>in</str<strong>on</strong>g> which each node, except for the root, has <strong>on</strong>ly <strong>on</strong>e parent.” They put<br />

forward that although there have been over 50 papers deal<str<strong>on</strong>g>in</str<strong>on</strong>g>g with this topic, <strong>on</strong>e fundamental <strong>and</strong><br />

comm<strong>on</strong> weakness <str<strong>on</strong>g>in</str<strong>on</strong>g> the proposed approaches was that they offered soluti<strong>on</strong>s to problematic markup<br />

by us<str<strong>on</strong>g>in</str<strong>on</strong>g>g markup itself. The authors further <strong>and</strong> asserted that all cases of overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g hierarchies are<br />

also cases of textual variati<strong>on</strong>, even if the reverse is not always true. “The overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g hierarchies<br />

problem, then, boils down to variati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the metadata,” Schmidt <strong>and</strong> Colomb declared, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “it<br />

is entirely subsumed by the textual variati<strong>on</strong> problem because textual variati<strong>on</strong> is variati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the entire<br />

text, not <strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g> the markup” (Schmidt <strong>and</strong> Colomb 2009). They thus c<strong>on</strong>cluded that textual variati<strong>on</strong><br />

was the problem that needed solv<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

Ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g that neither versi<strong>on</strong> c<strong>on</strong>trol systems nor multiple sequence alignment (<str<strong>on</strong>g>in</str<strong>on</strong>g>spired by<br />

bio<str<strong>on</strong>g>in</str<strong>on</strong>g>formatics) can adequately address the problem of text variants, Schmidt <strong>and</strong> Colomb propose<br />

model<str<strong>on</strong>g>in</str<strong>on</strong>g>g text variati<strong>on</strong> as either a “m<str<strong>on</strong>g>in</str<strong>on</strong>g>imally redundant directed graph” or as an “ordered list of<br />

pairs” where each pair c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s a “set of versi<strong>on</strong>s <strong>and</strong> a fragment of text or data.” The greatest<br />

challenge with variant graphs, they expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, is how to process them efficiently. The m<str<strong>on</strong>g>in</str<strong>on</strong>g>imum<br />

number of functi<strong>on</strong>s that users would need were read<str<strong>on</strong>g>in</str<strong>on</strong>g>g a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle versi<strong>on</strong> of a text, search<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

multiversi<strong>on</strong> text, compar<str<strong>on</strong>g>in</str<strong>on</strong>g>g two versi<strong>on</strong>s of a text, determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g what was a variant of what else,<br />

creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g a variant graph, <strong>and</strong> separat<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>tent <strong>and</strong> variati<strong>on</strong>. The soluti<strong>on</strong> proposed by<br />

Schmidt (2010) is the multiversi<strong>on</strong> document format (MVD):<br />

The Multi-Versi<strong>on</strong> Document or MVD model represents all the versi<strong>on</strong>s of a work, whether<br />

they arise from correcti<strong>on</strong>s to a text or from the copy<str<strong>on</strong>g>in</str<strong>on</strong>g>g of <strong>on</strong>e orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al text <str<strong>on</strong>g>in</str<strong>on</strong>g>to several variant<br />

versi<strong>on</strong>s, or some comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of the two, as four atomic operati<strong>on</strong>s: <str<strong>on</strong>g>in</str<strong>on</strong>g>serti<strong>on</strong>, deleti<strong>on</strong>,<br />

substituti<strong>on</strong>, <strong>and</strong> transpositi<strong>on</strong>. … An MVD can be represented as a directed graph, with <strong>on</strong>e<br />

start node <strong>and</strong> <strong>on</strong>e end-node. … Alternatively it can be serialized as a list of paired values, each<br />

c<strong>on</strong>sist<str<strong>on</strong>g>in</str<strong>on</strong>g>g of a fragment of text <strong>and</strong> a set of versi<strong>on</strong>s to which that fragment bel<strong>on</strong>gs. As the<br />

number of versi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>creases, the number of fragments <str<strong>on</strong>g>in</str<strong>on</strong>g>creases, their size decreases, <strong>and</strong> the<br />

size of their versi<strong>on</strong>-sets <str<strong>on</strong>g>in</str<strong>on</strong>g>creases. This provides a good scalability as it trades off complexity<br />

for size, someth<str<strong>on</strong>g>in</str<strong>on</strong>g>g that modern computers are very good at h<strong>and</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>g. By follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g a path from<br />

the start-node to the end-node any versi<strong>on</strong> can be recovered. When read<str<strong>on</strong>g>in</str<strong>on</strong>g>g the list form of the<br />

graph, fragments not bel<strong>on</strong>g<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the desired versi<strong>on</strong> are merely skipped over (Schmidt 2010).<br />

Schmidt listed a number of benefits of the MVD format for humanists, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g: (1) it<br />

supports the automatic computati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>serti<strong>on</strong>s, deleti<strong>on</strong>s, variants, <strong>and</strong> transpositi<strong>on</strong>s between a set<br />

of versi<strong>on</strong>s; (2) MVDs are c<strong>on</strong>tent format-agnostic about <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual versi<strong>on</strong>s so they can be used with<br />

any generalized markup or pla<str<strong>on</strong>g>in</str<strong>on</strong>g> text; (3) an MVD is “not a collecti<strong>on</strong> of files” <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stead stores “<strong>on</strong>ly<br />

the differences between all the versi<strong>on</strong>s of a work as <strong>on</strong>e digital entity <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terrelates them” (Schmidt<br />

2010); (4) s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the MVD stores the overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g structures of a set of versi<strong>on</strong>s, the markup of


48<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual texts can be much simpler; <strong>and</strong> (5) “an MVD is the format of an applicati<strong>on</strong> not a st<strong>and</strong>ard.”<br />

Schmidt suggests that MVD documents should be stored <str<strong>on</strong>g>in</str<strong>on</strong>g> a b<str<strong>on</strong>g>in</str<strong>on</strong>g>ary format, particularly if the c<strong>on</strong>tent<br />

of each text is <str<strong>on</strong>g>in</str<strong>on</strong>g> XML. In their current work, they have created a MultiVersi<strong>on</strong> wiki tool where<br />

scholars can work <strong>on</strong> cultural heritage texts that exist <str<strong>on</strong>g>in</str<strong>on</strong>g> multiple versi<strong>on</strong>s.<br />

Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <strong>and</strong> Natural Language Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics 142 has been def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as “the branch of l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <str<strong>on</strong>g>in</str<strong>on</strong>g> which the techniques of<br />

computer science are applied to the analysis <strong>and</strong> synthesis of language <strong>and</strong> speech.” 143 NLP has been<br />

def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as an “area of computer science that develops systems that implement natural language<br />

underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g,” <strong>and</strong> it is often listed as a subdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics. 144 The use of<br />

computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <strong>and</strong> of NLP has grown enormously <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities over the past 20 years,<br />

<strong>and</strong> they have an even l<strong>on</strong>ger history <str<strong>on</strong>g>in</str<strong>on</strong>g> classical comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, as described <str<strong>on</strong>g>in</str<strong>on</strong>g> the <str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong> to this<br />

review. 145 Bamman <strong>and</strong> Crane (2009) have argued that both computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <strong>and</strong> NLP will be<br />

necessary comp<strong>on</strong>ents of any cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for classics:<br />

In decid<str<strong>on</strong>g>in</str<strong>on</strong>g>g how we want to design a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for Classics over the next ten years,<br />

there is an important questi<strong>on</strong> that lurks between “where are we now” <strong>and</strong> “where do we want<br />

to be”: where are our colleagues already Computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <strong>and</strong> natural language<br />

process<str<strong>on</strong>g>in</str<strong>on</strong>g>g generally perform best <str<strong>on</strong>g>in</str<strong>on</strong>g> high-resource languages—languages like English, <strong>on</strong><br />

which computati<strong>on</strong>al research has been focus<str<strong>on</strong>g>in</str<strong>on</strong>g>g for over sixty years, <strong>and</strong> for which expensive<br />

resources (such as treebanks, <strong>on</strong>tologies <strong>and</strong> large, curated corpora) have l<strong>on</strong>g been developed.<br />

Many of the tools we would want <str<strong>on</strong>g>in</str<strong>on</strong>g> the future are founded <strong>on</strong> technologies that already exist<br />

for English <strong>and</strong> other languages; our task <str<strong>on</strong>g>in</str<strong>on</strong>g> design<str<strong>on</strong>g>in</str<strong>on</strong>g>g a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure may simply be to<br />

transfer <strong>and</strong> customize them for Classical Studies (Bamman <strong>and</strong> Crane 2009).<br />

This secti<strong>on</strong> describes three applicati<strong>on</strong>s from computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <strong>and</strong> NLP <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of services<br />

for digital classics as a whole: treebanks, automatic morphological analysis, <strong>and</strong> lexic<strong>on</strong>s.<br />

Treebanks<br />

A treebank can be def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as a “database of sentences which are annotated with syntactic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>,<br />

often <str<strong>on</strong>g>in</str<strong>on</strong>g> the form of a tree.” 146 Treebanks can be either manually or automatically c<strong>on</strong>structed, <strong>and</strong><br />

they are used to support a variety of computati<strong>on</strong>al tasks such as those <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> corpus l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics,<br />

the study of syntactic features <str<strong>on</strong>g>in</str<strong>on</strong>g> computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, <strong>and</strong> tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> test<str<strong>on</strong>g>in</str<strong>on</strong>g>g parsers. There has<br />

been a large growth <str<strong>on</strong>g>in</str<strong>on</strong>g> the number of historical treebanks <str<strong>on</strong>g>in</str<strong>on</strong>g> recent years, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g treebanks <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek<br />

<strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>. Currently there are two major treebank projects for Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, the Perseus Project’s Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Dependency Treebank (classical Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>) <strong>and</strong> the Index Thomisticus (IT) Treebank (medieval Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>), <strong>and</strong><br />

142 Relatively little work has been d<strong>on</strong>e utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics for historical languages such as Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek, but for some fairly recent<br />

experiments with Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, see Sayeed <strong>and</strong> Szpakowicz (2004) <strong>and</strong> Casadio <strong>and</strong> Lambek (2005).<br />

143 "computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics plural noun" The Oxford Dicti<strong>on</strong>ary of English (revised editi<strong>on</strong>). Ed. Cather<str<strong>on</strong>g>in</str<strong>on</strong>g>e Soanes <strong>and</strong> Angus Stevens<strong>on</strong>. Oxford<br />

University Press, 2005. Oxford Reference Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Oxford University Press. Tufts University. 12 April 2010<br />

<br />

144 “natural-language process<str<strong>on</strong>g>in</str<strong>on</strong>g>g" A Dicti<strong>on</strong>ary of Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Ed John Da<str<strong>on</strong>g>in</str<strong>on</strong>g>tith <strong>and</strong> Edmund Wright. Oxford University Press, 2008. Oxford Reference<br />

Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Oxford University Press. Tufts University. <br />

145 For some recent exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>s of the potential of computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <strong>and</strong> of NLP for the humanities, see Sporleder (2010), de J<strong>on</strong>g (2009), <strong>and</strong><br />

Lüdel<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Zeldes (2007).<br />

146 http://en.wikti<strong>on</strong>ary.org/wiki/treebank


49<br />

<strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek, the Perseus Ancient Greek Dependency Treebank (AGDT). 147 This secti<strong>on</strong> describes<br />

these treebanks <strong>and</strong> their uses with<str<strong>on</strong>g>in</str<strong>on</strong>g> classical scholarship.<br />

The Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Dependency Treebank is a 53,143-word collecti<strong>on</strong> of syntactically parsed Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> sentences<br />

<strong>and</strong> it currently st<strong>and</strong>s at versi<strong>on</strong> 1.5 with excerpts from eight authors. Because Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> is a heavily<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>flected language with a great degree of variability <str<strong>on</strong>g>in</str<strong>on</strong>g> its word order, the annotati<strong>on</strong> style of the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Dependency Treebank was based <strong>on</strong> that of the Prague Dependency Treebank (PDT), which was then<br />

tailored for Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the grammar of P<str<strong>on</strong>g>in</str<strong>on</strong>g>kster (Bamman <strong>and</strong> Crane 2006). Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Bamman<br />

<strong>and</strong> Crane (2006) there are a variety of potential uses for a Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> treebank, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g “the potential to<br />

be used as a knowledge source <str<strong>on</strong>g>in</str<strong>on</strong>g> a number of traditi<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>es of <str<strong>on</strong>g>in</str<strong>on</strong>g>quiry, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g rhetoric,<br />

lexicography, philology <strong>and</strong> historical l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics.” In their <str<strong>on</strong>g>in</str<strong>on</strong>g>itial research they explored us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Dependency Treebank to detail the use of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> rhetorical devices <strong>and</strong> to quantify the change<br />

over time <str<strong>on</strong>g>in</str<strong>on</strong>g> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> from a subject-object-verb word order to a subject-verb-object order. Later research<br />

with the use of the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Dependency Treebank made use of the resources with<str<strong>on</strong>g>in</str<strong>on</strong>g> the PDL to provide<br />

advanced read<str<strong>on</strong>g>in</str<strong>on</strong>g>g support <strong>and</strong> to provide more-sophisticated levels of lemmatized <strong>and</strong> morphosyntactic<br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Bamman <strong>and</strong> Crane 2007).<br />

The IT Treebank is an <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g project that will <str<strong>on</strong>g>in</str<strong>on</strong>g>clude all of the works of Thomas Aqu<str<strong>on</strong>g>in</str<strong>on</strong>g>as as well as<br />

61 authors related to him <strong>and</strong> will ultimately <str<strong>on</strong>g>in</str<strong>on</strong>g>clude 179 texts <strong>and</strong> 11 milli<strong>on</strong> tokens. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its<br />

website, the IT Treebank is “presently composed of 82,141 tokens, for a total of 3,714 syntactically<br />

parsed sentences excerpted from Scriptum super Sententiis Magistri Petri Lombardi, Summa c<strong>on</strong>tra<br />

Gentiles <strong>and</strong> Summa Theologiae.” Their most recent work has explored the development of a valency<br />

lexic<strong>on</strong>, <strong>and</strong> the authors argue that although many classical languages projects exist, few have<br />

annotated texts above the morphological level (McGillivray <strong>and</strong> Passarotti 2009). N<strong>on</strong>etheless, the<br />

authors <str<strong>on</strong>g>in</str<strong>on</strong>g>sist “nowadays it is possible <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>deed necessary to match lexic<strong>on</strong>s with data from<br />

(annotated) corpora, <strong>and</strong> vice versa. This requires the scholars to exploit the vast amount of textual<br />

data from classical languages already available <str<strong>on</strong>g>in</str<strong>on</strong>g> digital format … <strong>and</strong> particularly those annotated at<br />

the highest levels.”<br />

Rather than develop their own annotati<strong>on</strong> st<strong>and</strong>ards, these two Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> treebank projects worked together<br />

to develop a comm<strong>on</strong> st<strong>and</strong>ard set of guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es that they have published <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 148 This provides an<br />

important example of the need for different projects with similar goals to not <strong>on</strong>ly collaborate but also<br />

to make the results of that collaborati<strong>on</strong> available to others. Another important collaborative feature of<br />

these treebanks is that, particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> the case of the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Dependency Treebank, a large number of<br />

graduate <strong>and</strong> undergraduate students have c<strong>on</strong>tributed to this knowledge base.<br />

Other work <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of collaborative treebanks has been c<strong>on</strong>ducted by the Perseus Project, which has<br />

created the AGDT. The AGDT, currently <str<strong>on</strong>g>in</str<strong>on</strong>g> versi<strong>on</strong> 1.1, is a “192,204-word collecti<strong>on</strong> of syntactically<br />

parsed Greek sentences” from Hesiod, Homer, <strong>and</strong> Aeschylus. The development of the AGDT has<br />

focused <strong>on</strong> a new model of treebank<str<strong>on</strong>g>in</str<strong>on</strong>g>g, that of the creati<strong>on</strong> of scholarly treebanks (Bamman,<br />

Mambr<str<strong>on</strong>g>in</str<strong>on</strong>g>i, <strong>and</strong> Crane 2009). While traditi<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic-annotati<strong>on</strong> projects have focused <strong>on</strong> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle best annotati<strong>on</strong> (often enforc<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terannotator agreement), such a model is poor fit when the<br />

object of annotati<strong>on</strong> itself is an object of <str<strong>on</strong>g>in</str<strong>on</strong>g>tense scholarly debate:<br />

147 For more <strong>on</strong> the Perseus Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Dependency Treebank <strong>and</strong> the AGDT (as well as to download them), see http://nlp.perseus.tufts.edu/syntax/treebank/,<br />

<strong>and</strong> for the Index Thomisticus Treebank, see http://itreebank.marg<str<strong>on</strong>g>in</str<strong>on</strong>g>alia.it/<br />

148 For the most recent versi<strong>on</strong> of the guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es, see http://hdl.h<strong>and</strong>le.net/10427/42683; for more <strong>on</strong> the collaborati<strong>on</strong>, see Bamman, Passarotti, <strong>and</strong> Crane<br />

(2008).


50<br />

In these cases we must provide a means for encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g multiple annotati<strong>on</strong>s for a text <strong>and</strong><br />

allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholars who disagree with a specific annotati<strong>on</strong> to encode their disagreement <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

quantifiable form. For historical texts especially, scholarly disagreement can be found not <strong>on</strong>ly<br />

<strong>on</strong> the level of the correct syntactic parse, but also <strong>on</strong> the form of the text itself (Bamman,<br />

Mambr<str<strong>on</strong>g>in</str<strong>on</strong>g>i, <strong>and</strong> Crane 2009).<br />

The text of Aeschylus serves as a useful example, they argue, for many scholars would disagree not<br />

<strong>on</strong>ly <strong>on</strong> how a text had been annotated but also <strong>on</strong> the rec<strong>on</strong>structed text, or the specific editi<strong>on</strong> that<br />

was used as the source for annotati<strong>on</strong>. The authors argue that the process of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarly<br />

treebanks is similar to that of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g critical editi<strong>on</strong>s:<br />

As the product of scholarly labor, a critical editi<strong>on</strong> displays the text as it is rec<strong>on</strong>structed by an<br />

editor; it is thus an <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative hypothesis whose foundati<strong>on</strong>s lie <strong>on</strong> the methods of textual<br />

criticism. A scholarly treebank may be def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by analogy as a syntactically annotated corpus<br />

that aga<str<strong>on</strong>g>in</str<strong>on</strong>g> reflects an <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle scholar, based not <strong>on</strong>ly <strong>on</strong> the scholar’s<br />

philological acumen but also <strong>on</strong> a degree of pers<strong>on</strong>al taste <strong>and</strong> op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s that are culturally <strong>and</strong><br />

historically determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed. A scholarly treebank thus distances itself from the noti<strong>on</strong> that l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic<br />

annotati<strong>on</strong>s can be absolute; when deal<str<strong>on</strong>g>in</str<strong>on</strong>g>g with n<strong>on</strong>-native historical languages especially, a<br />

syntactic <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of a sentence is always the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <strong>and</strong> therefore<br />

subject to debate (Bamman, Mambr<str<strong>on</strong>g>in</str<strong>on</strong>g>i, <strong>and</strong> Crane 2009).<br />

In order to address this issue, the AGDT focused <strong>on</strong> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a model that allowed for assign<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

authorship to all <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative annotati<strong>on</strong>s. By do<str<strong>on</strong>g>in</str<strong>on</strong>g>g so, the authors hoped to achieve two goals. First,<br />

by publicly releas<str<strong>on</strong>g>in</str<strong>on</strong>g>g the data with citable ownership, they wanted to provide a core data set around<br />

which scholars could add their own annotati<strong>on</strong>s; sec<strong>on</strong>d, they hoped that by publicly acknowledg<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the creators of annotati<strong>on</strong>s they could promote the idea of scholarly treebank<str<strong>on</strong>g>in</str<strong>on</strong>g>g as an act of scholarly<br />

publicati<strong>on</strong> that is similar <str<strong>on</strong>g>in</str<strong>on</strong>g> form to publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g a critical editi<strong>on</strong> or commentary. They also hoped that<br />

their model, which gave <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual recogniti<strong>on</strong> to student c<strong>on</strong>tributi<strong>on</strong>s to a treebank, would serve as a<br />

model for <str<strong>on</strong>g>in</str<strong>on</strong>g>corporat<str<strong>on</strong>g>in</str<strong>on</strong>g>g undergraduate research <str<strong>on</strong>g>in</str<strong>on</strong>g>to classical teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Many of these issues are<br />

revisited through this review, specifically the need for digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure to support multiple<br />

annotati<strong>on</strong>s of different scholars (regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>, certa<str<strong>on</strong>g>in</str<strong>on</strong>g>ty, etc.), the ability to show that the<br />

creati<strong>on</strong> of digital objects is <str<strong>on</strong>g>in</str<strong>on</strong>g> itself an act of <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative scholarship, the importance of attributable<br />

<strong>and</strong> citable scholarship, <strong>and</strong> the need to support new models of collaborati<strong>on</strong>.<br />

Morphological Analysis<br />

Some of the challenges of automatic morphological process<str<strong>on</strong>g>in</str<strong>on</strong>g>g for Sanskrit (Huet 2004) <strong>and</strong> Sumerian<br />

(Tablan et al. 2006) have already been discussed. This subsecti<strong>on</strong> focuses <strong>on</strong> some recent research<br />

work <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>.<br />

Classical Greek is a highly <str<strong>on</strong>g>in</str<strong>on</strong>g>flected language <strong>and</strong> this poses challenges for both students <strong>and</strong> scholars<br />

as detailed by John Lee:<br />

Indeed, a staple exercise for students of ancient Greek is to identify the root form of an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>flected verb. This skill is essential; without know<str<strong>on</strong>g>in</str<strong>on</strong>g>g the root form, <strong>on</strong>e cannot underst<strong>and</strong> the<br />

mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the word, or even look it up <str<strong>on</strong>g>in</str<strong>on</strong>g> a dicti<strong>on</strong>ary. For Classics scholars, these myriad<br />

forms also pose formidable challenges. In order to search for occurrences of a word <str<strong>on</strong>g>in</str<strong>on</strong>g> a corpus,<br />

all of its forms must be enumerated, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce words do not frequently appear <str<strong>on</strong>g>in</str<strong>on</strong>g> their root forms.


51<br />

This procedure becomes extremely labor-<str<strong>on</strong>g>in</str<strong>on</strong>g>tensive for small words that overlap with other<br />

comm<strong>on</strong> words (Lee 2008).<br />

The Greek morphological parser for the PDL (named Morpheus) has been <str<strong>on</strong>g>in</str<strong>on</strong>g> development s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce 1990<br />

<strong>and</strong> was developed by Gregory Crane (Crane 1991). 149 Crane worked with a database of 40,000 stems,<br />

13,000 <str<strong>on</strong>g>in</str<strong>on</strong>g>flecti<strong>on</strong>s, <strong>and</strong> 2,500 irregular forms. In 1991, Morpheus had been used to analyze almost 3<br />

milli<strong>on</strong> words with texts that ranged <str<strong>on</strong>g>in</str<strong>on</strong>g> data from the eighth century BC until the sec<strong>on</strong>d century AD.<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce then, Morpheus has played an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegral part of the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e PDL, which has exp<strong>and</strong>ed to cover over<br />

8 milli<strong>on</strong> words <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek. Crane argued that the parser was developed not just to address problems <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Ancient Greek but also to serve as a possible approach to develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g morphological tools for ancient<br />

languages.<br />

More-recent work <str<strong>on</strong>g>in</str<strong>on</strong>g> automatic morphological analysis of Greek has utilized Morpheus as well as other<br />

resources available from Perseus. Dik <strong>and</strong> Whal<str<strong>on</strong>g>in</str<strong>on</strong>g>g (2009) have discussed their implementati<strong>on</strong> of<br />

Greek morphological search<str<strong>on</strong>g>in</str<strong>on</strong>g>g over the Perseus Greek corpus that made use of two disambiguated<br />

Greek corpora, the open-source part-of-speech analyzer TreeTagger 150 <strong>and</strong> output from Morpheus. The<br />

backb<strong>on</strong>e of their implementati<strong>on</strong> is a SQLite database backend c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g tokens <strong>and</strong> parses for the<br />

full corpus that c<strong>on</strong>nects the three ma<str<strong>on</strong>g>in</str<strong>on</strong>g> comp<strong>on</strong>ents: the Perseus XML files with unique token IDs;<br />

TreeTagger, “which accepts token sequences from the database <strong>and</strong> outputs parses <strong>and</strong> probability<br />

weights, which are stored <str<strong>on</strong>g>in</str<strong>on</strong>g> their own table”; <strong>and</strong> PhiloLogic. 151 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the Dik <strong>and</strong> Whal<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

their system made use of PhiloLogic, because:<br />

… it serves as a highly efficient search <strong>and</strong> retrieval fr<strong>on</strong>t end, by <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g the augmented<br />

XML files as well as the c<strong>on</strong>tents of the l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked SQLite tables. PhiloLogic’s highly optimized<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dex architecture allows near-<str<strong>on</strong>g>in</str<strong>on</strong>g>stantaneous results <strong>on</strong> complex <str<strong>on</strong>g>in</str<strong>on</strong>g>quiries such as ‘any<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>f<str<strong>on</strong>g>in</str<strong>on</strong>g>itive forms with<str<strong>on</strong>g>in</str<strong>on</strong>g> 25 words of (dative s<str<strong>on</strong>g>in</str<strong>on</strong>g>gulars of) lemma X <strong>and</strong> str<str<strong>on</strong>g>in</str<strong>on</strong>g>g Y’, which would<br />

be a challenge for typical relati<strong>on</strong>al database systems (Dik <strong>and</strong> Whal<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2009).<br />

The results of their work are available at “Perseus Under PhiloLogic,” 152 a website that supports<br />

morphological search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of both the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek texts of Perseus. Although Dik <strong>and</strong> Whal<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

noted that they were c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>u<str<strong>on</strong>g>in</str<strong>on</strong>g>g to explore the possibilities of natural language search<str<strong>on</strong>g>in</str<strong>on</strong>g>g aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st the<br />

Greek corpus <str<strong>on</strong>g>in</str<strong>on</strong>g> place of the very technical ways supported through PhiloLogic, their system<br />

n<strong>on</strong>etheless supports full morphological search<str<strong>on</strong>g>in</str<strong>on</strong>g>g, str<str<strong>on</strong>g>in</str<strong>on</strong>g>g search<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> lemmatized search<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong><br />

these features have been <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g>to a read<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment for the texts.<br />

Some other recent research has focused <strong>on</strong> the use of mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> large, unlabeled corpora to<br />

perform automatic morphological analysis <strong>on</strong> classical Greek. Lee (2008) has developed an analyzer of<br />

Ancient Greek that “<str<strong>on</strong>g>in</str<strong>on</strong>g>fers the root form of a word” <strong>and</strong> has made two major <str<strong>on</strong>g>in</str<strong>on</strong>g>novati<strong>on</strong>s over previous<br />

systems:<br />

149 Far earlier but also highly significant work <strong>on</strong> the development of a morphological parser for Ancient Greek was c<strong>on</strong>ducted by David Packard <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

1970s (Packard 1973).<br />

150 http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/<br />

151 PhiloLogic (http://www.lib.uchicago.edu/efts/ARTFL/philologic/) is a software tool that has been developed by the Project for American <strong>and</strong> French<br />

Research <strong>on</strong> the Treasury of the French Language at the University of Chicago, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> “its simplest form serves as a document retrieval or look up<br />

mechanism whereby users can search a relati<strong>on</strong>al database to retrieve given documents <strong>and</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g> some implementati<strong>on</strong>s, porti<strong>on</strong>s of texts such as acts,<br />

scenes, articles, or head-words.”<br />

152 http://perseus.uchicago.edu/


52<br />

First, it utilizes a nearest neighbor framework that requires no h<strong>and</strong>-crafted rules, <strong>and</strong> provides<br />

analogies to facilitate learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Sec<strong>on</strong>d, <strong>and</strong> perhaps more significantly, it exploits a large,<br />

unlabelled corpus to improve the predicti<strong>on</strong> of novel roots (Lee 2008).<br />

Lee observed that many students of Ancient Greek memorized “paradigmatic” verbs that could be used<br />

as analogies to identify the roots of unseen verbs. From this <str<strong>on</strong>g>in</str<strong>on</strong>g>sight, Lee utilized a “nearest-neighbor”<br />

mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g framework to model this process. When given a word <str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>flected form, the<br />

algorithm searched for the root form am<strong>on</strong>g its “neighbors” by mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g substituti<strong>on</strong>s to its prefix <strong>and</strong><br />

suffix. Valid substituti<strong>on</strong>s are harvested from pairs of <str<strong>on</strong>g>in</str<strong>on</strong>g>flected <strong>and</strong> root forms <str<strong>on</strong>g>in</str<strong>on</strong>g> a tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g set of data,<br />

<strong>and</strong> these pairs are then used to serve as “analogies to re<str<strong>on</strong>g>in</str<strong>on</strong>g>force learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g.” N<strong>on</strong>etheless, Ancient Greek<br />

still posed some challenges that complicated a m<str<strong>on</strong>g>in</str<strong>on</strong>g>imally supervised approach. Lee expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that<br />

heavily <str<strong>on</strong>g>in</str<strong>on</strong>g>flected languages such as Greek suffer from “data sparseness” s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many <str<strong>on</strong>g>in</str<strong>on</strong>g>flected forms<br />

appear at most a few times <strong>and</strong> many root forms may not appear at all <str<strong>on</strong>g>in</str<strong>on</strong>g> a corpus. As a rule-based<br />

system, Morpheus needed a priori knowledge of possible stems <strong>and</strong> affixes, all of which had to be<br />

crafted by h<strong>and</strong>. To provide a more scalable approach, Lee used a data-driven approach that<br />

automatically determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed stems <strong>and</strong> affixes from tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g data (morphology data for the Greek<br />

Septuag<str<strong>on</strong>g>in</str<strong>on</strong>g>t from the University of Pennsylvania) <strong>and</strong> then used the TLG as a source of unlabeled data<br />

to guide predicti<strong>on</strong> of novel roots.<br />

While Lee made use of mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> unlabeled corpora, Tambouratzis (2008) automated the<br />

morphological segmentati<strong>on</strong> of Greek by “coupl<str<strong>on</strong>g>in</str<strong>on</strong>g>g an iterative pattern-recogniti<strong>on</strong> algorithm with a<br />

modest amount of l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic knowledge, expressed via a set of <str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong>s associated with weights.”<br />

He used an “ant col<strong>on</strong>y optimizati<strong>on</strong> (ACO) metaheuristic” to automatically determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e optimal weight<br />

values <strong>and</strong> found that <str<strong>on</strong>g>in</str<strong>on</strong>g> several cases the automatic system provided better results than those that had<br />

been manually determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by scholars. In c<strong>on</strong>trast to Lee, Tambouratzis used <strong>on</strong>ly a subset of the TLG<br />

for tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g data (<str<strong>on</strong>g>in</str<strong>on</strong>g> this case, the speeches of several Greek orators).<br />

In additi<strong>on</strong> to the work d<strong>on</strong>e by Dik <strong>and</strong> Whal<str<strong>on</strong>g>in</str<strong>on</strong>g>g for “Perseus Under PhiloLogic,” other research <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

automatic morphological analysis of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> has been c<strong>on</strong>ducted by (F<str<strong>on</strong>g>in</str<strong>on</strong>g>kel <strong>and</strong> Stump 2009). These<br />

authors reported <strong>on</strong> computati<strong>on</strong>al experiments to generate the morphology of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> verbs.<br />

Lexic<strong>on</strong>s<br />

Lexic<strong>on</strong>s are reference tools that have l<strong>on</strong>g played an important role <str<strong>on</strong>g>in</str<strong>on</strong>g> classical scholarship <strong>and</strong><br />

particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> the study of historical languages. 153 As previously noted, the lack of a computati<strong>on</strong>al<br />

lexic<strong>on</strong> for Sanskrit is a major research challenge. This secti<strong>on</strong> explores some important lexic<strong>on</strong>s for<br />

classical languages <strong>and</strong> suggests new roles for these traditi<strong>on</strong>al reference works <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital<br />

envir<strong>on</strong>ment.<br />

The Comprehensive Aramaic Lexic<strong>on</strong> 154 (CAL) hopes to serve as a “new dicti<strong>on</strong>ary of the Aramaic<br />

language.” Aramaic is a Semitic language, <strong>and</strong> numerous <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> papyri, as well as Biblical<br />

<strong>and</strong> other religious texts, are written <str<strong>on</strong>g>in</str<strong>on</strong>g> it. This project, currently <str<strong>on</strong>g>in</str<strong>on</strong>g> preparati<strong>on</strong> by an <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al<br />

team of scholars, is based at Hebrew Uni<strong>on</strong> College <str<strong>on</strong>g>in</str<strong>on</strong>g> C<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>nati. The goal is to create a<br />

comprehensive lexic<strong>on</strong> that will take all of ancient Aramaic <str<strong>on</strong>g>in</str<strong>on</strong>g>to account, be based <strong>on</strong> a compilati<strong>on</strong> of<br />

all Aramaic literature, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>clude extensive references to modern scholarly literature. Although a<br />

153 This secti<strong>on</strong> focuses <strong>on</strong> larger projects that plan to create <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e or digital lexic<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong> to pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted <strong>on</strong>es, but there are also a number of lexic<strong>on</strong>s<br />

for classical languages that have been placed <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e as PDFs or <str<strong>on</strong>g>in</str<strong>on</strong>g> other static formats, such as the Chicago Demotic Dicti<strong>on</strong>ary<br />

(http://oi.uchicago.edu/research/projects/dem/); other projects have scanned historical dicti<strong>on</strong>aries <strong>and</strong> provided <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e search<str<strong>on</strong>g>in</str<strong>on</strong>g>g capabilities, such as<br />

Sanskrit, Tamil <strong>and</strong> Pahlavi Dicti<strong>on</strong>aries, http://webapps.uni-koeln.de/tamil/<br />

154 http://cal1.cn.huc.edu/


53<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted publicati<strong>on</strong> is ultimately planned, various databases of textual, lexical, <strong>and</strong> bibliographical<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> will be available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Currently a limited versi<strong>on</strong> of the lexic<strong>on</strong> <strong>and</strong> the bibliographical<br />

archives can be searched <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.<br />

The Thesaurae L<str<strong>on</strong>g>in</str<strong>on</strong>g>guae Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>ae (TLL) 155 is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g to produce “the first comprehensive scholarly<br />

dicti<strong>on</strong>ary of ancient Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> from the earliest times down to AD 600.” This work is based <strong>on</strong> an archive<br />

of about 10 milli<strong>on</strong> slips <strong>and</strong> takes <str<strong>on</strong>g>in</str<strong>on</strong>g>to account all surviv<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts. While <str<strong>on</strong>g>in</str<strong>on</strong>g> older texts there is a slip<br />

for every word occurrence, later texts are generally covered by a selecti<strong>on</strong> of “lexicographically<br />

relevant examples.” As Hillen (2007) has expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, from around the time of Apuleius until 600 AD,<br />

textual sources have been excerpted by mark<str<strong>on</strong>g>in</str<strong>on</strong>g>g noteworthy usages rather than every usage of the word<br />

(with the excepti<strong>on</strong>s of major texts by August<str<strong>on</strong>g>in</str<strong>on</strong>g>e, Tertullian, <strong>and</strong> Commodian). To speed up work,<br />

Hillen observed that methods must be found to reduce the number of slips that will be given<br />

comprehensive treatment s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce they outnumber the excerpted material by a ratio of 4.5 to 1, <strong>and</strong> that<br />

emphasis must be given to texts that did not c<strong>on</strong>form to grammatical or stylistic norms. In terms of<br />

new digital collecti<strong>on</strong>s, Hillen saw them as hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g some value for the work of the TLL:<br />

The ma<str<strong>on</strong>g>in</str<strong>on</strong>g> value of digital databanks for the work of the Thesaurus cannot, therefore, be a<br />

systematic <str<strong>on</strong>g>in</str<strong>on</strong>g>crease <str<strong>on</strong>g>in</str<strong>on</strong>g> the raw material. Rather, they are useful <str<strong>on</strong>g>in</str<strong>on</strong>g> three specific areas:<br />

reproducti<strong>on</strong>, check<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> regulated expansi<strong>on</strong> of our sources (Hillen 2007).<br />

By the end of 2009, the project had reached the end of the letter P, <strong>and</strong> approximately two thirds of the<br />

work had been completed. The TLL has been issued <str<strong>on</strong>g>in</str<strong>on</strong>g> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t versi<strong>on</strong> <strong>and</strong> also has an electr<strong>on</strong>ic versi<strong>on</strong><br />

that is available by subscripti<strong>on</strong> from DeGruyer/Saur. 156<br />

Similar to the TLL’s plan of c<strong>on</strong>trolled expansi<strong>on</strong> through document<str<strong>on</strong>g>in</str<strong>on</strong>g>g unusual or noteworthy usages<br />

of words, the “Poorly Attested Words <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek” (PAWAG) 157 project based at the University of<br />

Genoa is sett<str<strong>on</strong>g>in</str<strong>on</strong>g>g up an electr<strong>on</strong>ic dicti<strong>on</strong>ary that “gathers together words of Ancient Greek that are<br />

either <strong>on</strong>ly scantily attested (i.e., with <strong>on</strong>e or few occurrences), <str<strong>on</strong>g>in</str<strong>on</strong>g>adequately (i.e., characterized by<br />

some sort of uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty) or <str<strong>on</strong>g>in</str<strong>on</strong>g> any case problematically, both from a formal <strong>and</strong> semantic po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of<br />

view.” This database is <str<strong>on</strong>g>in</str<strong>on</strong>g>tended to supplement traditi<strong>on</strong>al dicti<strong>on</strong>aries that cannot pay sufficient<br />

attenti<strong>on</strong> to the issue of poorly attested words. There are currently 1,548 headwords <str<strong>on</strong>g>in</str<strong>on</strong>g> this database<br />

that can be searched with a str<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> either Greek or Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>.<br />

A more large-scale endeavor is the Greek Lexic<strong>on</strong> project, 158 which is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g overseen by the Faculty<br />

of Classics at Cambridge University. It plans to release a new Ancient Greek-English lexic<strong>on</strong> of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>termediate size that will take <str<strong>on</strong>g>in</str<strong>on</strong>g>to account the most recent scholarship, replace archaic term<str<strong>on</strong>g>in</str<strong>on</strong>g>ology<br />

with up-to-date English, <strong>and</strong> both reexam<str<strong>on</strong>g>in</str<strong>on</strong>g>e orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al source material <strong>and</strong> add new material that has<br />

been discovered s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the end of the n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth century. This project has adopted a semantic method of<br />

organiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g articles <strong>and</strong> plans to publish a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t editi<strong>on</strong> through Cambridge University press as well as<br />

an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e versi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the PDL. Fraser (2008) has provided more <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about the creati<strong>on</strong> of this<br />

lexic<strong>on</strong> <strong>and</strong> the challenges that this new semantic organizati<strong>on</strong> created. In additi<strong>on</strong> to mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g use of<br />

the Perseus Morpheus database, they developed an additi<strong>on</strong>al resource:<br />

… because we can predict every word-search that we will eventually want to perform, a<br />

program was designed to c<strong>on</strong>duct these searches <str<strong>on</strong>g>in</str<strong>on</strong>g> advance. Our corpus of texts has been<br />

155 http://www.thesaurus.badw.de/english/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.htm<br />

156 http://www.degruyter.de/c<strong>on</strong>t/fb/at/detail.cfmid=IS-9783110229561-1<br />

157 http://www.aristarchus.unige.it/pawag/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

158 http://www.classics.cam.ac.uk/faculty/research_groups_<strong>and</strong>_societies/greek_lexic<strong>on</strong>/


54<br />

entirely pre-searched for each lemma-form, <strong>and</strong> the results archived <str<strong>on</strong>g>in</str<strong>on</strong>g> static HTML (Hypertext<br />

Mark-up Language) pages. This c<strong>on</strong>stitutes a digital archive of lexicographic ‘slips’, provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the dicti<strong>on</strong>ary writers with immediate access to the searches, <strong>and</strong> also enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g the citati<strong>on</strong>s <strong>and</strong><br />

their c<strong>on</strong>texts to be archived <str<strong>on</strong>g>in</str<strong>on</strong>g> a generic format that is not tied to any particular operat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

system or database program (Fraser 2008).<br />

This digital archive of Greek lemma searches has helped speed the process of writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g entries.<br />

Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, as this lexic<strong>on</strong> has been designed for students, Fraser noted that it gives fewer Greek<br />

quotati<strong>on</strong>s <strong>and</strong> more space to semantic descripti<strong>on</strong>. Citati<strong>on</strong>s have also been restricted to a can<strong>on</strong> of 70<br />

authors, with no examples taken from fragmentary authors or <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. Fraser also reported that<br />

while dicti<strong>on</strong>ary entries are stored <str<strong>on</strong>g>in</str<strong>on</strong>g> XML, the project created a new DTD for their system based <strong>on</strong> a<br />

“provisi<strong>on</strong>al entry structure.”<br />

The PDL already c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s digital versi<strong>on</strong>s of lexic<strong>on</strong>s for some <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual authors 159 as well as several<br />

major classical lexic<strong>on</strong>s, such as the Lewis & Short Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Dicti<strong>on</strong>ary 160 <strong>and</strong> the Liddell Scott Johns<strong>on</strong><br />

Greek English Lexic<strong>on</strong> (LSJ). 161 The lexic<strong>on</strong>s that are a part of Perseus, however, differ from the<br />

projects described above. Instead of design<str<strong>on</strong>g>in</str<strong>on</strong>g>g lexic<strong>on</strong>s for both pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t <strong>and</strong> electr<strong>on</strong>ic distributi<strong>on</strong>, the<br />

lexic<strong>on</strong>s, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>deed all reference works that are part of Perseus, have been created from the start to<br />

serve as both hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>ked tools <str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> read<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment <strong>and</strong> as<br />

knowledge sources that can be m<str<strong>on</strong>g>in</str<strong>on</strong>g>ed to support a variety of automated processes. 162<br />

In additi<strong>on</strong> to turn<str<strong>on</strong>g>in</str<strong>on</strong>g>g “traditi<strong>on</strong>al” pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted lexic<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>to dynamic reference works, current research at<br />

Perseus is explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g how to create a new k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of “dynamic lexic<strong>on</strong>” that is generated not from just <strong>on</strong>e<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted text but from all the texts <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital library (Bamman <strong>and</strong> Crane 2008). They first used the<br />

large, aligned, parallel corpus of English <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g> Perseus to <str<strong>on</strong>g>in</str<strong>on</strong>g>duce a word-sense <str<strong>on</strong>g>in</str<strong>on</strong>g>ventory <strong>and</strong><br />

determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed how often certa<str<strong>on</strong>g>in</str<strong>on</strong>g> def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong>s of a word were actually manifested, while us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the c<strong>on</strong>text<br />

surround<str<strong>on</strong>g>in</str<strong>on</strong>g>g words to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e which def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong>s were used <str<strong>on</strong>g>in</str<strong>on</strong>g> a given <str<strong>on</strong>g>in</str<strong>on</strong>g>stance. The treebank was<br />

then used to tra<str<strong>on</strong>g>in</str<strong>on</strong>g> an automatic syntactic parser for the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> corpus, <str<strong>on</strong>g>in</str<strong>on</strong>g> particular to extract <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

about word’s subcategorizati<strong>on</strong> frames <strong>and</strong> selecti<strong>on</strong>al preferences. Cluster<str<strong>on</strong>g>in</str<strong>on</strong>g>g was then used to<br />

establish semantic similarity between words determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by their appearance <str<strong>on</strong>g>in</str<strong>on</strong>g> similar c<strong>on</strong>texts<br />

(Bamman <strong>and</strong> Crane 2009). This automatically extracted lexical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> can be used <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of<br />

ways:<br />

A digital library architecture <str<strong>on</strong>g>in</str<strong>on</strong>g>teracts with this knowledge <str<strong>on</strong>g>in</str<strong>on</strong>g> three ways: first, it lets us further<br />

c<strong>on</strong>textualize our source texts for the users of our exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital library; sec<strong>on</strong>d, it allows us to<br />

present customized reports for word usage accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the metadata associated with the texts<br />

from which they’re drawn, enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g us to create a dynamic lexic<strong>on</strong> that not <strong>on</strong>ly notes how a<br />

word is used <str<strong>on</strong>g>in</str<strong>on</strong>g> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g> general, but also <str<strong>on</strong>g>in</str<strong>on</strong>g> any specific author, genre, or era (or comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong><br />

of those). And third, it lets us c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ue to m<str<strong>on</strong>g>in</str<strong>on</strong>g>e more texts for the knowledge they c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> as<br />

they’re added to the library collecti<strong>on</strong>, essentially mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g it an open-ended service (Bamman<br />

<strong>and</strong> Crane 2008).<br />

As <strong>on</strong>e example, Bamman <strong>and</strong> Crane (2008) traced how the use of the word libero changed over time<br />

<strong>and</strong> across genre (e.g., classical authors vs. Church Fathers). The Perseus corpus is somewhat small,<br />

159 For example, P<str<strong>on</strong>g>in</str<strong>on</strong>g>dar --http://www.perseus.tufts.edu/hopper/textdoc=Perseus%3atext%3a1999.04.0072<br />

160 http://www.perseus.tufts.edu/hopper/textdoc=Perseus%3atext%3a1999.04.0059<br />

161 http://www.perseus.tufts.edu/hopper/textdoc=Perseus%3atext%3a1999.04.0057, for more <strong>on</strong> the development of the LSJ, see Crane (1998) <strong>and</strong><br />

Rydberg-Cox (2002).<br />

162 For more <strong>on</strong> the need to design “dynamic reference works,” see Crane (2005) <strong>and</strong> Crane <strong>and</strong> J<strong>on</strong>es (2006).


55<br />

<strong>and</strong> these authors noted that even more <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g results could be ga<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by us<str<strong>on</strong>g>in</str<strong>on</strong>g>g such techniques<br />

with the large corpus of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> that is grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, from multiple editi<strong>on</strong>s of classical Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> authors<br />

to neo-Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> texts. In a larger corpus, a dynamic lexic<strong>on</strong> could be used to explore how classical Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

authors such as Caesar <strong>and</strong> Ovid used words differently, or the use of a word could be compared<br />

between classical <strong>and</strong> neo-Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> texts. Another advantage of a dynamic lexic<strong>on</strong> is that rather than<br />

present<str<strong>on</strong>g>in</str<strong>on</strong>g>g several highly illustrative examples of word usage (as is d<strong>on</strong>e with the Cambridge Greek<br />

English Lexic<strong>on</strong>), it can present as many examples as are found <str<strong>on</strong>g>in</str<strong>on</strong>g> the corpus. F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, the fact that the<br />

dynamic lexic<strong>on</strong> supports the ability to search across Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greeks text us<str<strong>on</strong>g>in</str<strong>on</strong>g>g English translati<strong>on</strong>s<br />

of Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> words is a “close approximati<strong>on</strong> to real cross-language <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> retrieval.”<br />

Perhaps most important, Bamman <strong>and</strong> Crane argue that their work to create a dynamic lexic<strong>on</strong><br />

illustrates how even small structured-knowledge sources can be used to m<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g patterns from<br />

larger collecti<strong>on</strong>s:<br />

The applicati<strong>on</strong> of structured knowledge to much larger but unstructured collecti<strong>on</strong>s addresses<br />

a gap left by the massive digitizati<strong>on</strong> efforts of groups such as Google <strong>and</strong> the Open C<strong>on</strong>tent<br />

Alliance (OCA). While these large projects are creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g truly milli<strong>on</strong>- book collecti<strong>on</strong>s, the<br />

services they provide are general (e.g., key term extracti<strong>on</strong>, named entity analysis, related<br />

works) <strong>and</strong> reflect the wide array of texts <strong>and</strong> languages they c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>. By apply<str<strong>on</strong>g>in</str<strong>on</strong>g>g the language<br />

specific knowledge of experts (as encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> our treebank), we are able to create more specific<br />

services to complement these general <strong>on</strong>es already <str<strong>on</strong>g>in</str<strong>on</strong>g> place. In creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a dynamic lexic<strong>on</strong> built<br />

from the <str<strong>on</strong>g>in</str<strong>on</strong>g>tersecti<strong>on</strong> of a 3.5 milli<strong>on</strong> word corpus <strong>and</strong> a 30,457 word treebank, we are<br />

highlight<str<strong>on</strong>g>in</str<strong>on</strong>g>g the immense role than even very small structured knowledge sources can play<br />

(Bamman <strong>and</strong> Crane 2008).<br />

The authors also observed that s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many of the technologies used to build the lexic<strong>on</strong>, such as wordsense<br />

disambiguati<strong>on</strong> <strong>and</strong> syntactic pars<str<strong>on</strong>g>in</str<strong>on</strong>g>g, are modular, any separate improvements made to these<br />

algorithms could be <str<strong>on</strong>g>in</str<strong>on</strong>g>corporated back <str<strong>on</strong>g>in</str<strong>on</strong>g>to the lexic<strong>on</strong>. Similarly, as tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> pars<str<strong>on</strong>g>in</str<strong>on</strong>g>g accuracy<br />

improve with the size of a corpus <strong>and</strong> as the tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g corpus of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> grows, so will the treebank. In<br />

additi<strong>on</strong>, this work illustrates how small-doma<str<strong>on</strong>g>in</str<strong>on</strong>g> tools might be repurposed to work with larger<br />

collecti<strong>on</strong>s.<br />

Bamman <strong>and</strong> Crane (2009) have <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated these issues further <str<strong>on</strong>g>in</str<strong>on</strong>g> their overview of computati<strong>on</strong>al<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <strong>and</strong> lexicography. They noted that while the TLG <strong>and</strong> Perseus provide “dirty results,” or the<br />

ability to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d all the <str<strong>on</strong>g>in</str<strong>on</strong>g>stances of a lemma <str<strong>on</strong>g>in</str<strong>on</strong>g> their collecti<strong>on</strong>s, the TLL gives a smaller subset of<br />

impeccably precise results. Bamman <strong>and</strong> Crane argued that <str<strong>on</strong>g>in</str<strong>on</strong>g> the future, a comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of these two<br />

approaches will be necessary, <strong>and</strong> lexicography will need to utilize both mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques<br />

that learn from large textual collecti<strong>on</strong>s <strong>and</strong> the knowledge <strong>and</strong> labor <str<strong>on</strong>g>in</str<strong>on</strong>g>vested <str<strong>on</strong>g>in</str<strong>on</strong>g> h<strong>and</strong>crafted lexic<strong>on</strong>s<br />

to help such techniques learn. The authors also noted that new lexic<strong>on</strong>s built for a classical<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure would need to support new levels of research:<br />

Manual lexicography has produced fantastic results for Classical languages, but as we design a<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for Classics <str<strong>on</strong>g>in</str<strong>on</strong>g> the future, our aim must be to build a scaffold<str<strong>on</strong>g>in</str<strong>on</strong>g>g that is<br />

essentially enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g: it must not <strong>on</strong>ly make historical languages more accessible <strong>on</strong> a functi<strong>on</strong>al<br />

level, but <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectually as well; it must give students the resources they need to underst<strong>and</strong> a<br />

text while also provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholars the tools to <str<strong>on</strong>g>in</str<strong>on</strong>g>teract with it <str<strong>on</strong>g>in</str<strong>on</strong>g> whatever ways they see fit<br />

(Bamman <strong>and</strong> Crane 2009).


56<br />

As this research <str<strong>on</strong>g>in</str<strong>on</strong>g>dicates, lexic<strong>on</strong>s <strong>and</strong> other traditi<strong>on</strong>al reference tools will need to be redesigned as<br />

knowledge sources that can be used not just by scholars but also by students.<br />

Can<strong>on</strong>ical Text Services, Citati<strong>on</strong> Detecti<strong>on</strong>, <strong>and</strong> Citati<strong>on</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Digital libraries of classics typically c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> both primary <strong>and</strong> sec<strong>on</strong>dary materials (commentaries,<br />

dicti<strong>on</strong>aries, lexic<strong>on</strong>s, etc.) Many of these sec<strong>on</strong>dary materials, as well as journal articles <str<strong>on</strong>g>in</str<strong>on</strong>g> JSTOR 163<br />

<strong>and</strong> historical books <str<strong>on</strong>g>in</str<strong>on</strong>g> Google Books <strong>and</strong> the Internet Archive, will c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> a fair amount of latent<br />

semantic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g references to can<strong>on</strong>ical texts (typically primary sources), historical<br />

pers<strong>on</strong>s, <strong>and</strong> place names, as well as a variety of other <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>.<br />

In order to effectively l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to primary sources, however, these sources must not <strong>on</strong>ly be available<br />

<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e but also be structured <str<strong>on</strong>g>in</str<strong>on</strong>g> a uniform or at least mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-acti<strong>on</strong>able way. One proposed soluti<strong>on</strong> to<br />

this problem is the Can<strong>on</strong>ical Text Services (CTS) protocol. 164 Developed by Neel Smith <strong>and</strong><br />

Christopher Blackwell, “the Can<strong>on</strong>ical Text Services (CTS) are part of the CITE architecture” <strong>and</strong> the<br />

“specificati<strong>on</strong> def<str<strong>on</strong>g>in</str<strong>on</strong>g>es a network service for identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts <strong>and</strong> for retriev<str<strong>on</strong>g>in</str<strong>on</strong>g>g fragments of texts by<br />

can<strong>on</strong>ical reference expressed as CTS URNs.” 165 Can<strong>on</strong>ical references have been def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as<br />

“references to discrete corpora of ancient texts that are written by scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> a can<strong>on</strong>ical citati<strong>on</strong><br />

format” (Romanello 2008), so, for example, “Hom. Il.” typically refers to Homer’s Iliad. One major<br />

functi<strong>on</strong> of CTS, which was previously known as the Classical Text Services protocol, “is to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e a<br />

network service enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g use of a distributed collecti<strong>on</strong> of texts accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to noti<strong>on</strong>s that are traditi<strong>on</strong>al<br />

am<strong>on</strong>g classicists” (Porter et al. 2006).<br />

The CTS protocol is part of a larger CITE 166 architecture that has been designed to encompass<br />

collecti<strong>on</strong>s of structured objects, <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes, texts, <strong>and</strong> extended objects. CTS is <strong>on</strong>e of three services<br />

def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by this architecture; the other two are Collecti<strong>on</strong> Services <strong>and</strong> a Reference Index Service. 167<br />

While the Collecti<strong>on</strong>s Service is still be<str<strong>on</strong>g>in</str<strong>on</strong>g>g def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong> seeks to “provide an abstract <str<strong>on</strong>g>in</str<strong>on</strong>g>terface to sets<br />

of similarly structured objects” the more explicitly def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong> mature Reference Index<str<strong>on</strong>g>in</str<strong>on</strong>g>g or RefIndex<br />

service “associates a permanent reference (a CTS URN, or a Collecti<strong>on</strong> identifier) with either a sec<strong>on</strong>d<br />

permanent reference, or a raw data value.” Reference <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g services encompass mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>gs<br />

traditi<strong>on</strong>ally called <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes, such as a lemmatized <str<strong>on</strong>g>in</str<strong>on</strong>g>dex of a text, as well as other k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds, such as the<br />

mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g of a commentary <strong>on</strong>to relevant parts of the text.<br />

The most thoroughly def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed comp<strong>on</strong>ent of the CITE architecture is the CTS specificati<strong>on</strong>/ protocol/<br />

service. The CTS protocol extends the hierarchy of the Functi<strong>on</strong>al Requirements for Bibliographic<br />

Records (FRBR) c<strong>on</strong>ceptual model for bibliographical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> developed by the Internati<strong>on</strong>al<br />

Federati<strong>on</strong> of <strong>Library</strong> Associati<strong>on</strong>s (IFLA). 168 FRBR def<str<strong>on</strong>g>in</str<strong>on</strong>g>es a work as a “dist<str<strong>on</strong>g>in</str<strong>on</strong>g>ct <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual or<br />

artistic creati<strong>on</strong>,” an expressi<strong>on</strong> as “the <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual or artistic realizati<strong>on</strong> of a work,” a manifestati<strong>on</strong> as<br />

“the physical embodiment of an expressi<strong>on</strong> of a work,” <strong>and</strong> an item as “a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle exemplar of a<br />

manifestati<strong>on</strong>” (IFLA 1998). In other words, Homer’s Iliad is a work, but an English translati<strong>on</strong> by a<br />

particular translator is an expressi<strong>on</strong>, an 1890 pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<str<strong>on</strong>g>in</str<strong>on</strong>g>g of that particular translati<strong>on</strong> by Macmillan is a<br />

manifestati<strong>on</strong>, <strong>and</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual copy of that pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the library shelf is an item.<br />

163 http://www.jstor.org/<br />

164 http://chs75.chs.harvard.edu/projects/dig<str<strong>on</strong>g>in</str<strong>on</strong>g>c/techpub/cts<br />

165 For an overview of how CTS URNs <strong>and</strong> the CITE architecture have been used <str<strong>on</strong>g>in</str<strong>on</strong>g> the HMT project, see Smith (2010).<br />

166 CITE is not a formal acr<strong>on</strong>ym, but <str<strong>on</strong>g>in</str<strong>on</strong>g>stead represents that range of material that the CTS protocol <str<strong>on</strong>g>in</str<strong>on</strong>g>tends to support, specifically: “Collecti<strong>on</strong>s of<br />

structured objects, <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes, texts, extended objects.”<br />

167 http://chs75.chs.harvard.edu/projects/dig<str<strong>on</strong>g>in</str<strong>on</strong>g>c/techpub/cite<br />

168 http://www.ifla.org/publicati<strong>on</strong>s/functi<strong>on</strong>al-requirements-for-bibliographic-records


57<br />

While the FRBR hierarchy <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes Works, Expressi<strong>on</strong>s, Manifestati<strong>on</strong>s, <strong>and</strong> Items, the CTS protocol<br />

uses the terms Work, Editi<strong>on</strong>, Translati<strong>on</strong>, <strong>and</strong> Exemplar. As communicated by Porter et al. (2006),<br />

CTS extends the FRBR hierarchy upward by “group<str<strong>on</strong>g>in</str<strong>on</strong>g>g Works under a noti<strong>on</strong>al entity called<br />

“TextGroup,’” an entity that can refer to authors for literary texts or to corpora such as <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s<br />

(e.g., “Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>” for a published corpus of papyri). It also extends the FRBR hierarchy downward to<br />

support the “identificati<strong>on</strong> <strong>and</strong> abstracti<strong>on</strong> of citable chunks of text (Homer, Iliad Book 1, L<str<strong>on</strong>g>in</str<strong>on</strong>g>e 123), or<br />

ranges of citable chunks (Hom.~ Il. 1.123-2.22).” This both downward <strong>and</strong> upward extensi<strong>on</strong> of FRBR<br />

is important, for as Romanello (2008) underscores, “it allows <strong>on</strong>e to reach a higher granularity when<br />

access<str<strong>on</strong>g>in</str<strong>on</strong>g>g documents hierarchically <strong>and</strong> supports the use of a citati<strong>on</strong> scheme referr<str<strong>on</strong>g>in</str<strong>on</strong>g>g to each level of<br />

the entire document hierarchical structure.” Romanello notes that another important feature of CTS is<br />

that it enables the differentiati<strong>on</strong> of “different exemplars” of the same text.<br />

As noted above, citati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> CTS are expressed as CTS URNs. CTS URNs “provide the permanent<br />

can<strong>on</strong>ical references that Can<strong>on</strong>ical Text Services (CTS) rely <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> order to identify or retrieve<br />

passages of text. These references are a k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of Uniform Resource Name (URN).” 169 URNs, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

to RFC2141 170 , “are <str<strong>on</strong>g>in</str<strong>on</strong>g>tended to serve as persistent, locati<strong>on</strong>-<str<strong>on</strong>g>in</str<strong>on</strong>g>dependent, resource identifiers.” Smith<br />

(2009) provides extensive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> the syntax of CTS-URNs <strong>and</strong> their role <str<strong>on</strong>g>in</str<strong>on</strong>g> the CTS.<br />

This same article also addressed the importance of st<strong>and</strong>ards such as CTS (Smith 2009). “Source<br />

citati<strong>on</strong> is just <strong>on</strong>e part of scholarly publicati<strong>on</strong>, <strong>and</strong> c<strong>on</strong>venti<strong>on</strong>s for cit<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources digitally must be<br />

viewed as part of a larger architectural design,” Smith expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “when the digital library is the global<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>ternet, the natural architecture for scholarly publicati<strong>on</strong>s is a hierarchy of service.” In additi<strong>on</strong> to the<br />

need for st<strong>and</strong>ards <strong>and</strong> c<strong>on</strong>venti<strong>on</strong>s for cit<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources or parts of resources, the importance of<br />

st<strong>and</strong>ard c<strong>on</strong>venti<strong>on</strong>s or <strong>on</strong>tologies for add<str<strong>on</strong>g>in</str<strong>on</strong>g>g semantic encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g for named entities (such as citati<strong>on</strong>s)<br />

to sec<strong>on</strong>dary literature <strong>and</strong> also to web pages <strong>and</strong> then l<str<strong>on</strong>g>in</str<strong>on</strong>g>k them to <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e representati<strong>on</strong>s of primary<br />

<strong>and</strong> other sources has been the subject of a series of recent blog posts by Sebastian Heath of the<br />

American Numismatics Society (ANS). 171<br />

In a 2010 post entitled “RDFa Patterns for Ancient World References,” 172 Heath described his efforts<br />

to encode the year, different named entities such as Polem<strong>on</strong>, an imperial cult, <strong>and</strong> two text citati<strong>on</strong>s<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> a chosen text. He wanted to embed this <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to a sample web page us<str<strong>on</strong>g>in</str<strong>on</strong>g>g st<strong>and</strong>ards<br />

such as RDFa 173 to make the data “automatically recognizable by third-parties.” To encode this<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> he utilized RDFa <strong>and</strong> a number of other <strong>on</strong>tologies with resolvable namespaces (e.g.,<br />

Dbpedia, CITO, FOAF, FRBR, OWL, RDFS, SKOS). Heath listed two references with<str<strong>on</strong>g>in</str<strong>on</strong>g> his sample<br />

text, the first to a published <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> <strong>and</strong> the sec<strong>on</strong>d to a recently published book. Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly,<br />

while Heath was able to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to a bibliographic descripti<strong>on</strong> of the published book with<str<strong>on</strong>g>in</str<strong>on</strong>g> WorldCat,<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g the reference to the published <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> with CITO (Citati<strong>on</strong> Typ<str<strong>on</strong>g>in</str<strong>on</strong>g>g Otology) was<br />

problematic because there was no value for “cites as a primary source” available with<str<strong>on</strong>g>in</str<strong>on</strong>g> this <strong>on</strong>tology.<br />

An additi<strong>on</strong>al complicati<strong>on</strong> was that Heath simply wanted to cite the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, <strong>and</strong> there<br />

was no way to cite just <strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> or “work” with<str<strong>on</strong>g>in</str<strong>on</strong>g> the larger published volume of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s.<br />

“The c<strong>on</strong>cept of “Primary Source” <strong>and</strong> references thereto is important for the Humanities <strong>and</strong> we need<br />

a way of <str<strong>on</strong>g>in</str<strong>on</strong>g>dicat<str<strong>on</strong>g>in</str<strong>on</strong>g>g its usage,” Heath noted. “It's also important that I'm referr<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the publicati<strong>on</strong> of<br />

169 http://chs75.chs.harvard.edu/projects/dig<str<strong>on</strong>g>in</str<strong>on</strong>g>c/techpub/cts-urn-overview<br />

170 RFC2141 is a “memo” released by the Internet Eng<str<strong>on</strong>g>in</str<strong>on</strong>g>eer<str<strong>on</strong>g>in</str<strong>on</strong>g>g Task Force (IETF) that specified the “can<strong>on</strong>ical syntax” for URNs<br />

(http://tools.ietf.org/html/rfc2141)<br />

171 http://numismatics.org<br />

172 http://mediterraneanceramics.blogspot.com/2010/01/rdfa-patterns-for-ancient-world.html<br />

173 “RDFa is a specificati<strong>on</strong> for attributes to express structured data <str<strong>on</strong>g>in</str<strong>on</strong>g> any markup language” (http://www.w3.org/TR/rdfa-syntax/). See also<br />

http://www.w3.org/TR/xhtml-rdfa-primer/


58<br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, not the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> itself. When a digital surrogate becomes available, I can po<str<strong>on</strong>g>in</str<strong>on</strong>g>t to<br />

that. In the meantime, a way of st<strong>and</strong>ardiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g references to parts of a work would be useful,” he added.<br />

Other recent research has exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed some potential methods for resolv<str<strong>on</strong>g>in</str<strong>on</strong>g>g the issues of semantic<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Romanello (2008) proposed the use of microformats 174 <strong>and</strong> the CTS to provide<br />

semantic l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g between classics e-journals <strong>and</strong> the primary sources/can<strong>on</strong>ical texts they referenced.<br />

One of the first challenges was simply to detect the can<strong>on</strong>ical references themselves, for as Romanello<br />

dem<strong>on</strong>strated, references to ancient texts were often abridged, the abbreviati<strong>on</strong>s used for author <strong>and</strong><br />

work names varied greatly, <strong>on</strong>ly some citati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the editors’ names, <strong>and</strong> the reference schemes<br />

could differ (e.g., for Aeschylus Persae, variant citati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded A. Pers., Aesch. Pers., <strong>and</strong> Aeschyl.<br />

Pers.). For this reas<strong>on</strong>, Romanello et al. (2009a) explored the use of mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g to extract<br />

can<strong>on</strong>ical references to primary classical sources from unstructured texts. Although references to<br />

primary sources with<str<strong>on</strong>g>in</str<strong>on</strong>g> the sec<strong>on</strong>dary literature can vary greatly, as seen above, they noted that a<br />

number of similar patterns could often be detected. They thus tra<str<strong>on</strong>g>in</str<strong>on</strong>g>ed c<strong>on</strong>diti<strong>on</strong>al r<strong>and</strong>om fields (CRF)<br />

to identify references to primary sources texts with<str<strong>on</strong>g>in</str<strong>on</strong>g> larger unstructured texts. CRF was a particularly<br />

suitable algorithm because of its ability to c<strong>on</strong>sider a large number of token features when classify<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g data as either “citati<strong>on</strong>s” or “not citati<strong>on</strong>s.” Prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary results <strong>on</strong> a sample of 24 pages<br />

achieved a precisi<strong>on</strong> of 81 percent <strong>and</strong> a recall of 94.1 percent. 175<br />

Even when references are successfully identified, the challenges of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g rema<str<strong>on</strong>g>in</str<strong>on</strong>g>.<br />

Romanello (2008) stated that most references to primary texts with<str<strong>on</strong>g>in</str<strong>on</strong>g> electr<strong>on</strong>ic sec<strong>on</strong>dary sources<br />

were hard l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked “through a tightly coupled l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g system” <strong>and</strong> were also rarely encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-readable format. Other obstacles to semantic l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the lack of shared st<strong>and</strong>ards or<br />

best practices <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g primary references <str<strong>on</strong>g>in</str<strong>on</strong>g> most corpora served as XHTML documents<br />

<strong>and</strong> the lack of comm<strong>on</strong> protocols to support <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability am<strong>on</strong>g different texts collecti<strong>on</strong>s that<br />

would allow the l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of primary <strong>and</strong> sec<strong>on</strong>dary sources. To allow as much <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability as<br />

possible, Romanello promoted us<str<strong>on</strong>g>in</str<strong>on</strong>g>g “a comm<strong>on</strong> protocol to access collecti<strong>on</strong>s of texts <strong>and</strong> a shared<br />

format to encode can<strong>on</strong>ical references with<str<strong>on</strong>g>in</str<strong>on</strong>g> web <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e resources” (Romanello 2008). The other<br />

requirements of a semantic l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g system were that it must be open ended, <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable, <strong>and</strong><br />

semantic- <strong>and</strong> language-neutral. Language-neutral <strong>and</strong> unique identifiers for authors <strong>and</strong> works (such<br />

as those of the TLG) were also recommended to support cross-l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g across languages.<br />

The basic system for semantic l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by Romanello thus made use of the CTS URN scheme,<br />

which uses the TLG Can<strong>on</strong> 176 of Greek authors for identifiers, a series of microformats that he<br />

specifically developed to embed can<strong>on</strong>ical references <str<strong>on</strong>g>in</str<strong>on</strong>g> HTML elements, <strong>and</strong> open protocols such as<br />

the CTS text-retrieval protocol to retrieve either whole texts or parts of texts <str<strong>on</strong>g>in</str<strong>on</strong>g> order to support various<br />

value-added services such as reference <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Romanello proposed three microformats for his<br />

system: ctauthor (references to can<strong>on</strong>ical authors, or statements that can be made mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-readable<br />

through the CTS URN structure); ctwork (references to works without author names); <strong>and</strong> ctref—“a<br />

compound microformat to encode a complete can<strong>on</strong>ical reference” that requires the use of ctauthor,<br />

ctwork; <strong>and</strong> a range property to specify the text secti<strong>on</strong>s that were referred to. While implementati<strong>on</strong> of<br />

such microformats encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> CTS protocols would enable a number of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g value-added<br />

services such as semantic l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g, granular text retrieval, <strong>and</strong> cross-l<str<strong>on</strong>g>in</str<strong>on</strong>g>gual reference <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g (e.g.,<br />

174 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the microformats website, “microformats are a set of simple, open data formats built up<strong>on</strong> exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> widely adopted st<strong>and</strong>ards” that<br />

have been designed to be both human <strong>and</strong> mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e readable (http://microformats.org/about)<br />

175 Work by Romanello c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues <str<strong>on</strong>g>in</str<strong>on</strong>g> this area through crefex (Can<strong>on</strong>ical REFerences Extractor- http://code.google.com/p/crefex/) <strong>and</strong> was presented at the<br />

Digital Classicist/ICS Work <str<strong>on</strong>g>in</str<strong>on</strong>g> Progress Sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ar <str<strong>on</strong>g>in</str<strong>on</strong>g> July 2010. See Matteo Romanello, “Towards a Tool for the Automatic Extracti<strong>on</strong> of Can<strong>on</strong>ical<br />

References.” http://www.digitalclassicist.org/wip/wip2010-04mr.pdf<br />

176 http://www.tlg.uci.edu/can<strong>on</strong>/f<strong>on</strong>tsel


59<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>d all articles that reference Verg. Aen. regardless of the language they are written <str<strong>on</strong>g>in</str<strong>on</strong>g>). They also<br />

require, as Romanello admitted, a high level of participati<strong>on</strong> by both classical e-journals <strong>and</strong> relevant<br />

digital collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g such microformats <strong>and</strong> CTS protocols, a situati<strong>on</strong> that<br />

seems unlikely.<br />

Another approach to the challenge of semantic l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g has been <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced by a project between the<br />

Classics Department at Cornell University, which hosts the editor of the L’Année Philologique <strong>on</strong> the<br />

Internet, <strong>and</strong> Cornell University <strong>Library</strong> (Ruddy 2009). This project was awarded a plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g grant<br />

from The Andrew W. Mell<strong>on</strong> Foundati<strong>on</strong> to explore us<str<strong>on</strong>g>in</str<strong>on</strong>g>g OpenURL 177 to provide l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks from<br />

can<strong>on</strong>ical citati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> L’Année Philologique to their full text <str<strong>on</strong>g>in</str<strong>on</strong>g> both commercial <strong>and</strong> open-access<br />

classics digital libraries. OpenURL was chosen because it provides a uniform l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g syntax that was<br />

system- <strong>and</strong> vendor-<str<strong>on</strong>g>in</str<strong>on</strong>g>dependent <strong>and</strong> m<str<strong>on</strong>g>in</str<strong>on</strong>g>imized the cost of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks. Other<br />

stated advantages of OpenURL were that it easily allowed both <strong>on</strong>e-to-many l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> “appropriate<br />

copy l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g,” or the ability to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k a user to c<strong>on</strong>tent they are licensed to see. For example, if a user<br />

from outside the library community tried to access a restricted resource such as the TLG, he or she<br />

would be directed to the PDL.<br />

One of the major project tasks therefore was to create a metadata format that could “reliably reference<br />

can<strong>on</strong>ical citati<strong>on</strong>s” (Ruddy 2009). The encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of classical author names was particularly<br />

problematic s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce OpenURL metadata presupposes a modern Western name for an author. Ruddy<br />

reas<strong>on</strong>ed that any metadata format would have to allow multiple ways of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g author forms. In<br />

terms of citati<strong>on</strong> structure itself, they did not adopt the CTS but <str<strong>on</strong>g>in</str<strong>on</strong>g>stead chose an abstract approach to<br />

recognize the typical hierarchical structure of works.<br />

In additi<strong>on</strong> to metadata challenges there were a number of implementati<strong>on</strong> issues. With the normal use<br />

of OpenURL, the resoluti<strong>on</strong> of a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to a resource is left to a user’s local l<str<strong>on</strong>g>in</str<strong>on</strong>g>k resolver. The type of<br />

soluti<strong>on</strong> chosen by this project, however, <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g an extra level of detailed knowledge, <strong>and</strong>,<br />

as Ruddy noted, <strong>on</strong>ly “uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> commercial <str<strong>on</strong>g>in</str<strong>on</strong>g>centive for l<str<strong>on</strong>g>in</str<strong>on</strong>g>k resolver vendors.” To solve this issue,<br />

Ruddy proposed <strong>and</strong> c<strong>on</strong>sequently created a “doma<str<strong>on</strong>g>in</str<strong>on</strong>g>-specific community-supported knowledge base”<br />

that was ultimately titled the Classical Works Knowledge Base (CWKB). 178 The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al prototype<br />

soluti<strong>on</strong> implemented by the project was to use the CWKB as an <str<strong>on</strong>g>in</str<strong>on</strong>g>termediate resolver/knowledge base<br />

that could augment <strong>and</strong> normalize metadata values, provide specialized l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, <strong>and</strong><br />

support access to free resources for users without a local l<str<strong>on</strong>g>in</str<strong>on</strong>g>k resolver. The basic user scenario they<br />

envisi<strong>on</strong>ed <str<strong>on</strong>g>in</str<strong>on</strong>g>volved a user click<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> a can<strong>on</strong>ical text reference <str<strong>on</strong>g>in</str<strong>on</strong>g> a JSTOR article or L’Annee article<br />

abstract that was encoded with an OpenURL that would then direct the user to the CWKB (which<br />

would provide a normalized authority form of author <strong>and</strong> title <strong>and</strong> provide a list of services for that<br />

work), then to the local l<str<strong>on</strong>g>in</str<strong>on</strong>g>k resolver (if <strong>on</strong>e was available), <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ally to a HTML page with l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<br />

opti<strong>on</strong>s such as the library catalog, <str<strong>on</strong>g>in</str<strong>on</strong>g>terlibrary loan, text <str<strong>on</strong>g>in</str<strong>on</strong>g> orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al language, or text <str<strong>on</strong>g>in</str<strong>on</strong>g> translati<strong>on</strong>. If<br />

some users did not have a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k resolver, the service could direct them to appropriate free resources by<br />

redirect<str<strong>on</strong>g>in</str<strong>on</strong>g>g them to the CWKB. Ruddy argued that this model could have wide applicati<strong>on</strong>s, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce it<br />

could be “useful to any discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e that cites works <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent of specific editi<strong>on</strong>s or translati<strong>on</strong>s” <strong>and</strong><br />

also offered <strong>on</strong>e soluti<strong>on</strong> for cha<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g l<str<strong>on</strong>g>in</str<strong>on</strong>g>k resolvers <strong>and</strong> knowledge bases together to provide enhanced<br />

services to users.<br />

177 http://www.oclc.org/research/activities/openurl/default.htm<br />

178 http://www.cwkb.org/


60<br />

Text M<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g, Quotati<strong>on</strong> Detecti<strong>on</strong>, <strong>and</strong> Authorship Attributi<strong>on</strong><br />

A number of potential technologies could benefit both from automatic citati<strong>on</strong> detecti<strong>on</strong> <strong>and</strong> from the<br />

broader use of more st<strong>and</strong>ardized citati<strong>on</strong> encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> digital corpora; these <str<strong>on</strong>g>in</str<strong>on</strong>g>clude text m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

applicati<strong>on</strong>s, such as the study of text reuse, as well as quotati<strong>on</strong> detecti<strong>on</strong> <strong>and</strong> authorship attributi<strong>on</strong>.<br />

While the research presented <str<strong>on</strong>g>in</str<strong>on</strong>g> this secti<strong>on</strong> made use of various text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> NLP techniques with<br />

unlabeled corpora, digital texts with large numbers of citati<strong>on</strong>s either automatically or manually<br />

marked up could provide useful tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g data for this k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of work. Regardless of how the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

is detected <strong>and</strong> extracted, the ability to exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e text reuse, trace quotati<strong>on</strong>s, 179 <strong>and</strong> analyze <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

authors <strong>and</strong> study different patterns of authorship will be <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly important services expected not<br />

<strong>on</strong>ly by users of mass-digitizati<strong>on</strong> projects but of classical digital libraries as well.<br />

The eAQUA project, 180 based <str<strong>on</strong>g>in</str<strong>on</strong>g> Germany, is broadly <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g how text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies<br />

might be used <str<strong>on</strong>g>in</str<strong>on</strong>g> the analysis of classical texts through six specific subprojects (rec<strong>on</strong>structi<strong>on</strong> of the<br />

lost works of the Atthidographers, text reuse <str<strong>on</strong>g>in</str<strong>on</strong>g> Plato, papyri classificati<strong>on</strong>, extracti<strong>on</strong> of templates for<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, metrical analysis of Plautus, <strong>and</strong> text completi<strong>on</strong> of fragmentary texts). 181 “The ma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

focus of this project is to break down research questi<strong>on</strong>s from the field of Classics <str<strong>on</strong>g>in</str<strong>on</strong>g> a reusable format<br />

fitt<str<strong>on</strong>g>in</str<strong>on</strong>g>g with NLP algorithms,” Büchler et al. (2008) submitted, “<strong>and</strong> to apply this type of approach to<br />

the data from the Ancient sources.” This approach of first determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g how classical scholars actually<br />

c<strong>on</strong>duct research <strong>and</strong> then attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to match those processes with appropriate algorithms shows the<br />

importance of underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e for which you are design<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools. This is an essential po<str<strong>on</strong>g>in</str<strong>on</strong>g>t<br />

that is seen throughout this review.<br />

The basic visi<strong>on</strong> of eAQUA is to present a unified approach c<strong>on</strong>sist<str<strong>on</strong>g>in</str<strong>on</strong>g>g of “Data, Algorithms <strong>and</strong><br />

Applicati<strong>on</strong>s,” <strong>and</strong> this project specifically addresses both the development of applicati<strong>on</strong>s (research<br />

questi<strong>on</strong>s) <strong>and</strong> algorithms (NLP, text m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g, co-occurrence analysis, cluster<str<strong>on</strong>g>in</str<strong>on</strong>g>g, classificati<strong>on</strong>). Data<br />

or corpora from research partners will be imported through st<strong>and</strong>ardized data <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces <str<strong>on</strong>g>in</str<strong>on</strong>g>to an<br />

eAQUA portal that is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g developed. This portal will also provide access to all the structured data<br />

that are extracted through a variety of web services that can be used by scholars. 182<br />

One area of active research that is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>ducted by the eAQUA project is the use of citati<strong>on</strong><br />

detecti<strong>on</strong> <strong>and</strong> textual reuse <str<strong>on</strong>g>in</str<strong>on</strong>g> the TLG corpus to <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigate “the recepti<strong>on</strong> of Plato as a case study of<br />

textual reuse <strong>on</strong> ancient Greek texts” (Büchler <strong>and</strong> Geßner 2009). In their work, they first extracted<br />

word-by-word citati<strong>on</strong>s by comb<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g n-gram overlaps <strong>and</strong> significant terms for several works of<br />

Plato; sec<strong>on</strong>d, they loosened the c<strong>on</strong>stra<str<strong>on</strong>g>in</str<strong>on</strong>g>ts <strong>on</strong> syntactic word order to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d citati<strong>on</strong>s. The authors<br />

emphasized that develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g appropriate visualizati<strong>on</strong> tools is essential to study textual reuse s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce textm<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

approaches to corpora typically produce a huge amount of data that simply cannot be explored<br />

manually. Their paper thus offers several <str<strong>on</strong>g>in</str<strong>on</strong>g>trigu<str<strong>on</strong>g>in</str<strong>on</strong>g>g visualizati<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g highlight<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

differences <str<strong>on</strong>g>in</str<strong>on</strong>g> citati<strong>on</strong>s to works of Plato across time (from the Neo-Plat<strong>on</strong>ists to the Middle<br />

179 Prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary research <strong>on</strong> quotati<strong>on</strong> identificati<strong>on</strong> <strong>and</strong> track<str<strong>on</strong>g>in</str<strong>on</strong>g>g has been reported for Google Books (Schilit <strong>and</strong> Kolak 2008).<br />

180 http://www.eaqua.net/en/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

181 The computati<strong>on</strong>al challenges of automatic metrical analysis <strong>and</strong> fragmentary texts have received some research attenti<strong>on</strong>. For metrical analysis, see<br />

Deufert et al. (2010), Eder (2007), Fusi (2008) <strong>and</strong> Papakitsos (2011) For fragmentary texts, see Berti et al. (2009) <strong>and</strong> Romanello et al. (2009b). The use<br />

of digital technology for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> papyri is covered <str<strong>on</strong>g>in</str<strong>on</strong>g> their respective secti<strong>on</strong>s.<br />

182 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the W3C, a web service can be def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as “a software system designed to support <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-to-mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong> over a<br />

network. It has an <str<strong>on</strong>g>in</str<strong>on</strong>g>terface described <str<strong>on</strong>g>in</str<strong>on</strong>g> a mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-processable format (specifically WSDL). Other systems <str<strong>on</strong>g>in</str<strong>on</strong>g>teract with the Web service <str<strong>on</strong>g>in</str<strong>on</strong>g> a manner<br />

prescribed by its descripti<strong>on</strong> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g SOAP messages, typically c<strong>on</strong>veyed us<str<strong>on</strong>g>in</str<strong>on</strong>g>g HTTP with an XML serializati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>juncti<strong>on</strong> with other Web-related<br />

st<strong>and</strong>ard.” (http://www.w3.org/TR/ws-arch/#whatis). Two important related st<strong>and</strong>ards are SOAP (Simple Object Access Protocol), a “lightweight protocol<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tended for exchang<str<strong>on</strong>g>in</str<strong>on</strong>g>g structured <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a decentralized, distributed envir<strong>on</strong>ment” (http://www.w3.org/TR/soap12-part1/) <strong>and</strong> WSDL (Web<br />

Services Descripti<strong>on</strong> Language), an “XML format for describ<str<strong>on</strong>g>in</str<strong>on</strong>g>g network services as a set of endpo<str<strong>on</strong>g>in</str<strong>on</strong>g>ts operat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> messages c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g either<br />

document-oriented or procedure-oriented <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>.” (http://www.w3.org/TR/wsdl)


61<br />

Plat<strong>on</strong>ists). John Lee (2007) has c<strong>on</strong>ducted other work <str<strong>on</strong>g>in</str<strong>on</strong>g> textual reuse <strong>and</strong> explored sentence<br />

alignment <str<strong>on</strong>g>in</str<strong>on</strong>g> the Synoptic Gospels of the Greek New Testament. He po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g ancienttext<br />

reuse is a difficult but important task s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce ancient authors rarely acknowledged their sources <strong>and</strong><br />

often quoted from memory or comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed multiple sources. “Identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g the sources of ancient texts is<br />

useful <str<strong>on</strong>g>in</str<strong>on</strong>g> many ways,” Lee stressed: “It helps establish their relative dates. It traces the evoluti<strong>on</strong> of<br />

ideas. The material quoted, left out or altered <str<strong>on</strong>g>in</str<strong>on</strong>g> a compositi<strong>on</strong> provides much <str<strong>on</strong>g>in</str<strong>on</strong>g>sight <str<strong>on</strong>g>in</str<strong>on</strong>g>to the agenda<br />

of its author” (Lee 2007).<br />

Authorship attributi<strong>on</strong>, or us<str<strong>on</strong>g>in</str<strong>on</strong>g>g manual or automatic techniques to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e the authorship of<br />

an<strong>on</strong>ymous texts, has been previously explored <str<strong>on</strong>g>in</str<strong>on</strong>g> classical studies (Rudman 1998) <strong>and</strong> rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s a topic<br />

of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest. Forstall <strong>and</strong> Scheirer (2009) presented new methods for authorship attributi<strong>on</strong> based <strong>on</strong><br />

sound rather than text to Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> poets <strong>and</strong> prose authors:<br />

We present the functi<strong>on</strong>al n-gram as a feature well-suited to the analysis of poetry <strong>and</strong> other<br />

sound-sensitive material, work<str<strong>on</strong>g>in</str<strong>on</strong>g>g toward a stylistics based <strong>on</strong> sound rather than text. Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Support Vector Mach<str<strong>on</strong>g>in</str<strong>on</strong>g>es (SVM) for text classificati<strong>on</strong>, we extend the expressi<strong>on</strong> of our results<br />

from a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle marg<str<strong>on</strong>g>in</str<strong>on</strong>g>al distance or a b<str<strong>on</strong>g>in</str<strong>on</strong>g>ary yes/no decisi<strong>on</strong> to a more flexible receiver-operator<br />

characteristic curve. We apply the same feature methodology to Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciple Comp<strong>on</strong>ent Analysis<br />

(PCA) <str<strong>on</strong>g>in</str<strong>on</strong>g> order to validate PCA <strong>and</strong> to explore its expressive potential (Forstall <strong>and</strong> Scheirer<br />

2009).<br />

The authors discovered that sounds tested with SVMs produced results that performed as well as, if not<br />

better than, functi<strong>on</strong>-words <str<strong>on</strong>g>in</str<strong>on</strong>g> every experiment performed, <strong>and</strong> thus c<strong>on</strong>cluded that “sound can be<br />

captured <strong>and</strong> used effectively as a feature for attribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g authorship to a variety of literary texts.”<br />

Forstall <strong>and</strong> Scheirer also reported some <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>itial results <str<strong>on</strong>g>in</str<strong>on</strong>g> explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Homeric poems,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g test<str<strong>on</strong>g>in</str<strong>on</strong>g>g the argument that this poetry was composed without aid of writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g, an issue explored<br />

at length by the Homeric Multitext Project. “When the works of Thucydides, a literate prose historian,<br />

were projected us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cipal comp<strong>on</strong>ents derived from Homer, Thucydides' work not <strong>on</strong>ly<br />

clustered together but had a much smaller radius than either of the Homeric poems,” Forstall <strong>and</strong><br />

Scheirer c<strong>on</strong>tended, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “this result agrees with philological arguments for the Homer's works<br />

hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g been produced by a wholly different, oral mode of compositi<strong>on</strong>.” The work of Forstall <strong>and</strong><br />

Scheirer is just <strong>on</strong>e example of many am<strong>on</strong>g digital classics projects of how computer science<br />

methodologies can shed new light <strong>on</strong> old questi<strong>on</strong>s.<br />

The PDL has c<strong>on</strong>ducted some of its own experiments <str<strong>on</strong>g>in</str<strong>on</strong>g> automatic quotati<strong>on</strong> identificati<strong>on</strong>. Ernst-<br />

Gerlach <strong>and</strong> Crane (2008) <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced an algorithm for the automatic analysis of citati<strong>on</strong>s but found<br />

that they needed to first manually analyze the structure of quotati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> three different reference works<br />

of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> texts to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e text quotati<strong>on</strong> alternati<strong>on</strong> patterns. Their experience c<strong>on</strong>firmed Lee’s earlier<br />

po<str<strong>on</strong>g>in</str<strong>on</strong>g>t that text reuse is rarely word for word, though <str<strong>on</strong>g>in</str<strong>on</strong>g> this case it was the quotati<strong>on</strong> practices of<br />

n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth-century reference works, rather than those of ancient authors, that proved problematic:<br />

Quotati<strong>on</strong>s are, <str<strong>on</strong>g>in</str<strong>on</strong>g> practice, often not exact. In some cases, our quotati<strong>on</strong>s are based <strong>on</strong> different<br />

editi<strong>on</strong>s of a text than those to which we have electr<strong>on</strong>ic access <strong>and</strong> we f<str<strong>on</strong>g>in</str<strong>on</strong>g>d occasi<strong>on</strong>al<br />

variati<strong>on</strong>s that reflect different versi<strong>on</strong>s of the text. We also found, however, that some<br />

quotati<strong>on</strong>s – especially <str<strong>on</strong>g>in</str<strong>on</strong>g> reference works such as lexica <strong>and</strong> grammars – deliberately modify<br />

the quoted text – the goal <str<strong>on</strong>g>in</str<strong>on</strong>g> such cases is not to replicate the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al text but to illustrate a<br />

po<str<strong>on</strong>g>in</str<strong>on</strong>g>t about lexicography, grammar, or some other topic (Ernst-Gerlach <strong>and</strong> Crane 2008).


62<br />

This manual analysis provided a classificati<strong>on</strong> of the different types of text variati<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g regular<br />

text differences, irregular text differences, word omissi<strong>on</strong>, text <str<strong>on</strong>g>in</str<strong>on</strong>g>serti<strong>on</strong>, <strong>and</strong> word substituti<strong>on</strong>. The<br />

algorithm that was ultimately developed has not yet been put <str<strong>on</strong>g>in</str<strong>on</strong>g>to the producti<strong>on</strong> PDL.<br />

Other research by Perseus has explored automatic citati<strong>on</strong> <strong>and</strong> quotati<strong>on</strong> identificati<strong>on</strong> by utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

quotati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes, or <str<strong>on</strong>g>in</str<strong>on</strong>g>dices scriptorum, which typically listed all of the authors quoted with<str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

classical text <strong>and</strong> were manually created by editors for critical editi<strong>on</strong>s of classical texts. Fuzzy pars<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

techniques were applied to the OCR transcripti<strong>on</strong> of <strong>on</strong>e such <str<strong>on</strong>g>in</str<strong>on</strong>g>dex from the Deipnosophistae of<br />

Athenaeus to then automatically mark up all the quotati<strong>on</strong>s found with<str<strong>on</strong>g>in</str<strong>on</strong>g> the <str<strong>on</strong>g>in</str<strong>on</strong>g>dex <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital versi<strong>on</strong><br />

of the text (Romanello et al. 2009c).<br />

THE DISCIPLINES AND TECHNOLOGIES OF DIGITAL CLASSICS<br />

This secti<strong>on</strong> explores a variety of subdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es or related discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of digital classics <strong>and</strong> presents<br />

an overview of some important projects <strong>and</strong> relevant literature <str<strong>on</strong>g>in</str<strong>on</strong>g> each. These overviews are by no<br />

means exhaustive; the goal was to identify the major projects <str<strong>on</strong>g>in</str<strong>on</strong>g> each field that illustrate the key<br />

challenges faced.<br />

Ancient History<br />

In many ways the study of ancient history is less a subdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of classical studies than an<br />

overarch<str<strong>on</strong>g>in</str<strong>on</strong>g>g field that makes use of all the sources that are studied <str<strong>on</strong>g>in</str<strong>on</strong>g>tensively <str<strong>on</strong>g>in</str<strong>on</strong>g> each of the other<br />

subdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es; c<strong>on</strong>sequently, various aspects of this topic are covered <str<strong>on</strong>g>in</str<strong>on</strong>g> many of the subdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

secti<strong>on</strong>s rather than <str<strong>on</strong>g>in</str<strong>on</strong>g> this secti<strong>on</strong> exclusively. The popularity of this topic is evidenced by<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>numerable academic <strong>and</strong> enthusiast websites <strong>on</strong> the history of Greece, <str<strong>on</strong>g>Rome</str<strong>on</strong>g>, <strong>and</strong> the Ancient Near<br />

East. One of the larger enthusiast websites is Attalus.org, 183 which provides detailed “lists of events<br />

<strong>and</strong> sources for the history of the Hellenistic world <strong>and</strong> the Roman republic” <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes translati<strong>on</strong>s<br />

of many of the relevant sources, such as Livy <strong>and</strong> Tacitus. The Livius 184 website, managed by Dutch<br />

historian J<strong>on</strong>a Lender<str<strong>on</strong>g>in</str<strong>on</strong>g>g, offers a search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>e to a large number of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e articles <strong>on</strong> Roman <strong>and</strong><br />

Greek history.<br />

The subject of ancient history makes up a large comp<strong>on</strong>ent of many digital classics projects, albeit not<br />

necessarily as a specific focus; <str<strong>on</strong>g>in</str<strong>on</strong>g>stead, the sources provided (whether primary texts or documentary<br />

sources such as <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, papyri, or co<str<strong>on</strong>g>in</str<strong>on</strong>g>s) support the study of ancient history. As <strong>on</strong>e report<br />

recently noted, digital archives are provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to all of these materials at a rapid rate:<br />

Scholars of ancient history study particular documentary rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s (such as <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, papyri,<br />

ancient maps, classical literature <strong>and</strong> drama, <strong>and</strong> art <strong>and</strong> architecture) to exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e early<br />

societies. These materials may be excavated from multiple archaeological sites, <strong>and</strong> are<br />

generally found <str<strong>on</strong>g>in</str<strong>on</strong>g> archives. While documentary editi<strong>on</strong>s have traditi<strong>on</strong>ally provided scholars<br />

with wider access to archival sources, the growth of digital archives is perceived as a great<br />

bo<strong>on</strong>. On the <strong>on</strong>e h<strong>and</strong>, digital archives are allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholars to search, study, <strong>and</strong> make<br />

c<strong>on</strong>necti<strong>on</strong>s between more archival materials. On the other h<strong>and</strong>, different archival materials<br />

are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g digitized at various rates, <strong>and</strong> look<str<strong>on</strong>g>in</str<strong>on</strong>g>g at a digital surrogate may not replace the value<br />

of see<str<strong>on</strong>g>in</str<strong>on</strong>g>g the physical artifact. The “next generati<strong>on</strong>” of digital archives, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to some<br />

183 http://www.attalus.org/<br />

184 http://www.livius.org/


63<br />

scholars, will <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate archival materials with computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <strong>and</strong> other text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

capabilities (Harley et al. 2010, 118).<br />

These themes of “next-generati<strong>on</strong>” digital archives <strong>and</strong> the need to comb<str<strong>on</strong>g>in</str<strong>on</strong>g>e them with sophisticated<br />

language technologies are revisited <str<strong>on</strong>g>in</str<strong>on</strong>g> later secti<strong>on</strong>s of this report.<br />

To c<strong>on</strong>clude this secti<strong>on</strong> we describe <strong>on</strong>e major technology project that seeks to support the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

of historical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> wherever it is found, namely, the HEML, Historical Event <strong>and</strong> Markup<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Project (Roberts<strong>on</strong> 2009), which seeks to provide markup st<strong>and</strong>ards <strong>and</strong> tools with which to<br />

encode historical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> the web. In a state of evoluti<strong>on</strong> s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce 2001, HEML now has an RDF<br />

data model that allows it to represent nested events <strong>and</strong> relati<strong>on</strong>s of causality between events. The<br />

HEML data format supports collecti<strong>on</strong>s of events where each is tagged with heml:Event, which at their<br />

simplest are bound to mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-readable spans of time <strong>and</strong> references to evidence. Other important<br />

features of the model <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the assignment of a URI 185 to each entity <strong>and</strong> the utilizati<strong>on</strong> of an<br />

evidence element:<br />

Pers<strong>on</strong>s, roles, locati<strong>on</strong>s, <strong>and</strong> keywords are assigned m<strong>and</strong>atory URIs so that they may be<br />

referred to <str<strong>on</strong>g>in</str<strong>on</strong>g> multiple events. F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, <strong>on</strong>e or more heml:Evidence elements must be attributed<br />

to each event, <strong>and</strong> with<str<strong>on</strong>g>in</str<strong>on</strong>g> these there is a means by which different editi<strong>on</strong>s <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic<br />

representati<strong>on</strong>s of the same text may be grouped together for the researcher's benefit<br />

(Roberts<strong>on</strong> 2009).<br />

These encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g choices illustrate the importance of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g unique URIs to identify specific entities so<br />

that they can not <strong>on</strong>ly be referred to <str<strong>on</strong>g>in</str<strong>on</strong>g> multiple encoded historical events but also be reused as “l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked<br />

data” 186 by other applicati<strong>on</strong>s. The ability to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k encoded events to attestati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> primary texts is also<br />

of critical importance. One criticism often made of HEML, Roberts<strong>on</strong> stated, is that “it is not possible<br />

to encode the variati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g historical events.” The ability to encode multiple<br />

scholarly op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s <strong>and</strong> to <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate the uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty of knowledge regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g dates or other <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

are important features of any markup for historical texts. Roberts<strong>on</strong> does argue, however, that URIs<br />

could eventually be created for scholars, <strong>and</strong> specific encoded arguments could be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to those<br />

URIs.<br />

Classical Archaeology<br />

Overview<br />

Archaeology is the study of the material rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>and</strong> envir<strong>on</strong>mental effects of human behavior<br />

throughout prehistory to the modern era. Scholarship <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology is divided <str<strong>on</strong>g>in</str<strong>on</strong>g>to a large number of<br />

subdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, many def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed geographically (e.g., North America, Egypt, Near East, Oceania) <strong>and</strong>/or<br />

by time period (e.g., Paleolithic, Neolithic, Classical). A moderately sized field, archaeology overlaps<br />

with a range of other scholarly discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g biological anthropology, ethnobotany,<br />

paleozoology, geology, <strong>and</strong> classics (<str<strong>on</strong>g>in</str<strong>on</strong>g> particular, palaeography, philology, papyrology, epigraphy,<br />

numismatics, history of the ancient world, Hellenic literature, <strong>and</strong> art <strong>and</strong> architectural history)<br />

(Harley et al. 2010, 30).<br />

185 A URI has been def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as a “compact sequence of characters that identifies an abstract or physical resource.” For more <strong>on</strong> the syntax <strong>and</strong> architecture<br />

of URIs, see http://labs.apache.org/webarch/uri/rfc/rfc3986.html - overview<br />

186 Tim Berners-Lee has described l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data as essential to the creati<strong>on</strong> of the Semantic Web, <strong>and</strong> the creati<strong>on</strong> of l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data must follow four essential<br />

rules: 1. “Use URIs as names for th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs,” 2.“Use HTTP URIs so that people can look up those names,” 3. Use st<strong>and</strong>ards such as RDF <strong>and</strong> SPARQL so that<br />

when some<strong>on</strong>e looks up a URI it provides useful <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, <strong>and</strong> 4. Include l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to other URIs to support further discovery.<br />

http://www.w3.org/DesignIssues/L<str<strong>on</strong>g>in</str<strong>on</strong>g>kedData.html


64<br />

As illustrated by this def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong>, archaeology is a complex <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary field that has many of<br />

its own specializati<strong>on</strong>s but is also closely related to classics. The <str<strong>on</strong>g>in</str<strong>on</strong>g>novative use of digital technology<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology has a history that is at least three decades old. 187 Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <strong>and</strong> reports of numerous<br />

c<strong>on</strong>ferences c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> papers <str<strong>on</strong>g>in</str<strong>on</strong>g>volv<str<strong>on</strong>g>in</str<strong>on</strong>g>g the use of 3-D visualizati<strong>on</strong>s, digital rec<strong>on</strong>structi<strong>on</strong>s, <strong>and</strong><br />

electr<strong>on</strong>ic publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology—<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Computer Applicati<strong>on</strong>s <strong>and</strong> Quantitative Methods <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Archaeology (CAA), 188 the Virtual Systems <strong>and</strong> Multimedia Society (VSMM), 189 Virtual Reality,<br />

Archaeology <strong>and</strong> Cultural Heritage (VAST), <strong>and</strong> the Internati<strong>on</strong>al Committee for Documentati<strong>on</strong> of<br />

Cultural Heritage (CIPA). 190 In additi<strong>on</strong>, the importance of cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> of digital<br />

preservati<strong>on</strong> for archaeology has been addressed by a number of recent projects as well as by l<strong>on</strong>gerst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

organizati<strong>on</strong>s. This secti<strong>on</strong> looks at several major digital classical archaeology projects <strong>and</strong><br />

provides an overview of some of the literature <strong>on</strong> this topic.<br />

Electr<strong>on</strong>ic Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Traditi<strong>on</strong>al Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

One of the oldest e-journals <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology is Internet Archaeology, 191 a peer-reviewed journal that<br />

was established <str<strong>on</strong>g>in</str<strong>on</strong>g> 1996 <strong>and</strong> sought to make full use of electr<strong>on</strong>ic publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g. There are some other<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g examples of electr<strong>on</strong>ic publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g FastiOnl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 192 N<strong>on</strong>etheless, a<br />

major new report from the Center for Studies <str<strong>on</strong>g>in</str<strong>on</strong>g> Higher Educati<strong>on</strong> (CSHE) at the University of<br />

California, Berkeley, 193 that <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated the potential of digital scholarship <strong>and</strong> scholarly<br />

communicati<strong>on</strong> across a number of discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g archaeology, by <str<strong>on</strong>g>in</str<strong>on</strong>g>terview<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholars at elite<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s (Harley et al. 2010) found that most archaeologists were still very distrustful of electr<strong>on</strong>ic<br />

publicati<strong>on</strong>, peer-reviewed or not.<br />

This same report surveyed how digital technology was affect<str<strong>on</strong>g>in</str<strong>on</strong>g>g the nature of publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, tenure, <strong>and</strong><br />

scholarship <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology. Traditi<strong>on</strong>al publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g of archaeological scholarship is typically d<strong>on</strong>e<br />

through m<strong>on</strong>ographs <strong>and</strong> less frequently through journal articles, although <str<strong>on</strong>g>in</str<strong>on</strong>g> more technical fields<br />

such as epigraphy or papyrology, journal articles are more likely to be the st<strong>and</strong>ard. The complicated<br />

nature of archaeological data, with its extensive use of images <strong>and</strong> other multimedia, <str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong> to the<br />

limitati<strong>on</strong>s of pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, however, have led many projects to create complex archaeological<br />

websites or to pursue more sophisticated digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g opti<strong>on</strong>s. Two <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g examples are<br />

Ostia: Harbour City of Ancient <str<strong>on</strong>g>Rome</str<strong>on</strong>g> 194 <strong>and</strong> the Pylos Regi<strong>on</strong>al Archaeological Project. 195<br />

Despite this grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g practice, the CSHE researchers documented a fairly str<strong>on</strong>g resistance to the<br />

c<strong>on</strong>siderati<strong>on</strong> of digital publicati<strong>on</strong>s for tenure dossiers; this was largely because of the lack of<br />

precedent, a general uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g how to peer review electr<strong>on</strong>ic publicati<strong>on</strong>s, <strong>and</strong> a belief that<br />

such projects often failed to make scholarly arguments. A quote from <strong>on</strong>e scholar reflects this general<br />

view:<br />

But I would say the test for me is not do you have computer wizards do<str<strong>on</strong>g>in</str<strong>on</strong>g>g classics—the<br />

answer is yes—but <str<strong>on</strong>g>in</str<strong>on</strong>g>stead, are there works <str<strong>on</strong>g>in</str<strong>on</strong>g> classics that have come up, which are excellent<br />

187 For a detailed overview of how digital technologies can be used <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g data record<str<strong>on</strong>g>in</str<strong>on</strong>g>g, GIS, CAD model<str<strong>on</strong>g>in</str<strong>on</strong>g>g, the digitizati<strong>on</strong> of both<br />

legacy <strong>and</strong> newly collected data, <strong>and</strong> digital preservati<strong>on</strong>, see Eiteljorg <strong>and</strong> Limp (2008). An earlier article by Eiteljorg serves as a useful <str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong> to<br />

the topic of archaeological comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g Eiteljorg (2004). An still earlier but also exhaustive survey of the use of comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g applicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology can<br />

be found <str<strong>on</strong>g>in</str<strong>on</strong>g> Richards (1998).<br />

188 http://www.leidenuniv.nl/caa/<br />

189 http://www.vsmmsociety.org/site/<br />

190 http://cipa.icomos.org/<br />

191 http://<str<strong>on</strong>g>in</str<strong>on</strong>g>tarch.ac.uk/<br />

192 http://www.fasti<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.org/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

193 http://cshe.berkeley.edu/<br />

194 http://www.ostia-antica.org/<br />

195 http://classics.uc.edu/PRAP/


65<br />

particularly because of their technological c<strong>on</strong>necti<strong>on</strong>s I d<strong>on</strong>’t know of <strong>on</strong>e. So I th<str<strong>on</strong>g>in</str<strong>on</strong>g>k that<br />

there is a basic mistrust toward the digital medium <str<strong>on</strong>g>in</str<strong>on</strong>g> academia, <strong>and</strong> particularly with regard to<br />

tenure <strong>and</strong> promoti<strong>on</strong>. And I th<str<strong>on</strong>g>in</str<strong>on</strong>g>k that to some extent it’s justified. … At best, the digital<br />

medium produces data that are structured sometimes very well <strong>and</strong> with a great deal of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>teractive opportunities, search capabilities, <strong>and</strong> whatnot. But a website does not really<br />

develop an argument, <strong>and</strong> what we expect of a scholar, young or old, is to be able to develop an<br />

argument (Harley et al. 2010, 38).<br />

Harley <strong>and</strong> colleagues report noted that <strong>on</strong>e department head suggested that digital publicati<strong>on</strong>s be<br />

represented <str<strong>on</strong>g>in</str<strong>on</strong>g> a “new category” between service <strong>and</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Many archaeologists who were<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed did not seem to realize that a number of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e journals, such as Internet Archaeology, are<br />

peer-reviewed. 196 While traditi<strong>on</strong>al publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g largely rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s the rule, the Center for Hellenic Studies<br />

(CHS) 197 has founded its own digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g program that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes journals such as Classics@, <strong>and</strong><br />

has also put electr<strong>on</strong>ic books <strong>on</strong> its website. 198 On the other h<strong>and</strong>, the APA <strong>and</strong> Archaeological<br />

Institute of America (AIA) released a report <strong>on</strong> electr<strong>on</strong>ic publicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> 2007 where the most<br />

“revoluti<strong>on</strong>ary” move proposed was for the APA to “explore a new digitally-distributed series of APA<br />

m<strong>on</strong>ographs” (APA/AIA 2007). 199<br />

Not all scholars <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed by the CSHE were pessimistic about the potential of electr<strong>on</strong>ic publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

however, <strong>and</strong> some believed the potential was largely dependent <strong>on</strong> the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. One scholar<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed felt that it was reas<strong>on</strong>able that with<str<strong>on</strong>g>in</str<strong>on</strong>g> 10 years “most of the scholarly life cycle <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

papyrology” would be <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <strong>and</strong> accessible <str<strong>on</strong>g>in</str<strong>on</strong>g> the same technological platform. This scholar was<br />

also very hopeful regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the digital potential for epigraphy <strong>and</strong> the development of a comprehensive<br />

digital envir<strong>on</strong>ment for Greek, but argued that far more work needed to be d<strong>on</strong>e for Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>. A f<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>sight offered by this <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual was that “these digital projects make it possible for the rest of us to<br />

get access to the material that makes the books possible” (Harley et al. 2010, 64). Increas<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital<br />

access to archaeological data, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to this scholar, not <strong>on</strong>ly supported digital scholarship but also<br />

had great benefits for traditi<strong>on</strong>al m<strong>on</strong>ograph publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

Despite many traditi<strong>on</strong>al scholars’ reluctance to evaluate web-based publicati<strong>on</strong>s, the CSHE report<br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that new models of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e publicati<strong>on</strong> not <strong>on</strong>ly offer better ways of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g all types of<br />

data, from site reports <strong>and</strong> databases to videos <strong>and</strong> digital rec<strong>on</strong>structi<strong>on</strong>s, but also can provide almost<br />

immediate access to data as it is discovered. Even more important, “these <str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives are truly datadriven,<br />

collaborative, <strong>and</strong> require a shift <str<strong>on</strong>g>in</str<strong>on</strong>g> the traditi<strong>on</strong>al th<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the “m<strong>on</strong>ograph as the f<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

publicati<strong>on</strong> <strong>on</strong> a site” (Harley et al. 2010). Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e publicati<strong>on</strong> thus offers new opportunities for datadriven<br />

scholarship, collaborati<strong>on</strong>, <strong>and</strong> almost real-time updat<str<strong>on</strong>g>in</str<strong>on</strong>g>g. The report authors assert that an<br />

additi<strong>on</strong>al advantage of the dynamic nature of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e publicati<strong>on</strong> is that scholarly arguments can<br />

evolve as more data are published <strong>and</strong> this process can be made much more transparent to the reader.<br />

Meckseper <strong>and</strong> Warwick (2003) make a similar po<str<strong>on</strong>g>in</str<strong>on</strong>g>t, not<str<strong>on</strong>g>in</str<strong>on</strong>g>g that electr<strong>on</strong>ic publicati<strong>on</strong> can help<br />

196 A recent article by Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> Carver has exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the issue of peer review, open access <strong>and</strong> journal publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology (Carver 2007).<br />

197 The CHS is a classical research <str<strong>on</strong>g>in</str<strong>on</strong>g>stitute that is affiliated with Harvard University <strong>and</strong> is based <str<strong>on</strong>g>in</str<strong>on</strong>g> Wash<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, DC. Founded <str<strong>on</strong>g>in</str<strong>on</strong>g> 1962 through an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dependent endowment made “exclusively for the establishment of an educati<strong>on</strong>al center <str<strong>on</strong>g>in</str<strong>on</strong>g> the field of Hellenic Studies designed to re-discover the<br />

humanism of the Hellenic Greeks” the CHS hosts a number of <str<strong>on</strong>g>in</str<strong>on</strong>g>novative digital projects (such as the Homer Multitext) <strong>and</strong> is also resp<strong>on</strong>sible for a<br />

number of different <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e publicati<strong>on</strong>s.<br />

198 http://chs.harvard.edu/wa/pageRtn=Publicati<strong>on</strong>s&bdc=12&mn=0<br />

199 A more recent report released by the APA has <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated that the associati<strong>on</strong> is plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g to launch a major “Campaign for Classics <str<strong>on</strong>g>in</str<strong>on</strong>g> the 21 st Century”<br />

that has been endorsed by the NEH through a $650,000 challenge grant that requires a 4-to-1 match. If successful, the APA plans to build a Digital Portal<br />

that will <str<strong>on</strong>g>in</str<strong>on</strong>g>clude access to the bibliographic resources of the APh, to the full text of articles, dissertati<strong>on</strong>s, <strong>and</strong> m<strong>on</strong>ographs, to images of orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al source<br />

materials (as well as <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative guides), as well as to <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive maps, images, audio record<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, video producti<strong>on</strong>s of ancient dramas, documentaries,<br />

syllabi, etc. The portal will also be designed to collaborate with other digital classics projects <strong>and</strong> provide both research <strong>and</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g support (APA 2010).


66<br />

archaeology as a discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e move toward a better <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of data <strong>and</strong> published <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s.<br />

They reflected that the practice of treat<str<strong>on</strong>g>in</str<strong>on</strong>g>g excavati<strong>on</strong> reports as sole data archives had come under<br />

heavy criticism by the 1980s, as more archaeologists came to realize “the dist<str<strong>on</strong>g>in</str<strong>on</strong>g>cti<strong>on</strong> between data <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> was often not as easy to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> as previously assumed” (Meckseper <strong>and</strong> Warwick<br />

2003). The ability to represent uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s or statements<br />

is <strong>on</strong>e of the reas<strong>on</strong>s the authors chose to use TEI to encode archaeological reports.<br />

Stuart Dunn has also discussed how electr<strong>on</strong>ic publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology has both a number of<br />

benefits <strong>and</strong> difficulties. The deposit of archaeological data <str<strong>on</strong>g>in</str<strong>on</strong>g>to digital repositories or virtual research<br />

envir<strong>on</strong>ments (VREs), while still preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g copyright <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual property, is a difficult but<br />

ultimately worthwhile goal, he c<strong>on</strong>tends. N<strong>on</strong>etheless, the potential of l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g, for example, published<br />

articles to the data they reference, also raises questi<strong>on</strong>s of data accuracy, c<strong>on</strong>trolled access, security<br />

<strong>and</strong> transparency:<br />

Where a discussi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a published article focuses <strong>on</strong> a particular set of primary data, there is a<br />

clear logic to deploy<str<strong>on</strong>g>in</str<strong>on</strong>g>g VRE tools, where available, to make that data available al<strong>on</strong>gside the<br />

discussi<strong>on</strong>. However, <str<strong>on</strong>g>in</str<strong>on</strong>g> such situati<strong>on</strong>s it is <str<strong>on</strong>g>in</str<strong>on</strong>g>cumbent up<strong>on</strong> the VRE to ensure that those data<br />

are trustworthy, or, if they are not (or might not be), to provide transparent documentati<strong>on</strong><br />

about the process(es) of analysis <strong>and</strong> manipulati<strong>on</strong> via which they have come to support the<br />

published discussi<strong>on</strong>. … the term “research” <str<strong>on</strong>g>in</str<strong>on</strong>g> Virtual Research Envir<strong>on</strong>ment implies that the<br />

outputs meet “c<strong>on</strong>venti<strong>on</strong>al” st<strong>and</strong>ards of peer review <strong>and</strong> evaluati<strong>on</strong> (Dunn 2009).<br />

Although develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g new models of peer review <strong>and</strong> authenticati<strong>on</strong> for digital publicati<strong>on</strong> will not be<br />

easy, Dunn rightly c<strong>on</strong>cludes that this does not mean that such models are not worth work<str<strong>on</strong>g>in</str<strong>on</strong>g>g toward.<br />

Data Creati<strong>on</strong>, Data Shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> Digital Dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong><br />

Archaeological research provides a wealth of data of greatly different types, as the CSHE report<br />

summarizes:<br />

Archaeological research is somewhat excepti<strong>on</strong>al am<strong>on</strong>g its humanistic neighbors <str<strong>on</strong>g>in</str<strong>on</strong>g> its<br />

reliance <strong>on</strong> time- <strong>and</strong> locati<strong>on</strong>-specific data, abundant use of images, <strong>and</strong> dependence <strong>on</strong><br />

complex <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary teams of scholars <strong>and</strong> specialists, who work <strong>on</strong> both site excavati<strong>on</strong><br />

<strong>and</strong> complex lab-based data analysis. Teams produce a plethora of data types <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g three-dimensi<strong>on</strong>al artifacts, maps, sketches, mov<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> still images, flora <strong>and</strong> faunal<br />

assemblages, geological samples, virtual rec<strong>on</strong>structi<strong>on</strong>s, <strong>and</strong> field notes. (Harley et al. 2010,<br />

30-31)<br />

These greatly vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds of data make the development of any st<strong>and</strong>ards for data record<str<strong>on</strong>g>in</str<strong>on</strong>g>g, shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

<strong>and</strong> preservati<strong>on</strong> a c<strong>on</strong>siderable undertak<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

In terms of data shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g, the CSHE report suggested that most archaeological scholars shared ideas<br />

through <str<strong>on</strong>g>in</str<strong>on</strong>g>formal networks, e-mail, <strong>and</strong> small meet<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, but tended to keep all data <strong>and</strong> work <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

progress to themselves until formal publicati<strong>on</strong>. A variety of factors <str<strong>on</strong>g>in</str<strong>on</strong>g>fluenced these decisi<strong>on</strong>s,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g various stakeholder <str<strong>on</strong>g>in</str<strong>on</strong>g>terests, fear of data be<str<strong>on</strong>g>in</str<strong>on</strong>g>g “poached,” the sensitivity of some<br />

archaeological sites, <strong>and</strong> the “mess<str<strong>on</strong>g>in</str<strong>on</strong>g>ess of the data.” On the other h<strong>and</strong>, papyrologists tended to work<br />

together, a factor that is discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> greater detail later. While some scholars were familiar with<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g-papers sites <str<strong>on</strong>g>in</str<strong>on</strong>g> classics such as the Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>/Stanford Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g Papers <str<strong>on</strong>g>in</str<strong>on</strong>g> Classics


67<br />

(PSWPC) 200 <strong>and</strong> the Classics Research Network (CRN), 201 there was not a great deal of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

whether such a work<str<strong>on</strong>g>in</str<strong>on</strong>g>g paper site should be created for archaeology. One scholar <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed by<br />

Harley et al. succ<str<strong>on</strong>g>in</str<strong>on</strong>g>ctly described the problem as be<str<strong>on</strong>g>in</str<strong>on</strong>g>g “that archaeology has a culture of ownership of<br />

ideas as property, rather than the culture of a gift ec<strong>on</strong>omy” (Harley et al. 2010, 81). While<br />

archaeologists most often collaborated <strong>on</strong> data collecti<strong>on</strong>, they rarely coauthored articles, a criticism<br />

that is also often made of classicists.<br />

N<strong>on</strong>etheless, the CSHE report also illustrated that for a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g number of archaeologists the idea of<br />

data shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g as <strong>on</strong>e means of data preservati<strong>on</strong> is ga<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g importance. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce archaeological sites are<br />

typically destroyed dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g an excavati<strong>on</strong>, scholars highlighted the need to be very meticulous <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

record<str<strong>on</strong>g>in</str<strong>on</strong>g>g all the necessary <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about data <strong>and</strong> also revealed that a great deal of “dark data”<br />

from excavati<strong>on</strong>s never makes it to f<str<strong>on</strong>g>in</str<strong>on</strong>g>al publicati<strong>on</strong>. In other words, the published record <strong>and</strong> the data<br />

from an excavati<strong>on</strong> are the <strong>on</strong>ly surviv<str<strong>on</strong>g>in</str<strong>on</strong>g>g record of an archaeological site <str<strong>on</strong>g>in</str<strong>on</strong>g> many cases. Meckseper<br />

<strong>and</strong> Warwick also underscore this fact <str<strong>on</strong>g>in</str<strong>on</strong>g> their explorati<strong>on</strong> of XML as a means of publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

archaeological excavati<strong>on</strong> reports. “Archaeology is a destructive process: the physical rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

ground are destroyed through their excavati<strong>on</strong> <strong>and</strong> lift<str<strong>on</strong>g>in</str<strong>on</strong>g>g of material,” Meckseper <strong>and</strong> Warwick<br />

c<strong>on</strong>firmed, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “the written record <strong>and</strong> publicati<strong>on</strong> have therefore always been seen as<br />

syn<strong>on</strong>ymous with the preservati<strong>on</strong> of the archaeological record” (Meckseper <strong>and</strong> Warwick 2003). Shen<br />

et al. (2008) agreed that for this reas<strong>on</strong>, the digital record<str<strong>on</strong>g>in</str<strong>on</strong>g>g of data at both the plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

excavati<strong>on</strong> stages is extremely important, for, as they noted, “unlike many other applicati<strong>on</strong>s of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> systems, it simply is not possible to go back <strong>and</strong> re-check at a later date.”<br />

While many scholars <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed by the CSHE believed <str<strong>on</strong>g>in</str<strong>on</strong>g> the need to share more data sets they also<br />

lamented the complete lack of st<strong>and</strong>ards for shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g data. The best-known data model<br />

for archaeology is Archaeological Markup Language (ArchaeoML), an XML schema for<br />

archaeological data (Schloen 2001) that serves as the basis of the XML database of the OCHRE<br />

(Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Cultural Heritage Research Envir<strong>on</strong>ment) project. 202 Based at the University of Chicago,<br />

OCHRE is an Internet database system that has been designed to manage cultural heritage <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>.<br />

Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the website, “it is <str<strong>on</strong>g>in</str<strong>on</strong>g>tended for researchers engaged <str<strong>on</strong>g>in</str<strong>on</strong>g> artifactual <strong>and</strong> textual studies of<br />

various k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds. It is especially suitable (1) for organiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g the results of archaeological<br />

excavati<strong>on</strong>s <strong>and</strong> surveys <strong>and</strong> (2) for prepar<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>at<str<strong>on</strong>g>in</str<strong>on</strong>g>g philological text editi<strong>on</strong>s <strong>and</strong><br />

dicti<strong>on</strong>aries.” OCHRE implements a core <strong>on</strong>tology for cultural heritage <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> uses a “global<br />

schema” to which local schemas of various projects can be mapped to facilitate data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong>. 203 A<br />

number of projects, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Chicago Hittite Dicti<strong>on</strong>ary 204 <strong>and</strong> the PFAP, 205 are us<str<strong>on</strong>g>in</str<strong>on</strong>g>g OCHRE to<br />

present their research<br />

The CSHE report also drew attenti<strong>on</strong> to the fact that data preservati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology is becom<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly problematic because of the need to preserve both analog <strong>and</strong> digital data, particularly large<br />

amounts of legacy data. 206 Many scholars who were <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed wanted more <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al support for<br />

stor<str<strong>on</strong>g>in</str<strong>on</strong>g>g, migrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> back<str<strong>on</strong>g>in</str<strong>on</strong>g>g up their data. A similar problem is that while archaeological projects<br />

200 http://www.pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>.edu/~pswpc/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

201 http://www.ssrn.com/crn/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

202 http://ochre.lib.uchicago.edu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.htm<br />

203 For a good overview of this system <strong>and</strong> how it compares to the CIDOC-CRM <strong>and</strong> to tDAR of Digital Antiquity, see<br />

http://ochre.lib.uchicago.edu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex_files/Page794.htm<br />

204 http://ochre.lib.uchicago.edu/eCHD/<br />

205 http://ochre.lib.uchicago.edu/PFA_Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

206 An entire issue of Internet Archaeology <str<strong>on</strong>g>in</str<strong>on</strong>g> 2008 was dedicated to the issue of deal<str<strong>on</strong>g>in</str<strong>on</strong>g>g with legacy data (http://<str<strong>on</strong>g>in</str<strong>on</strong>g>tarch.ac.uk/journal/issue24/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html)


68<br />

funded with public m<strong>on</strong>ey <str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom are required to make their data publicly available,<br />

publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g m<strong>and</strong>ates differ greatly am<strong>on</strong>g U.S. funders.<br />

Open C<strong>on</strong>text, 207 an “open access data publicati<strong>on</strong> service for archaeology” created by the Alex<strong>and</strong>ria<br />

Archive Institute, is attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to address some of these issues of data shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> openly accessible<br />

publicati<strong>on</strong>. In an article that describes Open C<strong>on</strong>text, Kansa et al. (2007) explicate why data shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> are particularly complicated for archaeology:<br />

Am<strong>on</strong>g the primary technical <strong>and</strong> c<strong>on</strong>ceptual issues <str<strong>on</strong>g>in</str<strong>on</strong>g> shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g field data is the questi<strong>on</strong> of how<br />

to codify our documentati<strong>on</strong>. Archaeologists generally lack c<strong>on</strong>sensus <strong>on</strong> st<strong>and</strong>ards of<br />

record<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> tend to make their own customized databases to suit the needs of their <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

research agendas, theoretical perspectives, <strong>and</strong> time <strong>and</strong> budgetary c<strong>on</strong>stra<str<strong>on</strong>g>in</str<strong>on</strong>g>ts. … Because of<br />

this variability, databases need extensive documentati<strong>on</strong> for others to decipher their c<strong>on</strong>tents<br />

(Kansa et al. 2007).<br />

C<strong>on</strong>sequently, the authors propose that just mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g archaeological data sets available for download<br />

will not solve this basic problem <strong>and</strong> that a better soluti<strong>on</strong> is to “serve archaeological databases <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

dynamic, <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e websites, thus mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>tent easy to browse <strong>and</strong> explore.” Open C<strong>on</strong>text seeks to<br />

make the publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of cultural heritage collecti<strong>on</strong>s both easier <strong>and</strong> more affordable.<br />

The basic architecture of Open C<strong>on</strong>text is a flexible database that allows researchers to publish<br />

structured data, textual narratives, <strong>and</strong> media <strong>on</strong> the web us<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>ly open-source technologies. 208 Open<br />

C<strong>on</strong>text supports “publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g, search<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g multiple museum collecti<strong>on</strong>s <strong>and</strong><br />

field research datasets.”<br />

Open C<strong>on</strong>text utilizes <strong>on</strong>ly a subset of the “OCHRE data structure (ArchaeoML),” for Kansa states that<br />

while OCHRE “provides sophisticated data-management tools targeted for active research projects”<br />

the goal of Open C<strong>on</strong>text is “to support streaml<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, web-based access <strong>and</strong> community organizati<strong>on</strong> of<br />

diverse cultural heritage c<strong>on</strong>tent” (Kansa et al. 2007). The project ultimately decided to use<br />

ArchaeoML 209 because of its flexibility:<br />

Overly rigid st<strong>and</strong>ards may <str<strong>on</strong>g>in</str<strong>on</strong>g>hibit <str<strong>on</strong>g>in</str<strong>on</strong>g>novati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> research design <strong>and</strong> poorly accommodate<br />

“legacy” datasets. … The flexibility of ArchaeoML enables Open C<strong>on</strong>text to deliver c<strong>on</strong>tent<br />

from many different research projects <strong>and</strong> collecti<strong>on</strong>s. A web-based publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g tool called<br />

“Penelope” enables <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual c<strong>on</strong>tributors to upload their own data tables <strong>and</strong> media files <strong>and</strong><br />

submit them for review <strong>and</strong> publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Open C<strong>on</strong>text. This tool enables web publicati<strong>on</strong> of<br />

research while ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g that a project’s orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al record<str<strong>on</strong>g>in</str<strong>on</strong>g>g system <strong>and</strong> term<str<strong>on</strong>g>in</str<strong>on</strong>g>ology are reta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

(Kansa et al. 2007).<br />

The ability of st<strong>and</strong>ards such as ArchaeoML to provide some basic level of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability while also<br />

support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the <str<strong>on</strong>g>in</str<strong>on</strong>g>clusi<strong>on</strong> of legacy structures is thus an essential feature for any cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for<br />

archaeology.<br />

Another important issue that Open C<strong>on</strong>text seeks to address is the challenge of open access <strong>and</strong><br />

copyright. All Open C<strong>on</strong>text c<strong>on</strong>tributors reta<str<strong>on</strong>g>in</str<strong>on</strong>g> copyright to their own c<strong>on</strong>tent, but Kansa et al. (2007)<br />

state that they are encouraged to publish their data elsewhere to better support dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> <strong>and</strong><br />

207 http://openc<strong>on</strong>text.org/<br />

208 Another project that seeks to provide an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated open-source “Archaeological Informati<strong>on</strong> System” is the Open Archaeology Software Suite<br />

(https://launchpad.net/openarchaeology).<br />

209 Details <strong>on</strong> how ArchaeoML has been implemented <str<strong>on</strong>g>in</str<strong>on</strong>g> the Open C<strong>on</strong>text system, as well as <strong>on</strong> their approach to copyright <strong>and</strong> open access, can be found<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> Kansa et al. (2010).


69<br />

digital preservati<strong>on</strong>. The authors c<strong>on</strong>tend that current copyright laws make digital preservati<strong>on</strong> difficult<br />

because permissi<strong>on</strong> to copy any data must be granted explicitly by the copyright holder, even if data<br />

are copied simply for the purpose of backup; thus, they require all c<strong>on</strong>tributors to use copyright<br />

licenses that grant permissi<strong>on</strong> to reproduce c<strong>on</strong>tent. Kansa et al. (2007) also <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted that “copyright<br />

will typically apply to most archaeological field data” to assuage the c<strong>on</strong>cern of most archaeologists<br />

that their field data will be stolen if they place them <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e before formally publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g them. Others<br />

have challenged this po<str<strong>on</strong>g>in</str<strong>on</strong>g>t, assert<str<strong>on</strong>g>in</str<strong>on</strong>g>g that any field data published <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e will enter the public doma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

immediately (Cahill <strong>and</strong> Passamano 2007). While this legal debate is bey<strong>on</strong>d the scope of this report,<br />

Open C<strong>on</strong>text also supports other features such as time-stamp<str<strong>on</strong>g>in</str<strong>on</strong>g>g the accessi<strong>on</strong> of new collecti<strong>on</strong>s,<br />

clearly identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g authorship, <strong>and</strong> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g permanent citable URLs <str<strong>on</strong>g>in</str<strong>on</strong>g> an effort to encourage proper<br />

citati<strong>on</strong> <strong>and</strong> reuse of data. The creators of Open C<strong>on</strong>text hope that the development of this system<br />

ultimately will serve as <strong>on</strong>e step <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g open access <str<strong>on</strong>g>in</str<strong>on</strong>g> the field of archaeology.<br />

Although collaborati<strong>on</strong> <strong>and</strong> data shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g are often not the norm, the field of research archaeology <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

particular requires a great deal of collaborati<strong>on</strong> because of the large number of specialized fields it<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>volves. Another area where the CSHE report listed <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g collaborati<strong>on</strong> was between doma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

specialists <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology <strong>and</strong> technical experts. At the same time, many archaeologists argued that<br />

more doma<str<strong>on</strong>g>in</str<strong>on</strong>g> specialists needed to become technical experts as well, to design tools for the field that<br />

would actually be used:<br />

Indeed, some scholars who specialize <str<strong>on</strong>g>in</str<strong>on</strong>g> such areas as “virtual heritage” c<strong>on</strong>sider themselves to<br />

be “methodologists of archaeological research” or “technological ambassadors,” rather than<br />

particular experts <str<strong>on</strong>g>in</str<strong>on</strong>g> a specific period or culture. Moreover, some scholars with dual<br />

archaeological <strong>and</strong> technical expertise may turn to parallel career paths, <strong>and</strong> may play an<br />

important, often central, role <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for successful scholarship (Harley et<br />

al. 2010, 109)<br />

The grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g need to master both doma<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> technical expertise is a theme that is seen throughout the<br />

overviews of all the digital classical discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es.<br />

Data Integrati<strong>on</strong>, Digital Repositories, <strong>and</strong> Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for Archaeology<br />

Arguably <strong>on</strong>e of the best-known repositories for archaeological data is ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by the Archaeology<br />

Data Service (ADS), 210 an organizati<strong>on</strong> based at the University of York <str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom. The<br />

ADS provides digital-archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g services for archaeology projects with<str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom, offers a<br />

searchable catalog of projects <strong>and</strong> their data (labeled ArchSearch), <strong>and</strong> promotes best practices for<br />

digitizati<strong>on</strong>, preservati<strong>on</strong>, <strong>and</strong> database management <str<strong>on</strong>g>in</str<strong>on</strong>g> the larger field of archaeology. 211 Once part of<br />

the now-defunct Arts & Humanities Data Services (AHDS), 212 the ADS receives fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g from the<br />

Arts <strong>and</strong> Humanities Research <str<strong>on</strong>g>Council</str<strong>on</strong>g> (AHRC) but also has a charg<str<strong>on</strong>g>in</str<strong>on</strong>g>g policy with set fees for storage<br />

<strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>. 213 The ADS acknowledged the importance of digital preservati<strong>on</strong> from the time it<br />

was established <str<strong>on</strong>g>in</str<strong>on</strong>g> 1996 (Richards 1997), <strong>and</strong> a later article by William Kilbride (Kilbride 2005)<br />

offered some <str<strong>on</strong>g>in</str<strong>on</strong>g>sights from 10 years of preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g archaeological data. “Experience at the ADS shows<br />

that preservati<strong>on</strong> works most easily when creators of digital resources anticipate a future audience <strong>and</strong><br />

when <str<strong>on</strong>g>in</str<strong>on</strong>g>heritors of digital resources actively curate them,” Kilbride emphasized, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

210 http://ads.ahds.ac.uk/<br />

211 In fact, the ADS is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with the Digital Antiquity project (also discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> this secti<strong>on</strong>) to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e best practices <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital preservati<strong>on</strong> of<br />

archaeological data that will help maximize its potential reuse (Mitcham, Niven, <strong>and</strong> Richards 2010).<br />

212 http://ahds.ac.uk/<br />

213 http://www.ahrc.ac.uk/Pages/default.aspx


70<br />

for re-use is a simple slogan, but <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital age that plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g has to start at the po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of data<br />

creati<strong>on</strong>, not at the end of a project as happens with c<strong>on</strong>venti<strong>on</strong>al archives” (Kilbride 2005). The need<br />

to plan for digital data preservati<strong>on</strong> from the beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g of a project as well as for the reuse of data<br />

were highlighted by Kilbride as essential, <strong>and</strong> he expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that <str<strong>on</strong>g>in</str<strong>on</strong>g> the first 10 years of its existence the<br />

ADS also developed best-practice guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es (e.g., for the creati<strong>on</strong> of data, the deposit of data) that are<br />

c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uously ref<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong> well documented, built a rights-management framework, <strong>and</strong> created <str<strong>on</strong>g>in</str<strong>on</strong>g>itial<br />

l<strong>on</strong>g-term preservati<strong>on</strong> cost models. Kilbride also noted that more than 10 years of experience had<br />

shown ADS staff how various types of digital resources were used <str<strong>on</strong>g>in</str<strong>on</strong>g> different ways.<br />

The challenges of preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g archaeological data, both newly created digital data <strong>and</strong> analog data <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

myriad of formats, as briefly touched up<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the last secti<strong>on</strong>, has l<strong>on</strong>g been c<strong>on</strong>sidered as a significant<br />

problem fac<str<strong>on</strong>g>in</str<strong>on</strong>g>g the field with a diverse set of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure challenges. The Center for the Study of<br />

Architecture (CSA) 214 based at Bryn Mawr University led an <str<strong>on</strong>g>in</str<strong>on</strong>g>itiative to develop an Archaeological<br />

Data Archive Project (ADAP) <str<strong>on</strong>g>in</str<strong>on</strong>g> 1993 <str<strong>on</strong>g>in</str<strong>on</strong>g> an effort to raise awareness regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g data preservati<strong>on</strong> <strong>and</strong><br />

to encourage scholars to deposit their own digital materials <str<strong>on</strong>g>in</str<strong>on</strong>g>to a shared data archive, but a lack of<br />

funds limited the success of the <str<strong>on</strong>g>in</str<strong>on</strong>g>itiative. 215 As outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by McManam<strong>on</strong> <strong>and</strong> K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh (2010), legacy<br />

data <str<strong>on</strong>g>in</str<strong>on</strong>g>clude lengthy technical reports, grey literature, <strong>and</strong> digital data stored <strong>on</strong> computer cards,<br />

magnetic tapes, <strong>and</strong> floppy disks found <str<strong>on</strong>g>in</str<strong>on</strong>g> museums, archives, <strong>and</strong> faculty offices. Most of these digital<br />

data, the authors noted, has <strong>on</strong>ly been preserved <str<strong>on</strong>g>in</str<strong>on</strong>g> that the physical media <strong>on</strong> which it can be found is<br />

be<str<strong>on</strong>g>in</str<strong>on</strong>g>g actively preserved by archaeological repositories, although this provides no means of access to<br />

the data. 216<br />

The challenges of complicated data, diverse st<strong>and</strong>ards, the difficulties of mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gful archaeological<br />

data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong>, <strong>and</strong> the problems of l<strong>on</strong>g-term preservati<strong>on</strong> were also outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> a discussi<strong>on</strong> of the<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure needs of archaeology by Snow et al. (2006). They listed three particular types of<br />

data that are almost impossible to access simultaneously because of lack of cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for<br />

archaeology: databases us<str<strong>on</strong>g>in</str<strong>on</strong>g>g different st<strong>and</strong>ards for both record<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g data <strong>on</strong> different<br />

technical platforms; a large volume of “grey literature”; <strong>and</strong> images, maps, <strong>and</strong> photographs found <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

museum catalogs <strong>and</strong> both published <strong>and</strong> unpublished archaeological reports. The authors proposed an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>itial cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure architecture for archaeology that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital library middleware<br />

(such as Fedora) 217 <strong>and</strong> c<strong>on</strong>tent-management tools, document <strong>and</strong> image-search<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies,<br />

geographic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> systems (GIS), visualizati<strong>on</strong> tools, <strong>and</strong> the use of the Open Archives Initiative<br />

Protocol for Metadata Harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g (OAI-PMH) 218 because it provides an “applicati<strong>on</strong>-<str<strong>on</strong>g>in</str<strong>on</strong>g>dependent<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability framework for metadata harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g” by repositories. Snow et al. (2006) reported that<br />

such an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure could be built almost entirely from open-source comp<strong>on</strong>ents.<br />

These authors also identified several critical problems that needed to be addressed, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the lack<br />

of any type of st<strong>and</strong>ard protocols for record<str<strong>on</strong>g>in</str<strong>on</strong>g>g the greatly vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g types of archaeological data <strong>and</strong> the<br />

absence of any st<strong>and</strong>ardized tools for access. N<strong>on</strong>etheless, the authors suggested that rather than try<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

to force the adopti<strong>on</strong> of <strong>on</strong>e data model, users should employ semantic mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools to map different<br />

term<str<strong>on</strong>g>in</str<strong>on</strong>g>ologies or vocabularies <str<strong>on</strong>g>in</str<strong>on</strong>g> use, while also work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with<str<strong>on</strong>g>in</str<strong>on</strong>g> the archaeological community to<br />

establish at least m<str<strong>on</strong>g>in</str<strong>on</strong>g>imal shared st<strong>and</strong>ards for descripti<strong>on</strong>. F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, to ensure susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability, Snow et al.<br />

argued that “data collecti<strong>on</strong>s should be distributed <strong>and</strong> sharable” <strong>and</strong> that “digital libraries <strong>and</strong><br />

214 http://www.csanet.org/<br />

215 For further discussi<strong>on</strong> of the ADAP <strong>and</strong> the viability of digital archives <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology, see (Eiteljorg 2005) <strong>and</strong> (Xia 2006).<br />

216 The authors cite Childs <strong>and</strong> Kagan (2008) as the source for this statement.<br />

217 Fedora is an advanced digital repository platform (http://www.fedora-comm<strong>on</strong>s.org/) that is available as open source.<br />

218 http://www.openarchives.org/pmh/


71<br />

associated services should be made available to researchers <strong>and</strong> organizati<strong>on</strong>s to store their own data<br />

<strong>and</strong> mirror data of others.”<br />

Slightly more recent research by Pettersen et al. (2008) has reached similar c<strong>on</strong>clusi<strong>on</strong>s. This article<br />

reported <strong>on</strong> attempts to create an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated data grid for two archaeological projects <str<strong>on</strong>g>in</str<strong>on</strong>g> Australia, <strong>and</strong><br />

stated that:<br />

A c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>u<str<strong>on</strong>g>in</str<strong>on</strong>g>g problem <str<strong>on</strong>g>in</str<strong>on</strong>g> the archaeological <strong>and</strong> cultural heritage <str<strong>on</strong>g>in</str<strong>on</strong>g>dustries is a lack of<br />

coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ated digital resources <strong>and</strong> tools to access, analyze <strong>and</strong> visualize archaeological data for<br />

research <strong>and</strong> publicati<strong>on</strong>. A related problem is the absence of persistent archives that focus <strong>on</strong><br />

the l<strong>on</strong>g-term preservati<strong>on</strong> of these data. As a result professi<strong>on</strong>als <strong>and</strong> researchers are either<br />

unaware of the existence of data sets, or aware of them but unable to access them for a<br />

particular project (Pettersen et al. 2008).<br />

One potential benefit of a coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ated cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure or <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated digital archive for more<br />

archaeological projects as <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated here is that it would allow more researchers to not <strong>on</strong>ly f<str<strong>on</strong>g>in</str<strong>on</strong>g>d <strong>and</strong><br />

possibly reuse data but also to use their own tools with those data (such as visualizati<strong>on</strong>s). The<br />

architecture ultimately chosen by this project was the Storage Resource Broker (SRB) 219 developed by<br />

the San Diego Supercomput<str<strong>on</strong>g>in</str<strong>on</strong>g>g Center. The biggest challenge they found <str<strong>on</strong>g>in</str<strong>on</strong>g> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the SRB was the<br />

lack of an easy-to-use <str<strong>on</strong>g>in</str<strong>on</strong>g>terface. Their project also encountered various challenges of data capture <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the field, problems with data logg<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong>, like many other researchers, they criticized the fact, that<br />

“there is no st<strong>and</strong>ardized methodology <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology for record<str<strong>on</strong>g>in</str<strong>on</strong>g>g data <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital format.” While<br />

Pettersen et al. (2008) are explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the use of ArchaeoML for data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> <strong>and</strong> portability, they<br />

also submitted that the largest challenge was to create a user-friendly way for archaeologists to <str<strong>on</strong>g>in</str<strong>on</strong>g>teract<br />

with the data grid.<br />

As was illustrated above, OpenC<strong>on</strong>text makes use of the XML st<strong>and</strong>ard ArchaeoML to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate<br />

disparate archaeological collecti<strong>on</strong>s, <strong>and</strong> Kilbride (2005) has also suggested that XML could be used<br />

as a st<strong>and</strong>ard for the digital preservati<strong>on</strong> of archaeological data. He noted, however, that there was<br />

relatively little XML activity am<strong>on</strong>g archaeologists, largely because of the diverse nature of the<br />

community, which <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes archaeological field workers, f<str<strong>on</strong>g>in</str<strong>on</strong>g>d specialists, museum curators, <strong>and</strong> many<br />

others, all of whom often have different ways of describ<str<strong>on</strong>g>in</str<strong>on</strong>g>g the same <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. Another significant<br />

issue Kilbride listed was that to support the uptake of XML by archaeologists, more XML-based tools<br />

needed to move out of the development stage <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to producti<strong>on</strong>, <strong>and</strong> that c<strong>on</strong>solidati<strong>on</strong>, rather than<br />

elaborati<strong>on</strong>, was needed. “Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the assumpti<strong>on</strong> that different types of archaeology will gravitate<br />

towards subtly different flavours of XML, then perhaps the most important technical part of this<br />

work,” Kilbride c<strong>on</strong>cluded, “will be an open <strong>and</strong> systematic declarati<strong>on</strong> of the semantics of various<br />

types of XML <strong>and</strong> appropriate mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>gs between them, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g explicit <str<strong>on</strong>g>in</str<strong>on</strong>g>dicati<strong>on</strong>s of terms <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

given schema that cannot be mapped directly to terms <str<strong>on</strong>g>in</str<strong>on</strong>g> another” (Kilbride 2005).<br />

A variety of research by the ETANA-DL 220 has also explored the difficulties of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

archaeological collecti<strong>on</strong>s, particularly how to map different schemas together. This digital library is<br />

part of the larger project ETANA (Electr<strong>on</strong>ic Tools <strong>and</strong> Ancient Near Eastern Archives), 221 which also<br />

provides access to ABZU <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a collecti<strong>on</strong> of core texts <str<strong>on</strong>g>in</str<strong>on</strong>g> the field of Ancient Near East<br />

studies. The ETANA-DL used the “5S” (streams, structures, spaces, scenarios, <strong>and</strong> societies)<br />

219 http://www.sdsc.edu/srb/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/Ma<str<strong>on</strong>g>in</str<strong>on</strong>g>_Page<br />

220 http://digbase.etana.org:8080/etana/servlet/Start<br />

221 http://www.etana.org/


72<br />

framework to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate several archaeological digital libraries (Shen et al. 2008). They developed a<br />

doma<str<strong>on</strong>g>in</str<strong>on</strong>g> metamodel for archaeology <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the 5S model <strong>and</strong> focused particularly <strong>on</strong> the<br />

challenges of digital library <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong>. The architecture of ETANA-DL c<strong>on</strong>sists of a “centralized<br />

catalog <strong>and</strong> partially decentralized uni<strong>on</strong> repository.” To create the centralized uni<strong>on</strong> catalog they used<br />

mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g services. ETANA-DL also c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues to provide all the services offered by the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual digital libraries they <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated through what they term uni<strong>on</strong> services:<br />

Uni<strong>on</strong> services are new implementati<strong>on</strong>s of all the services supported by member DLs to be<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated. They apply to the uni<strong>on</strong> catalog <strong>and</strong> the uni<strong>on</strong> repository that are <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated from<br />

member DLs. The uni<strong>on</strong> services do not communicate with member DLs directly <strong>and</strong> thus do<br />

not rely <strong>on</strong> member DLs to provide services (Shen et al. 2008).<br />

The authors stress the importance of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated user services over an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated digital<br />

library, while still develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g an architecture that allows the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual libraries to reta<str<strong>on</strong>g>in</str<strong>on</strong>g> their<br />

aut<strong>on</strong>omy.<br />

In agreement with Kansa et al. (2007), the creators of ETANA-DL also warned aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

create <strong>on</strong>e universal schema for archaeology:<br />

Migrati<strong>on</strong> or export of archeological data from <strong>on</strong>e system to another is a m<strong>on</strong>umental task that<br />

is aggravated by peculiar data formats <strong>and</strong> database schemas. Furthermore, archeological data<br />

classificati<strong>on</strong> depends <strong>on</strong> a number of vaguely def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed qualitative characteristics, which are<br />

open to pers<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>. Different branches of archeology have special methods of<br />

classificati<strong>on</strong>; progress <str<strong>on</strong>g>in</str<strong>on</strong>g> digs <strong>and</strong> new types of excavated f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds make it impossible to foresee<br />

an ultimate global schema for the descripti<strong>on</strong> of all excavati<strong>on</strong> data. … Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, an<br />

‘‘<str<strong>on</strong>g>in</str<strong>on</strong>g>cremental’’ approach is desired for global schema enrichment (Shen et al. 2008).<br />

Instead of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g ArchaeoML or a subset, as with Open C<strong>on</strong>text, the creators of ETANA-DL have<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>crementally created a global schema <strong>and</strong> support data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> between different digital libraries<br />

through the use of “an <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive software tool for database-to-XML generati<strong>on</strong>, schema mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong><br />

global archive generati<strong>on</strong>” (Vemuri et al. 2006). The three major comp<strong>on</strong>ents to the ETANA-ADD<br />

tool are a database to XML c<strong>on</strong>verter, a schema mapper, <strong>and</strong> an OAI-XML data provider tool. The first<br />

comp<strong>on</strong>ent “DB2XML” c<strong>on</strong>verts data from custom databases <str<strong>on</strong>g>in</str<strong>on</strong>g>to XML collecti<strong>on</strong>s. “The end user can<br />

open tables corresp<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g to an artifact, <strong>and</strong> call for an SQL jo<str<strong>on</strong>g>in</str<strong>on</strong>g> operati<strong>on</strong> <strong>on</strong> them. Each record of the<br />

result represents an XML record <str<strong>on</strong>g>in</str<strong>on</strong>g> the collecti<strong>on</strong>, <strong>and</strong> the structure of the dataset determ<str<strong>on</strong>g>in</str<strong>on</strong>g>es the local<br />

XML schema,” Vemuri et al. (2006) expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “based <strong>on</strong> this pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciple, the comp<strong>on</strong>ent<br />

generates a local XML collecti<strong>on</strong> <strong>and</strong> its XML schema.” After a local XML collecti<strong>on</strong> is generated,<br />

end users <str<strong>on</strong>g>in</str<strong>on</strong>g>teract with a tool called Schema Mapper, which leads a user through mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g the local<br />

XML schema that has been generated for their database <str<strong>on</strong>g>in</str<strong>on</strong>g>to the global XML schema used by ETANA-<br />

DL. If a particular artifact type or other item is not available the global XML schema is extended to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude it. The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al comp<strong>on</strong>ent is an OAI XML data provider that supports publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the new<br />

XML collecti<strong>on</strong> as an OAI data provider. Thus, the ETANA-DL created a system that allowed for an<br />

almost-lossless c<strong>on</strong>versi<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual databases <str<strong>on</strong>g>in</str<strong>on</strong>g>to their own universal schema <strong>and</strong> created an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated archaeological digital library from three <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <strong>on</strong>es that can now be searched<br />

seamlessly.


73<br />

Another research project that has explored <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g diverse data sets <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology is the STAR-<br />

Semantic Technologies for Archaeology Resources) Project 222 based at the University of Glamorgan.<br />

This AHRC-funded project worked <str<strong>on</strong>g>in</str<strong>on</strong>g> collaborati<strong>on</strong> with English Heritage (EH) 223 to develop<br />

semantic technologies that could be used <str<strong>on</strong>g>in</str<strong>on</strong>g> the doma<str<strong>on</strong>g>in</str<strong>on</strong>g> of digital archaeology. They sought to<br />

dem<strong>on</strong>strate the utility of cross-search<str<strong>on</strong>g>in</str<strong>on</strong>g>g archaeological data that were expressed as RDF <strong>and</strong> that<br />

c<strong>on</strong>formed to a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle <strong>on</strong>tological scheme (B<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g et al. 2008). For this <strong>on</strong>tological scheme, B<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

et al. (2008) expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that they had created a “modular RDF extensi<strong>on</strong>” of CRM-EH, 224 an<br />

archaeological extensi<strong>on</strong> of the CIDOC-CRM that had been produced by EH but had previously <strong>on</strong>ly<br />

existed <strong>on</strong> paper. “The <str<strong>on</strong>g>in</str<strong>on</strong>g>itial work <strong>on</strong> the CRM-EH was prompted by a need to model the<br />

archaeological processes <strong>and</strong> c<strong>on</strong>cepts <str<strong>on</strong>g>in</str<strong>on</strong>g> use by the (EH) archaeological teams,” B<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g et al. (2008)<br />

expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “to <str<strong>on</strong>g>in</str<strong>on</strong>g>form future systems design <strong>and</strong> to aid <str<strong>on</strong>g>in</str<strong>on</strong>g> the potential <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of archaeological<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable web based research <str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives.” It was hoped that the design of a comm<strong>on</strong><br />

<strong>on</strong>tology would support greater “cross doma<str<strong>on</strong>g>in</str<strong>on</strong>g> search<str<strong>on</strong>g>in</str<strong>on</strong>g>g” <strong>and</strong> more “semantic depth” for<br />

archaeological queries.<br />

The STAR project mapped five archaeological databases, each with its own unique schema, to the<br />

CRM-EH <strong>on</strong>tology. The <str<strong>on</strong>g>in</str<strong>on</strong>g>itial mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g between database columns <strong>and</strong> RDF entities was undertaken<br />

manually with the assistance of doma<str<strong>on</strong>g>in</str<strong>on</strong>g> experts for <strong>on</strong>e of these databases; this mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g was then used<br />

to extrapolate the mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>gs for the other databases. After the data from these five databases were<br />

mapped to the CRM-EH, a complicated data-extracti<strong>on</strong> process that <str<strong>on</strong>g>in</str<strong>on</strong>g>volved the creati<strong>on</strong> of unique<br />

identifiers, the model<str<strong>on</strong>g>in</str<strong>on</strong>g>g of events through the creati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>termediate “virtual” entities, <strong>and</strong> the<br />

model<str<strong>on</strong>g>in</str<strong>on</strong>g>g of “data <str<strong>on</strong>g>in</str<strong>on</strong>g>stance values” was undertaken. A mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> extracti<strong>on</strong> utility was built to<br />

allow users to query the archaeological data <strong>and</strong> then save their query as an XML file for later reuse<br />

<strong>and</strong> the results of their query as tabular data <str<strong>on</strong>g>in</str<strong>on</strong>g> an RDF format. To help users work with the extracted<br />

archaeological data that were stored as RDF, the STAR project <str<strong>on</strong>g>in</str<strong>on</strong>g>itially built a prototype<br />

“search/browse applicati<strong>on</strong>” where the extracted data were stored <str<strong>on</strong>g>in</str<strong>on</strong>g> a MySQL 225 RDF triplestore 226<br />

<strong>and</strong> users could query it us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a CRM-based web service. This RDF data store not <strong>on</strong>ly holds the<br />

CRM-EH <strong>on</strong>tology <strong>and</strong> the “amalgamated data” from the separate archaeological databases but also<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a number of important doma<str<strong>on</strong>g>in</str<strong>on</strong>g>-specific thesauri <strong>and</strong> glossaries that are represented <str<strong>on</strong>g>in</str<strong>on</strong>g> SKOS<br />

(B<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2010). 227<br />

Mov<str<strong>on</strong>g>in</str<strong>on</strong>g>g bey<strong>on</strong>d this <str<strong>on</strong>g>in</str<strong>on</strong>g>itial prototype, the STAR project website offers access to a full research<br />

dem<strong>on</strong>strator 228 that provides a SPARQL 229 -based semantic search not <strong>on</strong>ly of the extracted databases<br />

but also of an archaeological grey literature collecti<strong>on</strong> made available to them by the ADS. 230 They<br />

used NLP <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>-extracti<strong>on</strong> techniques 231 to identify key c<strong>on</strong>cepts <str<strong>on</strong>g>in</str<strong>on</strong>g> the grey literature that were<br />

222 http://hypermedia.research.glam.ac.uk/kos/star/<br />

223 http://www.english-heritage.org.uk/<br />

224 http://hypermedia.research.glam.ac.uk/resources/crm/<br />

225 MySQL is a “relati<strong>on</strong>al database management system that runs as a server provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g multi-user access to a number of databases”<br />

(http://en.wikipedia.org/wiki/MySQL). The source code for MySQL has been made available under a GNU General Public License <strong>and</strong> is a popular choice<br />

of “database for use <str<strong>on</strong>g>in</str<strong>on</strong>g> web applicati<strong>on</strong>s.”<br />

226 A triplestore is a “purpose-built database for the storage <strong>and</strong> retrieval of Resource Descripti<strong>on</strong> Framework (RDF) metadata”<br />

(http://en.wikipedia.org/wiki/Triplestore#) <strong>and</strong> they are designed for the storage <strong>and</strong> retrieval of statements known as triples, “<str<strong>on</strong>g>in</str<strong>on</strong>g> the form of subjectpredicate-object”<br />

as are used <str<strong>on</strong>g>in</str<strong>on</strong>g> the creati<strong>on</strong> of RDF statements.<br />

227 SKOS st<strong>and</strong>s for “Simple Knowledge Organizati<strong>on</strong> System” <strong>and</strong> provides a RDF model for encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g reference tool such as thesauri, tax<strong>on</strong>omies, <strong>and</strong><br />

classificati<strong>on</strong> systems. SKOS is under active development as part of the W3C’s Semantic Web activity (http://www.w3.org/2004/02/skos/)<br />

228 http://hypermedia.research.glam.ac.uk/resources/star-dem<strong>on</strong>strator/<br />

229 http://www.w3.org/TR/rdf-sparql-query/<br />

230 The grey literature collecti<strong>on</strong> that the STAR project used was an “extract of the OASIS corpus” that is ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by the ADS. The OASIS (Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

AccesS to the Index of archaeological <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigati<strong>on</strong>S) (http://oasis.ac.uk/) corpus seeks to “provide an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g>dex to the mass of archaeological grey<br />

literature that has been produced as a result of the advent of large-scale developer funded fieldwork.”<br />

231 The STAR project made use of GATE (http://gate.ac.uk/)


74<br />

then mapped to the CRM-EH so that both the extracted data sets <strong>and</strong> the textual documents could be<br />

searched by the same semantic metadata terms. 232 This project has thus successfully dem<strong>on</strong>strated <strong>on</strong>e<br />

approach to re<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the textual <strong>and</strong> material records with<str<strong>on</strong>g>in</str<strong>on</strong>g> the doma<str<strong>on</strong>g>in</str<strong>on</strong>g> of archaeology, a goal<br />

also sought by the Archaeotools project discussed below.<br />

While the STAR project c<strong>on</strong>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> 2010, this research led to another currently funded project<br />

entitled STELLAR (Semantic Technologies Enhanc<str<strong>on</strong>g>in</str<strong>on</strong>g>g L<str<strong>on</strong>g>in</str<strong>on</strong>g>ks <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data for Archaeological<br />

Resources). 233 The STELLAR project is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with the ADS to “enhance the discoverability,<br />

accessibility, impact <strong>and</strong> susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability of ADS datasets <strong>and</strong> STAR project outcomes (services <strong>and</strong> data<br />

resources)” by focus<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> semantic <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of state-of-the-art data-<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong><br />

techniques. This project will develop a series of guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es for the mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> extracti<strong>on</strong> of diverse<br />

archaeological resources <str<strong>on</strong>g>in</str<strong>on</strong>g>to a RDF-XML representati<strong>on</strong> of the CIDOC CRM-EH <strong>on</strong>tology as well as<br />

for the publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g of these resources as L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked Data, create a mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g tool to assist n<strong>on</strong>specialist users<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g process, <strong>and</strong> publish all the extracted archaeological data as L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked Data <strong>on</strong> the web.<br />

Seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to address the various issues outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed above, several recent <str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives have sought to provide<br />

the beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g both advanced <strong>and</strong> mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gful <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated access<br />

to <strong>and</strong> the preservati<strong>on</strong> of archaeological data. The Mell<strong>on</strong> Foundati<strong>on</strong>–funded Archaeo<str<strong>on</strong>g>in</str<strong>on</strong>g>formatics 234<br />

was “established as a collaborative organizati<strong>on</strong> to design, seek fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g for, <strong>and</strong> direct a set of<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives for archaeology.” This project ran from 2007 to 2008. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the<br />

project summary, the <str<strong>on</strong>g>in</str<strong>on</strong>g>itiative sought to provide preservati<strong>on</strong> for both archaeological data <strong>and</strong><br />

metadata <strong>and</strong> to build a digital archive for data that would provide access to both scholars <strong>and</strong> the<br />

general public.<br />

While a number of scholars <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed by the CSHE had great hopes for projects such as<br />

Archaeo<str<strong>on</strong>g>in</str<strong>on</strong>g>formatics, most still believed that “<strong>on</strong>e of the biggest obstacles is the questi<strong>on</strong> of st<strong>and</strong>ards<br />

for <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g idiosyncratic data sets <strong>on</strong> a large scale, as well as the difficulties of secur<str<strong>on</strong>g>in</str<strong>on</strong>g>g buy-<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

from other stakeholders of archaeological data” (Harley et al. 2010, 91). Similarly, Stuart Dunn <str<strong>on</strong>g>in</str<strong>on</strong>g> his<br />

overview of develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a specific VRE <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology commented that projects were often good at<br />

creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, but <strong>on</strong>ly at the project level. “Because archaeological fieldwork is by def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong><br />

regi<strong>on</strong>al or site-specific, excavati<strong>on</strong> directors generally focus the majority of their efforts at any <strong>on</strong>e<br />

time <strong>on</strong> relatively small-scale data gather<str<strong>on</strong>g>in</str<strong>on</strong>g>g activities,” Dunn declared, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g, “this produces bodies<br />

of data that might be c<strong>on</strong>ceptually comparable, but are not st<strong>and</strong>ardized or c<strong>on</strong>sistent” (Dunn 2009).<br />

The “successor” project to Archaeo<str<strong>on</strong>g>in</str<strong>on</strong>g>formatics, which is named Digital Antiquity, 235 has thus sought<br />

to address the challenges of diverse <strong>and</strong> n<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable st<strong>and</strong>ards, <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g idiosyncratic data,<br />

project-specific <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures, <strong>and</strong> limited stakeholder participati<strong>on</strong>. Digital Antiquity is a<br />

“collaborative organizati<strong>on</strong> devoted to enhanc<str<strong>on</strong>g>in</str<strong>on</strong>g>g preservati<strong>on</strong> <strong>and</strong> access to digital records of<br />

archaeological <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigati<strong>on</strong>.” With fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g from the Mell<strong>on</strong> Foundati<strong>on</strong> <strong>and</strong> the Nati<strong>on</strong>al Science<br />

Foundati<strong>on</strong> (NSF), this project has built an “<strong>on</strong>-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e digital repository that is able to provide<br />

preservati<strong>on</strong>, discovery, <strong>and</strong> access for data <strong>and</strong> documents produced by archaeological projects.”<br />

Named tDAR (the Digital Archaeological Record 236 ), this repository already <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes more than 500<br />

232 A separate “semantic <str<strong>on</strong>g>in</str<strong>on</strong>g>dices portal,” named Andr<strong>on</strong>ikos, has also been developed that provides access to the entire OASIS corpus.<br />

(http://<strong>and</strong>r<strong>on</strong>ikos.kyklos.co.uk/)<br />

233 http://hypermedia.research.glam.ac.uk/kos/STELLAR/<br />

234 http://archaeo<str<strong>on</strong>g>in</str<strong>on</strong>g>formatics.org/<br />

235 http://www.digitalantiquity.org/<br />

236 http://www.tdar.org/


75<br />

documents <strong>and</strong> 60 data sets 237 <strong>and</strong> plans to encompass digital data from both <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g research <strong>and</strong><br />

legacy archaeological projects, with a focus <strong>on</strong> American archaeology.<br />

The two major goals of Digital Antiquity are to provide greater <strong>and</strong> more sophisticated access to<br />

archaeological reports <strong>and</strong> data <strong>and</strong> to provide a preservati<strong>on</strong> repository for those data (McManam<strong>on</strong>,<br />

K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh, <strong>and</strong> Br<str<strong>on</strong>g>in</str<strong>on</strong>g> 2010). Because of the sheer volume of archaeological data <strong>and</strong> reports that are<br />

generated every year, McManam<strong>on</strong> <strong>and</strong> K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh (2010) noted that many archaeologists, even those<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the same field, are often unaware of important results that have already been published,<br />

particularly because of the difficulties <str<strong>on</strong>g>in</str<strong>on</strong>g> both access<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g data. As with the OpenC<strong>on</strong>text<br />

project, Digital Antiquity hopes that both the availability of <strong>and</strong> simplicity of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g tDAR will<br />

encourage archaeologists to deposit their data <strong>and</strong> will also support broader analysis <strong>and</strong> synthesis of<br />

archaeological data by <str<strong>on</strong>g>in</str<strong>on</strong>g>terested researchers.<br />

All <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> resources that are deposited <str<strong>on</strong>g>in</str<strong>on</strong>g>to tDAR are documented by detailed metadata<br />

(adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrative, descriptive, technical, spatial, temporal, keywords, etc.) that are provided by the user,<br />

<strong>and</strong> such metadata can be associated with either a project or a specific digital resource (e.g., a<br />

spreadsheet or database) (McManam<strong>on</strong> <strong>and</strong> K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh 2010). It is hoped that this detailed level of<br />

metadata will both take advantage of c<strong>on</strong>tributors’ knowledge of their own data <strong>and</strong> assist users <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

discover<str<strong>on</strong>g>in</str<strong>on</strong>g>g relevant resources for their own research. “Descriptive metadata are tailored to the nature<br />

of archaeological data,” McManam<strong>on</strong>, K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh <strong>and</strong> Br<str<strong>on</strong>g>in</str<strong>on</strong>g> (2010) noted, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “this metadata both<br />

enables effective resource discovery dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g browse <strong>and</strong> search by users <strong>and</strong> provides the detailed<br />

semantic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> needed to permit sensible scientific reuse of the data.”<br />

Rather than attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g complete <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of all data sets or design<str<strong>on</strong>g>in</str<strong>on</strong>g>g a universal data model for<br />

archaeology, Missy Elliott reported that the “semantic dem<strong>and</strong>s” of user queries are rec<strong>on</strong>ciled with<br />

the semantic c<strong>on</strong>tent of available data sets:<br />

tDAR uses a novel strategy of query-driven, ad-hoc data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> which, given a query,<br />

the cybertools will identify relevant data sources <strong>and</strong> perform <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive, <strong>on</strong>-the-fly metadata<br />

match<str<strong>on</strong>g>in</str<strong>on</strong>g>g to align key porti<strong>on</strong>s of the data while reas<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g with potentially <str<strong>on</strong>g>in</str<strong>on</strong>g>complete <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>c<strong>on</strong>sistent <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> (Elliott 2008).<br />

Currently tDAR allows users to search an <str<strong>on</strong>g>in</str<strong>on</strong>g>itial data archive <strong>and</strong> to register <strong>and</strong> upload resources<br />

(<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g databases, text files, <strong>and</strong> images). While any<strong>on</strong>e may register to use tDAR, <strong>on</strong>ly approved<br />

users may upload <strong>and</strong> add <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> resources. Informati<strong>on</strong> resources can be public or private<br />

mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g it possible to support different levels of access c<strong>on</strong>trol <strong>and</strong> embargo<str<strong>on</strong>g>in</str<strong>on</strong>g>g. The tDAR repository<br />

also requires detailed <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about authorship when c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources, <strong>and</strong> the registrati<strong>on</strong><br />

agreement requires users to adhere to a set of c<strong>on</strong>diti<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g proper citati<strong>on</strong> <strong>and</strong> credit<br />

(McManam<strong>on</strong>, K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh, <strong>and</strong> Br<str<strong>on</strong>g>in</str<strong>on</strong>g> 2010). In additi<strong>on</strong>, all <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual repository data sets <strong>and</strong> documents<br />

will so<strong>on</strong> have persistent URLs that will provide “permanent, citable, web addresses,” <strong>and</strong> whenever<br />

c<strong>on</strong>tent is revised all earlier c<strong>on</strong>tent is automatically versi<strong>on</strong>ed (McManam<strong>on</strong> <strong>and</strong> K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh 2010).<br />

tDAR currently supports the upload of a variety of data formats, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g text files <str<strong>on</strong>g>in</str<strong>on</strong>g> ASCII or PDF,<br />

JPEG, <strong>and</strong> TIFF images, <strong>and</strong> databases can be <str<strong>on</strong>g>in</str<strong>on</strong>g>gested as Access, Excel, or CSV files. All uploaded<br />

databases are c<strong>on</strong>verted to a st<strong>and</strong>ard relati<strong>on</strong>al database format that will serve as the l<strong>on</strong>g-term format<br />

for preservati<strong>on</strong> <strong>and</strong> updat<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Elliott 2008). While all orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al formats are ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed at the bit level,<br />

237 tDAR has already been used as the data archive for an article published <str<strong>on</strong>g>in</str<strong>on</strong>g> Internet Archaeology<br />

(http://<str<strong>on</strong>g>in</str<strong>on</strong>g>tarch.ac.uk/journal/issue28/holmberg_toc.html), <strong>and</strong> the data can be found at (http://core.tdar.org/project/4243)


76<br />

tDAR also stores all resources <str<strong>on</strong>g>in</str<strong>on</strong>g> a preservati<strong>on</strong> format to support stability through l<strong>on</strong>g-term<br />

migrati<strong>on</strong>. tDAR also provides a range of tools to help archaeologists both map <strong>and</strong> import their data<br />

sets even if they have limited technical skills. These tools <str<strong>on</strong>g>in</str<strong>on</strong>g>clude Web forms that guide c<strong>on</strong>tributors<br />

through a process of metadata entry <strong>and</strong> the upload<str<strong>on</strong>g>in</str<strong>on</strong>g>g of files (McManam<strong>on</strong>, K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh, <strong>and</strong> Br<str<strong>on</strong>g>in</str<strong>on</strong>g> 2010).<br />

Future development is planned that will exp<strong>and</strong> the range of data <strong>and</strong> document formats that can be<br />

accepted, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g GIS, CAD, LiDAR, 3-D scans, <strong>and</strong> remote-sens<str<strong>on</strong>g>in</str<strong>on</strong>g>g data (McManam<strong>on</strong> <strong>and</strong><br />

K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh 2010), but this <str<strong>on</strong>g>in</str<strong>on</strong>g>clusi<strong>on</strong> is await<str<strong>on</strong>g>in</str<strong>on</strong>g>g the completi<strong>on</strong> of “best-practices” guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es 238 that are<br />

be<str<strong>on</strong>g>in</str<strong>on</strong>g>g developed <str<strong>on</strong>g>in</str<strong>on</strong>g> collaborati<strong>on</strong> with the ADS. 239<br />

To ensure the l<strong>on</strong>g-term preservati<strong>on</strong> of the resources c<strong>on</strong>tributed to tDAR, the Digital Antiquity<br />

project has established a number of important procedures, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g check<str<strong>on</strong>g>in</str<strong>on</strong>g>g file <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrity <strong>and</strong><br />

correct<str<strong>on</strong>g>in</str<strong>on</strong>g>g deteriorati<strong>on</strong>, <strong>and</strong> migrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g data file formats to new file st<strong>and</strong>ards as software <strong>and</strong> hardware<br />

develop over time (McManam<strong>on</strong>, K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh <strong>and</strong> Br<str<strong>on</strong>g>in</str<strong>on</strong>g> 2010). Their stated goal is to meet the criteria for<br />

trusted digital repositories established by the Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Computer <strong>Library</strong> Center (OCLC) <strong>and</strong> the Center<br />

for Research Libraries (OCLC-CRL 2007). In additi<strong>on</strong>, the development <strong>and</strong> test<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the tDAR<br />

prototype <str<strong>on</strong>g>in</str<strong>on</strong>g>volved archaeologists, computer scientists, <strong>and</strong> librarians, <strong>and</strong> regular enhancements to<br />

both repository functi<strong>on</strong>s <strong>and</strong> the user <str<strong>on</strong>g>in</str<strong>on</strong>g>terface are planned. “The tDAR software <strong>and</strong> hardware<br />

choices have been made with flexibility, scal<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> l<strong>on</strong>gevity as the primary c<strong>on</strong>cerns,”<br />

McManam<strong>on</strong>, K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh, <strong>and</strong> Br<str<strong>on</strong>g>in</str<strong>on</strong>g> expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed; “<str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong>, tDAR has chosen to use Open Source tools to<br />

the greatest extent possible.” 240<br />

While technical plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> the use of open-source soluti<strong>on</strong>s are important parts of the l<strong>on</strong>g-term<br />

preservati<strong>on</strong> of data, the Digital Antiquity project has also made important steps toward organizati<strong>on</strong>al<br />

<strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability. The Digital Antiquity organizati<strong>on</strong> that currently manages tDAR directly<br />

grew out of a multi-<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al effort 241 that submitted the successful proposal to the Mell<strong>on</strong><br />

Foundati<strong>on</strong>. After <str<strong>on</strong>g>in</str<strong>on</strong>g>itial fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g 242 ends, Digital Antiquity plans to use a data-curati<strong>on</strong> model, where<br />

archaeologists or other <str<strong>on</strong>g>in</str<strong>on</strong>g>terested parties will pay fees for the deposit <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>tenance of their data,<br />

which will then be made freely available over the Internet (with desired levels of c<strong>on</strong>trolled access<br />

where required). In additi<strong>on</strong>, the organizati<strong>on</strong> will transiti<strong>on</strong> from an entity supported by Ariz<strong>on</strong>a State<br />

University, Mell<strong>on</strong> <strong>and</strong> the NSF, to either an <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent n<strong>on</strong>profit or to “a unit of an established n<strong>on</strong>profit<br />

with compatible goals that can manage Digital Antiquity’s services <strong>and</strong> data assets <str<strong>on</strong>g>in</str<strong>on</strong>g> the l<strong>on</strong>g<br />

term” (McManam<strong>on</strong> <strong>and</strong> K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh 2010). Digital Antiquity staff currently <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a full-time executive<br />

director <strong>and</strong> a number of technical <strong>and</strong> support staff. It is governed by a 12-member board of directors<br />

drawn from the public <strong>and</strong> private sectors.<br />

Design<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Infrastructures for the Research Methods of Archaeology<br />

The basic stages <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeological research have been described as discovery, identificati<strong>on</strong> <strong>and</strong><br />

attributi<strong>on</strong>, cross-referenc<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>, <strong>and</strong> publicati<strong>on</strong> (Dunn 2009). In additi<strong>on</strong>, most<br />

archaeological research beg<str<strong>on</strong>g>in</str<strong>on</strong>g>s with site excavati<strong>on</strong>, which produces massive amounts of data. Thus,<br />

research challenges <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology typically <str<strong>on</strong>g>in</str<strong>on</strong>g>clude organiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g excavati<strong>on</strong> data after the fact <strong>and</strong><br />

sophisticated means of data analysis, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g spatial analysis, 3-D model<str<strong>on</strong>g>in</str<strong>on</strong>g>g of sites, <strong>and</strong> artifact<br />

imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Many scholars <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed <str<strong>on</strong>g>in</str<strong>on</strong>g> the CSHE report also argued for a greater need to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k the<br />

238 http://ads.ahds.ac.uk/project/goodguides/g2gp.htm<br />

239 See Mitcham, Niven, <strong>and</strong> Richards (2010).<br />

240 Further technical detail is provided <str<strong>on</strong>g>in</str<strong>on</strong>g> McManam<strong>on</strong>, K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh, <strong>and</strong> Br<str<strong>on</strong>g>in</str<strong>on</strong>g> (2010), <strong>and</strong> the entire codebase has been licensed under an Apache 2.0 License.<br />

241 These organizati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the University of Arkansas, Ariz<strong>on</strong>a State University, Pennsylvania State University, the ADS, <strong>and</strong> Wash<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong> State<br />

University.<br />

242 The Mell<strong>on</strong> Foundati<strong>on</strong> has funded Digital Antiquity for a four- to five-year startup period.


77<br />

material record that they document so meticulously with the grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g textual record, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g text<br />

collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> digital libraries <strong>and</strong> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s found <str<strong>on</strong>g>in</str<strong>on</strong>g> mass-digitizati<strong>on</strong> projects.<br />

Efforts to re<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate the material <strong>and</strong> textual records of archaeology were recently explored by the<br />

Archaeotools project 243 (Jeffrey et al. 2009a, Jeffrey et al. 2009b). Archaeotools was a major e-Science<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure project for archaeology <str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom that sought to create a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle-faceted<br />

browser <str<strong>on</strong>g>in</str<strong>on</strong>g>terface that would <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate access to the milli<strong>on</strong>s of structured database records regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

archaeological sites <strong>and</strong> m<strong>on</strong>uments found <str<strong>on</strong>g>in</str<strong>on</strong>g> the ADS with “<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> extracted from semistructured<br />

grey literature reports, <strong>and</strong> unstructured antiquarian journal accounts.” Archaeotools<br />

explored both the use of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>-extracti<strong>on</strong> techniques with arts <strong>and</strong> humanities data sets <strong>and</strong> the<br />

automatic creati<strong>on</strong> of metadata for those archaeological reports that had no manually created metadata.<br />

Jeffrey et al. (2009b) observed that archaeology has an extensive pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted record go<str<strong>on</strong>g>in</str<strong>on</strong>g>g back to the<br />

n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth century <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g m<strong>on</strong>ographs, journal articles, special society publicati<strong>on</strong>s, <strong>and</strong> a vast<br />

body of grey literature. One unique challenge of much of the antiquarian literature, they noted, was the<br />

use of n<strong>on</strong>st<strong>and</strong>ard historical place names that made it impossible to automatically <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate this<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> with modern GIS <strong>and</strong> mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies. Their project was <str<strong>on</strong>g>in</str<strong>on</strong>g>formed by the results of<br />

the Armadillo project, 244 a historical text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g project that used <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> extracti<strong>on</strong> to identify<br />

names of historical pers<strong>on</strong>s <strong>and</strong> places <str<strong>on</strong>g>in</str<strong>on</strong>g> the Old Bailey Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs 245 <strong>and</strong> then mapped them to a<br />

def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>on</strong>tology.<br />

The Archaeotools project ultimately created a faceted classificati<strong>on</strong> <strong>and</strong> geospatial browser for the<br />

ADS database, with the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> facets for brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g fall<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>to the categories of “What,” “Where,”<br />

“When,” <strong>and</strong> “Media.” All facets were populated us<str<strong>on</strong>g>in</str<strong>on</strong>g>g exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g thesauri that were marked up <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

XML <strong>and</strong> then <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated us<str<strong>on</strong>g>in</str<strong>on</strong>g>g SKOS. Selected fields were then extracted from the ADS database <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

MIDAS XML 246 format, c<strong>on</strong>verted to RDF XML, <strong>and</strong> then mapped <strong>on</strong>to the thesauri <strong>on</strong>tology that<br />

was previously created. The project also created an extendable NLP system that automatically<br />

extracted metadata from unpublished archaeological reports <strong>and</strong> legacy historical publicati<strong>on</strong>s that<br />

used a comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of knowledge eng<str<strong>on</strong>g>in</str<strong>on</strong>g>eer<str<strong>on</strong>g>in</str<strong>on</strong>g>g (KE) <strong>and</strong> automatic tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g (AT). 247 The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al task was<br />

to use the geoXwalk 248 service to recast “historical place names <strong>and</strong> locati<strong>on</strong>s as nati<strong>on</strong>al grid<br />

references.”<br />

Despite the growth of projects such as Archaeotools, <strong>on</strong>e scholar <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed by the CSHE c<strong>on</strong>cluded<br />

that he had yet to see any revoluti<strong>on</strong>ary uses of technology with<str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology. “What I see still is<br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ly people be<str<strong>on</strong>g>in</str<strong>on</strong>g>g able to do much more of what they always were able to do, <strong>and</strong> do it faster, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

some cases better, with the tools,” this scholar observed; “I d<strong>on</strong>’t see yet that the technology is<br />

fundamentally chang<str<strong>on</strong>g>in</str<strong>on</strong>g>g the nature of what people are do<str<strong>on</strong>g>in</str<strong>on</strong>g>g. …” (Harley et al. 2010, 120). This<br />

argument, i.e., that scholars aren’t do<str<strong>on</strong>g>in</str<strong>on</strong>g>g qualitatively new work but are simply answer<str<strong>on</strong>g>in</str<strong>on</strong>g>g old<br />

questi<strong>on</strong>s more efficiently with new tools, is seen often <str<strong>on</strong>g>in</str<strong>on</strong>g> criticism of digital classics projects.<br />

The CSHE report c<strong>on</strong>cluded that to support more archaeological scholars <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> do<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital<br />

scholarship, both tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> technical support would be required. Such support, they added, would<br />

need to reflect the vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g capabilities of scholars as well as their limited funds:<br />

243 http://ads.ahds.ac.uk/project/archaeotools/<br />

244 http://www.hri<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.ac.uk/armadillo/objectives.html<br />

245 http://www.oldbailey<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.org/<br />

246 http://www.heritage-st<strong>and</strong>ards.org.uk/midas/docs/<br />

247 Further details <strong>on</strong> the results of Archaeotools research with mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g, automatic annotati<strong>on</strong>, <strong>and</strong> the evaluati<strong>on</strong> of doma<str<strong>on</strong>g>in</str<strong>on</strong>g> experts’ annotati<strong>on</strong><br />

of archaeological literature can be found <str<strong>on</strong>g>in</str<strong>on</strong>g> Zhang et al. (2010).<br />

248 http://ed<str<strong>on</strong>g>in</str<strong>on</strong>g>a.ac.uk/projects/geoxwalk/geoparser.html


78<br />

Many look to their <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s to provide the needed support <strong>and</strong> resources for digital<br />

scholarship, but are unable to pay for the services of local technical staff. Digital humanities<br />

facilities at some <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s support <str<strong>on</strong>g>in</str<strong>on</strong>g>novative scholars, but these <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s may be too<br />

advanced for the needs of many of the scholars we <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed <strong>and</strong>, c<strong>on</strong>sequently, have limited<br />

uptake by faculty. Some scholars, however, observed that it is easier to get technical help from<br />

their <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s if the projects might produce transferable tools <strong>and</strong> technologies (Harley et al.<br />

2010, 25).<br />

The ability to provide both basic <strong>and</strong> advanced levels of technical assistance is thus required. In<br />

additi<strong>on</strong>, build<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools that can be repurposed was suggested as <strong>on</strong>e way of garner<str<strong>on</strong>g>in</str<strong>on</strong>g>g greater<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al support. One project that has explored the issues of implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g new technology <strong>and</strong><br />

digital methods for archaeological field research is the Silchester Roman Town 249 project, a British<br />

research excavati<strong>on</strong> project of the Roman town of Silchester from its history before the Roman<br />

c<strong>on</strong>quest until it was ab<strong>and</strong><strong>on</strong>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> the fifth century AD.<br />

The project made extensive use of a specialized database called the Integrated Archaeological<br />

Database (IADB), 250 which was developed <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1980s <strong>and</strong> is now available as a web-based<br />

applicati<strong>on</strong> that makes use of Ajax, MySQL, <strong>and</strong> PHP (Fulford et al. 2010). “Crucial to the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of the archaeological record,” Fulford et al. reported, “is the IADB’s capacity to build<br />

the hierarchical relati<strong>on</strong>ships (archaeological matrix) which mirror the stratigraphic sequence <strong>and</strong><br />

enable the capture of composite, spatial plans of the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual c<strong>on</strong>text record to dem<strong>on</strong>strate the<br />

chang<str<strong>on</strong>g>in</str<strong>on</strong>g>g character of occupati<strong>on</strong> over time” (Fulford et al. 2010). Archaeological data can be viewed<br />

as <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual records, 2-D matrices, or as groups of objects.<br />

One major challenge faced dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g field research is site record<str<strong>on</strong>g>in</str<strong>on</strong>g>g; Fulford et al. observed that the<br />

double h<strong>and</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>g of data was particularly problematic. To deal with this problem, the Silchester Roman<br />

Town project first collaborated with the OGHAM (On-L<str<strong>on</strong>g>in</str<strong>on</strong>g>e Group Historical <strong>and</strong> Archaeological<br />

Matrix) project, which was funded by the Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t Informati<strong>on</strong> Systems Committee (JISC), 251 <strong>and</strong><br />

Silchester thus <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced the use of PDAs <strong>and</strong> rugged tablet computers for field record<str<strong>on</strong>g>in</str<strong>on</strong>g>g. The most<br />

significant <str<strong>on</strong>g>in</str<strong>on</strong>g>sight of this first project was that direct network access was “<str<strong>on</strong>g>in</str<strong>on</strong>g>valuable,” particularly <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

terms of communicati<strong>on</strong> <strong>and</strong> data management. JISC c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ued fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g this work through the VERA:<br />

Virtual Envir<strong>on</strong>ment for Research <str<strong>on</strong>g>in</str<strong>on</strong>g> Archaeology 252 project, <strong>and</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>itial collaborati<strong>on</strong> was<br />

extended to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> computer scientists. Baker et al. (2008) noted that the VERA<br />

project sought “to <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigate how archaeologists use Informati<strong>on</strong> Technology (IT) <str<strong>on</strong>g>in</str<strong>on</strong>g> the c<strong>on</strong>text of a<br />

field excavati<strong>on</strong>, <strong>and</strong> also for post-excavati<strong>on</strong> Analysis.” The project also <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced new tools <strong>and</strong><br />

technology to assist <str<strong>on</strong>g>in</str<strong>on</strong>g> “the archaeological processes of record<str<strong>on</strong>g>in</str<strong>on</strong>g>g, manipulat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> analys<str<strong>on</strong>g>in</str<strong>on</strong>g>g data.”<br />

Baker et al. underscored that <strong>on</strong>e of the most important parts of the archaeological process is record<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

“c<strong>on</strong>texts,” which they def<str<strong>on</strong>g>in</str<strong>on</strong>g>e as the “smallest identifiable unit <str<strong>on</strong>g>in</str<strong>on</strong>g>to which the archaeological record<br />

can be divided <strong>and</strong> are usually the result of a physical acti<strong>on</strong>” (Baker et al. 2008). As c<strong>on</strong>texts are<br />

identified they are given a unique number <str<strong>on</strong>g>in</str<strong>on</strong>g> a site register <strong>and</strong> typically the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> is recorded <strong>on</strong><br />

a paper “c<strong>on</strong>text card” that will track everyth<str<strong>on</strong>g>in</str<strong>on</strong>g>g from sketches to data. C<strong>on</strong>text cards are filed <str<strong>on</strong>g>in</str<strong>on</strong>g> an<br />

area folder <strong>and</strong> eventually entered manually <str<strong>on</strong>g>in</str<strong>on</strong>g>to a database. This process, however, is not without its<br />

problems:<br />

249 http://www.silchester.rdg.ac.uk/<br />

250 http://www.iadb.org.uk/<br />

251 http://www.jisc.ac.uk/<br />

252 http://vera.rdg.ac.uk/


79<br />

The recorded c<strong>on</strong>texts provide the material to populate the research envir<strong>on</strong>ment, they are<br />

stored <str<strong>on</strong>g>in</str<strong>on</strong>g> the Integrated Archaeological Data Base (IADB), which is an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e database system<br />

for manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g record<str<strong>on</strong>g>in</str<strong>on</strong>g>g, analysis, archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e publicati<strong>on</strong> of archaeological f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds,<br />

c<strong>on</strong>texts <strong>and</strong> plans. In the past the entry of data <strong>on</strong> to the IADB has been undertaken manually.<br />

There are around 1000 c<strong>on</strong>texts recorded each seas<strong>on</strong>, which means that manual <str<strong>on</strong>g>in</str<strong>on</strong>g>put of the<br />

data <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> is very time c<strong>on</strong>sum<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Baker et al. 2008).<br />

One of the challenges of the VERA project, therefore, was to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d a way to make the process of<br />

record<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>texts <strong>and</strong> enter<str<strong>on</strong>g>in</str<strong>on</strong>g>g them <str<strong>on</strong>g>in</str<strong>on</strong>g>to the database more efficient. As their ideal, they cite Gary<br />

Lock’s goal of archaeological comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, where “the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> flows seamlessly from excavati<strong>on</strong>,<br />

through post-excavati<strong>on</strong> to publicati<strong>on</strong> <strong>and</strong> archive” (Lock 2003, as cited <str<strong>on</strong>g>in</str<strong>on</strong>g> Baker et al. 2008).<br />

The VERA team asked archaeologists to complete diaries while <str<strong>on</strong>g>in</str<strong>on</strong>g> the field, c<strong>on</strong>ducted <strong>on</strong>e-to-<strong>on</strong>e<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terviews <strong>and</strong> a workshop, <strong>and</strong> implemented user test<str<strong>on</strong>g>in</str<strong>on</strong>g>g with the IADB. In the diary study of 2007,<br />

they asked archaeologists about their experience us<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital technology dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g the excavati<strong>on</strong><br />

process. They met with a fair amount of resistance both to keep<str<strong>on</strong>g>in</str<strong>on</strong>g>g diaries <strong>and</strong> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g new technology,<br />

<strong>and</strong> many participants noted the unreliability of wi-fi <str<strong>on</strong>g>in</str<strong>on</strong>g> the field. Another important <str<strong>on</strong>g>in</str<strong>on</strong>g>sight from the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terviews was that data quality was of the highest importance, <strong>and</strong> some archaeologists observed that<br />

the direct entry of c<strong>on</strong>texts <str<strong>on</strong>g>in</str<strong>on</strong>g>to the database often led to lower-quality data. In the first excavati<strong>on</strong>,<br />

field workers were enter<str<strong>on</strong>g>in</str<strong>on</strong>g>g data directly <str<strong>on</strong>g>in</str<strong>on</strong>g>to the IADB without the traditi<strong>on</strong>al quality-c<strong>on</strong>trol layer,<br />

but <str<strong>on</strong>g>in</str<strong>on</strong>g> the sec<strong>on</strong>d field seas<strong>on</strong>, paper record<str<strong>on</strong>g>in</str<strong>on</strong>g>g of c<strong>on</strong>texts was re<str<strong>on</strong>g>in</str<strong>on</strong>g>troduced, with the “small f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds<br />

supervisor” collat<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>text reports <strong>and</strong> enter<str<strong>on</strong>g>in</str<strong>on</strong>g>g them herself. The diary study thus “illustrated the<br />

importance of ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g mechanisms for check<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> c<strong>on</strong>troll<str<strong>on</strong>g>in</str<strong>on</strong>g>g data” (Baker et al. 2008).<br />

N<strong>on</strong>etheless, the use of digital pens was rated very highly <strong>and</strong> did speed enter<str<strong>on</strong>g>in</str<strong>on</strong>g>g of some c<strong>on</strong>texts <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

the IADB. 253 At the same time, the pens were not able to digitally capture all data (43 percent) from<br />

the excavati<strong>on</strong> seas<strong>on</strong> directly <str<strong>on</strong>g>in</str<strong>on</strong>g>to the IADB <strong>on</strong> the first pass (Fulford et al. 2010, 25). This study thus<br />

also dem<strong>on</strong>strated the importance of a will<str<strong>on</strong>g>in</str<strong>on</strong>g>gness to both try <strong>and</strong> possibly ab<strong>and</strong><strong>on</strong> new technologies<br />

<strong>and</strong> to test real discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary workflows <str<strong>on</strong>g>in</str<strong>on</strong>g> the development of the VRE.<br />

Interviews <strong>and</strong> user test<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the IADB provided the VERA team with other important <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the need to make the <str<strong>on</strong>g>in</str<strong>on</strong>g>terface more <str<strong>on</strong>g>in</str<strong>on</strong>g>tuitive <strong>and</strong> for the database design team to warn users<br />

before implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g broad system changes. Three themes that ran through all the research c<strong>on</strong>ducted<br />

by Baker et al. were the need for data quality, transparency of data trails, <strong>and</strong> “ease of use of<br />

technologies.”<br />

Fulford et al. (2010) ultimately c<strong>on</strong>cluded that both VERA <strong>and</strong> OGHAM had enhanced the work of the<br />

Silchester Roman Town project:<br />

The OGHAM <strong>and</strong> VERA projects have unquesti<strong>on</strong>ably strengthened <strong>and</strong> improved the flow of<br />

data, both field <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds records, from the trench to the database, where they can be<br />

immediately accessed by the research team. The greater the speed by which these data have<br />

become available, the faster the research manipulati<strong>on</strong> of those data can be undertaken, <strong>and</strong> the<br />

faster the c<strong>on</strong>sequent presentati<strong>on</strong> of the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpreted field record to the wider research team.<br />

The challenge is now to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e whether the same speed can be achieved with the research<br />

team of specialist analysts (Fulford et al. 2010, 26).<br />

253 For greater detail <strong>on</strong> these studies, see Warwick et al. (2009).


80<br />

Although the multidiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary project team could both get to their data faster <strong>and</strong> manipulate their<br />

data <str<strong>on</strong>g>in</str<strong>on</strong>g> new ways, Fulford et al. reported that it rema<str<strong>on</strong>g>in</str<strong>on</strong>g>ed to be seen whether specialists would beg<str<strong>on</strong>g>in</str<strong>on</strong>g> to<br />

publish their results any faster or if they would publish them electr<strong>on</strong>ically. N<strong>on</strong>etheless, remote access<br />

to the IADB was found to be especially important for specialists, as it allowed them to “become more<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated with the c<strong>on</strong>text of their material” <strong>and</strong> enabled new levels of both <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent <strong>and</strong><br />

collaborative work between specialties.<br />

Additi<strong>on</strong>ally, while Fulford et al. stated that the IADB formed the “heart of the VRE,” they also<br />

envisi<strong>on</strong>ed a VRE for archaeology that could support both the digital capture <strong>and</strong> manipulati<strong>on</strong> of f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds<br />

from the field <strong>and</strong> more sophisticated levels of postexcavati<strong>on</strong> analysis. The Silchester project hoped to<br />

publish for both a specialist <strong>and</strong> public audience, <strong>and</strong> while acknowledg<str<strong>on</strong>g>in</str<strong>on</strong>g>g the new opportunities of<br />

electr<strong>on</strong>ic publicati<strong>on</strong>, stated that they were simultaneously publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g their results <str<strong>on</strong>g>in</str<strong>on</strong>g> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t as well.<br />

“Despite the potential of web-based publicati<strong>on</strong>, lack of c<strong>on</strong>fidence <str<strong>on</strong>g>in</str<strong>on</strong>g> the medium- or the l<strong>on</strong>ger-term<br />

susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability of the web-based resource has meant a c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ued <strong>and</strong> significant reliance <strong>on</strong> traditi<strong>on</strong>al<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted media,” Fulford et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “this is as much true of Silchester as it is of<br />

archaeology <str<strong>on</strong>g>in</str<strong>on</strong>g> general” (Fulford et al. 2010, 28). At the same time, the Silchester Roman town project<br />

has created a c<strong>on</strong>stantly evolv<str<strong>on</strong>g>in</str<strong>on</strong>g>g project website <strong>and</strong> a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted m<strong>on</strong>ograph, <strong>and</strong> has published an<br />

article <str<strong>on</strong>g>in</str<strong>on</strong>g> Internet Archaeology 254 that referenced data they had archived <str<strong>on</strong>g>in</str<strong>on</strong>g> the ADS. 255 The creati<strong>on</strong><br />

of a website brought with it the c<strong>on</strong>current challenges of accessibility <strong>and</strong> susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability. N<strong>on</strong>etheless,<br />

this project dem<strong>on</strong>strated how digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g offered new opportunities while not necessarily<br />

preclud<str<strong>on</strong>g>in</str<strong>on</strong>g>g pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

The archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the Silchester data with the ADS <strong>and</strong> the publicati<strong>on</strong> of an article <str<strong>on</strong>g>in</str<strong>on</strong>g> Internet<br />

Archaeology that l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to these data occurred under the auspices of the LEAP (L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Electr<strong>on</strong>ic<br />

Archives <strong>and</strong> Publicati<strong>on</strong>s) project 256 <strong>and</strong> dem<strong>on</strong>strated not <strong>on</strong>ly the importance of the l<strong>on</strong>g-term<br />

preservati<strong>on</strong> of data but the potential of electr<strong>on</strong>ic publicati<strong>on</strong> for l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g archaeological research to the<br />

actual data <strong>on</strong> which it is based. 257 Despite the importance of l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g data to their published<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s, the Digital Antiquity project has also stressed that much work rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s to be d<strong>on</strong>e to<br />

encourage more archaeologists to utilize digital archives. “To achieve this potential,” McManam<strong>on</strong> <strong>and</strong><br />

K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh stated, “we must transform archaeological practice so that the digital archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g of data <strong>and</strong> the<br />

metadata necessary to make it mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gful become a st<strong>and</strong>ard part of all archaeological project<br />

workflows” (McManam<strong>on</strong> <strong>and</strong> K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh 2010). 258<br />

While the IADB presented a number of opportunities, it also raised three major technical challenges:<br />

security <strong>on</strong> the web, <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability with other databases, <strong>and</strong> the potential of 3-D visualizati<strong>on</strong>s or<br />

rec<strong>on</strong>structi<strong>on</strong>s for both academic <strong>and</strong> public users of the data. On this third po<str<strong>on</strong>g>in</str<strong>on</strong>g>t, Fulford et al. stated<br />

that “<str<strong>on</strong>g>in</str<strong>on</strong>g>tegral to this is the need to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k the evidence used to build the rec<strong>on</strong>structi<strong>on</strong> with data stored <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the IADB.” The need to illustrate to users that all visualizati<strong>on</strong>s of Silchester <strong>on</strong> the website are based<br />

<strong>on</strong> human <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of available data was also cited as essential. The reality of human<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> needs to be carefully c<strong>on</strong>sidered <str<strong>on</strong>g>in</str<strong>on</strong>g> any VRE design for archaeology accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Stuart<br />

Dunn:<br />

In the discussi<strong>on</strong> of VREs, <strong>and</strong> models of data curati<strong>on</strong> <strong>and</strong> distributi<strong>on</strong> which are based <strong>on</strong><br />

central <strong>and</strong>/or <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al storage <strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of data, it is easy to forget the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretive<br />

254 http://<str<strong>on</strong>g>in</str<strong>on</strong>g>tarch.ac.uk/journal/issue21/silchester_<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

255 http://ads.ahds.ac.uk/catalogue/archive/silchester_ahrc_2007/<br />

256 http://ads.ahds.ac.uk/project/leap/<br />

257 The tDAR has also been used as a data archive for an article <str<strong>on</strong>g>in</str<strong>on</strong>g> Internet Archaeology.<br />

258 As an <str<strong>on</strong>g>in</str<strong>on</strong>g>itial step <str<strong>on</strong>g>in</str<strong>on</strong>g> this directi<strong>on</strong>, the Digital Antiquity project has started a grants program to encourage deposit of data <str<strong>on</strong>g>in</str<strong>on</strong>g>to tDAR.


81<br />

implicati<strong>on</strong>s of h<strong>and</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>g archaeological <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> digitally… this is [SIC] also perta<str<strong>on</strong>g>in</str<strong>on</strong>g>s to the<br />

broader arts <strong>and</strong> humanities VRE agenda. The act of publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g a database of archaeological<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> implicitly disguises the fact that creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the database <str<strong>on</strong>g>in</str<strong>on</strong>g> the first place is an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretive process (Dunn 2009).<br />

Dunn’s warn<str<strong>on</strong>g>in</str<strong>on</strong>g>g is an important rem<str<strong>on</strong>g>in</str<strong>on</strong>g>der that the design of any <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities must<br />

take <str<strong>on</strong>g>in</str<strong>on</strong>g>to account the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative nature of most humanities scholarship. He also further detailed how<br />

archaeological workflows are “idiosyncratic, partly <str<strong>on</strong>g>in</str<strong>on</strong>g>formal, <strong>and</strong> extremely difficult to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e,” all<br />

factors that make them hard to translate <str<strong>on</strong>g>in</str<strong>on</strong>g>to a digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure (Dunn 2009).<br />

Some recent research us<str<strong>on</strong>g>in</str<strong>on</strong>g>g topic model<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> an archaeological database has illustrated how<br />

subjective the human <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s of archaeological data can be. Recent work by David Mimno<br />

(2009) used topic model<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> a database of objects discovered <str<strong>on</strong>g>in</str<strong>on</strong>g> houses from Pompeii 259 to<br />

exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e the validity of the typological classificati<strong>on</strong>s that were <str<strong>on</strong>g>in</str<strong>on</strong>g>itially assigned to these objects. This<br />

database c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s 6,000 artifact records for f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds <str<strong>on</strong>g>in</str<strong>on</strong>g> 30 architecturally similar houses <str<strong>on</strong>g>in</str<strong>on</strong>g> Pompeii, <strong>and</strong><br />

each artifact is labeled with <strong>on</strong>e of 240 typological categories <strong>and</strong> the room <str<strong>on</strong>g>in</str<strong>on</strong>g> which it was found.<br />

Because of the large amount of data available, Mimno argued that the use of statistical data-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

tools could help provide some new <str<strong>on</strong>g>in</str<strong>on</strong>g>sights <str<strong>on</strong>g>in</str<strong>on</strong>g>to these data:<br />

In this paper we apply <strong>on</strong>e such tool, statistical topic model<str<strong>on</strong>g>in</str<strong>on</strong>g>g ... <str<strong>on</strong>g>in</str<strong>on</strong>g> which rooms are modeled<br />

as hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g mixtures of functi<strong>on</strong>s, <strong>and</strong> functi<strong>on</strong>s are modeled as distributi<strong>on</strong>s over a “vocabulary”<br />

of object types. The purpose of this study is not to show that topic model<str<strong>on</strong>g>in</str<strong>on</strong>g>g is the best tool for<br />

archeological <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigati<strong>on</strong>, but that it is an appropriate tool that can provide a complement to<br />

human analysis. To this aim, we attempt to provide a perspective <strong>on</strong> several issues raised by<br />

Allis<strong>on</strong>, that is, if not unbiased, then at least mathematically c<strong>on</strong>crete <str<strong>on</strong>g>in</str<strong>on</strong>g> its biases (Mimno<br />

2009).<br />

In comm<strong>on</strong> archaeological practice, Mimno expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, artifacts that are excavated are removed to<br />

secure storage, <strong>and</strong> while their locati<strong>on</strong> is carefully noted <str<strong>on</strong>g>in</str<strong>on</strong>g> modern digs, artifacts <str<strong>on</strong>g>in</str<strong>on</strong>g> storage are<br />

typically analyzed “<str<strong>on</strong>g>in</str<strong>on</strong>g> comparis<strong>on</strong> to typologically similar objects rather than with<str<strong>on</strong>g>in</str<strong>on</strong>g> their orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

c<strong>on</strong>text.” C<strong>on</strong>sequently, Mimno reas<strong>on</strong>ed that determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the functi<strong>on</strong> of many artifacts had been<br />

driven by “arbitrary traditi<strong>on</strong>” <strong>and</strong> the percepti<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual researchers <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of what an artifact<br />

resembles. Two classes of artifacts, <str<strong>on</strong>g>in</str<strong>on</strong>g> fact, the casseruola (casserole dish) <strong>and</strong> forma di pasticceria<br />

(pastry mold) were named based <strong>on</strong> similarities to n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth-century household objects, <strong>and</strong> the<br />

creator of the Pompeii database (Penelope Allis<strong>on</strong>) c<strong>on</strong>tended that modern archaeologists often made<br />

unvalidated assumpti<strong>on</strong>s about objects based <strong>on</strong> their modern names (Allis<strong>on</strong> 2001).<br />

For these reas<strong>on</strong>s, Mimno decided to use topic model<str<strong>on</strong>g>in</str<strong>on</strong>g>g to reduce this bias <strong>and</strong> explored the functi<strong>on</strong><br />

of artifact types us<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>ly object co-occurrence data <strong>and</strong> no typology <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. All object<br />

descripti<strong>on</strong>s were reduced to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegers, <strong>and</strong> a statistical topic model was then used to detect “clusters of<br />

object cooccurrence” that might <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate functi<strong>on</strong>s. While Mimno admitted that this system still relied<br />

<strong>on</strong> experts hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g accurately classified physical objects <str<strong>on</strong>g>in</str<strong>on</strong>g>to appropriate categories <str<strong>on</strong>g>in</str<strong>on</strong>g> the first place, no<br />

other archaeological assumpti<strong>on</strong>s were made by the tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g model. The basic assumpti<strong>on</strong> was that if<br />

two objects had similar patterns of use they should have a high probability of co-occurrence together <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

<strong>on</strong>e or more “topics.” Initial analysis of a topic model for the casseruola <strong>and</strong> forma di pasticceria<br />

illustrated them as hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g little c<strong>on</strong>necti<strong>on</strong> to other food-preparati<strong>on</strong> objects <strong>and</strong> thus supported<br />

Allis<strong>on</strong>’s claim that the modern names for these items are <str<strong>on</strong>g>in</str<strong>on</strong>g>correct. This work illustrates how<br />

259 http://www.stoa.org/projects/ph/home


82<br />

computer science can make it possible for scholars to reanalyze large amounts of exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g legacy<br />

archaeological data to make new arguments about those data.<br />

Visualizati<strong>on</strong> <strong>and</strong> 3-D Rec<strong>on</strong>structi<strong>on</strong>s of Archaeological Sites<br />

The nature of 3-D model<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> digital rec<strong>on</strong>structi<strong>on</strong> has made this particular area of archaeological<br />

research <strong>on</strong>e of the most collaborative <strong>and</strong> groundbreak<str<strong>on</strong>g>in</str<strong>on</strong>g>g. As Harley et al. (2010) have noted, these<br />

new approaches have both benefits <strong>and</strong> challenges:<br />

3D model<str<strong>on</strong>g>in</str<strong>on</strong>g>g—based <strong>on</strong> the laser scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g of archaeological sites, the photogrammetric<br />

analysis of excavati<strong>on</strong> photographs, or other virtual model<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques—provides unique<br />

opportunities to virtually represent archaeological sites. These virtual models do not yet<br />

provide a facile publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g platform, but they may allow scholars to run experiments or test<br />

claims made <str<strong>on</strong>g>in</str<strong>on</strong>g> the scholarly literature. Although 3D model<str<strong>on</strong>g>in</str<strong>on</strong>g>g is a new dimensi<strong>on</strong> for<br />

archaeological research <strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, it may not be suitable for all applicati<strong>on</strong>s (such as<br />

poorly preserved archaeological sites). In additi<strong>on</strong>, some scholars observed that the focus <strong>on</strong><br />

3D model<str<strong>on</strong>g>in</str<strong>on</strong>g>g as a new technology may come at the expense of close attenti<strong>on</strong> to physical <strong>and</strong><br />

cultural research (Harley et al. 2010, 115).<br />

As with other new technologies, scholars were often worried that a focus <strong>on</strong> technology would replace<br />

explorati<strong>on</strong> of more traditi<strong>on</strong>al archaeological questi<strong>on</strong>s. N<strong>on</strong>etheless, the number of archaeological<br />

websites mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g use of 3-D model<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> virtual rec<strong>on</strong>structi<strong>on</strong> is grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uously. 260 This<br />

secti<strong>on</strong> looks at several of the larger projects <strong>and</strong> the grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g body of research literature <str<strong>on</strong>g>in</str<strong>on</strong>g> this area.<br />

Alys<strong>on</strong> Gill recently provided an overview (Gill 2009) of the use of digital model<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> both<br />

archaeology <strong>and</strong> humanities applicati<strong>on</strong>s. She reported that Paul Reilly co<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the term virtual<br />

archaeology <str<strong>on</strong>g>in</str<strong>on</strong>g> 1991, <strong>and</strong> s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce that time his <str<strong>on</strong>g>in</str<strong>on</strong>g>itial c<strong>on</strong>cept of 3-D model<str<strong>on</strong>g>in</str<strong>on</strong>g>g of ancient sites has<br />

exp<strong>and</strong>ed greatly. Instead of “virtual archaeology,” Koller et al. (2009) suggest a more expansive<br />

def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of “virtual heritage” as a:<br />

… relatively new branch of knowledge that utilizes <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> technology to capture or<br />

represent the data studied by archaeologists <strong>and</strong> historians of art <strong>and</strong> architecture. These data<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude three-dimensi<strong>on</strong>al objects such as pottery, furniture, works of art, build<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, <strong>and</strong> even<br />

entire villages, cities, <strong>and</strong> cultural l<strong>and</strong>scapes (Koller et al. 2009).<br />

The first major book published <strong>on</strong> this subject, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Gill (2009) <strong>and</strong> to Koller et al. (2009), was<br />

Virtual Archaeology, by M. Forte <strong>and</strong> A. Siliotti (Forte <strong>and</strong> Siliotti 1997). Koller et al. <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated that<br />

this book illustrated how early models typically served illustrati<strong>on</strong> purposes <strong>on</strong>ly <strong>and</strong> that early<br />

publicati<strong>on</strong>s focused <strong>on</strong> methodologies used to create such models. In additi<strong>on</strong>, commercial companies<br />

created almost all the models <str<strong>on</strong>g>in</str<strong>on</strong>g> this book. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce that time, Koller et al. reported, th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs have changed<br />

greatly; the price of 3-D model<str<strong>on</strong>g>in</str<strong>on</strong>g>g software <strong>and</strong> data-capture technology has dropped drastically <strong>and</strong><br />

the skill sets required to work with these tools are grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g am<strong>on</strong>g scholars. Gill (2009) has identified<br />

four major trends <str<strong>on</strong>g>in</str<strong>on</strong>g> current projects: collaborative virtual envir<strong>on</strong>ments; <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e applicati<strong>on</strong>s used for<br />

teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g; rec<strong>on</strong>structi<strong>on</strong> of large-scale historical spaces; <strong>and</strong> digital preservati<strong>on</strong> of<br />

cultural heritage sites.<br />

260 Some <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g sites not covered here <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the Skenographia Project (http://www.kvl.cch.kcl.ac.uk/wall_pa<str<strong>on</strong>g>in</str<strong>on</strong>g>t<str<strong>on</strong>g>in</str<strong>on</strong>g>gs/<str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong>/default.htm), the<br />

Digital Pompeii Project (http://pompeii.uark.edu/Digital_Pompeii/Welcome.html), the Portus Project (http://www.portusproject.org/aims/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html), <strong>and</strong><br />

Parthen<strong>on</strong> 360 (1.) QTVR (http://www.dkv.columbia.edu/vmc/acropolis/ - 1_1)


83<br />

One of the largest <strong>and</strong> best-known websites described by Gill is the Digital Karnak Project, 261 created<br />

under the directi<strong>on</strong> of two scholars at the University of California, Los Angeles. The Temple of Karnak<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> Egypt existed for more than 3,000 years, <strong>and</strong> this website has created a number of ways for users to<br />

explore its history. A 3-D virtual reality model of the temple was created that allows users to view how<br />

the temple was c<strong>on</strong>structed <strong>and</strong> modified over time; this is accompanied by orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al videos, maps, <strong>and</strong><br />

thematic essays written by Egyptologists. A simplified versi<strong>on</strong> of the model of the temple was also<br />

made available <strong>on</strong> Google Earth. There are four ways to enter the website: (1) us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Timemap of the<br />

site that enables users to choose a time period <strong>and</strong> view features that were created, modified, <strong>and</strong><br />

destroyed; (2) choos<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>e of a series of thematic topics with essays <strong>and</strong> videos; (3) brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

archive by chr<strong>on</strong>ology, type, feature, or topic, which takes the user to both rec<strong>on</strong>structi<strong>on</strong> model<br />

render<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, descripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the object catalog, videos, <strong>and</strong> a large number of photographs; <strong>and</strong> (4) us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Google Earth to view the model. This website dem<strong>on</strong>strates how many of these technologies are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

put to use to create sophisticated teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources.<br />

The city of <str<strong>on</strong>g>Rome</str<strong>on</strong>g> has also been the subject of a number of virtual rec<strong>on</strong>structi<strong>on</strong> projects, with the<br />

Digital Roman Forum explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>e particular m<strong>on</strong>ument, 262 while the Plan de <str<strong>on</strong>g>Rome</str<strong>on</strong>g>, 263 the Stanford<br />

Digital Forma Urbis Romae 264 project, <strong>and</strong> the particularly well-known <str<strong>on</strong>g>Rome</str<strong>on</strong>g> Reborn 265 focus <strong>on</strong> the<br />

city as a whole. The Digital Roman Forum provides access to a digital model of the Roman Forum as it<br />

appeared <str<strong>on</strong>g>in</str<strong>on</strong>g> late antiquity <strong>and</strong> was created by the University of California, Los Angeles, Cultural<br />

Virtual Reality Lab (CVRLab). 266 Users can use TimeMap to view different features (e.g., the Basilica<br />

Aemilia, the Curia Iulia) <strong>on</strong> the model, <strong>and</strong> click<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> a feature br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs up both a virtual model <strong>and</strong><br />

current photograph of that feature, each of which can have its po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of view adjusted. The digital<br />

rec<strong>on</strong>structi<strong>on</strong>s can be searched by keyword or browsed by the primary sources that described it, as<br />

well as by functi<strong>on</strong> or type. One facet of this website that is particularly noteworthy is that it seeks to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate the textual sources (such as the histories of Livy <strong>and</strong> Tacitus) <strong>and</strong> the sec<strong>on</strong>dary scholarly<br />

research that were used <str<strong>on</strong>g>in</str<strong>on</strong>g> mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g some model<str<strong>on</strong>g>in</str<strong>on</strong>g>g decisi<strong>on</strong>s. Each feature also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a full<br />

descripti<strong>on</strong> 267 with an <str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong>, history, rec<strong>on</strong>structi<strong>on</strong> issues <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g sources <strong>and</strong> levels of<br />

certa<str<strong>on</strong>g>in</str<strong>on</strong>g>ty, a bibliography, a series of QuickTime object <strong>and</strong> panorama movies, <strong>and</strong> still images. This<br />

website illustrates the complicated nature of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g rec<strong>on</strong>structi<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the amount of work<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>volved, the number of sources used, <strong>and</strong> the uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> nature of many visualizati<strong>on</strong> decisi<strong>on</strong>s.<br />

The Stanford Digital Forma Urbis Romae project provides digital access to the rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s “Forma Urbis<br />

Romae,” a large marble plan of the city that was carved <str<strong>on</strong>g>in</str<strong>on</strong>g> the third century AD This website <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes<br />

digital photographs, 3-D models of the plan, <strong>and</strong> a database that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes details <strong>on</strong> all of the<br />

fragments. The Plan de <str<strong>on</strong>g>Rome</str<strong>on</strong>g> website provides access to a virtual 3-D model of the “Plan of <str<strong>on</strong>g>Rome</str<strong>on</strong>g>,” a<br />

large plaster model of the city that was created by architect Paul Bigot, <strong>and</strong> an extraord<str<strong>on</strong>g>in</str<strong>on</strong>g>ary level of<br />

detail <strong>on</strong> the city.<br />

The most ambitious of all of these projects, <str<strong>on</strong>g>Rome</str<strong>on</strong>g> Reborn, is an <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al effort that seeks to create<br />

3-D models that illustrates the urban development of <str<strong>on</strong>g>Rome</str<strong>on</strong>g> from the late Br<strong>on</strong>ze age (1000 BC) to the<br />

early Middle Ages. The project staff has decided to focus <str<strong>on</strong>g>in</str<strong>on</strong>g>itially <strong>on</strong> 320 AD because at this time<br />

<str<strong>on</strong>g>Rome</str<strong>on</strong>g> had reached its peak populati<strong>on</strong>; many major churches were be<str<strong>on</strong>g>in</str<strong>on</strong>g>g built, <strong>and</strong> few new build<str<strong>on</strong>g>in</str<strong>on</strong>g>gs<br />

261 http://dlib.etc.ucla.edu/projects/Karnak/<br />

262 http://dlib.etc.ucla.edu/projects/Forum<br />

263 http://www.unicaen.fr/services/cireve/rome/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phplangue=en<br />

264 http://formaurbis.stanford.edu/docs/FURdb.html<br />

265 http://www.romereborn.virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia.edu/<br />

266 http://www.cvrlab.org/<br />

267 http://dlib.etc.ucla.edu/projects/Forum/rec<strong>on</strong>structi<strong>on</strong>s/CuriaIulia_1


84<br />

were created after this time. A number of partners are <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> the effort, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Institute for<br />

Advanced Technology <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities (IATH) of the University of Virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia <strong>and</strong> the CVRLab.<br />

C<strong>on</strong>cern<str<strong>on</strong>g>in</str<strong>on</strong>g>g the goals of the project, the website notes the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g:<br />

The primary purpose of this phase of the project was to spatialize <strong>and</strong> present <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong><br />

theories about how the city looked at this moment <str<strong>on</strong>g>in</str<strong>on</strong>g> time, which was more or less the height of<br />

its development as the capital of the Roman Empire. A sec<strong>on</strong>dary, but important, goal was to<br />

create the cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure whereby the model could be updated, corrected, <strong>and</strong> augmented.<br />

A large number of rec<strong>on</strong>structi<strong>on</strong> stills can be viewed at the website. In November 2008 a versi<strong>on</strong> of<br />

<str<strong>on</strong>g>Rome</str<strong>on</strong>g> Reborn 1.0 was published <strong>on</strong> the Internet through Google Earth. 268<br />

Guidi et al. (2006) have reported <strong>on</strong> <strong>on</strong>e of the most significant efforts <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>Rome</str<strong>on</strong>g> Reborn,<br />

namely, the digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the Plastico di Roma Antica, a physical model of <str<strong>on</strong>g>Rome</str<strong>on</strong>g> that is owned by the<br />

Museum of Roman Civilizati<strong>on</strong> <strong>and</strong> was designed by Italo Gism<strong>on</strong>di. The Plastico is a huge physical<br />

model of imperial <str<strong>on</strong>g>Rome</str<strong>on</strong>g> with a high level of <str<strong>on</strong>g>in</str<strong>on</strong>g>tricate detail, <strong>and</strong> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital model of it required<br />

the development of a number of advanced-imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques <strong>and</strong> algorithms. Ultimately, the<br />

digitized Plastico was used as the basis for a hybrid model of late antique <str<strong>on</strong>g>Rome</str<strong>on</strong>g> that was also based <strong>on</strong><br />

new, born-digital models created for specific build<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <strong>and</strong> m<strong>on</strong>uments <str<strong>on</strong>g>in</str<strong>on</strong>g> the city. The sheer size of<br />

their project required utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g such a model, for as the authors noted:<br />

Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g of an ancient build<str<strong>on</strong>g>in</str<strong>on</strong>g>g may start from the historical documentati<strong>on</strong>, archeological<br />

studies undertaken <str<strong>on</strong>g>in</str<strong>on</strong>g> the past <strong>and</strong> sometimes from a new survey of the area. These data are<br />

then comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> the creati<strong>on</strong> of a digital three-dimensi<strong>on</strong>al (3D) synthesis that represents a<br />

reas<strong>on</strong>able hypothesis of how the artifact <strong>on</strong>ce appeared. The c<strong>on</strong>structi<strong>on</strong> of an entire city can<br />

proceed by repeat<str<strong>on</strong>g>in</str<strong>on</strong>g>g this method as l<strong>on</strong>g as needed, but the process would of course be<br />

extremely time-c<strong>on</strong>sum<str<strong>on</strong>g>in</str<strong>on</strong>g>g, assum<str<strong>on</strong>g>in</str<strong>on</strong>g>g it would be at all possible s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce sometime (as <str<strong>on</strong>g>in</str<strong>on</strong>g> the case<br />

discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper) all the archaeological data that would be needed are not known (Guidi<br />

et al. 2006).<br />

One <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g idea suggested by Guidi et al. was that the Plastico is but <strong>on</strong>e physical model of a city.<br />

Hundreds of such models have been developed s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the Renaissance, <strong>and</strong> the methodologies they<br />

have used could easily be transferred to the development of digital models of other cities.<br />

While smaller <str<strong>on</strong>g>in</str<strong>on</strong>g> scale than <str<strong>on</strong>g>Rome</str<strong>on</strong>g> Reborn, <strong>on</strong>e of the l<strong>on</strong>gest-runn<str<strong>on</strong>g>in</str<strong>on</strong>g>g virtual rec<strong>on</strong>structi<strong>on</strong> projects is<br />

the Pompey Project, 269 which has developed an extensive website that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a history of the theatre<br />

of Pompey, an overview of classical theatre, details <strong>on</strong> historic <strong>and</strong> modern excavati<strong>on</strong>s at this site with<br />

extensive images, <strong>and</strong> a series of 3-D visualizati<strong>on</strong>s of the Pompey theatre. Beacham <strong>and</strong> Denard<br />

(2003) provide both a practical <strong>and</strong> theoretical overview <strong>on</strong> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital rec<strong>on</strong>structi<strong>on</strong>s of the<br />

theatre of Pompey, <strong>and</strong> also exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e some of the issues such rec<strong>on</strong>structi<strong>on</strong>s create for historical<br />

study. One of the greatest advantages of virtual-model<str<strong>on</strong>g>in</str<strong>on</strong>g>g technology, they found, was its ability to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate “architectural, archaeological, pictorial <strong>and</strong> textual evidence” to create new 3-dimensi<strong>on</strong>al<br />

“virtual performance spaces.” 270<br />

The use of 3-D model<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Beacham <strong>and</strong> Denard observed, allowed them to manipulate huge data sets<br />

of different types of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, <strong>and</strong> this <str<strong>on</strong>g>in</str<strong>on</strong>g> particular supported better hypotheses <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of<br />

268 http://earth.google.com/rome/<br />

269 http://www.pompey.cch.kcl.ac.uk/<br />

270 Other recent work has g<strong>on</strong>e even further <strong>and</strong> has tried to repopulate ancient theatre rec<strong>on</strong>structi<strong>on</strong>s with human avatars (Ciechomski et al. 2004).


85<br />

“calculat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> document<str<strong>on</strong>g>in</str<strong>on</strong>g>g degrees of probability <str<strong>on</strong>g>in</str<strong>on</strong>g> architectural rec<strong>on</strong>structi<strong>on</strong>s.” The authors<br />

stressed that the data used <str<strong>on</strong>g>in</str<strong>on</strong>g> such models must be carefully evaluated <strong>and</strong> coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ated. At the same<br />

time, virtual models can both be updated more quickly than traditi<strong>on</strong>al models or draw<str<strong>on</strong>g>in</str<strong>on</strong>g>gs when new<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> becomes available <strong>and</strong> represent alternative hypotheses. The authors argued that the<br />

creati<strong>on</strong> of a website thus supports a more sophisticated form of publicati<strong>on</strong> that allows for rapid<br />

dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> that can be c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uously updated.<br />

The ability to represent multiple hypotheses <strong>and</strong> to provide different rec<strong>on</strong>structi<strong>on</strong>s, the authors also<br />

c<strong>on</strong>cluded, supports the “liberati<strong>on</strong>” of readers, so they can “<str<strong>on</strong>g>in</str<strong>on</strong>g>terpret <strong>and</strong> exploit the comprehensive<br />

data accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to their own needs, agendas <strong>and</strong> c<strong>on</strong>texts.” The nature of this work is also <str<strong>on</strong>g>in</str<strong>on</strong>g>herently<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>volves scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> multiple discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es as well as various technicians. Beacham<br />

<strong>and</strong> Denard argued that digital rec<strong>on</strong>structi<strong>on</strong>s are <str<strong>on</strong>g>in</str<strong>on</strong>g>herently a new form of scholarship:<br />

The very fact that this work is driven by the aim of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a three-dimensi<strong>on</strong>al rec<strong>on</strong>structi<strong>on</strong><br />

of the theatre has, itself, far-reach<str<strong>on</strong>g>in</str<strong>on</strong>g>g implicati<strong>on</strong>s. The extrapolati<strong>on</strong> of a complete, threedimensi<strong>on</strong>al<br />

form from fragmentary evidence, assorted compar<strong>and</strong>a <strong>and</strong> documentary evidence<br />

is quite different <str<strong>on</strong>g>in</str<strong>on</strong>g> character to the more frequently encountered project of <strong>on</strong>ly document<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s of a structure (Beacham <strong>and</strong> Denard 2003).<br />

The authors also warned, however, that digital rec<strong>on</strong>structi<strong>on</strong>s must avoid the lure of the “positivist<br />

paradigm”; <str<strong>on</strong>g>in</str<strong>on</strong>g> other words, digital models should never be presented as “reality.” All rec<strong>on</strong>structi<strong>on</strong>s<br />

must be c<strong>on</strong>sidered as hypotheses with different levels of probability, <strong>and</strong> this must be made very clear<br />

to the user; otherwise, the utility of these models as teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> scholarly communicati<strong>on</strong> tools is<br />

dubious at best. 271<br />

The utility of rec<strong>on</strong>structi<strong>on</strong>s <strong>and</strong> models <str<strong>on</strong>g>in</str<strong>on</strong>g> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g has been explored extensively by the Ashes2Art<br />

project, 272 a collaborati<strong>on</strong> between Coastal Carol<str<strong>on</strong>g>in</str<strong>on</strong>g>a University <str<strong>on</strong>g>in</str<strong>on</strong>g> South Carol<str<strong>on</strong>g>in</str<strong>on</strong>g>a <strong>and</strong> Arkansas State<br />

University <str<strong>on</strong>g>in</str<strong>on</strong>g> J<strong>on</strong>esboro, where students create 3-D computer models of ancient m<strong>on</strong>uments based <strong>on</strong><br />

excavati<strong>on</strong> reports, build educati<strong>on</strong>al <strong>and</strong> fly-through videos, take <strong>on</strong>-site photographs of architectural<br />

details, write essays, create less<strong>on</strong> plans, <strong>and</strong> ultimately document all of their work <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e with<br />

primary- <strong>and</strong> sec<strong>on</strong>dary-source bibliographies (Flaten 2009). The Ashes2Art collaborati<strong>on</strong> provides an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>novative example of undergraduate research, faculty-student collaborati<strong>on</strong>, <strong>and</strong> the development of<br />

an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e resource for both specialists <strong>and</strong> the general public.<br />

While the first iterati<strong>on</strong> of the course had students work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> all the different areas, the <str<strong>on</strong>g>in</str<strong>on</strong>g>structors<br />

so<strong>on</strong> realized this was overly ambitious for a <strong>on</strong>e-semester course. Subsequently, the students were<br />

grouped by area of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest (e.g., develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g models, design<str<strong>on</strong>g>in</str<strong>on</strong>g>g or updat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Web platform, writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

essays <strong>and</strong> prepar<str<strong>on</strong>g>in</str<strong>on</strong>g>g bibliographies, prepar<str<strong>on</strong>g>in</str<strong>on</strong>g>g teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g materials, creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g videos). All the groups<br />

depended <strong>on</strong> each other for the f<str<strong>on</strong>g>in</str<strong>on</strong>g>al product. At the end of the semester, a panel of external scholars<br />

reviewed all the models. Although the development of models was the end goal of the Ashes2Art<br />

project, the course also addressed larger issues regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital models <strong>and</strong> the rec<strong>on</strong>structi<strong>on</strong> of<br />

archaeological artifacts. As summarized by Flaten:<br />

The opportunity to visualize complex dimensi<strong>on</strong>al data has never been greater, but digital<br />

rec<strong>on</strong>structi<strong>on</strong>s <strong>and</strong> models are not without their critics. Questi<strong>on</strong>s of accuracy, methodology,<br />

271 The need to visualize <strong>and</strong> represent uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty or levels of probability <str<strong>on</strong>g>in</str<strong>on</strong>g> virtual models of ancient architecture <strong>and</strong> l<strong>and</strong>scapes has received an<br />

extensive amount of discussi<strong>on</strong> with<str<strong>on</strong>g>in</str<strong>on</strong>g> the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of archaeology <strong>and</strong> is not a new topic to the field. See, for example, Miller <strong>and</strong> Richards (1994), Ryan<br />

(1996), Strothotte et al. (1999), <strong>and</strong> Zuk et al. (2005).<br />

272 http://www.coastal.edu/ashes2art/projects.html


86<br />

transparency, accessibility, availability, <strong>and</strong> objective peer review are legitimate c<strong>on</strong>cerns<br />

(Flaten 2009).<br />

As had Beacham <strong>and</strong> Denard (2003), Flaten emphasized the importance of publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels of<br />

certa<str<strong>on</strong>g>in</str<strong>on</strong>g>ty regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the data used to create a rec<strong>on</strong>structi<strong>on</strong>, mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g all the data that were utilized<br />

explicit to the user <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>form<str<strong>on</strong>g>in</str<strong>on</strong>g>g the user that multiple <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s are possible. Flaten commented<br />

that his students were creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g “percepti<strong>on</strong> models” rather than “structural models.” Students used a<br />

variety of data, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g published excavati<strong>on</strong> reports, journal articles, <strong>and</strong> photography, to create<br />

general structural models. When there were c<strong>on</strong>flict<str<strong>on</strong>g>in</str<strong>on</strong>g>g accounts <str<strong>on</strong>g>in</str<strong>on</strong>g> the data <strong>and</strong> decisi<strong>on</strong>s had to be<br />

made, both the evidence <strong>and</strong> the decisi<strong>on</strong> made were recorded al<strong>on</strong>g with other metadata for the model.<br />

Flaten reiterated that the ability to update digital rec<strong>on</strong>structi<strong>on</strong>s as new <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> becomes available<br />

is <strong>on</strong>e of the greatest strengths of the digital approach. Another important <str<strong>on</strong>g>in</str<strong>on</strong>g>sight ga<str<strong>on</strong>g>in</str<strong>on</strong>g>ed from this<br />

process, Flaten observed, was that students “discover that uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty is a crucial comp<strong>on</strong>ent of<br />

knowledge, that precisi<strong>on</strong> does not imply accuracy, <strong>and</strong> that questi<strong>on</strong>s are more important than def<str<strong>on</strong>g>in</str<strong>on</strong>g>ite<br />

answers.” Through their work <strong>on</strong> Ashes2Art, students learned important less<strong>on</strong>s about how scholarly<br />

arguments are c<strong>on</strong>structed <strong>and</strong> that the creati<strong>on</strong> of new knowledge is always an <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>versati<strong>on</strong><br />

rather than a f<str<strong>on</strong>g>in</str<strong>on</strong>g>ished product.<br />

As this secti<strong>on</strong> has dem<strong>on</strong>strated, several significant archaeological projects are explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the use of 3-<br />

D models <strong>and</strong> digital rec<strong>on</strong>structi<strong>on</strong>. N<strong>on</strong>etheless, the ability to preserve these projects <strong>and</strong> provide<br />

l<strong>on</strong>g-term access to them has received little attenti<strong>on</strong>. Koller et al. (2009) have proposed creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g open<br />

repositories of authenticated 3-D models that are based <strong>on</strong> the model of traditi<strong>on</strong>al scholarly<br />

journals. 273 Such repositories must <str<strong>on</strong>g>in</str<strong>on</strong>g>clude mechanisms for peer review, preservati<strong>on</strong>, publicati<strong>on</strong>,<br />

updat<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>. In additi<strong>on</strong> to the lack of a digital archive, the authors criticized the fact<br />

that there is no central f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g tool to even discover whether a site or m<strong>on</strong>ument has been digitally<br />

modeled, mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g it difficult to repurpose or to learn from other scholars work. This state of affairs led<br />

them to the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>clusi<strong>on</strong>:<br />

A l<strong>on</strong>g-term objective, then, should be the creati<strong>on</strong> of centralized, open repositories of<br />

scientifically authenticated virtual envir<strong>on</strong>ments of cultural heritage sites. By scientifically<br />

authenticated, we mean that such archives should accessi<strong>on</strong> <strong>on</strong>ly 3D models that are clearly<br />

identified with authors with appropriate professi<strong>on</strong>al qualificati<strong>on</strong>s, <strong>and</strong> whose underly<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

design documents <strong>and</strong> metadata are published al<strong>on</strong>g with the model. Uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ties <str<strong>on</strong>g>in</str<strong>on</strong>g> the 3D<br />

data <strong>and</strong> hypotheses <str<strong>on</strong>g>in</str<strong>on</strong>g> the rec<strong>on</strong>structi<strong>on</strong>s must be clearly documented <strong>and</strong> communicated to<br />

users (Koller et al. 2009).<br />

The ability to visualize uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty <str<strong>on</strong>g>in</str<strong>on</strong>g> digital models <strong>and</strong> present these results to users is also a<br />

significant technical challenge the authors reveal, <strong>and</strong> <strong>on</strong>e that has been the subject of little if any<br />

research. The development of such repositories, however, faces a number of research challenges,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital rights management for models, uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty visualizati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> 3-D rec<strong>on</strong>structi<strong>on</strong>s,<br />

versi<strong>on</strong> c<strong>on</strong>trol for models (e.g., different scholars may generate different versi<strong>on</strong>s of the same model,<br />

models change over time), effective metadata creati<strong>on</strong>, digital preservati<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability, search<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

across 3-D models, the use of computati<strong>on</strong>al analysis tools with such a repository, <strong>and</strong> last, but by no<br />

means least, the development of organizati<strong>on</strong>al structures to support them.<br />

273 The creati<strong>on</strong> of such a repository is part of their larger “SAVE: Serv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g Virtual Envir<strong>on</strong>ments” project<br />

(http://vwhl.clas.virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia.edu/save.html), which when complete “will be the world’s first <strong>on</strong>-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, peer-reviewed journal <str<strong>on</strong>g>in</str<strong>on</strong>g> which scholars can publish 3D<br />

digital models of the world’s cultural heritage (CH) sites <strong>and</strong> m<strong>on</strong>uments.” On a larger scale, the n<strong>on</strong>profit CyArk High Def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> Heritage Network<br />

(http://archive.cyark.org/) is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g “to digitally preserve cultural heritage sites through collect<str<strong>on</strong>g>in</str<strong>on</strong>g>g, archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g open access to data created<br />

by laser scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g, digital model<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> other state-of-the-art technologies.”


87<br />

Classical Art <strong>and</strong> Architecture<br />

The diverse world of classical art <strong>and</strong> architecture is well represented <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <strong>and</strong> this secti<strong>on</strong> briefly<br />

surveys a number of specific digital projects. One impressive website, the “Ancient Theatre Archive: A<br />

Virtual Reality Tour of Greek <strong>and</strong> Roman Theatre Architecture,” 274 has been created by Thomas G.<br />

H<str<strong>on</strong>g>in</str<strong>on</strong>g>es of Whitman College <strong>and</strong> is an excellent resource that can be used to study ancient Greek <strong>and</strong><br />

Roman theatres. This website provides both a list <strong>and</strong> graphical map overview for navigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g through<br />

images of classical theatres. Each theatre page <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes an extensive history, a time l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <strong>and</strong> a virtual<br />

tour that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes panorama images <strong>and</strong> Quicktime movies. A recent additi<strong>on</strong> is a table of “Greek <strong>and</strong><br />

Roman Theatre Specificati<strong>on</strong>” that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes extensive details <strong>on</strong> each theatre, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g its ancient<br />

name, modern name, locati<strong>on</strong>, date, width, capacity, renovati<strong>on</strong> dates, <strong>and</strong> summary. While this table<br />

can be sorted by type of data, it would also have been useful to hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>k the theatres to their<br />

descriptive pages. This website is an excellent educati<strong>on</strong>al resource, but it does not seem,<br />

unfortunately, that any of the extensive historical data compiled, images, or video data can be<br />

downloaded <str<strong>on</strong>g>in</str<strong>on</strong>g> any k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of st<strong>and</strong>ard format or reused.<br />

Many websites are dedicated to the architecture of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual build<str<strong>on</strong>g>in</str<strong>on</strong>g>gs or cities. For example, Trajan’s<br />

Column, 275 hosted by McMaster University, is dedicated to the explorati<strong>on</strong> of the column of Trajan as<br />

a sculptural m<strong>on</strong>ument. This website <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes <str<strong>on</strong>g>in</str<strong>on</strong>g>troductory essays, a database of images, <strong>and</strong> useful<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dexes for the website. Although more than 10 years old, this website st<strong>and</strong>s up well as an educati<strong>on</strong>al<br />

resource <strong>and</strong>, even more important, provides technical details <strong>on</strong> its creati<strong>on</strong> <strong>and</strong> all the source code<br />

used <str<strong>on</strong>g>in</str<strong>on</strong>g> its creati<strong>on</strong>. 276<br />

Another useful tool for brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g across a number of classical art objects is the Perseus Art &<br />

Archaeology Browser, 277 which allows the user to browse the digital library’s image collecti<strong>on</strong>,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, vases, sculptures, sites, gems, <strong>and</strong> build<str<strong>on</strong>g>in</str<strong>on</strong>g>gs. The descripti<strong>on</strong>s <strong>and</strong> images have been<br />

produced <str<strong>on</strong>g>in</str<strong>on</strong>g> collaborati<strong>on</strong> with a large number of museums, <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s, <strong>and</strong> scholars. In <strong>on</strong>e<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g related project, 3-D models were developed from the photographs of vases <str<strong>on</strong>g>in</str<strong>on</strong>g> this collecti<strong>on</strong><br />

<strong>and</strong> were used to build a “3D Vase Museum” that users could browse (Shiaw et al. 2004). Although the<br />

entire collecti<strong>on</strong> of art objects can be searched, the major form of access to the Art & Archaeology<br />

collecti<strong>on</strong> is provided through a brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terface, where the user must pick an artifact type such as a<br />

co<str<strong>on</strong>g>in</str<strong>on</strong>g>, a property of that artifact (such as material), <strong>and</strong> a property of that artifact type (such as br<strong>on</strong>ze);<br />

this then leads to a list of images. 278 Each catalog entry <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes descriptive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>,<br />

photographer, credits, <strong>and</strong> the source of the object <strong>and</strong> photograph. In additi<strong>on</strong>, each image <str<strong>on</strong>g>in</str<strong>on</strong>g> the Art<br />

& Archaeology Browser has a stable URL for l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g. All of the source code used to create this<br />

brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment is available for download as part of the Perseus Hopper. 279<br />

The largest research effort <str<strong>on</strong>g>in</str<strong>on</strong>g> mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g classical art available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e is CLAROS (Classical Art Research<br />

Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Services), 280 a major <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary research <str<strong>on</strong>g>in</str<strong>on</strong>g>itiative that launched its first<br />

public research <str<strong>on</strong>g>in</str<strong>on</strong>g>terface <str<strong>on</strong>g>in</str<strong>on</strong>g> May of 2011. Some further technical details <strong>on</strong> this project are discussed<br />

later <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper, but this secti<strong>on</strong> will c<strong>on</strong>sider some of the larger research questi<strong>on</strong>s addressed by<br />

CLAROS <strong>and</strong> described <str<strong>on</strong>g>in</str<strong>on</strong>g> Kurtz et al. (2009). This project is led by Oxford University <strong>and</strong> hosted by<br />

the Oxford e-Research Center, <strong>and</strong> its current “data web” <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrates the collecti<strong>on</strong>s of Arachne<br />

274 http://www.whitman.edu/theatre/theatretour/home.htm<br />

275 http://cheir<strong>on</strong>.mcmaster.ca/~trajan/<br />

276 http://cheir<strong>on</strong>.mcmaster.ca/~trajan/tech.html<br />

277 http://www.perseus.tufts.edu/hopper/artifactBrowser<br />

278 http://www.perseus.tufts.edu/hopper/artifactBrowserobject=Co<str<strong>on</strong>g>in</str<strong>on</strong>g>&field=Material&value=Br<strong>on</strong>ze<br />

279 http://www.perseus.tufts.edu/hopper/opensource/download<br />

280 http://explore.clarosnet.org/XDB/ASP/clarosHome/


88<br />

(Research Archive for Ancient Sculpture-Cologne), 281 the Beazley Research Archive, 282 the Lexic<strong>on</strong><br />

Ic<strong>on</strong>ographicum Mythologicae Classicae (LIMC Basel 283 <strong>and</strong> LIMC Paris 284 ), the German<br />

Archaeological Institute (DAI), 285 the LGPN (Lexic<strong>on</strong> of Greek Pers<strong>on</strong>al Names) 286 al<strong>on</strong>g with<br />

several other partner projects.<br />

All these found<str<strong>on</strong>g>in</str<strong>on</strong>g>g members have extensive data sets <strong>on</strong> antiquity <str<strong>on</strong>g>in</str<strong>on</strong>g> vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g formats, with more than 2<br />

milli<strong>on</strong> records collectively. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, Arachne is the “central object database” of the DAI <strong>and</strong> the<br />

Archaeological Institute of the University of Cologne. While registrati<strong>on</strong> is required, this database<br />

provides free access to hundreds of thous<strong>and</strong>s of records <strong>on</strong> archaeological objects <strong>and</strong> their attributes,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a large number of digitized photographs (e.g., Roman <strong>and</strong> Greek antiquities, museum<br />

artifacts, archaeological draw<str<strong>on</strong>g>in</str<strong>on</strong>g>gs) both from traditi<strong>on</strong>al publicati<strong>on</strong>s that have been digitized <strong>and</strong> from<br />

current archaeological digs. Arachne also provides access to a m<strong>on</strong>ument browser <strong>and</strong> has recently<br />

launched the iDAi bookbrowser that “<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrates documents <str<strong>on</strong>g>in</str<strong>on</strong>g> the object structure of Arachne,<br />

provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g direct l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks between “real world” objects <strong>and</strong> their textual descripti<strong>on</strong>s.” 287 This new tool<br />

addresses the frequently stated desire of many digital classicists to better l<str<strong>on</strong>g>in</str<strong>on</strong>g>k the material <strong>and</strong> textual<br />

records.<br />

The Beazley Research Archive <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e provides access to more than 20 databases of different types of<br />

objects from classical antiquity (e.g., pottery, gems, sculpture, antiquaria, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s). Databases<br />

can be searched <str<strong>on</strong>g>in</str<strong>on</strong>g>dividually or the entire archive can be searched at <strong>on</strong>ce through a general keyword<br />

search. A variety of tools are also available for use <str<strong>on</strong>g>in</str<strong>on</strong>g> search<str<strong>on</strong>g>in</str<strong>on</strong>g>g this database, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g an illustrated<br />

dicti<strong>on</strong>ary <strong>and</strong> a series of time l<str<strong>on</strong>g>in</str<strong>on</strong>g>es. Individual records for objects <str<strong>on</strong>g>in</str<strong>on</strong>g>clude extensive metadata such as<br />

an object descripti<strong>on</strong>, a full publicati<strong>on</strong> record, <strong>and</strong> images of the object. F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, both the LIMC Basel<br />

<strong>and</strong> the LIMC Paris provide access to records <strong>and</strong> images of religious <strong>and</strong> mythological ic<strong>on</strong>ography<br />

drawn from more than 2,000 museums <strong>and</strong> collecti<strong>on</strong>s.<br />

CLAROS is us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the CIDOC-CRM 288 <strong>on</strong>tology to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate these collecti<strong>on</strong>s <strong>and</strong> is <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

LGPN to place classical “art <str<strong>on</strong>g>in</str<strong>on</strong>g> its ancient cultural c<strong>on</strong>text” <strong>and</strong> provide “a natural bridge to the large<br />

<strong>and</strong> well developed epidoc community” (Kurtz et al. 2009). While data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> presents a number<br />

of difficulties, the authors emphasize that:<br />

A guid<str<strong>on</strong>g>in</str<strong>on</strong>g>g pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciple of CLAROS is that no partner should need to change the format of his data<br />

to jo<str<strong>on</strong>g>in</str<strong>on</strong>g>. Each of the founder members uses different databases <strong>and</strong> fr<strong>on</strong>t end programs for<br />

enter<str<strong>on</strong>g>in</str<strong>on</strong>g>g, query<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> display<str<strong>on</strong>g>in</str<strong>on</strong>g>g results through their own websites. Data have been exported<br />

from each partner <str<strong>on</strong>g>in</str<strong>on</strong>g>to a comm<strong>on</strong> CIDOC CRM format (Kurtz et al. 2009).<br />

C<strong>on</strong>sequently, this approach presents the challenge of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g a “data web” that supports search<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

across multiple different collecti<strong>on</strong>s while still permitt<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual organizati<strong>on</strong>s to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> their<br />

own databases with their own unique st<strong>and</strong>ards.<br />

281 http://www.arachne.uni-koeln.de/<br />

282 http://www.beazley.ox.ac.uk/<br />

283 http://www.limcnet.org/Home/tabid/77/Default.aspx<br />

284 http://www.mae.u-paris10.fr/limc-france/<br />

285 http://www.da<str<strong>on</strong>g>in</str<strong>on</strong>g>st.org/<br />

286 http://www.lgpn.ox.ac.uk/<br />

287 http://www.arachne.uni-koeln.de/drupal/q=en/node/188<br />

288 CIDOC-CRM is a c<strong>on</strong>ceptual reference model that “provides def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong>s <strong>and</strong> a formal structure for describ<str<strong>on</strong>g>in</str<strong>on</strong>g>g the implicit <strong>and</strong> explicit c<strong>on</strong>cepts <strong>and</strong><br />

relati<strong>on</strong>ships used <str<strong>on</strong>g>in</str<strong>on</strong>g> cultural heritage documentati<strong>on</strong>” (http://www.cidoc-crm.org/) <strong>and</strong> was designed to promote <strong>and</strong> support the use of a “comm<strong>on</strong> <strong>and</strong><br />

extensible semantic framework” by various cultural heritage organizati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g museums, libraries <strong>and</strong> archives. For more <strong>on</strong> the CIDOC-CRM <strong>and</strong><br />

its potential for support<str<strong>on</strong>g>in</str<strong>on</strong>g>g semantic <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability between various digital resources <strong>and</strong> systems, see Doerr <strong>and</strong> Iorizzo (2008).


89<br />

The CLAROS data web represents each resource as a SPARQL end po<str<strong>on</strong>g>in</str<strong>on</strong>g>t that is then queried by the<br />

SPARQL RDF-query language <strong>and</strong> returns data as RDF. 289 The two ma<str<strong>on</strong>g>in</str<strong>on</strong>g> problems of this approach,<br />

as reported by Kurtz et al. (2009), are semantic <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> (alignment of different data schemas or<br />

ideally mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g to a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle schema or <strong>on</strong>tology) <strong>and</strong> co-reference resoluti<strong>on</strong> (ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g a reference to<br />

the same object or entity <str<strong>on</strong>g>in</str<strong>on</strong>g> different databases with different names) is recognized as such. Like<br />

EAGLE for epigraphy <strong>and</strong> APIS for papyrology, CLAROS is support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the federated search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

multiple classical collecti<strong>on</strong>s. It is their hope that the use of CIDOC-CRM, RDF <strong>and</strong> SPARQL will<br />

allow them to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate additi<strong>on</strong>al classical art collecti<strong>on</strong>s by mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g their unique schemas to the core<br />

CIDOC-CRM <strong>on</strong>tology of CLAROS <strong>and</strong> add<str<strong>on</strong>g>in</str<strong>on</strong>g>g all necessary entries <str<strong>on</strong>g>in</str<strong>on</strong>g> the co-reference service.<br />

The CLAROS project has many goals, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital facsimiles of images <strong>and</strong> reference<br />

works available to the general public, provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholars access with “datasets of <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectually<br />

coherent material easily <strong>and</strong> swiftly through <strong>on</strong>e multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual search facility,” enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g museums to<br />

access records about their own collecti<strong>on</strong>s <strong>and</strong> those of other museums, <strong>and</strong> to permit both the public<br />

<strong>and</strong> educati<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> particular to engage with “high art” <str<strong>on</strong>g>in</str<strong>on</strong>g> new ways. One particular new<br />

way of engag<str<strong>on</strong>g>in</str<strong>on</strong>g>g users that they have developed is a system 290 that allow users to query for new or<br />

related images to an image of their own that they can upload (either directly from their computer or by<br />

select<str<strong>on</strong>g>in</str<strong>on</strong>g>g a web address), <strong>and</strong> this process is currently based <strong>on</strong>ly <strong>on</strong> image recogniti<strong>on</strong> <strong>and</strong> not textual<br />

descriptors. Another <str<strong>on</strong>g>in</str<strong>on</strong>g>trigu<str<strong>on</strong>g>in</str<strong>on</strong>g>g visi<strong>on</strong> of this project is <strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g> which members of the public could take<br />

images of classical art around the world <strong>and</strong> upload them to CLAROS clouds for image recogniti<strong>on</strong>,<br />

identificati<strong>on</strong>, <strong>and</strong> documentati<strong>on</strong> by experts.<br />

The current CLAROS Explorer allows users to either browse or search by five major facets: category,<br />

place, period, text or c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g data collecti<strong>on</strong>. In additi<strong>on</strong>, two other collecti<strong>on</strong> brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g opti<strong>on</strong>s<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude view<str<strong>on</strong>g>in</str<strong>on</strong>g>g a timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e of records <str<strong>on</strong>g>in</str<strong>on</strong>g> the database or brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g a map of places to choose records.<br />

The entire CLAROS collecti<strong>on</strong> can be searched by keyword us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a “General Search” <strong>and</strong> a number of<br />

search terms can be comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by choos<str<strong>on</strong>g>in</str<strong>on</strong>g>g values from each list. Search results can be displayed as a<br />

textual list, a series of images, or <strong>on</strong> a timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e (or results can be viewed as a comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of these<br />

opti<strong>on</strong>s, e.g. a list with images). Click<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual result <str<strong>on</strong>g>in</str<strong>on</strong>g> a search list then takes the user to<br />

the record for that <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual item <str<strong>on</strong>g>in</str<strong>on</strong>g> its home collecti<strong>on</strong> (e.g. the Beazley Research Archive).<br />

Classical Geography<br />

The potential of digital technologies for the study of geography with<str<strong>on</strong>g>in</str<strong>on</strong>g> the larger c<strong>on</strong>text of the digital<br />

humanities <strong>and</strong> with a specific focus <strong>on</strong> classical geography <strong>and</strong> archaeology 291 has recently been<br />

provided by Stuart Dunn (Dunn 2010). Dunn noted that digital technologies are support<str<strong>on</strong>g>in</str<strong>on</strong>g>g what he<br />

labeled “neogeography” a “discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e” that is collaborative <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes new sets of tools <strong>and</strong> methods,<br />

but also presents its own challenges:<br />

The ‘gr<strong>and</strong> challenge’ for collaborative digital geography therefore, with its vast user base, <strong>and</strong><br />

its capacity for generat<str<strong>on</strong>g>in</str<strong>on</strong>g>g new data from across the specialist <strong>and</strong> n<strong>on</strong>-specialist communities,<br />

is to establish how its various methods can be used to underst<strong>and</strong> better the c<strong>on</strong>structi<strong>on</strong> of the<br />

spatial artefact, rather than simply to represent it (Dunn 2010, 56).<br />

289 http://www.w3.org/TR/rdf-sparql-query/<br />

290 http://explore.clarosnet.org/XDB/ASP/clarosExplorerImage.asp<br />

291 A number of public doma<str<strong>on</strong>g>in</str<strong>on</strong>g> reference works have been digitized <strong>and</strong> are useful for the study of classical geography, such as the Topographical<br />

Dicti<strong>on</strong>ary of Ancient <str<strong>on</strong>g>Rome</str<strong>on</strong>g> (http://www.lib.uchicago.edu/cgi-b<str<strong>on</strong>g>in</str<strong>on</strong>g>/eos/eos_title.plcallnum=DG16.P72), which has been put <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e by the University of<br />

Chicago, <strong>and</strong> the Tabula Peut<str<strong>on</strong>g>in</str<strong>on</strong>g>geriana, a medieval copy of a Roman map of the empire, http://www.hsaugsburg.de/~harsch/Chr<strong>on</strong>ologia/Lspost03/Tabula/tab_<str<strong>on</strong>g>in</str<strong>on</strong>g>tr.html


90<br />

This challenge of not just digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g or represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g traditi<strong>on</strong>al objects of study <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e but of f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

new ways to use these methods to c<strong>on</strong>duct <str<strong>on</strong>g>in</str<strong>on</strong>g>novative research, create new knowledge, <strong>and</strong> answer<br />

new questi<strong>on</strong>s is a recurrent theme throughout the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of digital classics. Several prom<str<strong>on</strong>g>in</str<strong>on</strong>g>ent<br />

digital projects that focus <strong>on</strong> address<str<strong>on</strong>g>in</str<strong>on</strong>g>g these challenges with<str<strong>on</strong>g>in</str<strong>on</strong>g> the realm of classical geography<br />

provide the focus of this secti<strong>on</strong>.<br />

The Ancient World Mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g Center<br />

The Ancient World Mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g Center (AWMC), 292 an <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary center at the University of North<br />

Carol<str<strong>on</strong>g>in</str<strong>on</strong>g>a-Chapel Hill, is perhaps the preem<str<strong>on</strong>g>in</str<strong>on</strong>g>ent organizati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> this field of study. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its<br />

website, the AWMC “promotes cartography, 293 historical geography <strong>and</strong> geographic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

science as essential discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es with<str<strong>on</strong>g>in</str<strong>on</strong>g> the field of ancient studies through <str<strong>on</strong>g>in</str<strong>on</strong>g>novative <strong>and</strong> collaborative<br />

research, teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> community outreach activities.” This website <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a number of resources<br />

for researchers, but the majority of the website is composed of short research articles written by<br />

AWMC staff regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g topics such as new publicati<strong>on</strong>s available or websites of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest. These articles<br />

can be found by brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g the table of c<strong>on</strong>tents for the website or the topical <str<strong>on</strong>g>in</str<strong>on</strong>g>dex or by search<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

There is also an <str<strong>on</strong>g>in</str<strong>on</strong>g>dex of place names used <str<strong>on</strong>g>in</str<strong>on</strong>g> articles <strong>on</strong> the website. One other useful resource is a<br />

selecti<strong>on</strong> of free maps of the classical world that can be downloaded for teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

The Pleiades Project<br />

The Pleiades Project, 294 <strong>on</strong>ce solely based at the AWMC but now a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t project of the AWMC, the<br />

Institute for the Study of the Ancient World (ISAW), 295 <strong>and</strong> the Stoa C<strong>on</strong>sortium, 296 is <strong>on</strong>e of the<br />

largest digital resources <str<strong>on</strong>g>in</str<strong>on</strong>g> classical geography. The Pleiades website allows scholars, students, <strong>and</strong><br />

enthusiasts to share <strong>and</strong> use historical geographical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about the classical world. A major goal<br />

of Pleiades is to create an authoritative digital gazetteer of the ancient world that is c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uously<br />

updated <strong>and</strong> can be used to support other digital projects <strong>and</strong> publicati<strong>on</strong>s through the use of “open,<br />

st<strong>and</strong>ards based <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces” (Elliott <strong>and</strong> Gillies 2009a).<br />

From its beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Pleiades was <str<strong>on</strong>g>in</str<strong>on</strong>g>tended to be collaborative, <strong>and</strong> to jo<str<strong>on</strong>g>in</str<strong>on</strong>g> the project <strong>and</strong> c<strong>on</strong>tribute<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> a user simply needs to have an e-mail address <strong>and</strong> to accept a c<strong>on</strong>tributor agreement. This<br />

agreement leaves all <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual property rights with c<strong>on</strong>tributors but also grants to Pleiades a “CC<br />

Attributi<strong>on</strong> Share-Alike License.” Registered users can suggest updates to geographic names, add<br />

bibliographic references, <strong>and</strong> c<strong>on</strong>tribute to descriptive essays. While all c<strong>on</strong>tributi<strong>on</strong>s are vetted, these<br />

suggesti<strong>on</strong>s then “become a permanent, author-attributed part of future publicati<strong>on</strong>s <strong>and</strong> data services”<br />

(Elliott <strong>and</strong> Gillies 2009b). Thus, the Pleiades project provides a light level of “peer review” to all<br />

user-c<strong>on</strong>tributed data.<br />

The c<strong>on</strong>tent with<str<strong>on</strong>g>in</str<strong>on</strong>g> the Pleiades gazetteer “comb<str<strong>on</strong>g>in</str<strong>on</strong>g>es “pure” data comp<strong>on</strong>ents (e.g., geospatial<br />

coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ates) with the products of analysis (e.g., top<strong>on</strong>ymic variants with <str<strong>on</strong>g>in</str<strong>on</strong>g>dicia of completeness,<br />

degree of rec<strong>on</strong>structi<strong>on</strong> <strong>and</strong> level of scholarly c<strong>on</strong>fidence there<str<strong>on</strong>g>in</str<strong>on</strong>g>) <strong>and</strong> textual argument” (Elliott <strong>and</strong><br />

Gillies 2009a). In additi<strong>on</strong>, Pleiades <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes c<strong>on</strong>tent from the Classical Atlas project, an extensive<br />

292 http://www.unc.edu/awmc/<br />

293 One excellent <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive resource for study<str<strong>on</strong>g>in</str<strong>on</strong>g>g the cartographic history “of the relati<strong>on</strong>ships between hydrological <strong>and</strong> hydraulic systems <strong>and</strong> their<br />

impact <strong>on</strong> the urban development of <str<strong>on</strong>g>Rome</str<strong>on</strong>g>, Italy” is Aquae Urbis Romae (http://www3.iath.virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia.edu/waters/)<br />

294 http://pleiades.stoa.org/<br />

295 ISAW (http://www.nyu.edu/isaw/) is based at New York University <strong>and</strong> is a “center for advanced scholarly research <strong>and</strong> graduate educati<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>tended<br />

to cultivate comparative <strong>and</strong> c<strong>on</strong>nective <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigati<strong>on</strong>s of the ancient world from the western Mediterranean to Ch<str<strong>on</strong>g>in</str<strong>on</strong>g>a, open to the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of every<br />

category of evidence <strong>and</strong> relevant method of analysis.” It will feature a variety of doctoral <strong>and</strong> postdoctoral programs to support groundbreak<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary scholarship.<br />

296 http://www.stoa.org/


91<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al collaborati<strong>on</strong> that led to the publicati<strong>on</strong> of the Barr<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong> Atlas of the Greek <strong>and</strong> Roman<br />

World. In fact, the creators of Pleiades see the website as a permanent way to update <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

Barr<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong> <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. The creators of Pleiades c<strong>on</strong>sider their publicati<strong>on</strong> model as <str<strong>on</strong>g>in</str<strong>on</strong>g> some ways close to<br />

both an academic journal <strong>and</strong> an encyclopedia:<br />

Instead of a thematic organizati<strong>on</strong> <strong>and</strong> primary subdivisi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividually authored articles,<br />

Pleiades pushes discrete author<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g down to the f<str<strong>on</strong>g>in</str<strong>on</strong>g>e level of structured reports <strong>on</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual places <strong>and</strong> names, their relati<strong>on</strong>ships with each other <strong>and</strong> the scholarly rati<strong>on</strong>ale<br />

beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d their c<strong>on</strong>tent. In a real sense then Pleiades is also like an encyclopedic reference work,<br />

but with the built-<str<strong>on</strong>g>in</str<strong>on</strong>g> assumpti<strong>on</strong> of <strong>on</strong>-go<str<strong>on</strong>g>in</str<strong>on</strong>g>g revisi<strong>on</strong> <strong>and</strong> iterative publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g of versi<strong>on</strong>s<br />

(Elliott <strong>and</strong> Gillies 2009b).<br />

Rather than us<str<strong>on</strong>g>in</str<strong>on</strong>g>g top<strong>on</strong>yms or coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ates as the primary organiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g theme of the website, they have<br />

used the c<strong>on</strong>cept of place “as a bundle of associati<strong>on</strong>s between attested names <strong>and</strong> measured (or<br />

estimated) locati<strong>on</strong>s (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g areas)” (Elliott <strong>and</strong> Gillies 2009b). These bundles are called “features,”<br />

<strong>and</strong> they can be positi<strong>on</strong>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> time <strong>and</strong> have scholarly c<strong>on</strong>fidences registered to them. The ability to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dicate levels of c<strong>on</strong>fidence <str<strong>on</strong>g>in</str<strong>on</strong>g> historical or uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> data is an important part of many digital classics<br />

projects.<br />

As the sheer amount of c<strong>on</strong>tent is far bey<strong>on</strong>d the scale of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual project participants to actively edit<br />

<strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>, the Pleiades project has “pushed out” this resp<strong>on</strong>sibility to <str<strong>on</strong>g>in</str<strong>on</strong>g>terested members of the<br />

classics community <strong>and</strong> bey<strong>on</strong>d, through the collaborati<strong>on</strong> model described above. Another important<br />

feature of Pleiades is that it uses <strong>on</strong>ly open-source software such as OpenLayers, 297 Pl<strong>on</strong>e, 298 <strong>and</strong><br />

zgeo. 299<br />

In additi<strong>on</strong>, the Pleiades project promotes the use of their gazetteer as an “authority list” for Greek <strong>and</strong><br />

Roman geographic names <strong>and</strong> their associated locati<strong>on</strong>s. All Pleiades c<strong>on</strong>tent has stable URLs for its<br />

discrete elements; this allows other digital resources to “refer unambiguously to the places <strong>and</strong> spaces<br />

menti<strong>on</strong>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> ancient texts, the subjects of modern scholarly works, the m<str<strong>on</strong>g>in</str<strong>on</strong>g>t<str<strong>on</strong>g>in</str<strong>on</strong>g>g locati<strong>on</strong>s of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, <strong>and</strong><br />

the f<str<strong>on</strong>g>in</str<strong>on</strong>g>dspots of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, papyri, <strong>and</strong> the like” (Elliott <strong>and</strong> Gillies 2009a). The difficulties that<br />

historical place names with<str<strong>on</strong>g>in</str<strong>on</strong>g> “legacy literature” present for named entity disambiguati<strong>on</strong>, geopars<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

<strong>and</strong> automatic mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques with<str<strong>on</strong>g>in</str<strong>on</strong>g> the field of archaeology was previously documented by the<br />

Archaeotools project, <strong>and</strong> Elliott <strong>and</strong> Gillies (2009b) also report that such place names present similar<br />

challenges for classical geography. 300 They detailed how many historical books found with<str<strong>on</strong>g>in</str<strong>on</strong>g> Google<br />

Books have <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> pages that <str<strong>on</strong>g>in</str<strong>on</strong>g>clude Google Maps populated with place names extracted from<br />

the text; classical place names such as Ithaca, however, are often assigned to modern places such as the<br />

city <str<strong>on</strong>g>in</str<strong>on</strong>g> New York by mistake. While there are algorithms that attempt to deal with many of these<br />

issues, they also argue that:<br />

This circumstance highlights a class of research <strong>and</strong> publicati<strong>on</strong> work of critical importance for<br />

humanists <strong>and</strong> geographers over the next decade: the creati<strong>on</strong> of open, structured, web-fac<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

geo-historical reference works that can be used for a variety of purposes, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

of geo-pars<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools <strong>and</strong> the populati<strong>on</strong> of geographic <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes (Elliott <strong>and</strong> Gillies 2009b).<br />

297 OpenLayers is an open-source JavaScript library that can be used to display map data <str<strong>on</strong>g>in</str<strong>on</strong>g> most Web browsers (http://openlayers.org/)<br />

298 Pl<strong>on</strong>e is an open source c<strong>on</strong>tent management system (http://pl<strong>on</strong>e.org/).<br />

299 http://pl<strong>on</strong>e.org/products/zgeo.wfs<br />

300 The use of computati<strong>on</strong>al methods <strong>and</strong> customized knowledge sources for historical named-entity disambiguati<strong>on</strong> has an extensive literature that is<br />

bey<strong>on</strong>d the scope of this paper. For some useful approaches, see Smith (2002), Tob<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. (2008), <strong>and</strong> Byrne (2007).


92<br />

Part of the research of the Pleiades project, therefore, has been to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e how best to turn digital<br />

resources such as their gazetteer <str<strong>on</strong>g>in</str<strong>on</strong>g>to repurposeable knowledge bases. 301 Elliott <strong>and</strong> Gillies (2009b)<br />

predict that <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly those who hold geographic data <strong>and</strong> wish to make it freely available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

will provide access to their data through a variety of web services.<br />

Despite their desire to make all of Pleiades c<strong>on</strong>tent available to be remixed <strong>and</strong> mashed up, these<br />

efforts have met with some obstacles:<br />

In our web services, we employ proxies for our c<strong>on</strong>tent (KML 302 <strong>and</strong> GeoRSS 303 -enhanced<br />

Atom 304 feeds) so that users can visualize <strong>and</strong> exploit it <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of automated ways. In this<br />

way, we provide a computati<strong>on</strong>ally acti<strong>on</strong>able bridge between a nuanced, scholarly publicati<strong>on</strong><br />

<strong>and</strong> the geographic discovery <strong>and</strong> exploitati<strong>on</strong> tools now emerg<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the web. But for us, these<br />

formats are lossy: they cannot represent our data model <str<strong>on</strong>g>in</str<strong>on</strong>g> a structured way that preserves all<br />

nuance <strong>and</strong> detail <strong>and</strong> permits ready pars<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> exploitati<strong>on</strong> by software agents. Indeed, we<br />

have been unable to identify a st<strong>and</strong>ard XML-based data format that simply <strong>and</strong> losslessly<br />

supports the full expressi<strong>on</strong> of the Pleiades data model (Elliott <strong>and</strong> Gillies 2009b).<br />

To provide a lossless export, they plan to produce file sets composed of ESRI shape files with attribute<br />

tables <str<strong>on</strong>g>in</str<strong>on</strong>g> CSV, a soluti<strong>on</strong> that, despite the proprietary nature of the ShapeFile format, does allow them<br />

to download time-stamped files <str<strong>on</strong>g>in</str<strong>on</strong>g>to the <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repository at New York University. Although they<br />

would prefer to use <strong>on</strong>ly open formats, Elliott <strong>and</strong> Gillies argued that the ShapeFile format is used<br />

around the world <strong>and</strong> can be decoded by open-source software, a fact that gives it a “high likelihood of<br />

translati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to new formats <str<strong>on</strong>g>in</str<strong>on</strong>g> the c<strong>on</strong>text of l<strong>on</strong>g-term preservati<strong>on</strong> repositories.” The experience of<br />

Pleiades illustrates the challenges of want<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create open-access resources while hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g to deal with<br />

the limits of open formats <strong>and</strong> l<strong>on</strong>g-term preservati<strong>on</strong> needs.<br />

N<strong>on</strong>etheless, the open-access nature of Pleiades <strong>and</strong> the ability to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual places with<str<strong>on</strong>g>in</str<strong>on</strong>g> it<br />

makes it a natural source to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate with other digital classics projects <str<strong>on</strong>g>in</str<strong>on</strong>g> numismatics, epigraphy, <strong>and</strong><br />

papyrology, or any digital resource that makes extensive use of historical place names with<str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

ancient world. Indeed, Pleiades is actively work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with other projects through the C<strong>on</strong>cordia <str<strong>on</strong>g>in</str<strong>on</strong>g>itiative<br />

to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate its c<strong>on</strong>tent with other digital projects <strong>and</strong> “develop st<strong>and</strong>ards-based mechanisms for crossproject<br />

geographic search.”<br />

The HESTIA Project<br />

While smaller <str<strong>on</strong>g>in</str<strong>on</strong>g> scale than Pleiades, the HESTIA (Herodotus Encoded Space-Text-Imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g-<br />

Archive) 305 project provides an <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g look at how digital technology <strong>and</strong> spatial analysis 306 can<br />

301 The gazetteer data created by the Pleiades project has served as <strong>on</strong>e key comp<strong>on</strong>ent of the Digital Atlas of Roman <strong>and</strong> Medieval Civilizati<strong>on</strong><br />

(http://medievalmap.harvard.edu/icb/icb.dokeyword=k40248&pageid=icb.page188865), which “offers a series of maps <strong>and</strong> geodatabases bear<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong><br />

multiple aspects of Roman <strong>and</strong> medieval civilizati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the broadest terms” <strong>and</strong> makes extensive use of the Barr<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong> Atlas of the Greek <strong>and</strong> Roman<br />

World. The recently announced “Google Ancient Places” project is also utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the data made available by Pleiades<br />

(http://googleancientplaces.wordpress.com/2010/12/08/fill<str<strong>on</strong>g>in</str<strong>on</strong>g>g-<str<strong>on</strong>g>in</str<strong>on</strong>g>-some-gaps-<str<strong>on</strong>g>in</str<strong>on</strong>g>-gap/).<br />

302 KML, formerly known as “keyhole markup language,” was created by Google <strong>and</strong> is ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as an open st<strong>and</strong>ard by the Open Geospatial<br />

C<strong>on</strong>sortium (http://www.opengeospatial.org/st<strong>and</strong>ards/kml/). KML is an “XML language focused <strong>on</strong> geographic visualizati<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g annotati<strong>on</strong> of<br />

maps <strong>and</strong> images” <strong>and</strong> is used by a number of open-source mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g projects. (http://code.google.com/apis/kml/documentati<strong>on</strong>/mapsSupport.html).<br />

303 GeoRSS (http://www.georss.org/Ma<str<strong>on</strong>g>in</str<strong>on</strong>g>_Page) is a “lightweight, community driven way to extend exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g feeds with geographic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>.”<br />

304 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Wikipedia (http://en.wikipedia.org/wiki/Atom_(st<strong>and</strong>ard)), the name Atom applies to two related st<strong>and</strong>ards, while the Atom Syndicati<strong>on</strong><br />

Format is a “XML language used for web feed” (http://www.ietf.org/rfc/rfc4287.txt), the Atom Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g Protocol is a “HTTP-based protocol for<br />

creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> updat<str<strong>on</strong>g>in</str<strong>on</strong>g>g web resources.”<br />

305 http://www.open.ac.uk/Arts/hestia/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

306 The HESTIA project hosted a workshop <str<strong>on</strong>g>in</str<strong>on</strong>g> July 2010 entitled “New Worlds Out of Old Texts: Interrogat<str<strong>on</strong>g>in</str<strong>on</strong>g>g New Techniques for the Spatial Analysis of<br />

Ancient Narratives” that brought together numerous projects that are us<str<strong>on</strong>g>in</str<strong>on</strong>g>g spatial-analysis techniques <str<strong>on</strong>g>in</str<strong>on</strong>g> various classical discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es (http://www.artshumanities.net/event/new_worlds_out_old_texts_<str<strong>on</strong>g>in</str<strong>on</strong>g>terrogat<str<strong>on</strong>g>in</str<strong>on</strong>g>g_new_techniques_spatial_analysis_ancient_narratives).


93<br />

be used to answer a specific research questi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> classical geography. HESTIA seeks to exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e the<br />

different ways <str<strong>on</strong>g>in</str<strong>on</strong>g> which the history of Herodotus refers to space <strong>and</strong> time. 307 Their major research<br />

questi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>clude study<str<strong>on</strong>g>in</str<strong>on</strong>g>g his “representati<strong>on</strong> of space <str<strong>on</strong>g>in</str<strong>on</strong>g> its cultural c<strong>on</strong>text,” explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g whether<br />

different peoples represented <str<strong>on</strong>g>in</str<strong>on</strong>g> his history c<strong>on</strong>ceive of space differently, <strong>and</strong> test<str<strong>on</strong>g>in</str<strong>on</strong>g>g the thesis “that<br />

the ancient Greek world centered <strong>on</strong> the Mediterranean <strong>and</strong> was comprised of a series of networks”<br />

(Barker 2010).<br />

Barker et al. (2010) have provided an extensive overview of the design <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>itial f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of HESTIA,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g methodological c<strong>on</strong>siderati<strong>on</strong>s for other projects that seek to make use of the state of the art<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> GIS, relati<strong>on</strong>al databases, <strong>and</strong> other computati<strong>on</strong>al tools to explore questi<strong>on</strong>s not just <str<strong>on</strong>g>in</str<strong>on</strong>g> classical<br />

geography but also <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities <str<strong>on</strong>g>in</str<strong>on</strong>g> general. The authors also dem<strong>on</strong>strate how many traditi<strong>on</strong>al<br />

questi<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the text of Herodotus (e.g., What is the relative importance of bodies of water<br />

(particularly rivers) <str<strong>on</strong>g>in</str<strong>on</strong>g> organiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the physical <strong>and</strong> cultural space) can be <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated <str<strong>on</strong>g>in</str<strong>on</strong>g> new ways<br />

us<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital technologies. The authors also identified a number of themes that HESTIA would pursue<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>-depth regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the th<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g about space <str<strong>on</strong>g>in</str<strong>on</strong>g> Herodotus Histories:<br />

…namely, the types of networks present <strong>and</strong> their <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>, the <str<strong>on</strong>g>in</str<strong>on</strong>g>fluence of human<br />

agency <strong>and</strong> focalisati<strong>on</strong>, the idea of space as someth<str<strong>on</strong>g>in</str<strong>on</strong>g>g experienced <strong>and</strong> lived <str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> the role<br />

of the medium <str<strong>on</strong>g>in</str<strong>on</strong>g> the representati<strong>on</strong> of space—<strong>and</strong> to emphasise that close textual read<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

underp<str<strong>on</strong>g>in</str<strong>on</strong>g>s our use of ICT throughout (Barker et al. 2010).<br />

One major po<str<strong>on</strong>g>in</str<strong>on</strong>g>t reiterated by Barker et al. was that a close textual read<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Herodotus was the first<br />

c<strong>on</strong>siderati<strong>on</strong> before mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g any technological decisi<strong>on</strong>s.<br />

The methodology <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g HESTIA <str<strong>on</strong>g>in</str<strong>on</strong>g>volved four stages: (1) utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the digital markup of a<br />

Herodotus text obta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed from Perseus; (2) compil<str<strong>on</strong>g>in</str<strong>on</strong>g>g a spatial database from this text; (3) produc<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

basic GIS, GoogleEarth, <strong>and</strong> Timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e maps with this database; <strong>and</strong> (4) creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

automated network maps. One particularly <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g feature of the HESTIA project is that it<br />

repurposed the TEI-XML versi<strong>on</strong>s of Herodotus available from the PDL, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> particular the place<br />

names tagged, al<strong>on</strong>g with coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ates <strong>and</strong> identifiers from the Getty Thesaurus of Geographic Names<br />

(TGN) 308 <str<strong>on</strong>g>in</str<strong>on</strong>g> the English file. N<strong>on</strong>etheless, this process of reuse was not seamless, <strong>and</strong> HESTIA needed<br />

to perform some procedural c<strong>on</strong>versi<strong>on</strong>s. Specifically, they c<strong>on</strong>verted the TEI P4 file from Perseus to<br />

TEI P5; the Greek text was transformed from Beta code to Unicode us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a transcoder tool developed<br />

by Hugh Cayless.<br />

The HESTIA project also decided to use <strong>on</strong>ly the English versi<strong>on</strong> of the Histories to probe spatial data<br />

s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the Greek text would have to have had top<strong>on</strong>yms tagged by h<strong>and</strong>. N<strong>on</strong>etheless, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce they still<br />

wanted to make use of the Greek text, they assigned unique identifiers to each secti<strong>on</strong> of the text <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Greek <strong>and</strong> English so that associati<strong>on</strong>s could still be made. In additi<strong>on</strong>, the project needed to perform<br />

some data clean<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the geographic markup <str<strong>on</strong>g>in</str<strong>on</strong>g> the Perseus TEI XML file, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g remov<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

duplicate entries <strong>and</strong> correct<str<strong>on</strong>g>in</str<strong>on</strong>g>g coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ates, entity categorizati<strong>on</strong>s, <strong>and</strong> references to false locati<strong>on</strong>s.<br />

This work by HESTIA illustrates that while the creati<strong>on</strong> of open-access texts by digital classics<br />

projects supports reuse, this reuse is not without its own computati<strong>on</strong>al challenges <strong>and</strong> costs.<br />

307 Explorati<strong>on</strong>s of how time <strong>and</strong> space were c<strong>on</strong>ceived of <str<strong>on</strong>g>in</str<strong>on</strong>g> the ancient world is also the focus of the German research project TOPOI, “The Formati<strong>on</strong><br />

<strong>and</strong> Transformati<strong>on</strong> of Space <strong>and</strong> Knowledge <str<strong>on</strong>g>in</str<strong>on</strong>g> Ancient Civilizati<strong>on</strong>s” (http://www.topoi.org/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php). Recent details <strong>on</strong> <strong>on</strong>e TOPOI project can be<br />

found <str<strong>on</strong>g>in</str<strong>on</strong>g> Pappelau <strong>and</strong> Belt<strong>on</strong> (2009).<br />

308 http://www.getty.edu/research/c<strong>on</strong>duct<str<strong>on</strong>g>in</str<strong>on</strong>g>g_research/vocabularies/tgn/


94<br />

After correct<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> st<strong>and</strong>ardiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the place-name tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the English Histories of Herodotus, the<br />

HESTIA project extracted this <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> compiled a spatial database stored <str<strong>on</strong>g>in</str<strong>on</strong>g> PostgreSQL 309<br />

(which has a PostGIS extensi<strong>on</strong>). This database can be queried to produce different automated results<br />

that can then be visualized through maps. The generati<strong>on</strong> of this database posed a number of problems,<br />

however, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g questi<strong>on</strong>s as to what if any c<strong>on</strong>necti<strong>on</strong>s Herodotus might have been draw<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

between places, the quality of the English translati<strong>on</strong>, <strong>and</strong> various syntactic issues of language<br />

representati<strong>on</strong>. N<strong>on</strong>etheless, the f<str<strong>on</strong>g>in</str<strong>on</strong>g>al database has a simple structure <strong>and</strong> c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>on</strong>ly three tables:<br />

secti<strong>on</strong>s (which stores <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about the secti<strong>on</strong> of Herodotus text); locati<strong>on</strong>s (which stores unique<br />

locati<strong>on</strong>s); <strong>and</strong> references (this table ties the locati<strong>on</strong>s <strong>and</strong> secti<strong>on</strong>s together by provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g unique IDs<br />

for all references to spatial locati<strong>on</strong>s with<str<strong>on</strong>g>in</str<strong>on</strong>g> Herodotus). Whereas Perseus had used <strong>on</strong>ly a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle level<br />

of categorizati<strong>on</strong>, HESTIA <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced a broader level of categorizati<strong>on</strong> of “geotype” <strong>and</strong> “subtype,” a<br />

process that also def<str<strong>on</strong>g>in</str<strong>on</strong>g>es places as settlements, territories, or physical features.<br />

The HESTIA project chose QGIS, 310 an open-source GIS tool that c<strong>on</strong>nects easily to PostGIS, as the<br />

applicati<strong>on</strong> for query<str<strong>on</strong>g>in</str<strong>on</strong>g>g the database <strong>and</strong> generat<str<strong>on</strong>g>in</str<strong>on</strong>g>g maps. As with the choice of PostgreSQL, Barker<br />

et al. were c<strong>on</strong>cerned with choos<str<strong>on</strong>g>in</str<strong>on</strong>g>g applicati<strong>on</strong>s that would support l<strong>on</strong>g-term data preservati<strong>on</strong> <strong>and</strong><br />

analysis. Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g these tools allowed SQL queries to be generated that could perform various functi<strong>on</strong>s<br />

with related maps, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g produc<str<strong>on</strong>g>in</str<strong>on</strong>g>g a gazetteer of sites, list<str<strong>on</strong>g>in</str<strong>on</strong>g>g the total number of references to a<br />

locati<strong>on</strong> (such as by book of Herodotus), <strong>and</strong> generat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a network based <strong>on</strong> co-reference of locati<strong>on</strong>s<br />

(e.g.. with<str<strong>on</strong>g>in</str<strong>on</strong>g> a specific book). To provide greater public access to these data, the HESTIA project<br />

decided to expose the PostGIS data as KML so that it could be read by various mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g applicati<strong>on</strong>s<br />

such as GoogleEarth. The project accomplished this by us<str<strong>on</strong>g>in</str<strong>on</strong>g>g GeoServer, an “Open Source server that<br />

serves spatial data <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of web-friendly formats simultaneously” <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g KML <strong>and</strong> SVG. The<br />

creati<strong>on</strong> of this “Herodotus geodata” allows users to c<strong>on</strong>struct their own mashups of the visual <strong>and</strong><br />

textual data created by the HESTIA project.<br />

To more successfully visualize spatial changes <str<strong>on</strong>g>in</str<strong>on</strong>g> the narrative, the project made use of TimeMap.js, 311<br />

an open-source project that uses several technologies to “allow data plotted <strong>on</strong> GoogleMaps to appear<br />

<strong>and</strong> disappear as a timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e is moved.” The project hired the developer of TimeMap, Nick Rab<str<strong>on</strong>g>in</str<strong>on</strong>g>owitz,<br />

to create a “Herodotus Narrative Timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e” that “allows users to visualise locati<strong>on</strong>s referred to <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

text by scroll<str<strong>on</strong>g>in</str<strong>on</strong>g>g al<strong>on</strong>g a 'timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e' represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g each chapter.” 312 The development of this timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e,<br />

however, also required the creati<strong>on</strong> of <strong>on</strong>e feature to better <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate textual <strong>and</strong> visual data:<br />

When places are first menti<strong>on</strong>ed, they appear flush to the right-h<strong>and</strong> side of the ‘timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e’ bar<br />

<strong>and</strong> fully coloured <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong> the map. As <strong>on</strong>e moves through the narrative, however, they move to<br />

the left of the ‘timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e’ bar accord<str<strong>on</strong>g>in</str<strong>on</strong>g>gly <strong>and</strong> become ever fa<str<strong>on</strong>g>in</str<strong>on</strong>g>ter <strong>on</strong> the map, until, <str<strong>on</strong>g>in</str<strong>on</strong>g> both<br />

cases, they drop out altogether. In do<str<strong>on</strong>g>in</str<strong>on</strong>g>g this, we have tried to reproduce more accurately the<br />

read<str<strong>on</strong>g>in</str<strong>on</strong>g>g experience: <str<strong>on</strong>g>in</str<strong>on</strong>g> the m<str<strong>on</strong>g>in</str<strong>on</strong>g>d; some chapters later, this place might no l<strong>on</strong>ger hold the<br />

attenti<strong>on</strong> so greatly, but its memory l<str<strong>on</strong>g>in</str<strong>on</strong>g>gers <strong>on</strong> (captured <str<strong>on</strong>g>in</str<strong>on</strong>g> the Timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e Map by the faded<br />

ic<strong>on</strong>s), until it disappears altogether. By re-visualis<str<strong>on</strong>g>in</str<strong>on</strong>g>g the data <str<strong>on</strong>g>in</str<strong>on</strong>g> this format, we hope not <strong>on</strong>ly<br />

to assist <str<strong>on</strong>g>in</str<strong>on</strong>g> the read<str<strong>on</strong>g>in</str<strong>on</strong>g>g experience of Herodotus but also to raise new research questi<strong>on</strong>s that<br />

would not have been apparent before the advent of such technology (Barker et al. 2010)<br />

309 http://www.postgresql.org/about/<br />

310 http://www.qgis.org/<br />

311 http://code.google.com/p/timemap/<br />

312 http://www.nickrab<str<strong>on</strong>g>in</str<strong>on</strong>g>owitz.com/projects/timemap/herodotus/basic.html


95<br />

The development of this time l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> read<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools to support its use dem<strong>on</strong>strates how digital<br />

technologies offer new ways of read<str<strong>on</strong>g>in</str<strong>on</strong>g>g a text.<br />

HESTIA has also produced a number of automatic network maps to analyze how the narrative of<br />

Herodotus organized space <strong>and</strong> relati<strong>on</strong>s between places. Barker et al. cauti<strong>on</strong>ed, however, that as<br />

accurate as such maps may appear <str<strong>on</strong>g>in</str<strong>on</strong>g> GoogleEarth, they can never truly be objective, for “the form<br />

chosen to represent space <str<strong>on</strong>g>in</str<strong>on</strong>g>evitably affects its underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> is <str<strong>on</strong>g>in</str<strong>on</strong>g> itself framed by certa<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

epistemological assumpti<strong>on</strong>s” (Barker et al. 2010). This warn<str<strong>on</strong>g>in</str<strong>on</strong>g>g cogently echoes the fears articulated<br />

earlier by scholars who created digital rec<strong>on</strong>structi<strong>on</strong>s of archaeological m<strong>on</strong>uments (Beacham <strong>and</strong><br />

Denard 2003), that users of such visualizati<strong>on</strong>s <strong>and</strong> maps must always be cognizant of the theoretical<br />

<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative arguments <str<strong>on</strong>g>in</str<strong>on</strong>g>herent <str<strong>on</strong>g>in</str<strong>on</strong>g> such c<strong>on</strong>structi<strong>on</strong>s. N<strong>on</strong>etheless, the HESTIA project created<br />

network maps as a way to better analyze c<strong>on</strong>necti<strong>on</strong>s made between places <str<strong>on</strong>g>in</str<strong>on</strong>g> the narrative of<br />

Herodotus, <strong>and</strong> they focused <strong>on</strong> topological (l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks created by human agents) rather than topographic<br />

c<strong>on</strong>necti<strong>on</strong>s (two-dimensi<strong>on</strong>al maps). As they put it, “In other words, we are <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> captur<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<strong>and</strong> evaluat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the mental image (or, better, images) of the world c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed with<str<strong>on</strong>g>in</str<strong>on</strong>g> Herodotus’<br />

narrative, not any supposed objective representati<strong>on</strong>” (Barker et al. 2010).<br />

The creators of HESTIA expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that the creati<strong>on</strong> of automated network maps was <str<strong>on</strong>g>in</str<strong>on</strong>g> large part<br />

designed to <str<strong>on</strong>g>in</str<strong>on</strong>g>crease the impact of the project or to br<str<strong>on</strong>g>in</str<strong>on</strong>g>g Herodotus to new readers <strong>on</strong> the web. At the<br />

same time, the queries <strong>and</strong> the maps they generated are <str<strong>on</strong>g>in</str<strong>on</strong>g>tended to prompt new questi<strong>on</strong>s <strong>and</strong> analysis<br />

rather than provide def<str<strong>on</strong>g>in</str<strong>on</strong>g>itive answers, or, as Barker et al. (2010) accentuated, they “should be regarded<br />

as complement<str<strong>on</strong>g>in</str<strong>on</strong>g>g rather than replac<str<strong>on</strong>g>in</str<strong>on</strong>g>g close textual analysis.” In fact, the automated network maps<br />

faced a variety of problems with spatial data <str<strong>on</strong>g>in</str<strong>on</strong>g>herited from the Perseus English text as well as the far<br />

greater issue that many place names <str<strong>on</strong>g>in</str<strong>on</strong>g> the English translati<strong>on</strong>, <strong>and</strong> thus <str<strong>on</strong>g>in</str<strong>on</strong>g> the database, are not found<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the Greek text. As Barker et al. (2010) expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, <strong>on</strong>e fundamental difference between Greek <strong>and</strong><br />

English is that often what was c<strong>on</strong>ceptualized of as a “named geographical c<strong>on</strong>cept” <str<strong>on</strong>g>in</str<strong>on</strong>g> English was<br />

represented <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek as the people who lived <str<strong>on</strong>g>in</str<strong>on</strong>g> that place. For future work, 313 Barker et al. (2010)<br />

posited that they would need to further nuance their quantitative approach with qualitative approaches<br />

that m<str<strong>on</strong>g>in</str<strong>on</strong>g>e the textual narrative. In sum, this project illustrates not <strong>on</strong>ly how the digital objects of other<br />

projects can be repurposed <str<strong>on</strong>g>in</str<strong>on</strong>g> new ways but also how a traditi<strong>on</strong>al research questi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> classics can be<br />

rec<strong>on</strong>ceived <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital envir<strong>on</strong>ment.<br />

One of the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cipal researchers of the HESTIA Project has published other work <str<strong>on</strong>g>in</str<strong>on</strong>g>to the use of<br />

technologies such as GIS <strong>and</strong> network analysis to research questi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> classical geography (Isaksen<br />

2008). Leif Isaksen’s 314 research <str<strong>on</strong>g>in</str<strong>on</strong>g>to the Roman Baetica <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrates the archaeological <strong>and</strong><br />

documentary record (Ant<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>e It<str<strong>on</strong>g>in</str<strong>on</strong>g>eraries, Ravenna Cosmography) <strong>and</strong> dem<strong>on</strong>strates the potential of<br />

us<str<strong>on</strong>g>in</str<strong>on</strong>g>g new technologies to revisit arguments about transportati<strong>on</strong> networks <strong>and</strong> the Roman Empire.<br />

313 In July 2010, Google announced a number of digital humanities grant awards, <strong>on</strong>e of which was made to Elt<strong>on</strong> Barker of the HESTIA project <strong>and</strong> to<br />

Eric Kansa of OpenC<strong>on</strong>text for the “Google Ancient Places (GAP)” project. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the project blog,<br />

(http://googleancientplaces.wordpress.com/2010/10/13/tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g-a-gap-year/), GAP plans to m<str<strong>on</strong>g>in</str<strong>on</strong>g>e the Google Books corpus for locati<strong>on</strong>s <strong>and</strong> place names<br />

associated with the ancient world <strong>and</strong> then provide a variety of ways for scholars to visualize the results (e.g., <str<strong>on</strong>g>in</str<strong>on</strong>g> GoogleEarth or with Google Maps). ).<br />

The GAP project is also participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the PELAGIOS (Pelagios: Enable L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked Ancient Geodata In Open Systems) project (http://pelagiosproject.blogspot.com/p/about.html)<br />

al<strong>on</strong>g with Pleiades, the Perseus Digital <strong>Library</strong>, Open C<strong>on</strong>text <strong>and</strong> several other prom<str<strong>on</strong>g>in</str<strong>on</strong>g>ent projects to <str<strong>on</strong>g>in</str<strong>on</strong>g>troduce<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked Open Data <str<strong>on</strong>g>in</str<strong>on</strong>g>to “<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e resources that refer to places <str<strong>on</strong>g>in</str<strong>on</strong>g> the Ancient World.”<br />

314 Leif Isaksen (2010) also recently presented <str<strong>on</strong>g>in</str<strong>on</strong>g>itial work <str<strong>on</strong>g>in</str<strong>on</strong>g> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital methods to analyze the Geographia of Claudius Ptolemy.


96<br />

Epigraphy<br />

Overview: Epigraphy Databases, Digital Epigraphy, <strong>and</strong> EpiDoc<br />

Epigraphy has been def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as the study of “<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s or epigraphs engraved <str<strong>on</strong>g>in</str<strong>on</strong>g>to durable materials<br />

(e.g., st<strong>on</strong>e)” (Bauer et al. 2008). This digitally advanced discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e 315 is well represented <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e by<br />

numerous projects as well as by a relatively mature encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g st<strong>and</strong>ard, EpiDoc. 316 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the<br />

Corpus Inscripti<strong>on</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>arum (CIL) project, “Inscripti<strong>on</strong>s, as direct evidence from the ancient world,<br />

are am<strong>on</strong>g the most important sources for <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Roman history <strong>and</strong> everyday life <str<strong>on</strong>g>in</str<strong>on</strong>g> all their<br />

aspects.” 317 Bodard (2008) has offered further explanati<strong>on</strong> of the importance of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s for<br />

classical scholarship:<br />

Inscripti<strong>on</strong>s, ancient texts <str<strong>on</strong>g>in</str<strong>on</strong>g>scribed <strong>on</strong> st<strong>on</strong>e or other durable materials, are an important source<br />

of access to various ancient societies, <strong>and</strong> particularly the worlds of ancient Greece <strong>and</strong> <str<strong>on</strong>g>Rome</str<strong>on</strong>g>.<br />

These texts survive <str<strong>on</strong>g>in</str<strong>on</strong>g> large numbers, <strong>and</strong> are widely used by historians as <strong>on</strong>e of the primary<br />

sources of direct evidence <strong>on</strong> the history, language, rituals, <strong>and</strong> practices of the ancient world.<br />

Words <str<strong>on</strong>g>in</str<strong>on</strong>g>scribed <strong>on</strong> st<strong>on</strong>e, a skilful <strong>and</strong> expensive process, may tend to be élite texts …<br />

(Bodard 2008).<br />

Bodard stated that <str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong> to official documents there are many other types of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, such as<br />

gravest<strong>on</strong>es <strong>and</strong> curse tablets, all of which give <str<strong>on</strong>g>in</str<strong>on</strong>g>sight <str<strong>on</strong>g>in</str<strong>on</strong>g>to the everyday life of ord<str<strong>on</strong>g>in</str<strong>on</strong>g>ary people.<br />

Cayless et al. (2009) gave an overview of the state of the art <str<strong>on</strong>g>in</str<strong>on</strong>g> digital epigraphy 318 <strong>and</strong> the future of<br />

epigraphy as a discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. They stated that while most epigraphic publicati<strong>on</strong>s were still published <strong>on</strong>ly<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t, by 2017 this situati<strong>on</strong> would have changed. The discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of epigraphy grew greatly dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the eighteenth <strong>and</strong> n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth centuries, Cayless et al. observed, as a st<strong>and</strong>ard educati<strong>on</strong> for gentlemen<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> both Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek, <strong>and</strong> travel <str<strong>on</strong>g>in</str<strong>on</strong>g> the eastern Mediterranean, <str<strong>on</strong>g>in</str<strong>on</strong>g>creased. Many <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s were<br />

transcribed by n<strong>on</strong>classical scholars, but a scientific approach for transcrib<str<strong>on</strong>g>in</str<strong>on</strong>g>g gradually developed as<br />

did st<strong>and</strong>ards for publicati<strong>on</strong>, albeit <str<strong>on</strong>g>in</str<strong>on</strong>g> a rather haphazard manner. In the early 1930s, a set of<br />

publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g protocols called the Leiden c<strong>on</strong>venti<strong>on</strong>s (Van Gr<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>gen 1932) was agreed up<strong>on</strong>,<br />

c<strong>on</strong>venti<strong>on</strong>s that have been discussed <strong>and</strong> updated ever s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Cayless et al. (2009). The<br />

Leiden c<strong>on</strong>venti<strong>on</strong>s have been described as “a type of semantic encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, which c<strong>on</strong>sists of various<br />

brackets, underdots <strong>and</strong> other mark<str<strong>on</strong>g>in</str<strong>on</strong>g>gs relat<str<strong>on</strong>g>in</str<strong>on</strong>g>g to miss<str<strong>on</strong>g>in</str<strong>on</strong>g>g or broken characters, uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty, additi<strong>on</strong>s<br />

<strong>and</strong> correcti<strong>on</strong>s made by the editor of an ancient text” (Roued 2009). Despite the creati<strong>on</strong> of these<br />

c<strong>on</strong>venti<strong>on</strong>s, Roued (2009) noted that editi<strong>on</strong>s published before 1931 used vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>venti<strong>on</strong>s, <strong>and</strong><br />

even after the creati<strong>on</strong> of Leiden, not all parts of the c<strong>on</strong>venti<strong>on</strong>s were applied evenly.<br />

One major issue with st<strong>and</strong>ard pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> epigraphy, Cayless et al. observed, was that it<br />

“tended to emphasize the role of epigraphy with<str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology <strong>and</strong> history, <strong>and</strong> to distance it from the<br />

study of text <strong>and</strong> language.” Bodard (2008) has also emphasized this unique feature of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s:<br />

The texts themselves are an awkward category, neither poetry, history, or philosophy, nor even<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the same category as literature preserved by the direct manuscript traditi<strong>on</strong>, but documentary<br />

texts with very little beauty or elegance of language. The objects <strong>on</strong> which the texts are<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scribed, the stelae, statues, wall panels, tablets, <strong>and</strong> grave m<strong>on</strong>uments, are studied by<br />

archaeologists <strong>and</strong> art historians for whom the written texts are little more than a footnote, if<br />

315 For <strong>on</strong>e look at an earlier approach to “digital epigraphy” <strong>and</strong> its advantages for Egyptology, see Manuelian (1998).<br />

316 http://epidoc.sourceforge.net/<br />

317 http://cil.bbaw.de/cil_en/dateien/forschung.html<br />

318 For an overview of state-of-the-art digital research methods specifically for Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> epigraphy, see Feraudi-Gruénais (2010).


97<br />

not an <str<strong>on</strong>g>in</str<strong>on</strong>g>c<strong>on</strong>venience. This fact has tended to keep <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> an academic limbo—not<br />

quite literary text <strong>and</strong> not quite archaeological object (Bodard 2008).<br />

In fact, Bodard claimed that electr<strong>on</strong>ic publicati<strong>on</strong> supports an entire reappraisal of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, <strong>and</strong><br />

that text encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> subject-based markup, <str<strong>on</strong>g>in</str<strong>on</strong>g> particular, <str<strong>on</strong>g>in</str<strong>on</strong>g>crease the ability to deal with <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s<br />

as both texts <strong>and</strong> archaeological objects.<br />

To del<str<strong>on</strong>g>in</str<strong>on</strong>g>eate the future of digital epigraphy, Cayless et al. (2009) referred to John Unsworth’s list of<br />

scholarly primitives (Unsworth 2000)—“discovery, annotati<strong>on</strong>, compar<str<strong>on</strong>g>in</str<strong>on</strong>g>g, referr<str<strong>on</strong>g>in</str<strong>on</strong>g>g, sampl<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

illustrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g”—<strong>and</strong> used it as a framework for analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g how well epigraphy<br />

databases addressed these needs. They argued that epigraphy databases have been greatly successful <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the task of discovery, <strong>and</strong> that provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ability to search across texts has been <strong>on</strong>e of<br />

the major goals beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d most digital epigraphy projects. In additi<strong>on</strong>, any project published <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e can<br />

also be searched at least by Google. Indeed, as the survey of projects below illustrates, most digital<br />

epigraphy projects are database driven. Cayless et al. noted, however, that the approach taken by most<br />

epigraphy projects fails to address the other scholarly primitives, <strong>and</strong> that a different type of digital<br />

representati<strong>on</strong> is thus necessary.<br />

To support this asserti<strong>on</strong>, Cayless et al. briefly reviewed EDH, EAGLE, <strong>and</strong> several other digital<br />

epigraphy projects, <strong>and</strong> suggested that they represented st<strong>and</strong>ard approaches to digitally represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> related data. They po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that massive digitizati<strong>on</strong> projects such as Google Books<br />

<strong>and</strong> the Internet Archive have also scanned a number of public doma<str<strong>on</strong>g>in</str<strong>on</strong>g> editi<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s (though<br />

Google Books sometimes restricts access to some of these texts without explanati<strong>on</strong>, particularly to<br />

users outside of the United States). The st<strong>and</strong>ard approach of most databases, as described by Cayless<br />

et al., directly transfers a Leiden-encoded <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> to digital form with <strong>on</strong>ly some adjustments. In<br />

c<strong>on</strong>trast, they advocate the use of EpiDoc, a TEI XML st<strong>and</strong>ard created by Tom Elliott, for encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. Although c<strong>on</strong>ceived of as a comm<strong>on</strong> data <str<strong>on</strong>g>in</str<strong>on</strong>g>terchange format, Cayless et al. reported that<br />

through a number of projects <strong>and</strong> workshops:<br />

… EpiDoc has grown <strong>and</strong> matured. Its scope has exp<strong>and</strong>ed bey<strong>on</strong>d (though not ab<strong>and</strong><strong>on</strong>ed) the<br />

orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al visi<strong>on</strong> for a comm<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terchange format. EpiDoc now aims also to be a mechanism for<br />

the creati<strong>on</strong> of complete digital epigraphic editi<strong>on</strong>s <strong>and</strong> corpora. We will argue that EpiDoc<br />

represents a better digital abstracti<strong>on</strong> of the Leiden c<strong>on</strong>venti<strong>on</strong>s than is achievable by a simple<br />

mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g Leiden’s syntax for pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to digital form. A full EpiDoc document<br />

may c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>, <str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong> to the text itself, <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about the history of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, a<br />

descripti<strong>on</strong> of the text <strong>and</strong> its support, commentary, f<str<strong>on</strong>g>in</str<strong>on</strong>g>dspot <strong>and</strong> current locati<strong>on</strong>s, l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to<br />

photographs, translati<strong>on</strong>s, etc. (Cayless et al. 2009).<br />

As a result, they argue that EpiDoc can support the creati<strong>on</strong> of more sophisticated digital editi<strong>on</strong>s <strong>and</strong><br />

digital corpora of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. In additi<strong>on</strong>, the EpiDoc project has created tools to c<strong>on</strong>vert Leidenformatted<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s automatically <str<strong>on</strong>g>in</str<strong>on</strong>g>to EpiDoc XML versi<strong>on</strong>s.<br />

The Leiden c<strong>on</strong>venti<strong>on</strong>s specify how <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> features besides the text should be represented <str<strong>on</strong>g>in</str<strong>on</strong>g> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<br />

<strong>and</strong> provide st<strong>and</strong>ard symbols that can be used to “c<strong>on</strong>vey the state of the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al document <strong>and</strong> the<br />

editor’s <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of that document” (Cayless et al. 2009). Directly mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g Leiden pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t syntax to<br />

a digital form, however, presented a number of issues that were covered <str<strong>on</strong>g>in</str<strong>on</strong>g> detail by the authors.<br />

Cayless et al. also noted that digitally represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g the typographic features of Leiden was <strong>on</strong>ly a first


98<br />

step, because epigraphic texts should also be “fully queryable <strong>and</strong> manipulable” <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital<br />

envir<strong>on</strong>ment:<br />

By the term “queryable”, we do not simply mean that the text may be scanned for particular<br />

patterns of characters; we mean that features of the text <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated by Leiden should be able to<br />

be <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated also. So, for example, a corpus of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s should be able to be queried for<br />

the full list of abbreviati<strong>on</strong>s used with<str<strong>on</strong>g>in</str<strong>on</strong>g> it, or for the number of occurrences of a word <str<strong>on</strong>g>in</str<strong>on</strong>g> its full<br />

form, neither abbreviated nor supplemented. One can imag<str<strong>on</strong>g>in</str<strong>on</strong>g>e many uses for a search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

able to do these k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds of queries <strong>on</strong> text (Cayless et al. 2009).<br />

The ability to do searches that “leverage the structures” embedded with<str<strong>on</strong>g>in</str<strong>on</strong>g> Leiden, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Cayless<br />

et al. (2009), first requires marked-up <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> text that could then be parsed <strong>and</strong> c<strong>on</strong>verted <str<strong>on</strong>g>in</str<strong>on</strong>g>to data<br />

structures that could be used to support the operati<strong>on</strong>s listed above. Such pars<str<strong>on</strong>g>in</str<strong>on</strong>g>g requires lexical<br />

analysis that produces token streams that can be fed <str<strong>on</strong>g>in</str<strong>on</strong>g>to a parser, which can produce parse trees that<br />

can be acted up<strong>on</strong> <strong>and</strong> queried <str<strong>on</strong>g>in</str<strong>on</strong>g> different ways. The authors granted that while EpiDoc is <strong>on</strong>ly <strong>on</strong>e<br />

“possible serializati<strong>on</strong> of the Leiden data structure,” it does have the added advantage of hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g many<br />

tools available to already work with it.<br />

Rather than mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g use of st<strong>and</strong>ards such as EpiDoc, Cayless et al. stated that the databases that<br />

supported most <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e epigraphy projects typically <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded various metadata fields <strong>and</strong> a large text<br />

field with the Leiden <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> directly transcribed without any markup or encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g (a fact supported<br />

by the survey <str<strong>on</strong>g>in</str<strong>on</strong>g> this review). The c<strong>on</strong>venience of such a database setup is that it permits various<br />

fielded <strong>and</strong> full-text searches, it is easy to c<strong>on</strong>nect with web-based fr<strong>on</strong>t ends for forms, data can be<br />

easily extracted us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Structured Query Language (SQL), <strong>and</strong> data can also be easily added to these<br />

systems. This makes it easy to <str<strong>on</strong>g>in</str<strong>on</strong>g>sert new <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s as they are discovered. N<strong>on</strong>etheless, this<br />

st<strong>and</strong>ard database approach has two major flaws, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Cayless et al.: (1) <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of digital<br />

preservati<strong>on</strong>, each “digital corpus” or database does not have distributed copies as a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t corpus does;<br />

<strong>and</strong> (2) these databases lack the ability to “customize queries” <strong>and</strong> thus “see how result sets are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

c<strong>on</strong>structed.”<br />

Another significant issue is that the way databases or their <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces are designed can greatly <str<strong>on</strong>g>in</str<strong>on</strong>g>fluence<br />

the types of questi<strong>on</strong>s that can be asked. Mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g arguments similar to those of Dunn (2009) <strong>and</strong><br />

Bodard <strong>and</strong> Garcés (2009), Cayless et al. argued that technical decisi<strong>on</strong>s such as the creati<strong>on</strong> of a<br />

database are also “editorial <strong>and</strong> scholarly decisi<strong>on</strong>s” <strong>and</strong> that access to raw data is required to provide<br />

users the ability to both exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> correct decisi<strong>on</strong>s. L<strong>on</strong>g-term digital repositories for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s<br />

thus have at least two major requirements: (1) the ability to export part or all of the data <str<strong>on</strong>g>in</str<strong>on</strong>g> st<strong>and</strong>ard<br />

formats; <strong>and</strong> (2) persistent identifiers (such as digital object identifiers [DOIs]) at the level of a digital<br />

object so that they can be used to cite these objects <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent of the locati<strong>on</strong> from where they were<br />

retrieved. As Cayless et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>, <str<strong>on</strong>g>in</str<strong>on</strong>g> a future where published digital <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s may be stored <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

various locati<strong>on</strong>s, the ability to cite items us<str<strong>on</strong>g>in</str<strong>on</strong>g>g persistent identifiers will be very important. They see<br />

EpiDoc as a key comp<strong>on</strong>ent of such a future digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for epigraphy, because it could serve<br />

not <strong>on</strong>ly as an <str<strong>on</strong>g>in</str<strong>on</strong>g>terchange format but also as a means of stor<str<strong>on</strong>g>in</str<strong>on</strong>g>g, distribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

epigraphic data <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital format.<br />

All of these arguments lead the authors to <strong>on</strong>e fundamental c<strong>on</strong>clusi<strong>on</strong> about epigraphy, namely, that<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s are texts <str<strong>on</strong>g>in</str<strong>on</strong>g> complex envir<strong>on</strong>ments, not just physical objects:<br />

This fact argues for treat<str<strong>on</strong>g>in</str<strong>on</strong>g>g them from the start as complex digital packages with their own<br />

deep structure, history, <strong>and</strong> associated data (such as images), rather than as simple elements <str<strong>on</strong>g>in</str<strong>on</strong>g> a


99<br />

st<strong>and</strong>ardized collecti<strong>on</strong> of data. Rather than eng<str<strong>on</strong>g>in</str<strong>on</strong>g>eer<str<strong>on</strong>g>in</str<strong>on</strong>g>g applicati<strong>on</strong>s <strong>on</strong> top of a data structure<br />

that does not corresp<strong>on</strong>d well to the nature of the source material, we would do better to<br />

c<strong>on</strong>struct ways of closely represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g the physical <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual aspects of the source <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

digital form, <strong>and</strong> then f<str<strong>on</strong>g>in</str<strong>on</strong>g>d ways to build up<strong>on</strong> that foundati<strong>on</strong> (Cayless et al. 2009).<br />

As <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated above, much of the earlier research that focused <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s has <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated their<br />

archaeological c<strong>on</strong>text. The arguments made by Cayless et al. emphasize the need for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s to be<br />

c<strong>on</strong>sidered not just as simple data elements but also as complex digital objects with both a text<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> <strong>and</strong> an archaeological c<strong>on</strong>text. For epigraphy databases to support the grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g field of<br />

digital epigraphy, Cayless et al. c<strong>on</strong>cluded that a mass of epigraphical data would need to be made<br />

available <strong>and</strong> that better tools would also be needed to gather, analyze, <strong>and</strong> publish those data.<br />

Despite Cayless et al.’s str<strong>on</strong>g arguments <str<strong>on</strong>g>in</str<strong>on</strong>g> favor of greater adopti<strong>on</strong> of EpiDoc, research by Bauer et<br />

al. (2008) have countered that EpiDoc has some specific limitati<strong>on</strong>s, particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> regard to the<br />

development of philological critical editi<strong>on</strong>s. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the project website, Hypereidoc 319 is an<br />

“XML based framework support<str<strong>on</strong>g>in</str<strong>on</strong>g>g distributed, multi-layered, versi<strong>on</strong>-c<strong>on</strong>trolled process<str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

epigraphical, papyrological or similar texts <str<strong>on</strong>g>in</str<strong>on</strong>g> a modern critical editi<strong>on</strong>.” Bauer <strong>and</strong> colleagues argued<br />

that EpiDoc has limitati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of its expressive power <strong>and</strong> how <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual results can be<br />

comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed to form a cooperative product. They ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that their proposed Hypereidoc framework<br />

provided an “XML schema def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> for a set of annotati<strong>on</strong>-based layers c<strong>on</strong>nected by an extensive<br />

reference system, validat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools, <strong>and</strong> an editor <strong>on</strong>-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e visualiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the base text <strong>and</strong> the<br />

annotati<strong>on</strong>s” (Bauer et al. 2008). Their framework has been successfully tested by philologists work<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<strong>on</strong> the Hypereides palimpsest. 320<br />

The creati<strong>on</strong> of digital transcripti<strong>on</strong>s of epigraphic or papyrological texts, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Bauer et al.,<br />

requires a model that supports multiple levels of annotati<strong>on</strong> to a base text:<br />

Annotati<strong>on</strong>s may mark miss<str<strong>on</strong>g>in</str<strong>on</strong>g>g, unreadable, ambiguous, or superfluous parts of text. They<br />

should also quote <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about the reas<strong>on</strong> of the scholar’s decisi<strong>on</strong> e.g. other document<br />

sources, well-accepted historical facts or advances <str<strong>on</strong>g>in</str<strong>on</strong>g> technology. Annotati<strong>on</strong>s also provide<br />

meta-<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about the author of the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual critical notes <strong>and</strong> expose the supposed<br />

mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the given scholar. It is of a primary importance that no <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

should be lost dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g the transcripti<strong>on</strong> process (Bauer et al. 2008).<br />

Although they noted that the Leiden c<strong>on</strong>venti<strong>on</strong>s are the most accepted set of rules <strong>and</strong> that EpiDoc did<br />

meet some of their needs, they also argued that digital critical editi<strong>on</strong>s would require a base text layer<br />

that always rema<str<strong>on</strong>g>in</str<strong>on</strong>g>ed untouched. They developed a text model for annotati<strong>on</strong>, VITAM (Virtual Text-<br />

Document Annotati<strong>on</strong> Model), which c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s “virtual text-documents as data items <strong>and</strong> annotati<strong>on</strong><br />

sequences <strong>and</strong> virtual text documents’ merg<str<strong>on</strong>g>in</str<strong>on</strong>g>g as operati<strong>on</strong>s.” Their multilayered XML schema model<br />

is based <strong>on</strong> TEI <strong>and</strong> EpiDoc <strong>and</strong> def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a “Base Text Layer” that stored just the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al text <strong>and</strong> its<br />

physical structure, an “Order<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Index<str<strong>on</strong>g>in</str<strong>on</strong>g>g Layer” that def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed page order <strong>and</strong> their place <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

codices, <strong>and</strong> <strong>on</strong>e or more “Annotati<strong>on</strong> Layers” that stored attached philological metadata. This model,<br />

they argue, better supports the creati<strong>on</strong> of collaborative critical editi<strong>on</strong>s:<br />

Philologists can def<str<strong>on</strong>g>in</str<strong>on</strong>g>e their own Annotati<strong>on</strong> Layers which may refer to <strong>on</strong>ly the Base Text<br />

Layer or <strong>on</strong>e or more Annotati<strong>on</strong> Layers. They can add notes <strong>and</strong> annotati<strong>on</strong>s to the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

319 http://hypereidoc.elte.hu/<br />

320 The Hypereides palimpsest is part of the larger Archimedes palimpsest, <strong>and</strong> for some of this work <strong>on</strong> Hypereides, see Tchernetska et al. (2007). The<br />

XML transcripti<strong>on</strong> of Hypereides can also be downloaded at http://hypereidoc.elte.hu/hypereides/downloads/hypereides-full.xml


100<br />

text <strong>and</strong> to previous annotati<strong>on</strong>s, they can make reflecti<strong>on</strong>s <strong>on</strong> earlier work or create a new<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>. We have designed a schema to h<strong>and</strong>le these references <strong>and</strong> to support the<br />

distributed <strong>and</strong> collaborative work with us<str<strong>on</strong>g>in</str<strong>on</strong>g>g more Annotati<strong>on</strong> Layers <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e editi<strong>on</strong> (Bauer et<br />

al. 2008).<br />

At the same time, the authors noted that to make exact references to any po<str<strong>on</strong>g>in</str<strong>on</strong>g>t <str<strong>on</strong>g>in</str<strong>on</strong>g> the text, they needed<br />

to be able to describe the physical structure of the text. The base text layer was stored as a basic TEI<br />

document, <strong>and</strong> as the palimpsest provided an exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g physical structure with “codices, quires, leaves,<br />

sides, columns <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>es,” these were used as the primary structure for their reference system <str<strong>on</strong>g>in</str<strong>on</strong>g> order<br />

to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e exact references to specific document parts. Such references were needed to support<br />

“philological process<str<strong>on</strong>g>in</str<strong>on</strong>g>g,” text annotati<strong>on</strong>, <strong>and</strong> mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g between images <strong>and</strong> transcripti<strong>on</strong>. The authors<br />

did not discuss whether they c<strong>on</strong>sidered us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Can<strong>on</strong>ical Text Service (CTS) references.<br />

Bauer et al. argued that TEI P4 <strong>and</strong> EpiDoc were less useful than the Hypereidoc model <str<strong>on</strong>g>in</str<strong>on</strong>g> the creati<strong>on</strong><br />

of philological annotati<strong>on</strong>s because they required that such annotati<strong>on</strong>s be stored <str<strong>on</strong>g>in</str<strong>on</strong>g> the form of XML<br />

tags <str<strong>on</strong>g>in</str<strong>on</strong>g>serted <str<strong>on</strong>g>in</str<strong>on</strong>g>to a document, mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g that annotati<strong>on</strong>s could be embedded by philologists <strong>on</strong>ly if the<br />

tags were balanced (because of the need for a well-formed XML document). Their proposed soluti<strong>on</strong><br />

was to develop a reference system based <strong>on</strong> the physical structure of the document. “This enables the<br />

h<strong>and</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>g of any overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g annotati<strong>on</strong>,” Bauer et al. stated. “With this reference system miss<str<strong>on</strong>g>in</str<strong>on</strong>g>g word<br />

<strong>and</strong> sentence boundaries can easily be described, even if <str<strong>on</strong>g>in</str<strong>on</strong>g>terpreted differently by various philologists.<br />

Punctuati<strong>on</strong>s miss<str<strong>on</strong>g>in</str<strong>on</strong>g>g from the document can also easily be coded,” they wrote.<br />

The unique nature of the palimpsest, however, with sec<strong>on</strong>dary text written over other orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al texts, led<br />

them to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e the text of Archimedes <strong>and</strong> Hypereides as the “undertext” <strong>and</strong> the new texts that were<br />

written over them as the “overtext.” The page number<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the “overtext” was used as the base for<br />

their reference system, <strong>and</strong> they def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed their “order<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g layer” <str<strong>on</strong>g>in</str<strong>on</strong>g>dependently from the<br />

“base text layer” <strong>and</strong> stored these data <str<strong>on</strong>g>in</str<strong>on</strong>g> an external XML file, not<str<strong>on</strong>g>in</str<strong>on</strong>g>g that philologists would not<br />

necessarily agree <strong>on</strong> a page order <strong>and</strong> might want to use their own “order<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g layer.”<br />

While the “base text layer’s” physical structure was based <strong>on</strong> the overtext, <strong>on</strong>ly pages were identified<br />

with overtext leaf <strong>and</strong> side while columns <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>es were marked us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the undertext so that the l<str<strong>on</strong>g>in</str<strong>on</strong>g>es<br />

of Hypereides text could be specifically identified.<br />

The Hypereidoc reference system supported three types of references: absolute references that po<str<strong>on</strong>g>in</str<strong>on</strong>g>t at<br />

a character positi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the base text; <str<strong>on</strong>g>in</str<strong>on</strong>g>ternal relative references that po<str<strong>on</strong>g>in</str<strong>on</strong>g>t to a character positi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

text “<str<strong>on</strong>g>in</str<strong>on</strong>g>serted by a previous annotati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the same annotati<strong>on</strong> layer”; <strong>and</strong> external relative references<br />

that po<str<strong>on</strong>g>in</str<strong>on</strong>g>t to a character positi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the text “<str<strong>on</strong>g>in</str<strong>on</strong>g>serted by an annotati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a previous Annotati<strong>on</strong> Layer.”<br />

Several types of annotati<strong>on</strong> are supported, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g embedded <strong>and</strong> overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g annotati<strong>on</strong>s. They then<br />

developed a customized system that made use of the XML Po<str<strong>on</strong>g>in</str<strong>on</strong>g>ter Language (XPo<str<strong>on</strong>g>in</str<strong>on</strong>g>ter), 321 which<br />

allows <strong>on</strong>e “to po<str<strong>on</strong>g>in</str<strong>on</strong>g>t to an arbitrary po<str<strong>on</strong>g>in</str<strong>on</strong>g>t <str<strong>on</strong>g>in</str<strong>on</strong>g> an XML document.” While TEI P5 has developed<br />

specificati<strong>on</strong>s for the use of XPo<str<strong>on</strong>g>in</str<strong>on</strong>g>ter, 322 Bauer et al. criticized these guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es for th<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g about<br />

XPo<str<strong>on</strong>g>in</str<strong>on</strong>g>ter <strong>on</strong>ly as a po<str<strong>on</strong>g>in</str<strong>on</strong>g>ter to an arbitrary tag rather than as an arbitrary positi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a text, a feature<br />

that did not support the type of overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g annotati<strong>on</strong> that they needed. N<strong>on</strong>etheless, want<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> maximum compatibility with TEI P5, they made use of the <strong>and</strong> tags <strong>and</strong><br />

publish an additi<strong>on</strong>al “flat file format” of their publicati<strong>on</strong>s that does not make use of XPo<str<strong>on</strong>g>in</str<strong>on</strong>g>ter.<br />

321 http://www.w3.org/TR/xptr-framework/<br />

322 http://www.tei-c.org/release/doc/tei-p5-doc/en/html/SA.html#SATS


101<br />

Bauer et al. argued that another important feature supported by their system is effective versi<strong>on</strong><br />

c<strong>on</strong>trol. The base text layer is read-<strong>on</strong>ly <strong>and</strong> all annotati<strong>on</strong> layers are modeled as separate sequences. In<br />

practice, they use a web server that h<strong>and</strong>les all service requests <str<strong>on</strong>g>in</str<strong>on</strong>g> a RESTful manner. 323 The “virtualtext<br />

documents” are c<strong>on</strong>sidered to be resources that can have the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g versi<strong>on</strong> c<strong>on</strong>trol operati<strong>on</strong>s:<br />

list (which shows the base text layer); create (which adds a new <strong>and</strong> time-stamped annotati<strong>on</strong>); <strong>and</strong><br />

show (which gets the appropriate versi<strong>on</strong> of the file). To support the creati<strong>on</strong> of digital critical editi<strong>on</strong>s,<br />

Bauer et al. designed their own XML editor 324 that manages their custom XML schema <strong>and</strong> supports<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with both the layered XML file created us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Hypereidoc schema <strong>and</strong> the flat file format<br />

used to create compatible TEI P5 documents. In sum, Bauer et al. illustrated the importance of<br />

develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g schemas <strong>and</strong> tools that support the representati<strong>on</strong> of multiple scholarly arguments <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

creati<strong>on</strong> of digital critical editi<strong>on</strong>s of texts, whether epigraphic or papyrological.<br />

Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Epigraphy Databases<br />

The sheer breadth of material available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> the active nature of this discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e are illustrated by<br />

the community-ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed blog Current Epigraphy, 325 which reports news <strong>and</strong> events <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek <strong>and</strong><br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> epigraphy <strong>and</strong> publishes workshop <strong>and</strong> c<strong>on</strong>ference announcements, notices of new discoveries,<br />

<strong>and</strong> publicati<strong>on</strong>s <strong>and</strong> also provides descriptive l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to digital epigraphy projects. Digital epigraphy<br />

projects are greatly varied, some <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a small number of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from a particular area 326 while<br />

others <str<strong>on</strong>g>in</str<strong>on</strong>g>clude selected <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>ly Greek 327 or Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> 328 (typically with a chr<strong>on</strong>ological,<br />

geographic or thematic focus), <strong>and</strong> some small projects focus <strong>on</strong> a particular type of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>. 329<br />

This subsecti<strong>on</strong> provides an overview of some of the larger projects 330 <strong>and</strong> research that explores the<br />

major challenges fac<str<strong>on</strong>g>in</str<strong>on</strong>g>g this field <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital world.<br />

One of the largest Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> projects available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e is the Corpus Inscripti<strong>on</strong>um Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>arum<br />

(CIL), 331 a website that provides descriptive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the CIL publicati<strong>on</strong> series <strong>and</strong><br />

limited digital access to some of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. Theodor Mommsen formed the CIL <str<strong>on</strong>g>in</str<strong>on</strong>g> 1853, with the<br />

purpose of collect<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g all Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> an organized <strong>and</strong> scientific manner, <strong>and</strong><br />

the publicati<strong>on</strong> of new <strong>and</strong> the reissue of edited volumes c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ue. 332 The CIL c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from the entire former Roman Empire, <strong>and</strong> publicati<strong>on</strong>s were arranged by regi<strong>on</strong> <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> type. Electr<strong>on</strong>ic access to a “collecti<strong>on</strong> of squeezes, photographs <strong>and</strong> bibliographical<br />

references ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by the CIL research centre, sorted by <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>-number” is provided through<br />

the “Archivum Corporis Electr<strong>on</strong>icum.” 333 The database can be searched <strong>on</strong>ly by CIL volume <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> number, but records for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s can c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> digital images of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s<br />

323 REST st<strong>and</strong>s for “REpresentati<strong>on</strong>al State Transfer,” <strong>and</strong> RESTful web services have “a key design idiom that embraces a stateless client-server<br />

architecture <str<strong>on</strong>g>in</str<strong>on</strong>g> which the web services are viewed as resources <strong>and</strong> can be identified by their URLs. Web service clients that want to use these resources<br />

access a particular representati<strong>on</strong> by transferr<str<strong>on</strong>g>in</str<strong>on</strong>g>g applicati<strong>on</strong> c<strong>on</strong>tent us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a small globally def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed set of remote methods that describe the acti<strong>on</strong> to be<br />

performed <strong>on</strong> the resource. REST is an analytical descripti<strong>on</strong> of the exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g web architecture, <strong>and</strong> thus the <str<strong>on</strong>g>in</str<strong>on</strong>g>terplay between the style <strong>and</strong> the underly<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

HTTP protocol appears seamless.” http://java.sun.com/developer/technicalArticles/WebServices/restful/<br />

324 http://hypereidoc.elte.hu/i=editor/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex<br />

325 http://www.currentepigraphy.org/<br />

326 Such as the Cyprus Inscripti<strong>on</strong>s Database (http://paspserver.class.utexas.edu/cyprus/)<br />

327 See, for example, “Po<str<strong>on</strong>g>in</str<strong>on</strong>g>ikastas: Epigraphic Sources for Early Greek Writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g,” http://po<str<strong>on</strong>g>in</str<strong>on</strong>g>ikastas.csad.ox.ac.uk/<br />

328 For example, see “Images Italicae” (http://icls.sas.ac.uk/imag<str<strong>on</strong>g>in</str<strong>on</strong>g>esit/)<br />

329 See, for example, “Curse Tablets from Roman Brita<str<strong>on</strong>g>in</str<strong>on</strong>g>”, http://curses.csad.ox.ac.uk/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.shtml<br />

330 For a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g list of projects, see http://delicious.com/Alis<strong>on</strong>Babeu/clir-review+epigraphy<br />

331 http://cil.bbaw.de/cil_en/dateien/forschung.html<br />

332 Currently there are 17 volumes <str<strong>on</strong>g>in</str<strong>on</strong>g> 70 parts (hold<str<strong>on</strong>g>in</str<strong>on</strong>g>g 180,000 <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s) with 13 supplementary volumes that <str<strong>on</strong>g>in</str<strong>on</strong>g>clude illustrati<strong>on</strong>s <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes; see<br />

http://cil.bbaw.de/cil_en/dateien/cil_baende.html. Websites have also been created for some <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual volumes, such as a new editi<strong>on</strong> of Corpus<br />

Inscripti<strong>on</strong>um Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>arum, vol. II: Inscripti<strong>on</strong>es Hispaniae Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>ae (http://www2.uah.es/imag<str<strong>on</strong>g>in</str<strong>on</strong>g>es_cilii/)<br />

333 http://cil.bbaw.de/cil_en/dateien/hilfsmittel.html


102<br />

<strong>and</strong> squeezes as well as a selected bibliography. A variety of other resources are available from this<br />

website, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a glossary of st<strong>on</strong>e types used for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> a c<strong>on</strong>cordance to the CIL.<br />

Another significant database of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s is the Epigraphik Datenbank Clauss-Slaby<br />

(EDCS) 334 which, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the website “records almost all Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s.” As of April 2010,<br />

the EDCS <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded more than 539,000 sets of data for 381,170 <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from over 900 publicati<strong>on</strong>s<br />

cover<str<strong>on</strong>g>in</str<strong>on</strong>g>g more than 19,000 places with over 32,000 pictures. Inscripti<strong>on</strong> texts are typically presented<br />

without abbreviati<strong>on</strong>s <strong>and</strong> as completely as possible, with <strong>on</strong>ly a few special characters to <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate<br />

miss<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts. A full list of the publicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> this database is provided, <strong>and</strong> the EDCS<br />

provides coverage of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the CIL <strong>and</strong> many other major corpora. Users can search for<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s by text (of the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> transcripti<strong>on</strong>), publicati<strong>on</strong> name, <strong>and</strong> prov<str<strong>on</strong>g>in</str<strong>on</strong>g>ce or place of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong><br />

(or a comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of these). A full list of relevant publicati<strong>on</strong>s is given for an <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> search, <strong>and</strong><br />

each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> record c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s abbreviated publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, the prov<str<strong>on</strong>g>in</str<strong>on</strong>g>ce <strong>and</strong> place of the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, <strong>and</strong> a Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> transcripti<strong>on</strong>. L<str<strong>on</strong>g>in</str<strong>on</strong>g>ks are occasi<strong>on</strong>ally provided to images of these <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

other databases.<br />

The largest database of Greek <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e appears to be the PHI-funded “Searchable Greek<br />

Inscripti<strong>on</strong>s,” 335 a project managed by the Greek Epigraphy Project at Cornell University with<br />

significant support from Ohio State University. The website describes the collecti<strong>on</strong> as a “scholarly<br />

work <str<strong>on</strong>g>in</str<strong>on</strong>g> progress” <strong>and</strong> it is frequently updated. The Greek <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s are organized by 15 regi<strong>on</strong>s<br />

(Attica, Asia M<str<strong>on</strong>g>in</str<strong>on</strong>g>or, North Africa, etc.). After select<str<strong>on</strong>g>in</str<strong>on</strong>g>g a regi<strong>on</strong>, the user chooses from a variety of<br />

opti<strong>on</strong>s, such as ma<str<strong>on</strong>g>in</str<strong>on</strong>g> corpora, regi<strong>on</strong>al <strong>and</strong> site corpora, miscellaneous collecti<strong>on</strong>s, miscellaneous<br />

books, or journals. After choos<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>e of these opti<strong>on</strong>s, the user is presented with a browsable list of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> numbers; choos<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>e of these numbers then allows the user to view the entire Greek<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>. The whole collecti<strong>on</strong> or specific regi<strong>on</strong>s can be searched <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek or Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>. In additi<strong>on</strong>,<br />

there is a c<strong>on</strong>cordance feature where a user can type a search pattern <strong>and</strong> the keyboard emulates an<br />

Ibycus keyboard. The user can then launch a c<strong>on</strong>cordance search for that term.<br />

One of the largest epigraphy resources for both Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> is EAGLE (Electr<strong>on</strong>ic Archive of<br />

Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Epigraphy), 336 a federated database that searches across four epigraphical archives:<br />

Epigraphische Datenbank Heidelberg (EDH), 337 Epigraphic Database Roma (EDR), 338 Epigraphic<br />

Database Bari (EDB), 339 <strong>and</strong> Hispania Epigraphica (HE). 340 Each of these databases c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> texts, metadata, <strong>and</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g> some cases, images for Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. While the<br />

EDB c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from <str<strong>on</strong>g>Rome</str<strong>on</strong>g> <strong>on</strong>ly, EDR <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from <str<strong>on</strong>g>Rome</str<strong>on</strong>g> <strong>and</strong> greater<br />

Italy, <strong>and</strong> the HE c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> images from Spa<str<strong>on</strong>g>in</str<strong>on</strong>g>. The EDH is a far larger database<br />

that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s both Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. EAGLE provides federated search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of these four<br />

databases, <strong>and</strong> click<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> search results takes the user to the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual epigraphy databases. The<br />

website <strong>and</strong> search<str<strong>on</strong>g>in</str<strong>on</strong>g>g are currently available <strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g> Italian, but there are plans to create English,<br />

German, French, <strong>and</strong> Spanish <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces.<br />

The EDH database, which can be accessed through EAGLE, is a l<strong>on</strong>gst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g project <str<strong>on</strong>g>in</str<strong>on</strong>g> its own right<br />

that seeks to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from all parts of the Roman Empire. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce 2004, the EDH has<br />

334 http://www.manfredclauss.de/gb/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

335 http://epigraphy.packhum.org/<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s/ma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

336 http://www.eagle-eagle.it/, <strong>and</strong> for more <strong>on</strong> the early development of EAGLE, see Pasqualis dell’Ant<strong>on</strong>io (2005).<br />

337 http://www.uni-heidelberg.de/<str<strong>on</strong>g>in</str<strong>on</strong>g>stitute/s<strong>on</strong>st/adw/edh/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html.en<br />

338 http://www.edr-edr.it/<br />

339 http://www.edb.uniba.it/<br />

340 http://www.eda-bea.es/pub/search_select.phpnewlang=en


103<br />

also entered Greek <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from this same chr<strong>on</strong>ological time span. The EDH c<strong>on</strong>sists of three<br />

databases: the Epigraphic Text Database, the Epigraphic Bibliography (EBH), <strong>and</strong> the Photographic<br />

Database. While the Epigraphic Text Database c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s more than 56,000 <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g many<br />

that were published outside of the st<strong>and</strong>ard major editi<strong>on</strong>s, the EBH c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s 12,000 bibliographic<br />

records c<strong>on</strong>cern<str<strong>on</strong>g>in</str<strong>on</strong>g>g m<strong>on</strong>ographs, journal articles, <strong>and</strong> other sources of sec<strong>on</strong>dary literature regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the EDH, <strong>and</strong> the Photographic Database <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes more than 11,000 photographs of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from various countries. While all three of these <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual databases can be searched<br />

separately, photos <strong>and</strong> bibliographic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> are often found with<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> records. The<br />

records for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the EDH are very detailed <strong>and</strong> typically <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a unique EDH identifier,<br />

images, <strong>and</strong> historical <strong>and</strong> physical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, as well as transcripti<strong>on</strong>s. The EDH is also<br />

participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> two major data-<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> projects, EAGLE <strong>and</strong> the C<strong>on</strong>cordia Initiative.<br />

C<strong>on</strong>cEyst 341 (short for Das Eichstätter K<strong>on</strong>kordanzprogramm zur griechischen und late<str<strong>on</strong>g>in</str<strong>on</strong>g>ischen<br />

Epigraphik) is another large database of Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s (from the Roman prov<str<strong>on</strong>g>in</str<strong>on</strong>g>ce of<br />

P<strong>on</strong>tus-Bithynia) available for download <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. This database is c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uously updated, <strong>and</strong> both it <strong>and</strong><br />

a search <str<strong>on</strong>g>in</str<strong>on</strong>g>terface <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>cordance format can be downloaded.<br />

While a large number of epigraphic resources <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e are dedicated to Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, there are also<br />

major epigraphic resources for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> other languages. The Bibliotheca Alex<strong>and</strong>r<str<strong>on</strong>g>in</str<strong>on</strong>g>a has<br />

created a “Digital <strong>Library</strong> of Inscripti<strong>on</strong>s <strong>and</strong> Calligraphies” 342 that is creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a comprehensive digital<br />

collecti<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> Ancient Egyptian (Hieroglyphic, Hieratic, Demotic, <strong>and</strong> Coptic scripts),<br />

Arabic, Turkish, Persian, <strong>and</strong> Greek. This collecti<strong>on</strong> also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s bear<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Thamodic,<br />

Musnad, <strong>and</strong> Nabatean scripts. This whole collecti<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s can be searched <strong>and</strong> browsed by<br />

language. For each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> record, a digital image of the object, build<str<strong>on</strong>g>in</str<strong>on</strong>g>g, or m<strong>on</strong>ument from which<br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> came is provided, al<strong>on</strong>g with a descripti<strong>on</strong> <strong>and</strong> historical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

object, the text of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> <strong>on</strong> the object, a transliterati<strong>on</strong>, <strong>and</strong> often an English translati<strong>on</strong>. This<br />

website’s presentati<strong>on</strong> of both the archaeological object <strong>on</strong> which the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> is found, al<strong>on</strong>g with a<br />

full-text transliterati<strong>on</strong>, helps emphasize the unique nature of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s as both texts <strong>and</strong><br />

archaeological objects with a c<strong>on</strong>text.<br />

Another major epigraphical project with a significant c<strong>on</strong>centrati<strong>on</strong> outside of Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> is the<br />

“Inscripti<strong>on</strong>s of Israel/Palest<str<strong>on</strong>g>in</str<strong>on</strong>g>e,” 343 which seeks to “collect <strong>and</strong> make accessible over the Web all of<br />

the previously published <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s (<strong>and</strong> their English translati<strong>on</strong>s) of Israel/Palest<str<strong>on</strong>g>in</str<strong>on</strong>g>e from the<br />

Persian period through the Islamic c<strong>on</strong>quest (ca. 500 BCE–640 CE).” Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the website, there<br />

are about 10,000 such <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, written primarily <str<strong>on</strong>g>in</str<strong>on</strong>g> Hebrew, Aramaic, Greek, <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, by Jews,<br />

Christians, <strong>and</strong> pagans. These <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s are quite varied <strong>and</strong> have never been collected or published<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> a systematic fashi<strong>on</strong>. This project is a collaborative effort supported by Brown University <strong>and</strong> a<br />

variety of partners, <strong>and</strong> their goal is to gather all these <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s together <strong>and</strong> publish them <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e as<br />

a scholarly resource. Each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> will be c<strong>on</strong>verted <str<strong>on</strong>g>in</str<strong>on</strong>g>to a tagged XML text that is compatible with<br />

EpiDoc, but this project’s markup <strong>and</strong> DTD have some differences:<br />

Rather than treat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the text as the primary object (with the goal of mov<str<strong>on</strong>g>in</str<strong>on</strong>g>g it relatively easily<br />

to publicati<strong>on</strong>), our mark-up treats the <str<strong>on</strong>g>in</str<strong>on</strong>g>scribed object as primary. The metadata, which is<br />

specified with great detail to allow for database-like search<str<strong>on</strong>g>in</str<strong>on</strong>g>g, is all put <str<strong>on</strong>g>in</str<strong>on</strong>g>to the Header. The<br />

Header c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> such as type of object; date range; locati<strong>on</strong>s (present, f<str<strong>on</strong>g>in</str<strong>on</strong>g>d, <strong>and</strong><br />

341 http://www.ku-eichstaett.de/Fakultaeten/GGF/fachgebiete/Geschichte/Alte Geschichte/Projekte/c<strong>on</strong>ceyst/<br />

342 http://www.bibalex.org/calligraphycenter/Inscripti<strong>on</strong>s<strong>Library</strong>/Presentati<strong>on</strong>/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.aspxLang=en<br />

343 http://www.stg.brown.edu/projects/Inscripti<strong>on</strong>s/


104<br />

orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al); type of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>; language, etc.). Individual “div” secti<strong>on</strong>s c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> the diplomatic<br />

<strong>and</strong> edited versi<strong>on</strong> of the texts, <str<strong>on</strong>g>in</str<strong>on</strong>g> their orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al languages, <strong>and</strong> an English translati<strong>on</strong>. The<br />

source of each is always acknowledged. The DTD already c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s a scheme for richly<br />

mark<str<strong>on</strong>g>in</str<strong>on</strong>g>g-up the c<strong>on</strong>tents of the texts themselves. We have recently decided <strong>on</strong>ly to mark textual<br />

<strong>and</strong> editorial features; c<strong>on</strong>tent such as names, places, occupati<strong>on</strong>s, etc., will be added (perhaps<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> part through automated processes) at a later stage. Almost all of our tags follow, to the extent<br />

possible, the accepted TEI <strong>and</strong> EpiDoc usages.<br />

The project creators note that they would like to support a higher level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, but limited staff<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<strong>and</strong> fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g opti<strong>on</strong>s have made this impossible. A database of about 1,000 <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s is available,<br />

<strong>and</strong> users can search the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> metadata, the text of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> <strong>and</strong> English translati<strong>on</strong>, or<br />

both, at the same time us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a variety of opti<strong>on</strong>s.<br />

Another major resource is InscriptiFact, 344 an image database of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> artifacts that has been<br />

created as part of the West Semitic Research Project (WSRP) 345 at the University of California. This<br />

database provides access to high-resoluti<strong>on</strong> images of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s (papyri, <str<strong>on</strong>g>in</str<strong>on</strong>g>cised <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>on</strong><br />

st<strong>on</strong>e <strong>and</strong> clay, cuneiform tablets, stamp seals, etc.) from the Near Eastern <strong>and</strong> Mediterranean worlds.<br />

The archive c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s more than 250,000 images <strong>and</strong> is updated regularly. For access to the database,<br />

an applicati<strong>on</strong> form must be faxed. Hunt et al. (2005) have described the creati<strong>on</strong> of this database <strong>and</strong><br />

the st<strong>and</strong>ards used <str<strong>on</strong>g>in</str<strong>on</strong>g> detail. InscriptiFact made use of many historical photographs that were often the<br />

<strong>on</strong>ly source of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> for many <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, but also used a number of advanced photographic<br />

techniques to better capture images of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>on</strong> all types of objects <strong>and</strong> m<strong>on</strong>uments.<br />

One major challenge was that fragments of different <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s or collecti<strong>on</strong>s of fragments were often<br />

scattered am<strong>on</strong>g various museums, libraries, <strong>and</strong> archaeological collecti<strong>on</strong>s. 346 To address this issue,<br />

Hunt et al. advised that a fairly complicated catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> data model needed to be developed:<br />

Data <str<strong>on</strong>g>in</str<strong>on</strong>g> InscriptiFact are organized around the c<strong>on</strong>cept of a text, rather than a digital object or a<br />

collecti<strong>on</strong> c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g texts. A “text” <str<strong>on</strong>g>in</str<strong>on</strong>g> this c<strong>on</strong>text is a virtual object <str<strong>on</strong>g>in</str<strong>on</strong>g> that a given text may<br />

not physically exist at any <strong>on</strong>e place <str<strong>on</strong>g>in</str<strong>on</strong>g> its entirety. That is, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce text fragments are often found<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> scattered locati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> various collecti<strong>on</strong>s, InscriptiFact br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs together images of a given text<br />

regardless of the locati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual parts of that text <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s around the world (Hunt<br />

et al. 2005).<br />

The authors argued that neither FRBR nor unqualified Dubl<str<strong>on</strong>g>in</str<strong>on</strong>g> Core 347 readily represented the type of<br />

metadata required by scholars of ancient texts. They noted that for such scholars, “a text is an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual c<strong>on</strong>cept” <strong>and</strong> scholarly catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g must be created for all of its manifestati<strong>on</strong>s <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>clude<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about the physical objects that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, the “<str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual work of the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> itself,” as well as photographic images <strong>and</strong> digital images. Hunt et al. suggested that the<br />

FRBR c<strong>on</strong>cept of the work could not be used for ancient <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce scholars have <strong>on</strong>ly the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scribed physical object (manifestati<strong>on</strong>) <strong>and</strong> certa<str<strong>on</strong>g>in</str<strong>on</strong>g> texts may have been subdivided <str<strong>on</strong>g>in</str<strong>on</strong>g>to many<br />

physical fragments. “It is the job of archaeologists, l<str<strong>on</strong>g>in</str<strong>on</strong>g>guists, epigraphists, philologists, <strong>and</strong> other<br />

specialists to try to rec<strong>on</strong>struct the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al text,” Hunt et al. (2005) ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed; “that is, figure out<br />

what pieces fit together, how the text is organized, when it was <str<strong>on</strong>g>in</str<strong>on</strong>g>scribed, <strong>and</strong> what, <str<strong>on</strong>g>in</str<strong>on</strong>g> fact, the<br />

344 http://www.<str<strong>on</strong>g>in</str<strong>on</strong>g>scriptifact.com/<br />

345 http://www.usc.edu/dept/LAS/wsrp/<br />

346 This is similar to the problem communicated by Ebel<str<strong>on</strong>g>in</str<strong>on</strong>g>g (2007) regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the need to create “composite” texts for Sumerian cuneiform tablets that were<br />

physically fragmented.<br />

347 http://dubl<str<strong>on</strong>g>in</str<strong>on</strong>g>core.org/


105<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual c<strong>on</strong>tent of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> might be.” They c<strong>on</strong>cluded that it was not useful to separate the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual work of the text from the physical object or objects up<strong>on</strong> which it was <str<strong>on</strong>g>in</str<strong>on</strong>g>scribed.<br />

With<str<strong>on</strong>g>in</str<strong>on</strong>g> InscriptiFact, the <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual work has been def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> with<str<strong>on</strong>g>in</str<strong>on</strong>g> the c<strong>on</strong>text of the<br />

physical object where it was <str<strong>on</strong>g>in</str<strong>on</strong>g>scribed. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s do not have expressi<strong>on</strong>s as def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by FRBR,<br />

they used the Dubl<str<strong>on</strong>g>in</str<strong>on</strong>g> Core element relati<strong>on</strong> to map relati<strong>on</strong>ships between the textual c<strong>on</strong>tent of an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stances of that archaeological/physical c<strong>on</strong>text. While the basic objects that are<br />

delivered to users are digital images <strong>and</strong> would seem to corresp<strong>on</strong>d to FRBR items, Hunt et al. <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted<br />

that images of complicated objects (e.g., a plate of fragments) that can <str<strong>on</strong>g>in</str<strong>on</strong>g>clude multiple texts illustrates<br />

that with <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s there can be a “many to many” relati<strong>on</strong>ship between works <strong>and</strong> items. Their f<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

approach was to separate catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g for the text (the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> or work) from the images (the digital<br />

objects or items) <strong>and</strong> to extend qualified Dubl<str<strong>on</strong>g>in</str<strong>on</strong>g> Core to “<str<strong>on</strong>g>in</str<strong>on</strong>g>clude an additi<strong>on</strong>al qualifier to denote<br />

manifestati<strong>on</strong>.”<br />

In additi<strong>on</strong> to federated databases <strong>and</strong> digital collecti<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, there are a number of<br />

reference tools now available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e that have been created to assist scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s.<br />

The CLAROS: C<strong>on</strong>cordance of Greek Inscripti<strong>on</strong>s 348 database provides access to a computerized<br />

c<strong>on</strong>cordance of editi<strong>on</strong>s of ancient Greek <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s that have been published <str<strong>on</strong>g>in</str<strong>on</strong>g> the past 100 years.<br />

The fifth editi<strong>on</strong>, issued <str<strong>on</strong>g>in</str<strong>on</strong>g> 2006, <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes more than “450,000 equivalences com<str<strong>on</strong>g>in</str<strong>on</strong>g>g from more than<br />

750 collecti<strong>on</strong>s.” This c<strong>on</strong>cordance provides limited l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks between the results of bibliographic searches<br />

with electr<strong>on</strong>ic versi<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the Inscripti<strong>on</strong>s of Aphrodisias, the Greek Epigraphy Project<br />

of the PHI, <strong>and</strong> some texts from Egypt published <str<strong>on</strong>g>in</str<strong>on</strong>g> the Duke Data Bank of Documentary Papyri. A<br />

full list of collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> the c<strong>on</strong>cordance is provided as well as a list of abbreviati<strong>on</strong>s used for<br />

classical <strong>and</strong> epigraphical publicati<strong>on</strong>s. The sheer breadth of this database illustrates how many Greek<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s have been published <str<strong>on</strong>g>in</str<strong>on</strong>g> multiple editi<strong>on</strong>s <strong>and</strong> the c<strong>on</strong>sequent challenges of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks<br />

to electr<strong>on</strong>ic versi<strong>on</strong>s of these <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> other databases.<br />

EpiDoc-Based Digital Epigraphy Projects<br />

The Inscripti<strong>on</strong>s of Aphrodisias (ALA 2004) 349 website provides access to the electr<strong>on</strong>ic sec<strong>on</strong>d<br />

editi<strong>on</strong> of “Aphrodisias <str<strong>on</strong>g>in</str<strong>on</strong>g> Late Antiquity: The Late Roman <strong>and</strong> Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e Inscripti<strong>on</strong>s” by Charlotte<br />

Roueché of K<str<strong>on</strong>g>in</str<strong>on</strong>g>g’s College L<strong>on</strong>d<strong>on</strong>. This website provides access to a sec<strong>on</strong>d editi<strong>on</strong> that has been<br />

exp<strong>and</strong>ed <strong>and</strong> revised from the versi<strong>on</strong> published by the Society for the Promoti<strong>on</strong> of Roman Studies<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> 1989. Charlotte Roueché (2009) has expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the process of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g this website <str<strong>on</strong>g>in</str<strong>on</strong>g> detail <strong>and</strong>, as<br />

noted by Cayless et al. (2009) earlier, po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s have two identities, both as a text<br />

<strong>and</strong> as an “archaeological object with a c<strong>on</strong>text.” Despite this identity as a text, Roueché remarked that<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scribed texts had often been omitted from the literary can<strong>on</strong>. As an example, she noted that there<br />

were two verse <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from Aphrodisias <strong>on</strong> the same block, but <strong>on</strong>ly <strong>on</strong>e was quoted <str<strong>on</strong>g>in</str<strong>on</strong>g> the Greek<br />

Anthology <strong>and</strong> thus ended up <str<strong>on</strong>g>in</str<strong>on</strong>g> the TLG; the other never entered the literary traditi<strong>on</strong>.<br />

While ALA (2004) <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes about 2,000 <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, Roueché stressed that the sheer scale of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s almost necessitates electr<strong>on</strong>ic publicati<strong>on</strong>. She reported that she was <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced to EpiDoc<br />

by her colleagues Tom Elliott <strong>and</strong> Charles Crowther <strong>and</strong> thus felt that <strong>on</strong>e important questi<strong>on</strong> to<br />

c<strong>on</strong>sider was how to br<str<strong>on</strong>g>in</str<strong>on</strong>g>g together doma<str<strong>on</strong>g>in</str<strong>on</strong>g> specialists with the technical experts that can help them. 350<br />

348 http://www.dge.filol.csic.es/claros/cnc/2cnc.htm<br />

349 http://<str<strong>on</strong>g>in</str<strong>on</strong>g>saph.kcl.ac.uk/ala2004/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

350 For a brief look at some earlier work d<strong>on</strong>e to br<str<strong>on</strong>g>in</str<strong>on</strong>g>g together epigraphers <strong>and</strong> technical specialists <str<strong>on</strong>g>in</str<strong>on</strong>g> order to help design better tools for use with<br />

EpiDoc, see Cayless (2003).


106<br />

To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> this process, Roueché <strong>and</strong> others started the Epidoc Aphrodisias Project (EPAPP) 351 <str<strong>on</strong>g>in</str<strong>on</strong>g> 2002<br />

to develop tools for present<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>on</strong> the Internet us<str<strong>on</strong>g>in</str<strong>on</strong>g>g EpiDoc. The project<br />

held two workshops <str<strong>on</strong>g>in</str<strong>on</strong>g> the United States <strong>and</strong> United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom to get <str<strong>on</strong>g>in</str<strong>on</strong>g>put from <str<strong>on</strong>g>in</str<strong>on</strong>g>terested experts, <strong>and</strong><br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>itial outcome of this project was ALA 2004. After secur<str<strong>on</strong>g>in</str<strong>on</strong>g>g more grant fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g, the project<br />

exp<strong>and</strong>ed ALA 2004 <strong>and</strong> also published IAph 2007 (the whole corpus is now referenced as InsAph).<br />

As ALA 2004 is c<strong>on</strong>sidered to be the sec<strong>on</strong>d editi<strong>on</strong> of her book, Roueché articulated that the website<br />

that has been produced is stable <strong>and</strong> all <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s are citable. Furthermore, Roueché believed that<br />

simply creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g ALA 2004 was important to dem<strong>on</strong>strate what was possible for an electr<strong>on</strong>ic<br />

publicati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. Although she obta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed an ISBN for the website, she had trouble gett<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

librarians at her <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong> to create a catalog record for it, thus re<str<strong>on</strong>g>in</str<strong>on</strong>g>forc<str<strong>on</strong>g>in</str<strong>on</strong>g>g the idea that the website<br />

was not a real publicati<strong>on</strong>, a problem also reported by archaeologists <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed by Harley et al.<br />

(2010).<br />

Despite this difficulty, even more daunt<str<strong>on</strong>g>in</str<strong>on</strong>g>g were the challenges of data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> am<strong>on</strong>g different<br />

epigraphy projects. As dem<strong>on</strong>strated through even the brief survey c<strong>on</strong>ducted by this review, there are<br />

numerous epigraphy projects <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <strong>and</strong> Roueché reported that more pi<strong>on</strong>eer<str<strong>on</strong>g>in</str<strong>on</strong>g>g work <str<strong>on</strong>g>in</str<strong>on</strong>g> digital<br />

epigraphy has <str<strong>on</strong>g>in</str<strong>on</strong>g>volved Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. N<strong>on</strong>etheless, as listed above, <strong>on</strong>e of the major Greek<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s projects is PHI Greek Inscripti<strong>on</strong>s, which also c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s all the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from<br />

Aphrodisias published through 1993. In the future, Roueché hopes to embed PHI Greek identificati<strong>on</strong><br />

numbers <str<strong>on</strong>g>in</str<strong>on</strong>g> the XML of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> ALA 2004 so that the PHI website can automatically receive<br />

updated <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> from ALA 2004.<br />

Fundamental to the problem of data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong>, Roueché asserted, is c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>g more epigraphists to<br />

take up the EpiDoc st<strong>and</strong>ard. One means of do<str<strong>on</strong>g>in</str<strong>on</strong>g>g this, she c<strong>on</strong>cluded, was to dem<strong>on</strong>strate that EpiDoc<br />

was not a radical shift but rather an extensi<strong>on</strong> of how epigraphists have always worked:<br />

The aim is to get epigraphers to perceive that EpiDoc encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g simply represents an extensi<strong>on</strong><br />

of the approach which produced the Leiden c<strong>on</strong>venti<strong>on</strong>s. As often <str<strong>on</strong>g>in</str<strong>on</strong>g> humanities comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, it<br />

is important to dem<strong>on</strong>strate that the <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual activities <strong>and</strong> processes <str<strong>on</strong>g>in</str<strong>on</strong>g> what appear to be<br />

separate fields are <str<strong>on</strong>g>in</str<strong>on</strong>g> fact closely related (Roueché 2009, 165).<br />

Another problem, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Roueché, was that many humanities scholars wanted simply to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e a<br />

problem <strong>and</strong> then have technicians solve all of the challenges <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital soluti<strong>on</strong>, a situati<strong>on</strong><br />

she rightly c<strong>on</strong>cluded was not viable. Key to solv<str<strong>on</strong>g>in</str<strong>on</strong>g>g this difficulty is the development of a comm<strong>on</strong><br />

language, for she noted that both epigraphists <strong>and</strong> computer scientists have their own acr<strong>on</strong>yms (CIL,<br />

XML). The most critical task, Roueché <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted, is to dem<strong>on</strong>strate the added scholarly value of<br />

electr<strong>on</strong>ic publicati<strong>on</strong> to epigraphists. Am<strong>on</strong>g the many benefits of electr<strong>on</strong>ic publicati<strong>on</strong>, perhaps the<br />

most significant Roueché listed was the ability of electr<strong>on</strong>ic publicati<strong>on</strong> to better accommodate the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary nature of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s as both literary texts <strong>and</strong> archaeological objects. The new ability<br />

to both dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ate <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s not <strong>on</strong>ly with other collecti<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s but also<br />

with papyri, manuscripts, or seals, Roueché hoped, would help “break down what have been<br />

essentially false dist<str<strong>on</strong>g>in</str<strong>on</strong>g>cti<strong>on</strong>s between texts which all orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ate from the same cultural milieu, but are<br />

recorded <strong>on</strong> different media” (Roueché 2009, 167). As illustrated throughout the earlier discussi<strong>on</strong> of<br />

archaeology, the ability to re<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate the textual <strong>and</strong> material records <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital envir<strong>on</strong>ment is a<br />

critical issue that must be addressed. In additi<strong>on</strong>, Roueché suggested that it is far easier to update a<br />

351 http://www.epapp.kcl.ac.uk/


107<br />

digital corpus of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s than it is to modify a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted volume <strong>and</strong> that digital editi<strong>on</strong>s also make it<br />

simpler for scholars to work together collaboratively. These new forms of collaborati<strong>on</strong>, however,<br />

create questi<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g data ownership <strong>and</strong> authorship credit. Indeed, many of the greatest<br />

challenges may be social rather than technical, Roueché c<strong>on</strong>cluded.<br />

The sec<strong>on</strong>d major part of the InsAph corpus is IAph 2007, 352 which provides access to the first editi<strong>on</strong><br />

of an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e corpus of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from Aphrodisias that were recorded up to 1994. All of the editi<strong>on</strong>s,<br />

translati<strong>on</strong>s, <strong>and</strong> commentary have been provided by Joyce Reynolds, Charlotte Roueché, <strong>and</strong> Gabriel<br />

Bodard. In additi<strong>on</strong>, the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s have been marked up us<str<strong>on</strong>g>in</str<strong>on</strong>g>g EpiDoc <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual XML files for<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s or the entire XML repository of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s can be downloaded from the site al<strong>on</strong>g with a<br />

DTD. 353 Slightly more than 1,500 <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s (<strong>on</strong>e-third of which have not been previously published)<br />

are available through the database, <strong>and</strong> the whole collecti<strong>on</strong> can be searched by free text (Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>,<br />

<strong>and</strong> English <str<strong>on</strong>g>in</str<strong>on</strong>g> transcripti<strong>on</strong>s <strong>and</strong> editi<strong>on</strong>s) or by category (such as date, object text, <strong>and</strong> text type).<br />

Brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s is also provided through different “Tables of C<strong>on</strong>tents,” <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

locati<strong>on</strong>s, date, text categories, m<strong>on</strong>ument types, decorative features, <strong>and</strong> texts new to the editi<strong>on</strong>.<br />

Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, this database also provides a number of <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes to the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greek<br />

words, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> words, pers<strong>on</strong>al <strong>and</strong> place names, <strong>and</strong> several other characters <strong>and</strong> features. There is also<br />

a “Plan of Aphrodisias” 354 that allows users to choose a secti<strong>on</strong> of the city <strong>and</strong> then provides them with<br />

a list of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> that area. Each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> record <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes extensive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, multiple<br />

images, the Greek or Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> text, a diplomatic transcripti<strong>on</strong>, an EpiDoc XML file, <strong>and</strong> an English<br />

translati<strong>on</strong>. In additi<strong>on</strong>, each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> has a citable <strong>and</strong> permanent URL. 355<br />

Two recent articles by Gabriel Bodard (Bodard 2008, Bodard 2006) have looked at some of the issues<br />

regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the benefits <strong>and</strong> difficulties of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g electr<strong>on</strong>ic publicati<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s such as IAph<br />

2007. Bodard listed six features for analysis <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the opportunities of electr<strong>on</strong>ic publicati<strong>on</strong>:<br />

accessibility, scale, media, hypertext, updates, iterative research, <strong>and</strong> transparency. The digital<br />

publicati<strong>on</strong>s of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s of Aphrodisias were the first major <strong>on</strong>es to adopt EpiDoc, Bodard<br />

expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, <strong>and</strong> the use of this st<strong>and</strong>ard based <strong>on</strong> a subset of the TEI guaranteed maximum<br />

compatibility with many other digital humanities projects. One important th<str<strong>on</strong>g>in</str<strong>on</strong>g>g to note, Bodard<br />

declared, was that:<br />

An essential c<strong>on</strong>cept beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d EpiDoc is the underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g that this form of semantic markup is<br />

not meant to replace traditi<strong>on</strong>al epigraphic transcripti<strong>on</strong> based <strong>on</strong> the Leiden c<strong>on</strong>venti<strong>on</strong>s. The<br />

XML may (<strong>and</strong> almost <str<strong>on</strong>g>in</str<strong>on</strong>g>evitably will) encode more <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> than the range of brackets <strong>and</strong><br />

sigla used <str<strong>on</strong>g>in</str<strong>on</strong>g> Leiden, but there will always be a <strong>on</strong>e-to-<strong>on</strong>e equivalence between Leiden codes<br />

<strong>and</strong> markup features <str<strong>on</strong>g>in</str<strong>on</strong>g> the EpiDoc guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es (Bodard 2008).<br />

As did Cayless et al. (2009) <strong>and</strong> Roueché (2009), Bodard argued that encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> EpiDoc<br />

is a natural extensi<strong>on</strong> of the work that epigraphists already do.<br />

To return to the six features listed by Bodard, he argued that the most obvious benefit of electr<strong>on</strong>ic<br />

publicati<strong>on</strong> was accessibility, <strong>and</strong> that publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e both serves as scholarly outreach<br />

<strong>and</strong> fosters <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>arity. Sec<strong>on</strong>d, as was also argued by Roueché, Bodard advocated that scale<br />

was <strong>on</strong>e of the most significant opportunities for electr<strong>on</strong>ic publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g. He revealed that unlike the first<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>, ALA 2004 was able to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude multiple photographs of each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, thus help<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

352 http://<str<strong>on</strong>g>in</str<strong>on</strong>g>saph.kcl.ac.uk/iaph2007/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

353 http://<str<strong>on</strong>g>in</str<strong>on</strong>g>saph.kcl.ac.uk/iaph2007/<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s/xml-repo.html<br />

354 http://<str<strong>on</strong>g>in</str<strong>on</strong>g>saph.kcl.ac.uk/iaph2007/reference/plan.html<br />

355 http://<str<strong>on</strong>g>in</str<strong>on</strong>g>saph.kcl.ac.uk/iaph2007/iAph040113.html - editi<strong>on</strong>


108<br />

better set the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> their archaeological c<strong>on</strong>text. The scale of digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g also allowed<br />

for the <str<strong>on</strong>g>in</str<strong>on</strong>g>clusi<strong>on</strong> of far more <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative material <strong>and</strong> for the explanati<strong>on</strong> <strong>and</strong> expansi<strong>on</strong> of text that<br />

had <strong>on</strong>ce needed to be abbreviated by epigraphists <strong>and</strong> papyrologists for pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t publicati<strong>on</strong>:<br />

… <strong>on</strong>ce the restricti<strong>on</strong>s of a page limit are removed this abbreviated text can be exp<strong>and</strong>ed,<br />

c<strong>on</strong>venti<strong>on</strong>s can be glossed, descripti<strong>on</strong>s of comments can be repeated where they are relevant,<br />

<strong>and</strong> cross-references can be made more self-explanatory. This is not necessarily to reject<br />

generati<strong>on</strong>s of scholarship <strong>and</strong> academic jarg<strong>on</strong>, which is familiar to practiti<strong>on</strong>ers of our<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> serves a useful functi<strong>on</strong> of communicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong> to space-sav<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Rather,<br />

by exp<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g, expla<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> illustrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g these c<strong>on</strong>venti<strong>on</strong>al sigla <strong>and</strong> abbreviati<strong>on</strong>s we are<br />

enhanc<str<strong>on</strong>g>in</str<strong>on</strong>g>g our scholarly publicati<strong>on</strong> by mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g it more accessible to outsider (Bodard 2008).<br />

Mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s more accessible to a general audience is perhaps <strong>on</strong>e of the greatest benefits to<br />

publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.<br />

The third feature listed by Bodard was media, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong> to the far larger number of photographs,<br />

Bodard suggested that digital rec<strong>on</strong>structi<strong>on</strong> of m<strong>on</strong>uments with <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s could prove very useful.<br />

In additi<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>creased levels of geographical access can now be provided through digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g maps <strong>and</strong><br />

plans that can then be hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to editi<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. One potential avenue for explorati<strong>on</strong><br />

Bodard outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed would be the use of a global mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g API 356 such as that provided by Google<br />

Maps 357 “to plot not <strong>on</strong>ly f<str<strong>on</strong>g>in</str<strong>on</strong>g>dspots <strong>and</strong> ancient <strong>and</strong> modern locati<strong>on</strong>s of f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds, but also places <strong>and</strong><br />

other geographical entities named or implied <str<strong>on</strong>g>in</str<strong>on</strong>g> the texts themselves” (Bodard 2008). Elliott <strong>and</strong><br />

Gillies (2009a) <strong>and</strong> Jeffrey et al. (2009a) have also suggested the utility of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g modern mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

technology such as GIS <strong>and</strong> historical named-entity recogniti<strong>on</strong> techniques to provide better access to<br />

historical materials <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology <strong>and</strong> classical geography.<br />

Perhaps the most significant impact of electr<strong>on</strong>ic publicati<strong>on</strong> <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, Bodard argues, is through<br />

his fourth feature, hypertext, with its myriad possibilities of support<str<strong>on</strong>g>in</str<strong>on</strong>g>g sophisticated l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

dynamic referenc<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Internal hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>ks with<str<strong>on</strong>g>in</str<strong>on</strong>g> a publicati<strong>on</strong>, as Bodard noted, enabled mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

str<strong>on</strong>ger l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks between data, narrative commentary, <strong>and</strong> other support<str<strong>on</strong>g>in</str<strong>on</strong>g>g materials. New ways of<br />

navigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the material become available as a user can go from a narrative commentary directly to an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, or from an <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest directly to commentary. External hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g to other<br />

projects (such as other <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> collecti<strong>on</strong>s <strong>and</strong> sec<strong>on</strong>dary reference works) also offers powerful<br />

possibilities. Another important aspect of hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Bodard underscored, was the potential of<br />

dynamic l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g or “live hypertext shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g.” As both InsAph 2007 <strong>and</strong> ALA 2004 provide<br />

downloadable EpiDoc XML files for the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> the entire corpus <strong>and</strong> also provide<br />

“transparent <strong>and</strong> predictable URLs for dynamic l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g” (Bodard 2008), other projects can easily l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<br />

to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s or download the entire corpus for reuse. As far as Bodard knew, no other<br />

projects had yet made reuse of any of the EpiDoc files available.<br />

The fifth feature listed by Bodard, the ability to easily update data, was also briefly discussed by<br />

Roueché. Although the possibility of <str<strong>on</strong>g>in</str<strong>on</strong>g>stantaneous updat<str<strong>on</strong>g>in</str<strong>on</strong>g>g can be very useful, Bodard proposed that<br />

some electr<strong>on</strong>ic publicati<strong>on</strong>s may need to be kept “static” not <strong>on</strong>ly because of the burden <strong>on</strong> the author<br />

but also because of the need for a stable <strong>and</strong> citable publicati<strong>on</strong> “that has to <str<strong>on</strong>g>in</str<strong>on</strong>g>teract with, <strong>and</strong> be<br />

356 An API, short for “Applicati<strong>on</strong> Programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g Interface” is a “set of rout<str<strong>on</strong>g>in</str<strong>on</strong>g>es, protocols, <strong>and</strong> tools for build<str<strong>on</strong>g>in</str<strong>on</strong>g>g software applicati<strong>on</strong>s”<br />

(http://webopedia.com/TERM/A/API.html). Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Webopedia, a “good API makes it easier to develop a program by provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g all the build<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

blocks.” Programmers can then use these “build<str<strong>on</strong>g>in</str<strong>on</strong>g>g blocks” to more easily write applicati<strong>on</strong>s that are c<strong>on</strong>sistent with a particular operat<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment or<br />

software program.<br />

357 http://code.google.com/apis/maps/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html


109<br />

reviewed with<str<strong>on</strong>g>in</str<strong>on</strong>g>, the world of peer reviewed, cited, traceable, <strong>and</strong> replicable scholarship.”<br />

C<strong>on</strong>sequently, all versi<strong>on</strong>s (with their URLs) of an <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> must be ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed even if changes or<br />

correcti<strong>on</strong>s are made, <str<strong>on</strong>g>in</str<strong>on</strong>g> case a scholar has cited an earlier versi<strong>on</strong> of an <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>. For these reas<strong>on</strong>s,<br />

as well as the challenges of project-based fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g, the authors decided to create ALA 2004 <strong>and</strong> IAph<br />

2007 not as “liv<str<strong>on</strong>g>in</str<strong>on</strong>g>g databases” but as more traditi<strong>on</strong>al “<strong>on</strong>e-off publicati<strong>on</strong>s.”<br />

The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al feature analyzed by Bodard, the ability of electr<strong>on</strong>ic publicati<strong>on</strong> to support iterative research<br />

<strong>and</strong> transparency, was also previously discussed by this author <strong>and</strong> Garcés (2009) <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of digital<br />

editi<strong>on</strong>s. The availability of source files <strong>and</strong> code makes research more replicable by other scholars<br />

because it provides access to primary source data; allows other scholars to exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e the digital<br />

processes, markup, <strong>and</strong> techniques used to create the collecti<strong>on</strong>; <strong>and</strong> allows them to use their own<br />

algorithms <strong>and</strong> tools to create new digital editi<strong>on</strong>s. Such transparency is essential to all scholarly<br />

research, <strong>and</strong> Bodard c<strong>on</strong>cludes that:<br />

Even more central to the research process, however, is the fact that a true digital project is not<br />

merely the result of traditi<strong>on</strong>al classical research that is at the last m<str<strong>on</strong>g>in</str<strong>on</strong>g>ute c<strong>on</strong>verted to<br />

electr<strong>on</strong>ic form <strong>and</strong> made available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Rather the XML files (<str<strong>on</strong>g>in</str<strong>on</strong>g> the case of Inscripti<strong>on</strong>s of<br />

Aphrodisias <strong>and</strong> other EpiDoc projects, other data models for other types of project) that lie<br />

beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d the publicati<strong>on</strong>, are the direct result of, <strong>and</strong> primary tools for, the academic research<br />

itself. These files c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> the marked-up data, Greek or Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> texts, descripti<strong>on</strong>s, editorial<br />

commentary <strong>and</strong> argumentati<strong>on</strong>, references <strong>and</strong> metadata, all <str<strong>on</strong>g>in</str<strong>on</strong>g> mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-readable <strong>and</strong><br />

acti<strong>on</strong>able form. It is this s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle, structured collecti<strong>on</strong> of source data which is taken by the<br />

mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e process <strong>and</strong> run through a series of XSLT stylesheets which generate, <str<strong>on</strong>g>in</str<strong>on</strong>g> turn, the web<br />

presentati<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual or groups of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, the c<strong>on</strong>textual tables of c<strong>on</strong>tents, <str<strong>on</strong>g>in</str<strong>on</strong>g>dices,<br />

c<strong>on</strong>cordances, prosopographical <strong>and</strong> <strong>on</strong>omastic tables, <strong>and</strong> so forth (Bodard 2008).<br />

Thus, the electr<strong>on</strong>ic files <strong>and</strong> source code that have been created for this publicati<strong>on</strong> are every bit as<br />

important as, if not more important than, the website created from them.<br />

One project that is closely related both to ALA 2004 <strong>and</strong> to IAph 2007 is the Inscripti<strong>on</strong>s of Roman<br />

Tripolitania (IRT 2009). 358 IRT 2009 is the enhanced electr<strong>on</strong>ic reissue of a publicati<strong>on</strong> that first<br />

appeared <str<strong>on</strong>g>in</str<strong>on</strong>g> 1952. Created by Gabriel Bodard <strong>and</strong> Charlotte Roueché, it is hosted by K<str<strong>on</strong>g>in</str<strong>on</strong>g>g’s College<br />

L<strong>on</strong>d<strong>on</strong>. Electr<strong>on</strong>ic publicati<strong>on</strong> has allowed for the <str<strong>on</strong>g>in</str<strong>on</strong>g>clusi<strong>on</strong> of the full photographic record of the<br />

orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t publicati<strong>on</strong> <strong>and</strong> the l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s to maps <strong>and</strong> gazetteers. As with IAph<br />

2007, <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual EpiDoc XML files <strong>and</strong> an entire XML repository of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s can be<br />

downloaded. 359 There are a variety of ways to access the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. The chapters of the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<br />

publicati<strong>on</strong> have been made available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e as text files with hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to the relevant <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. In<br />

additi<strong>on</strong>, as with IAph 2007, various tables of c<strong>on</strong>tents (locati<strong>on</strong>s, dates, text categories, m<strong>on</strong>ument<br />

types) <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes (Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> words, Greek words, fragments of text, pers<strong>on</strong>al names, <strong>and</strong> other features)<br />

provide alternative means of access to the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. The website notes that <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes to this editi<strong>on</strong><br />

were generated from the texts themselves <strong>and</strong> thus differ from those of the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t editi<strong>on</strong>. Each<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> record is similar to those <str<strong>on</strong>g>in</str<strong>on</strong>g> IAph 2007, with <strong>on</strong>e key difference: many of the records <str<strong>on</strong>g>in</str<strong>on</strong>g> IRT<br />

have “f<str<strong>on</strong>g>in</str<strong>on</strong>g>dspots” that have been l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to maps. 360 A map can be used to browse the collecti<strong>on</strong> of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s with l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks provided to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> records. Each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> also has a citable<br />

<strong>and</strong> permanent URL.<br />

358 http://irt.kcl.ac.uk/irt2009/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

359 http://irt.kcl.ac.uk/irt2009/<str<strong>on</strong>g>in</str<strong>on</strong>g>scr/xmlrepo.html<br />

360 For example, the map for the f<str<strong>on</strong>g>in</str<strong>on</strong>g>dspot “Sirtica,” http://irt.kcl.ac.uk/irt2009/maps/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.htmlll=31.2,16.5&z=10


110<br />

Another related project that has recently begun is the Inscripti<strong>on</strong>s of Roman Cyrenaica (IRCyr). 361<br />

This website provides access to <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s gathered by Joyce Reynolds of Newnham College<br />

Cambridge between 1948 <strong>and</strong> the present. This project draws <strong>on</strong> the experience ga<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

ALA2004 <strong>and</strong> IAph 2007 <strong>and</strong> they plan to present the documents <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g> a similar fashi<strong>on</strong> to these<br />

websites <strong>and</strong> to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k all <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s to an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e map of Roman Cyrenaica that is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g prepared by the<br />

Pleiades project. No <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s database is currently available at this website. IAph 2007, IRT 2009,<br />

<strong>and</strong> IRCyr are also participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the C<strong>on</strong>cordia project.<br />

F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, another project that makes partial use of EpiDoc is the U.S. Epigraphy project, which is<br />

dedicated to collect<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s but is focused <strong>on</strong> those preserved <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the United States of America. 362 The project was founded at Rutgers University <str<strong>on</strong>g>in</str<strong>on</strong>g> 1995 <strong>and</strong> has been<br />

based at Brown University s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce 2003, where the present website was developed with help from the<br />

Scholarly Technology Group. Every <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> that has been cataloged by this project has been<br />

assigned a unique identifier or U.S. epigraphy number. The database of almost 2,500 Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s can be browsed by publicati<strong>on</strong> or collecti<strong>on</strong> <strong>and</strong> searched for by language, place of orig<str<strong>on</strong>g>in</str<strong>on</strong>g>,<br />

date, type of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, type of object, <strong>and</strong> material (am<strong>on</strong>g many other metadata categories) as well<br />

as by bibliographic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the website, a “grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital editi<strong>on</strong> of the collecti<strong>on</strong><br />

currently registers some 400 transcripti<strong>on</strong>s of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> texts encoded accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to EpiDoc c<strong>on</strong>venti<strong>on</strong>s<br />

<strong>and</strong> provides some 1,000 photographs <strong>and</strong> images of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> our corpus.” This makes the<br />

U.S. Epigraphy project <strong>on</strong>e of the first major projects to beg<str<strong>on</strong>g>in</str<strong>on</strong>g> the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of its texts <str<strong>on</strong>g>in</str<strong>on</strong>g> EpiDoc.<br />

The Challenges of L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Epigraphy <strong>and</strong> Digital Classics Projects<br />

As the preced<str<strong>on</strong>g>in</str<strong>on</strong>g>g overview of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> projects dem<strong>on</strong>strated, there are records of many of the same<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> various databases, <strong>and</strong> many databases have used their own technological<br />

implementati<strong>on</strong>s to provide access to collecti<strong>on</strong>s <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. The sheer scale of many such projects <strong>and</strong> the<br />

grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g number of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e require computati<strong>on</strong>al soluti<strong>on</strong>s.<br />

Recently Leif Isaksen has proposed the development of an “augmented reality mobile applicati<strong>on</strong>”<br />

(such as for the iPh<strong>on</strong>e) to support the “crowdsourc<str<strong>on</strong>g>in</str<strong>on</strong>g>g” of epigraphy (Isaksen 2009). In theory, such<br />

an applicati<strong>on</strong> could allow tourists or archaeologists to submit spatially located images of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s<br />

to a central <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> database that could also <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a website where correcti<strong>on</strong>s <strong>and</strong> translati<strong>on</strong>s of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s could be proposed based <strong>on</strong> multiple images of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a central database<br />

would also support research work <str<strong>on</strong>g>in</str<strong>on</strong>g> advanced-imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques for various cultural heritage<br />

projects.<br />

While a number of epigraphy websites make use of the EpiDoc st<strong>and</strong>ard, recent research by Álvarez et<br />

al. (2010) has noted that the use of EpiDoc al<strong>on</strong>e is not enough to provide access to “open l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data”<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> that EpiDoc does not provide a way to encode “computati<strong>on</strong>al semantics” as it relies <str<strong>on</strong>g>in</str<strong>on</strong>g>stead <strong>on</strong><br />

“structured metadata with text fields.” The authors c<strong>on</strong>tended that the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of computati<strong>on</strong>al<br />

semantics for epigraphic data would provide an added value <str<strong>on</strong>g>in</str<strong>on</strong>g> that epigraphical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> could<br />

then be reused by various Semantic Web applicati<strong>on</strong>s. Similarly to Cayless et al. (2009), Álvarez et al.<br />

outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a number of problems with us<str<strong>on</strong>g>in</str<strong>on</strong>g>g relati<strong>on</strong>al databases to represent epigraphical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the fact that as both texts <strong>and</strong> archaeological objects, a database of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s must <str<strong>on</strong>g>in</str<strong>on</strong>g>clude<br />

both a full text of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> <strong>and</strong> a detailed descripti<strong>on</strong> of its decorati<strong>on</strong> <strong>and</strong> the place where it was<br />

discovered. In additi<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s were written <str<strong>on</strong>g>in</str<strong>on</strong>g> numerous languages with various scripts, but there<br />

361 http://ircyr.kcl.ac.uk/<br />

362 http://usepigraphy.brown.edu/


111<br />

are no established st<strong>and</strong>ards for “mix<str<strong>on</strong>g>in</str<strong>on</strong>g>g scripts <str<strong>on</strong>g>in</str<strong>on</strong>g> a regular search<str<strong>on</strong>g>in</str<strong>on</strong>g>g pattern” (Álvarez et al. 2010).<br />

One of the largest problems with current databases, however, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to these authors, is that<br />

although relati<strong>on</strong>al databases allow l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of data <str<strong>on</strong>g>in</str<strong>on</strong>g> different tables to <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate relati<strong>on</strong>ships, key<br />

entities <str<strong>on</strong>g>in</str<strong>on</strong>g> different <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, such as pers<strong>on</strong>al <strong>and</strong> place names, are typically not normalized. As a<br />

soluti<strong>on</strong> to this problem <strong>and</strong> others, Álvarez et al. proposed the creati<strong>on</strong> of an <strong>on</strong>tological schema<br />

based <strong>on</strong> EpiDoc. “Develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <strong>on</strong>tological schema that allows for more normalized <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

structure,” Álvarez et al argued, “has the added benefit of prepar<str<strong>on</strong>g>in</str<strong>on</strong>g>g epigraphic data to be shared <strong>on</strong> the<br />

Web via l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data, which opens new possibilities to relat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> currently dispersed <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

several databases.”<br />

After mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g the EpiDoc schema to an “<strong>on</strong>tological representati<strong>on</strong> expressed <str<strong>on</strong>g>in</str<strong>on</strong>g> the W3C OWL 363<br />

language,” Álvarez et al. provided an example of how a sample <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> EpiDoc XML<br />

would appear as represented by their <strong>on</strong>tology. One important advantage of an <strong>on</strong>tological<br />

representati<strong>on</strong> for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s that Álvarez et al. listed was the possibility of mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g the discrete<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> units found with<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s to other knowledge organizati<strong>on</strong> systems (e.g., mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

terms for civilizati<strong>on</strong> eras to the Getty Art & Architecture Thesaurus 364 ). Not<str<strong>on</strong>g>in</str<strong>on</strong>g>g that EpiDoc also<br />

made use of some c<strong>on</strong>trolled vocabularies, such as <strong>on</strong>e that describes m<strong>on</strong>uments <strong>and</strong> objects that bear<br />

texts, 365 Álvarez et al. translated a number of these vocabularies <str<strong>on</strong>g>in</str<strong>on</strong>g>to separate <strong>on</strong>tology modules that<br />

can also be reused separately. Another advantage of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <strong>on</strong>tology for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s with both a<br />

series of properties <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ference rules is that far more precise search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s can be<br />

c<strong>on</strong>ducted, such as far more complex named-entity search<str<strong>on</strong>g>in</str<strong>on</strong>g>g (e.g., all parts of the “tria nom<str<strong>on</strong>g>in</str<strong>on</strong>g>a” [the<br />

praenomen, cognomen, <strong>and</strong> nomen] are encoded as separate data types associated to an <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> as<br />

well as filiati<strong>on</strong>).<br />

The OWL representati<strong>on</strong> designed by Álvarez et al. avoided the use of free text str<str<strong>on</strong>g>in</str<strong>on</strong>g>gs whenever<br />

possible, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stead treated all <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> as either properties or classes, <strong>and</strong> all <strong>on</strong>tology elements <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

this representati<strong>on</strong> are identified by the URI of the element (class, property, or data) that is referenced<br />

when c<strong>on</strong>nect<str<strong>on</strong>g>in</str<strong>on</strong>g>g entries. They designed their implementati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> this manner to support the exportati<strong>on</strong><br />

of epigraphic-l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data by other applicati<strong>on</strong>s, or essentially to create mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-acti<strong>on</strong>able data:<br />

This idea br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs a new dimensi<strong>on</strong> to the shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g of epigraphic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g for<br />

software agents to c<strong>on</strong>sume RDF <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> for specific purposes, complement<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces<br />

oriented to use by humans. Each of their elements described <str<strong>on</strong>g>in</str<strong>on</strong>g> the previous secti<strong>on</strong> will be<br />

referenced by a unique address, a URI that enables an unambiguous, c<strong>on</strong>sistent, <strong>and</strong> permanent<br />

identificati<strong>on</strong> for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> all their associated <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> items. In our approach,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> is already stored <str<strong>on</strong>g>in</str<strong>on</strong>g> OWL-RDF so that the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong>al requirements are hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

a c<strong>on</strong>sistent URI design for the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> items <strong>and</strong> deploy<str<strong>on</strong>g>in</str<strong>on</strong>g>g the provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g services<br />

(Álvarez et al. 2010).<br />

One important use of URIs, Álvarez et al. noted, is that they could be used to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k records for the same<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> found <str<strong>on</strong>g>in</str<strong>on</strong>g> different databases. While most epigraphical catalogs assign objects various codes<br />

depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the organizati<strong>on</strong> system, as well as often use st<strong>and</strong>ard reference identifiers (e.g., CIL #),<br />

Álvarez et al. po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that by us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked-data approach, the suffix of a URI could be changed<br />

depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the reference number, <strong>and</strong> the use of RDF triples <strong>and</strong> the predicate rdfs:seeAlso could be<br />

363 The OWL web <strong>on</strong>tology language has been created by the W3C to support further development of the Semantic Web, <strong>and</strong> the current recommendati<strong>on</strong><br />

for OWL 2 can be found at (http://www.w3.org/TR/owl2-overview/)<br />

364 http://www.getty.edu/research/tools/vocabularies/aat/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

365 For example, the “Eagle/EpiDoc Object Type Vocabulary” can be found at http://edh-www.adw.uni-heidelberg.de/EDH/<str<strong>on</strong>g>in</str<strong>on</strong>g>schrift/012116


112<br />

used to po<str<strong>on</strong>g>in</str<strong>on</strong>g>t between alternate descripti<strong>on</strong>s of an <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> different databases. Their use of an<br />

<strong>on</strong>tology <strong>and</strong> a l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked-data approach thus could be used to support the cross-l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

databases.<br />

The challenges of l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g epigraphy databases with other types of digital classics resources<br />

such as papyrological databases is also a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g challenge for which various soluti<strong>on</strong>s have been<br />

explored by projects such as the recently completed LaQuAT (L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Query<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Ancient<br />

Texts). 366 LaQuAT was a collaborati<strong>on</strong> between the Centre for e-Research, K<str<strong>on</strong>g>in</str<strong>on</strong>g>g’s College,<br />

L<strong>on</strong>d<strong>on</strong>, 367 <strong>and</strong> EPCC 368 at the University of Ed<str<strong>on</strong>g>in</str<strong>on</strong>g>burgh. Two recent articles by Bodard et al. (2009)<br />

<strong>and</strong> Jacks<strong>on</strong> et al. (2009) have described the goals of <strong>and</strong> technological challenges faced by this<br />

project. The LaQuAT project used OGSA-DAI, 369 an open-source distributed data-management<br />

software, to create a dem<strong>on</strong>strator that provided uniform access to different epigraphic <strong>and</strong><br />

papyrological resources. Basically the LaQuAT project sought to build a proof of c<strong>on</strong>cept that<br />

explored the possibilities of “creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g virtual data centres for the coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ated shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g of such<br />

resources” <strong>and</strong> exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed how “distributed data resources can be mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gfully federated <strong>and</strong> queried.”<br />

From a prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary analysis of digital classics resources, Jacks<strong>on</strong> et al. reas<strong>on</strong>ed that a data-<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong><br />

project would need to deal with the various complexities of annotated corpora, material <str<strong>on</strong>g>in</str<strong>on</strong>g> relati<strong>on</strong>al<br />

databases, <strong>and</strong> large numbers of XML files. Such research is of grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g importance because of the<br />

large number of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <strong>and</strong> isolated digital resources that have been created. “In the fields of<br />

archaeology <strong>and</strong> classics al<strong>on</strong>e,” Bodard et al. (2009) expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “there are numerous datasets, often<br />

small <strong>and</strong> isolated, that would be of great utility if the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> they c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed could be <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated.”<br />

The researchers found that four major issues needed to be addressed <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of potential <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong><br />

(1) the formats of resources were very diverse; (2) resources were often not very accessible (e.g.,<br />

stored <strong>on</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual department’s or a scholar’s computers), <strong>and</strong> even data published <strong>on</strong> websites<br />

were typically not available for reuse; (3) resources were available to be used <strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g> isolati<strong>on</strong> (e.g.,<br />

s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> databases); <strong>and</strong> (4) resources were owned by different <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals <strong>and</strong> communities<br />

with vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g rights schemes. The LaQuAT project thus wanted to explore whether bridges could be<br />

built between data silos <str<strong>on</strong>g>in</str<strong>on</strong>g> order to support federated search<str<strong>on</strong>g>in</str<strong>on</strong>g>g at the least <strong>and</strong> they thus brought<br />

together experts <str<strong>on</strong>g>in</str<strong>on</strong>g> distributed data management <strong>and</strong> digital humanities.<br />

The orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al plan of LaQuAT was to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k three projects: Project Volterra 370 (an Access database of<br />

Roman legal texts <strong>and</strong> metadata); Heidelberger Gesamtverzeichnis der griechischen Papyrusurkunden<br />

Ägyptens (HGV) 371 (a “database of papyrological metadata <str<strong>on</strong>g>in</str<strong>on</strong>g> relati<strong>on</strong>al <strong>and</strong> TEI-XML format” that<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> 55,000 papyri <strong>and</strong> is stored <str<strong>on</strong>g>in</str<strong>on</strong>g> FileMaker Pro); <strong>and</strong> the Inscripti<strong>on</strong>s of<br />

Aphrodisias (IAph). These collecti<strong>on</strong>s span about 500 years of the Roman Empire <strong>and</strong> overlap <str<strong>on</strong>g>in</str<strong>on</strong>g> terms<br />

of places <strong>and</strong> people. While all the data sets are freely available <strong>and</strong> both the IAph <strong>and</strong> HGV<br />

collecti<strong>on</strong>s have been published as EpiDoc XML that can be downloaded under a CC Attributi<strong>on</strong><br />

License, it was the master databases of both the HGV <strong>and</strong> Volterra that were needed for this project<br />

<strong>and</strong> they had to be specifically requested (Bodard et al. 2009). Despite their <str<strong>on</strong>g>in</str<strong>on</strong>g>itial desire to support<br />

cross-database search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of all three projects, they found that the challenges of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

relati<strong>on</strong>al databases were so complicated that they focused <strong>on</strong> simply Volterra <strong>and</strong> HGV <str<strong>on</strong>g>in</str<strong>on</strong>g> this project.<br />

One questi<strong>on</strong> they still wished to explore was whether <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> HGV could be used to reduce<br />

366 http://www.kcl.ac.uk/iss/cerch/projects/completed/laquat.html<br />

367 http://www.kcl.ac.uk/iss/cerch<br />

368 http://www.epcc.ed.ac.uk/<br />

369 http://www.ogsadai.org.uk/<br />

370 http://www.ucl.ac.uk/history2/volterra/<br />

371 HGV is also federated as part of Trismegistos, http://aquila.papy.uni-heidelberg.de/gvzFM.html


113<br />

uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g dates <str<strong>on</strong>g>in</str<strong>on</strong>g> the legal texts <str<strong>on</strong>g>in</str<strong>on</strong>g> Volterra. To <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate HGV <strong>and</strong> Volterra, they created<br />

annotati<strong>on</strong>s databases for each project or “r<strong>and</strong>omly-generated values associated with each record <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al databases” so they could “dem<strong>on</strong>strate cross-database jo<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>and</strong> third-party annotati<strong>on</strong>s”<br />

(Jacks<strong>on</strong> et al. 2009).<br />

The project used OGSA-DAI 372 for data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> because it was c<strong>on</strong>sidered a de facto st<strong>and</strong>ard by<br />

many other e-science projects for <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g heterogeneous databases, it was open source, <strong>and</strong> it was<br />

compliant with many relati<strong>on</strong>al databases, XML, <strong>and</strong> other file-based resources. OGSA-DAI also<br />

supported the exposure of data resources <strong>on</strong> to grids (Bodard et al. 2009). Most important, <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of<br />

data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong>:<br />

… OGSA-DAI can abstract the underly<str<strong>on</strong>g>in</str<strong>on</strong>g>g databases us<str<strong>on</strong>g>in</str<strong>on</strong>g>g SQL views <strong>and</strong> provide an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g>terface <strong>on</strong>to them us<str<strong>on</strong>g>in</str<strong>on</strong>g>g distributed query<str<strong>on</strong>g>in</str<strong>on</strong>g>g. This fulfils the essential requirement<br />

of the project to leave the underly<str<strong>on</strong>g>in</str<strong>on</strong>g>g data resources untouched as far as possible (Jacks<strong>on</strong> et al.<br />

2009).<br />

One goal of LaQuAT was to be able to support federated search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of a “virtual database” <str<strong>on</strong>g>in</str<strong>on</strong>g> order that<br />

the underly<str<strong>on</strong>g>in</str<strong>on</strong>g>g databases would not have to undergo major changes for <str<strong>on</strong>g>in</str<strong>on</strong>g>clusi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> such a resource.<br />

“The ability to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k up such diverse data resources, <str<strong>on</strong>g>in</str<strong>on</strong>g> a way that respects the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al data resources<br />

<strong>and</strong> the communities resp<strong>on</strong>sible for them,” Bodard et al. (2009) asserted, “is a press<str<strong>on</strong>g>in</str<strong>on</strong>g>g need am<strong>on</strong>g<br />

humanities researchers.”<br />

A number of issues complicated data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong>, however, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g data c<strong>on</strong>sistency <strong>and</strong> some<br />

specific features of OGSA-DAI. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, some of the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al data <str<strong>on</strong>g>in</str<strong>on</strong>g> the HGV database had<br />

been “c<strong>on</strong>tam<str<strong>on</strong>g>in</str<strong>on</strong>g>ated by c<strong>on</strong>trol characters,” a factor that had serious implicati<strong>on</strong>s for the OGSA-DAI<br />

system s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce it provided access to databases via web services, which are based <strong>on</strong> the exchange of<br />

XML documents. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the use of c<strong>on</strong>trol characters with<str<strong>on</strong>g>in</str<strong>on</strong>g> an XML document results <str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>valid<br />

XML file that cannot be parsed, they had to extend the system’s “relati<strong>on</strong>al data to XML c<strong>on</strong>versi<strong>on</strong><br />

classes to filter out such c<strong>on</strong>trol characters <strong>and</strong> replace these with spaces.” The Volterra database also<br />

presented its own unique challenges, particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of database design, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce not all tables had<br />

the same columns <strong>and</strong> some columns with the same <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> had different names. A sec<strong>on</strong>d major<br />

challenge was the lack of suitable database drivers, <strong>and</strong> the data from both Volterra <strong>and</strong> HGV were<br />

ported <str<strong>on</strong>g>in</str<strong>on</strong>g>to MySQL to be able to <str<strong>on</strong>g>in</str<strong>on</strong>g>teract with OGSA-DAI. Other issues <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded need<str<strong>on</strong>g>in</str<strong>on</strong>g>g to adapt the<br />

way the OGSA-DAI exposed metadata <strong>and</strong> hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g to alter the way the system used SQL views<br />

because of the large nature of the HGV database. In the end, the project could use <strong>on</strong>ly a subset of the<br />

HGV database to ensure that query time would be reas<strong>on</strong>able. Despite these <strong>and</strong> other challenges, the<br />

project was able to develop a dem<strong>on</strong>strator that provided <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated access to both HGV <strong>and</strong><br />

Volterra. 373<br />

The LaQuAT project had orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ally assumed that <strong>on</strong>e of the most useful outcomes of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

two databases would be where data overlapped (such as <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of pers<strong>on</strong>al <strong>and</strong> place names), but<br />

they found <str<strong>on</strong>g>in</str<strong>on</strong>g>stead that clear-cut overlaps were fairly easy to identify. A far more <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g questi<strong>on</strong>,<br />

they proposed, was to try to automatically recognize “the co-existence of hom<strong>on</strong>ymous pers<strong>on</strong>s or<br />

372 While the technical details of this software are bey<strong>on</strong>d the scope of this paper, Jacks<strong>on</strong> et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g> that “OGSA-DAI executes workflows which can<br />

be viewed as scripts which specify what data is to be accessed <strong>and</strong> what is to be d<strong>on</strong>e to it. Workflows c<strong>on</strong>sist of activities, which are well-def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

functi<strong>on</strong>al units which perform some data-related operati<strong>on</strong> e.g. query a database, transform data to XML, deliver data via FTP. A client submits a<br />

workflow to an OGSA-DAI server via an OGSA-DAI web service. The server parses, compiles <strong>and</strong> executes the workflow.”<br />

373 For more <strong>on</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure proof of c<strong>on</strong>cept design, see Jacks<strong>on</strong> et al. (2009). This dem<strong>on</strong>strator can be viewed at<br />

http://doma<str<strong>on</strong>g>in</str<strong>on</strong>g>001.vidar.ngs.manchester.ac.uk:8080/laquat/laquatDemo.jsp


114<br />

names <str<strong>on</strong>g>in</str<strong>on</strong>g> texts dated to with<str<strong>on</strong>g>in</str<strong>on</strong>g> some small number of years of <strong>on</strong>e another, for example” (Jacks<strong>on</strong> et al.<br />

2009). Historical named-entity disambiguati<strong>on</strong> thus presented both a major opportunity <strong>and</strong> a<br />

challenge to data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong>. Another significant barrier to query<str<strong>on</strong>g>in</str<strong>on</strong>g>g multiple databases was the<br />

problem of semantic ambiguity:<br />

To run queries across multiple databases, a researcher would already need a significant degree<br />

of underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g about what each database c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong> also which tables <strong>and</strong> columns<br />

c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed data that was semantically equivalent <strong>and</strong> could therefore be compared or tested for<br />

equality. Any such <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure would have to provide a far greater degree of support for<br />

mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g the databases seem as if they are <str<strong>on</strong>g>in</str<strong>on</strong>g>deed part of <strong>on</strong>e virtual database, for example by<br />

normaliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g dates (Jacks<strong>on</strong> et al. 2009).<br />

In additi<strong>on</strong> to semantic ambiguity <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of how data were described or stored, Jacks<strong>on</strong> et al. po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted<br />

out that <strong>on</strong>ce <strong>on</strong>e starts try<str<strong>on</strong>g>in</str<strong>on</strong>g>g to automatically l<str<strong>on</strong>g>in</str<strong>on</strong>g>k humanities databases, the fuzzy <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative<br />

nature of much of these data become quite problematic. Other, more specific challenges <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded<br />

know<str<strong>on</strong>g>in</str<strong>on</strong>g>g when to jo<str<strong>on</strong>g>in</str<strong>on</strong>g> columns, variant names for historical entities, various ways of represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

dates, the precisi<strong>on</strong> <strong>and</strong> uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty of dates, <strong>and</strong> errors <str<strong>on</strong>g>in</str<strong>on</strong>g> databases that cannot easily be changed.<br />

One major c<strong>on</strong>clusi<strong>on</strong> reached by the LaQuAT project was that more virtual data centers needed to be<br />

created that could <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate several data sources. For this reas<strong>on</strong>, they were actively participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

DARIAH project, hop<str<strong>on</strong>g>in</str<strong>on</strong>g>g that the soluti<strong>on</strong>s LaQuAT had developed would:<br />

… have a lifespan bey<strong>on</strong>d the <str<strong>on</strong>g>in</str<strong>on</strong>g>itial project <strong>and</strong> will provide a framework <str<strong>on</strong>g>in</str<strong>on</strong>g>to which other<br />

researchers will be able to attach resources of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest, thus build<str<strong>on</strong>g>in</str<strong>on</strong>g>g up a critical mass of<br />

related material whose utility as a research tool will be significantly greater than that of the sum<br />

of its parts. We see this project as provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g an opportunity to start build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a more extensive e-<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for advanced research <str<strong>on</strong>g>in</str<strong>on</strong>g> the (digital) humanities (Bodard et al. 2009).<br />

As part of this work, they hoped to c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>ce scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> different countries to ab<strong>and</strong><strong>on</strong> a silo mentality<br />

<strong>and</strong> help build a large mass of open material. In terms of future research, they argued that far more<br />

research was needed <str<strong>on</strong>g>in</str<strong>on</strong>g>to the issue of cross-database l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, especially <str<strong>on</strong>g>in</str<strong>on</strong>g> the l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

of relati<strong>on</strong>al <strong>and</strong> XML databases, which their project was unable to <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigate further. N<strong>on</strong>etheless,<br />

the recently announced SPQR (Support<str<strong>on</strong>g>in</str<strong>on</strong>g>g Productive Queries for Research) 374 project plans to carry<br />

<strong>on</strong> the work of LaQuAT, particularly the “<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of heterogeneous datasets <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities,”<br />

particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> the area of classical antiquity, <strong>and</strong> will explore the use of l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data <strong>and</strong> will be based<br />

<strong>on</strong> the Europeana Data Model (EDM). 375 Further details <strong>on</strong> this project can be found later <str<strong>on</strong>g>in</str<strong>on</strong>g> this<br />

paper.<br />

The scholarly importance of l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g the study of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s to other sources of archaeological or<br />

other material, particularly to help provide a greater c<strong>on</strong>text for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, has also been<br />

emphasized by Charlotte Tupman (Tupman 2010). In her discussi<strong>on</strong> of funerary <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s found <strong>on</strong><br />

m<strong>on</strong>uments, Tupman noted that different categories of funerary evidence (e.g., pottery, b<strong>on</strong>e<br />

fragments) typically need to be assembled for fuller underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of an <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> 376 <strong>and</strong> that there is<br />

no easy way to present the varied archaeological evidence, the funerary text, <strong>and</strong> images of the<br />

m<strong>on</strong>ument it was found <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a way that is easily comprehensible to scholars. As funerary texts were<br />

rarely published with other related material evidence, Tupman observed that typically these<br />

374 http://spqr.cerch.kcl.ac.uk/m=201008<br />

375 http://www.europeanac<strong>on</strong>nect.eu/news.phparea=News&pag=48<br />

376 Kris Lockyear has also made similar arguments about the importance of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g numismatic evidence with other archaeological evidence.


115<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s have not been thought of as archaeological material but as “the preserve of historians <strong>and</strong><br />

literary scholars” s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce they are c<strong>on</strong>sidered as “texts rather than artifacts,” a po<str<strong>on</strong>g>in</str<strong>on</strong>g>t also made previously<br />

(Roueché 2009, Bodard et al. 2009).<br />

Tupman argued that it would be highly desirable not <strong>on</strong>ly to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k funerary <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s to images of the<br />

m<strong>on</strong>ument <strong>on</strong> which they were found but also to then l<str<strong>on</strong>g>in</str<strong>on</strong>g>k these m<strong>on</strong>uments to other objects found <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the same archaeological c<strong>on</strong>text. While Tupman granted that it made some sense that <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s,<br />

pottery catalogs, <strong>and</strong> b<strong>on</strong>e analyses be published separately (as they are separate discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es), she<br />

c<strong>on</strong>tended that all data that were published should at least be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to, <strong>and</strong> ideally be comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed with,<br />

other data:<br />

Specialists, therefore need to work to make their material available to others <str<strong>on</strong>g>in</str<strong>on</strong>g> a way that<br />

permits their various forms of data to be comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gfully. This will be most effective if<br />

undertaken collaboratively, so that shared aims <strong>and</strong> st<strong>and</strong>ards can be established. This does not<br />

imply that there should be any dim<str<strong>on</strong>g>in</str<strong>on</strong>g>uti<strong>on</strong> of expert knowledge or <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> any of these<br />

fields for the sake of mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g it easier for others to digest; to do so would entirely miss the po<str<strong>on</strong>g>in</str<strong>on</strong>g>t<br />

of the exercise. Rather, we should be seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g ways of l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g these different types of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a rati<strong>on</strong>al <strong>and</strong> useful manner that not <strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g>creases our own underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

the data, but also enhances the way <str<strong>on</strong>g>in</str<strong>on</strong>g> which computers can process that data (Tupman 2010,<br />

77).<br />

Tupman thus encouraged the use <strong>and</strong> creati<strong>on</strong> of collaborative st<strong>and</strong>ards <strong>and</strong> to make data both human<br />

readable <strong>and</strong> mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e acti<strong>on</strong>able. While she suggested that perhaps Semantic Web technologies might<br />

be useful <str<strong>on</strong>g>in</str<strong>on</strong>g> this regard, Tupman also proposed that the use of XML, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> particular EpiDoc, to<br />

support the digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> data would be highly beneficial toward achiev<str<strong>on</strong>g>in</str<strong>on</strong>g>g these<br />

aims. After reiterat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Gabriel Bodard’s (Bodard 2008) six transformati<strong>on</strong>al qualities of digital<br />

publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Tupman listed a number of advantages of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>corporat<str<strong>on</strong>g>in</str<strong>on</strong>g>g marked-up<br />

texts <str<strong>on</strong>g>in</str<strong>on</strong>g>to databases, <str<strong>on</strong>g>in</str<strong>on</strong>g>terl<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g marked up <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s with other types of XML files, the ability of<br />

researchers to add their own markup, <strong>and</strong> the possibility of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g XSLT to produce different displays<br />

of the same source file (e.g., for a beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g student vs. an advanced scholar). Tupman c<strong>on</strong>cluded that<br />

provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> data as EpiDoc XML files not <strong>on</strong>ly lessened editorial ambiguity (e.g., by<br />

support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of variant read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs) but also could allow researchers to use digital tools to sort<br />

large amounts of data <strong>and</strong> thus ask their own questi<strong>on</strong>s of the materials<br />

Advanced Imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g Technologies for Epigraphy<br />

While the above secti<strong>on</strong>s exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed some of the difficulties <str<strong>on</strong>g>in</str<strong>on</strong>g> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g sophisticated access to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s as digitized texts <strong>and</strong> of l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g between collecti<strong>on</strong>s, other research has focused <strong>on</strong> the<br />

challenges of advanced imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s as archaeological objects. This secti<strong>on</strong> exam<str<strong>on</strong>g>in</str<strong>on</strong>g>es<br />

several state-of-the-art approaches. 377<br />

The eSAD project has recently developed a number of image-process<str<strong>on</strong>g>in</str<strong>on</strong>g>g algorithms for study<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

ancient documents that have also been made available to scholars through a portal (Tarte et al. 2009).<br />

Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g images of wooden stylus tablets from V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a as their testbed, they developed algorithms<br />

that helped rebalance illum<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> <strong>and</strong> remove wood gra<str<strong>on</strong>g>in</str<strong>on</strong>g>. This project then extended the data model<br />

<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terface of the previously developed Virtual Research Envir<strong>on</strong>ment for the Study of Documents<br />

377 The focus <strong>on</strong> this secti<strong>on</strong> has been research that sought to provide better access to images of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s (e.g., to enhance access to <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> text)<br />

rather than <strong>on</strong> virtual rec<strong>on</strong>structi<strong>on</strong> of the m<strong>on</strong>uments or other objects <strong>on</strong> which they are found. For some recent work <str<strong>on</strong>g>in</str<strong>on</strong>g> this area, see Rem<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>o et al.<br />

(2009).


116<br />

<strong>and</strong> Manuscripts (VRE-SDM) 378 so that it could call up<strong>on</strong> these algorithms us<str<strong>on</strong>g>in</str<strong>on</strong>g>g web services that<br />

make use of the U.K. Nati<strong>on</strong>al Grid Service. This work served as a proof of c<strong>on</strong>cept for the viability of<br />

the VRE-SDM <strong>and</strong> supported the development of a portal that hid the technology from scholars, used<br />

the grid <strong>and</strong> web services as a means of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to powerful image process<str<strong>on</strong>g>in</str<strong>on</strong>g>g algorithms, <strong>and</strong><br />

allowed the eSAD to dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ate its results to both classicists <strong>and</strong> image-process<str<strong>on</strong>g>in</str<strong>on</strong>g>g researchers. The<br />

VRE-SDM also supported scholars who wanted to collaborate by provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g them with a virtual<br />

envir<strong>on</strong>ment where work could be shared.<br />

In additi<strong>on</strong> to image-process<str<strong>on</strong>g>in</str<strong>on</strong>g>g algorithms for digital images of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, other research has<br />

developed advanced 3-D techniques to capture better images of squeezes taken of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s.<br />

Barmpoutis et al. (2009) have asserted that c<strong>on</strong>venti<strong>on</strong>al analysis of ancient <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s has typically<br />

been based <strong>on</strong> observati<strong>on</strong> <strong>and</strong> manual analysis by epigraphists, who exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e the letter<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> attempt<br />

to classify <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s geographically <strong>and</strong> chr<strong>on</strong>ologically. In <strong>on</strong>e method that has been traditi<strong>on</strong>ally<br />

used, researchers “use a special type of moisturized paper (squeeze) which they push <strong>on</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>scribed<br />

surface us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a brush specially adapted for the purpose. When the letters are shaped <strong>on</strong> the squeezed<br />

paper, the archaeologists let it dry, creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g that way an impressi<strong>on</strong> of the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>” (Barmpoutis et<br />

al. 2009). The authors reported that many collecti<strong>on</strong>s of squeezes have been created 379 (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g some<br />

for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s that have now been destroyed) yet the use of these collecti<strong>on</strong>s has been limited because<br />

of a variety of factors, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the expense of travel <strong>and</strong> the difficulties of preservati<strong>on</strong>.<br />

C<strong>on</strong>sequently, Barmpoutis et al. sought to develop methods that could store <strong>and</strong> preserve squeezes <strong>and</strong><br />

make them more accessible to a larger number of scholars. 380<br />

The authors developed a framework that uses “3D rec<strong>on</strong>structi<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s” <strong>and</strong> “statistical<br />

analysis of their rec<strong>on</strong>structed surfaces.” They used a regular image scanner to scan squeezes from two<br />

different light<str<strong>on</strong>g>in</str<strong>on</strong>g>g directi<strong>on</strong>s; these images were then used <str<strong>on</strong>g>in</str<strong>on</strong>g> a “shape from shad<str<strong>on</strong>g>in</str<strong>on</strong>g>g technique <str<strong>on</strong>g>in</str<strong>on</strong>g> order<br />

to rec<strong>on</strong>struct <str<strong>on</strong>g>in</str<strong>on</strong>g> high resoluti<strong>on</strong> the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al 3D surface.” Barmpoutis et al. argued that the major<br />

c<strong>on</strong>tributi<strong>on</strong>s of their research were threefold: (1) they had developed the first framework for<br />

c<strong>on</strong>vert<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> stor<str<strong>on</strong>g>in</str<strong>on</strong>g>g squeezes <str<strong>on</strong>g>in</str<strong>on</strong>g> 3-D; (2) their research dem<strong>on</strong>strated how squeezes could be studied<br />

more effectively us<str<strong>on</strong>g>in</str<strong>on</strong>g>g different visualizati<strong>on</strong>s <strong>and</strong> such results could be more easily shared <strong>and</strong><br />

distributed; <strong>and</strong> (3) automated analysis of the squeezes produced results that would likely have been<br />

impossible to obta<str<strong>on</strong>g>in</str<strong>on</strong>g> with traditi<strong>on</strong>al techniques. Their framework was applied to five Ancient Greek–<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scribed fragments from Epidauros <str<strong>on</strong>g>in</str<strong>on</strong>g> southern Greece <strong>and</strong> they c<strong>on</strong>ducted experiments <str<strong>on</strong>g>in</str<strong>on</strong>g> surface<br />

recogniti<strong>on</strong> <strong>and</strong> statistical analysis. The ability to use different visualizati<strong>on</strong>s <strong>and</strong> 3-D data, they also<br />

proposed, would support collaborative work <strong>and</strong> preservati<strong>on</strong>:<br />

Render<str<strong>on</strong>g>in</str<strong>on</strong>g>g the <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s with different virtual illum<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>s <strong>and</strong> view<str<strong>on</strong>g>in</str<strong>on</strong>g>g angles makes the<br />

use of a squeeze more effective <strong>and</strong> allows the archaeologists to share digital copies of the<br />

squeezes without los<str<strong>on</strong>g>in</str<strong>on</strong>g>g any <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. Thus by us<str<strong>on</strong>g>in</str<strong>on</strong>g>g our proposed framework the digital<br />

libraries of scanned squeezes (regular 2D images) which are comm<strong>on</strong>ly used by archaeology<br />

scholars can easily be replaced by databases of 3D squeezes, without the need of any additi<strong>on</strong>al<br />

equipment (Barmpoutis et al. 2009).<br />

In additi<strong>on</strong>, the use of statistical analysis (such as automatically creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g height-maps of the average<br />

letters) both replicated the results of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual scholars <strong>and</strong> helped epigraphists to significantly speed<br />

378 http://bvreh.humanities.ox.ac.uk/VRE-SDM.html<br />

379 For an <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital collecti<strong>on</strong> of squeezes, see Ohio State University’s Center for Palaeographical Studies, “Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Inscripti<strong>on</strong>s: Digital<br />

Squeezes” http://drc.ohiol<str<strong>on</strong>g>in</str<strong>on</strong>g>k.edu/h<strong>and</strong>le/2374.OX/106<br />

380 One of the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cipal authors of this paper has recently been awarded a digital humanities start-up grant to develop a “Digital Epigraphy” toolbox<br />

(https://securegrants.neh.gov/PublicQuery/ma<str<strong>on</strong>g>in</str<strong>on</strong>g>.aspxf=1&gn=HD-51214-11).


117<br />

the process of analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual letters <strong>and</strong> the variability of letter<str<strong>on</strong>g>in</str<strong>on</strong>g>g schemes <strong>and</strong> of evaluat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

their results.<br />

Other advanced research <str<strong>on</strong>g>in</str<strong>on</strong>g> the imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s has <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated the automatic classificati<strong>on</strong> of<br />

Greek <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the writer who carved them (Panagopoulos et al. 2008). One of the<br />

biggest challenges, Panagopoulos et al. noted, <str<strong>on</strong>g>in</str<strong>on</strong>g> study<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s carved <strong>on</strong> st<strong>on</strong>e is that they are<br />

unsigned <strong>and</strong> undated, <strong>and</strong> have often been broken up <strong>and</strong> so are <str<strong>on</strong>g>in</str<strong>on</strong>g> various fragments. At the same<br />

time, they proposed that identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g a writer could be a crucial part of dat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> <strong>and</strong> sett<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

it <str<strong>on</strong>g>in</str<strong>on</strong>g> historical c<strong>on</strong>text. The major goals of their work were to objectively assign <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s to writers,<br />

to assist <str<strong>on</strong>g>in</str<strong>on</strong>g> writer identificati<strong>on</strong> where archaeological <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> analysis had yielded no results,<br />

<strong>and</strong> to help resolve archaeological disputes regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the dat<str<strong>on</strong>g>in</str<strong>on</strong>g>g of events. In sum, they reported that<br />

they hoped to “achieve writer identificati<strong>on</strong> employ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>ly mathematical process<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> pattern<br />

recogniti<strong>on</strong> methods applied to the letters carved <str<strong>on</strong>g>in</str<strong>on</strong>g> each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>” (Panagopoulos et al. 2008).<br />

One archaeologist worked with several computer scientists to evaluate the f<str<strong>on</strong>g>in</str<strong>on</strong>g>al methodology described<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the paper. They obta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed images of 24 <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, segmented the images, <strong>and</strong> extracted the<br />

c<strong>on</strong>tours of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual letters. Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g mathematical process<str<strong>on</strong>g>in</str<strong>on</strong>g>g, they computed “plat<strong>on</strong>ic” prototypes for<br />

each alphabet symbol <str<strong>on</strong>g>in</str<strong>on</strong>g> each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>. All <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s were then “compared pairwise by employ<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

these ideal representati<strong>on</strong>s <strong>and</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual letter realizati<strong>on</strong>s.” Panagopoulos et al. then used several<br />

statistical techniques to reject the “hypothesis that two <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s are carved by the same writer” <strong>and</strong><br />

computed maximum-likelihood c<strong>on</strong>siderati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> order to def<str<strong>on</strong>g>in</str<strong>on</strong>g>itively attribute <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> their<br />

collecti<strong>on</strong>s to their <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual writers. To evaluate their framework, they used it to automatically<br />

attribute 24 <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s from Athens <strong>and</strong> successfully attributed these <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s to six different<br />

identified “h<strong>and</strong>s” <strong>and</strong> matched the expert op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s of epigraphists. A particular strength of their<br />

process, the authors c<strong>on</strong>cluded, was that it required no tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g data, but they also hypothesized that a<br />

greater mass of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> data <strong>on</strong> which to test their system would help greatly improve their<br />

accuracy rate.<br />

Manuscript Studies<br />

Any discussi<strong>on</strong> of manuscripts quickly leads to an exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of many challenges found across<br />

classical discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es—challenges that <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the creati<strong>on</strong> of digital editi<strong>on</strong>s, the complicati<strong>on</strong>s of<br />

palaeographic studies, <strong>and</strong> how to design a digital collecti<strong>on</strong> of manuscripts that supports researchers<br />

c<strong>on</strong>sider<str<strong>on</strong>g>in</str<strong>on</strong>g>g codicological, 381 historical, or philological questi<strong>on</strong>s. Manuscripts are <strong>on</strong>e of the most<br />

complicated <strong>and</strong> highly used artifacts across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es. The data richness of manuscripts, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

Choudhury <strong>and</strong> St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> their analysis of the comm<strong>on</strong>alities between creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for a<br />

manuscript digital library <strong>and</strong> for a massive dataset <str<strong>on</strong>g>in</str<strong>on</strong>g> physics, makes them an <str<strong>on</strong>g>in</str<strong>on</strong>g>tricate but important<br />

source for humanities data:<br />

Manuscripts, so evidently data-rich <str<strong>on</strong>g>in</str<strong>on</strong>g> the era <str<strong>on</strong>g>in</str<strong>on</strong>g> which they were created, today reta<str<strong>on</strong>g>in</str<strong>on</strong>g> their<br />

former value <strong>and</strong> mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g while they <str<strong>on</strong>g>in</str<strong>on</strong>g>spire a new generati<strong>on</strong> of humanists to create new sets<br />

of data. This <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the metadata needed to encode, organize, <strong>and</strong> underst<strong>and</strong> the texts,<br />

annotati<strong>on</strong>s, <strong>and</strong> the visual art embodied <str<strong>on</strong>g>in</str<strong>on</strong>g> the manuscripts. Not <strong>on</strong>ly does this dem<strong>on</strong>strate the<br />

parallel need for data curati<strong>on</strong> <strong>and</strong> preservati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities <strong>and</strong> the sciences (for at the<br />

381 Codicology has been def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as “the study of the physical structure of books, which, when used <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>juncti<strong>on</strong> with palaeography, reveals a great deal<br />

about the date, place of orig<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> subsequent history of a particular codex. The term was first used <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>juncti<strong>on</strong> with list<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts <str<strong>on</strong>g>in</str<strong>on</strong>g> catalogue form, but<br />

later <str<strong>on</strong>g>in</str<strong>on</strong>g> the 20th century came to be associated primarily with the structural aspects of manuscript producti<strong>on</strong>, which had been studied <str<strong>on</strong>g>in</str<strong>on</strong>g> a coherent fashi<strong>on</strong><br />

s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the late 19th century.” Timothy Hunter "Codicology." The Oxford Compani<strong>on</strong> to Western Art. Ed. Hugh Brigstocke. Oxford University Press, 2001.<br />

Oxford Reference Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Oxford University Press. Tufts University. 27 April 2010<br />

http://www.oxfordreference.com/views/ENTRY.htmlsubview=Ma<str<strong>on</strong>g>in</str<strong>on</strong>g>&entry=t118.e581


118<br />

level of storage <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, a byte is a byte <strong>and</strong> a terabyte a terabyte) but it underscores the<br />

fact that there is an <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>vergence of what it is that is analyzed by humanities scholars<br />

<strong>and</strong> scientists: data (Choudhury <strong>and</strong> St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2007).<br />

The authors noted that manuscripts represented the richest sets of data for their day because they<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated texts <strong>and</strong> images <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded user annotati<strong>on</strong>s, as well as vast numbers of <str<strong>on</strong>g>in</str<strong>on</strong>g>tertextual<br />

allusi<strong>on</strong>s <strong>and</strong> references.<br />

While not specifically a “discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e” of classics, manuscript studies <str<strong>on</strong>g>in</str<strong>on</strong>g>forms the work of many classical<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es. The majority of classical texts—whether studied for philological analysis or as a source of<br />

ancient history—that form the basis for the modern critical editi<strong>on</strong>s up<strong>on</strong> which many scholars rely,<br />

are based off of medieval manuscripts. In additi<strong>on</strong>, as was seen <str<strong>on</strong>g>in</str<strong>on</strong>g> the secti<strong>on</strong> <strong>on</strong> digital editi<strong>on</strong>s <strong>and</strong><br />

textual criticism, access to images of manuscripts <strong>and</strong> transcripti<strong>on</strong>s is an essential comp<strong>on</strong>ent of<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for classics. This subsecti<strong>on</strong> addresses some of the challenges of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital<br />

libraries of manuscripts <strong>and</strong> exam<str<strong>on</strong>g>in</str<strong>on</strong>g>es <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual research projects <str<strong>on</strong>g>in</str<strong>on</strong>g> detail. Some of the projects<br />

discussed have received fuller treatment <str<strong>on</strong>g>in</str<strong>on</strong>g> other discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary secti<strong>on</strong>s of the paper.<br />

Digital Libraries of Manuscripts<br />

The past 20 years has seen a volum<str<strong>on</strong>g>in</str<strong>on</strong>g>ous growth <str<strong>on</strong>g>in</str<strong>on</strong>g> the number of digital images <strong>and</strong> transcripti<strong>on</strong>s for<br />

manuscripts that have become available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. As <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated by the Catalogue of <str<strong>on</strong>g>Digitized</str<strong>on</strong>g><br />

Manuscripts, there are both large collecti<strong>on</strong>s of digital manuscripts at s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s 382 <strong>and</strong> many<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscripts that have been digitized by <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual libraries, museums, or cultural<br />

organizati<strong>on</strong>s. 383<br />

One of the largest exemplary collecti<strong>on</strong>s is “Medieval Illum<str<strong>on</strong>g>in</str<strong>on</strong>g>ated Manuscripts,” 384 a website provided<br />

by the K<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>klijke Bibliotheek <strong>and</strong> the Museum Meermanno-Westreenianum (Netherl<strong>and</strong>s). This<br />

website serves as an extensive database of research <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about illum<str<strong>on</strong>g>in</str<strong>on</strong>g>ated medieval<br />

manuscripts. 385 More than 10,000 digital images of decorati<strong>on</strong>s taken from manuscripts are provided,<br />

<strong>and</strong> they may be browsed by subject matter (<str<strong>on</strong>g>in</str<strong>on</strong>g> English, German, or French) us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ICONCLASS<br />

classificati<strong>on</strong> system, which was created for the classificati<strong>on</strong> of art <strong>and</strong> ic<strong>on</strong>ography. For example, a<br />

user may choose “Classical Mythology <strong>and</strong> Ancient History” <strong>and</strong> then choose “Classical History” that<br />

then takes her or him to a f<str<strong>on</strong>g>in</str<strong>on</strong>g>al selecti<strong>on</strong> of opti<strong>on</strong>s, such as “female pers<strong>on</strong>s from classical history.”<br />

Select<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>e of these opti<strong>on</strong>s takes the user to a list of manuscript images (with high-resoluti<strong>on</strong> <strong>and</strong><br />

zoomable images available) where pick<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual image also provides a full manuscript<br />

descripti<strong>on</strong> <strong>and</strong> a bibliography of the manuscript. A searchable database is also provided.<br />

Two projects, the Digital Scriptorium 386 <strong>and</strong> Manuscriptorium, 387 have been created to br<str<strong>on</strong>g>in</str<strong>on</strong>g>g together<br />

large numbers of digital manuscripts <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, or essentially to create virtual libraries of digital<br />

manuscripts. 388 Each project has taken a different approach to this comm<strong>on</strong> problem.<br />

382 For example, see “Digital Medieval Manuscripts” at Hought<strong>on</strong> <strong>Library</strong>, Harvard University,<br />

http://hcl.harvard.edu/libraries/hought<strong>on</strong>/collecti<strong>on</strong>s/early_manuscripts/<br />

383 http://manuscripts.cmrs.ucla.edu/languages_list.php<br />

384 http://www.kb.nl/manuscripts/<br />

385 Some recent research has explored <str<strong>on</strong>g>in</str<strong>on</strong>g>novative approaches to support<str<strong>on</strong>g>in</str<strong>on</strong>g>g more effective scholarly use of illum<str<strong>on</strong>g>in</str<strong>on</strong>g>ated manuscripts through the<br />

development of user annotati<strong>on</strong> tools <strong>and</strong> a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g tax<strong>on</strong>omy Agosti et al. (2005).<br />

386 http://www.scriptorium.columbia.edu/<br />

387 http://beta.manuscriptorium.com/<br />

388 Other approaches have also explored develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g large data grids or digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures for manuscripts; see Cal<strong>and</strong>ucci et al. (2009).


119<br />

The Digital Scriptorium provides access to an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e image database of medieval <strong>and</strong> Renaissance<br />

manuscripts from almost 30 libraries <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes records for 5,300 manuscripts <strong>and</strong> 24,300 images.<br />

This collecti<strong>on</strong> can be browsed by locati<strong>on</strong>, shelfmark, author, title, scribe, artist, <strong>and</strong> language<br />

(<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g 58 Greek manuscripts). Each manuscript record <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes an extensive bibliographic <strong>and</strong><br />

physical descripti<strong>on</strong>, l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscript pages images (thumbnail, small, medium,<br />

large), 389 <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to the fully digitized manuscript at its home <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong> (where available). Several<br />

types of search<str<strong>on</strong>g>in</str<strong>on</strong>g>g are available, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a basic search, a shelfmark search, <strong>and</strong> an advanced search<br />

where a user can enter multiple keywords (to search the fields shelfmark, author, title, docket,<br />

language, provenance, b<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g, capti<strong>on</strong>) with limits by date, decorati<strong>on</strong>, country of orig<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> current<br />

locati<strong>on</strong>.<br />

An overview of the Digital Scriptorium (DS) <strong>and</strong> its future has been provided by C<strong>on</strong>suelo Dutschke<br />

(Dutschke 2008). She articulated how the creati<strong>on</strong> of the DS had made the work of text editors <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

assembl<str<strong>on</strong>g>in</str<strong>on</strong>g>g a body of evidence much simpler, <strong>and</strong> that libraries that had chosen to participate had also<br />

made the job of future editors far easier for DS provides a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of access to the <str<strong>on</strong>g>in</str<strong>on</strong>g>dexed<br />

hold<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of multiple libraries. She also observed that many libraries that had chosen to participate <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

DS had c<strong>on</strong>sequently made a much greater effort to identify their own collecti<strong>on</strong>s. Even more<br />

important, however, the DS can help editors ga<str<strong>on</strong>g>in</str<strong>on</strong>g> a more complete underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of the c<strong>on</strong>text of the<br />

manuscripts with which they work:<br />

DS also serves the cause of the editor <str<strong>on</strong>g>in</str<strong>on</strong>g> allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g him a first glimpse of the world that a given<br />

manuscript occupies: the other texts with which it circulates; the m<str<strong>on</strong>g>in</str<strong>on</strong>g>iatures, if any, which<br />

always imply <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>; the level of expense that went <str<strong>on</strong>g>in</str<strong>on</strong>g>to its producti<strong>on</strong>; early <strong>and</strong> late<br />

owners with their notes <strong>and</strong> their b<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, each br<str<strong>on</strong>g>in</str<strong>on</strong>g>g<str<strong>on</strong>g>in</str<strong>on</strong>g>g a historical glimpse of that<br />

manuscript's value – both semantic <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial – to the whole. Le<strong>on</strong>ard Boyle rem<str<strong>on</strong>g>in</str<strong>on</strong>g>ds us that<br />

no text exists without its physical means of transmissi<strong>on</strong>. … <strong>and</strong> DS significantly aids the<br />

editor <str<strong>on</strong>g>in</str<strong>on</strong>g> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g an underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of the physical <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual envir<strong>on</strong>ment of the chosen<br />

text (Dutschke 2008).<br />

Dutschke asserted that editors’ underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g would also grow as they could exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e other<br />

manuscripts of the same text or even other manuscripts of different texts but of a similar place <strong>and</strong> date<br />

of orig<str<strong>on</strong>g>in</str<strong>on</strong>g>. The DS provides access to <strong>on</strong>ly some images of manuscripts (an average of six images per<br />

codex) as the costs of full digitizati<strong>on</strong> were prohibitive <str<strong>on</strong>g>in</str<strong>on</strong>g> many cases. N<strong>on</strong>etheless, it serves as an<br />

important discovery tool for widely scattered collecti<strong>on</strong>s, Dutschke ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce for most<br />

researchers it simply matters whether a library has the particular text, author, scribe, or artist that they<br />

are research<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

The DS began <str<strong>on</strong>g>in</str<strong>on</strong>g> 1997 <strong>and</strong> first established st<strong>and</strong>ards for bibliographic data collecti<strong>on</strong> <strong>and</strong><br />

photographic capture of manuscripts, st<strong>and</strong>ards that are iteratively updated. The existence of such<br />

st<strong>and</strong>ards, al<strong>on</strong>g with documentati<strong>on</strong>, has made it easier for potential collaborators to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

whether they wish to jo<str<strong>on</strong>g>in</str<strong>on</strong>g> the DS. This documentati<strong>on</strong> <strong>and</strong> these st<strong>and</strong>ards have proved a crucial<br />

comp<strong>on</strong>ent of technical susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Dutschke. The other critical element of<br />

susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability, she po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out, is f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial, <strong>and</strong> the DS is currently tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g steps to ensure the survival<br />

of their digital program for the future. Part of any f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability plan, Dutschke expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed,<br />

was a c<strong>on</strong>crete specificati<strong>on</strong> of what is required to keep an organizati<strong>on</strong> runn<str<strong>on</strong>g>in</str<strong>on</strong>g>g as well as to keep<br />

down future costs. Some key elements she listed <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded documentati<strong>on</strong>, technological transparency,<br />

389 For example, a large image of a manuscript page of Hero of Alex<strong>and</strong>ria’s Geometrica, http://www.columbia.edu/cgi-b<str<strong>on</strong>g>in</str<strong>on</strong>g>/dloobj=ds.Columbia-<br />

NY.NNC-RBML.6869&size=large


120<br />

simplicity, <strong>and</strong> sensible file nam<str<strong>on</strong>g>in</str<strong>on</strong>g>g. As Dutschke expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “There is an unfortunate tendency to want<br />

to make the file name carry verbal mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g, to allow humans to underst<strong>and</strong> how it refers to the reallife<br />

object” (Dutschke 2008). She argued that this was unnecessary; because the catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g for files<br />

occurs elsewhere <str<strong>on</strong>g>in</str<strong>on</strong>g> tables of the database, such <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> need not be encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> the file name.<br />

Simple <strong>and</strong> transparent file names, she <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted, helped limit future costs of hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g to update <str<strong>on</strong>g>in</str<strong>on</strong>g>valid<br />

semantic values or fix<str<strong>on</strong>g>in</str<strong>on</strong>g>g typ<str<strong>on</strong>g>in</str<strong>on</strong>g>g errors.<br />

While the database used for data entry <strong>and</strong> collecti<strong>on</strong> is currently Microsoft Access, Dutschke po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted<br />

out that <strong>on</strong> a regular basis every DS partner exports its own, collecti<strong>on</strong>- specific <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to XML<br />

<strong>and</strong> then forwards those XML data to the central DS organizati<strong>on</strong>. “It is <strong>on</strong> the XML-encoded data that<br />

technology experts write the applicati<strong>on</strong>s that make the data useful to scholars,” Dutschke reported,<br />

“via mesh<str<strong>on</strong>g>in</str<strong>on</strong>g>g the data from multiple partners, search<str<strong>on</strong>g>in</str<strong>on</strong>g>g it, retriev<str<strong>on</strong>g>in</str<strong>on</strong>g>g it, display<str<strong>on</strong>g>in</str<strong>on</strong>g>g it.” In additi<strong>on</strong>,<br />

because XML is both n<strong>on</strong>proprietary <strong>and</strong> platform <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent they have chosen to use it for “data<br />

transport, l<strong>on</strong>g-term storage <strong>and</strong> manipulati<strong>on</strong>.” Cayless et al. (2009) have also argued for the use of<br />

XML as a l<strong>on</strong>g-term preservati<strong>on</strong> format for digitally encoded epigraphic data. Two other l<strong>on</strong>g-term<br />

preservati<strong>on</strong> challenges that were identified by Dutschke were the challenges of mass storage <strong>and</strong> the<br />

security of the files.<br />

While technical <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability were two key comp<strong>on</strong>ents of the l<strong>on</strong>g-term<br />

preservati<strong>on</strong> of any digital project, Dutschke ultimately c<strong>on</strong>cluded that the most important questi<strong>on</strong>s<br />

were political; <str<strong>on</strong>g>in</str<strong>on</strong>g> other words, were the DS partners committed to its l<strong>on</strong>g-term survival, <strong>and</strong> did the<br />

larger user community value it To stabilize the DS c<strong>on</strong>sortium, the DS has created a govern<str<strong>on</strong>g>in</str<strong>on</strong>g>g body<br />

that has developed policies for the daily management of decisi<strong>on</strong> mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g. They also c<strong>on</strong>ducted a user<br />

survey to which 200 people resp<strong>on</strong>ded <strong>and</strong> 43 of which agreed to be <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed <str<strong>on</strong>g>in</str<strong>on</strong>g> detail; the major<br />

c<strong>on</strong>clusi<strong>on</strong> of this survey was the unrelent<str<strong>on</strong>g>in</str<strong>on</strong>g>g dem<strong>and</strong> for more c<strong>on</strong>tent. This led Dutschke to offer the<br />

important <str<strong>on</strong>g>in</str<strong>on</strong>g>sight that digital projects need to underst<strong>and</strong> both their current <strong>and</strong> future user dem<strong>and</strong>s,<br />

ultimately posit<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “the will to susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability lies not <strong>on</strong>ly with<str<strong>on</strong>g>in</str<strong>on</strong>g> the project <strong>and</strong> its<br />

creators/partners; it also lies with its users.”<br />

One last critical issue raised by Dutschke addressed the needs of the DS <strong>and</strong> the digital humanities as a<br />

whole to develop a greater underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of the costs <strong>and</strong> needs of cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure:<br />

It’s not simply that digital projects cost m<strong>on</strong>ey; all human endeavour falls <str<strong>on</strong>g>in</str<strong>on</strong>g>to that category.<br />

It's that digital projects rema<str<strong>on</strong>g>in</str<strong>on</strong>g> so new to us that we, as a nati<strong>on</strong> <strong>and</strong> even as a world-wide<br />

community of scholars work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, haven't fully understood the costs nor<br />

factored them out across appropriate bodies. The steps DS has taken towards a more reliable ad<br />

efficient technology, <strong>and</strong> the steps it has not taken reflect growth <strong>and</strong> uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty <str<strong>on</strong>g>in</str<strong>on</strong>g> the field<br />

overall. DS <strong>and</strong> the digital world as a community still lack a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure not simply <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

terms of hardware or software, but even more importantly as a shared <strong>and</strong> recognized expertise<br />

<strong>and</strong> mode of operati<strong>on</strong> (Dutschke 2008).<br />

Throughout this review, the uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the way forward for a digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure at both<br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <strong>and</strong> the cross-discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary levels was frequently discussed. While technological issues<br />

<strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial c<strong>on</strong>cerns were often raised, Dutschke also broached the important questi<strong>on</strong> of the lack of<br />

shared expertise <strong>and</strong> bus<str<strong>on</strong>g>in</str<strong>on</strong>g>ess models <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g how best to move forward toward<br />

build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a humanities cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

The sec<strong>on</strong>d major virtual library of manuscripts is the Manuscriptorium, which, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its<br />

website, is seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create “a virtual research envir<strong>on</strong>ment provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to all exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital


121<br />

documents <str<strong>on</strong>g>in</str<strong>on</strong>g> the sphere of historic book resources (manuscripts, <str<strong>on</strong>g>in</str<strong>on</strong>g>cunabula, early pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted books,<br />

maps, charters <strong>and</strong> other types of documents).” 390 Manuscriptorium provides access to more than 5<br />

milli<strong>on</strong> digital images from dozens of European as well as several Asian libraries <strong>and</strong> museums.<br />

Extensive multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual access is provided to this collecti<strong>on</strong> (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g English, French, German, <strong>and</strong><br />

Spanish). In additi<strong>on</strong> to multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual searches, a translati<strong>on</strong> tool provided by Systran can be used to<br />

translate manuscript descripti<strong>on</strong>s from <strong>on</strong>e language to another. This rich multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual envir<strong>on</strong>ment<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes both modern <strong>and</strong> ancient languages, <strong>and</strong> this has had led to complicated transcripti<strong>on</strong> issues<br />

s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the collecti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes Old Slav<strong>on</strong>ic, Greek, Arabic, Persian, <strong>and</strong> various Indian languages.<br />

The Manuscriptorium collecti<strong>on</strong> can be searched by document identificati<strong>on</strong> or document orig<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong><br />

provides both easy <strong>and</strong> advanced search <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces. The former allows a user to search for documents<br />

by locati<strong>on</strong>, keyword, time frame, resp<strong>on</strong>sible pers<strong>on</strong> or associated name, or, for those documents with<br />

a digital facsimile, full-text transcripti<strong>on</strong>, editi<strong>on</strong>, or transliterati<strong>on</strong>. The advanced search offers<br />

multiple keyword entry <str<strong>on</strong>g>in</str<strong>on</strong>g> the fields (shelf-mark, text anywhere, country, settlement, <strong>and</strong> library). A<br />

user search br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs up a list of relevant manuscripts, <strong>and</strong> each manuscript descripti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes full<br />

bibliographic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, a physical descripti<strong>on</strong>, <strong>and</strong> a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to a full digital facsimile (when available).<br />

Open<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital facsimile launches a separate image viewer for exam<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the images of the<br />

facsimile that are available.<br />

Knoll et al. (2009) have provided some further explanati<strong>on</strong> of the history, technical design, <strong>and</strong> goals<br />

of the Manuscriptorium project. It began as the Czech Manuscriptorium Digital <strong>Library</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> 2002 <strong>and</strong><br />

through the ENRICH project 391 was exp<strong>and</strong>ed to provide seamless access to data about <strong>and</strong> digital<br />

images of manuscripts from numerous European <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s. Manuscriptorium supports harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g via<br />

OAI for exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital collecti<strong>on</strong>s of manuscripts <strong>and</strong> has also created tools to allow participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

organizati<strong>on</strong>s that simply wish to create their digital library as part of the larger Manuscriptorium to<br />

create compliant data. The development of Manuscriptorium has <str<strong>on</strong>g>in</str<strong>on</strong>g>volved the creati<strong>on</strong> of technical <strong>and</strong><br />

legal agreements that have also evolved over time. The early st<strong>and</strong>ard used <str<strong>on</strong>g>in</str<strong>on</strong>g> manuscript descripti<strong>on</strong><br />

was the MASTER DTD 392 <strong>and</strong> participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the project required that partners provide detailed<br />

technical descripti<strong>on</strong>s about their manuscript images as well as a “framework for mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

document structure with references to images.”<br />

Under the auspices of the ENRICH program, the Manuscriptorium project realized that a new, more<br />

robust DTD would be necessary. This was a complicated process as they sought to harvest data created<br />

from two very different approaches to manuscript descripti<strong>on</strong>—that of the library community (MARC)<br />

<strong>and</strong> that of researchers <strong>and</strong> text encoders (TEI):<br />

In the so-called catalogue or bibliographic segment, there are certa<str<strong>on</strong>g>in</str<strong>on</strong>g> descripti<strong>on</strong> granularity<br />

problems when c<strong>on</strong>vert<str<strong>on</strong>g>in</str<strong>on</strong>g>g metadata from MASTER (TEI P.4) to MARC, while vice versa is<br />

not problematic. On the other h<strong>and</strong>, TEI offers much more flexibility <strong>and</strong> analytical depth even<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the descripti<strong>on</strong> segment <strong>and</strong>, furthermore, be<str<strong>on</strong>g>in</str<strong>on</strong>g>g a part of a complex document format, it also<br />

provides space for structural mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Knoll et al 2009).<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce two major goals of the ENRICH project were to support data <str<strong>on</strong>g>in</str<strong>on</strong>g>terchange <strong>and</strong> shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g as well as<br />

data storage, Oxford University Computer Services led the development of both a new, TEI P5-<br />

390 http://beta.manuscriptorium.com/<br />

391 The recently c<strong>on</strong>cluded ENRICH project (funded under the European Uni<strong>on</strong> eC<strong>on</strong>tent + progamme) provided fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g to exp<strong>and</strong> Manuscriptorium to<br />

serve as a digital library platform “to create seamless access to distributed <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about manuscripts <strong>and</strong> rare old pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted books.”<br />

http://enrich.manuscriptorium.com/<br />

392 http://digit.nkp.cz/MMSB/1.1/msnkaip.xsd


122<br />

compliant DTD <strong>and</strong> schema. 393 It was decided that TEI not <strong>on</strong>ly supported all the requirements of<br />

manuscript descripti<strong>on</strong> but also provided a comm<strong>on</strong> format for structural mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> would be able<br />

to accommodate all <str<strong>on</strong>g>in</str<strong>on</strong>g>com<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels of manuscript granularity. All exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g documents <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital<br />

library thus had to be migrated from the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al masterx.dtd to the new <strong>on</strong>e created, <strong>and</strong> all documents<br />

that are harvested or added directly must c<strong>on</strong>form to it as well.<br />

The basic image-access format used with<str<strong>on</strong>g>in</str<strong>on</strong>g> Manuscriptorium is JPEG, but it also supports GIF <strong>and</strong><br />

PNG. N<strong>on</strong>etheless, <strong>on</strong>e major challenge <str<strong>on</strong>g>in</str<strong>on</strong>g> data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong>, Knoll et al. reported, was that many<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual libraries chose to provide access to images of their manuscripts through multipage image<br />

files such as PDF or DjVu rather than through XML-based structural mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce this k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of<br />

access did not support the manipulati<strong>on</strong> of manuscript pages as <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual digital objects, all<br />

participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g libraries were required to c<strong>on</strong>vert such files <str<strong>on</strong>g>in</str<strong>on</strong>g>to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual JPEG images for each<br />

manuscript page. As Knoll et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed:<br />

The goal of Manuscriptorium is to use its own <str<strong>on</strong>g>in</str<strong>on</strong>g>terface for representati<strong>on</strong> of any document<br />

from any partner digital library or repository. Thus, the central database must c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> not <strong>on</strong>ly<br />

descripti<strong>on</strong>s of such documents—manuscripts or rare old pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ts—but also their structural maps<br />

with references (<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual URLs) to c<strong>on</strong>crete image files. As the users wishes to c<strong>on</strong>sult the<br />

c<strong>on</strong>crete files, they are called <str<strong>on</strong>g>in</str<strong>on</strong>g>to the uniform Manuscriptorium viewer from anywhere they are<br />

so that the user enjoys seamless access without navigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g to remote digital libraries or<br />

presentati<strong>on</strong>s (Knoll et al. 2009).<br />

The ability to provide a seamless po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of access to distributed collecti<strong>on</strong>s of digital objects—<str<strong>on</strong>g>in</str<strong>on</strong>g> effect<br />

to create a virtual user experience—was also cited as important by the CLAROS, LaQuAT, <strong>and</strong><br />

TextGrid projects. In the end, an ideal partner for Manuscriptorium, as described by Knoll et al., is <strong>on</strong>e<br />

whose collecti<strong>on</strong> can be harvested by OAI <strong>and</strong> where each manuscript profile c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s both a<br />

descriptive record <strong>and</strong> a structural map with l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to any images. While harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> transform<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

descriptive records was fairly simple, Knoll et al. reported that harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g structural mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>gs was far<br />

more problematic. Two tools have also been created to allow c<strong>on</strong>tent providers to easily create the<br />

structured digital documents required by Manuscriptorium. The first tool, M-Tool, supports both the<br />

manual entry <strong>and</strong> creati<strong>on</strong> of new metadata so it can be used either to create new manuscript records<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> Manuscriptorium or to edit exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>es for import. The sec<strong>on</strong>d tool, M-Can, was created for<br />

upload<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> evaluat<str<strong>on</strong>g>in</str<strong>on</strong>g>g manuscript records (Marek 2009).<br />

One of the most <str<strong>on</strong>g>in</str<strong>on</strong>g>novative features of Manuscriptorium is that it supports the creati<strong>on</strong> of pers<strong>on</strong>al<br />

digital libraries. Users who register (as well as c<strong>on</strong>tent providers) can “build their own virtual libraries<br />

from the aggregated c<strong>on</strong>tent” <strong>and</strong> thus organize c<strong>on</strong>tent <str<strong>on</strong>g>in</str<strong>on</strong>g>to static pers<strong>on</strong>al collecti<strong>on</strong>s or dynamic<br />

collecti<strong>on</strong>s (based off of a query so the collecti<strong>on</strong> automatically updates based <strong>on</strong> your query terms).<br />

Even more important, users can create “virtual documents” that can be shared with other users. These<br />

documents can be created through the use of the M-Tool applicati<strong>on</strong>. These virtual documents are<br />

particularly <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g for they can be composed of parts of different physical documents (<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

page images from different manuscripts can be saved with notes from the user), <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e example<br />

they give, “<str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g illum<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>s from manuscripts of a certa<str<strong>on</strong>g>in</str<strong>on</strong>g> period” could be selected <strong>and</strong><br />

“bound” <str<strong>on</strong>g>in</str<strong>on</strong>g>to a new virtual document.” 394 Images from other external URLs can be <str<strong>on</strong>g>in</str<strong>on</strong>g>serted <str<strong>on</strong>g>in</str<strong>on</strong>g>to these<br />

393 http://tei.oucs.ox.ac.uk/ENRICH/ODD/RomaResults/enrich.dtd <strong>and</strong> http://tei.oucs.ox.ac.uk/ENRICH/ODD/RomaResults/enrich.xsd<br />

394 For more <strong>on</strong> the creati<strong>on</strong> of pers<strong>on</strong>al collecti<strong>on</strong>s <strong>and</strong> virtual documents, see<br />

http://beta.manuscriptorium.com/apps/ma<str<strong>on</strong>g>in</str<strong>on</strong>g>/docs/mns_20_pdlib_help_eng.pdf


123<br />

virtual documents. This level of pers<strong>on</strong>alizati<strong>on</strong> is extremely useful <strong>and</strong> rarely found am<strong>on</strong>g most<br />

digital projects. Moreover, the ability to create virtual manuscripts that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> page images of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<br />

from various manuscripts will likely support <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g new research.<br />

As <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated by the projects surveyed here, a wealth of digital manuscript resources is available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.<br />

The next secti<strong>on</strong> looks at some of the challenges of work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with complicated <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscripts.<br />

Digital Challenges of Individual Manuscripts <strong>and</strong> Manuscript Collecti<strong>on</strong>s<br />

This review has already briefly explored some of the challenges of manuscript digitizati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of<br />

advanced document-recogniti<strong>on</strong> <strong>and</strong> research projects such as the work of EDUCE with the HMT<br />

project <strong>and</strong> research c<strong>on</strong>ducted <strong>on</strong> the Archimedes Palimpsest.<br />

This secti<strong>on</strong> exam<str<strong>on</strong>g>in</str<strong>on</strong>g>es another perspective, the challenges of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g metadata (e.g., l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

transcripti<strong>on</strong>s, translati<strong>on</strong>s, <strong>and</strong> images) to manage highly complicated <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual digital manuscripts<br />

such as the Codex S<str<strong>on</strong>g>in</str<strong>on</strong>g>aiticus 395 <strong>and</strong> the Archimedes Palimpsest <strong>and</strong> for digital collecti<strong>on</strong>s of the<br />

multiple manuscripts of a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle work such as <str<strong>on</strong>g>in</str<strong>on</strong>g> the Roman de La Rose Digital <strong>Library</strong> (RRDL).<br />

The Codex S<str<strong>on</strong>g>in</str<strong>on</strong>g>aiticus is <strong>on</strong>e major project that illustrates some of the challenges of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital<br />

library of an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscript, albeit <strong>on</strong>e that exists <str<strong>on</strong>g>in</str<strong>on</strong>g> fragments <str<strong>on</strong>g>in</str<strong>on</strong>g> various collecti<strong>on</strong>s. This<br />

manuscript was h<strong>and</strong>written more than 1,600 years ago <strong>and</strong> c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s a copy of the Christian Bible <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Greek, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the oldest complete copy of the New Testament. This text, which has been heavily<br />

corrected over the centuries, is of critical importance not just for Biblical studies but also as the oldest<br />

“substantial book to survive antiquity,” is an important source of study for the “history of the book.”<br />

The orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al codex was distributed <str<strong>on</strong>g>in</str<strong>on</strong>g> unequal porti<strong>on</strong>s between L<strong>on</strong>d<strong>on</strong>, Leipzig, S<str<strong>on</strong>g>in</str<strong>on</strong>g>ai, <strong>and</strong> St.<br />

Petersburg, <strong>and</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al collaborati<strong>on</strong> reunited the manuscript <str<strong>on</strong>g>in</str<strong>on</strong>g> digital form <strong>and</strong> has made it<br />

available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. A recent article by Dogan <strong>and</strong> Scharsky (2008) has described the technical <strong>and</strong><br />

metadata processes <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the digital editi<strong>on</strong> of this codex that is available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. They<br />

stated that creati<strong>on</strong> of the website <str<strong>on</strong>g>in</str<strong>on</strong>g>volved physical descripti<strong>on</strong> of the manuscript, translati<strong>on</strong> of<br />

selected parts <str<strong>on</strong>g>in</str<strong>on</strong>g>to different languages such as German <strong>and</strong> English, the creati<strong>on</strong> of a Greek<br />

transcripti<strong>on</strong>, <strong>and</strong> the digital imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the entire codex.<br />

On the website, the user can choose to view the manuscript transcripti<strong>on</strong> either by “semantic layout”<br />

(view by Biblical verse) or by manuscript layout (view by page). An image of the codex is presented<br />

with the Greek transcripti<strong>on</strong> <strong>and</strong>, when available, a parallel translati<strong>on</strong> of the verse <str<strong>on</strong>g>in</str<strong>on</strong>g> various<br />

languages. The entire codex can also be searched (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the transcripti<strong>on</strong> or the translati<strong>on</strong>), <strong>and</strong> a<br />

Greek “keyboard” is available to search <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek. Dogan <strong>and</strong> Scharsky have reported that multispectral<br />

images of the manuscript were also taken to “enable erased or hidden text to be discovered as well as<br />

codicological <strong>and</strong> palaeographical characteristics of the Codex to be fully analysed.” Another major<br />

challenge they noted was that almost every page <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes “correcti<strong>on</strong>s, re-correcti<strong>on</strong>s <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>serti<strong>on</strong>s,<br />

many of c<strong>on</strong>siderable textual significance.” A f<str<strong>on</strong>g>in</str<strong>on</strong>g>al goal of the project is to make available a fully<br />

searchable electr<strong>on</strong>ic transcripti<strong>on</strong> of both the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> text <strong>and</strong> correcti<strong>on</strong>s. The project developed a<br />

specific schema based <strong>on</strong> the TEI to create a transcripti<strong>on</strong> that reflected both the Biblical structure<br />

(book, chapter, verse) <strong>and</strong> the physical structure (quire, folio, page, column) of the manuscript.<br />

Development of the website has also <str<strong>on</strong>g>in</str<strong>on</strong>g>volved creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a specialized l<str<strong>on</strong>g>in</str<strong>on</strong>g>kage system between image,<br />

transcripti<strong>on</strong>, <strong>and</strong> translati<strong>on</strong>.<br />

395 http://www.codexs<str<strong>on</strong>g>in</str<strong>on</strong>g>aiticus.org/en/


124<br />

While the advanced document-recogniti<strong>on</strong> technology used with the Archimedes Palimpsest has been<br />

discussed previously, the metadata <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g strategies used to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k manuscript metadata, images,<br />

<strong>and</strong> transcripti<strong>on</strong>s that were developed merit some further discussi<strong>on</strong>. Two recent articles by Doug<br />

Emery <strong>and</strong> Michael B. Toth (Emery <strong>and</strong> Toth 2009, Toth <strong>and</strong> Emery 2008) have described this process<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> detail. The creati<strong>on</strong> of the Archimedes Palimpsest Digital product, which released <strong>on</strong>e terabyte of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated image <strong>and</strong> transcripti<strong>on</strong> data, required the spatial l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of registered images for each leaf<br />

“to diplomatic transcripti<strong>on</strong>s that scholars <str<strong>on</strong>g>in</str<strong>on</strong>g>itially created <str<strong>on</strong>g>in</str<strong>on</strong>g> various n<strong>on</strong>st<strong>and</strong>ard formats, with<br />

associated st<strong>and</strong>ardized metadata” (Emery <strong>and</strong> Toth 2009). The transcripti<strong>on</strong> encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g built off of<br />

previous work c<strong>on</strong>ducted by the HMT project, <strong>and</strong> Emery <strong>and</strong> Toth noted that st<strong>and</strong>ardized metadata<br />

were critical for three purposes: “(1) access to <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of images for digital process<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

enhancement, (2) management of transcripti<strong>on</strong>s from those images, <strong>and</strong> (3) l<str<strong>on</strong>g>in</str<strong>on</strong>g>kage of the images with<br />

the transcripti<strong>on</strong>s.”<br />

The authors also described how the great discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary variety of scholars work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the palimpsest,<br />

from students of Ancient Greek to those explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the history of science, necessitated the ability to<br />

capture data from a range of scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> a st<strong>and</strong>ard digital format. This necessity led to a “Transcripti<strong>on</strong><br />

Integrati<strong>on</strong> Plan” that <str<strong>on</strong>g>in</str<strong>on</strong>g>corporated Unicode, Dubl<str<strong>on</strong>g>in</str<strong>on</strong>g> Core, <strong>and</strong> the TEI. They expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that they chose<br />

Dubl<str<strong>on</strong>g>in</str<strong>on</strong>g> Core as their major <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> st<strong>and</strong>ard for digital images <strong>and</strong> transcripti<strong>on</strong>s because it would<br />

allow for “host<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of this data set <strong>and</strong> other cultural works across service providers,<br />

libraries <strong>and</strong> cultural <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s” (Toth <strong>and</strong> Emery 2008). While they used the “Identificati<strong>on</strong>,” “Data<br />

Type,” <strong>and</strong> “Data C<strong>on</strong>tent” elements from the Dubl<str<strong>on</strong>g>in</str<strong>on</strong>g> Core element set, they also needed to extend this<br />

st<strong>and</strong>ard with elements such as “Spatial Data Reference” drawn from the Federal Geographic Data<br />

Committee C<strong>on</strong>tent St<strong>and</strong>ard for Digital Geospatial Metadata.<br />

Emery <strong>and</strong> Toth (2009) argued that <strong>on</strong>e of the guid<str<strong>on</strong>g>in</str<strong>on</strong>g>g pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciples both beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d their choice of comm<strong>on</strong><br />

st<strong>and</strong>ards <strong>and</strong> emphasis <strong>on</strong> the importance of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g data <strong>and</strong> metadata was the need to create a<br />

digital archive for both today <strong>and</strong> the distant future. The data set they created thus also follows the<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciples of the Open Archival Informati<strong>on</strong> System (OAIS) 396 In their data set, every image bears all<br />

relevant metadata <str<strong>on</strong>g>in</str<strong>on</strong>g> its header, <strong>and</strong> each image file or folio directory serves as a self-c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

preservati<strong>on</strong> unit that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes all the images of a given folio side, XMP metadata files, checksum data<br />

<strong>and</strong> the spatially mapped TEI-XML transcripti<strong>on</strong>s. In additi<strong>on</strong>, the project developed its own<br />

Archimedes Palimpsest Metadata St<strong>and</strong>ard that “provides a metadata structure specifically geared to<br />

relat<str<strong>on</strong>g>in</str<strong>on</strong>g>g all images of a folio side <str<strong>on</strong>g>in</str<strong>on</strong>g> a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle multi- or hyper-spectral data “cube””<br />

(Emery <strong>and</strong> Toth 2009). Because each image has its own embedded metadata, the images can either<br />

st<strong>and</strong> al<strong>on</strong>e or be related to other members of the same cube. F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, more than 140 of the 180 folio<br />

sides <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a transcripti<strong>on</strong>, <strong>and</strong> the l<str<strong>on</strong>g>in</str<strong>on</strong>g>es <str<strong>on</strong>g>in</str<strong>on</strong>g> these transcripti<strong>on</strong>s are mapped to rectangular regi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the folio images us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the TEI element. This mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g serves two useful purposes: it<br />

allows the digital transcripti<strong>on</strong>s to provide “mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-readable c<strong>on</strong>tent” <strong>and</strong> allows easy movement<br />

between the transcripti<strong>on</strong> <strong>and</strong> the image.<br />

In additi<strong>on</strong> to the challenges presented by <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscripts, other digital projects have explored<br />

the challenges of manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g multiple manuscripts of the same text. The Roman de La Rose 397 Digital<br />

<strong>Library</strong> (RRDL), a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t project of the Sheridan Libraries of Johns Hopk<str<strong>on</strong>g>in</str<strong>on</strong>g>s University <strong>and</strong> the<br />

Bibliothèque Nati<strong>on</strong>ale de France (BnF), seeks to ultimately provide access to digital surrogates of all<br />

of the manuscripts (more than 300) c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the Roman de la Rose poem. The creati<strong>on</strong> of this digital<br />

396 For more <strong>on</strong> this ISO st<strong>and</strong>ard, see http://public.ccsds.org/publicati<strong>on</strong>s/archive/650x0b1.pdf<br />

397 http://rom<strong>and</strong>elarose.org/ - home


125<br />

library was supported by the Mell<strong>on</strong> Foundati<strong>on</strong>, <strong>and</strong> by the end of 2009 the website provided access<br />

to digital surrogates of roughly 130 manuscripts through either a French or an English <str<strong>on</strong>g>in</str<strong>on</strong>g>terface. The<br />

website <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a list of all extant manuscripts as well as a collecti<strong>on</strong> spreadsheet that can be sorted by<br />

various columns if a user wants to sort manuscripts alphabetically or by number of illustrati<strong>on</strong>s.<br />

Click<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> any <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscript name l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to a full codicological descripti<strong>on</strong> 398 that also<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to the digitized manuscript. Individual manuscripts can be read page by page <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

special viewer with a variety of other view<str<strong>on</strong>g>in</str<strong>on</strong>g>g opti<strong>on</strong>s such as (full screen) or by zoom<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong> the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual pages. 399 For several manuscripts, a transcripti<strong>on</strong> can also be viewed <strong>on</strong> screen at the same<br />

time as some <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscript pages. Individual <strong>and</strong> citable URLs are provided for the<br />

codicological descripti<strong>on</strong>s 400 <strong>and</strong> for two-page views of the manuscripts with<str<strong>on</strong>g>in</str<strong>on</strong>g> the special view<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

applicati<strong>on</strong>. 401<br />

In additi<strong>on</strong> to choos<str<strong>on</strong>g>in</str<strong>on</strong>g>g manuscripts from the collecti<strong>on</strong> spreadsheet, users can select <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

manuscripts by brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g the collecti<strong>on</strong> by repository, comm<strong>on</strong> name, current locati<strong>on</strong>, date, orig<str<strong>on</strong>g>in</str<strong>on</strong>g>,<br />

type, number of illustrati<strong>on</strong>s or folios, <strong>and</strong> availability of transcripti<strong>on</strong>. Once a manuscript has been<br />

selected, a user can choose to exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e the codicological descripti<strong>on</strong>, to view it <str<strong>on</strong>g>in</str<strong>on</strong>g> the “page turner”<br />

applicati<strong>on</strong> described above, or to simply browse the images <str<strong>on</strong>g>in</str<strong>on</strong>g> a special viewer. Each manuscript also<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a full bibliography. Both a basic keyword <strong>and</strong> advanced search feature are available, <strong>and</strong> the<br />

advanced search allows for multiple keyword search<str<strong>on</strong>g>in</str<strong>on</strong>g>g with<str<strong>on</strong>g>in</str<strong>on</strong>g> various fields (l<str<strong>on</strong>g>in</str<strong>on</strong>g>es of verse, rubric,<br />

illustrati<strong>on</strong> title, narrative secti<strong>on</strong>s, etc.)<br />

With such a large number of digital surrogates available, <strong>on</strong>e of the most significant opportunities<br />

presented by the RRDL is the possibility of “cross manuscript comparative study.” To facilitate this,<br />

the creators of this collecti<strong>on</strong> found it necessary to create a new text organizati<strong>on</strong>al structure called<br />

narrative secti<strong>on</strong>s as expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>on</strong> the website: 402<br />

Citati<strong>on</strong> practice for the Roman de la Rose <strong>and</strong> most medieval texts have traditi<strong>on</strong>ally<br />

referenced the currently accepted critical editi<strong>on</strong>s. This scholarly protocol <str<strong>on</strong>g>in</str<strong>on</strong>g>hibits the crossmanuscript<br />

comparative study that the RRDL promotes. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the number of l<str<strong>on</strong>g>in</str<strong>on</strong>g>es for the work<br />

varies from <strong>on</strong>e manuscript to another, depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpolati<strong>on</strong>s or excisi<strong>on</strong>s, the narrative<br />

mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the Roman de la Rose divides the text <str<strong>on</strong>g>in</str<strong>on</strong>g>to read<str<strong>on</strong>g>in</str<strong>on</strong>g>g segments <str<strong>on</strong>g>in</str<strong>on</strong>g>stead of l<str<strong>on</strong>g>in</str<strong>on</strong>g>es. This<br />

means that comparable passages across different manuscript can be readily locatable, while<br />

number of l<str<strong>on</strong>g>in</str<strong>on</strong>g>es for each secti<strong>on</strong> facilitates track<str<strong>on</strong>g>in</str<strong>on</strong>g>g variati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> secti<strong>on</strong> length from <strong>on</strong>e<br />

exemplar to another. The narrative-mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g protocol borrows from that used for classical texts,<br />

where <strong>on</strong>e cites not a page number or a given editi<strong>on</strong> or translati<strong>on</strong> but a segment of the text.<br />

The narrative mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g was largely generated algorithmically but should be accurate to with<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e or<br />

two columns of texts. By select<str<strong>on</strong>g>in</str<strong>on</strong>g>g a narrative secti<strong>on</strong>, users can be taken to a list of image secti<strong>on</strong>s for<br />

the different manuscripts that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> that secti<strong>on</strong> so they can compare the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscript<br />

secti<strong>on</strong>s themselves. The need to create a new can<strong>on</strong>ical text structure <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent of particular<br />

scholarly editi<strong>on</strong>s or c<strong>on</strong>venti<strong>on</strong>s to facilitate the citati<strong>on</strong>, navigati<strong>on</strong>, <strong>and</strong> use of manuscripts <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

digital envir<strong>on</strong>ment was an issue also articulated by the creators of the CTS <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of classical texts,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dicat<str<strong>on</strong>g>in</str<strong>on</strong>g>g that there are many similar digital challenges to be resolved across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es.<br />

398 Manuscript descripti<strong>on</strong>s have been encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> TEI P5 (St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009).<br />

399 http://rom<strong>and</strong>elarose.org/#read;Douce195.156v.tif<br />

400 http://rom<strong>and</strong>elarose.org/#book;SeldenSupra57<br />

401 http://rom<strong>and</strong>elarose.org/#read;SeldenSupra57.013r.tif<br />

402 http://rom<strong>and</strong>elarose.org/#secti<strong>on</strong>s


126<br />

Another significant manuscript project is Parker <strong>on</strong> the Web, 403 a multiyear project of Corpus Christi<br />

College, Stanford University Libraries, <strong>and</strong> Cambridge University <strong>Library</strong>, to create high-resoluti<strong>on</strong><br />

digital images of almost all the manuscripts <str<strong>on</strong>g>in</str<strong>on</strong>g> the Parker <strong>Library</strong>. This project has built an “<str<strong>on</strong>g>in</str<strong>on</strong>g>teractive<br />

web applicati<strong>on</strong>” to allow users to exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e manuscripts with<str<strong>on</strong>g>in</str<strong>on</strong>g> the “c<strong>on</strong>text of support<str<strong>on</strong>g>in</str<strong>on</strong>g>g descriptive<br />

material <strong>and</strong> bibliography.” There are more than 550 manuscripts described <strong>on</strong> this site, <strong>and</strong> almost all<br />

of them were numbered <strong>and</strong> cataloged by M. R. James <str<strong>on</strong>g>in</str<strong>on</strong>g> his 1912 publicati<strong>on</strong> (James 1912). The<br />

<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e collecti<strong>on</strong> also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes some volumes it received after the publicati<strong>on</strong> of the James catalog.<br />

Limited free access to the collecti<strong>on</strong> is provided; full access is available by subscripti<strong>on</strong>.<br />

St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> (2009) explored the digitizati<strong>on</strong> of these two projects <strong>and</strong> their c<strong>on</strong>sequent effects up<strong>on</strong><br />

manuscript studies <strong>and</strong> codicology <str<strong>on</strong>g>in</str<strong>on</strong>g> particular. He noted that <str<strong>on</strong>g>in</str<strong>on</strong>g> the RRDL, all digital surrogates were<br />

c<strong>on</strong>nected to codicological descripti<strong>on</strong>s s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many important features of manuscripts as physical<br />

books can be lost when represented <str<strong>on</strong>g>in</str<strong>on</strong>g> digital form. In additi<strong>on</strong>, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce no comprehensive catalog or<br />

reference work existed that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed either descripti<strong>on</strong>s or a full list of all the Rose manuscripts, the<br />

project team wrote many of these descripti<strong>on</strong>s themselves. In c<strong>on</strong>trast, Parker <strong>on</strong> the Web was able to<br />

create marked-up descripti<strong>on</strong>s of the entries for manuscripts <str<strong>on</strong>g>in</str<strong>on</strong>g> the M. R. James catalog. This very<br />

process however, led them to some important c<strong>on</strong>clusi<strong>on</strong>s:<br />

Yet <str<strong>on</strong>g>in</str<strong>on</strong>g> mark<str<strong>on</strong>g>in</str<strong>on</strong>g>g up both sets of descripti<strong>on</strong>s—<strong>on</strong>e custom made for the web, the other a<br />

digitized versi<strong>on</strong> of a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted reference work—for <str<strong>on</strong>g>in</str<strong>on</strong>g>clusi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> digital libraries, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> design<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<strong>and</strong> implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces for access<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML-encoded descripti<strong>on</strong>s <strong>and</strong> the surrogates to<br />

which they are l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked, it has become apparent that <str<strong>on</strong>g>in</str<strong>on</strong>g> digital form the relati<strong>on</strong>ship of<br />

codicological descripti<strong>on</strong>s to the books they describe has, like the relati<strong>on</strong>ships of critical<br />

editi<strong>on</strong>s to the texts they document <strong>and</strong> represent, underg<strong>on</strong>e fundamental change (St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong><br />

2009).<br />

The digital envir<strong>on</strong>ment has changed codicological descripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> three major ways, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong>: (1) new purposes <strong>and</strong> uses have been discovered for these descripti<strong>on</strong>s, particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> terms<br />

of their specific <strong>and</strong> technical language; (2) the relati<strong>on</strong>ship between a codicological descripti<strong>on</strong> <strong>and</strong><br />

codex has moved from a <strong>on</strong>e-to-<strong>on</strong>e to a <strong>on</strong>e-to-many relati<strong>on</strong>ship between “codices, descripti<strong>on</strong>s,<br />

metadata <strong>and</strong> digital images”; <strong>and</strong> (3) where <strong>on</strong>ce books were used to study other books, digital tools<br />

are now be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used to represent <strong>and</strong> analyze books. These <str<strong>on</strong>g>in</str<strong>on</strong>g>sights also emphasize the larger<br />

realizati<strong>on</strong> that when pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted reference works are digitized—particularly when the knowledge they<br />

c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> is marked up <str<strong>on</strong>g>in</str<strong>on</strong>g> a mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gful way—they can take <strong>on</strong> whole new roles <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital world. 404<br />

St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> described how codicological descripti<strong>on</strong>s were typically created by experts us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a formalized<br />

vocabulary to summarize dates, orig<str<strong>on</strong>g>in</str<strong>on</strong>g>s, owners, <strong>and</strong> c<strong>on</strong>tents of books, am<strong>on</strong>g other items, <strong>and</strong> that<br />

these descripti<strong>on</strong>s were used either by visitors to a library who wanted to use a manuscript or by<br />

scholars study<str<strong>on</strong>g>in</str<strong>on</strong>g>g the manuscript remotely. In digital libraries, however, St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> argued that digital<br />

images of codices serve as the “mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e readable forms of the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al artifacts” <strong>and</strong> that “XML<br />

encoded codicological descripti<strong>on</strong>s are the sec<strong>on</strong>dary <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> used to describe, analyze <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpret these artifacts.” While codicological descripti<strong>on</strong>s are still needed for the dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of<br />

specialized knowledge (such as for the palaeographical <strong>and</strong> literary histories of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

manuscripts), St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> argued that their purpose of physical descripti<strong>on</strong> <strong>and</strong> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> to<br />

403 http://parkerweb.stanford.edu/parker/acti<strong>on</strong>s/page.doforward=home<br />

404 For further c<strong>on</strong>siderati<strong>on</strong> of how digitized historical reference works can be used <str<strong>on</strong>g>in</str<strong>on</strong>g> new ways, see Crane <strong>and</strong> J<strong>on</strong>es (2006) <strong>and</strong> Gelernter <strong>and</strong> Lesk<br />

(2008).


127<br />

remote scholars has evolved <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital envir<strong>on</strong>ment. Physical descripti<strong>on</strong> can still be important s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce<br />

digital repositories often misname files rather than mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g mistakes <str<strong>on</strong>g>in</str<strong>on</strong>g> foliati<strong>on</strong> <strong>and</strong> pag<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, <strong>and</strong> “a<br />

break <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital codex might as easily be the result of a lost file as a lost leaf <str<strong>on</strong>g>in</str<strong>on</strong>g> the physical book it<br />

represents.” Even more important, the extensive descriptive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong>ce <str<strong>on</strong>g>in</str<strong>on</strong>g>tended to aid remote<br />

scholars now provides new means for “sort<str<strong>on</strong>g>in</str<strong>on</strong>g>g, classify<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> compar<str<strong>on</strong>g>in</str<strong>on</strong>g>g collecti<strong>on</strong>s of manuscripts.”<br />

Although 17,000-word transcripti<strong>on</strong>s, St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> admits, cannot easily be put <str<strong>on</strong>g>in</str<strong>on</strong>g>to a relati<strong>on</strong>al database,<br />

specific <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> can be extracted from them:<br />

… the precisi<strong>on</strong> <strong>and</strong> specificity of the language of codicological descripti<strong>on</strong>s, developed to<br />

c<strong>on</strong>vey a substantial amount of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a small space (a necessity <str<strong>on</strong>g>in</str<strong>on</strong>g> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t reference<br />

works if <strong>on</strong>e wishes to avoid prohibitive cost <strong>and</strong> unwieldy volumes) now facilitates databases<br />

that provide highly flexible, searchable, <strong>and</strong> sortable relati<strong>on</strong>ships between the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al artifacts<br />

(St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009).<br />

In fact, as described above, the RRDL provides access to a complete database created from much of<br />

the codicological <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> it can be viewed <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, downloaded as a spreadsheet, or used to<br />

search or sort this <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> “across the entire corpus of manuscript descripti<strong>on</strong>s.” 405<br />

The sec<strong>on</strong>d major change St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> listed was how the new many-to-many relati<strong>on</strong>ship between a<br />

codicological descripti<strong>on</strong>, its codex, <strong>and</strong> the images that c<strong>on</strong>stitute the digital surrogate has created a<br />

series of complex relati<strong>on</strong>ships that must be represented <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital library envir<strong>on</strong>ment. Codicological<br />

descripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital envir<strong>on</strong>ment can be hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>ked not just to digital images of the codex itself<br />

but to digitized items listed <str<strong>on</strong>g>in</str<strong>on</strong>g> its bibliography, biographies of illustrators, <strong>and</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>deed, to any related<br />

scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> that is available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. These codicological descripti<strong>on</strong>s then not <strong>on</strong>ly c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ue<br />

to serve as guides to the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted codices <strong>and</strong> their digital surrogates but also, because they have been<br />

marked up <str<strong>on</strong>g>in</str<strong>on</strong>g> XML with def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed data categories, can be used to create databases <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong><br />

serve “as a large searchable “meta-manuscript” that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed data from numerous physical<br />

codices <strong>and</strong> thous<strong>and</strong>s of digital images” (St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009).<br />

The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al <str<strong>on</strong>g>in</str<strong>on</strong>g>sight offered by St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> underscored how the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t envir<strong>on</strong>ment has exp<strong>and</strong>ed the potential<br />

for pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted reference works that were <strong>on</strong>ce used solely to study other books, for these reference works<br />

can now be turned <str<strong>on</strong>g>in</str<strong>on</strong>g>to digital tools that can then provide much more sophisticated opportunities for<br />

analysis. He asserted that pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted codicological descripti<strong>on</strong>s, such as those found <str<strong>on</strong>g>in</str<strong>on</strong>g> the James catalog,<br />

suffer from the same challenges of many pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted critical editi<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g> particular, they <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a large<br />

number of “abbreviated <strong>and</strong> coded forms” known <strong>on</strong>ly to experts. 406 Such abbreviated forms were<br />

used as space- sav<str<strong>on</strong>g>in</str<strong>on</strong>g>g devices that are no l<strong>on</strong>ger necessary <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital envir<strong>on</strong>ment, <strong>and</strong> St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> thus<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>sists that much of the data embedded <str<strong>on</strong>g>in</str<strong>on</strong>g> codicological descripti<strong>on</strong>s “lies latent” until it is “unleashed”<br />

by digitizati<strong>on</strong>. At the same time, he argued that the digitizati<strong>on</strong> of manuscripts <strong>and</strong> their codicological<br />

descripti<strong>on</strong>s offered a new opportunity to move bey<strong>on</strong>d simple digital <str<strong>on</strong>g>in</str<strong>on</strong>g>cunabula: 407<br />

The rubricati<strong>on</strong>, historiated <str<strong>on</strong>g>in</str<strong>on</strong>g>itials, <strong>and</strong> foliated borders of <str<strong>on</strong>g>in</str<strong>on</strong>g>cunables rem<str<strong>on</strong>g>in</str<strong>on</strong>g>d us that <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

early days of pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t the c<strong>on</strong>cept of what a book should be was dom<str<strong>on</strong>g>in</str<strong>on</strong>g>ated by the manuscript<br />

codex. Dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g recent centuries, the opposite is true; descripti<strong>on</strong>s of manuscript books bear<br />

witness to the dom<str<strong>on</strong>g>in</str<strong>on</strong>g>ance of pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> form<str<strong>on</strong>g>in</str<strong>on</strong>g>g our collective noti<strong>on</strong> of what a book should be.<br />

… As we seek to liberate our codicological descripti<strong>on</strong>s from the c<strong>on</strong>stra<str<strong>on</strong>g>in</str<strong>on</strong>g>ts of “be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

405 http://rom<strong>and</strong>elarose.org/#data<br />

406 Bodard (2008) <strong>and</strong> Roueché (2009) have observed similar opportunities of exp<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g specialist abbreviati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital editi<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, <strong>and</strong><br />

Rydberg-Cox (2009) has described the challenges of digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g abbreviated texts found with<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>cunabula.<br />

407 The need to move bey<strong>on</strong>d “digital <str<strong>on</strong>g>in</str<strong>on</strong>g>cunabula” has been articulated <str<strong>on</strong>g>in</str<strong>on</strong>g> Crane et al. (2006).


128<br />

compelled to operate <str<strong>on</strong>g>in</str<strong>on</strong>g> a bookish format,” we should also bear <str<strong>on</strong>g>in</str<strong>on</strong>g> m<str<strong>on</strong>g>in</str<strong>on</strong>g>d the opportunity to<br />

correct the assumpti<strong>on</strong> that such books operate—<strong>and</strong> should be described—<str<strong>on</strong>g>in</str<strong>on</strong>g> parallel with<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted books. Both our tools <strong>and</strong> our m<str<strong>on</strong>g>in</str<strong>on</strong>g>dsets need to be liberated from pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t if we are to<br />

achieve accurate representati<strong>on</strong>s of artifacts that were produced before the advent of pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

(St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009).<br />

The need to go bey<strong>on</strong>d traditi<strong>on</strong>al pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted models <strong>and</strong> use the digital envir<strong>on</strong>ment both to more<br />

accurately represent cultural objects <strong>and</strong> artifacts <strong>and</strong> to “unleash” their latent semantic potential,<br />

whether they are primary texts, archaeological m<strong>on</strong>uments <str<strong>on</strong>g>in</str<strong>on</strong>g> ru<str<strong>on</strong>g>in</str<strong>on</strong>g>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>on</strong> st<strong>on</strong>e, or medieval<br />

manuscripts, is a theme that has been throughout this review.<br />

Digital Manuscripts, Infrastructure, <strong>and</strong> Automatic L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Technologies<br />

As illustrated by the previous secti<strong>on</strong>s, any digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure designed for manuscripts will need to<br />

address the complicated nature of manuscripts as both physical <strong>and</strong> digital objects, support a range of<br />

scholarly uses, <strong>and</strong> provide effective access to all of the data created <str<strong>on</strong>g>in</str<strong>on</strong>g> their digitizati<strong>on</strong> (e.g., digital<br />

images, diplomatic transcripti<strong>on</strong>s, TEI-XML editi<strong>on</strong>s, scholarly annotati<strong>on</strong>s). One challenge is that<br />

while there are many images of digital manuscripts available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, many are viewable <strong>on</strong>ly through<br />

special image-viewers <strong>and</strong> thus often do not have stable URLs that can be cited. Furthermore, many of<br />

these digitized manuscripts do not have <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual URLs for each page; c<strong>on</strong>sequently, l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g to a<br />

specific page, let al<strong>on</strong>e an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual l<str<strong>on</strong>g>in</str<strong>on</strong>g>e or word, is impossible. Similarly, it is often difficult to<br />

determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e whether a digitized manuscript has a transcripti<strong>on</strong> available, <strong>and</strong> even if <strong>on</strong>e has been<br />

created, it is often even more difficult for a user to ga<str<strong>on</strong>g>in</str<strong>on</strong>g> access to it.<br />

Two projects that are seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create some of the necessary <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for manuscripts that will<br />

address some of these issues are the Interediti<strong>on</strong> project <strong>and</strong> the Virtual Manuscript Room (VMR). The<br />

Interediti<strong>on</strong> project is focused <strong>on</strong> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a “supranati<strong>on</strong>al networked <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital<br />

scholarly edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> analysis.” As the creati<strong>on</strong> of digital editi<strong>on</strong>s, particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> classical <strong>and</strong><br />

medieval studies, typically <str<strong>on</strong>g>in</str<strong>on</strong>g>volves the use of multiple manuscripts, their draft architecture addresses<br />

the need to represent multiple manuscripts <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to both <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscript images <strong>and</strong><br />

transcripti<strong>on</strong>s. 408<br />

While Interediti<strong>on</strong> is focused <strong>on</strong> the larger <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure required for digital editi<strong>on</strong>s, the VMR, 409<br />

which is currently <str<strong>on</strong>g>in</str<strong>on</strong>g> its first phase, is c<strong>on</strong>centrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g advanced access to an important<br />

collecti<strong>on</strong> of manuscripts. They have provided access to fully digitized manuscripts from the M<str<strong>on</strong>g>in</str<strong>on</strong>g>gana<br />

Collecti<strong>on</strong> of Middle Eastern Manuscripts at the University of Birm<str<strong>on</strong>g>in</str<strong>on</strong>g>gham. Each digitized manuscript<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes high-resoluti<strong>on</strong> images of each page <strong>and</strong> descripti<strong>on</strong>s from both the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted catalog <strong>and</strong> the<br />

special collecti<strong>on</strong>s department that holds them. In their next phase, they will add more c<strong>on</strong>tent,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g 50,000 digital manuscript images, 500 manuscript descripti<strong>on</strong>s, <strong>and</strong> 1,000 transcripti<strong>on</strong><br />

pages. Even more important, however, the next phase of the VMR’s work will <str<strong>on</strong>g>in</str<strong>on</strong>g>volve the<br />

development of a framework for digital manuscripts that will:<br />

…br<str<strong>on</strong>g>in</str<strong>on</strong>g>g together digital resources related to manuscript materials (digital images, descripti<strong>on</strong>s<br />

<strong>and</strong> other metadata, transcripts) <str<strong>on</strong>g>in</str<strong>on</strong>g> an envir<strong>on</strong>ment which will permit libraries to add images,<br />

scholars to add <strong>and</strong> edit metadata <strong>and</strong> transcripts <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <strong>and</strong> users to access material<br />

(http://vmr.bham.ac.uk/about/).<br />

408 http://www.<str<strong>on</strong>g>in</str<strong>on</strong>g>terediti<strong>on</strong>.eu/wiki/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/WG2:Architecture<br />

409 http://vmr.bham.ac.uk/


129<br />

As part of this phase, the VMR at the University of Birm<str<strong>on</strong>g>in</str<strong>on</strong>g>gham 410 also plans to jo<str<strong>on</strong>g>in</str<strong>on</strong>g> with a parallel<br />

VMR be<str<strong>on</strong>g>in</str<strong>on</strong>g>g built at the University of Münster <str<strong>on</strong>g>in</str<strong>on</strong>g> Germany <strong>and</strong> provide seamless access to both<br />

collecti<strong>on</strong>s. Four features dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guish Interediti<strong>on</strong> from previous manuscript digitizati<strong>on</strong> projects: (1) it<br />

is designed around granular metadata, so <str<strong>on</strong>g>in</str<strong>on</strong>g>stead of simply present<str<strong>on</strong>g>in</str<strong>on</strong>g>g metadata records for whole<br />

manuscripts, records are provided for each page image, for the transcripti<strong>on</strong> of the text <strong>on</strong> that page,<br />

<strong>and</strong> for specify<str<strong>on</strong>g>in</str<strong>on</strong>g>g what text is <strong>on</strong> that page; (2) “the metadata states the exact resource type associated<br />

with the URL specified <str<strong>on</strong>g>in</str<strong>on</strong>g> each record” (e.g., if a text file is <str<strong>on</strong>g>in</str<strong>on</strong>g> XML <strong>and</strong> what schema has been used);<br />

(3) all VMR materials will be stored <str<strong>on</strong>g>in</str<strong>on</strong>g> Birm<str<strong>on</strong>g>in</str<strong>on</strong>g>gham’s <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repository <strong>and</strong> be accessible<br />

through the library <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e public access catalog (OPAC), <strong>and</strong> (4) the VMR will support full reuse of its<br />

materials not just access to them.<br />

This fourth feature is perhaps most unique, for as seen by the survey of projects <str<strong>on</strong>g>in</str<strong>on</strong>g> this secti<strong>on</strong>, the<br />

focus of much manuscript-digitizati<strong>on</strong> work has often been <strong>on</strong> support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the discovery of digital<br />

manuscripts for use <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e rather than <strong>on</strong> the ability for scholars to get access to the raw digital<br />

materials <strong>and</strong> use them <str<strong>on</strong>g>in</str<strong>on</strong>g> their own projects. The VMR plans to provide access to all the metadata they<br />

create through a syndicated RSS feed so that users can create their own <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces to VMR data. In<br />

additi<strong>on</strong>, they plan to allow other users to add material to the VMR by creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a “metadata record for<br />

the resource follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g VMR protocols” <strong>and</strong> then add it to the RSS feed of any VMR project. The<br />

importance of support<str<strong>on</strong>g>in</str<strong>on</strong>g>g new collaborati<strong>on</strong> models that allow many <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals to c<strong>on</strong>tribute related<br />

digital manuscript resources has also been discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> (2009, 2010).<br />

While the amount of metadata about manuscripts, as well as digital images <strong>and</strong> transcripti<strong>on</strong>s of<br />

manuscripts, that have become available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e has <str<strong>on</strong>g>in</str<strong>on</strong>g>creased, there are still few easy ways to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<br />

between them if they exist <str<strong>on</strong>g>in</str<strong>on</strong>g> different collecti<strong>on</strong>s. A related problem is the limited ability to at least<br />

partially automate the l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of manuscript images with their transcripti<strong>on</strong>s, even if both are known to<br />

exist. Arianna Ciula has argued that the work of palaeographers would greatly benefit from descriptive<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g or technology that supported more sophisticated l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g between images <strong>and</strong> texts,<br />

particularly “the possibility to export the associati<strong>on</strong> between descripti<strong>on</strong>s of specific palaeographical<br />

properties <strong>and</strong> the coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ates with<str<strong>on</strong>g>in</str<strong>on</strong>g> a manuscript image <str<strong>on</strong>g>in</str<strong>on</strong>g> a st<strong>and</strong>ard format such as the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

proposed by the TEI facsimile module or SVG” (Ciula 2009).<br />

Recently Hugh Cayless has developed a series of tools <strong>and</strong> techniques to assist <str<strong>on</strong>g>in</str<strong>on</strong>g> this process that have<br />

been grouped under the name img2XML 411 <strong>and</strong> have been described <str<strong>on</strong>g>in</str<strong>on</strong>g> detail <str<strong>on</strong>g>in</str<strong>on</strong>g> Cayless (2008, 2009).<br />

As has been previously discussed by M<strong>on</strong>ella (2008) <strong>and</strong> Boschetti (2009), Cayless noted that<br />

manuscript transcripti<strong>on</strong>s are typically published <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e of two formats, either as a critical editi<strong>on</strong><br />

where the editors’ comments are <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded as an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegral part of the text or as a diplomatic transcripti<strong>on</strong><br />

that tries to “faithfully reproduce the text” (Cayless 2009). While TEI allows the producti<strong>on</strong> of both<br />

types of transcripti<strong>on</strong>s from the same marked-up text, Cayless argued that the next important step is to<br />

automatically l<str<strong>on</strong>g>in</str<strong>on</strong>g>k such transcripti<strong>on</strong>s to their page images. While many systems l<str<strong>on</strong>g>in</str<strong>on</strong>g>k manuscript<br />

images <strong>and</strong> transcripti<strong>on</strong>s <strong>on</strong> the page level, 412 the work of Cayless sought to support more granular<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g, such as at the level of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual l<str<strong>on</strong>g>in</str<strong>on</strong>g>es or even words.<br />

410 The VMR at Birm<str<strong>on</strong>g>in</str<strong>on</strong>g>gham has been funded by JISC <strong>and</strong> is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g created by the Institute of Textual Scholarship <strong>and</strong> Electr<strong>on</strong>ic Edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g (ITSEE),<br />

http://www.itsee.bham.ac.uk/<br />

411 http://github.com/hcayless/img2xml<br />

412 One such system is EPPT (discussed earlier <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper), <strong>and</strong> another tool listed by Cayless is the Image Markup Tool (IMT),<br />

http://www.tapor.uvic.ca/%7Emholmes/image_markup/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php, which allows a user to annotate “rectangular secti<strong>on</strong>s of an image” by us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a draw<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

tool with which they can first “draw shape overlays <strong>on</strong> an image” <strong>and</strong> then these overlays can then be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to “text annotati<strong>on</strong>s entered by the user.”<br />

(Cayless 2009).


130<br />

Cayless thus developed a method for generat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a “Scalable Vector Graphics (SVG) 413 representati<strong>on</strong><br />

of the text <str<strong>on</strong>g>in</str<strong>on</strong>g> an image of a manuscript” (Cayless 2009). This work was <str<strong>on</strong>g>in</str<strong>on</strong>g>spired by experiments<br />

c<strong>on</strong>ducted us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the OpenLayers 414 Javascript library by Tom Elliott <strong>and</strong> Sean Gillies to trace the text<br />

<strong>on</strong> a sample <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> 415 <strong>and</strong> Cayless sought to create a “toolcha<str<strong>on</strong>g>in</str<strong>on</strong>g>” that used <strong>on</strong>ly open-source<br />

software. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g>, Cayless c<strong>on</strong>verted JPEG images of manuscripts <str<strong>on</strong>g>in</str<strong>on</strong>g>to a bitmap format us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

ImageMagick; 416 he then used an open-source tool called Potrace 417 to c<strong>on</strong>vert the bitmap to SVG.<br />

The SVG c<strong>on</strong>versi<strong>on</strong> process required some manual <str<strong>on</strong>g>in</str<strong>on</strong>g>terventi<strong>on</strong>, <strong>and</strong> an SVG editor called Inscape<br />

was used to clean up the result<str<strong>on</strong>g>in</str<strong>on</strong>g>g SVG files. The result<str<strong>on</strong>g>in</str<strong>on</strong>g>g SVG documents were analyzed us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

Pyth<strong>on</strong> script that attempted to “detect l<str<strong>on</strong>g>in</str<strong>on</strong>g>es <str<strong>on</strong>g>in</str<strong>on</strong>g> the image <strong>and</strong> organize paths with<str<strong>on</strong>g>in</str<strong>on</strong>g> those l<str<strong>on</strong>g>in</str<strong>on</strong>g>es <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

groups with<str<strong>on</strong>g>in</str<strong>on</strong>g> the document” (Cayless 2008).<br />

After the text image with<str<strong>on</strong>g>in</str<strong>on</strong>g> a larger manuscript page image had been c<strong>on</strong>verted <str<strong>on</strong>g>in</str<strong>on</strong>g>to SVG paths, these<br />

paths could be grouped with<str<strong>on</strong>g>in</str<strong>on</strong>g> the document to mark the words there<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> these groups could then be<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked us<str<strong>on</strong>g>in</str<strong>on</strong>g>g various methods to tokenized versi<strong>on</strong>s of the transcripti<strong>on</strong>s (Cayless 2009). Cayless then<br />

used the OpenLayers library to simultaneously display the l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked manuscript image <strong>and</strong> TEI<br />

transcripti<strong>on</strong>, for importantly, OpenLayers “allows the <str<strong>on</strong>g>in</str<strong>on</strong>g>serti<strong>on</strong> of a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle image as a base layer<br />

(though it supports tiled images as well), so it is quite simple to <str<strong>on</strong>g>in</str<strong>on</strong>g>sert a page image <str<strong>on</strong>g>in</str<strong>on</strong>g>to it” (Cayless<br />

2008). This <str<strong>on</strong>g>in</str<strong>on</strong>g>itial system also required the additi<strong>on</strong> of several functi<strong>on</strong>s to the OpenLayers library,<br />

particularly the ability to support “paths <strong>and</strong> groups of paths.” Ultimately, Cayless reported that:<br />

The experiments outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed above prove that it is feasible to go from a page image with a TEIbased<br />

transcripti<strong>on</strong> to an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e display <str<strong>on</strong>g>in</str<strong>on</strong>g> which the image can be panned <strong>and</strong> zoomed, <strong>and</strong> the<br />

text <strong>on</strong> the page can be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to the transcripti<strong>on</strong> (<strong>and</strong> vice-versa). The steps <str<strong>on</strong>g>in</str<strong>on</strong>g> the process that<br />

have not yet been fully automated are the selecti<strong>on</strong> of a black/white cutoff for the page image,<br />

the decisi<strong>on</strong> of what percentage of vertical overlap to use <str<strong>on</strong>g>in</str<strong>on</strong>g> recogniz<str<strong>on</strong>g>in</str<strong>on</strong>g>g that two paths are<br />

members of the same l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <strong>and</strong> the need for l<str<strong>on</strong>g>in</str<strong>on</strong>g>e beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g () tags to be <str<strong>on</strong>g>in</str<strong>on</strong>g>serted <str<strong>on</strong>g>in</str<strong>on</strong>g>to the<br />

TEI transcripti<strong>on</strong> (Cayless 2008).<br />

While automatic analysis of the SVG output has supported the detecti<strong>on</strong> of l<str<strong>on</strong>g>in</str<strong>on</strong>g>es of text <str<strong>on</strong>g>in</str<strong>on</strong>g> page<br />

images, work c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues to allow the automatic detecti<strong>on</strong> of words or other features <str<strong>on</strong>g>in</str<strong>on</strong>g> the image.<br />

Cayless c<strong>on</strong>cluded that this research raised two issues. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, further research would need to<br />

c<strong>on</strong>sider what structures (bey<strong>on</strong>d l<str<strong>on</strong>g>in</str<strong>on</strong>g>es) could be detected <str<strong>on</strong>g>in</str<strong>on</strong>g> a SVG document <strong>and</strong> how they could be<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to transcripti<strong>on</strong>s. Sec<strong>on</strong>d, TEI transcripti<strong>on</strong>s often def<str<strong>on</strong>g>in</str<strong>on</strong>g>e document structure <str<strong>on</strong>g>in</str<strong>on</strong>g> a “semantic”<br />

rather than physical way, <strong>and</strong> even though l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, word, <strong>and</strong> letter segments can be marked <str<strong>on</strong>g>in</str<strong>on</strong>g> TEI they<br />

often are not. This makes it difficult, if not impossible, to automate the l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g process. Cayless<br />

proposed that a st<strong>and</strong>ard would need to be developed for this type of l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

Other experiments <str<strong>on</strong>g>in</str<strong>on</strong>g> automatic l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of images <strong>and</strong> transcripti<strong>on</strong>s have been c<strong>on</strong>ducted by the TILE<br />

project. 418 This project seeks to build a “web-based image markup tool” 419 <strong>and</strong> is based <strong>on</strong> the exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

code of the Ajax XML (AXE) image encoder. 420 It will be <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable with both the EPPT <strong>and</strong> the<br />

413 SVG is “a language for describ<str<strong>on</strong>g>in</str<strong>on</strong>g>g two-dimensi<strong>on</strong>al graphics <strong>and</strong> graphical applicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> XML.” http://www.w3.org/Graphics/SVG/<br />

414 http://trac.openlayers.org/wiki/Release/2.6/Notes<br />

415 http://sgillies.net/blog/691/digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g-ancient-<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s-with-openlayers<br />

416 http://www.imagemagick.org/<br />

417 http://potrace.sourceforge.net/<br />

418 This project’s approach to digital editi<strong>on</strong>s was discussed earlier <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper.<br />

419 An <str<strong>on</strong>g>in</str<strong>on</strong>g>itial release of TILE 0.9 is now available for download at (http://mith.umd.edu/tile/) <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g extensive step-by-step documentati<strong>on</strong><br />

http://mith.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo/tile/documentati<strong>on</strong>/ <strong>and</strong> a forum for users. This <str<strong>on</strong>g>in</str<strong>on</strong>g>itial versi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes an image markup tool, import<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> export<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools, <strong>and</strong> a semiautomated<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>e recognizer. There is also a TILE s<strong>and</strong>box (http://mith.umd.edu/tile/s<strong>and</strong>box/), a “MITH-hosted versi<strong>on</strong> of TILE allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g users to try the<br />

tool before <str<strong>on</strong>g>in</str<strong>on</strong>g>stall<str<strong>on</strong>g>in</str<strong>on</strong>g>g their own copy.”<br />

420 http://mith.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo/AXE/


131<br />

IMT <strong>and</strong> be “capable of produc<str<strong>on</strong>g>in</str<strong>on</strong>g>g TEI-compliant XML for l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g image to text.” Similar to Cayless,<br />

the TILE project wants to support l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g bey<strong>on</strong>d the page level, such as the ability, for example, “to<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>k from a word <str<strong>on</strong>g>in</str<strong>on</strong>g> the edited text to its locati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the image” or to “click an <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g area <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

image to read an annotati<strong>on</strong>” (Porter et al. 2009). While Porter et al. recognized that a number of<br />

tools 421 allowed users to edit or display images with<str<strong>on</strong>g>in</str<strong>on</strong>g> the larger c<strong>on</strong>text of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital editi<strong>on</strong>s,<br />

they found that n<strong>on</strong>e of these tools c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed all the functi<strong>on</strong>ality they desired.<br />

Of all of the tools they menti<strong>on</strong>, Porter et al. stated that <strong>on</strong>ly the IMT outputs complete <strong>and</strong> valid TEI<br />

P5 XML, but it runs <strong>on</strong>ly <strong>on</strong> W<str<strong>on</strong>g>in</str<strong>on</strong>g>dows mach<str<strong>on</strong>g>in</str<strong>on</strong>g>es. While TILE will <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperate with the “c<strong>on</strong>stra<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

IMT TEI format,” it will also provide output <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of formats. A recent blog entry by Dorothy<br />

Porter listed these formats as <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g “any flavour” of TEI, METS 422 files, <strong>and</strong> output that is not <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

XML. “One result of this flexibility is that, aga<str<strong>on</strong>g>in</str<strong>on</strong>g> unlike the IMT, TILE will not be “plug <strong>and</strong> play,”<br />

<strong>and</strong> process<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the output will be the resp<strong>on</strong>sibility of projects us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the software,” Porter<br />

acknowledged, “This will require a bit of work <strong>on</strong> the part of users. On the other h<strong>and</strong>, as a modular set<br />

of tools, TILE will be able to be <str<strong>on</strong>g>in</str<strong>on</strong>g>corporated <str<strong>on</strong>g>in</str<strong>on</strong>g>to other digital edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g software suites that would<br />

otherwise have to design their own text-image l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g functi<strong>on</strong>ality or go without” (Porter 2010).<br />

AXE enabled the collaborative tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g of TEI texts, the associati<strong>on</strong> of XML with “time stamps <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

video or audio files,” <strong>and</strong> the mark<str<strong>on</strong>g>in</str<strong>on</strong>g>g of image regi<strong>on</strong>s that could then be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to external metadata,<br />

<strong>and</strong> TILE will extend these functi<strong>on</strong>alities. One significant issue with AXE was that while it did allow<br />

users to annotate image regi<strong>on</strong>s <strong>and</strong> store those coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ates <str<strong>on</strong>g>in</str<strong>on</strong>g> a database, it did not provide any data<br />

analysis tools for this <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. The most significant way <str<strong>on</strong>g>in</str<strong>on</strong>g> which TILE will extend AXE then is<br />

that it will support:<br />

Semi-automated creati<strong>on</strong> of l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks between transcripti<strong>on</strong>s <strong>and</strong> images of the materials from<br />

which the transcripti<strong>on</strong>s were made. Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a form of optical character recogniti<strong>on</strong>, our software<br />

will recognize words <str<strong>on</strong>g>in</str<strong>on</strong>g> a page image <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k them to a preexist<str<strong>on</strong>g>in</str<strong>on</strong>g>g textual transcripti<strong>on</strong><br />

(Porter et al. 2009).<br />

As with the research of Cayless, the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cipal goal of this work is to be able to support the l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

manuscript transcripti<strong>on</strong>s <strong>and</strong> images at the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual word level. Some other <str<strong>on</strong>g>in</str<strong>on</strong>g>tended functi<strong>on</strong>alities<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude image annotati<strong>on</strong> with c<strong>on</strong>trolled vocabularies, the creati<strong>on</strong> of editorial annotati<strong>on</strong>s, 423 <strong>and</strong> the<br />

creati<strong>on</strong> of l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks between “different, n<strong>on</strong>-c<strong>on</strong>tiguous areas of primary source images” such as capti<strong>on</strong>s<br />

<strong>and</strong> illustrati<strong>on</strong>s or “analogous texts across different manuscripts.”<br />

Numismatics<br />

Numismatics has been def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as “the collecti<strong>on</strong> <strong>and</strong> study of m<strong>on</strong>ey (<strong>and</strong> co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g> particular).” 424 It is<br />

<strong>on</strong>e of the most popular classics topics <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of academic, commercial, <strong>and</strong> enthusiast sites<br />

<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 425 In fact, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Sebastian Heath (Heath 2010), any discussi<strong>on</strong> of numismatics <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

421 Am<strong>on</strong>g this list were Juxta (http://www.n<str<strong>on</strong>g>in</str<strong>on</strong>g>es.org/tools/juxta.html), developed by the NINES project, which is typically used to compare two<br />

documents but also <strong>on</strong>ly c<strong>on</strong>nects images <strong>and</strong> text <strong>on</strong>ly at the page level, <strong>and</strong> the Versi<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g Mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e (http://v-mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e.org/), a tool with some of the same<br />

basic functi<strong>on</strong>ality as Juxta, but aga<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e that <strong>on</strong>ly supports the l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of texts <strong>and</strong> images <strong>on</strong>ly at the page level.<br />

422 METS st<strong>and</strong>s for “Metadata Encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g & Transmissi<strong>on</strong> St<strong>and</strong>ard.” The st<strong>and</strong>ard has been created by the <strong>Library</strong> of C<strong>on</strong>gress “for encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g descriptive,<br />

adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrative, <strong>and</strong> structural metadata regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g objects with<str<strong>on</strong>g>in</str<strong>on</strong>g> a digital library” (http://www.loc.gov/st<strong>and</strong>ards/mets/). METS is expressed us<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML<br />

<strong>and</strong> has been used by many digital library projects.<br />

423 Other research has also explored the creati<strong>on</strong> of annotati<strong>on</strong> technologies for digital manuscript collecti<strong>on</strong>s <strong>and</strong> the ability to share them; see, for<br />

example, Doumat et al. (2008), who exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed stor<str<strong>on</strong>g>in</str<strong>on</strong>g>g user annotati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> a collaborative workspace so that they could be used <str<strong>on</strong>g>in</str<strong>on</strong>g> a recommender system<br />

for other manuscript users.<br />

424 http://wordnetweb.pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>.edu/perl/webwns=numismatics<br />

425 For an example of an excellent website created by an enthusiast, see http://www.snible.org/co<str<strong>on</strong>g>in</str<strong>on</strong>g>s/, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> particular the “Digital Historia Numorum: A<br />

Manual of Greek Numismatics” (http://www.snible.org/co<str<strong>on</strong>g>in</str<strong>on</strong>g>s/hn/), a typed <str<strong>on</strong>g>in</str<strong>on</strong>g> versi<strong>on</strong> of the 1911 editi<strong>on</strong> of the “Historia Numorum” by Barclay Head. Ed


132<br />

must c<strong>on</strong>sider commercial <strong>and</strong> enthusiast websites as they often provide the most <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, <strong>and</strong> he also asserted that some commercial enterprises were more open with their data<br />

than academic <strong>on</strong>es. In additi<strong>on</strong>, Heath’s brief <str<strong>on</strong>g>in</str<strong>on</strong>g>formal survey of the f<str<strong>on</strong>g>in</str<strong>on</strong>g>dability of academic<br />

numismatics sites us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Google illustrated that “commercial <strong>and</strong> pers<strong>on</strong>al sources dom<str<strong>on</strong>g>in</str<strong>on</strong>g>ate the<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of ancient numismatics as presented by Google” (Heath 2010, 41). N<strong>on</strong>etheless, this<br />

subsecti<strong>on</strong> focuses <strong>on</strong> n<strong>on</strong>profit organizati<strong>on</strong>s <strong>and</strong> academic digital projects <str<strong>on</strong>g>in</str<strong>on</strong>g> numismatics <strong>and</strong><br />

outl<str<strong>on</strong>g>in</str<strong>on</strong>g>es some issues that will need to be addressed to create a digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for this discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e.<br />

Numismatics Databases<br />

One of the largest organizati<strong>on</strong>s dedicated to the field of numismatics is the American Numismatic<br />

Society (ANS). 426 The ANS has perhaps the largest numismatics database available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e (over<br />

600,000 items) <strong>and</strong> provides access to a searchable database of “co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, paper m<strong>on</strong>ey, tokens,<br />

‘primitive’ m<strong>on</strong>ey, medals <strong>and</strong> decorati<strong>on</strong>s” from all over the world. In April 2011, the ANS released a<br />

new versi<strong>on</strong> of their database entitled “MANTIS: A Numismatic Technologies Integrati<strong>on</strong> Service.” 427<br />

This database <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes extensive images of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s from the ancient world (Hellenistic Greece <strong>and</strong> the<br />

Roman Republican Period <str<strong>on</strong>g>in</str<strong>on</strong>g> particular), <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a useful feature where search<str<strong>on</strong>g>in</str<strong>on</strong>g>g/brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g will<br />

<strong>on</strong>ly retrieve records/objects that have images. MANTIS can be either browsed or searched, <strong>and</strong> for a<br />

quick start a user can select a co<str<strong>on</strong>g>in</str<strong>on</strong>g> image from <strong>on</strong>e of the ANS departments <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greek, Roman,<br />

Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e, Islamic, East Asian, Medieval, etc. While the entire database can be browsed at <strong>on</strong>ce, a<br />

variety of brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g facets can be chosen to create filtered results sets <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g artist, authority,<br />

category, century, deity, denom<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, dynasty, f<str<strong>on</strong>g>in</str<strong>on</strong>g>dspot, material, object type, pers<strong>on</strong>, etc. While a<br />

“quick search” of the entire database can be c<strong>on</strong>ducted, an advanced search<str<strong>on</strong>g>in</str<strong>on</strong>g>g opti<strong>on</strong> is also available.<br />

The user can comb<str<strong>on</strong>g>in</str<strong>on</strong>g>e search terms by choos<str<strong>on</strong>g>in</str<strong>on</strong>g>g a variety of search fields <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g keyword, century,<br />

color, denom<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, geographical locati<strong>on</strong>, legend, reference, subject <strong>and</strong> type. The Greek, Roman<br />

<strong>and</strong> U.S. collecti<strong>on</strong>s can be searched through a map <str<strong>on</strong>g>in</str<strong>on</strong>g>terface. Many numismatic object records <str<strong>on</strong>g>in</str<strong>on</strong>g>clude<br />

<strong>on</strong>e or more digital images, descriptive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> such as object type, material, weight,<br />

denom<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, regi<strong>on</strong>, pers<strong>on</strong> illustrated <strong>on</strong> the co<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> also provide stable URLs for l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g. 428 In<br />

additi<strong>on</strong>, each object record can be shared us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a variety of social bookmark<str<strong>on</strong>g>in</str<strong>on</strong>g>g sites <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes<br />

downloadable metadata <str<strong>on</strong>g>in</str<strong>on</strong>g> the form of KML, Atom, RDF <strong>and</strong> XML. 429<br />

Another significant <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e collecti<strong>on</strong> with a focus <strong>on</strong> Roman co<str<strong>on</strong>g>in</str<strong>on</strong>g>s is Roman Prov<str<strong>on</strong>g>in</str<strong>on</strong>g>cial Co<str<strong>on</strong>g>in</str<strong>on</strong>g>age<br />

Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e (RPC Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e). 430 This website was created by the University of Oxford <strong>and</strong> is funded by the<br />

AHRC. While the current project is c<strong>on</strong>f<str<strong>on</strong>g>in</str<strong>on</strong>g>ed to co<str<strong>on</strong>g>in</str<strong>on</strong>g>s from the Ant<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>e period (AD 138–192), <strong>on</strong>e<br />

goal of the RPC series project is to produce a “st<strong>and</strong>ard typology of the prov<str<strong>on</strong>g>in</str<strong>on</strong>g>cial co<str<strong>on</strong>g>in</str<strong>on</strong>g>age of the<br />

Roman Empire from its beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> 44 B.C. to its end <str<strong>on</strong>g>in</str<strong>on</strong>g> AD 296/7” <strong>and</strong> a model for putt<str<strong>on</strong>g>in</str<strong>on</strong>g>g more<br />

collecti<strong>on</strong>s <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. The current database is based <strong>on</strong> 10 collecti<strong>on</strong>s <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> “13,729<br />

co<str<strong>on</strong>g>in</str<strong>on</strong>g> types based <strong>on</strong> 46,725 specimens (9,061 of which have images).”<br />

RPC Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e provides both a quick search of the whole collecti<strong>on</strong> <strong>and</strong> three specialized types of<br />

searches. The first specialized type is an identificati<strong>on</strong> search (“to identify a co<str<strong>on</strong>g>in</str<strong>on</strong>g> or f<str<strong>on</strong>g>in</str<strong>on</strong>g>d a st<strong>and</strong>ard<br />

reference”) where the user can search by city, obverse or reverse design, reverse <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> (<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a<br />

Snible created the HTML editi<strong>on</strong> with help from some volunteers <strong>and</strong> also scanned the photographs <str<strong>on</strong>g>in</str<strong>on</strong>g> this collecti<strong>on</strong>. Another large website created by an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual enthusiast is “Magna Graecia Co<str<strong>on</strong>g>in</str<strong>on</strong>g>s,” http://www.magnagraecia.nl/co<str<strong>on</strong>g>in</str<strong>on</strong>g>s/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html.<br />

426 http://numismatics.org/<br />

427 http://numismatics.org/search/<br />

428 See for example: http://numismatics.org/collecti<strong>on</strong>/1944.100.45344<br />

429 The XML for the above example can be found at: http://numismatics.org/collecti<strong>on</strong>/1944.100.45344.xml<br />

430 http://rpc.ashmus.ox.ac.uk/


133<br />

Greek keyboard), metal <strong>and</strong> diameter). The user can also select from a list of cities or features <strong>on</strong> the<br />

obverse or reverse design of the co<str<strong>on</strong>g>in</str<strong>on</strong>g>. The sec<strong>on</strong>d specialized type is an ic<strong>on</strong>ographic search that<br />

exam<str<strong>on</strong>g>in</str<strong>on</strong>g>es the type of imagery used <strong>on</strong> co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, the user must first choose a design group (animals,<br />

architecture, deities, games, heroes, imperial family, object), then choose from a list of relevant<br />

opti<strong>on</strong>s, <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ally choose to search all prov<str<strong>on</strong>g>in</str<strong>on</strong>g>ces or an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <strong>on</strong>e (e.g., animals—chimera—<br />

Achaea). 431 The third type of specialized search is an advanced search that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes m<str<strong>on</strong>g>in</str<strong>on</strong>g>t locati<strong>on</strong>,<br />

date, magistrate, design <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, physical characteristics, <strong>and</strong> co<str<strong>on</strong>g>in</str<strong>on</strong>g> reference. Rather than<br />

records of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, this database c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s records for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual co<str<strong>on</strong>g>in</str<strong>on</strong>g> types, with both<br />

thumbnail <strong>and</strong> high-resoluti<strong>on</strong> images <strong>and</strong> a full descripti<strong>on</strong>. Geographic access is also provided to the<br />

collecti<strong>on</strong> through a Flash map 432 where users can either browse a map <strong>and</strong> choose a city or pick from<br />

a list of cities where select<str<strong>on</strong>g>in</str<strong>on</strong>g>g a city will take them to a list of match<str<strong>on</strong>g>in</str<strong>on</strong>g>g co<str<strong>on</strong>g>in</str<strong>on</strong>g> types. This database<br />

provides numerous means of access<str<strong>on</strong>g>in</str<strong>on</strong>g>g its collecti<strong>on</strong>s <strong>and</strong> is unique <str<strong>on</strong>g>in</str<strong>on</strong>g> that it supports search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

Greek <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s found <strong>on</strong> co<str<strong>on</strong>g>in</str<strong>on</strong>g>s.<br />

The Sylloge Nummorum Graecorum (SNG) 433 is <strong>on</strong>e of the larger Greek numismatics databases<br />

available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. This website is a British Academy Research Project <strong>and</strong> has the major purpose of<br />

publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g “illustrated catalogs of Greek co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g> public <strong>and</strong> private collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the British Isles.”<br />

SNG has reta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a traditi<strong>on</strong>al broad def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of Greek “to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the co<str<strong>on</strong>g>in</str<strong>on</strong>g>s produced by all ancient<br />

civilisati<strong>on</strong>s of the Mediterranean <strong>and</strong> neighbour<str<strong>on</strong>g>in</str<strong>on</strong>g>g regi<strong>on</strong>s except <str<strong>on</strong>g>Rome</str<strong>on</strong>g>, though it does <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the<br />

Roman Prov<str<strong>on</strong>g>in</str<strong>on</strong>g>cial series often known as 'Greek Imperials'.” While the SNG had traditi<strong>on</strong>ally focused<br />

<strong>on</strong> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t publicati<strong>on</strong>, it has <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly utilized electr<strong>on</strong>ic publicati<strong>on</strong> <strong>and</strong> has developed a relati<strong>on</strong>al<br />

database for this website that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes 25,000 co<str<strong>on</strong>g>in</str<strong>on</strong>g>s from the SNG volumes. This database can be<br />

searched by a variety of fields, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g collecti<strong>on</strong>, state, m<str<strong>on</strong>g>in</str<strong>on</strong>g>t, material, ruler, period, denom<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>,<br />

hoard, <strong>and</strong> obverse or reverse co<str<strong>on</strong>g>in</str<strong>on</strong>g> descripti<strong>on</strong>, am<strong>on</strong>g others. Records for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g>clude<br />

thumbnail images <strong>and</strong> full descriptive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. 434<br />

Another large <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e collecti<strong>on</strong> is the Numismatische Bilddatenbank Eichstätt, 435 which provides<br />

various means of access to a virtual library of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s from several German universities <strong>and</strong> museums,<br />

but particularly from the Catholic University of Eichstatt-Ingolstadt. While the <str<strong>on</strong>g>in</str<strong>on</strong>g>terface is available<br />

<strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g> German, this database <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes more than 5,600 objects <strong>and</strong> there are various ways of<br />

access<str<strong>on</strong>g>in</str<strong>on</strong>g>g the co<str<strong>on</strong>g>in</str<strong>on</strong>g>s. A user can c<strong>on</strong>duct a full-text search of all the database fields, browse a list of all<br />

words found <str<strong>on</strong>g>in</str<strong>on</strong>g> the co<str<strong>on</strong>g>in</str<strong>on</strong>g> legends, or choose a c<strong>on</strong>trolled keyword from the thesaurus. In additi<strong>on</strong>, a<br />

number of <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes to the collecti<strong>on</strong> can be browsed, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g pers<strong>on</strong>al names, m<str<strong>on</strong>g>in</str<strong>on</strong>g>t locati<strong>on</strong>s, dates,<br />

<strong>and</strong> collecti<strong>on</strong>. There are also a variety of ways to browse the entire collecti<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a short list<br />

with images, a st<strong>and</strong>ard list of images <strong>and</strong> descripti<strong>on</strong>s, a picture gallery, or a simple list of pictures.<br />

Individual co<str<strong>on</strong>g>in</str<strong>on</strong>g> records <str<strong>on</strong>g>in</str<strong>on</strong>g>clude high-resoluti<strong>on</strong> images, the name of the pers<strong>on</strong> <strong>on</strong> the co<str<strong>on</strong>g>in</str<strong>on</strong>g>, the date of<br />

the co<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> the collecti<strong>on</strong> it is from.<br />

Universities <strong>and</strong> colleges often hold small numismatic collecti<strong>on</strong>s as well. 436 The University of<br />

Virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia Art Museum has recently digitized a collecti<strong>on</strong> of nearly 600 Greek <strong>and</strong> Roman co<str<strong>on</strong>g>in</str<strong>on</strong>g>s. 437<br />

Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, this entire collecti<strong>on</strong> was described us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Encoded Archival Descripti<strong>on</strong> (EAD), 438 a<br />

431 http://rpc.ashmus.ox.ac.uk/search/ic<strong>on</strong>o/prov<str<strong>on</strong>g>in</str<strong>on</strong>g>ces=sel&prov<str<strong>on</strong>g>in</str<strong>on</strong>g>ce-1=Achaea&stype=ic<strong>on</strong>o&design_group=4&design-0=114&step=3&next=F<str<strong>on</strong>g>in</str<strong>on</strong>g>ish<br />

432 http://rpc.ashmus.ox.ac.uk/maps/flash/<br />

433 http://www.sylloge-nummorum-graecorum.org/<br />

434 http://www.s110120695.websitehome.co.uk/SNG/sng_reply2a.phpverb=SNGuk_0300_3363<br />

435 http://www.ifaust.de/nbe/<br />

436 For example, see the Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong> University Numismatic Collecti<strong>on</strong> (http://www.pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>.edu/~rbsc/department/numismatics/)<br />

437 http://co<str<strong>on</strong>g>in</str<strong>on</strong>g>s.lib.virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia.edu/<br />

438 http://www.loc.gov/ead/


134<br />

st<strong>and</strong>ard developed for the descripti<strong>on</strong> of archival f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g aids. The project extended the EAD with<br />

several specific adaptati<strong>on</strong>s for co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, such as to describe physical attributes like ic<strong>on</strong>ography.<br />

Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the website, this project seems to be the first that has applied EAD to numismatics. They<br />

found EAD to be useful because it not <strong>on</strong>ly allowed them to describe the physical attributes of each<br />

co<str<strong>on</strong>g>in</str<strong>on</strong>g> but also to encode “adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrative history, essays, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dex terms” <str<strong>on</strong>g>in</str<strong>on</strong>g> XML <strong>and</strong> thus create<br />

sophisticated metadata for search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Further technical <strong>and</strong> metadata details of their<br />

implementati<strong>on</strong> have been expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> Gruber (2009). Gruber expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that <strong>on</strong>e important step <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g EAD descripti<strong>on</strong>s of the object was that subject specialists normalized co<str<strong>on</strong>g>in</str<strong>on</strong>g> legends <strong>and</strong><br />

pers<strong>on</strong>al <strong>and</strong> place names to support st<strong>and</strong>ardized search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of proper Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> names <strong>and</strong> abbreviati<strong>on</strong>s<br />

(e.g., a search for co<str<strong>on</strong>g>in</str<strong>on</strong>g>s “m<str<strong>on</strong>g>in</str<strong>on</strong>g>ted <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek Byzantium or Roman C<strong>on</strong>stant<str<strong>on</strong>g>in</str<strong>on</strong>g>ople” could thus be<br />

accomplished by search<str<strong>on</strong>g>in</str<strong>on</strong>g>g for Istanbul). This encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of pers<strong>on</strong>al names, geographic names, <strong>and</strong><br />

deities, am<strong>on</strong>g others, was essential for “establish<str<strong>on</strong>g>in</str<strong>on</strong>g>g authority lists for faceted brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

normalizati<strong>on</strong>” <strong>and</strong> thus made for more sophisticated textual search<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Gruber also reported that they<br />

used Apache Solr, 439 an “open source search <str<strong>on</strong>g>in</str<strong>on</strong>g>dex based <strong>on</strong> the Lucene Java library” that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a<br />

number of useful features such as hit highlight<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> facet<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

The database for the University of Virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia Art Museum Numismatic Collecti<strong>on</strong> can be searched or<br />

browsed. A basic search <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the ability to search multiple terms <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of fields (keyword,<br />

century, collecti<strong>on</strong>, deity, denom<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, geographical locati<strong>on</strong>, ic<strong>on</strong>ography, etc.); an advanced search<br />

makes use of Lucene query syntax. This collecti<strong>on</strong> also offers a faceted brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terface where co<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

can be browsed by different categories (city, collecti<strong>on</strong>, deity, material, name, etc.). Records for each<br />

co<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>clude multiple images (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g high-resoluti<strong>on</strong> <strong>on</strong>es), descriptive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, archival<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, a bibliography, <strong>and</strong> a list of <str<strong>on</strong>g>in</str<strong>on</strong>g>dex terms such as pers<strong>on</strong>al names <strong>and</strong> subjects that have<br />

been l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to entries <str<strong>on</strong>g>in</str<strong>on</strong>g> Wikipedia <strong>and</strong> other digital classics resources (Livius.org, Theoi.com).<br />

Individual records have permanent URLs that are easy to cite, <strong>and</strong> a social bookmark<str<strong>on</strong>g>in</str<strong>on</strong>g>g feature is<br />

available for each record. 440 One useful feature is that this entire record can be pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted; another unique<br />

feature of this collecti<strong>on</strong> is the ability to compare two co<str<strong>on</strong>g>in</str<strong>on</strong>g>s.<br />

Several smaller <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual numismatic collecti<strong>on</strong>s have been created as parts of museum exhibiti<strong>on</strong>s or<br />

educati<strong>on</strong>al resources. For example, “Bearers of Mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g: The Ottilia Buerger Collecti<strong>on</strong> of Ancient<br />

<strong>and</strong> Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e Co<str<strong>on</strong>g>in</str<strong>on</strong>g>s” 441 <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a series of thematic essays <strong>and</strong> a catalog of almost 150 co<str<strong>on</strong>g>in</str<strong>on</strong>g>s. Each<br />

catalog entry <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes images, descriptive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, provenance, bibliography, <strong>and</strong> a descriptive<br />

entry. 442 While the “Bearers of Mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g” website was designed to accompany an educati<strong>on</strong>al exhibit,<br />

other museums have provided <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e access to their entire numismatic databases. The Bruce Brace<br />

Co<str<strong>on</strong>g>in</str<strong>on</strong>g> Collecti<strong>on</strong> 443 at McMaster University Museum of Art <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes 272 Roman co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, a series of<br />

thematic tours, <strong>and</strong> a time l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. The collecti<strong>on</strong> can be searched <strong>on</strong>ly by a general search box (with no<br />

limits). Records for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g>clude images, time l<str<strong>on</strong>g>in</str<strong>on</strong>g>e year, c<strong>on</strong>diti<strong>on</strong>, locati<strong>on</strong>, provenance, a<br />

descripti<strong>on</strong> <strong>and</strong> additi<strong>on</strong>al notes, <strong>and</strong> references to numismatic catalogs. One unique feature of this<br />

collecti<strong>on</strong> is the ability to zoom <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual images at great detail us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a tool called Zoomify.<br />

Another <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e exhibit is the “Co<str<strong>on</strong>g>in</str<strong>on</strong>g>age of Ephesus” at Macquarie University <str<strong>on</strong>g>in</str<strong>on</strong>g> Sydney. 444 This<br />

website <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a lengthy essay about the collecti<strong>on</strong> that is divided <str<strong>on</strong>g>in</str<strong>on</strong>g>to chapters with l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>and</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive gallery of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s (that requires Flash to view). Click<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> a co<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

439 http://lucene.apache.org/solr/<br />

440 For example, http://co<str<strong>on</strong>g>in</str<strong>on</strong>g>s.lib.virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia.edu/display-uvaid=n1990_18_3<br />

441 http://www.lawrence.edu/dept/art/buerger/<br />

442 For example, see http://www.lawrence.edu/dept/art/buerger/catalogue/033.html<br />

443 http://tapor1.mcmaster.ca/~co<str<strong>on</strong>g>in</str<strong>on</strong>g>s/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

444 http://learn.mq.edu.au/webct/RelativeResourceManager/15043963001/Public Files/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.htm


135<br />

br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs up images of the co<str<strong>on</strong>g>in</str<strong>on</strong>g> al<strong>on</strong>g with basic descriptive <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. There is no way, however, to<br />

search the collecti<strong>on</strong> of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s. F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, another useful resource is the “Virtual Catalog of Roman Co<str<strong>on</strong>g>in</str<strong>on</strong>g>s”<br />

(VCRC) 445 a website ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by Robert W. Cape, Jr., associate professor of classics, Aust<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

College, <strong>and</strong> “devoted to help<str<strong>on</strong>g>in</str<strong>on</strong>g>g students <strong>and</strong> teachers learn more about ancient Roman co<str<strong>on</strong>g>in</str<strong>on</strong>g>s.” This<br />

website c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s co<str<strong>on</strong>g>in</str<strong>on</strong>g> images <strong>and</strong> descripti<strong>on</strong>s from the early Roman Republic through the end of the<br />

fourth century AD. The VCRC can be searched by either a general keyword search or by co<str<strong>on</strong>g>in</str<strong>on</strong>g> issuer,<br />

obverse or reverse descripti<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, <strong>and</strong> c<strong>on</strong>tributor. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the VCRC was designed as an<br />

educati<strong>on</strong>al resource it also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a list of student projects <strong>and</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources.<br />

Some numismatic research databases that <strong>on</strong>ce had <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual websites have been archived by the<br />

ADS. One example is “Analysis of Roman Silver Co<str<strong>on</strong>g>in</str<strong>on</strong>g>s: Augustus to Nero (27 B.C. – AD 69),” 446 a<br />

project c<strong>on</strong>ducted <str<strong>on</strong>g>in</str<strong>on</strong>g> 2005 by Matthew P<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Kev<str<strong>on</strong>g>in</str<strong>on</strong>g> Butcher at the University of Liverpool. The<br />

research database <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes numismatic descripti<strong>on</strong>s of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>and</strong> pictures, <strong>and</strong> can be queried by<br />

denom<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, m<str<strong>on</strong>g>in</str<strong>on</strong>g>t, emperor, hoard, or d<strong>on</strong>or. This website illustrates the importance of access to<br />

digital preservati<strong>on</strong> services for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual faculty research projects <strong>on</strong>ce they are completed.<br />

Numismatic Data Integrati<strong>on</strong> <strong>and</strong> Digital Publicati<strong>on</strong><br />

As <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated by this overview, there are numerous numismatics databases, many of which have<br />

collecti<strong>on</strong>s of overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g time periods <strong>and</strong> geography but all of which provide vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels of<br />

access through different types of database <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces <strong>and</strong> utilize various schemas, often with an<br />

extensive number of different fields/elements that describe the same data items <str<strong>on</strong>g>in</str<strong>on</strong>g> different databases.<br />

Some of the challenges of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g such collecti<strong>on</strong>s have been explored by D’Andrea <strong>and</strong><br />

Niccolucci (2008). These authors exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed data-harm<strong>on</strong>izati<strong>on</strong> efforts us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the CIDOC-CRM<br />

<strong>on</strong>tology <strong>and</strong> described <str<strong>on</strong>g>in</str<strong>on</strong>g>itial efforts to map three different numismatics databases to the CIDOC-<br />

CRM <strong>and</strong> to develop a “general numismatic reference model.” Similarly, the CLAROS project has<br />

used CIDOC-CRM to provide federated search<str<strong>on</strong>g>in</str<strong>on</strong>g>g to various classical art databases.<br />

This lack of a comm<strong>on</strong> st<strong>and</strong>ard schema for numismatic databases is not surpris<str<strong>on</strong>g>in</str<strong>on</strong>g>g as there is a similar<br />

lack of st<strong>and</strong>ards for both the catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> analysis of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s with<str<strong>on</strong>g>in</str<strong>on</strong>g> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted publicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> this field,<br />

accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to a recent article by Kris Lockyear (Lockyear 2007). In his overview of the record<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

analysis of Roman co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g> Brita<str<strong>on</strong>g>in</str<strong>on</strong>g>, Lockyear also criticized the English Heritage guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es that had<br />

recently been released for describ<str<strong>on</strong>g>in</str<strong>on</strong>g>g co<str<strong>on</strong>g>in</str<strong>on</strong>g>s. One of the major problems, Lockyear suggested, was the<br />

identificati<strong>on</strong> of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, for without a “well-preserved genu<str<strong>on</strong>g>in</str<strong>on</strong>g>e co<str<strong>on</strong>g>in</str<strong>on</strong>g>,” even extensive patience <strong>and</strong> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the 10 volumes of Roman Imperial Co<str<strong>on</strong>g>in</str<strong>on</strong>g>age did not necessarily provide a scholar with a dist<str<strong>on</strong>g>in</str<strong>on</strong>g>ct<br />

catalog number al<strong>on</strong>g with a date range, place of manufacture, <strong>and</strong> denom<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>. Lockyear noted,<br />

however, that detailed analysis of sites required <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g not just well-identified co<str<strong>on</strong>g>in</str<strong>on</strong>g>s but all the co<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

found, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g those that were poorly preserved. To address this issue, a series of “co<str<strong>on</strong>g>in</str<strong>on</strong>g>-issue<br />

periods” were created so that co<str<strong>on</strong>g>in</str<strong>on</strong>g>s could at least be assigned to a period <strong>and</strong> summary list<str<strong>on</strong>g>in</str<strong>on</strong>g>gs could be<br />

created of the co<str<strong>on</strong>g>in</str<strong>on</strong>g>s at a site. There are currently two such schemes used <str<strong>on</strong>g>in</str<strong>on</strong>g> Brita<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> c<strong>on</strong>versi<strong>on</strong><br />

between them typically requires a full catalog of the co<str<strong>on</strong>g>in</str<strong>on</strong>g>s found, Lockyear reported, but unfortunately,<br />

many pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted publicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>clude <strong>on</strong>ly partial catalogs of site f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds.<br />

Lockyear expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that the English Heritage guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es suggested three levels of catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g: a full<br />

catalog with detailed <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>; a shorter catalog of the full <strong>on</strong>e; or a “spreadsheet” that is typically a<br />

summary of data by co<str<strong>on</strong>g>in</str<strong>on</strong>g> periods. The m<str<strong>on</strong>g>in</str<strong>on</strong>g>imum data to be <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded was co<str<strong>on</strong>g>in</str<strong>on</strong>g> identificati<strong>on</strong> accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

445 http://vcrc.aust<str<strong>on</strong>g>in</str<strong>on</strong>g>college.edu/<br />

446 http://ads.ahds.ac.uk/catalogue/archive/co<str<strong>on</strong>g>in</str<strong>on</strong>g>s_lt_2005/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.cfmCFID=3825887&CFTOKEN=59064527


136<br />

to st<strong>and</strong>ard catalogs, identificati<strong>on</strong> code, site code, <strong>and</strong> small f<str<strong>on</strong>g>in</str<strong>on</strong>g>d number. Lockyear stated that the<br />

Heritage guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the weight <strong>and</strong> diameter of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s were problematic; <str<strong>on</strong>g>in</str<strong>on</strong>g> particular, he<br />

disliked the <str<strong>on</strong>g>in</str<strong>on</strong>g>structi<strong>on</strong>s <strong>on</strong> how to record co<str<strong>on</strong>g>in</str<strong>on</strong>g> legends. While three database schemas were provided<br />

by the guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es, Lockyear c<strong>on</strong>sidered them to be poorly designed <strong>and</strong> claimed that they did not take<br />

full advantage of relati<strong>on</strong>al database capabilities. This is unfortunate, Lockyear noted, because a good<br />

database design could make it easy to produce catalogs that c<strong>on</strong>formed to almost any format as l<strong>on</strong>g as<br />

the necessary data had been entered. Good database design would also reduce duplicati<strong>on</strong> of effort at<br />

data entry (e.g., a table of legends could be used so a legend did not have to be entered for each co<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

that exhibited it). In additi<strong>on</strong>, Lockyear argued that s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce “fuzzy data” are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g recorded for many<br />

co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, the ability to <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate levels of certa<str<strong>on</strong>g>in</str<strong>on</strong>g>ty for some fields would be very important. Another area<br />

for improvement, Lockyear proposed, was <str<strong>on</strong>g>in</str<strong>on</strong>g> the categories of analysis used <str<strong>on</strong>g>in</str<strong>on</strong>g> numismatics, <strong>and</strong> he<br />

emphasized the need to move bey<strong>on</strong>d simply analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g when co<str<strong>on</strong>g>in</str<strong>on</strong>g>s were produced to exam<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g “co<str<strong>on</strong>g>in</str<strong>on</strong>g>use<br />

periods.”<br />

The greatest benefit of a well-designed database, Lockyear reas<strong>on</strong>ed, would be the ability to<br />

automatically generate summary lists of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s. Additi<strong>on</strong>ally, he proposed l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g such databases to<br />

GIS packages <str<strong>on</strong>g>in</str<strong>on</strong>g> order to better enable “<str<strong>on</strong>g>in</str<strong>on</strong>g>tra <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ter-site analyses,” a possibility that he was<br />

surprised had not been explored by more numismatists. Lockyear believed three po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts were essential<br />

to move numismatics forward: (1) all co<str<strong>on</strong>g>in</str<strong>on</strong>g>s from excavati<strong>on</strong>s should be identified to the extent<br />

possible; (2) a st<strong>and</strong>ard database schema should be created that could be used by specialists <str<strong>on</strong>g>in</str<strong>on</strong>g> the field<br />

<strong>and</strong> easily archived with the ADS, <strong>and</strong> any such schema should not be dependent <strong>on</strong> a particular piece<br />

of software; <strong>and</strong> (3) the analysis of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s should be <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated with stratigraphic <strong>and</strong> ceramic data. 447<br />

To help br<str<strong>on</strong>g>in</str<strong>on</strong>g>g about such change, he proposed mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g available a “user friendly <strong>and</strong> flexible database<br />

applicati<strong>on</strong>.” An even better soluti<strong>on</strong>, he suggested, would be to develop a web-based system that<br />

made use of MySQL or another open-source system, for this would support the use of comm<strong>on</strong><br />

database tables <strong>and</strong> make data universally available. Lockyear hoped that new f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds would be entered<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>to such a system, that archaeologists would be able to download all or any part of these data <strong>and</strong><br />

analyze them us<str<strong>on</strong>g>in</str<strong>on</strong>g>g their favorite tools, <strong>and</strong> that legacy numismatic data such as site lists <strong>and</strong> hoards<br />

would begun to be <str<strong>on</strong>g>in</str<strong>on</strong>g>put. 448 Lockyear c<strong>on</strong>cluded by reiterat<str<strong>on</strong>g>in</str<strong>on</strong>g>g that the analysis of co<str<strong>on</strong>g>in</str<strong>on</strong>g>age data needs<br />

to be <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated with other str<strong>and</strong>s of archaeological <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>:<br />

The future lies, I believe, <str<strong>on</strong>g>in</str<strong>on</strong>g> the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of stratigraphic, co<str<strong>on</strong>g>in</str<strong>on</strong>g>age <strong>and</strong> other evidence such as<br />

ceramic data. This is hardly a revoluti<strong>on</strong>ary idea, be<str<strong>on</strong>g>in</str<strong>on</strong>g>g comm<strong>on</strong>place <str<strong>on</strong>g>in</str<strong>on</strong>g> other c<strong>on</strong>texts, but <strong>on</strong>e<br />

which we must now pursue with some vigour (Lockyear 2007).<br />

As has been seen <str<strong>on</strong>g>in</str<strong>on</strong>g> other discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es such as epigraphy, scholars are <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly look<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the<br />

potential of databases <strong>and</strong> the digital world to re<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate data sources that have often been arbitrarily<br />

divided by discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary structures, to provide a more holistic approach to study<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ancient world.<br />

Perhaps the most significant work regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the challenges of numismatic data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

c<strong>on</strong>ducted by the Digital Co<str<strong>on</strong>g>in</str<strong>on</strong>g>s Network, 449 which is promot<str<strong>on</strong>g>in</str<strong>on</strong>g>g “the effective use of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

technology <str<strong>on</strong>g>in</str<strong>on</strong>g> the collecti<strong>on</strong>, exchange, <strong>and</strong> publicati<strong>on</strong> of numismatic data.” This network identifies<br />

447 One useful source for the study of Roman ceramics is “Roman Amphorae: A Digital Resource”<br />

(http://ads.ahds.ac.uk/catalogue/archive/amphora_ahrb_2005/<str<strong>on</strong>g>in</str<strong>on</strong>g>fo_<str<strong>on</strong>g>in</str<strong>on</strong>g>tro.cfmCFID=3775528&CFTOKEN=68253066), which is available <str<strong>on</strong>g>in</str<strong>on</strong>g> the ADS<br />

archive. A directory of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e ceramics <strong>and</strong> pottery resources (particularly from the Roman period) can be found at Potsherd<br />

(http://www.potsherd.ukl<str<strong>on</strong>g>in</str<strong>on</strong>g>ux.net/).<br />

448 Lockyear also menti<strong>on</strong>ed the possibility of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the data from <strong>on</strong>e extensive project, the “Portable Antiquities Scheme,” “a voluntary scheme to<br />

record archaeological objects found by members of the public <str<strong>on</strong>g>in</str<strong>on</strong>g> Engl<strong>and</strong> <strong>and</strong> Wales.” All these f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds are available through a central database that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes<br />

thous<strong>and</strong>s of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s. http://www.f<str<strong>on</strong>g>in</str<strong>on</strong>g>ds.org.uk/<br />

449 http://www.digitalco<str<strong>on</strong>g>in</str<strong>on</strong>g>s.org/


137<br />

exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g st<strong>and</strong>ards <strong>and</strong> promotes the development of new st<strong>and</strong>ards for the numismatic community.<br />

The current major project of the network is the ref<str<strong>on</strong>g>in</str<strong>on</strong>g>ement <strong>and</strong> extensi<strong>on</strong> of the Numismatic Database<br />

St<strong>and</strong>ard (NUDS), 450 <strong>and</strong> they are “work<str<strong>on</strong>g>in</str<strong>on</strong>g>g to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e a st<strong>and</strong>ardized set of fields to describe<br />

numismatic objects with<str<strong>on</strong>g>in</str<strong>on</strong>g> the c<strong>on</strong>text of a column-oriented database.” The network recognizes that<br />

there are already thous<strong>and</strong>s of records of numismatics objects <str<strong>on</strong>g>in</str<strong>on</strong>g> traditi<strong>on</strong>al relati<strong>on</strong>al databases so they<br />

plan to create a set of shared fields 451 that will promote exchange of data <strong>and</strong> are develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a NUDS<br />

testbed.<br />

In additi<strong>on</strong> to data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong>, the need to support more sophisticated digital publicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

numismatics will be another challenge for any developed <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. The ANS has recently<br />

announced a digital publicati<strong>on</strong>s project 452 <strong>and</strong>, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to their announcement, they are “develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the digital publicati<strong>on</strong> of numismatic catalogs, exhibiti<strong>on</strong>s, articles, <strong>and</strong> other<br />

materials.” The system will take advantage of st<strong>and</strong>ards <strong>and</strong> schemas that are already available so text<br />

will be encoded us<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML (with TEI DTDs <strong>and</strong> schemas where appropriate). They have a number of<br />

experimental digital publicati<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary HTML map of m<str<strong>on</strong>g>in</str<strong>on</strong>g>ts, 453 a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g catalog<br />

of “Numismatic Literature” that c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues their annual list<str<strong>on</strong>g>in</str<strong>on</strong>g>g of numismatic titles, <strong>and</strong> a number of<br />

HTML catalogs of exhibits <strong>and</strong> traditi<strong>on</strong>al publicati<strong>on</strong>s.<br />

The most advanced of the ANS digital publicati<strong>on</strong>s, however, is Nomisma.org, 454 a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t effort by the<br />

ANS, Yale University Art Gallery, <strong>and</strong> the Paris-Sorb<strong>on</strong>ne University to “provide stable digital<br />

representati<strong>on</strong>s of numismatic c<strong>on</strong>cepts <strong>and</strong> entities” such as generic c<strong>on</strong>cepts like a “co<str<strong>on</strong>g>in</str<strong>on</strong>g> hoard” or<br />

actual hoards listed <str<strong>on</strong>g>in</str<strong>on</strong>g> published collecti<strong>on</strong>s such as the Inventory of Greek Co<str<strong>on</strong>g>in</str<strong>on</strong>g> Hoards (IGCH).<br />

Nomisma.org provides stable URIs 455 for the resources it <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes <strong>and</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g> the spirit of “l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data,” 456<br />

def<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> presents the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> both human- <strong>and</strong> mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-readable forms. All resources are<br />

represented <str<strong>on</strong>g>in</str<strong>on</strong>g> XML, <strong>and</strong> they also plan to use XHTML with embedded RDFa. The hope is that<br />

creators of other digital collecti<strong>on</strong>s will use these stable URIs to build a web of l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data “that<br />

enables faster acquisiti<strong>on</strong> <strong>and</strong> analysis of well-structured numismatic data.” For a test case,<br />

Nomisma.org is develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital versi<strong>on</strong> of the IGCH, <strong>and</strong> all of the 2,387 hoards have been given<br />

stable URIs. The IGCH was chosen because the hoards identified with<str<strong>on</strong>g>in</str<strong>on</strong>g> it have unique identifiers that<br />

are well known to the community <strong>and</strong> hoards are typically c<strong>on</strong>ceived as lists of l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to other<br />

numismatic entities (the m<str<strong>on</strong>g>in</str<strong>on</strong>g>ts of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>dspots). This presents the opportunity of def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>venti<strong>on</strong>s for these entities <strong>and</strong> for then turn<str<strong>on</strong>g>in</str<strong>on</strong>g>g the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to explicit hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>ks.<br />

There was surpris<str<strong>on</strong>g>in</str<strong>on</strong>g>gly limited research regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g digitizati<strong>on</strong> strategies for co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>and</strong> the development<br />

of digital collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> numismatics. One area of grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g research, however, is that of automatic co<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

recogniti<strong>on</strong>. Kampel <strong>and</strong> Zaharieva (2008) have recently described <strong>on</strong>e state-of-the-art approach:<br />

Fundamental part of a numismatists work is the identificati<strong>on</strong> <strong>and</strong> classificati<strong>on</strong> of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to st<strong>and</strong>ard reference books. The recogniti<strong>on</strong> of ancient co<str<strong>on</strong>g>in</str<strong>on</strong>g>s is a highly complex<br />

task that requires years of experience <str<strong>on</strong>g>in</str<strong>on</strong>g> the entire field of numismatics. To date, no optical<br />

recogniti<strong>on</strong> system for ancient co<str<strong>on</strong>g>in</str<strong>on</strong>g>s has been <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated successfully. In this paper, we<br />

450 http://digitalco<str<strong>on</strong>g>in</str<strong>on</strong>g>s.org/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/NUDS:Fields<br />

451 Initial work <strong>on</strong> the NUDS:Exchange, “an xml schema designed to facilitate the exchange of numismatic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>,” can be found at<br />

http://digitalco<str<strong>on</strong>g>in</str<strong>on</strong>g>s.org/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/NUDS:Exchange_Format<br />

452 http://www.numismatics.org/DigitalPublicati<strong>on</strong>s/DigitalPublicati<strong>on</strong>s<br />

453 http://numismatics.org/xml/geography.html<br />

454 http://nomisma.org/<br />

455 For example, http://nomisma.org/id/igch0262<br />

456 For more <strong>on</strong> publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data <strong>on</strong> the web, see Bizer et al. (2007).


138<br />

present an extensi<strong>on</strong> <strong>and</strong> comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of local image descriptors relevant for ancient co<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

recogniti<strong>on</strong> (Kampel <strong>and</strong> Zaharieva 2008)<br />

They identified two major processes that must first be differentiated: co<str<strong>on</strong>g>in</str<strong>on</strong>g> identificati<strong>on</strong>, where a<br />

unique identifier is assigned to a specific co<str<strong>on</strong>g>in</str<strong>on</strong>g>; <strong>and</strong> co<str<strong>on</strong>g>in</str<strong>on</strong>g> classificati<strong>on</strong>, where a co<str<strong>on</strong>g>in</str<strong>on</strong>g> is assigned to a<br />

predef<str<strong>on</strong>g>in</str<strong>on</strong>g>ed type. The authors argued that automatic co<str<strong>on</strong>g>in</str<strong>on</strong>g> identificati<strong>on</strong> was an easier task because of<br />

way <str<strong>on</strong>g>in</str<strong>on</strong>g> which ancient co<str<strong>on</strong>g>in</str<strong>on</strong>g>s were created. For example, the manufactur<str<strong>on</strong>g>in</str<strong>on</strong>g>g process tended to give co<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

unique shapes (hammer<str<strong>on</strong>g>in</str<strong>on</strong>g>g procedures, co<str<strong>on</strong>g>in</str<strong>on</strong>g> breakages, etc.) The same features that assist <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

co<str<strong>on</strong>g>in</str<strong>on</strong>g> identificati<strong>on</strong> complicate automatic classificati<strong>on</strong>, however, for as they noted, “the almost<br />

arbitrary shape of an ancient co<str<strong>on</strong>g>in</str<strong>on</strong>g> narrows the amount of appropriate segmentati<strong>on</strong> algorithms”<br />

(Kampel <strong>and</strong> Zaharieva 2008). Additi<strong>on</strong>ally, algorithms that performed well <strong>on</strong> image collecti<strong>on</strong>s of<br />

modern co<str<strong>on</strong>g>in</str<strong>on</strong>g>s did not fare well <strong>on</strong> medieval <strong>on</strong>es.<br />

Ultimately, Kampel <strong>and</strong> Zaharieva (2008) decided to use “texture sensitive po<str<strong>on</strong>g>in</str<strong>on</strong>g>t detectors” <strong>and</strong><br />

c<strong>on</strong>ducted <str<strong>on</strong>g>in</str<strong>on</strong>g>itial experiments to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e what local feature descriptors would work best for<br />

identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g a given set of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts <str<strong>on</strong>g>in</str<strong>on</strong>g> ancient co<str<strong>on</strong>g>in</str<strong>on</strong>g>s. After acquir<str<strong>on</strong>g>in</str<strong>on</strong>g>g a set of images of 350 co<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

types from the Fitzwilliam Museum <str<strong>on</strong>g>in</str<strong>on</strong>g> Cambridge, they built a co<str<strong>on</strong>g>in</str<strong>on</strong>g>-recogniti<strong>on</strong> workflow. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce they<br />

were us<str<strong>on</strong>g>in</str<strong>on</strong>g>g images of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, they did not need to automatically detect <strong>and</strong> segment co<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

found <str<strong>on</strong>g>in</str<strong>on</strong>g> the images <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stead focused <strong>on</strong> the feature-extracti<strong>on</strong> process. That process <str<strong>on</strong>g>in</str<strong>on</strong>g>volved two<br />

steps: the use of local feature algorithms to extract local image descriptors that could be used <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual co<str<strong>on</strong>g>in</str<strong>on</strong>g> identificati<strong>on</strong>; <strong>and</strong> the extracti<strong>on</strong> of features that could be used to “to reduce the<br />

number of required feature comparis<strong>on</strong>s” by reduc<str<strong>on</strong>g>in</str<strong>on</strong>g>g the co<str<strong>on</strong>g>in</str<strong>on</strong>g>s that needed to be extracted from the<br />

database. After an <str<strong>on</strong>g>in</str<strong>on</strong>g>itial preselecti<strong>on</strong> step, they performed descriptor match<str<strong>on</strong>g>in</str<strong>on</strong>g>g by “identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g the first<br />

two nearest neighbors <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of Euclidean distances.” The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al step was verificati<strong>on</strong>. An algorithm<br />

called SIFT (Scale Invariant Feature transform) provided the best results <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of discrim<str<strong>on</strong>g>in</str<strong>on</strong>g>ant<br />

feature identificati<strong>on</strong>, but its biggest drawback was computati<strong>on</strong>al time. For future experiments, they<br />

plan to exp<strong>and</strong> their evaluati<strong>on</strong> to a larger set of co<str<strong>on</strong>g>in</str<strong>on</strong>g> images. As more collecti<strong>on</strong>s of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s become<br />

available for <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e experimentati<strong>on</strong>, the accuracy <strong>and</strong> viability of such approaches will likely <str<strong>on</strong>g>in</str<strong>on</strong>g>crease.<br />

Palaeography<br />

The discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of palaeography has received some brief exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> other secti<strong>on</strong>s, such as <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

creati<strong>on</strong> of a palaeographic knowledge base for Cuneiform <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of automatic document<br />

recogniti<strong>on</strong> for Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> manuscripts. 457 A fairly comprehensive def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> has been offered by Moalla et<br />

al. (2006):<br />

The paleography is a complementary discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of the philology. … The paleography studies<br />

the layout of old manuscripts <strong>and</strong> their evoluti<strong>on</strong>s whereas the classic philology studies the<br />

c<strong>on</strong>tent of the texts, the languages <strong>and</strong> their evoluti<strong>on</strong>s. The goals of the palaeographic science<br />

are ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ly the study of the correct decod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the old writ<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <strong>and</strong> the study of the history of<br />

the transmissi<strong>on</strong> of the ancient texts. The palaeography is also the study of the writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g style,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dependently from the author pers<strong>on</strong>al writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g style, which can help to date <strong>and</strong>/or to<br />

transcribe ancient manuscripts (Moalla et al. 2006).<br />

The study of palaeography is thus closely tied to work with many other discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, <strong>and</strong> as Ciula (2009)<br />

expla<str<strong>on</strong>g>in</str<strong>on</strong>g>s, “Palaeography cannot proceed without shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g methods, tools <strong>and</strong> outcomes with codiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es<br />

such as epigraphy, codicology, philology, textual criticism—to name but a few.” As this<br />

457 Some other computer science research <str<strong>on</strong>g>in</str<strong>on</strong>g> palaeography has focused <strong>on</strong> the development of automatic h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g recogniti<strong>on</strong> for medieval English<br />

documents (Bulacu <strong>and</strong> Schomaker 2007) <strong>and</strong> for eighteenth- <strong>and</strong> n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth-century French manuscripts (Egl<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2006).


139<br />

comment illustrates, while there are many specialized discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es with<str<strong>on</strong>g>in</str<strong>on</strong>g> classics, they also share many<br />

research methods. This secti<strong>on</strong> therefore explores some recent state-of-the-art work <str<strong>on</strong>g>in</str<strong>on</strong>g> digital<br />

palaeography.<br />

In a recent paper, Peter Stokes (Stokes 2009) has provided an overview of the issues faced <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create a discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of digital palaeography. He observed that traditi<strong>on</strong>al palaeographic<br />

studies have their own methodological issues, <str<strong>on</strong>g>in</str<strong>on</strong>g> particular a lack of established term<str<strong>on</strong>g>in</str<strong>on</strong>g>ology (e.g., for<br />

h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g), a factor that has made those few digital resources that have been created difficult to use<br />

<strong>and</strong> “almost impossible to <str<strong>on</strong>g>in</str<strong>on</strong>g>terc<strong>on</strong>nect.” This has not <strong>on</strong>ly frustrated scholarly communicati<strong>on</strong> but also<br />

made creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g “databases of scripts” almost impossible. N<strong>on</strong>etheless, extensive digital corpora are now<br />

available, <strong>and</strong> Stokes argued that such corpora could not be analyzed by traditi<strong>on</strong>al methods because<br />

they can <str<strong>on</strong>g>in</str<strong>on</strong>g>clude “hundreds of scribal h<strong>and</strong>s with potentially thous<strong>and</strong>s or tens of thous<strong>and</strong>s of<br />

features.” The creati<strong>on</strong> of new databases <strong>and</strong> the use of data m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g, Stokes asserted, would be<br />

necessary to work with such large bodies of material. N<strong>on</strong>etheless, he acknowledged that digital<br />

methods had still received little acceptance from scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e:<br />

However, promis<str<strong>on</strong>g>in</str<strong>on</strong>g>g as these seem, they have received almost no acceptance <strong>and</strong> relatively<br />

little <str<strong>on</strong>g>in</str<strong>on</strong>g>terest from ‘traditi<strong>on</strong>al’ palaeographers. This is partly because the technology is not yet<br />

mature, <strong>and</strong> perhaps also because the attempts to date have generally <str<strong>on</strong>g>in</str<strong>on</strong>g>volved small projects<br />

without the susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g or larger <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary groups that digital humanities often<br />

require (Stokes 2009).<br />

In additi<strong>on</strong> to the challenges of relatively new <strong>and</strong> untested technology, limited fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> small<br />

projects, Stokes expressed how the use of digital methods is problematic because it requires<br />

underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of many fields, such as computer graphics <strong>and</strong> probability theory, skill areas that most<br />

traditi<strong>on</strong>al palaeographers cannot be expected to have.<br />

One potential soluti<strong>on</strong>, Stokes believed, was to develop software that presented results <str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>telligible<br />

manner to palaeographers. C<strong>on</strong>sequently, Stokes is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> a software platform for image<br />

enhancement called the “Framework for Image Analysis,” a “modular <strong>and</strong> extendible software <str<strong>on</strong>g>in</str<strong>on</strong>g> Java<br />

for the analysis of scribal h<strong>and</strong>s.” This software allows users to load images of h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> run<br />

various automated processes to analyze <strong>and</strong> generate metrics for scribal h<strong>and</strong>s. The system <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a<br />

module to enhance images before they are processed that can also be run as a st<strong>and</strong>-al<strong>on</strong>e applicati<strong>on</strong><br />

to try to recover damaged text from manuscripts. One useful feature of this system is that users can<br />

compare various metrics <strong>and</strong> distances generated by different processes (implemented as plug-<str<strong>on</strong>g>in</str<strong>on</strong>g>s) <strong>on</strong><br />

different pieces of writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g, implement their own algorithms, <strong>and</strong> export the results of these processes.<br />

As Stokes noted, this type of system “allows people to compare different techniques <str<strong>on</strong>g>in</str<strong>on</strong>g> a comm<strong>on</strong><br />

framework, produc<str<strong>on</strong>g>in</str<strong>on</strong>g>g libraries of scribal h<strong>and</strong>s <strong>and</strong> plug-<str<strong>on</strong>g>in</str<strong>on</strong>g>s as a comm<strong>on</strong> <strong>and</strong> documented basis for<br />

palaeographical study” (Stokes 2009). The ability to create results that could be either “reproducible or<br />

at least verifiable” was also important, although Stokes believed that issues of documentati<strong>on</strong> <strong>and</strong><br />

reproducibility were manageable <str<strong>on</strong>g>in</str<strong>on</strong>g> that software could be designed to record all acti<strong>on</strong>s that are<br />

performed <strong>and</strong> save them <str<strong>on</strong>g>in</str<strong>on</strong>g> a st<strong>and</strong>ard format.<br />

Thus, Stokes highlighted the need for comm<strong>on</strong> frameworks for analysis, the use of st<strong>and</strong>ards, <strong>and</strong><br />

reproducible results to build the foundati<strong>on</strong>s for digital palaeography. One other valuable po<str<strong>on</strong>g>in</str<strong>on</strong>g>t he<br />

made was that designers of digital humanities applicati<strong>on</strong>s needed not just to c<strong>on</strong>sider what algorithms<br />

to implement but to how present those results <str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>telligible manner to n<strong>on</strong>-computer scientists:<br />

“Indeed, it is an important questi<strong>on</strong> how the results of complex algorithms can best be presented to


140<br />

scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities,” Stokes c<strong>on</strong>cluded, “<strong>and</strong> it may well be that the plug-<str<strong>on</strong>g>in</str<strong>on</strong>g>s should allow both<br />

‘computer-friendly’ <strong>and</strong> ‘human-friendly’ output, with the latter <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g graphical or even <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive<br />

displays”(Stokes 2009).<br />

Recent work by Arianna Ciula (Ciula 2009) has also explored the methodological issues <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

us<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital technology to support palaeographical analysis of medieval h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g. She ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

that digital methods would assist palaeographers <strong>on</strong>ly if the complex nature of the cultural artifacts<br />

they studied were also c<strong>on</strong>sidered. In additi<strong>on</strong>, she argued that the identificati<strong>on</strong> of “critical processes<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> the palaeographic method” was essential before any tools were developed. Digital tools needed<br />

to make the steps of scholarly analysis more explicit, Ciula <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g “analyses, comparis<strong>on</strong>s,<br />

<strong>and</strong> classificati<strong>on</strong>s.” S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce palaeography is closely related to many other classical discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, Ciula<br />

also argued that a more <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated digital envir<strong>on</strong>ment of tools <strong>and</strong> resources was necessary:<br />

Therefore, <str<strong>on</strong>g>in</str<strong>on</strong>g>dependently from its more or less limited scope, the more any digital tool or<br />

resource—be<str<strong>on</strong>g>in</str<strong>on</strong>g>g it a digital facsimile of a manuscript, an applicati<strong>on</strong> to segment letter forms, a<br />

digital editi<strong>on</strong>, or an electr<strong>on</strong>ic publicati<strong>on</strong> of other k<str<strong>on</strong>g>in</str<strong>on</strong>g>d—can be <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated with<str<strong>on</strong>g>in</str<strong>on</strong>g> an<br />

envir<strong>on</strong>ment where complementary material is also accessible, the more it becomes<br />

exp<strong>on</strong>entially useful to the palaeographer (Ciula 2009).<br />

Palaeographers <str<strong>on</strong>g>in</str<strong>on</strong>g> particular need more visual representati<strong>on</strong>s of manuscripts <strong>and</strong> open-access<br />

comprehensive collecti<strong>on</strong>s.<br />

In her own work, Ciula developed a comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g applicati<strong>on</strong> called “System for Palaeographic<br />

Inspecti<strong>on</strong>”(SPI) for work she c<strong>on</strong>ducted as a graduate student. Ciula scanned the leaves of several<br />

codices <strong>and</strong> developed a basic system that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded image preprocess<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <str<strong>on</strong>g>in</str<strong>on</strong>g>serti<strong>on</strong> of images <str<strong>on</strong>g>in</str<strong>on</strong>g>to a<br />

relati<strong>on</strong>al database, segmentati<strong>on</strong> of h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> images <str<strong>on</strong>g>in</str<strong>on</strong>g>to relevant letters <strong>and</strong> ligatures, <strong>and</strong><br />

automatic generati<strong>on</strong> of letter models. She created extensive documentati<strong>on</strong> regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g her choice of<br />

digitizati<strong>on</strong> criteria, the ref<str<strong>on</strong>g>in</str<strong>on</strong>g>ement <strong>and</strong> evaluati<strong>on</strong> of segmentati<strong>on</strong> processes, <strong>and</strong> tun<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

parameters for generat<str<strong>on</strong>g>in</str<strong>on</strong>g>g letter models. For this she made extensive use of the large body of literature<br />

<strong>on</strong> manuscript digitizati<strong>on</strong> <strong>and</strong> OCR development, but she underscored that the development of this<br />

system required extensive doma<str<strong>on</strong>g>in</str<strong>on</strong>g> knowledge as well:<br />

On the other h<strong>and</strong>, the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative phase based <strong>on</strong> the analysis of the letter models <strong>and</strong> their<br />

automatic cluster<str<strong>on</strong>g>in</str<strong>on</strong>g>g has required <str<strong>on</strong>g>in</str<strong>on</strong>g>sights <str<strong>on</strong>g>in</str<strong>on</strong>g>to a much more established traditi<strong>on</strong> of do<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

palaeography. The comparis<strong>on</strong> of types of letterforms—which is the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> objective of<br />

analytical palaeography—has not effectively been supported so far by any tool. Therefore, the<br />

major challenge was represented by the attempt to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate <strong>and</strong> support the palaeographical<br />

method with<str<strong>on</strong>g>in</str<strong>on</strong>g> a digital humanities (as def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by McCarty …) research approach (Ciula 2009).<br />

One of the greatest challenges faced by many digital classics practiti<strong>on</strong>ers was the need for both<br />

extensive discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary expertise <strong>and</strong> technical knowledge.<br />

The tool she developed had a number of technical limitati<strong>on</strong>s, Ciula granted, <strong>and</strong> she commented that<br />

various scholarly stages of <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>, such as letter segmentati<strong>on</strong> <strong>and</strong> model generati<strong>on</strong>, were<br />

assisted by the tool but not “comprehensively <strong>and</strong> systematically supported by the tool itself.” The<br />

most powerful functi<strong>on</strong> of the SPI was its ability to “compute graphical features”; this assisted<br />

palaeographic analysis by mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g variati<strong>on</strong>s between characters more perceptible to human visi<strong>on</strong>.<br />

Ciula n<strong>on</strong>etheless emphasized that her tool was meant to assist scholars, not replace them, <strong>and</strong> this<br />

raised the questi<strong>on</strong> of how well digital tools could ever model scholarly expertise. “How much of the


141<br />

palaeographical expertise can the tool or its modules <str<strong>on</strong>g>in</str<strong>on</strong>g>corporate” Ciula asked, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “if the use<br />

of the tool itself c<strong>on</strong>tributes to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e, ref<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> enrich the underly<str<strong>on</strong>g>in</str<strong>on</strong>g>g method, to what extent can this<br />

process be fed back <str<strong>on</strong>g>in</str<strong>on</strong>g>to the tool <strong>and</strong> make it more sophisticated” (Ciula 2009). The iterative process<br />

of model<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarly expertise <str<strong>on</strong>g>in</str<strong>on</strong>g> a computati<strong>on</strong>al manner <strong>and</strong> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a tool that can both utilize this<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>itial knowledge <strong>and</strong> feed new knowledge back <str<strong>on</strong>g>in</str<strong>on</strong>g>to the system has also been explored by the eSAD<br />

project <str<strong>on</strong>g>in</str<strong>on</strong>g> their development of an <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> support system for papyrologists (Tarte 2011, Olsen et<br />

al. 2009).<br />

In sum, Ciula hoped that the functi<strong>on</strong>ality of the SPI could be turned <str<strong>on</strong>g>in</str<strong>on</strong>g>to a paleographical module that<br />

could be utilized as a web service as part of larger <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that might be used <str<strong>on</strong>g>in</str<strong>on</strong>g> the “creati<strong>on</strong> <strong>and</strong><br />

annotati<strong>on</strong> of digital editi<strong>on</strong>s.” Other necessities Ciula proposed for a larger digital research<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the need for documentati<strong>on</strong> <strong>and</strong> transparency, the use of st<strong>and</strong>ards, systems that<br />

were extensible <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable, <strong>and</strong> more fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g for <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary research.<br />

Papyrology<br />

As described by Bauer et al. (2008) papyrology “focuses <strong>on</strong> the study of ancient literature,<br />

corresp<strong>on</strong>dence, legal archives, etc. as preserved <str<strong>on</strong>g>in</str<strong>on</strong>g> papyri.” Both <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual papyri collecti<strong>on</strong>s <strong>and</strong><br />

massive <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated databases of papyri can be found <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. In fact, digital papyrology is c<strong>on</strong>sidered to<br />

be a relatively “mature” digital discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 458 “Repositories for certa<str<strong>on</strong>g>in</str<strong>on</strong>g> primary sources, such as papyri,<br />

are already play<str<strong>on</strong>g>in</str<strong>on</strong>g>g an important role <str<strong>on</strong>g>in</str<strong>on</strong>g> ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to digital surrogates of artifacts,” Harley et al.<br />

(2010) observed, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “<str<strong>on</strong>g>in</str<strong>on</strong>g> the study of written evidence, these databases of annotated primary<br />

sources could also play an important role as digital critical editi<strong>on</strong>s.” As noted <str<strong>on</strong>g>in</str<strong>on</strong>g> the discussi<strong>on</strong> of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, the amount of scholarly edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> that often goes <str<strong>on</strong>g>in</str<strong>on</strong>g>to the producti<strong>on</strong> of<br />

<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e images <strong>and</strong> transcripti<strong>on</strong>s of papyri should be c<strong>on</strong>sidered ak<str<strong>on</strong>g>in</str<strong>on</strong>g> to the act of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a critical<br />

scholarly editi<strong>on</strong>.<br />

The discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of papyrology began to take shape <str<strong>on</strong>g>in</str<strong>on</strong>g> the late 1880s <strong>and</strong> 1890s, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to an article<br />

by Ann E. Hans<strong>on</strong>, <strong>and</strong> the importance of these sources for the study of the ancient world was quickly<br />

recognized:<br />

A direct <strong>and</strong> immediate c<strong>on</strong>tact with the ancient Mediterranean was be<str<strong>on</strong>g>in</str<strong>on</strong>g>g established, as texts,<br />

unexam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce antiquity, were be<str<strong>on</strong>g>in</str<strong>on</strong>g>g made available to the modern world. To be sure, this<br />

writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g paper of the ancients had been used not <strong>on</strong>ly for elegant rolls of Greek literature, but<br />

also for quite everyday purposes <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of languages, accounts, letters, petiti<strong>on</strong>s,<br />

medic<str<strong>on</strong>g>in</str<strong>on</strong>g>al recipes. Still, it was the copies of Greek literature which had not survived <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

manuscript traditi<strong>on</strong>s that were particularly prized <str<strong>on</strong>g>in</str<strong>on</strong>g> the early days, for these were the more<br />

accessible to scholars tra<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> the authors of the can<strong>on</strong> (Hans<strong>on</strong> 2001).<br />

Papyri thus offered access not just to works of lost literature but also to the documents of daily life. In<br />

the late-n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth <strong>and</strong> early-twentieth centuries, many European <strong>and</strong> American libraries began to<br />

create collecti<strong>on</strong>s of papyri such as from Tebtunis, 459 Oxyrhynchus, 460 <strong>and</strong> Herculaneum, 461<br />

collecti<strong>on</strong>s that are <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly becom<str<strong>on</strong>g>in</str<strong>on</strong>g>g available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Despite some competiti<strong>on</strong> for collecti<strong>on</strong>s,<br />

Hans<strong>on</strong> noted there was a c<strong>on</strong>siderable amount of collegiality between scholars <strong>and</strong> collectors that was<br />

458 For example, the 26 th Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Papyrology has three separate sessi<strong>on</strong>s <strong>on</strong> “Digital Technology <strong>and</strong> Tools of the Trade”<br />

(http://www.stoa.org/p=1177).<br />

459 http://tebtunis.berkeley.edu/<br />

460 http://www.papyrology.ox.ac.uk/POxy<br />

461 http://www.herculaneum.ox.ac.uk/papyri.html


142<br />

described as the amicitia papyrologorum, <strong>and</strong> this collaborative nature of papyrological research<br />

c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues today.<br />

This secti<strong>on</strong> exam<str<strong>on</strong>g>in</str<strong>on</strong>g>es a number of digital papyri projects <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e 462 <strong>and</strong> looks at several significant<br />

research projects seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to develop new technologies for the analysis of papyri <strong>and</strong> to create digital<br />

research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures for papyrology.<br />

Digital Papyri Projects<br />

The largest papyri projects to be found <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e are those that serve as aggregators, uni<strong>on</strong> catalogs, or<br />

portals to other papyri collecti<strong>on</strong>s. The Advanced Papyrological Informati<strong>on</strong> System (APIS) 463 is <strong>on</strong>e<br />

of the oldest <strong>and</strong> largest papyri databases <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong>, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its website, “is a collecti<strong>on</strong>s-based<br />

repository host<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about <strong>and</strong> images of papyrological materials (e.g., papyri, ostraca, wood<br />

tablets, etc) located <str<strong>on</strong>g>in</str<strong>on</strong>g> collecti<strong>on</strong>s around the world.” This repository c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s physical descripti<strong>on</strong>s,<br />

extensive bibliographic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, digital images, <strong>and</strong> English translati<strong>on</strong>s for many of the texts. In<br />

some cases, l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks are provided to the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al-language texts. As of March 2010, the APIS “uni<strong>on</strong><br />

catalog” <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded 28,677 records <strong>and</strong> 18,670 images from more than 20 collecti<strong>on</strong>s of papyri with the<br />

largest be<str<strong>on</strong>g>in</str<strong>on</strong>g>g Columbia, Duke, New York University, Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>, University California, Berkeley, the<br />

University of Michigan, <strong>and</strong> Yale. 464<br />

The APIS <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes both published <strong>and</strong> unpublished material <strong>and</strong> is currently hosted by New York<br />

University. The collecti<strong>on</strong> can be searched by keyword across the whole collecti<strong>on</strong> or with<str<strong>on</strong>g>in</str<strong>on</strong>g> an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual papyri collecti<strong>on</strong>. Individual documents can also be searched for by publicati<strong>on</strong> number,<br />

collecti<strong>on</strong> number, or APIS number. There are also a number of brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g features, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g by<br />

subject word, documentary or literary type, writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g material, <strong>and</strong> language (Arabic, Aramaic, Coptic,<br />

Demotic, Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Hebrew, Hieratic [Egyptian], Hieroglyphic, Italian, Middle Persian, Parthian,<br />

<strong>and</strong> Syriac). An advanced search offers even more opti<strong>on</strong>s. Each <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual papyrus record <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes an<br />

APIS identifier, title, language, physical descripti<strong>on</strong>, notes, <strong>and</strong> digital images where available. Some<br />

records also <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k back to the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al papyri collecti<strong>on</strong> database for fuller <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> that<br />

may be provided there. The APIS is <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> the larger digital classics research project Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Digital Papyrology (IDP).<br />

The potential for projects such as the APIS received early recogniti<strong>on</strong> from papyrologists such as Ann<br />

Hans<strong>on</strong>. “This wealth of electr<strong>on</strong>ically searchable materials means that more possibilities can be<br />

explored at every phase <str<strong>on</strong>g>in</str<strong>on</strong>g> the process of prepar<str<strong>on</strong>g>in</str<strong>on</strong>g>g a papyrus for publicati<strong>on</strong>,” Hans<strong>on</strong> asserted, “from<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g parallels to assist read<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the c<strong>on</strong>textualizati<strong>on</strong> of a papyrus' message back <str<strong>on</strong>g>in</str<strong>on</strong>g>to the<br />

circumstances that seemed to have occasi<strong>on</strong>ed its writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g” (Hans<strong>on</strong> 2001). She also praised the fact<br />

that the APIS was greatly exp<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g access to both papyri <strong>and</strong> papyrological <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, particularly<br />

through its l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to translati<strong>on</strong>s, <strong>and</strong> stated that mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital resources available to an audience<br />

bey<strong>on</strong>d the academic world was a goal for which all projects should strive.<br />

462 This secti<strong>on</strong> focuses <strong>on</strong> some of the larger projects as there are numerous <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g projects such as <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual university collecti<strong>on</strong>s that have been<br />

digitized such as Harvard’s “Digital Papyri at the Hought<strong>on</strong> <strong>Library</strong>” (http://hcl.harvard.edu/libraries/hought<strong>on</strong>/collecti<strong>on</strong>s/papyrus/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html) or the<br />

Ryl<strong>and</strong>s Papyri digitized by the University of Manchester<br />

(http://www.library.manchester.ac.uk/eresources/imagecollecti<strong>on</strong>s/university/papyrus/#d.en.98702). There are also websites that have been dedicated to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual papyrus fragments of particular <str<strong>on</strong>g>in</str<strong>on</strong>g>terest such as “Edw<str<strong>on</strong>g>in</str<strong>on</strong>g> Smith’s Surgical Papyrus” (http://archive.nlm.nih.gov/proj/ttp/flash/smith/smith.html),<br />

<strong>and</strong> research projects that are currently focused <strong>on</strong> the papyri of an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual author such as Philodemus<br />

(http://www.classics.ucla.edu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/philodemus).<br />

463 http://www.columbia.edu/cu/lweb/projects/digital/apis/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

464 Some of these collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>clude large <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e databases that provide separate access to their collecti<strong>on</strong>, such as the University of Michigan (PMichhttp://www.lib.umich.edu/papyrus-collecti<strong>on</strong>)<br />

<strong>and</strong> Berkeley (Tebtunis- http://tebtunis.berkeley.edu/).


143<br />

One of the oldest <strong>and</strong> largest electr<strong>on</strong>ic collecti<strong>on</strong>s of papyri <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, with federated access also<br />

provided by the APIS, is the Duke Data Bank of Documentary Papyri (DDbDP). 465 This project was<br />

founded <str<strong>on</strong>g>in</str<strong>on</strong>g> 1983 as a collaborative effort between Duke University <strong>and</strong> the Packard Humanities<br />

Institute <strong>and</strong> was first made available as a CD-ROM (Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> 2010). The DDbDP <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the full text of<br />

thous<strong>and</strong>s of Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> n<strong>on</strong>literary papyri <strong>and</strong> serves as the textual counterpart to the HGV (a<br />

collecti<strong>on</strong> of papyrological metadata) <strong>and</strong> the APIS, which as described above c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s both metadata<br />

<strong>and</strong> images. Initial <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e access to this collecti<strong>on</strong> was provided through the Perseus Digital <strong>Library</strong>, 466<br />

<strong>and</strong> Perseus c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues to provide a browsable list of texts where the full text of the papyri is available<br />

for view<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 467 While these texts are searchable through the various advanced search<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

features of Perseus, a newer papyrological search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>e also provides access to this collecti<strong>on</strong> <strong>and</strong> is<br />

available at papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo, 468 a site that is part of the larger IDP project. The DDbDP is also a project<br />

partner <str<strong>on</strong>g>in</str<strong>on</strong>g> both the IDP <strong>and</strong> the C<strong>on</strong>cordia projects.<br />

Like APIS, the Papyrus Portal Project 469 seeks to provide its users with a federated search of all<br />

“digitized <strong>and</strong> electr<strong>on</strong>ically catalogued papyrus collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> Germany” <strong>and</strong> to provide an “unified<br />

presentati<strong>on</strong> of the search results with the most important <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> the particular papyrus.” The<br />

Papyrus Portal provides <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated access to the digitized hold<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of 10 German papyri collecti<strong>on</strong>s,<br />

the largest of which is the Papyrus Project Halle-Jena-Leipzig. 470 Fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g for this project was<br />

provided by the Deutsche Forschungsgeme<str<strong>on</strong>g>in</str<strong>on</strong>g>schaft (DFG), <strong>and</strong> the website was created by the<br />

University of Leipzig us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the open-source software “MyCoRe.” 471 The Papyrus Portal database has<br />

both an English <strong>and</strong> a German <str<strong>on</strong>g>in</str<strong>on</strong>g>terface <strong>and</strong> can be searched by <str<strong>on</strong>g>in</str<strong>on</strong>g>ventory number, title, language<br />

(Arabic, Aramaic, Demotic, Gothic, Greek, Hebrew, Hieratic, Hieroglyphic, Coptic, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Syriac),<br />

text type (documentary, literary, unidentifiable, paraliterary), material, locati<strong>on</strong>, date, c<strong>on</strong>tent, <strong>and</strong> free<br />

text. Individual records for each papyrus also <str<strong>on</strong>g>in</str<strong>on</strong>g>clude l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to their orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al database so the user can go<br />

directly to detailed data without hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g search <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual databases aga<str<strong>on</strong>g>in</str<strong>on</strong>g>.<br />

The most significant collecti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> the Papyrus Portal Project is Papyrus-und Ostrakaprojekt<br />

Halle-Jena-Leipzig, a collaborative effort of three universities to digitize <strong>and</strong> provide access to their<br />

papyri collecti<strong>on</strong>s (both published <strong>and</strong> unpublished). This grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> well-documented database<br />

provides various means of access to the papyri collecti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> both English <strong>and</strong> German. While the<br />

“general” search <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes keyword search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> title, <str<strong>on</strong>g>in</str<strong>on</strong>g>ventory or publicati<strong>on</strong> number, text type,<br />

collecti<strong>on</strong>, place, <strong>and</strong> district with various date limit opti<strong>on</strong>s, the full-text search <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes both a Greek<br />

<strong>and</strong> an Arabic keyboard for easier text retrieval. The “complex” search <str<strong>on</strong>g>in</str<strong>on</strong>g>volves sophisticated<br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of detailed metadata fields for written objects, texts, or documents. Two ways of brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the collecti<strong>on</strong> of papyri <strong>and</strong> ostraca are available: the user can browse an <str<strong>on</strong>g>in</str<strong>on</strong>g>dex of written objects,<br />

fragments, or documents alphabetically by papyrus title or use a faceted brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terface, aga<str<strong>on</strong>g>in</str<strong>on</strong>g> for<br />

written objects, texts, or documents. Once a user chooses a type of object such as texts, he or she can<br />

then choose from a series of metadata categories to create a very specific list of documents (e.g.,<br />

Written objects – Material – Wood). The record for each object 472 <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes thumbnail images (larger<br />

images can be viewed us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a special viewer that loads <str<strong>on</strong>g>in</str<strong>on</strong>g> a separate browser w<str<strong>on</strong>g>in</str<strong>on</strong>g>dow), title, collecti<strong>on</strong>,<br />

publicati<strong>on</strong> number, writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g material, size, format, type of text, script, language, date, place, <strong>and</strong> a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<br />

465 For a full history of the DDbDP, see http://idp.atlantides.org/trac/idp/wiki/DDBDP; <strong>and</strong> for an earlier overview, see Oates (1993).<br />

466 http://www.perseus.tufts.edu/hopper/collecti<strong>on</strong>collecti<strong>on</strong>=Perseus:collecti<strong>on</strong>:DDBDP<br />

467 One unique feature of access to the DDbDP through Perseus is that Greek terms are cross-referenced to appropriate lexic<strong>on</strong> entries <str<strong>on</strong>g>in</str<strong>on</strong>g> the LSJ.<br />

468 http://www.papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo/ddbdp/<br />

469 http://www.papyrusportal.de/c<strong>on</strong>tent/below/start.xmllang=en<br />

470 http://papyri.uni-leipzig.de/<br />

471 http://www.mycore.de/<br />

472 For example, see http://papyri.uni-leipzig.de/receive/HalPapyri_schrift_00001210


144<br />

to a full text transcripti<strong>on</strong> when available. Both the metadata record for each papyrus <strong>and</strong> its text have<br />

static URLs for easy cit<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

Another papyrus portal that has recently become available is DVCTVS, 473 a project that aims to<br />

ultimately serve as a nati<strong>on</strong>al papyri portal for Spa<str<strong>on</strong>g>in</str<strong>on</strong>g>. The creati<strong>on</strong> of this project <str<strong>on</strong>g>in</str<strong>on</strong>g>volves four<br />

organizati<strong>on</strong>s: the Universitat Pompeu Fabra, the C<strong>on</strong>sejo Superior de Investigaci<strong>on</strong>es Científicas, the<br />

Abadia de M<strong>on</strong>tserrat, <strong>and</strong> the Companyia de Jesús <str<strong>on</strong>g>in</str<strong>on</strong>g> Catal<strong>on</strong>ia. Three collecti<strong>on</strong>s will <str<strong>on</strong>g>in</str<strong>on</strong>g>itially be<br />

made available through DVCTVS: the Abadia de M<strong>on</strong>tserrat Collecti<strong>on</strong> (c<strong>on</strong>sist<str<strong>on</strong>g>in</str<strong>on</strong>g>g of 1,500 papyri<br />

from Egypt from the Ptolemaic period until the tenth century AD <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g literary <strong>and</strong><br />

documentary texts written <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek, Coptic, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Arabic <strong>and</strong> Demotic); the Palau-Ribes collecti<strong>on</strong><br />

(with currently 100 published papyri from Egypt between the eighth century BC <strong>and</strong> the tenth century<br />

AD (with approximately 2,000 texts <str<strong>on</strong>g>in</str<strong>on</strong>g> total) <strong>and</strong> written <str<strong>on</strong>g>in</str<strong>on</strong>g> various languages (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>,<br />

Coptic, Demotic, Hebrew, Arabic, <strong>and</strong> Syriac); <strong>and</strong> the Fundaci<strong>on</strong> Pastor Collecti<strong>on</strong> (a collecti<strong>on</strong> of<br />

about 400 papyri from the same time period as the other two collecti<strong>on</strong>s). Currently papyri are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

cataloged, digital images are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g added, <strong>and</strong> previously pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted texts are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g published as TEI-<br />

XML files. A digital catalog is available that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes multiple keyword search<str<strong>on</strong>g>in</str<strong>on</strong>g>g (us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Boolean<br />

operators) across a large number of fields (e.g., alphabet, associated MSS, author, book, date, date of<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g, editi<strong>on</strong>, f<str<strong>on</strong>g>in</str<strong>on</strong>g>dspot, language, published title). A Greek keyboard can be used to search the<br />

digital catalog. Each papyrus record <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes extensive metadata <strong>and</strong> sometimes <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes digital<br />

images <strong>and</strong> a Greek text transcripti<strong>on</strong>. The website is available <str<strong>on</strong>g>in</str<strong>on</strong>g> Spanish, Catalan, <strong>and</strong> English.<br />

One of the largest papyri “portals,” as well as metadata aggregators, albeit with a c<strong>on</strong>centrati<strong>on</strong> <strong>on</strong><br />

collecti<strong>on</strong>s from Ancient Egypt, is Trismegistos. 474 This project serves as an “<str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary portal<br />

of papyrological <strong>and</strong> epigraphical resources deal<str<strong>on</strong>g>in</str<strong>on</strong>g>g with Egypt <strong>and</strong> the Nile valley between roughly<br />

800 B.C. <strong>and</strong> AD 800,” <strong>and</strong> the majority of these resources are based at the Katholieke Universiteit<br />

Leuven. The core comp<strong>on</strong>ent of this portal is the Trismegistos Texts Database, 475 which supports<br />

federated search<str<strong>on</strong>g>in</str<strong>on</strong>g>g across the metadata (currently 113,940 records) of a series of papyrological <strong>and</strong><br />

epigraphic databases of related projects. The first group of partner projects updates their <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

directly <str<strong>on</strong>g>in</str<strong>on</strong>g>to the FileMaker Database that underlies the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e XML versi<strong>on</strong> of Trismegistos Texts <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes Hieroglyphic Hieratic Papyri (HHP), 476 Demotic <strong>and</strong> Abnormal Hieratic Texts (DAHT), 477<br />

Aramaic Texts from Egypt (ATE), 478 TM Magic, 479 <strong>and</strong> the Leuven Database of Ancient Books<br />

(LDAB). 480 Each of these databases has a project website as part of Trismegistos so their collecti<strong>on</strong>s<br />

can be searched separately. Metadata also come from four other major projects that ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> separate<br />

databases <strong>and</strong> update their data <str<strong>on</strong>g>in</str<strong>on</strong>g> Trismegistos yearly, <strong>and</strong> this <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the HGV, 481 the Arabic<br />

Papyrology Database (APD), 482 the Brussels Coptic Database (BCD), 483 <strong>and</strong> the Catalogue of<br />

Paraliterary Papyri (CPP). 484 Each papyrus is given an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual Trismegistos number (TM Number)<br />

so that records for it can be easily found across all the databases. 485 The Trismegistos portal also<br />

473 http://dvctvs.upf.edu/lang/en/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

474 http://www.trismegistos.org/<br />

475 http://www.trismegistos.org/tm/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

476 http://www.trismegistos.org/hhp/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

477 http://www.trismegistos.org/daht/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

478 http://www.trismegistos.org/ate/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

479 http://www.trismegistos.org/magic/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

480 http://www.trismegistos.org/ldab/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php. Further discussi<strong>on</strong> of the LDAB can be found here.<br />

481 http://aquila.papy.uni-heidelberg.de/gvzFM.html<br />

482 http://orientw.uzh.ch/apd/project.jsp<br />

483 http://dev.ulb.ac.be/philo/bad/copte/base.phppage=accueil.php<br />

484 http://cpp.arts.kuleuven.be//<br />

485 Further discussi<strong>on</strong> of the federated search approach <strong>and</strong> the use of a unique Trismegistos identifier can be found <str<strong>on</strong>g>in</str<strong>on</strong>g> Bagnall (2010); these topics are also<br />

discussed later <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper.


145<br />

provides access to a variety of other major resources, 486 <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to Trismegistos texts have recently<br />

become available through the Papyrological Navigator.<br />

Of the four databases that are federated through Trismegistos, the HGV is the largest, <strong>and</strong> it is a<br />

participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g partner <str<strong>on</strong>g>in</str<strong>on</strong>g> the APIS <strong>and</strong> the IDP projects as well as LaQuAT. First funded <str<strong>on</strong>g>in</str<strong>on</strong>g> 1988, this<br />

database <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes extensive metadata 487 <strong>on</strong> almost all Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> documentary papyri <strong>and</strong><br />

ostraca from Egypt <strong>and</strong> nearby areas that have appeared <str<strong>on</strong>g>in</str<strong>on</strong>g> more than 500 pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t publicati<strong>on</strong>s. 488 The<br />

metadata <str<strong>on</strong>g>in</str<strong>on</strong>g> the HGV describe the papyri found <str<strong>on</strong>g>in</str<strong>on</strong>g> many other papyri collecti<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g many of<br />

those found with<str<strong>on</strong>g>in</str<strong>on</strong>g> the APIS <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> particular the DDbDP. The current database <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes 56,100<br />

records (though each record is not for a unique papyrus) s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce records/metadata for many <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

papyrus have been published <str<strong>on</strong>g>in</str<strong>on</strong>g> separate collecti<strong>on</strong>s <strong>and</strong> thus there are often several metadata records<br />

for the same papyrus. The database is based <strong>on</strong> FileMaker, <strong>and</strong> for users unfamiliar with this<br />

commercial database, search<str<strong>on</strong>g>in</str<strong>on</strong>g>g the HGV requires referr<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the specific database documentati<strong>on</strong>.<br />

The user can also browse an alphabetical list of texts, <strong>and</strong> the record for each papyrus <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes details<br />

of its publicati<strong>on</strong> <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to the full text when available <str<strong>on</strong>g>in</str<strong>on</strong>g> both the Perseus <strong>and</strong> papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo<br />

implementati<strong>on</strong>s of the DDbDP.<br />

Another database federated through Trismegistos but that also ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s a separate website for its<br />

database is the Arabic Papyrology Database (APD), a project created by the Internati<strong>on</strong>al Society for<br />

Arabic Papyrology. The APD allows users to search for Arabic documents <strong>on</strong> papyrus, parchment, <strong>and</strong><br />

paper from the seventh through the sixteenth centuries AD. The website notes that although there are<br />

more than 150,000 Arabic documents c<strong>on</strong>served <strong>on</strong> papyrus <strong>and</strong> paper, <strong>on</strong>ly a small number of them<br />

have been published <strong>and</strong> extensively studied. 489 The APD provides access to about 850 (out of 2,000)<br />

Arabic texts <strong>and</strong> is the first electr<strong>on</strong>ic compilati<strong>on</strong> of Arabic papyri. Both simple <strong>and</strong> advanced<br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g opti<strong>on</strong>s are available, <strong>and</strong> the APD supports lemmatized search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the papyri text <strong>and</strong> a<br />

full search of the metadata. The collecti<strong>on</strong> of papyri can also be browsed by name, metadata, or<br />

references. Each papyrus record <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes full publicati<strong>on</strong> metadata, the full Arabic text (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

variant read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <strong>and</strong> apparatus), a transcripti<strong>on</strong>, <strong>and</strong> relevant lexic<strong>on</strong> entries for words.<br />

The third database federated through Trismegistos with a significant separate web presence is the<br />

Brussels Coptic Database (BCD), a database of Coptic documentary texts that was started <str<strong>on</strong>g>in</str<strong>on</strong>g> 2000 <strong>and</strong><br />

is currently ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by Ala<str<strong>on</strong>g>in</str<strong>on</strong>g> Delattre of the Centre de Papyrologie et d'Épigraphie Grecque of the<br />

Université Libre de Bruxelles. The BCD used the HGV as its <str<strong>on</strong>g>in</str<strong>on</strong>g>itial model <strong>and</strong> now provides access to<br />

about 6,700 Coptic texts that have been previously published. The search <str<strong>on</strong>g>in</str<strong>on</strong>g>terface supports multiple<br />

fielded keyword search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> French <str<strong>on</strong>g>in</str<strong>on</strong>g> the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g fields: sigla, <str<strong>on</strong>g>in</str<strong>on</strong>g>ventory number, material, orig<str<strong>on</strong>g>in</str<strong>on</strong>g>,<br />

date, dialect, c<strong>on</strong>tent, bibliography, varia, <strong>and</strong> text ID. Although the website <str<strong>on</strong>g>in</str<strong>on</strong>g>dicates plans to provide<br />

access to the full text of documents, currently most of the database is limited to metadata regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

papyri.<br />

486 These <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a database of collecti<strong>on</strong>s of papyrological <strong>and</strong> epigraphic texts (http://www.trismegistos.org/coll/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php) created by the Leuven<br />

Homepage of Papyrus Collecti<strong>on</strong>s <strong>and</strong> the project Multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gualism <strong>and</strong> Multiculturalism <str<strong>on</strong>g>in</str<strong>on</strong>g> Graeco-Roman Egypt, a list of papyri archives <str<strong>on</strong>g>in</str<strong>on</strong>g> Graeco-<br />

Roman Egypt (http://www.trismegistos.org/arch/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php), the Prosopographia Ptolemaica (http://ldab.arts.kuleuven.be/prosptol/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html), <strong>and</strong> a<br />

database of place names (http://www.trismegistos.org/geo/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php).<br />

487 Another significant database of papyrological metadata is the Mertens-Pack 3 database<br />

(http://www2.ulg.ac.be/facphl/services/cedopal/pages/mp3anglais.htm), which provides a catalog of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> bibliographic details <strong>on</strong><br />

approximately 6000 Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> literary papyri.<br />

488 http://www.rzuser.uni-heidelberg.de/~gv0/Liste_der_Publikati<strong>on</strong>en.html (though a number of journal publicati<strong>on</strong>s have not been covered).<br />

489 A full list of published texts from which papyri have been taken can be found here at http://orientw.uzh.ch/apd/requisits3a.jsp


146<br />

The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al major separate resource federated through Trismegistos is the “Catalogue of Paraliterary<br />

Papyri (CPP),” 490 a research project sp<strong>on</strong>sored by the Onderzoeksraad K.U. Leuven <strong>and</strong> directed by<br />

Marc Huys. This electr<strong>on</strong>ic catalog of paraliterary papyri “c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s descripti<strong>on</strong>s of Greek papyri <strong>and</strong><br />

other written materials which, because of their paraliterary character, cannot be found <str<strong>on</strong>g>in</str<strong>on</strong>g> the st<strong>and</strong>ard<br />

electr<strong>on</strong>ic corpora of literary <strong>and</strong> documentary papyri, the Thesaurus L<str<strong>on</strong>g>in</str<strong>on</strong>g>guae Graecae (TLG) <strong>and</strong> the<br />

Duke Data Bank of Documentary Papyri (DDBDP),” mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g it very difficult for all but the specialist<br />

to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d them. These paraliterary papyri have typically been published <str<strong>on</strong>g>in</str<strong>on</strong>g> various editi<strong>on</strong>s, <strong>and</strong> the CPP<br />

has sought to create a unified collecti<strong>on</strong> of these materials. The CPP <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes digital versi<strong>on</strong>s of fulltext<br />

editi<strong>on</strong>s of the paraliterary fragments (<str<strong>on</strong>g>in</str<strong>on</strong>g> both beta code <strong>and</strong> Unicode). All papyri have been<br />

encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> TEI-XML but are presented <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g> HTML.<br />

One of the more unique papyrology projects is the V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a Tablets Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e, 491 a digital collecti<strong>on</strong><br />

that provides access to the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e editi<strong>on</strong> of the V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g tablets that were “excavated from<br />

the Roman fort at V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a <str<strong>on</strong>g>in</str<strong>on</strong>g> northern Engl<strong>and</strong>.” This website <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes searchable editi<strong>on</strong>s of<br />

volumes 1 <strong>and</strong> 2 of the tablets, an <str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong>, the archaeological <strong>and</strong> historical c<strong>on</strong>text, <strong>and</strong> a<br />

reference guide. This website received fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g through the Mell<strong>on</strong> Foundati<strong>on</strong> as part of the “Script,<br />

Image, <strong>and</strong> the Culture of Writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the Ancient World” program <strong>and</strong> was created through the<br />

collaborati<strong>on</strong> of the Centre for the Study of Ancient Documents (CSAD) 492 <strong>and</strong> the Academic<br />

Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g Development Team 493 at Oxford University. The collecti<strong>on</strong> of V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablets can be<br />

browsed or searched, <strong>and</strong> the user can search for Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> text specifically or do a general text search<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> the tablet descripti<strong>on</strong> or English translati<strong>on</strong> (when available). The tablets database can also be<br />

browsed by different categories, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g “highlights,” tablet number, subject, category, type, people,<br />

places, military terms, <strong>and</strong> archaeological c<strong>on</strong>text. The record for each tablet <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes an image that<br />

can be viewed <str<strong>on</strong>g>in</str<strong>on</strong>g> great detail through a special viewer <strong>and</strong> also provides a transcripti<strong>on</strong> (with notes <strong>and</strong><br />

commentary by l<str<strong>on</strong>g>in</str<strong>on</strong>g>e). Another particularly useful feature is the “pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t friendly tablet display” that<br />

provides a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>table page of all the tablet <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. 494 Each tablet has been encoded as a separate<br />

EpiDoc XML file, <strong>and</strong> the custom EpiDoc DTD, the V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a XSL style sheet, <strong>and</strong> the entire corpus<br />

of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s can be downloaded. 495<br />

As has been illustrated above, representati<strong>on</strong>s of an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual papyrus can often be found <str<strong>on</strong>g>in</str<strong>on</strong>g> various<br />

databases <strong>and</strong> the amount of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> available (metadata, images, transcripti<strong>on</strong>s, translati<strong>on</strong>s, etc.)<br />

can vary significantly between them. 496 The numerous challenges of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g such collecti<strong>on</strong>s are<br />

explored <str<strong>on</strong>g>in</str<strong>on</strong>g> depth <str<strong>on</strong>g>in</str<strong>on</strong>g> the next secti<strong>on</strong>.<br />

Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Collecti<strong>on</strong>s of Papyri <strong>and</strong> Digital Infrastructure<br />

Papyrology is such an important discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e with the field of classics that a number of the major digital<br />

classics research projects, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g C<strong>on</strong>cordia, eAQUA, eSAD, IDP, <strong>and</strong> LaQuAT, have a papyrology<br />

490 http://cpp.arts.kuleuven.be/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

491 http://v<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a.csad.ox.ac.uk/<br />

492 http://www.csad.ox.ac.uk/<br />

493 http://www.oucs.ox.ac.uk/acdt/<br />

494 For example, see<br />

http://v<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a.csad.ox.ac.uk/4DL<str<strong>on</strong>g>in</str<strong>on</strong>g>k2/4DACTION/WebRequestQuerysearchTerm=128&searchField=pr<str<strong>on</strong>g>in</str<strong>on</strong>g>tFriendly&searchType=number&pr<str<strong>on</strong>g>in</str<strong>on</strong>g>tImag<br />

e=yes&pr<str<strong>on</strong>g>in</str<strong>on</strong>g>tCommentary=yes&pr<str<strong>on</strong>g>in</str<strong>on</strong>g>tNotes=yes&pr<str<strong>on</strong>g>in</str<strong>on</strong>g>tLat<str<strong>on</strong>g>in</str<strong>on</strong>g>=yes&pr<str<strong>on</strong>g>in</str<strong>on</strong>g>tEnglish=yes<br />

495 http://v<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a.csad.ox.ac.uk/tablets/TVdigital-1.shtml<br />

496 For example, records for the papyri <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e important collecti<strong>on</strong> such as the Oxyrhynchus papyri (POxy) can be found <str<strong>on</strong>g>in</str<strong>on</strong>g> APIS, HGV, <strong>and</strong> Trismegistos,<br />

<strong>and</strong> the full text of many of the documents can be found <str<strong>on</strong>g>in</str<strong>on</strong>g> the DDbDP. POxy also has a descriptive website (http://www.papyrology.ox.ac.uk/POxy/) <strong>and</strong><br />

an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e database http://163.1.169.40/cgi-b<str<strong>on</strong>g>in</str<strong>on</strong>g>/librarysite=localhost&a=p&p=about&c=POxy&ct=0&l=en&w=utf-8. This database allows the papyri to<br />

be searched as well as browsed by papyrus number, author, title, genre, data, or volume number. Individual papyrus records <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to a papyrus<br />

image that is password protected.


147<br />

comp<strong>on</strong>ent. While C<strong>on</strong>cordia <strong>and</strong> LaQuAT seek to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate papyri collecti<strong>on</strong>s with other digital<br />

classical resources, such as epigraphical databases, <str<strong>on</strong>g>in</str<strong>on</strong>g>to larger “virtual” collecti<strong>on</strong>s that can be<br />

simultaneously searched, eAQUA <strong>and</strong> eSAD are develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies to assist papyrologists <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of their ancient texts.<br />

Focused exclusively <strong>on</strong> papyri collecti<strong>on</strong>s, the IDP project (Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2007, Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2008), 497<br />

which is a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t effort of the oldest digital resource <str<strong>on</strong>g>in</str<strong>on</strong>g> papyrology the DDbDP, the HGV, <strong>and</strong> the APIS,<br />

is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle <str<strong>on</strong>g>in</str<strong>on</strong>g>terface to these three collecti<strong>on</strong>s, a project that has largely been realized<br />

through the creati<strong>on</strong> of the Papyrological Navigator (PN). 498 Active research <strong>on</strong> improv<str<strong>on</strong>g>in</str<strong>on</strong>g>g the PN is<br />

<strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g, as illustrated by a recent blog post by Hugh Cayless (Cayless 2010c). One particular<br />

comp<strong>on</strong>ent of the PN that he has recently improved is a service that provides “lookup of identifiers” of<br />

papyri <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e collecti<strong>on</strong> <strong>and</strong> “correlates them with related records <str<strong>on</strong>g>in</str<strong>on</strong>g> other collecti<strong>on</strong>s.” While this<br />

service was orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ally based <strong>on</strong> a Lucene-based numbers server, Cayless is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g to replace it with a<br />

RDF triplestore. One particular challenge is that of data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> <strong>and</strong> the difficulties of model<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

relati<strong>on</strong>ships between the same items <str<strong>on</strong>g>in</str<strong>on</strong>g> different databases. The complicated nature of these<br />

relati<strong>on</strong>ships <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes several dimensi<strong>on</strong>s, such as different levels of hierarchy <str<strong>on</strong>g>in</str<strong>on</strong>g> database structures<br />

<strong>and</strong> various FRBR type relati<strong>on</strong>ships (e.g., the ancient document is the work but then it has various<br />

expressi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> different pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g translati<strong>on</strong>s), <strong>and</strong> each of those editi<strong>on</strong>s has various<br />

manifestati<strong>on</strong>s (HTML, EpiDoc transcripti<strong>on</strong>s, etc.). In additi<strong>on</strong>, while the papyrological items <strong>and</strong><br />

their metadata <str<strong>on</strong>g>in</str<strong>on</strong>g> different databases can sometimes have a 1:1 relati<strong>on</strong>ship (such as is usually the case<br />

between the DDbDP <strong>and</strong> the HGV) there can also be overlap (such as between the APIS <strong>and</strong> the other<br />

two databases). Each database also has complicated <str<strong>on</strong>g>in</str<strong>on</strong>g>ternal relati<strong>on</strong>ships; for example, although the<br />

HGV utilizes the idea of a “pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cipal editi<strong>on</strong>” <strong>and</strong> chooses a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle can<strong>on</strong>ical publicati<strong>on</strong> of a papyrus,<br />

it also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes other earlier publicati<strong>on</strong>s of the same papyrus <str<strong>on</strong>g>in</str<strong>on</strong>g> its metadata. The DDbDP follows the<br />

same basic idea but creates a new record that l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to stub records for the older editi<strong>on</strong>s of each<br />

papyrus.<br />

To better represent the complexity of these relati<strong>on</strong>ships, Cayless graphed them <str<strong>on</strong>g>in</str<strong>on</strong>g> Mulgara 499 (a<br />

scalable RDF database that is based <strong>on</strong> Java), so that he could use SPARQL queries to fetch data <strong>and</strong><br />

then map these to easily retrievable <strong>and</strong> citable URLs that follow a st<strong>and</strong>ard pattern. Results from<br />

SPARQL queries will also be made available as Notati<strong>on</strong>3 500 <strong>and</strong> JSON formats to create both humanreadable<br />

<strong>and</strong> -usable mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces to the data available through the PN. Cayless also reported<br />

that he was look<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>to us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the DC TERMS vocabulary as well as other relevant <strong>on</strong>tologies such as<br />

the FRBR vocabulary. 501 Ultimately, Cayless hoped to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k the bibliography <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual papyrus<br />

records to Zotero 502 <strong>and</strong> to ancient places names <str<strong>on</strong>g>in</str<strong>on</strong>g> Pleiades. “It all works well with my design<br />

philosophy for papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo,” Cayless c<strong>on</strong>cluded, “which is that it should c<strong>on</strong>sist of data (<str<strong>on</strong>g>in</str<strong>on</strong>g> the form of<br />

EpiDoc source files <strong>and</strong> representati<strong>on</strong>s of those files), retrievable via sensible URLs, with modular<br />

services surround<str<strong>on</strong>g>in</str<strong>on</strong>g>g the data to make it discoverable <strong>and</strong> usable.”<br />

A recent article by Roger Bagnall has offered an <str<strong>on</strong>g>in</str<strong>on</strong>g>-depth discussi<strong>on</strong> of the IDP project. As he<br />

expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, the goals of the IDP have changed s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce it was first c<strong>on</strong>ceptualized <str<strong>on</strong>g>in</str<strong>on</strong>g> 1992 <str<strong>on</strong>g>in</str<strong>on</strong>g> two specific<br />

ways:<br />

497 http://idp.atlantides.org/trac/idp/wiki/<br />

498 http://www.papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo<br />

499 http://www.mulgara.org/<br />

500 Notati<strong>on</strong>3 or N3 is a “shorth<strong>and</strong> n<strong>on</strong>-XML serializati<strong>on</strong> of Resource Descripti<strong>on</strong> Framework models, designed with human-readability <str<strong>on</strong>g>in</str<strong>on</strong>g> m<str<strong>on</strong>g>in</str<strong>on</strong>g>d.”<br />

http://en.wikipedia.org/wiki/Notati<strong>on</strong>3<br />

501 http://vocab.org/frbr/core.html<br />

502 http://www.zotero.org/


148<br />

One is toward openness; the other is toward dynamism. These are l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked. We no l<strong>on</strong>ger see IDP<br />

as represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g at any given moment a synthesis of fixed data sources directed by a central<br />

management; rather, we see it as a c<strong>on</strong>stantly chang<str<strong>on</strong>g>in</str<strong>on</strong>g>g set of fully open data sources governed<br />

by the scholarly community <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by all active scholars who care to participate. One<br />

might go so far as to say that we see this nexus of papyrological resources as ceas<str<strong>on</strong>g>in</str<strong>on</strong>g>g to be<br />

“projects” <strong>and</strong> turn<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>stead <str<strong>on</strong>g>in</str<strong>on</strong>g>to a community (Bagnall 2010).<br />

The IDP, like many other digital classics <strong>and</strong> humanities projects, is shift<str<strong>on</strong>g>in</str<strong>on</strong>g>g away from the idea of<br />

creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g static, project-based, <strong>and</strong> centrally c<strong>on</strong>trolled digital silos to dynamic, community-ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

resources where both the data <strong>and</strong> participati<strong>on</strong> are open to all scholars who wish to participate.<br />

Bagnall argued that this shift <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded rec<strong>on</strong>ceptualiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g what it means to be an editor <strong>and</strong> that the<br />

dist<str<strong>on</strong>g>in</str<strong>on</strong>g>cti<strong>on</strong> that was <strong>on</strong>ce made between edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts <strong>and</strong> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g textual banks should be ab<strong>and</strong><strong>on</strong>ed.<br />

As part of this new level of openness, the IDP plans to expose both their data <strong>and</strong> the code that was<br />

used to create the system. This means that if other scholars wish to create a new <str<strong>on</strong>g>in</str<strong>on</strong>g>terface to the data or<br />

reuse it <str<strong>on</strong>g>in</str<strong>on</strong>g> various ways, they may do so. The IDP also hopes to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude more sources of data, <strong>and</strong><br />

Bagnall lists at least two projects that are reus<str<strong>on</strong>g>in</str<strong>on</strong>g>g their code. The data <str<strong>on</strong>g>in</str<strong>on</strong>g> the IDP are utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

EpiDoc encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g st<strong>and</strong>ard, which although created for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s has been <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly used for<br />

record<str<strong>on</strong>g>in</str<strong>on</strong>g>g papyri <strong>and</strong> co<str<strong>on</strong>g>in</str<strong>on</strong>g>s. As EpiDoc uses st<strong>and</strong>ard TEI elements, new types of search <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces can<br />

be created that “will <str<strong>on</strong>g>in</str<strong>on</strong>g>terrogate a range of ancient sources of different types.” In fact, the C<strong>on</strong>cordia<br />

project has begun to create a prototype for this k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> between papyrology <strong>and</strong> epigraphy<br />

<strong>and</strong> also c<strong>on</strong>nects the documents to the Pleiades database.<br />

In this sec<strong>on</strong>d phase of the IDP, they have created an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g system named “S<strong>on</strong> of Suda On<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>e” (SoSOL) 503 that will allow authorized participants 504 to enter texts <str<strong>on</strong>g>in</str<strong>on</strong>g>to the DDbDP <strong>and</strong><br />

metadata <str<strong>on</strong>g>in</str<strong>on</strong>g>to the HGV <strong>and</strong> APIS. The name was <str<strong>on</strong>g>in</str<strong>on</strong>g>spired by Ross Scaife’s project Suda On L<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

(SOL), a collaborative site that supports group translati<strong>on</strong> of the Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e encyclopedia the Suda. 505<br />

This system supports the creati<strong>on</strong> of editi<strong>on</strong>s that will become publicly visible <strong>on</strong>ly when the editor<br />

chooses to do so. Where previously texts that were published <str<strong>on</strong>g>in</str<strong>on</strong>g> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s had to be retyped <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

a database, the IDP supports a new form of dynamic publicati<strong>on</strong> that is c<strong>on</strong>trolled by the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

editors <strong>and</strong> a larger editorial board. In a manner similar to that of the Pleiades model, any user can<br />

c<strong>on</strong>tribute variant read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, correcti<strong>on</strong>s, new texts, translati<strong>on</strong>s, or metadata, with all suggesti<strong>on</strong>s<br />

hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g to be approved by the editorial board. The edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g system records every step of this process,<br />

from proposal through vett<str<strong>on</strong>g>in</str<strong>on</strong>g>g to a f<str<strong>on</strong>g>in</str<strong>on</strong>g>al status as accepted or rejected, <strong>and</strong> a prose justificati<strong>on</strong> must be<br />

given at each step. Accepted proposals can be kept as limited access if the creator desires. One strength<br />

of this model is that rejected proposals are not deleted forever, <strong>and</strong> are <str<strong>on</strong>g>in</str<strong>on</strong>g>stead reta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital<br />

record, <str<strong>on</strong>g>in</str<strong>on</strong>g> case new data or better arguments appear to support them. Additi<strong>on</strong>ally, all accepted<br />

proposals are attributed to their c<strong>on</strong>tributor so that proper scholarly credit can be given to them. As<br />

expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by Joshua Sos<str<strong>on</strong>g>in</str<strong>on</strong>g>:<br />

Permanent transparency is the guid<str<strong>on</strong>g>in</str<strong>on</strong>g>g pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciple beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d SoSOL. The system keeps track of<br />

everyth<str<strong>on</strong>g>in</str<strong>on</strong>g>g. When you log <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> submit a text, SoSOL records it; when you submit a text or<br />

propose an emendati<strong>on</strong> SoSOL will not let you submit until you have written a message<br />

expla<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g what you propose. Similarly, SoSOL will not allow Editors to vote <strong>on</strong> a text without<br />

expla<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g why they vote the way they do. For every s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle text SoSOL keeps a permanent <strong>and</strong><br />

503 An overview of this software <strong>and</strong> full technical details can be found at http://idp.atlantides.org/trac/idp/wiki/SoSOL/Overview/<br />

504 Access to the prototype system is available at http://halsted.vis.uky.edu/protosite/.<br />

505 Full details <strong>on</strong> SoSOL, as well as a walkthrough of the software, can be found <str<strong>on</strong>g>in</str<strong>on</strong>g> Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> (2010).


149<br />

comprehensive record of every s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle change. Users can see this, forever. The discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of<br />

transparency <strong>and</strong> permanence has the virtue of requir<str<strong>on</strong>g>in</str<strong>on</strong>g>g all of us to live up to the high<br />

st<strong>and</strong>ards of our field’s motto, <strong>and</strong> make that motto mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gful: amicitia papyrologorum.<br />

Collegiality is, <str<strong>on</strong>g>in</str<strong>on</strong>g> effect, a technical requirement of SoSOL. It also means that all proposals<br />

must be offered <strong>and</strong> scrut<str<strong>on</strong>g>in</str<strong>on</strong>g>ized with utmost seriousness, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce our comments are visible to all,<br />

forever. And, that under SoSOL accurate scholarly attributi<strong>on</strong> is very easy to enforce (Sos<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

2010).<br />

Despite assurances that proper credit would be given, Bagnall noted that many c<strong>on</strong>tributors were<br />

worried about the visibility of their work be<str<strong>on</strong>g>in</str<strong>on</strong>g>g dim<str<strong>on</strong>g>in</str<strong>on</strong>g>ished as their data became “absorbed” <str<strong>on</strong>g>in</str<strong>on</strong>g> a larger<br />

system. To address this issue, Bagnall reported that the Trismegistos project (that provides access to a<br />

number of databases created both <str<strong>on</strong>g>in</str<strong>on</strong>g>ternally <strong>and</strong> externally such as the HGV) adds a unique item<br />

identifier to the records for an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual item <str<strong>on</strong>g>in</str<strong>on</strong>g> each database. This allows the federated system to<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>d all hits for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual items while still keep<str<strong>on</strong>g>in</str<strong>on</strong>g>g databases entirely separate; the user has to move<br />

between databases to look at the relevant <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. Bagnall thus expla<str<strong>on</strong>g>in</str<strong>on</strong>g>s how “highly dist<str<strong>on</strong>g>in</str<strong>on</strong>g>ct<br />

br<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g is central to their approach.” While Trismegistos has built-<str<strong>on</strong>g>in</str<strong>on</strong>g> l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to both the DDbDP <strong>and</strong><br />

APIS, users must go to the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual databases from Trismegistos. In additi<strong>on</strong>, no data are exposed for<br />

web services. N<strong>on</strong>etheless, c<strong>on</strong>versati<strong>on</strong>s are apparently under way to try to more closely <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate<br />

Trismegistos with the PN, <strong>and</strong> Bagnall acknowledged that orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ally the HGV ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a similarly<br />

more closed model before decid<str<strong>on</strong>g>in</str<strong>on</strong>g>g to let the new <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment operate <strong>on</strong> their metadata<br />

as well. 506 Thus at the end of IDP, both HGV <strong>and</strong> DDbDP will issue archival XML data under CC<br />

licenses.<br />

While Bagnall c<strong>on</strong>sidered c<strong>on</strong>cerns about br<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> attributi<strong>on</strong> to be legitimate because of issues of<br />

fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g, credit, <strong>and</strong> tenure, he thought that the risks of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g closed collecti<strong>on</strong>s were more <strong>on</strong>erous:<br />

But keep<str<strong>on</strong>g>in</str<strong>on</strong>g>g data <str<strong>on</strong>g>in</str<strong>on</strong>g> silos accessible <strong>on</strong>ly through <strong>on</strong>e’s own <str<strong>on</strong>g>in</str<strong>on</strong>g>terface has risks too, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> my<br />

view they are greater—the risk that search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>es will ignore you <strong>and</strong> you will therefore reach<br />

a much smaller audience. Our purpose <str<strong>on</strong>g>in</str<strong>on</strong>g> exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g is educati<strong>on</strong>; the more we shut out potential<br />

users who will come at the world through Google or similar eng<str<strong>on</strong>g>in</str<strong>on</strong>g>es, the fewer people we will<br />

educate. That to me is an unacceptable cost of preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g the high relief of the br<strong>and</strong>ed silo.<br />

Moreover, these resources will never reach their full value to users without extensive<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terl<str<strong>on</strong>g>in</str<strong>on</strong>g>kage, <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperati<strong>on</strong> <strong>and</strong> openness to remix<str<strong>on</strong>g>in</str<strong>on</strong>g>g by users (Bagnall 2010).<br />

Bey<strong>on</strong>d br<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g, the other major c<strong>on</strong>cern of most scholars was <str<strong>on</strong>g>in</str<strong>on</strong>g>adequate quality c<strong>on</strong>trol. Bagnall<br />

c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>gly argued, however, that the editorial structure of Pleiades <strong>and</strong> IDP <str<strong>on</strong>g>in</str<strong>on</strong>g> some ways offered<br />

str<strong>on</strong>ger quality c<strong>on</strong>trol measures <str<strong>on</strong>g>in</str<strong>on</strong>g> that <str<strong>on</strong>g>in</str<strong>on</strong>g>correct or <str<strong>on</strong>g>in</str<strong>on</strong>g>appropriate <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> could be removed far<br />

more quickly than from a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted text, <strong>and</strong> that the open system allows the community to alert the<br />

editors to mistakes they may have missed. “These systems are not weaker <strong>on</strong> quality c<strong>on</strong>trol,” Bagnall<br />

offered, “but str<strong>on</strong>ger, <str<strong>on</strong>g>in</str<strong>on</strong>g>asmuch as they leverage both traditi<strong>on</strong>al peer review <strong>and</strong> newer communitybased<br />

‘crowd-sourc<str<strong>on</strong>g>in</str<strong>on</strong>g>g models’.” New peer-review models such as those developed by this project are<br />

essential <str<strong>on</strong>g>in</str<strong>on</strong>g> any digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that hopes to ga<str<strong>on</strong>g>in</str<strong>on</strong>g> buy-<str<strong>on</strong>g>in</str<strong>on</strong>g> from a large number of scholars. 507<br />

Another major po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of c<strong>on</strong>tenti<strong>on</strong> for many scholars Bagnall listed, <strong>and</strong> <strong>on</strong>e not that easily addressed,<br />

is the issue of pers<strong>on</strong>al c<strong>on</strong>trol. Many scholars are possessive of projects that they have created, <strong>and</strong><br />

506 L<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to Trismegistos texts have recently become available through the Papyrological Navigator website.<br />

507 The importance of new peer-review models that make use of the Internet to solicit a broader range of op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s <strong>on</strong> scholarly material was the subject of<br />

a recent New York Times article (Cohen 2010).


150<br />

this idea of pers<strong>on</strong>al ownership of objects <strong>and</strong> data was also str<strong>on</strong>gly illustrated <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology (Harley<br />

et al. 2010). While pers<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>vestment has its merits, Bagnall also proposed that “c<strong>on</strong>trol is the enemy<br />

of susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability; it reduces other people’s <str<strong>on</strong>g>in</str<strong>on</strong>g>centive to <str<strong>on</strong>g>in</str<strong>on</strong>g>vest <str<strong>on</strong>g>in</str<strong>on</strong>g> someth<str<strong>on</strong>g>in</str<strong>on</strong>g>g.” Regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the experience<br />

of IDP, Bagnall worried that too much discussi<strong>on</strong> centered up<strong>on</strong> revenue rather than expense, <strong>and</strong><br />

str<strong>on</strong>gly doubted that there was any “viable earned-<str<strong>on</strong>g>in</str<strong>on</strong>g>come opti<strong>on</strong> for papyrology.” While the IDP<br />

briefly c<strong>on</strong>sidered direct subscripti<strong>on</strong> charges <strong>and</strong> also p<strong>on</strong>dered creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an endowment, they<br />

ultimately ab<strong>and</strong><strong>on</strong>ed both ideas.<br />

Although they c<strong>on</strong>sidered it likely that they could raise some m<strong>on</strong>ey, the IDP was uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> of how<br />

much m<strong>on</strong>ey would be needed <strong>and</strong> of what they wanted to “fund <str<strong>on</strong>g>in</str<strong>on</strong>g> perpetuity.” They <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly<br />

realized that if neither the APIS nor the DDbDP were “defensible silos,” then neither was the<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of papyrology. Far more essential than preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual projects, Bagnall reas<strong>on</strong>ed, was<br />

develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a shared set of data structures <strong>and</strong> tools to exploit various types of ancient evidence. As<br />

previously noted by Roueché <str<strong>on</strong>g>in</str<strong>on</strong>g> her overview of digital epigraphy, many of the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary divisi<strong>on</strong>s<br />

so entrenched today are <str<strong>on</strong>g>in</str<strong>on</strong>g> many ways, as Bagnall eloquently expresses, “arbitrary divisi<strong>on</strong>s of a<br />

seamless spectrum of written expressi<strong>on</strong>” that <str<strong>on</strong>g>in</str<strong>on</strong>g>clude numerous sources. Susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability for the IDP,<br />

Bagnall proposed, “will come <str<strong>on</strong>g>in</str<strong>on</strong>g> the first <str<strong>on</strong>g>in</str<strong>on</strong>g>stance from shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> an organizati<strong>on</strong>al <strong>and</strong> technological<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed to serve a much wider range of resources for the ancient world (<strong>and</strong> perhaps<br />

not necessarily limited to antiquity, either)” (Bagnall 2010). Technological <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure will be <strong>on</strong>e<br />

major cost; c<strong>on</strong>tent creati<strong>on</strong> <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>tenance are two others. The way forward for papyrology, Bagnall<br />

c<strong>on</strong>cluded, was to go bey<strong>on</strong>d its limits as a discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> particular its separateness. Indeed, as<br />

illustrated by this review, many scholars have commented <strong>on</strong> the fact that the digital envir<strong>on</strong>ment has<br />

rather unexpectedly provided new opportunities to transcend discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary boundaries <strong>and</strong> to promote a<br />

more <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated view of the ancient world.<br />

EpiDoc, Digital Papyrology, <strong>and</strong> Reus<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Resources<br />

As illustrated above, EpiDoc is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used by the IDP project to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate several papyrology projects.<br />

The use of this st<strong>and</strong>ard has also allowed researchers to explore new questi<strong>on</strong>s us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a<br />

tablets. Recent work by eSAD 508 (eScience <strong>and</strong> Ancient Documents), a collaborative project between<br />

the e-Research Centre <strong>and</strong> Centre for the Study of Ancient Documents <strong>and</strong> Eng<str<strong>on</strong>g>in</str<strong>on</strong>g>eer<str<strong>on</strong>g>in</str<strong>on</strong>g>g Science at<br />

University of Oxford, has exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the various ways <str<strong>on</strong>g>in</str<strong>on</strong>g> which the highly granular encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the<br />

V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablets can “be used to create a reusable word <strong>and</strong> character corpus for a networked e-<br />

Science system <strong>and</strong> other e-Science applicati<strong>on</strong>s” (Roued 2009). The eSAD project has two major<br />

goals: (1) to develop e-Science tools that aid <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpret<str<strong>on</strong>g>in</str<strong>on</strong>g>g damaged texts; <strong>and</strong> (2) to develop new<br />

image-analysis algorithms that can be used with digitized images of ancient texts. In terms of<br />

V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a, Roued <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated how the encoded EpiDoc XML of the tablets could be used to create a<br />

knowledge base of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> words for an Interpretati<strong>on</strong> Support System (ISS) that would assist users <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

read<str<strong>on</strong>g>in</str<strong>on</strong>g>g other ancient documents.<br />

The V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a project had decided to use EpiDoc to support at least a m<str<strong>on</strong>g>in</str<strong>on</strong>g>imal level of semantic<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g for the tablets. Roued noted that even with st<strong>and</strong>ard c<strong>on</strong>venti<strong>on</strong>s such as Leiden, not all the<br />

c<strong>on</strong>venti<strong>on</strong>s were applied evenly, as some scholars used “underdots” to <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate partially preserved<br />

characters while others used them to dem<strong>on</strong>strate doubtful characters. The use of EpiDoc c<strong>on</strong>sequently<br />

addressed these types of issues with Leiden encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g as it was comm<strong>on</strong>ly practiced:<br />

508 http://esad.classics.ox.ac.uk


151<br />

This example illustrates the primary advantage of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g the editi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> XML. If editors<br />

wish to differ between uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> characters <strong>and</strong> broken characters they can encode them with<br />

different tags. They can then transform both tags <str<strong>on</strong>g>in</str<strong>on</strong>g>to under-dots if they still wish to present<br />

both <str<strong>on</strong>g>in</str<strong>on</strong>g>stances as such or they can decide to visualize <strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g>stance, underl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong> the other<br />

under-dotted to dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guish between them (Roued 2009).<br />

Thus, EpiDoc allows different scholarly op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s to be encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> the same XML file s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce c<strong>on</strong>tent<br />

markup (EpiDoc XML) <strong>and</strong> presentati<strong>on</strong> (separate XSLT sheets) are separated. Roued also supported<br />

the argument of Roueché (2009) that EpiDoc encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g is not a “substantial c<strong>on</strong>ceptual leap” from<br />

Leiden encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

While the first two V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablet publicati<strong>on</strong>s were encoded us<str<strong>on</strong>g>in</str<strong>on</strong>g>g EpiDoc, Roued observed that<br />

the level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g was not very granular <strong>and</strong> the website was not well set up to exploit the<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g. She also noted that the level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g a project chooses typically depends both <strong>on</strong> the<br />

technology chosen <strong>and</strong> the anticipated future use of the data. For the next series of V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablets,<br />

Roued expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that the project decided to pursue an even more granular level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

words <strong>and</strong> terms <str<strong>on</strong>g>in</str<strong>on</strong>g> the transcripti<strong>on</strong>. This has supported an <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive search functi<strong>on</strong>ality <strong>and</strong> added<br />

greater value to the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g as a knowledge base. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, the project encoded the tablets <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

greater detail regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g Leiden:<br />

Encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>stances of uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty, added characters <strong>and</strong> abbreviati<strong>on</strong>s enables us to extract<br />

these <str<strong>on</strong>g>in</str<strong>on</strong>g>stances from their respective texts <strong>and</strong> analyze them. We can, for example, count how<br />

many characters <str<strong>on</strong>g>in</str<strong>on</strong>g> the text or texts are deemed to be uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>. Similarly, we can look at the<br />

type of characters that are most likely to be supplied. These illustrate the many new<br />

possibilities for analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the read<str<strong>on</strong>g>in</str<strong>on</strong>g>g of ancient document (Roued 2009).<br />

In additi<strong>on</strong> to more extensive encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the texts <str<strong>on</strong>g>in</str<strong>on</strong>g> EpiDoc, the eSAD project decided to perform a<br />

certa<str<strong>on</strong>g>in</str<strong>on</strong>g> amount of manual “c<strong>on</strong>textual encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g” of words, people, place names, dates, <strong>and</strong> military<br />

terms, or basically of all the items found <str<strong>on</strong>g>in</str<strong>on</strong>g> the <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes. For words, the <str<strong>on</strong>g>in</str<strong>on</strong>g>dex c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a list of<br />

lemmas with references to places <str<strong>on</strong>g>in</str<strong>on</strong>g> the text where corresp<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g words occurred; encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g these data<br />

allowed them to extract <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> such as the number of times a lemma occurred <str<strong>on</strong>g>in</str<strong>on</strong>g> the text. Dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes, the project discovered numerous errors that needed to be corrected. All of<br />

this encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g has been performed to support new advanced search<str<strong>on</strong>g>in</str<strong>on</strong>g>g features with a new launch of the<br />

website as V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a Tablets Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e 2.0 <str<strong>on</strong>g>in</str<strong>on</strong>g> 2010. In particular, they have developed an <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive<br />

search feature us<str<strong>on</strong>g>in</str<strong>on</strong>g>g AJAX, 509 LiveSearch, JavaScript, <strong>and</strong> PHP 510 that gives the user feedback while<br />

typ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> a search term. In the case of V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a, it will give users a list of all words, terms, names,<br />

<strong>and</strong> dates that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> their search pattern.<br />

The XML document created for each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> text c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s all of its relevant bibliographic<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> textual encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> Roued expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that this necessitated develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g methods that<br />

could extract relevant <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong>ly, depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g up<strong>on</strong> the need. The project thus decided to build<br />

RESTful web services us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ZEND framework 511 <strong>and</strong> PHP. The V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a web services receive<br />

URLs with certa<str<strong>on</strong>g>in</str<strong>on</strong>g> parameters <strong>and</strong> return answers as XML. This allows other projects to utilize these<br />

encoded XML files, <strong>and</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g> particular, the knowledge base of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> words. This web service is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

used <str<strong>on</strong>g>in</str<strong>on</strong>g> their related project that seeks to develop an ISS for readers of ancient documents. The<br />

509 AJAX, short for “Asynchr<strong>on</strong>ous JavaScript <strong>and</strong> XML” <strong>and</strong> is a technique “for creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g fast <strong>and</strong> dynamic web pages”<br />

http://www.w3schools.com/ajax/ajax_<str<strong>on</strong>g>in</str<strong>on</strong>g>tro.asp<br />

510 PHP st<strong>and</strong>s for “Hypertext Processor” <strong>and</strong> is a server-side script<str<strong>on</strong>g>in</str<strong>on</strong>g>g language, http://www.w3schools.com/php/php_<str<strong>on</strong>g>in</str<strong>on</strong>g>tro.asp<br />

511 http://framework.zend.com/


152<br />

prototype 512 <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a word search that “takes the partially <str<strong>on</strong>g>in</str<strong>on</strong>g>terpreted characters of a word <strong>and</strong><br />

attaches them to the web service URL as a pattern, thus receiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g suggesti<strong>on</strong>s for the word us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablets as a knowledge base” (Roued 2009).<br />

The research work by eSAD with the V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablets dem<strong>on</strong>strates how the use of st<strong>and</strong>ard<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g such as EpiDoc can support new research such as the development of knowledge bases from<br />

encoded texts. It also illustrates the potential of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to such knowledge resources for other<br />

digital humanities projects through web services.<br />

Collaborative Workspaces, Image Analysis, <strong>and</strong> Read<str<strong>on</strong>g>in</str<strong>on</strong>g>g Support Systems<br />

The unique nature of work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with papyri has made papyrology a fairly collaborative discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

c<strong>on</strong>trast to some of the other subdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of classics. As noted earlier, Hans<strong>on</strong> described the amicitia<br />

papyrologorum of the n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth <strong>and</strong> twentieth centuries, <strong>and</strong> this trend it appears c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues today.<br />

“Individual specialists, particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> the study of cultural artifacts or documentary rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s, work with<br />

collecti<strong>on</strong>s of artifacts, texts, artworks, <strong>and</strong> architecture that may span several excavati<strong>on</strong> sites,” Harley<br />

et al. (2010) expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “as a result, scholars can be highly collaborative” … “<str<strong>on</strong>g>in</str<strong>on</strong>g> how they<br />

locate <strong>and</strong> work with these materials <str<strong>on</strong>g>in</str<strong>on</strong>g> order to extract as much <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> detail as possible.”<br />

In their review of archaeology, Harley et al. also observed that some scholars desired web-based<br />

workspaces that could be shared when work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> documentary rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s. The authors quoted <strong>on</strong>e<br />

scholar at length:<br />

It would be nice to be able to have a more c<strong>on</strong>venient way of look<str<strong>on</strong>g>in</str<strong>on</strong>g>g at images all at the same<br />

time <strong>and</strong> manipulat<str<strong>on</strong>g>in</str<strong>on</strong>g>g them. We can now do that with text pretty easily, but let’s say a few of<br />

us are work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> an editi<strong>on</strong> of a papyrus <strong>and</strong> we want to discuss some particular feature <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

high-resoluti<strong>on</strong> image of it. The <strong>on</strong>ly th<str<strong>on</strong>g>in</str<strong>on</strong>g>g we can really do very easily at that po<str<strong>on</strong>g>in</str<strong>on</strong>g>t is to all<br />

look at the same webpage or to pass the image around by email <strong>and</strong> give verbal cues to<br />

navigate to a particular po<str<strong>on</strong>g>in</str<strong>on</strong>g>t (Harley et al. 2010, 111).<br />

Some <str<strong>on</strong>g>in</str<strong>on</strong>g>itial work toward provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g such an envir<strong>on</strong>ment has been c<strong>on</strong>ducted by the VRE-SDM<br />

project 513 <strong>and</strong> this work largely c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues under the eSAD project. Tarte et al. (2009) have observed<br />

that greater accessibility to legible images <strong>and</strong> a collaborative work<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment are both important<br />

comp<strong>on</strong>ents of any potential VRE:<br />

For Classical historians, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the legibility challenge, access to the document is often<br />

limited. Collaborative work <strong>on</strong> the documents is <strong>on</strong>e factor that facilitates their decipher<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

transcripti<strong>on</strong> <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>. The Virtual Research Envir<strong>on</strong>ment for the Study of Documents<br />

<strong>and</strong> Manuscripts pilot software (VRE-SDM) … was developed to promote n<strong>on</strong>-colocated work<br />

between documentary scholars, by provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g them with a web-based <str<strong>on</strong>g>in</str<strong>on</strong>g>terface allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g them to<br />

visualize <strong>and</strong> annotate documents <str<strong>on</strong>g>in</str<strong>on</strong>g> a digitized form, share annotati<strong>on</strong>s, exchange op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s <strong>and</strong><br />

access external knowledge bases (Tarte et al. 2009).<br />

The development of this prototype <strong>and</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>-depth exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of the work<str<strong>on</strong>g>in</str<strong>on</strong>g>g methodologies of<br />

scholars that work with such damaged texts through a video-based ethnographic study have been<br />

described <str<strong>on</strong>g>in</str<strong>on</strong>g> Bowman et al. (2010) <strong>and</strong> de la Flor et al. (2010a). 514 De la Flor et al. reported that the<br />

512 This prototype is discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> the next secti<strong>on</strong> of this paper.<br />

513 http://bvreh.humanities.ox.ac.uk/VRE-SDM.html<br />

514 One publicati<strong>on</strong> regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g this work (de la Flor et al. 2010b) <strong>on</strong>ly recently came to the attenti<strong>on</strong> of the author <strong>and</strong> could not be more fully <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

this report.


153<br />

use of image process<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques is not particularly new <str<strong>on</strong>g>in</str<strong>on</strong>g> either epigraphy 515 or papyrology, but<br />

few technologies have been based <strong>on</strong> detailed studies of the actual work<str<strong>on</strong>g>in</str<strong>on</strong>g>g practices of classicists with<br />

digital images. While the project was developed with papyrologists <strong>and</strong> epigraphers <str<strong>on</strong>g>in</str<strong>on</strong>g> m<str<strong>on</strong>g>in</str<strong>on</strong>g>d, Bowman<br />

et al. (2010) hoped that the VRE-SDM might prove useful to any scholars who worked with<br />

manuscripts <strong>and</strong> thus they attempted to develop a tool that could be generalized for various discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es.<br />

In this particular case, however, de la Flor et al. (2010a) had videotaped the collaborative work<br />

sessi<strong>on</strong>s of expert classicists who were work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with the Tolsum Tablet, a wooden tablet dat<str<strong>on</strong>g>in</str<strong>on</strong>g>g from<br />

the first century AD. “The aim of the film<str<strong>on</strong>g>in</str<strong>on</strong>g>g,” Bowman et al. (2010) expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “was to discover <strong>and</strong><br />

document the <str<strong>on</strong>g>in</str<strong>on</strong>g>herent practices, tools <strong>and</strong> processes used to decipher ancient texts <strong>and</strong> to establish<br />

ways <str<strong>on</strong>g>in</str<strong>on</strong>g> which a VRE might emulate, support <strong>and</strong> advance these practices” (Bowman et al. 2010, 95).<br />

The VRE-SDM project wanted both to c<strong>on</strong>struct <strong>and</strong> test their <str<strong>on</strong>g>in</str<strong>on</strong>g>terface, so they filmed four meet<str<strong>on</strong>g>in</str<strong>on</strong>g>gs<br />

between three specialists. The scholars worked with the VRE-SDM prototype <strong>and</strong> were able to display<br />

images of the tablet <strong>on</strong> a large screen.<br />

By watch<str<strong>on</strong>g>in</str<strong>on</strong>g>g how the scholars exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the tablet <strong>and</strong> progressed <str<strong>on</strong>g>in</str<strong>on</strong>g> their <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of the text, de<br />

la Flor et al. observed a number of significant processes at work, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g (1) how the scholars<br />

identified shapes <strong>and</strong> letters <str<strong>on</strong>g>in</str<strong>on</strong>g> order to figure out words <strong>and</strong> phrases, (2) how the determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of<br />

shapes <strong>and</strong> letters was an iterative process that could <str<strong>on</strong>g>in</str<strong>on</strong>g>volve major re<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of earlier scholarly<br />

hypotheses regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g words; <strong>and</strong> (3) how scholars drew off their background knowledge <str<strong>on</strong>g>in</str<strong>on</strong>g> various<br />

languages, ancient history, <strong>and</strong> palaeographic expertise to analyze not just <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual letters or words<br />

but the text as a whole. In this experiment, the ability to enhance multiple digital images of the text <strong>and</strong><br />

to work collaboratively led the scholars to re<str<strong>on</strong>g>in</str<strong>on</strong>g>terpret several letters, <strong>and</strong> this c<strong>on</strong>sequently led them to<br />

re<str<strong>on</strong>g>in</str<strong>on</strong>g>terpret the word bovem as dquem. They thus c<strong>on</strong>cluded that the Tolsum Tablet was not about the<br />

sale of an ox but may have <str<strong>on</strong>g>in</str<strong>on</strong>g>stead been an example of an early loan note. 516<br />

The VRE-SDM prototype as it currently exists is c<strong>on</strong>trolled by either mouse or keyboard <strong>and</strong><br />

accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to de la Flor et al. (2010a) provides a collaborative workspace where classicists can select<br />

high-resoluti<strong>on</strong> digital images, manipulate them <str<strong>on</strong>g>in</str<strong>on</strong>g> different ways, use different algorithms to analyze<br />

them, <strong>and</strong> then view them al<strong>on</strong>g with images, texts, <strong>and</strong> annotati<strong>on</strong>s. An annotati<strong>on</strong> feature allows<br />

classicists to comment <strong>on</strong> letters, words, <strong>and</strong> phrases <strong>and</strong> to enter translati<strong>on</strong>s of them. In additi<strong>on</strong>, to<br />

enable users to select <strong>and</strong> annotate image regi<strong>on</strong>s, the VRE-SDM extended the Annotea vocabulary. 517<br />

Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a portlet, users can create, save, <strong>and</strong> share annotati<strong>on</strong>s, <strong>and</strong> Bowman et al (2010), reported that<br />

they were hop<str<strong>on</strong>g>in</str<strong>on</strong>g>g to build a system toward shared read<str<strong>on</strong>g>in</str<strong>on</strong>g>g that, al<strong>on</strong>g with the use of a st<strong>and</strong>ard format<br />

such as EpiDoc XML, would allow users to create digital editi<strong>on</strong>s that could be “supported by an audit<br />

trail of read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs.” To support <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> with other projects, all annotati<strong>on</strong>s <strong>and</strong> metadata are<br />

represented as RDF <strong>and</strong> stored <str<strong>on</strong>g>in</str<strong>on</strong>g> a Jena 518 triplestore.<br />

While their current fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g has allowed for <strong>on</strong>ly a pilot implementati<strong>on</strong>, the VRE-SDM project is<br />

c<strong>on</strong>sider<str<strong>on</strong>g>in</str<strong>on</strong>g>g develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g some new functi<strong>on</strong>alities, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the creati<strong>on</strong> of “hypothesis folders” where<br />

researchers could track translati<strong>on</strong>s proposed by colleagues for different texts (de la Flor et al. 2010a).<br />

Such a feature would allow scholars to associate specific images of a manuscript with translati<strong>on</strong>s or<br />

515 The use of advanced image-process<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies used <str<strong>on</strong>g>in</str<strong>on</strong>g> epigraphy has been discussed earlier <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper.<br />

516 For more <strong>on</strong> the re<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of the text, see Bowman et al. (2009) <strong>and</strong> (Tarte 2011).<br />

517 http://www.w3.org/2001/Annotea/. As seen throughout this paper, many types of annotati<strong>on</strong>s (editorial commentary, image annotati<strong>on</strong>s, l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic<br />

annotati<strong>on</strong>s) are used by digital classics projects. The importance of be<str<strong>on</strong>g>in</str<strong>on</strong>g>g able to share different types of annotati<strong>on</strong>s both with<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es<br />

has led to the creati<strong>on</strong> of the Open Annotati<strong>on</strong> Collaborati<strong>on</strong> (http://www.openannotati<strong>on</strong>.org/), which has released an “alpha” data model<br />

(http://www.openannotati<strong>on</strong>.org/spec/alpha3/) <strong>and</strong> developed use cases for shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g annotati<strong>on</strong>s.<br />

518 Jena is a “Java framework for build<str<strong>on</strong>g>in</str<strong>on</strong>g>g Semantic Web applicati<strong>on</strong>s” (http://jena.sourceforge.net/) <strong>and</strong> provides a “programmatic envir<strong>on</strong>ment” for<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with RDF, RDFS, OWL <strong>and</strong> SPARQL.


154<br />

other asserti<strong>on</strong>s made about parts of that manuscript. They also seek to extend the current annotati<strong>on</strong><br />

tool with an ability to annotate parts of images so that scholars could store the reas<strong>on</strong>s they used to<br />

propose translati<strong>on</strong>s of letters, words, or phrases. This need to annotate images at the level of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual words has also been reported by Cayless (2008) <strong>and</strong> Porter et al. (2009). One drawback to<br />

the prototype was that the magnificati<strong>on</strong> of images made it difficult to “po<str<strong>on</strong>g>in</str<strong>on</strong>g>t at a mark that differs<br />

c<strong>on</strong>siderably <str<strong>on</strong>g>in</str<strong>on</strong>g> scale from the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al” (de la Flor et al. 2010a). N<strong>on</strong>etheless, analysis of classicists’<br />

actual use of their prototype c<strong>on</strong>firmed their hypotheses that scholars need to be able to select different<br />

versi<strong>on</strong>s of the same image <strong>and</strong> to “browse, search, <strong>and</strong> compare images.”<br />

In additi<strong>on</strong> to allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g classicists to perform traditi<strong>on</strong>al tasks more efficiently, de la Flor et al.<br />

proposed that port<str<strong>on</strong>g>in</str<strong>on</strong>g>g the model of their VRE-SDM to a larger <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure might support further<br />

reanalysis of ancient documents as seen <str<strong>on</strong>g>in</str<strong>on</strong>g> their case study <strong>on</strong> a far larger scale:<br />

The VRE might be able to provide technological support when such re-<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s are<br />

made. For example, by systematically annotat<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts with tentative or firmer analyses of<br />

read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs it may be possible to provide a way of track<str<strong>on</strong>g>in</str<strong>on</strong>g>g the c<strong>on</strong>sequences of re-<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong><br />

for other similar texts. Currently, classicists draw <strong>on</strong> their expertise to c<strong>on</strong>sider a text, shift<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

back <strong>and</strong> forth from analyses of letters to analyses of words, l<str<strong>on</strong>g>in</str<strong>on</strong>g>es of text <strong>and</strong> eventually to the<br />

tablet as a whole. Paleographers tend to use their own draw<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of letter forms, developed<br />

through their own research. An e-Infrastructure might not <strong>on</strong>ly be able to distribute these<br />

resources between scholars, but it might also provide the means to communicate, expla<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong><br />

defend justificati<strong>on</strong>s, asserti<strong>on</strong>s <strong>and</strong> claims about a text (de la Flor 2010a).<br />

The ability to distribute specialized knowledge resources am<strong>on</strong>g scholars <strong>and</strong> to record <strong>and</strong> support<br />

vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s of a text is an important comp<strong>on</strong>ent of develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a larger<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for classics.<br />

The eSAD project has complemented this work <strong>on</strong> image-analysis systems with<str<strong>on</strong>g>in</str<strong>on</strong>g> VREs <str<strong>on</strong>g>in</str<strong>on</strong>g> their efforts<br />

to develop an <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> support system (ISS) for papyrologists, epigraphers, <strong>and</strong> palaeographers<br />

that will assist them as they decipher ancient documents. Similar to de la Flor et al. (2010a), Olsen et<br />

al. (2009), Roued-Cunliffe (2010), <strong>and</strong> Tarte (2011) have exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the work of papyrologists <str<strong>on</strong>g>in</str<strong>on</strong>g> detail,<br />

particularly the processes used <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s of texts, <str<strong>on</strong>g>in</str<strong>on</strong>g> order to model these processes<br />

with digital methods. They plan to create a system that can aid the analysis of ancient documents by<br />

track<str<strong>on</strong>g>in</str<strong>on</strong>g>g how these documents are <str<strong>on</strong>g>in</str<strong>on</strong>g>terpreted <strong>and</strong> read. “Such a system will facilitate the process of<br />

transcrib<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts,” argued Olsen et al., “by provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g a framework <str<strong>on</strong>g>in</str<strong>on</strong>g> which experts can record, track,<br />

<strong>and</strong> trace their progress when <str<strong>on</strong>g>in</str<strong>on</strong>g>terpret<str<strong>on</strong>g>in</str<strong>on</strong>g>g documentary material.” At the same time, Tarte (2011)<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>sisted, “the aim of the ISS that is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g developed is not to automate the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> process, but<br />

rather to facilitate the digital record<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> track<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the unravell<str<strong>on</strong>g>in</str<strong>on</strong>g>g of that process.” In other words,<br />

eSAD does not c<strong>on</strong>ceive of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>telligent system that will automate the work of scholars but<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stead is design<str<strong>on</strong>g>in</str<strong>on</strong>g>g a tool that will assist scholars as they read ancient documents <strong>and</strong> help them<br />

perform difficult tasks. “In this case,” Roued-Cunliffe expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “these tasks would ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ly be<br />

captur<str<strong>on</strong>g>in</str<strong>on</strong>g>g complicated reas<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g processes, search<str<strong>on</strong>g>in</str<strong>on</strong>g>g huge datasets, access<str<strong>on</strong>g>in</str<strong>on</strong>g>g other scholars’<br />

knowledge, <strong>and</strong> enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g co-operati<strong>on</strong> between scholars work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle document” (Roued-<br />

Cunliffe 2010). This tool will thus model the tacit knowledge <strong>and</strong> work<str<strong>on</strong>g>in</str<strong>on</strong>g>g processes of papyrologists<br />

as well as learn from their behavior <str<strong>on</strong>g>in</str<strong>on</strong>g> order to expedite their daily work <strong>and</strong> make suggesti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

future work.


155<br />

Although the applicati<strong>on</strong> named DUGA that has been created by the eSAD project is based <strong>on</strong> decisi<strong>on</strong><br />

support system technology (DSS) such as that used by doctors <strong>and</strong> eng<str<strong>on</strong>g>in</str<strong>on</strong>g>eers, they ultimately decided it<br />

was an <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> support system they were creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce “experts transcrib<str<strong>on</strong>g>in</str<strong>on</strong>g>g ancient documents<br />

do not make decisi<strong>on</strong>s based <strong>on</strong> evidence but <str<strong>on</strong>g>in</str<strong>on</strong>g>stead create <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s of the texts based <strong>on</strong> their<br />

percepti<strong>on</strong>” (Olsen et al. 2009). At the same time, <strong>on</strong>e of the key research goals was to explore issues<br />

of technology transfer, or to see if the ideas <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a DSS could be transferred to the<br />

work of classical scholars (Roued-Cunliffe 2010).<br />

One key idea beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d the ISS is that an <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> is made up of a network of “percepts” that range<br />

from low-level (determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g a character was created by an <str<strong>on</strong>g>in</str<strong>on</strong>g>cised stroke) to high-level (determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

that several characters make up a word). While this network of percepts is implicit <str<strong>on</strong>g>in</str<strong>on</strong>g> the process of<br />

papyrologists, the eSAD project plans to make them explicit <str<strong>on</strong>g>in</str<strong>on</strong>g> a “human-readable format through a<br />

web-based browser applicati<strong>on</strong>” (Olsen et al. 2009). In the applicati<strong>on</strong>, the most elementary percepts<br />

will be image regi<strong>on</strong>s that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> “graphemes”; these images will then be divided <str<strong>on</strong>g>in</str<strong>on</strong>g>to cells “where<br />

each cell is expected to c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> what is perceived as a character or a space” (Olsen et al. 2009). The<br />

divisi<strong>on</strong> of the image is c<strong>on</strong>sidered to be a tessellati<strong>on</strong>, <strong>and</strong> documents can be tessellated <str<strong>on</strong>g>in</str<strong>on</strong>g> different<br />

ways. The basic idea is that <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s can be represented as “networks of substantiated<br />

percepts” that will then be made explicit through an <strong>on</strong>tology. “The <strong>on</strong>tology aims to make the<br />

rati<strong>on</strong>ale beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d the network of percepts visible,” Olsen et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “<strong>and</strong> thus expose both: (a)<br />

some of the cognitive processes <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> damaged texts <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>; <strong>and</strong> (b) a set of arguments<br />

support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the tentative <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>” (Olsen et al. 2009). The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al ISS system will use this <strong>on</strong>tology<br />

(that will be formatted <str<strong>on</strong>g>in</str<strong>on</strong>g> EpiDoc) as a framework to assist scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g transcripti<strong>on</strong>s of texts.<br />

Another step <str<strong>on</strong>g>in</str<strong>on</strong>g> the model<str<strong>on</strong>g>in</str<strong>on</strong>g>g process for the ISS was creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g image-capture <strong>and</strong> -process<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

algorithms that could embody perceptual processes of papyrologists. As papyrologists often do not<br />

have access to the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al objects, they frequently work with digital photographs, <strong>and</strong> Tarte (2011)<br />

acknowledged that digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g wooden stylus tablets as “text bear<str<strong>on</strong>g>in</str<strong>on</strong>g>g objects” was not an easy feat.<br />

Up<strong>on</strong> observ<str<strong>on</strong>g>in</str<strong>on</strong>g>g papyrologists, they c<strong>on</strong>cluded that both manipulati<strong>on</strong>s of the images <strong>and</strong> prior<br />

knowledge played important roles <str<strong>on</strong>g>in</str<strong>on</strong>g> the percepti<strong>on</strong> of characters <strong>and</strong> words. The tablets were<br />

visualized us<str<strong>on</strong>g>in</str<strong>on</strong>g>g polynomial texture maps, <strong>and</strong> several algorithms were used to detect the text with<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the images. The algorithms created for m<str<strong>on</strong>g>in</str<strong>on</strong>g>imiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g background <str<strong>on</strong>g>in</str<strong>on</strong>g>terference (flatten<str<strong>on</strong>g>in</str<strong>on</strong>g>g the gra<str<strong>on</strong>g>in</str<strong>on</strong>g> of the<br />

wood) have also been utilized <str<strong>on</strong>g>in</str<strong>on</strong>g> the VRE-SDM. One of the most complicated (<strong>and</strong> still <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g) tasks<br />

was develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g algorithms to extract the “strokelets” that form characters (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g broken <strong>on</strong>es), as<br />

this is the feature “<strong>on</strong> which the human visual system locks” (Tarte 2011). The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al major algorithm<br />

developed was a “stroke-completi<strong>on</strong> algorithm” that was created to help facilitate both automatic <strong>and</strong><br />

scholarly identificati<strong>on</strong> of characters. The ISS to be developed will eventually propose potential<br />

character read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs (utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g a knowledge base of “digitally identified list of possible read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs”) but<br />

will never force a user to choose <strong>on</strong>e (Tarte 2011).<br />

Many of the <str<strong>on</strong>g>in</str<strong>on</strong>g>sights for both the algorithm development process <strong>and</strong> the format of the ISS built off<br />

earlier work by Melissa Terras that modeled how papyrologists read documents (Terras 2005). This<br />

model identified various levels of read<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>ducted by papyrologists (identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g features [strokelets],<br />

characters, series of characters, morpheme, grammatical level, mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g of word, mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g of groups of<br />

words, mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g of document), but the use of knowledge-elicitati<strong>on</strong> techniques, such as “th<str<strong>on</strong>g>in</str<strong>on</strong>g>k aloud”<br />

protocols, by Terras revealed that “<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> as a mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g-build<str<strong>on</strong>g>in</str<strong>on</strong>g>g process” did not <str<strong>on</strong>g>in</str<strong>on</strong>g>variably<br />

beg<str<strong>on</strong>g>in</str<strong>on</strong>g> at the feature level <strong>and</strong> then successively build to higher levels of read<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Instead, as Tarte<br />

expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, the creati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s jumped between levels of read<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s at any<br />

given level might <str<strong>on</strong>g>in</str<strong>on</strong>g>fluence those at another. Roued-Cunliffe articulated this po<str<strong>on</strong>g>in</str<strong>on</strong>g>t further:


156<br />

The c<strong>on</strong>clusi<strong>on</strong> drawn from this experience was ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ly that read<str<strong>on</strong>g>in</str<strong>on</strong>g>g ancient documents is not a<br />

process of transcrib<str<strong>on</strong>g>in</str<strong>on</strong>g>g the document letter-by-letter <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>e-by-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Instead it is a cyclic<br />

process of identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g visual features <strong>and</strong> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g up evidence for <strong>and</strong> aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ually<br />

develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g hypotheses about characters, words <strong>and</strong> phrases. This is then checked aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st other<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> an <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g process until the editors are happy with the f<str<strong>on</strong>g>in</str<strong>on</strong>g>al <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong><br />

(Roued-Cunliffe 2010).<br />

Of particular <str<strong>on</strong>g>in</str<strong>on</strong>g>terest for the development of their ISS, then, was to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e “how <strong>and</strong> why the jumps<br />

between read<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels occur, <strong>and</strong> to what extent visi<strong>on</strong>, expertise <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> are <str<strong>on</strong>g>in</str<strong>on</strong>g>tertw<str<strong>on</strong>g>in</str<strong>on</strong>g>ed”<br />

(Tarte 2011).<br />

Further <str<strong>on</strong>g>in</str<strong>on</strong>g>sight <str<strong>on</strong>g>in</str<strong>on</strong>g>to this process came from an exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of the transcript of three papyrologists<br />

attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to figure out a complicated letter form. Two major approaches were identified: a<br />

“k<str<strong>on</strong>g>in</str<strong>on</strong>g>aesthetic/paleographical approach,” where the scholar would draw characters or trace over them<br />

with a f<str<strong>on</strong>g>in</str<strong>on</strong>g>ger to try <strong>and</strong> rec<strong>on</strong>struct the movements of a scribe; <strong>and</strong> a “philological/cruciverbalistic”<br />

approach, where the scholar looks at the questi<strong>on</strong> as a puzzle-solv<str<strong>on</strong>g>in</str<strong>on</strong>g>g task <strong>and</strong> often relies up<strong>on</strong><br />

characters he or she is certa<str<strong>on</strong>g>in</str<strong>on</strong>g> of to make decisi<strong>on</strong>s <strong>and</strong> test various hypotheses (Tarte 2011). Although<br />

Tarte recognized that the two approaches were not mutually exclusive, she c<strong>on</strong>cluded that the ISS<br />

would need to be able to support both. Through analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g this transcript she identified several forms of<br />

scholarly expertise <strong>and</strong> how they triggered jumps between read<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g visual skill (from<br />

experience), “scholarly c<strong>on</strong>tent expectati<strong>on</strong>s” (based <strong>on</strong> prior knowledge), aspect-shift<str<strong>on</strong>g>in</str<strong>on</strong>g>g (“ways of<br />

look<str<strong>on</strong>g>in</str<strong>on</strong>g>g” vs. “ways of see<str<strong>on</strong>g>in</str<strong>on</strong>g>g,”), <strong>and</strong> global-local oscillati<strong>on</strong>s. Translat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the work processes that lead<br />

to scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>to digital methods, however, Tarte stated has not been a simple task.<br />

“One difficulty <str<strong>on</strong>g>in</str<strong>on</strong>g> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a case for an <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> is that it is all about rec<strong>on</strong>struct<str<strong>on</strong>g>in</str<strong>on</strong>g>g a mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

for which there is no accessible ground truth,” Tarte reported; “the objective towards which we are<br />

tend<str<strong>on</strong>g>in</str<strong>on</strong>g>g is to facilitate the digital record<str<strong>on</strong>g>in</str<strong>on</strong>g>g of how such a case is made” (Tarte 2011). The project has<br />

thus sought to unravel the process of mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g decisi<strong>on</strong>s <strong>and</strong> to “m<str<strong>on</strong>g>in</str<strong>on</strong>g>d map” various percepts (through<br />

the creati<strong>on</strong> of a schematic of percepts) that lead to the creati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s.<br />

Additi<strong>on</strong>al technical details <strong>on</strong> design choices for the ISS that would support the<br />

cruciverbalistic/philological approach have been given by Roued-Cunliffe (2010). She expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that<br />

s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce DUGA needs to record not just f<str<strong>on</strong>g>in</str<strong>on</strong>g>al scholarly decisi<strong>on</strong>s but the evidence used to create them, she<br />

has explored the idea of “us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a set of knowledge bases, such as word lists <strong>and</strong> frequencies from<br />

relevant corpora” to suggest <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s of words <strong>and</strong> letters as a scholar reads a document or as<br />

evidence to support a particular <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>. C<strong>on</strong>sequently, DUGA <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a “word-search facility”<br />

that is c<strong>on</strong>nected to a “knowledge base Web Service” called APPELLO that has been created from the<br />

EpiDoc XML files of the V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablets. The project hopes to further develop the knowledge base<br />

functi<strong>on</strong> of this web service to support any textual corpus that uses EpiDoc. Rather than creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

large rule base for classicists such as the rules created <str<strong>on</strong>g>in</str<strong>on</strong>g> a DSS system for doctors, Roued-Cunliffe<br />

noted that classicists often use similar documents to make choices about the document they are<br />

currently analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g, so the use of a knowledge base from related documents seemed a far better choice.<br />

The work of Roued-Cunliffe differed slightly from the earlier work of Melissa Terras <str<strong>on</strong>g>in</str<strong>on</strong>g> that the former<br />

identified “stages,” rather than levels, of read<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> placed certa<str<strong>on</strong>g>in</str<strong>on</strong>g> identificati<strong>on</strong> tasks (such as the<br />

identificati<strong>on</strong> of letters) <strong>on</strong> a different level. Her work still relied <strong>on</strong> the basic idea that <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s<br />

c<strong>on</strong>sist of “networks of percepts” from low to high level, <strong>and</strong> that these percepts are used to make<br />

scholarly decisi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> an iterative fashi<strong>on</strong>. Roued-Cunliffe also wanted to make sure the model used


157<br />

for DUGA reflected the actual work<str<strong>on</strong>g>in</str<strong>on</strong>g>g practices of classicists, <strong>and</strong> did not simply rely <strong>on</strong> quantitative<br />

measures:<br />

Classical scholars do not traditi<strong>on</strong>ally justify their <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s for example by claim<str<strong>on</strong>g>in</str<strong>on</strong>g>g to be<br />

85% sure of a character or word. Therefore, there would be no po<str<strong>on</strong>g>in</str<strong>on</strong>g>t <str<strong>on</strong>g>in</str<strong>on</strong>g> try<str<strong>on</strong>g>in</str<strong>on</strong>g>g to quantify their<br />

percepti<strong>on</strong>s by express<str<strong>on</strong>g>in</str<strong>on</strong>g>g a percentage of certa<str<strong>on</strong>g>in</str<strong>on</strong>g>ty for a given percept. Instead this research is<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> a model of evidence for (+) <strong>and</strong> aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st (-) each percept. The network of percepts<br />

would furthermore enable each percept to act as evidence for or aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st other percepts (Roued-<br />

Cunliffe 2010).<br />

Thus, Roued-Cunliffe wanted to make sure that DUGA could capture scholarly expertise <strong>and</strong> add<br />

reas<strong>on</strong>s used by scholars to make decisi<strong>on</strong>s as “pieces of evidence” under the head<str<strong>on</strong>g>in</str<strong>on</strong>g>g “Scholarly<br />

Judgments.” The current DUGA prototype stores <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s as XML documents but does<br />

not yet store all the pieces of evidence for <strong>and</strong> aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st each percept.<br />

The DUGA prototype is divided <str<strong>on</strong>g>in</str<strong>on</strong>g>to a set of views, a “transcript view” <strong>and</strong> a “box view” that<br />

visualizes each character <strong>and</strong> word with boxes, both of which are populated by two different XSLT<br />

translati<strong>on</strong>s of an XML document. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce users will view images of documents <strong>and</strong> need to make<br />

annotati<strong>on</strong>s <strong>on</strong> words or characters, Roued-Cunliffe is explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the image-annotati<strong>on</strong> tool<br />

AXE, which is currently be<str<strong>on</strong>g>in</str<strong>on</strong>g>g enhanced by the TILE project (Porter et al. 2009) or an annotati<strong>on</strong><br />

viewer created by the BVREH project (Bowman et al. 2010). The use of <strong>on</strong>e of these annotati<strong>on</strong> tools<br />

would allow users to draw “character-, word- or l<str<strong>on</strong>g>in</str<strong>on</strong>g>e-boxes anywhere <strong>on</strong> the image at any time,” <strong>and</strong><br />

these annotati<strong>on</strong>s would then be turned <str<strong>on</strong>g>in</str<strong>on</strong>g>to XML <strong>and</strong> could have scholarly arguments attached to<br />

them. Scholars could also move back <strong>and</strong> forth between the box view <strong>and</strong> the annotati<strong>on</strong> view, as they<br />

needed to add or review characters or words. The ability to “support a circular <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> process,”<br />

Roued-Cunliffe argued, was an essential design feature for DUGA. As more scholars use such a<br />

system to annotate texts, the evidence used for <strong>and</strong> aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st different percepts such as problematic<br />

character identificati<strong>on</strong>s could be stored <strong>and</strong> then presented to new scholars as they were annotat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the same text. In additi<strong>on</strong>, the eSAD project is also work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> character-recogniti<strong>on</strong> software that will<br />

also make recommendati<strong>on</strong>s. “The word search <strong>and</strong> character recogniti<strong>on</strong> results should not be seen as<br />

c<strong>on</strong>clusive evidence,” Roued-Cunliffe c<strong>on</strong>cluded, “but as suggesti<strong>on</strong>s that may either c<strong>on</strong>firm the<br />

scholars’ current percept or <str<strong>on</strong>g>in</str<strong>on</strong>g>spire a new <strong>on</strong>e. It is entirely up to the scholars to decide how they value<br />

each piece of evidence” (Roued-Cunliffe 2010).<br />

Roued-Cunliffe commented that, to extend their model, more knowledge bases would need to become<br />

available, such as a knowledge base of Greek words for a scholar work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s. The general<br />

idea would be to allow scholars to choose from a number of knowledge bases, depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the text<br />

with which they were work<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Their current web service, APPELLO, makes use of the highly<br />

granular encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the EpiDoc XML files of the V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a project, particularly the lemmas that<br />

were encoded. In the future, they hope to adapt APPELLO so that it can <str<strong>on</strong>g>in</str<strong>on</strong>g>teract with other classical<br />

language data sets available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <strong>and</strong> Roued-Cunliffe hopes that more projects would make their<br />

data sets available <str<strong>on</strong>g>in</str<strong>on</strong>g> a format such as XML. Other planned work is to turn the prototype <str<strong>on</strong>g>in</str<strong>on</strong>g>to a<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g applicati<strong>on</strong>, where the biggest issues will be design<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>terface, determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g how to store<br />

the networks of percepts <strong>and</strong> the evidence for <strong>and</strong> aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st them, <strong>and</strong> add<str<strong>on</strong>g>in</str<strong>on</strong>g>g annotati<strong>on</strong> software.<br />

Philology<br />

Moalla et al. (2006) have def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed philology as a research field that studies “ancient languages, their<br />

grammars, the history <strong>and</strong> the ph<strong>on</strong>etics of the words <str<strong>on</strong>g>in</str<strong>on</strong>g> order to educate <strong>and</strong> underst<strong>and</strong> ancient texts”


158<br />

<strong>and</strong> that “is ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ly based <strong>on</strong> the c<strong>on</strong>tent of texts <strong>and</strong> c<strong>on</strong>cerns h<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts as well as pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted<br />

documents.” Crane, Seales, <strong>and</strong> Terras (2009) similarly def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed philology as the “producti<strong>on</strong> of shared<br />

primary <strong>and</strong> sec<strong>on</strong>dary sources about l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic sources” <strong>and</strong> dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guished classical philology as a<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e that “focuses up<strong>on</strong> Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, as these languages have been produced from antiquity<br />

through the present.”<br />

As these def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong>s illustrate, the study of philology c<strong>on</strong>cerns all texts whether they are ancient<br />

manuscripts or pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s from the n<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth century. The needs of philologists are closely tied<br />

to the development of digital editi<strong>on</strong>s <strong>and</strong> digital corpora, <strong>and</strong> various research surveyed throughout<br />

this review has explored different facets of philological research. For example, the LDAB helps<br />

philologists f<str<strong>on</strong>g>in</str<strong>on</strong>g>d the oldest preserved copies of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual texts, <strong>and</strong> portals such as KIRKE <strong>and</strong><br />

Propylaeum have created selected lists of digital philological resources. TextGrid is dedicated to<br />

creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a specialist text-edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment, <strong>and</strong> philologists are <strong>on</strong>e of their <str<strong>on</strong>g>in</str<strong>on</strong>g>tended user groups<br />

(Dimitriadis et al. 2006, Gietz et al. 2006). Various computati<strong>on</strong>al tools, such as morphological<br />

analyzers, lexic<strong>on</strong>s, <strong>and</strong> treebanks, have been developed to assist philologists of Sanskrit (Huet 2004,<br />

Hellwig 2010), Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> (Bamman <strong>and</strong> Crane 2006, Bamman <strong>and</strong> Crane 2009), <strong>and</strong> Greek (Bamman,<br />

Mambr<str<strong>on</strong>g>in</str<strong>on</strong>g>i, <strong>and</strong> Crane 2009, Dik <strong>and</strong> Whal<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2009). Other tools have been created to help philologists<br />

create digital critical editi<strong>on</strong>s such as DUGA (Roued-Cunliffe 2010), Hypereidoc (Bauer et al. 2008),<br />

<strong>and</strong> OCHRE. Other work has explored cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital philology <strong>and</strong> digital classics<br />

(Crane, Seales <strong>and</strong> Terras 2009). This secti<strong>on</strong> looks at several research projects that hope to support a<br />

new type of “digital philology.”<br />

One of the greatest obstacles to “digital philology,” accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to some researchers, is that digital<br />

corpora such as the TLG <strong>and</strong> the PHI databank of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> texts simply choose a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle editi<strong>on</strong> as their<br />

can<strong>on</strong>ical versi<strong>on</strong> of a text <strong>and</strong> provide no access to the apparatus criticus (Boschetti 2009, Ruhleder<br />

1995):<br />

Such approach to the ancient text, just about acceptable for literary <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic purposes, is<br />

unfeasible for philological studies. In fact, the philologist needs to identify manuscript variants<br />

<strong>and</strong> scholars’ c<strong>on</strong>jectures, <str<strong>on</strong>g>in</str<strong>on</strong>g> order to evaluate which is the most probable textual read<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

accept<str<strong>on</strong>g>in</str<strong>on</strong>g>g or reject<str<strong>on</strong>g>in</str<strong>on</strong>g>g the hypotheses of the previous editors. Furthermore, he or she needs to<br />

exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e the commentaries, articles <strong>and</strong> m<strong>on</strong>ographs c<strong>on</strong>cern<str<strong>on</strong>g>in</str<strong>on</strong>g>g specific parts of the text. Thus,<br />

the extensi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> breadth of the aforementi<strong>on</strong>ed collecti<strong>on</strong>s needs to be <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated by the<br />

extensi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> depth, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the paradigms of a new generati<strong>on</strong> of digital libraries<br />

(Boschetti 2009).<br />

In the Digital Aeschylus project described by Boschetti (2009), the author reports that they are seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

to remedy these problems by creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital library that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes images of multiple manuscripts of<br />

Aeschylus, manually created transcripti<strong>on</strong>s of the most relevant manuscripts <strong>and</strong> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s, OCR<br />

of recent editi<strong>on</strong>s, an extensive bibliography of sec<strong>on</strong>dary sources, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> extracti<strong>on</strong> tools to<br />

be used <strong>on</strong> the digitized documents. They seek to create a comprehensive digital library for Aeschylus<br />

that will support philologists <str<strong>on</strong>g>in</str<strong>on</strong>g> the development of critical editi<strong>on</strong>s.<br />

Tools for Electr<strong>on</strong>ic Philology: BAMBI <strong>and</strong> Aristarchus<br />

One of the earliest projects that explored the computati<strong>on</strong>al needs of philologists was the BAMBI<br />

(“Better Access to Manuscripts <strong>and</strong> Brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Images”) project, which developed a “hypermedia<br />

workstati<strong>on</strong>” to assist scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> read<str<strong>on</strong>g>in</str<strong>on</strong>g>g manuscripts, writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g annotati<strong>on</strong>s, <strong>and</strong> navigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g between<br />

words <str<strong>on</strong>g>in</str<strong>on</strong>g> a transcripti<strong>on</strong> <strong>and</strong> images <str<strong>on</strong>g>in</str<strong>on</strong>g> digitized manuscripts (Bozzi <strong>and</strong> Calabretto 1997). The project


159<br />

was aimed at two types of users: general users of libraries who wished to exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e manuscripts, <strong>and</strong><br />

“professi<strong>on</strong>al students of texts” or philologists, whom they def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as “critical editors of classical or<br />

medieval works that are h<strong>and</strong>-written <strong>on</strong> material supports of various types (paper, papyrus, st<strong>on</strong>e)”<br />

(Bozzi <strong>and</strong> Calabretto 1997). The authors thus developed a “philological workstati<strong>on</strong>” that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded<br />

four major features: (1) the ability to look up digital images <str<strong>on</strong>g>in</str<strong>on</strong>g> an archive; (2) the transcripti<strong>on</strong>,<br />

annotati<strong>on</strong>, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g of images; (3) the view<str<strong>on</strong>g>in</str<strong>on</strong>g>g of transcribed versi<strong>on</strong>s of texts <strong>and</strong> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an<br />

“Index Locorum”; <strong>and</strong> (4) the automatic match<str<strong>on</strong>g>in</str<strong>on</strong>g>g of words found <str<strong>on</strong>g>in</str<strong>on</strong>g> transcripti<strong>on</strong>s, the “Index<br />

Locorum,” <strong>and</strong> annotati<strong>on</strong>s with the relevant porti<strong>on</strong> of the source-document image that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s the<br />

word. This last feature, while desired by many other digital editi<strong>on</strong> <strong>and</strong> manuscript projects, is still an<br />

area of unresolved <strong>and</strong> active research (Cayless 2008, Cayless 2009, Porter et al. 2009).<br />

In an overview of their philological workstati<strong>on</strong>, Bozzi <strong>and</strong> Calabretto listed the functi<strong>on</strong>s that it<br />

supported. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, the workstati<strong>on</strong> allowed users to search manuscript collecti<strong>on</strong>s <strong>and</strong> to create<br />

transcripti<strong>on</strong>s of digital images of manuscripts <strong>and</strong> export them as RTF or SGML. One important<br />

feature was the <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g of transcripti<strong>on</strong>s that could be used by philologists to generate an “Index<br />

Verborum” <strong>and</strong> an “Index Locorum” for each script <str<strong>on</strong>g>in</str<strong>on</strong>g> the manuscript (e.g., Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>). The<br />

“Index Verborum” c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed all the words appear<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the transcripti<strong>on</strong> <strong>and</strong> the words that were<br />

corrected by the user (us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the text variant functi<strong>on</strong>), while the “Index Locorum” displayed “the<br />

positi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> which each word occurs <str<strong>on</strong>g>in</str<strong>on</strong>g> the manuscript.” In additi<strong>on</strong>, annotati<strong>on</strong>s could be created <strong>on</strong><br />

manuscript transcripti<strong>on</strong>s, <strong>and</strong> all annotati<strong>on</strong>s c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed two dist<str<strong>on</strong>g>in</str<strong>on</strong>g>ct fields, <strong>on</strong>e for free comments <strong>and</strong><br />

the critical apparatus, <strong>and</strong> <strong>on</strong>e for variants, syn<strong>on</strong>yms, <strong>and</strong> the correcti<strong>on</strong> of syntax. The BAMBI<br />

workstati<strong>on</strong> also supported automatic column <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>e recogniti<strong>on</strong> <strong>and</strong>, even more important, the<br />

automatic creati<strong>on</strong> of a word-image c<strong>on</strong>cordance (if a transcripti<strong>on</strong> for a manuscript was available) that<br />

matches each word of the text with the appropriate porti<strong>on</strong> of the image. The c<strong>on</strong>cordance was built<br />

automatically, <strong>and</strong> this module provided a simultaneous view of the transcripti<strong>on</strong> <strong>and</strong> the image so the<br />

user could check its accuracy. It also allowed the user to query the manuscript collecti<strong>on</strong> by select<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

word <str<strong>on</strong>g>in</str<strong>on</strong>g> either the transcripti<strong>on</strong> or <strong>on</strong> the image. The BAMBI prototype made use of HyTime (an<br />

extensi<strong>on</strong> of SGML) to model works <strong>on</strong> ancient manuscripts, <str<strong>on</strong>g>in</str<strong>on</strong>g> particular because it allowed<br />

“specificati<strong>on</strong> of l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks between text <strong>and</strong> part of image (part of an object).”<br />

While the fuller technical details of this workstati<strong>on</strong> are somewhat outdated as of this writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g, the<br />

unanswered issues identified by the BAMBI project are still largely relevant for digital philology.<br />

Bozzi <strong>and</strong> Calabretto noted that the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g requirements needed to be met: better st<strong>and</strong>ards-based<br />

tools for the descripti<strong>on</strong> of manuscripts; more sophisticated image-process<str<strong>on</strong>g>in</str<strong>on</strong>g>g rout<str<strong>on</strong>g>in</str<strong>on</strong>g>es (although they<br />

called for the enhancement of microfilm images rather than the images of manuscripts themselves); “a<br />

comprehensive soluti<strong>on</strong> for the management of text variants”; “tools based <strong>on</strong> image process<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

facilities <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic (statistical) facilities for the electr<strong>on</strong>ic restorati<strong>on</strong> of miss<str<strong>on</strong>g>in</str<strong>on</strong>g>g text elements”;<br />

new models for collaborative work (though work today has moved bey<strong>on</strong>d client-server models based<br />

<strong>on</strong> the web); <strong>and</strong> a survey of the technical <strong>and</strong> legal issues <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g “widespread, multisource<br />

services offer<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital versi<strong>on</strong>s of library materials <strong>and</strong> the tools for their use” (Bozzi <strong>and</strong><br />

Calabretto 1997). As has been seen <str<strong>on</strong>g>in</str<strong>on</strong>g> this review, the challenges of manuscript descripti<strong>on</strong>, advanced<br />

image process<str<strong>on</strong>g>in</str<strong>on</strong>g>g, the management of text variants, the creati<strong>on</strong> of sophisticated digital tools,<br />

collaborative workspaces, <strong>and</strong> comprehensive open-source digital libraries rema<str<strong>on</strong>g>in</str<strong>on</strong>g> topics of c<strong>on</strong>cern.<br />

Other research <str<strong>on</strong>g>in</str<strong>on</strong>g> digital philology has been c<strong>on</strong>ducted by the Aristarchus project, 519 <strong>and</strong> an article by<br />

Franco M<strong>on</strong>tanari (M<strong>on</strong>tanari 2004) has provided an overview of the electr<strong>on</strong>ic tools for classical<br />

519 http://www.aristarchus.unige.it/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex_<str<strong>on</strong>g>in</str<strong>on</strong>g>glese.php


160<br />

philology available at the website. M<strong>on</strong>tanari suggested that two types of digital tools had been created<br />

for philology <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>deed for the humanities <str<strong>on</strong>g>in</str<strong>on</strong>g> general: (1) general electr<strong>on</strong>ic tools that were<br />

transformed to fit more specific needs; <strong>and</strong> (2) new tools that were created to meet unique dem<strong>and</strong>s.<br />

Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to M<strong>on</strong>tanari, an <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g familiarity with digital tools will be required of all philologists:<br />

“The “new” classical scholar <strong>and</strong> teacher is supposed to be at home with this k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of tools,” M<strong>on</strong>tanari<br />

asserted, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “textual, bibliographical, <strong>and</strong> lexicographical databanks represent three of the<br />

most relevant electr<strong>on</strong>ic tools available thanks to the progress of digital technology.”<br />

The Aristarchus project, named after Aristarchus of Samothrace, <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a number of tools for<br />

philologists study<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Greek <strong>and</strong> Roman world <strong>and</strong> has been created by the University of Genoa.<br />

The first tool, the “Lessico dei Grammatici Graeci Antichi” (LGGA) 520 or “Lexic<strong>on</strong> of Ancient Greek<br />

Grammarians,” provides a lexic<strong>on</strong> of ancient Greek scholars <strong>and</strong> philologists <strong>and</strong> provides an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

database that can be used to study the “history of ancient philology, grammar <strong>and</strong> scholarship.” In<br />

additi<strong>on</strong>, this website provides access to a sec<strong>on</strong>d “lexic<strong>on</strong>,” the “Catalogus Philologorum<br />

Classicorum” (CPhCL), 521 “an encyclopaedic lexic<strong>on</strong> that collects the biographies <strong>and</strong> the<br />

bibliographies of modern classical scholars.” The largest database is “Poorly Attested Words <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek<br />

(PAWAG),” 522 which gathers together Ancient Greek words that have <strong>on</strong>ly been rarely attested <strong>and</strong> is<br />

described by M<strong>on</strong>tanari as a “half way house between a dicti<strong>on</strong>ary <str<strong>on</strong>g>in</str<strong>on</strong>g> the strict sense <strong>and</strong> an<br />

encyclopedic lexic<strong>on</strong>.” Two specialist websites have been created by Aristarchus:<br />

MEDIACLASSIC, 523 a “web site for didactics of the ancient Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> languages”; <strong>and</strong><br />

“Scholia M<str<strong>on</strong>g>in</str<strong>on</strong>g>ora <str<strong>on</strong>g>in</str<strong>on</strong>g> Homerum,” 524 a site that provides an “up-to-date list<str<strong>on</strong>g>in</str<strong>on</strong>g>g, descripti<strong>on</strong>s, editi<strong>on</strong>s <strong>and</strong><br />

digital images of the so-called Scholia M<str<strong>on</strong>g>in</str<strong>on</strong>g>ora to the Iliad <strong>and</strong> Odyssey <strong>on</strong> papyrus.” Images of papyri<br />

can be viewed after registrati<strong>on</strong>. F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, the Aristarchus website hosts the Centro Italiano dell’Annee<br />

Philologique (CIAPh), 525 the Italian editorial office of the <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al Année Philologique.<br />

Infrastructure for Digital Philology: The Teuchos Project<br />

While the BAMBI <strong>and</strong> the Aristarchus projects explored the use of digital tools, both projects def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

philology <str<strong>on</strong>g>in</str<strong>on</strong>g> a fairly traditi<strong>on</strong>al manner, i.e., <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the type of work that would be performed. A<br />

more expansive def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of philology was offered by Crane et al. (2009b): “Philology is thus not just<br />

about text; it is about the world that produced our surviv<str<strong>on</strong>g>in</str<strong>on</strong>g>g textual sources <strong>and</strong> about the tangible<br />

impact that these texts have had up<strong>on</strong> the worlds that read them.” To pursue a new level of ePhilology,<br />

the authors argued that a new digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure needed to be developed that brought together all<br />

relevant primary <strong>and</strong> sec<strong>on</strong>dary sources that are currently scattered <str<strong>on</strong>g>in</str<strong>on</strong>g> various specialized digital<br />

libraries <strong>and</strong> that provided background knowledge pers<strong>on</strong>alized to the needs of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual scholars. In<br />

additi<strong>on</strong>, new digital editi<strong>on</strong>s <strong>and</strong> commentaries need to ab<strong>and</strong><strong>on</strong> the limited assumpti<strong>on</strong>s of pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<br />

publicati<strong>on</strong>s (e.g., simply scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted book rather than creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a true digital editi<strong>on</strong>), Crane et<br />

al. (2009b) reas<strong>on</strong>ed:<br />

We now face the challenge of rebuild<str<strong>on</strong>g>in</str<strong>on</strong>g>g our <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital form. Much of the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual capital that we accumulated <str<strong>on</strong>g>in</str<strong>on</strong>g> the twentieth century is <str<strong>on</strong>g>in</str<strong>on</strong>g>accessible, either because<br />

its pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t format does not lend itself to c<strong>on</strong>versi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to a mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-acti<strong>on</strong>able form or because<br />

520 http://www.aristarchus.unige.it/lgga/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

521 http://www.aristarchus.unige.it/cphcl/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

522 http://www.aristarchus.unige.it/pawag/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php<br />

523 http://www.loescher.it/mediaclassica/<br />

524 http://www.aristarchus.unige.it/scholia/<br />

525 http://www.aristarchus.unige.it/ciaph/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php


161<br />

commercial entities own the rights <strong>and</strong> the c<strong>on</strong>tent is not available under the open-licens<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

regimes necessary for eScience <str<strong>on</strong>g>in</str<strong>on</strong>g> general <strong>and</strong> ePhilology <str<strong>on</strong>g>in</str<strong>on</strong>g> particular (Crane et al. 2009b).<br />

Thus, the lack of mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-acti<strong>on</strong>able c<strong>on</strong>tents <strong>and</strong> restrictive copyright regimes frustrate a move to<br />

ePhilology. In additi<strong>on</strong>, a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for ePhilology, the authors argued, required at least three<br />

types of access: (1) “access to digital representati<strong>on</strong>s of the human record” such as page images of<br />

manuscripts <strong>and</strong> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted books; (2) “access to labeled <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about the human record” such as<br />

named entity annotati<strong>on</strong>s; <strong>and</strong> (3) “access to automatically generated knowledge” or the processes of<br />

various algorithms.<br />

Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g such a new digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for philological research is <strong>on</strong>e of the larger goals of<br />

Teuchos, 526 a project of the University of Hamburg <str<strong>on</strong>g>in</str<strong>on</strong>g> partnership with the Aristotle Archive at the<br />

Free University of Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>. Teuchos is build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for classical philology, with an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>itial focus <strong>on</strong> the textual transmissi<strong>on</strong> of Aristotle. Work will focus <strong>on</strong> the digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g, encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong><br />

descripti<strong>on</strong> of manuscripts; develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a XML encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g for manuscript watermarks; <strong>and</strong> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

web-based envir<strong>on</strong>ment for philological work that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a Fedora repository, the management of<br />

heterogeneous data, <strong>and</strong> support for a multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment. Two recent articles (Deckers<br />

et al. 2009 <strong>and</strong> Vertan 2009) have explored different aspects of the Teuchos project.<br />

Deckers et al. (2009) offered a detailed explanati<strong>on</strong> of the data encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> representati<strong>on</strong> of<br />

“manuscripts as textual witnesses <strong>and</strong> watermarks” with a focus <strong>on</strong> the former <strong>and</strong> an extensive<br />

overview of the Teuchos platform.<br />

In its f<str<strong>on</strong>g>in</str<strong>on</strong>g>al form Teuchos is to provide a web based research envir<strong>on</strong>ment suited for manuscript<br />

<strong>and</strong> textual studies, offer<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools for captur<str<strong>on</strong>g>in</str<strong>on</strong>g>g, exchange <strong>and</strong> collaborative edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g of primary<br />

philogical (sic.) data. The data shall be made accessible to the scholarly community as primary<br />

or raw data <str<strong>on</strong>g>in</str<strong>on</strong>g> order to be reusable as source material for various <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual or collaborative<br />

research projects. This objective entails an open access policy us<str<strong>on</strong>g>in</str<strong>on</strong>g>g creative comm<strong>on</strong>s licenses<br />

regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the c<strong>on</strong>tent generated <strong>and</strong> published by means of the platform (esp. digital images of<br />

manuscripts may have to be h<strong>and</strong>led restrictively dependant up<strong>on</strong> the hold<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s’<br />

policies) (Deckers et al. 2009).<br />

The Teuchos project is c<strong>on</strong>sequently develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g an open-source platform that can be used for<br />

collaborative edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g of manuscripts <strong>and</strong> creati<strong>on</strong> of philological data, <strong>and</strong> the data that are created will<br />

be made available under a CC license, although the authors noted that provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to images of<br />

manuscripts will depend <strong>on</strong> the respective policies of their own<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s. 527 One dist<str<strong>on</strong>g>in</str<strong>on</strong>g>ctive<br />

feature of the Teuchos platform is that it will support the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of heterogeneous data <strong>and</strong> the<br />

participati<strong>on</strong> of different user groups.<br />

The creators of Teuchos (Deckers et al. 2009) outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a number of potential use cases that <str<strong>on</strong>g>in</str<strong>on</strong>g>formed<br />

their design choices, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g (1) the provisi<strong>on</strong> of extensive data that facilitate the use of digitized<br />

manuscripts such as the markup of both the structural <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual c<strong>on</strong>tent of<br />

manuscripts (this would <str<strong>on</strong>g>in</str<strong>on</strong>g>clude transcripti<strong>on</strong>s that <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate variant read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs); (2) access to digital<br />

manuscript images that are accompanied by at least partial transcripti<strong>on</strong>s so that material not <strong>on</strong>ly<br />

becomes more rapidly available <strong>and</strong> citable but also can be the “basis for further editorial work”; (3) a<br />

collaborative envir<strong>on</strong>ment for researchers; (4) a c<strong>on</strong>stantly evolv<str<strong>on</strong>g>in</str<strong>on</strong>g>g collecti<strong>on</strong> of manuscript<br />

526 http://beta.teuchos.uni-hamburg.de/<br />

527 Copyright, creative comm<strong>on</strong>s licens<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> the use of digitized manuscript images has recently been explored by Cayless (2010a).


162<br />

descripti<strong>on</strong>s that provides scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> “codicology, manuscript history <strong>and</strong> textual<br />

transmissi<strong>on</strong>”; (5) a flexible data model that can accommodate the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of “manuscript<br />

descripti<strong>on</strong>s” of vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g semantic depth <strong>and</strong> length; <strong>and</strong> (6) l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g to important exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

resources such as library catalogs, specialist bibliographies, <strong>and</strong> digital texts. Deckers et al. (2009)<br />

reported that they particularly wanted to create a tool that provides scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> the fields of Greek<br />

codicology <strong>and</strong> palaeography with the ability to publish digital research materials.<br />

The Teuchos platform is built off of a Fedora repository. Three types of users can <str<strong>on</strong>g>in</str<strong>on</strong>g>teract with this<br />

repository through a web applicati<strong>on</strong>: 528 systems adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrators; registered users, who may c<strong>on</strong>tribute<br />

resources; <strong>and</strong> public users, who can view <strong>on</strong>ly publicly released materials. The Teuchos Fedora<br />

repository <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes several types of complicated digital objects, all of which have been designed to try<br />

<strong>and</strong> cover all potential categories of text transmissi<strong>on</strong>. Manuscript watermark trac<str<strong>on</strong>g>in</str<strong>on</strong>g>gs are stored as<br />

digital images <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about them is stored <str<strong>on</strong>g>in</str<strong>on</strong>g> a custom XML format created by the project. A<br />

“textual transmissi<strong>on</strong>” group has two subgroups, each of which is then subdivided: the first group<br />

provides <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> related to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscripts <strong>and</strong> the sec<strong>on</strong>d <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> related to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual works. With<str<strong>on</strong>g>in</str<strong>on</strong>g> the manuscript group, <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual data objects <str<strong>on</strong>g>in</str<strong>on</strong>g>clude digital page images (of<br />

complete or partial manuscripts) that are aggregated for each manuscript, codicological descripti<strong>on</strong>s<br />

that reference page images when available, <strong>and</strong> vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels of transcripti<strong>on</strong> data. In terms of works,<br />

this subgroup encompasses a wide range of materials referr<str<strong>on</strong>g>in</str<strong>on</strong>g>g “to a source text with its entire set of<br />

manuscripts rather than to <strong>on</strong>e particular witness” <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes full critical editi<strong>on</strong>s, translati<strong>on</strong>s, <strong>and</strong><br />

commentaries (Deckers et al. 2009). The three other major categories of digital object that are created<br />

are biographical dicti<strong>on</strong>aries, bibliographical data, <strong>and</strong> published research papers. Because of the<br />

heterogeneous nature of these data, <strong>on</strong>ly the manuscript descripti<strong>on</strong>s <strong>and</strong> transcripti<strong>on</strong>s could be<br />

encoded accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to TEI P5 XML.<br />

Because the creators of Teuchos hope to provide scholars with advanced search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

functi<strong>on</strong>ality, they have developed a data model for both the physical <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual c<strong>on</strong>tent of<br />

manuscripts <str<strong>on</strong>g>in</str<strong>on</strong>g> their platform. While not all of the descriptive material <str<strong>on</strong>g>in</str<strong>on</strong>g> Teuchos <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes digital<br />

images of manuscripts, all the digital images that are <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded have accompany<str<strong>on</strong>g>in</str<strong>on</strong>g>g descriptive <strong>and</strong><br />

authority metadata. All manuscripts with digital images also have a corresp<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g reference document<br />

that makes use of the TEI element <strong>and</strong> a list of elements with unique identifiers<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the form of xml:id attributes <strong>and</strong> unambiguous labels for pages us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the “n” attribute. The<br />

elements are listed <str<strong>on</strong>g>in</str<strong>on</strong>g> the physical order of the manuscript, <strong>and</strong> miss<str<strong>on</strong>g>in</str<strong>on</strong>g>g pages are<br />

represented with empty elements.<br />

To facilitate user access to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual page images, Teuchos provides at least a m<str<strong>on</strong>g>in</str<strong>on</strong>g>imal transcripti<strong>on</strong><br />

for each manuscript (e.g., it may simply c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> page-break <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> no textual transcripti<strong>on</strong>)<br />

that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s structural <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> that “can be used to offer alternate representati<strong>on</strong>s <strong>and</strong> improved<br />

navigati<strong>on</strong> for brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> to give a clearer <str<strong>on</strong>g>in</str<strong>on</strong>g>dicati<strong>on</strong> of the part of the text to which an image<br />

viewed perta<str<strong>on</strong>g>in</str<strong>on</strong>g>s” (Deckers et al. 2009). These data are then encoded with<str<strong>on</strong>g>in</str<strong>on</strong>g> TEI elements.<br />

While elements with “corresp” attributes that po<str<strong>on</strong>g>in</str<strong>on</strong>g>t to unique page identifiers are used to<br />

reference digital images of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscript pages, the element is used to separately encode<br />

foliati<strong>on</strong> or pag<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. This separate encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g is important, Deckers et al. reported,<br />

because it “permits record<str<strong>on</strong>g>in</str<strong>on</strong>g>g whether numbers provided by the transcriber are actually present <strong>on</strong> the<br />

page or not” <strong>and</strong> also supports “record<str<strong>on</strong>g>in</str<strong>on</strong>g>g more than <strong>on</strong>e such reference system,” a particularly<br />

important issue, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many manuscripts can have multiple foliati<strong>on</strong> systems.<br />

528 A beta versi<strong>on</strong> of this applicati<strong>on</strong> is available at http://beta.teuchos.uni-hamburg.de/TeuchosWebUI/teuchos-web-ui


163<br />

A comprehensive set of markup structures has also been created to represent the <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual c<strong>on</strong>tent of<br />

manuscripts. Deckers et al. (2009) observed that two complementary issues are <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> relat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

structural <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> to a transcripti<strong>on</strong>: first, the need to relate the text of a manuscript transcripti<strong>on</strong> to<br />

the structure found <str<strong>on</strong>g>in</str<strong>on</strong>g> a particular editi<strong>on</strong> of a work; <strong>and</strong> sec<strong>on</strong>d, the need to encode “any structure<br />

evident” <str<strong>on</strong>g>in</str<strong>on</strong>g> the actual manuscript witness that is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g transcribed. These issues become even more<br />

complicated, Deckers et al. argued, when comb<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the transcripti<strong>on</strong>s of multiple manuscript<br />

witnesses of a work:<br />

To be able to reta<str<strong>on</strong>g>in</str<strong>on</strong>g> per-witness structural <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>ed document, we therefore<br />

propose to encode all structural <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g empty elements, i.e. s. When<br />

such a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>ed document is edited further to become a new editi<strong>on</strong> of a work <str<strong>on</strong>g>in</str<strong>on</strong>g> its own right, the<br />

editor(s) may (<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> most cases probably will) of course decide to create a hierarchical<br />

structure tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>to account the structure of the various witnesses, but this should be a later<br />

step. To avoid c<strong>on</strong>fusi<strong>on</strong>, we should state that we do not <str<strong>on</strong>g>in</str<strong>on</strong>g>tend to provide dynamically<br />

generated editi<strong>on</strong>s. While the semi-automatic jo<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g of transcripti<strong>on</strong>s is a first step towards<br />

creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital critical editi<strong>on</strong>, the further steps require substantial scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>terventi<strong>on</strong><br />

(Deckers et al. 2009).<br />

The approach chosen by Teuchos illustrates the difficulty <str<strong>on</strong>g>in</str<strong>on</strong>g>herent <str<strong>on</strong>g>in</str<strong>on</strong>g> try<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create dynamically<br />

generated editi<strong>on</strong>s, particularly for works that have potentially dozens of manuscript witnesses. In<br />

additi<strong>on</strong>, structural <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> from an older exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g editi<strong>on</strong> is often used as the organizati<strong>on</strong>al<br />

structure for a new editi<strong>on</strong> (e.g., us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Stephanus editi<strong>on</strong> page numbers for an OCT editi<strong>on</strong> of Plato).<br />

Deckers et al. thus proposed a hierarchical system that used s with a “special value of<br />

“external” for the unit attribute” that made it clear an external reference system was be<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated<br />

<strong>and</strong> a specially created value of “can<strong>on</strong>ical” for the type attribute. Different editi<strong>on</strong> <strong>and</strong> number<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

schemes were referred to us<str<strong>on</strong>g>in</str<strong>on</strong>g>g an “ed” attribute, the hierarchical level of the reference used a subtype<br />

attribute, <strong>and</strong> the actual reference used a “n attribute.” Def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g this level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g granularity is<br />

important for it means that multiple can<strong>on</strong>ical reference systems from different editi<strong>on</strong>s, or even<br />

multiple number<str<strong>on</strong>g>in</str<strong>on</strong>g>g schemes from <strong>on</strong>e editi<strong>on</strong>, can be encoded for <strong>on</strong>e manuscript text. This system is<br />

also used to encode the c<strong>on</strong>tent structure of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscript witnesses, but a value of “<str<strong>on</strong>g>in</str<strong>on</strong>g>ternal”<br />

is used <str<strong>on</strong>g>in</str<strong>on</strong>g>stead of “external” for the unit attribute, with type values of “present” for numbers actually<br />

found with<str<strong>on</strong>g>in</str<strong>on</strong>g> the text <strong>and</strong> “implied” for numbers that are no l<strong>on</strong>ger found <str<strong>on</strong>g>in</str<strong>on</strong>g> the witness. F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally,<br />

Deckers et al. suggested that the “ed” attribute could be used to <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate manuscript siglum as well as<br />

editi<strong>on</strong> names.<br />

While future work for the Teuchos project <str<strong>on</strong>g>in</str<strong>on</strong>g>volves the creati<strong>on</strong> of detailed codicological manuscript<br />

descripti<strong>on</strong>s <strong>and</strong> transcripti<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual manuscript texts, Deckers et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that they first<br />

focused <strong>on</strong> structural encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce they c<strong>on</strong>sidered “this an important step <str<strong>on</strong>g>in</str<strong>on</strong>g> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g fuller access<br />

to digitised manuscripts for textual scholars, <strong>and</strong> a necessary prerequisite for cumulative <strong>and</strong> shared<br />

scholarly work <strong>on</strong> the primary text sources <str<strong>on</strong>g>in</str<strong>on</strong>g> a distributed digital envir<strong>on</strong>ment.” As do the Interediti<strong>on</strong><br />

<strong>and</strong> the Virtual Manuscript Room, the Teuchos project wants to create a distributed envir<strong>on</strong>ment that<br />

will let many scholars c<strong>on</strong>tribute their expertise <strong>and</strong> share their edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g with others. A recent<br />

presentati<strong>on</strong> by Vertan (2009) has offered some further technical details <strong>on</strong> this <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that is<br />

be<str<strong>on</strong>g>in</str<strong>on</strong>g>g built, <strong>and</strong> stated that Teuchos is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with CLARIN as <strong>on</strong>e of their collaborative research<br />

projects, “MLT-Cphil-Multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual Language Technology for Classical Philology Research.” Vertan<br />

described Teuchos as a “Knowledge Web‐ Based eResearch Envir<strong>on</strong>ment” <str<strong>on</strong>g>in</str<strong>on</strong>g> which knowledge work<br />

is supported through knowledge organizati<strong>on</strong>, semiautomated <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> extracti<strong>on</strong>, the management


164<br />

of multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual data, <strong>and</strong> “<str<strong>on</strong>g>in</str<strong>on</strong>g>telligent retrieval of heterogeneous materials”; where the use of the Web<br />

will allow for comprehensive data access (different levels of users), <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability (TEI P5), <strong>and</strong><br />

persistency (URIs <strong>and</strong> persistent identifiers), user model<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> the creati<strong>on</strong> of a shared workspace;<br />

<strong>and</strong> where the “eResearch envir<strong>on</strong>ment” provides access to different material types, sophisticated data<br />

model<str<strong>on</strong>g>in</str<strong>on</strong>g>g, encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> visualizati<strong>on</strong> <strong>and</strong> extensive l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g between different digital projects (Vertan<br />

2009).<br />

As part of their collaborative envir<strong>on</strong>ment, Teuchos also plans to create a shared workspace that<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a forum <strong>and</strong> to support various comment<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> versi<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g features for different materials.<br />

Perhaps the greatest challenge listed by Vertan, however, was the need to manage both multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual<br />

<strong>and</strong> heterogeneous data that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded different data types such as semistructured data found <str<strong>on</strong>g>in</str<strong>on</strong>g> XML<br />

<strong>and</strong> TEI files, high-resoluti<strong>on</strong> TIFF images, graphics (for watermarks), <strong>and</strong> research materials stored as<br />

PDF or Word documents. The semistructured documents also had vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels of semantic depth,<br />

<strong>and</strong> there were different types of multil<str<strong>on</strong>g>in</str<strong>on</strong>g>guality with<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e document (e.g., Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> German<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e manuscript), across documents (there are five official languages for the project: German,<br />

French, English, Italian, <strong>and</strong> Spanish), <strong>and</strong> with<str<strong>on</strong>g>in</str<strong>on</strong>g> term<str<strong>on</strong>g>in</str<strong>on</strong>g>ologies. The Teuchos project wants to allow<br />

navigati<strong>on</strong> across different data collecti<strong>on</strong>s, <strong>and</strong> they are implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g various Semantic Web<br />

soluti<strong>on</strong>s. This <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes “semantic descripti<strong>on</strong>s of stored objects” us<str<strong>on</strong>g>in</str<strong>on</strong>g>g RDF triples, develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<strong>on</strong>tologies for each type of data collecti<strong>on</strong>, mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g “multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual lexical entries” <strong>on</strong>to this <strong>on</strong>tology,<br />

<strong>and</strong> then support<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>tological search<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

Manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g all of these data <str<strong>on</strong>g>in</str<strong>on</strong>g>volves the creati<strong>on</strong> of complicated digital objects <str<strong>on</strong>g>in</str<strong>on</strong>g> Fedora that have<br />

seven data streams: bibliographic details (stored <str<strong>on</strong>g>in</str<strong>on</strong>g> DC), semantic descripti<strong>on</strong>s of objects <strong>and</strong><br />

relati<strong>on</strong>ships with other objects (RDF), codicological descripti<strong>on</strong>s (XML), l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

(XML), layout details (XML), transcripti<strong>on</strong>s (text file), <strong>and</strong> image data (TIFF files). The current<br />

Teuchos implementati<strong>on</strong> not <strong>on</strong>ly makes use of Fedora but also uses AJAX for the client-server<br />

applicati<strong>on</strong> <strong>and</strong> image viewer. Vertan c<strong>on</strong>cluded that the Teuchos platform could illustrate the potential<br />

of Semantic Web technologies for real humanities problems as well as dem<strong>on</strong>strate the importance of<br />

develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g soluti<strong>on</strong>s for multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual c<strong>on</strong>tent. In additi<strong>on</strong>, Vertan asserted how “multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual problems<br />

are <str<strong>on</strong>g>in</str<strong>on</strong>g>creased because of the lack of tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g data <strong>and</strong> computati<strong>on</strong>al l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics tools for old languages,<br />

especially ancient Greek.” Similar criticisms were offered by Henriette Roued, who noted the lack of<br />

Ancient Greek knowledge bases for use with the DUGA prototype.<br />

Prosopography<br />

The discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of prosopography is “the study of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals” <strong>and</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of ancient history, “is a<br />

method which uses <strong>on</strong>omastic evidence” or the study of pers<strong>on</strong>al names “to establish (i) regi<strong>on</strong>al<br />

orig<str<strong>on</strong>g>in</str<strong>on</strong>g>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals <strong>and</strong> (ii) family c<strong>on</strong>necti<strong>on</strong>s.” 529 Many sources can be used for prosopography,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g narrative texts, adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrative records, letters, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, am<strong>on</strong>g others. The study of<br />

prosopography has thus been closely l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to epigraphy <str<strong>on</strong>g>in</str<strong>on</strong>g> particular. This secti<strong>on</strong> looks at several<br />

recent articles <strong>and</strong> research projects that have <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated the use of digital techniques <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

prosopographical databases.<br />

529 "prosopography" Oxford Dicti<strong>on</strong>ary of the Classical World. Ed. John Roberts. Oxford University Press, 2007. Oxford Reference Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Oxford<br />

University Press. Tufts University. 4 May 2010.<br />


165<br />

Issues <str<strong>on</strong>g>in</str<strong>on</strong>g> the Creati<strong>on</strong> of Prosopographical Databases<br />

Although prosopography is a well-established discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e, there are fewer digital resources <str<strong>on</strong>g>in</str<strong>on</strong>g> it than <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

many of the other fields of digital classics, an issue discussed extensively <str<strong>on</strong>g>in</str<strong>on</strong>g> a recent study by Ralph<br />

W. Mathisen (Mathisen 2007), who provided an overview of exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g prosopographical databases<br />

(PDBs) <strong>and</strong> the challenges <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g them. 530 In the 1970s, Mathisen decided to create a<br />

database based of the first volume of the Prosopography of the Later Roman Empire (PLRE), but had<br />

to temporarily ab<strong>and</strong><strong>on</strong> this work because of the limitati<strong>on</strong>s of FORTRAN <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>frame computers.<br />

By the 1980s, Mathisen believed that the development of PDBs was becom<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly possible,<br />

<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e of his grant proposals at the time listed a number of major advantages of databases,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>venience, speed, accuracy, diversity <strong>and</strong> multiplicity of access, ease of revisi<strong>on</strong> <strong>and</strong><br />

report<str<strong>on</strong>g>in</str<strong>on</strong>g>g, exp<strong>and</strong>ability, portability, <strong>and</strong>, perhaps most important, potential compatibility with other<br />

biographical <strong>and</strong> prosopographical databases. His current work <str<strong>on</strong>g>in</str<strong>on</strong>g>volves the development of a database<br />

he has named the “Biographical Database of Late Antiquity”(BDLA). 531 Despite the potential of<br />

PDBs, Mathisen reported that his earlier research (Mathisen 1988) had identified 20 prosopographical<br />

database projects, but by 2007, <strong>on</strong>ly <strong>on</strong>e had been completed, <strong>on</strong>e had been absorbed by a later project,<br />

<strong>and</strong> two were still <str<strong>on</strong>g>in</str<strong>on</strong>g> progress, <strong>and</strong> the other 16 were no l<strong>on</strong>ger f<str<strong>on</strong>g>in</str<strong>on</strong>g>dable.<br />

A variety of issues have caused this situati<strong>on</strong>, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Mathisen, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g questi<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

accessibility <strong>and</strong> hardware <strong>and</strong> software problems, but the greatest challenges have been<br />

methodological c<strong>on</strong>siderati<strong>on</strong>s of the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, Mathisen noted that some<br />

prosopographers (<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>deed other humanists) argued that databases imposed “too much structure <strong>on</strong><br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> from primary sources.” Any useful historical database, Mathisen suggested, must<br />

structure data from primary sources <str<strong>on</strong>g>in</str<strong>on</strong>g> two ways: first, it must identify all “categories of recurrent<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>” (e.g., sex, religi<strong>on</strong>); <strong>and</strong> sec<strong>on</strong>d, it must identify “appropriate, recurrent values for these<br />

fields.” Mathisen also po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that “historians always structure their data, whether they are creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

a PDB or not.” While creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a database may be an <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative act of scholarship <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of<br />

archaeology, as earlier argued by Dunn (2009), us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a historical database also <str<strong>on</strong>g>in</str<strong>on</strong>g>volves <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong><br />

as expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by Mathisen:<br />

PDB structure <strong>and</strong> cod<str<strong>on</strong>g>in</str<strong>on</strong>g>g are not prescriptive; it <strong>on</strong>ly provides a start<str<strong>on</strong>g>in</str<strong>on</strong>g>g po<str<strong>on</strong>g>in</str<strong>on</strong>g>t for research.<br />

The computer can <strong>on</strong>ly do so much. Human <str<strong>on</strong>g>in</str<strong>on</strong>g>terventi<strong>on</strong> is always needed, not <strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

course of the creati<strong>on</strong> of a PDB, but especially <str<strong>on</strong>g>in</str<strong>on</strong>g> the use of a PDB. This <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes not <strong>on</strong>ly<br />

verify<str<strong>on</strong>g>in</str<strong>on</strong>g>g the validity <strong>and</strong> appropriateness of the data returned, but also judiciously analys<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

that data. Even when I read an entry <str<strong>on</strong>g>in</str<strong>on</strong>g> the hard-copy of PLRE, I still check the primary source<br />

myself. Users of PDBs should do the same (Mathisen 2007).<br />

As this statement illustrates, Mathisen c<strong>on</strong>sidered scholarly c<strong>on</strong>sultati<strong>on</strong> of the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al primary<br />

sources to also be extremely important. To be truly useful, however, Mathisen proposed that PDBs<br />

should also <str<strong>on</strong>g>in</str<strong>on</strong>g>clude at least some access to the primary sources they used whenever possible. “Indeed,<br />

the most effective modern PDBs br<str<strong>on</strong>g>in</str<strong>on</strong>g>g the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al source documents al<strong>on</strong>g with them,” Mathisen<br />

argued, “either by a po<str<strong>on</strong>g>in</str<strong>on</strong>g>ter to a separate source database or by <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the source text with<str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

record, thus ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g that no source <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> is ever lost <str<strong>on</strong>g>in</str<strong>on</strong>g> the creati<strong>on</strong> of a PDB” (Mathisen 2007).<br />

530 In this overview, Mathisen also notes the somewhat limited research coverage of PDBs <str<strong>on</strong>g>in</str<strong>on</strong>g> the past 10 years, with much of the significant scholarship<br />

published <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1980s, such as Bulst (1989) <strong>and</strong> Mathisen (1988), <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1990s (Goudriaan et al 1995).<br />

531 As of this writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g (September 2010), there does not appear to be any website for the BDLA, which, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Mathisen (2007), “plans to <str<strong>on</strong>g>in</str<strong>on</strong>g>corporate<br />

all the pers<strong>on</strong>s attested as liv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the Mediterranean <strong>and</strong> western Asian worlds between AD 250 <strong>and</strong> 750” <strong>and</strong> c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s more than 27,000 <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes more than 70 searchable fields.


166<br />

Another major methodological issue <str<strong>on</strong>g>in</str<strong>on</strong>g> the development of PDBs, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Mathisen, is that they<br />

are about “<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual people,” <strong>and</strong> that these people must have unique identities with<str<strong>on</strong>g>in</str<strong>on</strong>g> a database. Yet<br />

the identificati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual people with<str<strong>on</strong>g>in</str<strong>on</strong>g> primary sources is no easy task, <strong>and</strong> even if two sources<br />

cite the pers<strong>on</strong> with the same name it can be difficult to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e whether it is the same pers<strong>on</strong>.<br />

Additi<strong>on</strong>ally, a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual may go by different names. “Sort<str<strong>on</strong>g>in</str<strong>on</strong>g>g out who’s who,” Mathisen noted,<br />

“either by us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a computer algorithm or by human eye-ball<str<strong>on</strong>g>in</str<strong>on</strong>g>g, c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues to be <strong>on</strong>e of the major<br />

problems, if not the major problem, fac<str<strong>on</strong>g>in</str<strong>on</strong>g>g the creators of PDBs” (Mathisen 2007). As has been seen<br />

throughout this review, the challenges of historical named-entity disambiguati<strong>on</strong> have also been<br />

highlighted <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of historical place names <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology (Jeffrey et al. 2009a, Jeffrey et al. 2009b)<br />

<strong>and</strong> classical geography (Elliott <strong>and</strong> Gillies 2009b), <strong>and</strong> both pers<strong>on</strong>al <strong>and</strong> place name disambiguati<strong>on</strong><br />

complicated data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> between papyrological <strong>and</strong> epigraphical databases <str<strong>on</strong>g>in</str<strong>on</strong>g> the LaQuAT project<br />

(Jacks<strong>on</strong> et al. 2009).<br />

While hierarchical structures were first explored for PDBs, Mathisen proposed that it was generally<br />

agreed that the relati<strong>on</strong>al model was the best structural model for such databases. Several important<br />

rules for relati<strong>on</strong>al PDBs that Mathisen listed <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the need to store data <str<strong>on</strong>g>in</str<strong>on</strong>g> tabular format, the<br />

creati<strong>on</strong> of a unique identifier for each primary data record (with<str<strong>on</strong>g>in</str<strong>on</strong>g> PDBs this is typically a pers<strong>on</strong>’s<br />

name comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed with a number, e.g., Alex<strong>and</strong>er-6), <strong>and</strong> the ability to retrieve data <str<strong>on</strong>g>in</str<strong>on</strong>g> different logical<br />

comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>s based <strong>on</strong> field values. While many PDBs, Mathisen observed, were often “structured<br />

based <strong>on</strong> a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle table” that attempted to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude all the important <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual,<br />

such a simple structure limited the types of questi<strong>on</strong>s that could be asked of such a database.<br />

The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al major methodological issue Mathisen c<strong>on</strong>sidered was st<strong>and</strong>ardizati<strong>on</strong>. While the early period<br />

of PDB creati<strong>on</strong> saw a number of efforts at develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a “st<strong>and</strong>ardized format for enter<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> stor<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

prosopographical material,” Mathisen doubted that any real st<strong>and</strong>ardizati<strong>on</strong> would ever occur. Indeed,<br />

he argued that s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the “data reducti<strong>on</strong>” methods of any prosopographical database were often<br />

designed based <strong>on</strong> the primary source material at h<strong>and</strong> <strong>and</strong> how it would be used, attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to design<br />

an all-purpose method would be <str<strong>on</strong>g>in</str<strong>on</strong>g>efficient. While Mathisen proposed that the use of a relati<strong>on</strong>al<br />

database structure <str<strong>on</strong>g>in</str<strong>on</strong>g> itself should make it relatively easy to transfer data between databases, the<br />

LaQuAT project found this to be far from the case (Jacks<strong>on</strong> et al. 2009).<br />

Despite these various methodological issues, a number of prosopographical database projects have<br />

been created, as seen <str<strong>on</strong>g>in</str<strong>on</strong>g> the next secti<strong>on</strong>s. Mathisen posited that there were two general types of<br />

PDBs: 532 (1) a restricted or limited database that typically <str<strong>on</strong>g>in</str<strong>on</strong>g>corporates <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals from <strong>on</strong>ly <strong>on</strong>e<br />

“discrete primary or sec<strong>on</strong>dary source”; <strong>and</strong> (2) “<str<strong>on</strong>g>in</str<strong>on</strong>g>clusive” or open-ended databases that usually<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude all of the people who lived at a particular time or place <strong>and</strong> c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> material from many<br />

heterogeneous sources. All the databases c<strong>on</strong>sidered <str<strong>on</strong>g>in</str<strong>on</strong>g> the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g secti<strong>on</strong>s, with the excepti<strong>on</strong> of<br />

Prosopographia Imperii Romani, are open-ended databases. Such databases are far more difficult to<br />

design, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Mathisen, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce “designers must anticipate both what k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> users<br />

might want to access <strong>and</strong> what k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> will be provided by the sources from which the<br />

database will be c<strong>on</strong>structed.” In additi<strong>on</strong>, such databases are typically never completed as new<br />

resources become unearthed or additi<strong>on</strong>al sources are m<str<strong>on</strong>g>in</str<strong>on</strong>g>ed for prosopographical data. “The greatest<br />

future promise of PDBs lies <str<strong>on</strong>g>in</str<strong>on</strong>g> the c<strong>on</strong>structi<strong>on</strong> of more sophisticated <strong>and</strong> comprehensive databases,”<br />

Mathisen c<strong>on</strong>cluded, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a broad range of pers<strong>on</strong>s, c<strong>on</strong>structed from a multiplicity<br />

of sources <strong>and</strong> permitt<str<strong>on</strong>g>in</str<strong>on</strong>g>g search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> a multiplicity of fields” (Mathisen 2007).<br />

532 Mathisen lists a third special case of limited databases with the form of open-ended databases, but that “are c<strong>on</strong>structed from exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g hard-copy<br />

prosopographical catalogue” or card-files, <strong>and</strong> the limit is imposed not by source-material but by editorial decisi<strong>on</strong>s <strong>on</strong> whom to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude. In additi<strong>on</strong>,<br />

Mathisen also described a number of “biographical catalogues” like the “De Imperatoribus Romanis” (DIR) (http://www.roman-emperors.org/).


167<br />

Network Analysis <strong>and</strong> Digital Prosopography<br />

In 2009, a new digital research project, the Berkeley Prosopography Service (BPS), 533 received<br />

fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g from the Office of Digital Humanities (ODH) of the NEH to create “an open source digital<br />

toolkit that extracts prosopographic data from TEI encoded text <strong>and</strong> generates <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive visual<br />

representati<strong>on</strong>s of social networks.” 534 This project is led by Niek Veldhuis, al<strong>on</strong>g with Laurie Pearce<br />

<strong>and</strong> Patrick Schmitz, <strong>and</strong> they are utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g NLP <strong>and</strong> social network analysis (SNA) techniques to<br />

extract pers<strong>on</strong>al names <strong>and</strong> family relati<strong>on</strong>ships of people menti<strong>on</strong>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> texts <strong>and</strong> to then assemble a<br />

social network of people based <strong>on</strong> described activities. 535 The <str<strong>on</strong>g>in</str<strong>on</strong>g>itial tool will be applied to a corpus 536<br />

of approximately 700 cuneiform tablets from the CDLI that record sales <strong>and</strong> lease transacti<strong>on</strong>s am<strong>on</strong>g<br />

a small group of Mesopotamians from Uruk (southern Iraq) between 331 <strong>and</strong> 346 BC. After the Uruk<br />

text corpus 537 has been c<strong>on</strong>verted <str<strong>on</strong>g>in</str<strong>on</strong>g>to TEI-XML, prosopographic data will be automatically extracted<br />

from the TEI files, SNA techniques will be used to create various networks, <strong>and</strong> users will then be able<br />

to visualize the results <str<strong>on</strong>g>in</str<strong>on</strong>g> various ways:<br />

A probabilistic eng<str<strong>on</strong>g>in</str<strong>on</strong>g>e will collate all the pers<strong>on</strong>-references <str<strong>on</strong>g>in</str<strong>on</strong>g> the corpus, al<strong>on</strong>g with some<br />

basic world knowledge, like the typical length of adult activity, <strong>and</strong> will then associate the<br />

names to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual pers<strong>on</strong>s, <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ally will relate the people to <strong>on</strong>e another by the k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of<br />

activities they engaged <str<strong>on</strong>g>in</str<strong>on</strong>g>. The result<str<strong>on</strong>g>in</str<strong>on</strong>g>g graph model can be used to produce a variety of<br />

reports <strong>and</strong> visualizati<strong>on</strong> tools, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g simple name lists <strong>and</strong> family trees, as well as<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>teractive models. By <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g graph visualizati<strong>on</strong> tools, the project will provide <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive<br />

tools that let researchers explore the network of associati<strong>on</strong>s <strong>and</strong> activities. They can focus <strong>on</strong><br />

an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual, <strong>on</strong> a given type of activity (e.g., real-estate sales), or explore other aspects of the<br />

model. This should enable the researchers to answer many complex questi<strong>on</strong>s more easily, <strong>and</strong><br />

with a visual resp<strong>on</strong>se (Schmitz 2009).<br />

The BPS will provide researchers with <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual workspaces <strong>and</strong> will be the first <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent tool to<br />

be <str<strong>on</strong>g>in</str<strong>on</strong>g>corporated <str<strong>on</strong>g>in</str<strong>on</strong>g>to the CDLI. Dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g the <str<strong>on</strong>g>in</str<strong>on</strong>g>itial grant period, beta test<str<strong>on</strong>g>in</str<strong>on</strong>g>g will be c<strong>on</strong>ducted with other<br />

corpora to test the extent to which this tool can be scaled <strong>and</strong> generalized for use <str<strong>on</strong>g>in</str<strong>on</strong>g> other<br />

prosopographical projects. The BPS is also participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> Project Bamboo as <strong>on</strong>e of their<br />

dem<strong>on</strong>strators.<br />

Other work us<str<strong>on</strong>g>in</str<strong>on</strong>g>g network analysis 538 as a means of explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g classical prosopography has been<br />

discussed by Graham <strong>and</strong> Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i (2007). They noted that rapid developments <str<strong>on</strong>g>in</str<strong>on</strong>g> computer technology,<br />

particularly graph theory, provided both network analysts, <strong>and</strong> c<strong>on</strong>sequently prosopographers, with<br />

new tools for answer<str<strong>on</strong>g>in</str<strong>on</strong>g>g complex questi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>volv<str<strong>on</strong>g>in</str<strong>on</strong>g>g the degrees of separati<strong>on</strong> between network<br />

members or how densely or loosely c<strong>on</strong>nected networks might be:<br />

Such questi<strong>on</strong>s hold a natural <str<strong>on</strong>g>in</str<strong>on</strong>g>terest for prosopographers, who can then beg<str<strong>on</strong>g>in</str<strong>on</strong>g> to look for<br />

certa<str<strong>on</strong>g>in</str<strong>on</strong>g> characteristics—class, office, occupati<strong>on</strong>, gender—<strong>and</strong> identify patterns of c<strong>on</strong>nectivity<br />

that they might have otherwise missed when c<strong>on</strong>fr<strong>on</strong>ted with a mass of data too large for<br />

normal synthetic approaches. And yet, network analysis has been slow to take root am<strong>on</strong>g<br />

533 http://code.google.com/p/berkeley-prosopography-services/<br />

534 http://www.neh.gov/ODH/Default.aspxtabid=111&id=159<br />

535 http://<str<strong>on</strong>g>in</str<strong>on</strong>g>ews.berkeley.edu/articles/Spr<str<strong>on</strong>g>in</str<strong>on</strong>g>g2009/BPS<br />

536 The dem<strong>on</strong>strator corpus “Hellenistic Babyl<strong>on</strong>ia: Texts, Image <strong>and</strong> Names (HBTIN)” can be viewed at<br />

http://oracc.museum.upenn.edu/hbt<str<strong>on</strong>g>in</str<strong>on</strong>g>/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

537 http://cdl.museum.upenn.edu/hbt<str<strong>on</strong>g>in</str<strong>on</strong>g>/<br />

538 One of the presentati<strong>on</strong>s at the Digital Classicist/ICS Work <str<strong>on</strong>g>in</str<strong>on</strong>g> Progress Sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ar <str<strong>on</strong>g>in</str<strong>on</strong>g> the summer of 2010 also exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the use of network analysis <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

prosopography. See Timothy Hill, “After Prosopography Data Modell<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Models of History, <strong>and</strong> New Directi<strong>on</strong>s for a Scholarly Genre,”<br />

http://www.digitalclassicist.org/wip/wip2010-03th.html


168<br />

ancient historians. Network analytical research <strong>on</strong> the Greco-Roman world has focused <strong>on</strong><br />

questi<strong>on</strong>s of religious history <strong>and</strong> topography. N<strong>on</strong>etheless, the epigraphic <strong>and</strong> papyrological<br />

evidence beg a network analytical approach to the prosopographical data available from these<br />

sources (Graham <strong>and</strong> Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i 2007, 325-326).<br />

To dem<strong>on</strong>strate the value of network analysis for prosopography, the authors described their own<br />

dissertati<strong>on</strong> work. One major requirement they listed that would be needed to dem<strong>on</strong>strate the potential<br />

of network analysis for ancient prosopography were “focused data-sets” unlike many of the massive<br />

multivolume prosopographies such as the PLRE.<br />

As an example of such a data set, Graham <strong>and</strong> Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i described a set of data regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals<br />

c<strong>on</strong>nected with the brick <str<strong>on</strong>g>in</str<strong>on</strong>g>dustry of imperial <str<strong>on</strong>g>Rome</str<strong>on</strong>g>. These data were largely obta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed from bricks that<br />

were stamped with estate <strong>and</strong> workshop names, <strong>and</strong> together the data set <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the names of at least<br />

1,300 <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals from largely the sec<strong>on</strong>d century AD. As <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> the brick <str<strong>on</strong>g>in</str<strong>on</strong>g>dustry<br />

came from vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels of society, the name data from the bricks have been used <str<strong>on</strong>g>in</str<strong>on</strong>g> various types of<br />

historical research. Several major published catalogs of stamped bricks have been created, <strong>and</strong> Graham<br />

created an Access database for <strong>on</strong>e of them (CIL XV.1) that could be used for both archaeological <strong>and</strong><br />

prosopographical analysis. Numerous programs can then be used to build <strong>and</strong> analyze networks from<br />

this data Graham <strong>and</strong> Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i suggested:<br />

In general, <strong>on</strong>e simply lists the name <str<strong>on</strong>g>in</str<strong>on</strong>g> questi<strong>on</strong> <strong>and</strong> all the other names with which it cooccurs.<br />

The programme then stitches the network together from these data. Many statistics of<br />

use to prosopographers can then be determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, but sometimes simply visualiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the network<br />

itself can provide a ‘eureka’ moment. Some networks will have a number of ‘hubs’ <strong>and</strong><br />

every<strong>on</strong>e else is c<strong>on</strong>nected like a ‘spoke’; other networks will look more like a cha<str<strong>on</strong>g>in</str<strong>on</strong>g> with<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terlock<str<strong>on</strong>g>in</str<strong>on</strong>g>g circles of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals. This is profoundly important (Graham <strong>and</strong> Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i 2007,<br />

328).<br />

Graham c<strong>on</strong>sequently used network analysis to explore “small world” networks with<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>Rome</str<strong>on</strong>g> <strong>and</strong> the<br />

effect of purges <strong>and</strong> proscripti<strong>on</strong>s <strong>on</strong> this network.<br />

Another potential use of network analysis for prosopographical research listed by Graham <strong>and</strong> Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i<br />

was for “explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the <str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong>s between various cliques <strong>and</strong> clusters with<str<strong>on</strong>g>in</str<strong>on</strong>g> a social network” <strong>on</strong><br />

the level of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual villages, such as those described <str<strong>on</strong>g>in</str<strong>on</strong>g> documentary archives of papyri that survive<br />

for a number of villages. They noted that the large number of papyrological databases such as APIS<br />

<strong>and</strong> DDbDP provide a wealth of material that can be m<str<strong>on</strong>g>in</str<strong>on</strong>g>ed for prosopographical analysis or, as they<br />

call it, “a prosopographical growth <str<strong>on</strong>g>in</str<strong>on</strong>g>dustry with enormous potential.” The dissertati<strong>on</strong> work of Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i<br />

used network analysis with documentary papyri from the Aphrodito archive to explore the prom<str<strong>on</strong>g>in</str<strong>on</strong>g>ence<br />

of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals other than the heavily studied Dioskoros <strong>and</strong> his family. Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i suggested that network<br />

analysis provides a number of “centrality measures,” such as “closeness centrality” <strong>and</strong> “betweenness<br />

centrality,” that can be used to “identify the most central figures <str<strong>on</strong>g>in</str<strong>on</strong>g> the archive, measures whose<br />

quantitative nature hopefully removes the biases <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced by our own scholarly curiosity <strong>and</strong><br />

prejudice.” Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g these two measures identified three other prom<str<strong>on</strong>g>in</str<strong>on</strong>g>ent <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals, results that<br />

surprised him as n<strong>on</strong>e of them is menti<strong>on</strong>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> modern scholarship <strong>on</strong> Aphrodito. A f<str<strong>on</strong>g>in</str<strong>on</strong>g>al potential use<br />

of network analysis for prosopography illustrated by Graham <strong>and</strong> Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i was the analysis of<br />

occupati<strong>on</strong>al groups <strong>and</strong> the social c<strong>on</strong>nectivity between them.


169<br />

While Graham <strong>and</strong> Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i acknowledged that most of their analysis is still fairly speculative, they also<br />

c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>gly argued that the unique nature of their results derived from network analysis of ancient<br />

evidence suggests that there are many <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g avenues of future work.<br />

Relati<strong>on</strong>al Databases <strong>and</strong> Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g Prosopography<br />

Perhaps the most extensive prosopographical database <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e is the Prosopography of the Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

World (PBW). 539 This website, formerly known as the Prosopography of the Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e Empire (PBE),<br />

provides access to a database that attempts to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude details <strong>on</strong> every <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual menti<strong>on</strong>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> both<br />

Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e textual <strong>and</strong> seal sources 540 between 641 <strong>and</strong> 1261 AD. The database of the PBW is large<br />

<strong>and</strong> complex <strong>and</strong>, as described by the website, is composed of thous<strong>and</strong>s of “factoids”:<br />

Its core is made up of nearly 60,000 factoids (small pieces of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> classified under<br />

different categories), each of which is l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to an owner <strong>and</strong> (generally) at least <strong>on</strong>e other<br />

sec<strong>on</strong>dary pers<strong>on</strong> by a hypertext l<str<strong>on</strong>g>in</str<strong>on</strong>g>k. More than a third of the factoids are of the narrative type,<br />

<strong>and</strong> these are organised <str<strong>on</strong>g>in</str<strong>on</strong>g>to narrative units by further l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks. There are 2,774 such units. The<br />

units are <str<strong>on</strong>g>in</str<strong>on</strong>g> turn l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to dates <strong>and</strong> reigns, <strong>and</strong> some of them to larger events <strong>and</strong> problems.<br />

There are, <str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong>, around 7,500 seals, with l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to matrices which number 5,000. Each<br />

seal is l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to a museum or private collecti<strong>on</strong> <strong>and</strong> at least <strong>on</strong>e editi<strong>on</strong>, <strong>and</strong> each set of<br />

matrices to an owner, certa<str<strong>on</strong>g>in</str<strong>on</strong>g> or hypothetical, <str<strong>on</strong>g>in</str<strong>on</strong>g> the core of the database. 541<br />

As of April 2010, approximately 10,000 named <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals were <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> the PBW. A variety of<br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g opti<strong>on</strong>s are available for this database. The entire prosopography can be<br />

browsed alphabetically, <strong>and</strong> click<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual name br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs up a record for that pers<strong>on</strong> that can<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels of detail, depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the number <strong>and</strong> specificity of attestati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the sources.<br />

For example, a record for “Kissiane 101” <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a Greek representati<strong>on</strong> of the name, place of<br />

residence, <strong>and</strong> a k<str<strong>on</strong>g>in</str<strong>on</strong>g>ship l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to her husb<strong>and</strong>, whereas the record for her husb<strong>and</strong>, “Nikephoros 148,”<br />

also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a textual descripti<strong>on</strong>, four k<str<strong>on</strong>g>in</str<strong>on</strong>g>ship relati<strong>on</strong>s (all of which are hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to the records<br />

for these <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals), <strong>and</strong> a list of possessi<strong>on</strong>s. Every “factoid” <str<strong>on</strong>g>in</str<strong>on</strong>g> each <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual record <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the<br />

source where this attestati<strong>on</strong> was found. For historically significant <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals such as emperors, even<br />

more extensive sets of factoids are available. For example, the record for “Michael 7” <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes 307<br />

narrative factoids, seven educati<strong>on</strong> factoids, <strong>and</strong> three alternative names, am<strong>on</strong>g extensive other detail.<br />

The PBW offers an extensive level of detail by often <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the full text of the various “factoids”<br />

from primary sources <str<strong>on</strong>g>in</str<strong>on</strong>g> each <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual record. One feature that is unfortunately not available but<br />

would likely be very useful is the ability to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual pers<strong>on</strong> records with<str<strong>on</strong>g>in</str<strong>on</strong>g> the PBW with<br />

permanent URLs.<br />

While the entire PBW can be browsed alphabetically by <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual names, a user can choose to browse<br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals found with<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual sources (rather than all), such as the Alexiad by Anna Comnena<br />

or the Epitome by Joannes Z<strong>on</strong>aras. The user can also browse lists of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals classified by factoids<br />

their records c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> (<strong>on</strong>ly <strong>on</strong>e factoid may be chosen at a time), <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g narrative, authorship,<br />

descripti<strong>on</strong>, dignity/office, educati<strong>on</strong>, k<str<strong>on</strong>g>in</str<strong>on</strong>g>ship, language skill, occupati<strong>on</strong>, possessi<strong>on</strong>, or religi<strong>on</strong>. In<br />

additi<strong>on</strong> to these brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g opti<strong>on</strong>s, the PDB database search allows the user to keyword search with<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

all factoids, with<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual factoids, or with<str<strong>on</strong>g>in</str<strong>on</strong>g> a comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of factoid categories us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Boolean<br />

operators.<br />

539 http://www.pbw.kcl.ac.uk/c<strong>on</strong>tent/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

540 A full list of the primary sources used <strong>and</strong> their abbreviati<strong>on</strong>s is provided at http://www.pbw.kcl.ac.uk/c<strong>on</strong>tent/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html; a list of the editi<strong>on</strong>s used for<br />

the text of the seals can be found at http://www.pbw.kcl.ac.uk/c<strong>on</strong>tent/reference/sealedit.html<br />

541 http://www.pbw.kcl.ac.uk/c<strong>on</strong>tent/reference/full.html


170<br />

Bradley <strong>and</strong> Short (2005) have offered some <str<strong>on</strong>g>in</str<strong>on</strong>g>sights <str<strong>on</strong>g>in</str<strong>on</strong>g>to the creati<strong>on</strong> of highly structured databases<br />

such as the PBW from sources used <str<strong>on</strong>g>in</str<strong>on</strong>g> the study of prosopography. 542 As illustrated above, the data <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the PBW are drawn from a large number of primary sources, <strong>and</strong> while Bradley <strong>and</strong> Short<br />

acknowledge that many traditi<strong>on</strong>al humanities comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g projects might have sought to first create<br />

digital editi<strong>on</strong>s of these primary sources, they believed that the prosopographical nature of their project<br />

required a different soluti<strong>on</strong>:<br />

This is because a digital prosopographical project does not aim to produce a textual editi<strong>on</strong>. If it<br />

is to be true to its name, it must create <str<strong>on</strong>g>in</str<strong>on</strong>g>stead, a new sec<strong>on</strong>dary source. Like a classic<br />

prosopography such as the Prosopography of the Later Roman Empire…a digital<br />

prosopography must act as a k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of visible record of the analysis of the sources produced by<br />

the scholars as they try to sort out who’s who from a close analysis of the extant source<br />

materials (Bradley <strong>and</strong> Short 2005).<br />

In traditi<strong>on</strong>al pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted prosopographies, this activity typically results <str<strong>on</strong>g>in</str<strong>on</strong>g> a biographical article that<br />

summarizes what can be c<strong>on</strong>cluded about the life of an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual from different sources <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative arguments from a scholar to support his or her c<strong>on</strong>clusi<strong>on</strong>s. A dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guish<str<strong>on</strong>g>in</str<strong>on</strong>g>g feature of<br />

the PBW, then, as a “new-style” digital prosopography is that its f<str<strong>on</strong>g>in</str<strong>on</strong>g>al publicati<strong>on</strong> is as a “highly<br />

structured database” not as a series of articles.<br />

As Bradley <strong>and</strong> Short expla<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> as seen above, all evidence data with<str<strong>on</strong>g>in</str<strong>on</strong>g> the PBW have been recorded<br />

as a series of factoids, or asserti<strong>on</strong>s made by a member of the project that a “source ‘S’ at locati<strong>on</strong> ‘L’<br />

states someth<str<strong>on</strong>g>in</str<strong>on</strong>g>g (‘F’) about Pers<strong>on</strong> ‘P’” (Bradley <strong>and</strong> Short 2005). Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Bradley <strong>and</strong> Short, a<br />

factoid is not a def<str<strong>on</strong>g>in</str<strong>on</strong>g>itive statement of fact about a pers<strong>on</strong>, <strong>and</strong> a collecti<strong>on</strong> of factoids should not be<br />

c<strong>on</strong>sidered as a scholarly overview of a pers<strong>on</strong>. Instead, factoids simply record asserti<strong>on</strong>s “made by a<br />

source at a particular spot about a pers<strong>on</strong>.” S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce factoids may c<strong>on</strong>tradict each other (e.g., make<br />

different asserti<strong>on</strong>s about an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual’s ethnicity), all factoids about a pers<strong>on</strong> are <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

database. The database also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a place where prosopographers can record their own asserti<strong>on</strong>s<br />

about why they have <str<strong>on</strong>g>in</str<strong>on</strong>g>terpreted a text <str<strong>on</strong>g>in</str<strong>on</strong>g> a certa<str<strong>on</strong>g>in</str<strong>on</strong>g> way. This methodology makes it easier to display<br />

the uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty <str<strong>on</strong>g>in</str<strong>on</strong>g>herent <str<strong>on</strong>g>in</str<strong>on</strong>g> determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g “facts” about an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual from complicated primary sources<br />

<strong>and</strong> also illustrates that factoids are also “acts of <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> by the researcher that gathers them.”<br />

“The ir<strong>on</strong>ic flavour of the name ‘Factoid’ is not accidental,” Bradley <strong>and</strong> Short submitted, “It reflects<br />

the historian’s worry when a t<str<strong>on</strong>g>in</str<strong>on</strong>g>y extract is taken out of the c<strong>on</strong>text of a larger text <strong>and</strong> the historical<br />

period <str<strong>on</strong>g>in</str<strong>on</strong>g> which it was written <strong>and</strong> presented as a ‘fact’”(Bradley <strong>and</strong> Short 2005). N<strong>on</strong>etheless, <strong>on</strong>e<br />

difficulty with the factoid approach was how to establish what types of factoids should be collected,<br />

<strong>and</strong> historical events proved to be the most challeng<str<strong>on</strong>g>in</str<strong>on</strong>g>g k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of data to transform <str<strong>on</strong>g>in</str<strong>on</strong>g>to factoids. 543<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce factoids l<str<strong>on</strong>g>in</str<strong>on</strong>g>k different k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds of structured <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> there were thous<strong>and</strong>s of factoids<br />

(60,000 or so, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the website), a relati<strong>on</strong>al model was chosen to help users make sense of all<br />

of the data. The relati<strong>on</strong>al model also offers many new facets for access as most pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted<br />

prosopographies <strong>on</strong>ly offer two to three <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes to articles they c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>. Bradley <strong>and</strong> Short c<strong>on</strong>trast<br />

their process of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a database with the “text-oriented modell<str<strong>on</strong>g>in</str<strong>on</strong>g>g” of projects such as the Old<br />

Bailey Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 544 The Old Bailey Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e provides access to a searchable <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e editi<strong>on</strong> of the historical<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Old Bailey, <strong>and</strong> like most prosopographical projects is based <strong>on</strong> narrative<br />

542 This article also offers some details <strong>on</strong> the creati<strong>on</strong> of two related database projects, the “Prosopography of Anglo-Sax<strong>on</strong> Engl<strong>and</strong> (PASE)”<br />

(http://www.pase.ac.uk/pase/apps/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.jsp) <strong>and</strong> the “Clergy of the Church of Engl<strong>and</strong> Database” (CCEd)<br />

(http://www.theclergydatabase.org.uk/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html).<br />

543 The computati<strong>on</strong>al model<str<strong>on</strong>g>in</str<strong>on</strong>g>g of historical events can be very complicated <strong>and</strong> was also described by Roberts<strong>on</strong> (2009) <str<strong>on</strong>g>in</str<strong>on</strong>g> his discussi<strong>on</strong> of HEML.<br />

544 http://www.oldbailey<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.org/


171<br />

texts that <str<strong>on</strong>g>in</str<strong>on</strong>g>clude references to people, places, <strong>and</strong> th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs. While pers<strong>on</strong> names are marked up <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

XML text of the Old Bailey Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e, Bradley <strong>and</strong> Short remarked that there was no effort “to structure<br />

the names <str<strong>on</strong>g>in</str<strong>on</strong>g>to pers<strong>on</strong>s themselves.” This is <str<strong>on</strong>g>in</str<strong>on</strong>g> direct c<strong>on</strong>trast to their relati<strong>on</strong>al approach with the<br />

PASE, PBEW, <strong>and</strong> CCEd:<br />

Our three projects, <strong>on</strong> the other h<strong>and</strong>, are explicitly prosopographical by nature, <strong>and</strong> the<br />

identificati<strong>on</strong> of pers<strong>on</strong>s is the central task of the researchers, as it must be <str<strong>on</strong>g>in</str<strong>on</strong>g> any<br />

prosopography. They must have a way to separate the people with the same recorded name <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

separate categories, <strong>and</strong> to group together references to a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle pers<strong>on</strong> regardless of the<br />

spell<str<strong>on</strong>g>in</str<strong>on</strong>g>g of his/her name. … It is exactly because prosopographical projects are <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

creati<strong>on</strong> of a model of their material that is perhaps not explicitly provided <str<strong>on</strong>g>in</str<strong>on</strong>g> the texts they<br />

work with that a purely textual approach is <str<strong>on</strong>g>in</str<strong>on</strong>g> the end not sufficient <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> of itself. Instead, it is<br />

exactly this k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of structur<str<strong>on</strong>g>in</str<strong>on</strong>g>g which makes our projects particularly suitable for the relati<strong>on</strong>al<br />

database model (Bradley <strong>and</strong> Short 2005).<br />

In additi<strong>on</strong>, the databases for all three of these projects c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> not <strong>on</strong>ly “structured data <str<strong>on</strong>g>in</str<strong>on</strong>g> the form of<br />

factoids” but also structures that are spread over several tables <strong>and</strong> represent other important objects <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the database <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g “pers<strong>on</strong>s, geographic locati<strong>on</strong>s, <strong>and</strong> possessi<strong>on</strong>s.”<br />

Bradley <strong>and</strong> Short also addressed a po<str<strong>on</strong>g>in</str<strong>on</strong>g>t raised earlier by Mathisen regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the limitati<strong>on</strong>s of<br />

historical databases <strong>and</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> <strong>and</strong> categorizati<strong>on</strong> of data. As Mathisen ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, they<br />

argued that all work with prosopographical sources, whether writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g an article or creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a database<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>volved a fair amount of scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> <strong>and</strong> categorizati<strong>on</strong>. Rather than attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create<br />

an “appropriate” model of their sources, Bradley <strong>and</strong> Short argued they were try<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create a model<br />

of how prosopographers work with those sources:<br />

For, of course, our database is not designed to model the texts up<strong>on</strong> which prosopography is<br />

based with all their subtle <strong>and</strong> ambiguous mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gs. The database, <str<strong>on</strong>g>in</str<strong>on</strong>g>stead, models the task of<br />

the prosopographer <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpret<str<strong>on</strong>g>in</str<strong>on</strong>g>g them i.e. it is not a model of an historical text, but a model<br />

of prosopography itself (Bradley <strong>and</strong> Short 2005).<br />

The importance of model<str<strong>on</strong>g>in</str<strong>on</strong>g>g how scholars with<str<strong>on</strong>g>in</str<strong>on</strong>g> a discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e c<strong>on</strong>duct their work <strong>and</strong> how they work<br />

with their sources are important comp<strong>on</strong>ents <str<strong>on</strong>g>in</str<strong>on</strong>g> the design not just of historical databases but also <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

larger digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures that will need to support multidiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary work.<br />

Mathisen has described the approach of the PBW as a comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of a “multi-file relati<strong>on</strong>al model”<br />

with a “decentralized biography model” (Mathisen 2007) or where <str<strong>on</strong>g>in</str<strong>on</strong>g>stead of hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

record with dedicated fields created for each <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual, each pers<strong>on</strong> is <str<strong>on</strong>g>in</str<strong>on</strong>g>stead assigned a unique ID<br />

key that is then associated with the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> bites or “factoids” as described above <str<strong>on</strong>g>in</str<strong>on</strong>g> various other<br />

databases. “Biographies” are thus created for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals by assembl<str<strong>on</strong>g>in</str<strong>on</strong>g>g all the relevant factoids for an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual. Mathisen offered a few caveats <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the methodology chosen for the PBW, namely,<br />

that the complexity of the data structure would make it hard for any<strong>on</strong>e without expert computer skills<br />

to implement such a soluti<strong>on</strong> <strong>and</strong> that the “multiplicity of sub-databases <strong>and</strong> lack of core biographies”<br />

would make it difficult to export this material or <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate it with another PDB without specialized<br />

programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Mathisen 2007). He also feared that the lack of “base-level” pers<strong>on</strong> entries might mean<br />

that important <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals could be omitted when different factoids were comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong><br />

could also make it difficult to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e when occurrences of the same name represent the same or<br />

different <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals. Despite this asserti<strong>on</strong>, Bradley <strong>and</strong> Short proposed that by not provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g their<br />

users with an “easy-to-read” f<str<strong>on</strong>g>in</str<strong>on</strong>g>al article about each <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stead present<str<strong>on</strong>g>in</str<strong>on</strong>g>g a collecti<strong>on</strong> of


172<br />

“apparently disc<strong>on</strong>nected <strong>and</strong> sometimes c<strong>on</strong>tradictory factoids,” they are <str<strong>on</strong>g>in</str<strong>on</strong>g> fact br<str<strong>on</strong>g>in</str<strong>on</strong>g>g<str<strong>on</strong>g>in</str<strong>on</strong>g>g the user<br />

closer to do<str<strong>on</strong>g>in</str<strong>on</strong>g>g actual prosopographical work. They argued that the series of factoids could be read as a<br />

“proto-narrative” <strong>and</strong> could serve to rem<str<strong>on</strong>g>in</str<strong>on</strong>g>d users of the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative <strong>and</strong> fuzzy nature of the data that<br />

they are gett<str<strong>on</strong>g>in</str<strong>on</strong>g>g from the database. Bradley <strong>and</strong> Short also asserted that the PBW seeks to provide<br />

focused access to the primary sources themselves <strong>and</strong> that users of the PBW should also c<strong>on</strong>sult these<br />

sources to form some of their own <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s. Thus the importance of access to primary sources is<br />

illustrated aga<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g> the study of prosopography.<br />

Other Prosopographical Databases<br />

A major prosopographical resource for ancient Greece is Website Attica, 545 an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e database<br />

designed to complement <strong>and</strong> extend a series of published volumes entitled Pers<strong>on</strong>s of Ancient Athens<br />

(PAA). Additi<strong>on</strong>s <strong>and</strong> correcti<strong>on</strong>s that are made to the published volumes are <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

database. More than 10,000 Athenian names are <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> the database, <strong>and</strong> a large variety of<br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g features are available. Individual names must be entered <str<strong>on</strong>g>in</str<strong>on</strong>g> capital letters <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek<br />

transliterati<strong>on</strong>. As the website expla<str<strong>on</strong>g>in</str<strong>on</strong>g>s, possible searches “range from select<str<strong>on</strong>g>in</str<strong>on</strong>g>g every pers<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

particular deme or of a specified professi<strong>on</strong> to more sophisticated searches,” such as f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g “all<br />

Athenians who lived between specified years <strong>and</strong>/or are related to a certa<str<strong>on</strong>g>in</str<strong>on</strong>g> pers<strong>on</strong> <strong>and</strong>/or are attested<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> a class of document.” The record for each <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes an identifier <strong>and</strong> identified name <strong>and</strong><br />

may also <str<strong>on</strong>g>in</str<strong>on</strong>g>clude status, place (a field which c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s the “demotic or ethnic of a pers<strong>on</strong>”), phyle, l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<br />

(k<str<strong>on</strong>g>in</str<strong>on</strong>g> relati<strong>on</strong>ship), k<str<strong>on</strong>g>in</str<strong>on</strong>g> name, activity, date, <strong>and</strong> a comment field where any additi<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

about a pers<strong>on</strong> that did not fall <str<strong>on</strong>g>in</str<strong>on</strong>g>to <strong>on</strong>e of the above categories can be found. A separate bibliographic<br />

reference search of the database is also available.<br />

One <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e resource for Roman prosopography is the Prosopographia Imperii Romani (PIR), 546 a<br />

website that is ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by the Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>-Br<strong>and</strong>enberg Academy <strong>and</strong> provides an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g>dex to the<br />

pers<strong>on</strong> entries found <str<strong>on</strong>g>in</str<strong>on</strong>g> the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted volumes of the Prosopographia Imperii Romani. The first editi<strong>on</strong> of<br />

this series was published <str<strong>on</strong>g>in</str<strong>on</strong>g> three parts between 1897 <strong>and</strong> 1898, <strong>and</strong> a sec<strong>on</strong>d editi<strong>on</strong> was published <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

seven parts with multiple fascicules beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> 1933 <strong>and</strong> c<strong>on</strong>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> 2006. The <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals<br />

covered <str<strong>on</strong>g>in</str<strong>on</strong>g> the PIR are drawn ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ly from the upper levels of society (emperors, senators, knights, <strong>and</strong><br />

their family members) of the Roman Empire between 31 BC <strong>and</strong> the end of the reign of Diocletian<br />

(305). The source material used <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g both the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted volumes <strong>and</strong> the database is wide-rang<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes literature (Ovid, 547 Virgil, Plutarch, Horace, Pausanias), adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrative, <strong>and</strong> historical<br />

records, as well as <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, papyri, <strong>and</strong> co<str<strong>on</strong>g>in</str<strong>on</strong>g>s. Access to the PIR entries is provided through a<br />

searchable “keyword list” that has been created for the website, <strong>and</strong> each entry c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s a unique<br />

identifier, a pers<strong>on</strong>’s name, <strong>and</strong> a reference to the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted PIR volumes or other st<strong>and</strong>ard reference<br />

works.<br />

A variety of <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g research <str<strong>on</strong>g>in</str<strong>on</strong>g>to the “<strong>on</strong>omastics <strong>and</strong> prosopography of the ‘later’ periods of<br />

Egyptian history <strong>on</strong> the basis of the Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Egyptian <strong>and</strong> other texts” is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>ducted by<br />

researchers us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the various texts c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed with<str<strong>on</strong>g>in</str<strong>on</strong>g> the Trismegistos portal. 548 The basic methodology<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>volves the collecti<strong>on</strong> of anthrop<strong>on</strong>yms <strong>and</strong> top<strong>on</strong>yms menti<strong>on</strong>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> the texts, <strong>and</strong> when there is no<br />

electr<strong>on</strong>ic corpus available, these names are entered manually. Work with Greek papyri, however, has<br />

been greatly enhanced because of the existence of the XML-encoded corpus of the DDbDP, which has<br />

545 http://www.chass.utor<strong>on</strong>to.ca/attica/<br />

546 http://www.bbaw.de/bbaw/Forschung/Forschungsprojekte/pir/de/Startseite<br />

547 One <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g project created an <strong>on</strong>omastic<strong>on</strong> exclusively for the Metamorphoses of Ovid http://staff.cch.kcl.ac.uk/~wmccarty/analytical<strong>on</strong>omastic<strong>on</strong>/<br />

548 For a list of the projects, see http://www.artshumanities.net/event/digital_classicistics_work_progress_sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ar_<strong>on</strong>omastics_name_extracti<strong>on</strong>_graeco_egyptian_papyri_


173<br />

been made freely available to them. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the DDbDP is <str<strong>on</strong>g>in</str<strong>on</strong>g> Unicode <strong>and</strong> has already capitalized all<br />

proper names, the semiautomated extracti<strong>on</strong> of names from it was greatly simplified. The names<br />

extracted from the DDbDP were added to the list of pers<strong>on</strong>al names already available from the<br />

Prosopographia Ptolemaica, <strong>and</strong> the full corpus currently <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes “25723 Greek nom<str<strong>on</strong>g>in</str<strong>on</strong>g>ative name<br />

variants” that have been grouped <str<strong>on</strong>g>in</str<strong>on</strong>g>to 16571 names. L<str<strong>on</strong>g>in</str<strong>on</strong>g>ks from this merged corpus to the DDbDP will<br />

be made us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a database of 207,070 “decl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed forms of these name variants.” Ultimately, all of the<br />

recognized name forms will be stored <str<strong>on</strong>g>in</str<strong>on</strong>g> a relati<strong>on</strong>al database of name references that will then be able<br />

to serve as a prosopography. All name references will be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to the appropriate texts <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

Trismegistos texts database.<br />

The Prosopographia Ptolemaica (PP) 549 is a l<strong>on</strong>gst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g research project from the Department of<br />

Ancient History at the University of Leuven. Started as a “list of all <str<strong>on</strong>g>in</str<strong>on</strong>g>habitants of Egypt between 300<br />

<strong>and</strong> 30 B.C., from Greek, Egyptian <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> sources” the project has recently been extended to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude the Roman <strong>and</strong> Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e periods. This resource has been <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g>to the larger<br />

Trismegistos portal <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes close l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to the HGV <strong>and</strong> the DDbDP but also ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s a separate<br />

database <str<strong>on</strong>g>in</str<strong>on</strong>g>terface. This database can be searched by Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> name transcripti<strong>on</strong>, ethnic group, residence,<br />

PP number, or date, <strong>and</strong> each <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual pers<strong>on</strong> record can <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a PP number (if available), a Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

transcripti<strong>on</strong> of the name, sex, place of residence, ethnic group, assumed dates, <strong>and</strong> a reference to the<br />

text <str<strong>on</strong>g>in</str<strong>on</strong>g> which they were menti<strong>on</strong>ed (e.g., papyri, <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s), al<strong>on</strong>g with a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to bibliographic<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> this text <str<strong>on</strong>g>in</str<strong>on</strong>g> the Trismegistos database.<br />

Another website that provides access to prosopographical data from Egypt is “DIME Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e:<br />

Prosopgraphie zu Soknopaiu Nesos.” 550 DIME c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s references to written records (Demotic <strong>and</strong><br />

Greek) of people who lived <str<strong>on</strong>g>in</str<strong>on</strong>g> the Soknopaiu Nesos area of Al Fayyūm from the seventh century BC to<br />

the fifth century AD. The entire database can be searched, <strong>and</strong> each identified <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual has a<br />

descriptive record that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes basic pers<strong>on</strong>al <strong>and</strong> k<str<strong>on</strong>g>in</str<strong>on</strong>g>ship <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, possessi<strong>on</strong>s, <strong>and</strong> any relevant<br />

bibliography. While search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the database does not require registrati<strong>on</strong>, users who do register can<br />

add <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> to the database.<br />

A related <strong>on</strong>omastic, if not entirely prosopographical, project for Ancient Greece is the Lexic<strong>on</strong> of<br />

Greek Pers<strong>on</strong>al Names (LGPN). 551 This project was established <str<strong>on</strong>g>in</str<strong>on</strong>g> 1972 as a research project of the<br />

British Academy under the directi<strong>on</strong> of Peter Marshall Fraser, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> 1996 it became a part of Oxford<br />

University <strong>and</strong> is a member of the group of Oxford Classics Research Projects. The purpose of the<br />

LGPN is to:<br />

… collect <strong>and</strong> publish with documentati<strong>on</strong> all known ancient Greek pers<strong>on</strong>al names (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

n<strong>on</strong>-Greek names recorded <str<strong>on</strong>g>in</str<strong>on</strong>g> Greek, <strong>and</strong> Greek names <str<strong>on</strong>g>in</str<strong>on</strong>g> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>), drawn from all available<br />

sources (literature, <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, graffiti, papyri, co<str<strong>on</strong>g>in</str<strong>on</strong>g>s, vases <strong>and</strong> other artefacts), with<str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

period from the earliest Greek written records down to, approximately, the sixth century<br />

A.D. 552<br />

This lexic<strong>on</strong> does not <str<strong>on</strong>g>in</str<strong>on</strong>g>clude mythological names, Mycenaean names, or later Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e names. Five<br />

volumes have been published, <strong>and</strong> several more are forthcom<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Individual volumes <str<strong>on</strong>g>in</str<strong>on</strong>g>clude all the<br />

Greek names from a particular geographic area (e.g., LGPN I: Aegean Isl<strong>and</strong>s, Cyprus, Cyrenaica).<br />

Each volume can be downloaded as a series of four PDF files, with an <str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong>, a bibliography of<br />

549 http://ldab.arts.kuleuven.be/prosptol/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

550 http://www.dime-<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.de/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpPHPSESSID=hc6fl0v16ls14vesuptn7uqnc3<br />

551 http://www.lgpn.ox.ac.uk/<br />

552 http://www.lgpn.ox.ac.uk/project/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html


174<br />

sources used <strong>and</strong> their abbreviati<strong>on</strong>s, <strong>and</strong> a foreword <strong>and</strong> reverse <str<strong>on</strong>g>in</str<strong>on</strong>g>dex of the Greek names <str<strong>on</strong>g>in</str<strong>on</strong>g> that<br />

volume. All of the LGPN data (250,000 published records) are stored <str<strong>on</strong>g>in</str<strong>on</strong>g> a relati<strong>on</strong>al database, <strong>and</strong> each<br />

record typically <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a normalized primary name form, sex of the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual named, place <strong>and</strong> date<br />

of attestati<strong>on</strong> (dates can vary widely), <strong>and</strong> the bibliographical reference or references as to where this<br />

name was found. 553 This website also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a useful <str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong> to Greek names, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g their<br />

history, formati<strong>on</strong>, <strong>and</strong> mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, <strong>and</strong> an image archive that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes tombst<strong>on</strong>es, vases, <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s,<br />

<strong>and</strong> other sources that have been used for names. A searchable database called the “LGPN Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e” can<br />

be used to search the more than 35,000 names published <str<strong>on</strong>g>in</str<strong>on</strong>g> LGPN I-IV <strong>and</strong> the revised LGPN II. Work<br />

is also under way to develop a TEI XML schema for the LGPN <strong>and</strong> to c<strong>on</strong>vert the entire database <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

TEI-XML for l<strong>on</strong>g-term preservati<strong>on</strong> <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability.<br />

A recent presentati<strong>on</strong> by Matthews <strong>and</strong> Rahtz (2008) has provided extensive details <strong>on</strong> the future plans<br />

of the LGPN regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g TEI-XML, how the resource has already been used <str<strong>on</strong>g>in</str<strong>on</strong>g> various types of classical<br />

research, 554 <strong>and</strong> how it may be used <str<strong>on</strong>g>in</str<strong>on</strong>g> future research. As these authors described, the LGPN has lived<br />

through various generati<strong>on</strong>s of humanities comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce it orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ated <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1970s. The most<br />

important part of this l<strong>on</strong>g history was the development of a database <str<strong>on</strong>g>in</str<strong>on</strong>g> the 1980s that was “structured<br />

to reflect <strong>and</strong> provide access to all the research comp<strong>on</strong>ents of an LGPN record, which <str<strong>on</strong>g>in</str<strong>on</strong>g> the books are<br />

subsumed under name-head<str<strong>on</strong>g>in</str<strong>on</strong>g>gs” (Matthews <strong>and</strong> Rahtz 2008). While this database has been important<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> enforc<str<strong>on</strong>g>in</str<strong>on</strong>g>g some format c<strong>on</strong>sistency <strong>and</strong> was used to generate the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted volumes, Matthews <strong>and</strong><br />

Rahtz argued that its research potential has yet to be fully exploited. 555 The last decade of LGPN<br />

development has <str<strong>on</strong>g>in</str<strong>on</strong>g>volved reach<str<strong>on</strong>g>in</str<strong>on</strong>g>g the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g goals: the serializati<strong>on</strong> of the relati<strong>on</strong>al database <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

XML; the support of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e search<str<strong>on</strong>g>in</str<strong>on</strong>g>g us<str<strong>on</strong>g>in</str<strong>on</strong>g>g an XML database; <strong>and</strong> the creati<strong>on</strong> of a new data model<br />

that will emphasize collaborati<strong>on</strong>.<br />

The future plans of the LGPN are to c<strong>on</strong>vert their electr<strong>on</strong>ic lexic<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to a system entirely based <strong>on</strong><br />

TEI-XML. This work is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g undertaken not <strong>on</strong>ly to create an IT <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that will support the<br />

preservati<strong>on</strong> <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>tenance of the LGPN data but also to enable these data to play a larger role <str<strong>on</strong>g>in</str<strong>on</strong>g> an<br />

e-research envir<strong>on</strong>ment <strong>and</strong> to allow the LGPN to play a “central role <str<strong>on</strong>g>in</str<strong>on</strong>g> determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g st<strong>and</strong>ards for<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g names <str<strong>on</strong>g>in</str<strong>on</strong>g> documents” through TEI/XML <strong>and</strong> thus achieve greater <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability with digital<br />

resources worldwide (Matthews <strong>and</strong> Rahtz 2008). This “XML phase” of the LGPN work has led to the<br />

def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of a customized TEI-XML schema that will be used to preserve an archival form of the<br />

lexic<strong>on</strong> data <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital repository. This work co<str<strong>on</strong>g>in</str<strong>on</strong>g>cided with <strong>and</strong> thus <str<strong>on</strong>g>in</str<strong>on</strong>g>fluenced the TEI’s recent<br />

revisi<strong>on</strong> of their module relat<str<strong>on</strong>g>in</str<strong>on</strong>g>g to names <strong>and</strong> dates. 556 The new module models “pers<strong>on</strong>s, places <strong>and</strong><br />

organizati<strong>on</strong>s as first class objects” so the LGPN schema is thus a fully “c<strong>on</strong>formant pure subset of the<br />

TEI” (Matthews <strong>and</strong> Rahtz 2008). The LGPN has also usefully def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed five potential levels of data<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terchange: character <str<strong>on</strong>g>in</str<strong>on</strong>g>terchange, character encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, st<strong>and</strong>ardized structural markup, st<strong>and</strong>ardized<br />

semantic markup, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

The LGPN has created an experimental database 557 that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s an XML versi<strong>on</strong> of the LGPN <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle XML database <strong>and</strong> uses XQuery to jo<str<strong>on</strong>g>in</str<strong>on</strong>g> name, place, <strong>and</strong> pers<strong>on</strong> data to support new forms of<br />

sophisticated search<str<strong>on</strong>g>in</str<strong>on</strong>g>g. They are currently deliver<str<strong>on</strong>g>in</str<strong>on</strong>g>g search results as HTML, TEI-XML, <strong>and</strong> KML<br />

for use <str<strong>on</strong>g>in</str<strong>on</strong>g> Google Maps <strong>and</strong> GoogleEarth, Atom feeds for use <str<strong>on</strong>g>in</str<strong>on</strong>g> RSS readers, <strong>and</strong> JSON (for use <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

553 http://www.lgpn.ox.ac.uk/<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e/documents/TEIXML<br />

554 The LGPN has hosted two <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al c<strong>on</strong>ferences regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g its use, <strong>and</strong> the results have been published<br />

(http://www.lgpn.ox.ac.uk/publicati<strong>on</strong>s/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html) <str<strong>on</strong>g>in</str<strong>on</strong>g> two books. The topics covered <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, religious history, <strong>and</strong> demographic studies,<br />

am<strong>on</strong>g many others.<br />

555 For a fuller descripti<strong>on</strong> of this database <strong>and</strong> the c<strong>on</strong>versi<strong>on</strong> of the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted slips, see http://www.lgpn.ox.ac.uk/<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e/computerizati<strong>on</strong>/.<br />

556 http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ND.html<br />

557 The experimental database can be accessed here: http://clas-lgpn2.classics.ox.ac.uk/


175<br />

Simile Timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e 558 <strong>and</strong> Exhibit 559 ). Even more important, they are provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>sistent “cool URLs”<br />

so that these data can be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to <strong>and</strong> widely reused <str<strong>on</strong>g>in</str<strong>on</strong>g> other applicati<strong>on</strong>s. Only a limited number of<br />

the 3,000 attested place names <str<strong>on</strong>g>in</str<strong>on</strong>g> the LGPN currently have KML downloads s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce this format requires<br />

latitude <strong>and</strong> l<strong>on</strong>gitude for locati<strong>on</strong>s, but the LGPN is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with Pleiades (as part of the C<strong>on</strong>cordia<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>itiative) to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d these places <str<strong>on</strong>g>in</str<strong>on</strong>g> the Barr<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong> Atlas <strong>and</strong> utilize the geolocati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> found<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> this atlas. All these formats, Matthews <strong>and</strong> Rahtz expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, are created by a simple series of<br />

XSL transformati<strong>on</strong>s <strong>on</strong> the TEI XML file. Through the use of c<strong>on</strong>sistent st<strong>and</strong>ards, therefore, the<br />

LGPN was able to dem<strong>on</strong>strate the potential of l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g their data with many other digital classics<br />

projects.<br />

As this overview of prosopographical <strong>and</strong> <strong>on</strong>omastic resources illustrates, there are a number of<br />

prosopographical resources <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, although far fewer than for other classical discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es. It appears<br />

that n<strong>on</strong>e of the projects reviewed other than the LGPN has plans to provide XML versi<strong>on</strong>s of their<br />

data, or <str<strong>on</strong>g>in</str<strong>on</strong>g>deed to provide any access to their data at all other than through the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual websites.<br />

However, the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> found with<str<strong>on</strong>g>in</str<strong>on</strong>g> these databases, particularly the lists of pers<strong>on</strong>al names <strong>and</strong><br />

their variants, could be extremely useful as tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g data <str<strong>on</strong>g>in</str<strong>on</strong>g> the development of named-entity<br />

disambiguati<strong>on</strong> algorithms for historical texts. Similarly, as many of the resources used <str<strong>on</strong>g>in</str<strong>on</strong>g> the creati<strong>on</strong><br />

of these databases (e.g., published collecti<strong>on</strong>s of documentary texts papyri, <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s) are <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

public doma<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> may have been published <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e (e.g., <str<strong>on</strong>g>in</str<strong>on</strong>g> Google Books or the Internet Archive),<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual name records could likely be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to a variety of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e sources of their attestati<strong>on</strong>s.<br />

THE USE AND USERS OF RESOURCES IN DIGITAL CLASSICS AND THE DIGITAL<br />

HUMANITIES<br />

While this review orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ally <str<strong>on</strong>g>in</str<strong>on</strong>g>tended to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a survey of studies that exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed how scholars made<br />

use of specific digital classics projects <strong>and</strong> how well these projects met their needs, no such overview<br />

studies were located. 560 A number of digital classics resources <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded extensive bibliographies 561<br />

that listed research that had made use of the analog sources (e.g., pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted collecti<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s or<br />

papyri, published editi<strong>on</strong>s of classical texts), but n<strong>on</strong>e seemed to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude studies that specifically<br />

exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed whether or how the digital resources were be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used for scholarship. 562 There are many<br />

studies that <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigate the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>-seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g habits of humanities scholars, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g how they f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<br />

electr<strong>on</strong>ic resources or search <strong>on</strong> the Web, but many of these studies have focused <strong>on</strong> “traditi<strong>on</strong>al”<br />

electr<strong>on</strong>ic resources such as databases subscribed to by libraries or the use of general search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>es<br />

such as Google. 563 One notable excepti<strong>on</strong> that focused specifically <strong>on</strong> humanist use of primary<br />

558 The SIMILE Timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e is an open-source “widget” that can be used create <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive time l<str<strong>on</strong>g>in</str<strong>on</strong>g>es (http://www.simile-widgets.org/timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e/), <strong>and</strong> LGPN<br />

made particular use of TimeMap (http://code.google.com/p/timemap/), a javascript library that was created to “help use Google Maps with a SIMILE<br />

timel<str<strong>on</strong>g>in</str<strong>on</strong>g>e.”<br />

559 Exhibit (http://www.simile-widgets.org/exhibit/) is an open-source publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g framework created by the SIMILE project that can be used to easily<br />

“create web pages with advanced text search <strong>and</strong> filter<str<strong>on</strong>g>in</str<strong>on</strong>g>g functi<strong>on</strong>alities, with <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive maps, timel<str<strong>on</strong>g>in</str<strong>on</strong>g>es, <strong>and</strong> other visualizati<strong>on</strong>s.”<br />

560 While some research has been c<strong>on</strong>ducted <str<strong>on</strong>g>in</str<strong>on</strong>g>to the use of the Perseus Digital <strong>Library</strong>, n<strong>on</strong>e has been c<strong>on</strong>ducted <str<strong>on</strong>g>in</str<strong>on</strong>g> the last 10 years or <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the<br />

current website (Perseus 4.0). For earlier research, see, for example, Marchi<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>i <strong>and</strong> Crane (1994).<br />

561 See, for example, the bibliography of publicati<strong>on</strong>s related to Projet Volterra, http://www.ucl.ac.uk/history2/volterra/bibliog.htm<br />

562 For some collecti<strong>on</strong>s such as the APIS, relevant bibliography of how an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual papyrus has been used or published is <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g>to records for the<br />

papyri.<br />

563 A synthesis of major f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs from over 12 recent studies <str<strong>on</strong>g>in</str<strong>on</strong>g> this area (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g faculty, researchers, graduate students, <strong>and</strong> undergraduates from<br />

various discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es) has been created by C<strong>on</strong>naway <strong>and</strong> Dickey (2010) while Palmer et al. (2009) have offered an analysis of studies that have focused<br />

more exclusively <strong>on</strong> the scholarly practices <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> use of faculty <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g the past 20 years. See also the literature review found <str<strong>on</strong>g>in</str<strong>on</strong>g> (Toms<br />

<strong>and</strong> O’Brien 2008). For two sample recent exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>s of how humanists search for <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> the Web <strong>and</strong> work with library electr<strong>on</strong>ic resources<br />

see (Am<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2008) <strong>and</strong> (Buchanan et al. 2005).


176<br />

resources, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital facsimiles, is Audenaert <strong>and</strong> Furuta (2010), which will be exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

further detail below.<br />

As the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of classics falls under the larger umbrella of the humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> this<br />

review has primarily focused <strong>on</strong> open-access digital resources <str<strong>on</strong>g>in</str<strong>on</strong>g> classics, a number of studies that<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated the use of freely available digital resources <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities have been chosen for further<br />

exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> to see what general <str<strong>on</strong>g>in</str<strong>on</strong>g>sights might be determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed. For this reas<strong>on</strong>, this review has<br />

exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed various studies that have explored the citati<strong>on</strong> of electr<strong>on</strong>ic resources <str<strong>on</strong>g>in</str<strong>on</strong>g> classics (Dalbello et<br />

al. 2006), the behaviors of digital humanists with e-texts (Toms <strong>and</strong> O’Brien 2008), humanist use of<br />

primary source materials <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital facsimiles (Audenaert <strong>and</strong> Furuta 2010), the scholarly<br />

creators of digital humanities resources (Warwick et al. 2008b), <strong>and</strong> the “traditi<strong>on</strong>al” scholarly use of<br />

digital humanities resources (Harley et al. 2006b, Brown <strong>and</strong> Greengrass 2010, Meyer et al. 2009,<br />

Warwick et al. 2008a).<br />

Citati<strong>on</strong> of Digital Classics Resources<br />

One method of explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g how digital resources are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used with<str<strong>on</strong>g>in</str<strong>on</strong>g> a discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e is to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e how<br />

many citati<strong>on</strong>s to different digital resources can be found with<str<strong>on</strong>g>in</str<strong>on</strong>g> “c<strong>on</strong>venti<strong>on</strong>al” publicati<strong>on</strong>s. Pursu<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the traditi<strong>on</strong>al task of bibliometrics to exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e how <strong>and</strong> when digital resources are cited <str<strong>on</strong>g>in</str<strong>on</strong>g> scholarly<br />

publicati<strong>on</strong>s is a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g area of research, <strong>and</strong> a general methodological approach is described as part<br />

of the JISC-funded “Toolkit for the Impact of Digitised Scholarly Resources (TIDSR).” 564 In their<br />

efforts to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e how often five digital projects had been cited, the project team searched for<br />

citati<strong>on</strong>s to these resources us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Google Scholar, 565 Scopus, 566 <strong>and</strong> ISI Web of Knowledge. 567 One<br />

major issue, they reported, was that bibliometrics “with regard to n<strong>on</strong>-traditi<strong>on</strong>al scholarly outputs is<br />

that citati<strong>on</strong> habits <str<strong>on</strong>g>in</str<strong>on</strong>g> many fields favour cit<str<strong>on</strong>g>in</str<strong>on</strong>g>g the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al paper versi<strong>on</strong> of a document, even if the<br />

<strong>on</strong>ly versi<strong>on</strong> c<strong>on</strong>sulted was electr<strong>on</strong>ic.” 568 In fact, they found that of those scholars who published<br />

papers as a result of their work with digital materials <str<strong>on</strong>g>in</str<strong>on</strong>g> the five projects, more than <strong>on</strong>e-third cited<br />

<strong>on</strong>ly the physical item that was represented <str<strong>on</strong>g>in</str<strong>on</strong>g> the collecti<strong>on</strong> <strong>and</strong> made no reference to the digital<br />

project; almost half cited the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al publicati<strong>on</strong> but also <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the URL; <strong>and</strong> less than <strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g> five<br />

cited <strong>on</strong>ly the digital versi<strong>on</strong>. Thus they cauti<strong>on</strong>ed aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st simply rely<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> bibliometrics to analyze<br />

the actual scholarly impact of a digital project. As Project Director Eric Meyer expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed:<br />

This means that rely<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g citati<strong>on</strong>s to <strong>on</strong>e's digitised resource based <strong>on</strong> look<str<strong>on</strong>g>in</str<strong>on</strong>g>g for<br />

URL's with<str<strong>on</strong>g>in</str<strong>on</strong>g> journal citati<strong>on</strong>s is almost certa<str<strong>on</strong>g>in</str<strong>on</strong>g>ly go<str<strong>on</strong>g>in</str<strong>on</strong>g>g to yield an artificially low number<br />

because of the uses that d<strong>on</strong>'t cite it at all, <strong>and</strong> because of <str<strong>on</strong>g>in</str<strong>on</strong>g>c<strong>on</strong>sistencies <str<strong>on</strong>g>in</str<strong>on</strong>g> how the URLs are<br />

cited. Nevertheless, do<str<strong>on</strong>g>in</str<strong>on</strong>g>g regular searches for citati<strong>on</strong>s to a collecti<strong>on</strong>'s material is an<br />

important way to establish the impact it is hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the scholarly community. 569<br />

Thus, while <strong>on</strong>e way to measure the impact of a resource <str<strong>on</strong>g>in</str<strong>on</strong>g> digital classics is to perform a citati<strong>on</strong><br />

analysis look<str<strong>on</strong>g>in</str<strong>on</strong>g>g for citati<strong>on</strong>s to project URLs or references to digital projects <str<strong>on</strong>g>in</str<strong>on</strong>g> article text us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

various tools such as Google Scholar, the actual amount of use of different digital projects may be<br />

quite higher than can be easily determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed. Meyer’s po<str<strong>on</strong>g>in</str<strong>on</strong>g>t also illustrates the importance of<br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g stable URLs to encourage the citati<strong>on</strong> of resources <strong>and</strong> their subsequent discovery <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

bibliometric analyses.<br />

564 http://microsites.oii.ox.ac.uk/tidsr/kb/53/bibliometrics-enhanc<str<strong>on</strong>g>in</str<strong>on</strong>g>g-ability-track-projects-scholarly-impacts<br />

565 http://scholar.google.com/<br />

566 http://www.scopus.com/<br />

567 http://www.isiknowledge.com/<br />

568 http://microsites.oii.ox.ac.uk/tidsr/kb/53/bibliometrics-enhanc<str<strong>on</strong>g>in</str<strong>on</strong>g>g-ability-track-projects-scholarly-impacts<br />

569 “What Is Bibliometrics <strong>and</strong> Scientometrics” Eric T. Meyer. http://microsites.oii.ox.ac.uk/tidsr/kb/48/what-bibliometrics-<strong>and</strong>-scientometrics


177<br />

Dozens of digital classics resources are menti<strong>on</strong>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper, <strong>and</strong> a full bibliometric analysis of<br />

even <strong>on</strong>e of the projects would be far bey<strong>on</strong>d the scope of this review. N<strong>on</strong>etheless, sample searches<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> Google Scholar for two projects, the PP <strong>and</strong> the APIS, illustrated that these resources are <str<strong>on</strong>g>in</str<strong>on</strong>g>deed<br />

cited with<str<strong>on</strong>g>in</str<strong>on</strong>g> the larger scholarly literature, even though these citati<strong>on</strong>s are not always easy to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d. For<br />

example, the PP has been used to explore spatial relati<strong>on</strong>ships <strong>and</strong> estimate populati<strong>on</strong> size settlements<br />

(Mueller <strong>and</strong> Lee 2004), as <strong>on</strong>e source of data for an <strong>on</strong>omastic study of Hebrew <strong>and</strong> Jewish-Aramaic<br />

names (H<strong>on</strong>igman 2004), <strong>and</strong> as a source of data for a populati<strong>on</strong> study of Hellenistic Egypt (Clarysse<br />

<strong>and</strong> Thomps<strong>on</strong> 2006). Digital images <strong>and</strong> translati<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual papyrus fragments with<str<strong>on</strong>g>in</str<strong>on</strong>g> the APIS<br />

have also been cited <str<strong>on</strong>g>in</str<strong>on</strong>g> different publicati<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a discussi<strong>on</strong> of a comic fragment (Barrenechea<br />

2006) <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the study of Greek hepatoscopy (Coll<str<strong>on</strong>g>in</str<strong>on</strong>g>s 2008). Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, while all three of the<br />

citati<strong>on</strong>s to the Prosopographica Ptolemaica either listed the database by name <str<strong>on</strong>g>in</str<strong>on</strong>g> the article or<br />

footnotes or <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded URLs to the collecti<strong>on</strong>, neither of the references to the APIS <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded URLs to<br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual papyrus records they used. 570<br />

A specific citati<strong>on</strong> study for electr<strong>on</strong>ic resources <str<strong>on</strong>g>in</str<strong>on</strong>g> classics by Dalbello et al. (2006) exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the<br />

number <strong>and</strong> types of citati<strong>on</strong>s to electr<strong>on</strong>ic resources made by classicists <str<strong>on</strong>g>in</str<strong>on</strong>g> three important journals<br />

(Classical Journal, Mnemosyne, <strong>and</strong> Classical Antiquity). 571 As the authors c<strong>on</strong>sidered classics to be a<br />

field known for digital <str<strong>on</strong>g>in</str<strong>on</strong>g>novati<strong>on</strong>, they expected to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d many references to digital scholarly resources,<br />

but <str<strong>on</strong>g>in</str<strong>on</strong>g>stead found that references were typically made to educati<strong>on</strong>al sites, such as <str<strong>on</strong>g>in</str<strong>on</strong>g> articles that<br />

discussed learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g about the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <str<strong>on</strong>g>in</str<strong>on</strong>g> reports of practice that discussed the recent history of the<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> articles that analyzed the potential use of technology for research. “More rarely are<br />

the associati<strong>on</strong>s to electr<strong>on</strong>ic resources <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded for knowledge build<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> a traditi<strong>on</strong>al scholarly<br />

fashi<strong>on</strong>,” Dalbello et al. observed, “such as – <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g> the literature review, support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

argument, etc.” With<str<strong>on</strong>g>in</str<strong>on</strong>g> classical journals, the types of websites cited <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded community of practice<br />

sites, university sites, digital library sites, encyclopedias, <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e dicti<strong>on</strong>aries, electr<strong>on</strong>ic libraries, <strong>and</strong><br />

electr<strong>on</strong>ic journals. Thus Dalbello et al. remarked that while digital resources were discussed as<br />

teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools or <str<strong>on</strong>g>in</str<strong>on</strong>g> state-of-the-art reviews, they were not be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used to create new knowledge or as<br />

research tools. Despite the presence <strong>and</strong> use of digital resources, Dalbello et al. c<strong>on</strong>cluded that most<br />

classicists still perceived of publicati<strong>on</strong> as paper-based:<br />

Our f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate that the structur<str<strong>on</strong>g>in</str<strong>on</strong>g>g of literature <str<strong>on</strong>g>in</str<strong>on</strong>g> these fields is largely still perceived as<br />

paper-based. … Documentary cultures result<str<strong>on</strong>g>in</str<strong>on</strong>g>g from digitizati<strong>on</strong> of resources support<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

traditi<strong>on</strong>al research <strong>and</strong> digital preservati<strong>on</strong> as well as multiple document formats for scholarly<br />

journals (electr<strong>on</strong>ic, paper) present a new research envir<strong>on</strong>ment for the humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es<br />

that is not as yet fully <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g> the can<strong>on</strong>ical knowledge base. These citati<strong>on</strong> practices po<str<strong>on</strong>g>in</str<strong>on</strong>g>t<br />

to the still <str<strong>on</strong>g>in</str<strong>on</strong>g>visible nature of the electr<strong>on</strong>ic document that is now ubiquitous <str<strong>on</strong>g>in</str<strong>on</strong>g> support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

actual research practice (Dalbello et al. 2006)<br />

The last<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>fluence of the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t paradigm has also been explored <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of “digital <str<strong>on</strong>g>in</str<strong>on</strong>g>cunabula,” or<br />

how the c<strong>on</strong>venti<strong>on</strong>s <strong>and</strong> limitati<strong>on</strong>s of pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t publicati<strong>on</strong> have shaped the development digital projects<br />

(Crane et al. 2006).<br />

The Research Habits of Digital Humanists<br />

Another approach to determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the actual use of digital classics resources is to survey the users who<br />

work with them. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce no studies that specifically exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the scholarly use of digital classics<br />

570 But the record for the papyrus utilized <str<strong>on</strong>g>in</str<strong>on</strong>g> Barrenechea (2006) does list his publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> its bibliography,<br />

(http://wwwapp.cc.columbia.edu/ldpd/apis/itemmode=item&key=columbia.apis.p1550)<br />

571 This research also explored electr<strong>on</strong>ic citati<strong>on</strong>s made <str<strong>on</strong>g>in</str<strong>on</strong>g> English literature journals.


178<br />

materials could be found, a recent comprehensive study that <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated <str<strong>on</strong>g>in</str<strong>on</strong>g> detail the research habits of<br />

digital humanists with e-texts was used to ga<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>sight <str<strong>on</strong>g>in</str<strong>on</strong>g>to larger research <strong>and</strong> use patterns of digital<br />

humanities scholars <strong>and</strong> how these might reflect the behaviors <strong>and</strong> needs of those scholars who use<br />

digital classics resources.<br />

A study by Toms <strong>and</strong> O’Brien (2008) focused <strong>on</strong> self-identified e-humanists <strong>and</strong> how they utilized<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> communicati<strong>on</strong> technology (ICT) to <str<strong>on</strong>g>in</str<strong>on</strong>g>form the design of an “e-humanist’s<br />

workbench.” Their research results were based <strong>on</strong> small sample of digital humanists who resp<strong>on</strong>ded to<br />

a web-based survey, <strong>and</strong> Toms <strong>and</strong> O’Brien noted that they planned to exp<strong>and</strong> their research with<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terviews <strong>and</strong> observati<strong>on</strong>s of scholars at work. As part of their research they exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed dozens of<br />

articles <strong>and</strong> over 40 years of studies <strong>on</strong> scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> behavior <strong>and</strong> the “traditi<strong>on</strong>al” research<br />

habits of humanists (e.g., with pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted library materials or databases). One fundamental c<strong>on</strong>clusi<strong>on</strong><br />

they reached was that “the accumulated research depicts the humanist as a solitary scholar who values<br />

primary materials <strong>and</strong> sec<strong>on</strong>dary materials—namely books—<strong>and</strong> engages <str<strong>on</strong>g>in</str<strong>on</strong>g> brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g behaviour more<br />

than search<str<strong>on</strong>g>in</str<strong>on</strong>g>g” (Toms <strong>and</strong> O’Brien 2008). They also observed that the “<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g strategy<br />

of choice” was l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g rather than search<str<strong>on</strong>g>in</str<strong>on</strong>g>g or a comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g different materials <strong>and</strong><br />

cha<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g, or us<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>e relevant article to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d other articles (e.g., through the footnotes or by who has<br />

cited the article <str<strong>on</strong>g>in</str<strong>on</strong>g> h<strong>and</strong>). Their overview of the literature also illustrated that when humanists began<br />

research they cared more about depth than relevance <strong>and</strong> that this facilitated the development of ideas,<br />

the ability to make c<strong>on</strong>necti<strong>on</strong>s, <strong>and</strong> the creati<strong>on</strong> of an <str<strong>on</strong>g>in</str<strong>on</strong>g>itial knowledge base to which later knowledge<br />

could be related.<br />

One problem that Toms <strong>and</strong> O’Brien found with the theoretical studies of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> behavior 572 that<br />

they c<strong>on</strong>sidered was that the studies typically excluded how <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> is actually used by humanists<br />

after it is found <strong>and</strong> thus are of limited use <str<strong>on</strong>g>in</str<strong>on</strong>g> the development of actual systems. “Despite the<br />

profound impact of technology <strong>on</strong> this scholarly community,” Toms <strong>and</strong> O’Brien remarked, “little is<br />

known about how computers have affected humanists’ work flow, unless it is to say that scholars adopt<br />

technologies when they augment established research practices”(Toms <strong>and</strong> O’Brien 2008).<br />

N<strong>on</strong>etheless, earlier work c<strong>on</strong>ducted by <strong>on</strong>e of the authors of this paper (Toms <strong>and</strong> Flora 2005) had<br />

identified a c<strong>on</strong>crete set of needs for e-humanists that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded five key comp<strong>on</strong>ents: (1) access to<br />

primary sources; (2) presentati<strong>on</strong> of text; (3) text analysis <strong>and</strong> relevant tools; (4) access to sec<strong>on</strong>dary<br />

sources; <strong>and</strong> (5) tools for communicati<strong>on</strong> <strong>and</strong> collaborati<strong>on</strong>. In additi<strong>on</strong>, Toms <strong>and</strong> O’Brien noted that<br />

there is little jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities. Hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g thus reviewed both the work of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

scientists <strong>and</strong> humanists, they elaborated <strong>on</strong> three comm<strong>on</strong> themes: (1) humanities scholarship utilizes<br />

a diverse set of primary <strong>and</strong> sec<strong>on</strong>dary sources <strong>and</strong> while text is the primary resource, a variety of<br />

digital media are also used; (2) digital humanists use a variety of tools <strong>and</strong> techniques when work<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

with encoded texts; <strong>and</strong> (3) humanists were typically solitary researchers (e.g., they saw little evidence<br />

of jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t publicati<strong>on</strong>) but did communicate with other scholars.<br />

Seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to c<strong>on</strong>firm or exp<strong>and</strong> these f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, Toms <strong>and</strong> O’Brien c<strong>on</strong>ducted a survey of self-identified<br />

e-humanists to exp<strong>and</strong> their knowledge regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the current use of electr<strong>on</strong>ic texts by e-humanists,<br />

the research envir<strong>on</strong>ment of e-humanists, <strong>and</strong> the types of research performed by all humanists. Survey<br />

participants were recruited through listservs such as HUMANIST, 573 <strong>and</strong> all results were obta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

from a questi<strong>on</strong>naire that asked about general <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> research <str<strong>on</strong>g>in</str<strong>on</strong>g>terests, <strong>and</strong> the use<br />

of ICT. Resp<strong>on</strong>ses of the 169 survey participants were analyzed us<str<strong>on</strong>g>in</str<strong>on</strong>g>g several different software tools.<br />

572 As an overview of all of these studies, Toms <strong>and</strong> O’Brien provided a table that summarized that various stages of the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>-seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g process<br />

identified by different researchers.<br />

573 http://www.digitalhumanities.org/humanist/


179<br />

The ratio of male-to-female resp<strong>on</strong>dents was about 3:2, <strong>and</strong> almost two-thirds of resp<strong>on</strong>dents had PhD<br />

degrees. Most were from Canada, followed by the United States, Europe, <strong>and</strong> Australia.<br />

Approximately 25 percent to 30 percent had never performed any type of text analysis. In terms of<br />

areas of specializati<strong>on</strong>, more than 40 percent reported that they worked <str<strong>on</strong>g>in</str<strong>on</strong>g> literature, but “classics,<br />

history <strong>and</strong> religi<strong>on</strong>” was reported by 12 percent of participants. While the “dom<str<strong>on</strong>g>in</str<strong>on</strong>g>ant language” that<br />

most scholars reported work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> was English, a dozen other languages were identified; <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly,<br />

more scholars reported work<str<strong>on</strong>g>in</str<strong>on</strong>g>g primarily with Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> (17 percent) than with Italian (7 percent) or<br />

Spanish (9 percent). While more than half of resp<strong>on</strong>dents worked primarily with materials from the<br />

twentieth century, approximately 13 percent worked <str<strong>on</strong>g>in</str<strong>on</strong>g> ancient <strong>and</strong> “post-classical history.”<br />

As teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> research are often <str<strong>on</strong>g>in</str<strong>on</strong>g>tertw<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, Toms <strong>and</strong> O’Brien also asked<br />

resp<strong>on</strong>dents how they used electr<strong>on</strong>ic texts <strong>and</strong> ICT <str<strong>on</strong>g>in</str<strong>on</strong>g> the classroom. About 60 percent of their<br />

resp<strong>on</strong>dents taught courses <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, but their use of technology <str<strong>on</strong>g>in</str<strong>on</strong>g> the classroom<br />

overwhelm<str<strong>on</strong>g>in</str<strong>on</strong>g>gly <str<strong>on</strong>g>in</str<strong>on</strong>g>volved the general use of course websites, <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g>structi<strong>on</strong>al systems, <strong>and</strong><br />

textbook courseware. While <str<strong>on</strong>g>in</str<strong>on</strong>g> some cases students were required to create e-texts (39 percent), encode<br />

texts us<str<strong>on</strong>g>in</str<strong>on</strong>g>g markup such as HTML or XML (72 percent), <strong>and</strong> use text-analysis tools (33 percent),<br />

Toms <strong>and</strong> O’Brien reported that “a significant number of resp<strong>on</strong>dents (42 per cent) have not required<br />

students to use any of these.” Thus although many e-humanists used text-analysis <strong>and</strong> digital<br />

technology <str<strong>on</strong>g>in</str<strong>on</strong>g> their own research, there was an apparent disc<strong>on</strong>nect for many <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of their use of<br />

technology <str<strong>on</strong>g>in</str<strong>on</strong>g> the classroom.<br />

Another set of questi<strong>on</strong>s explored the research themes of e-humanists. The most prevalent theme listed<br />

was the semantic or thematic exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of the text/s of <strong>on</strong>e or more authors (37 percent); this was<br />

followed by the creati<strong>on</strong> or use of specific electr<strong>on</strong>ic editi<strong>on</strong>s (20 percent) <strong>and</strong> the creati<strong>on</strong> of<br />

specialized catalogs or bibliographies us<str<strong>on</strong>g>in</str<strong>on</strong>g>g databases that already exist (13 percent). Only 13 percent<br />

reported their ma<str<strong>on</strong>g>in</str<strong>on</strong>g> research theme as c<strong>on</strong>duct<str<strong>on</strong>g>in</str<strong>on</strong>g>g “computati<strong>on</strong>al text analyses” or “develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

techniques for analysis.”<br />

Questi<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the use of electr<strong>on</strong>ic texts <strong>and</strong> text-analysis tools made up a large proporti<strong>on</strong> of the<br />

survey, <strong>and</strong> 86 percent of resp<strong>on</strong>dents reported us<str<strong>on</strong>g>in</str<strong>on</strong>g>g e-texts. At the same time, <strong>on</strong>ly 34 percent had<br />

used “publicly available” text-analysis tools <strong>and</strong> 61 percent had never scanned or encoded texts. In<br />

terms of markup, more than half of the resp<strong>on</strong>dents stated that they preferred no markup <str<strong>on</strong>g>in</str<strong>on</strong>g> their e-texts<br />

<strong>and</strong> almost 25 percent had no knowledge of TEI. Electr<strong>on</strong>ic texts were typically selected accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

levels of access, special features, <strong>and</strong> cost, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> general scholars wanted texts to be legally accessible<br />

<strong>and</strong> be available <str<strong>on</strong>g>in</str<strong>on</strong>g> a stable form <strong>and</strong> from reliable publishers or <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s. Similarly, more than 75<br />

percent wanted e-texts to be peer-reviewed (with 67 percent preferr<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts from established editi<strong>on</strong>s),<br />

while 79 percent wanted e-texts to be accompanied by documentati<strong>on</strong>. Surpris<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, less than half of<br />

those surveyed (48 percent) required page images to be available. Only 62 percent of resp<strong>on</strong>dents had<br />

used text-analysis tools; those who had not used them reported various reas<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g expense, low<br />

priority, usability issues <strong>and</strong> technical <str<strong>on</strong>g>in</str<strong>on</strong>g>compatibility.<br />

When participants were asked about their “wish list” for text-analysis tools, two of the most significant<br />

desires listed were for <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s “to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> tools for the study <strong>and</strong> publicati<strong>on</strong> of e-texts” (41<br />

percent) <strong>and</strong> for fellow researchers to share tools they had created (46 percent). While 66 percent of<br />

participants had either created or c<strong>on</strong>tributed to the creati<strong>on</strong> of text-analysis tools, most resp<strong>on</strong>dents<br />

were unaware of currently available text-analysis tools. Of those that were aware of these tools,<br />

resp<strong>on</strong>dents typically c<strong>on</strong>sidered most of them to not be very useful. Desired text-analysis techniques<br />

were quite varied, but the two most frequently desired capabilities were the to ability to compare two


180<br />

or more documents (69 percent) <strong>and</strong> to view a text c<strong>on</strong>cordance (61 percent). In sum, a large number<br />

of e-humanists desired to have some type of <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for their work but displayed a<br />

lack of knowledge about what types of resources were available.<br />

Another series of questi<strong>on</strong>s gauged participants’ access to primary <strong>and</strong> sec<strong>on</strong>dary sources. Over 90<br />

percent of resp<strong>on</strong>dents rated search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>es as highly useful for f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g e-texts <strong>and</strong> analysis tools, <strong>and</strong><br />

over 78 percent wanted to be able to view lists of available e-texts. Survey resp<strong>on</strong>dents also wanted a<br />

reas<strong>on</strong>ably high level of structure for their e-texts: 71 percent wanted to be able to restrict their search<br />

terms by chapter, 53 percent wanted to restrict it by a character <str<strong>on</strong>g>in</str<strong>on</strong>g> a play or novel, <strong>and</strong> 48 percent<br />

wanted to search <strong>on</strong> the level of the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual paragraph. These results are <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many<br />

participants also reported that they preferred no markup <str<strong>on</strong>g>in</str<strong>on</strong>g> their texts, <strong>and</strong> search<str<strong>on</strong>g>in</str<strong>on</strong>g>g at these levels of<br />

granularity requires at least basic structural markup (e.g., chapters, pages, paragraphs) <strong>and</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g> the case<br />

of novel characters, semantic markup (e.g., TEI).<br />

The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al series of questi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>volved scholarly communicati<strong>on</strong> <strong>and</strong> collaborati<strong>on</strong>, <strong>and</strong> the large<br />

majority of answers seemed to c<strong>on</strong>firm the picture of humanists, even self-identified e-humanists, as<br />

solitary researchers. 574 As Toms <strong>and</strong> O’Brien reported, almost half of resp<strong>on</strong>dents worked al<strong>on</strong>e. In<br />

additi<strong>on</strong>, a majority had not c<strong>on</strong>ducted research with colleagues (55 percent) or graduate students (64<br />

percent). An even larger number of researchers (87 percent) reported that they did not tend to discuss<br />

their work before it was formally submitted: less than 40 percent shared ideas at early stages of<br />

research <strong>and</strong> more than half had not c<strong>on</strong>sulted colleagues at all. While their research had c<strong>on</strong>firmed the<br />

picture of the humanist as a solitary scholar, the authors proposed that this was perhaps due more to the<br />

nature of work <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities rather than to pers<strong>on</strong>al qualities:<br />

This does not, however, mean that humanists are not collegial; it may be more fitt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to say that<br />

humanists communicate with each other rather than collaborate, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce collaborati<strong>on</strong> implies<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g together—build<str<strong>on</strong>g>in</str<strong>on</strong>g>g—<strong>and</strong> the humanists’ work is all about dec<strong>on</strong>struct<str<strong>on</strong>g>in</str<strong>on</strong>g>g ideas <strong>and</strong><br />

dissect<str<strong>on</strong>g>in</str<strong>on</strong>g>g texts (Toms <strong>and</strong> O’Brien 2008).<br />

To facilitate greater collaborati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the future, Toms <strong>and</strong> O’Brien suggested that an e-humanist<br />

workbench should provide a variety of communicati<strong>on</strong> <strong>and</strong> collaborati<strong>on</strong> tools.<br />

Tom <strong>and</strong> O’Brien c<strong>on</strong>cluded that this encapsulated view of digital humanists at work illustrated that<br />

“clearly, humanities research is <str<strong>on</strong>g>in</str<strong>on</strong>g>tricate <strong>and</strong> diverse.” They were surprised both by the relatively low<br />

level of technology use with<str<strong>on</strong>g>in</str<strong>on</strong>g> the classroom <strong>and</strong> by the fact that for many resp<strong>on</strong>dents the use of<br />

technology simply <str<strong>on</strong>g>in</str<strong>on</strong>g>volved deliver<str<strong>on</strong>g>in</str<strong>on</strong>g>g read<str<strong>on</strong>g>in</str<strong>on</strong>g>g materials from a course website. Another notable<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Toms <strong>and</strong> O’Brien, was that search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>es were used as often as library catalogs<br />

to locate both primary <strong>and</strong> sec<strong>on</strong>dary sources, a practice that marked a significant change from many<br />

of the earlier studies they had found. <strong>Library</strong> tools were typically used for well-def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed topics, <strong>and</strong><br />

brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g rema<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a preferred method for f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. As a result of these f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, they<br />

decided that an e-humanities workbench should <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a web search capability as well as l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to<br />

catalogs, f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g aids, <strong>and</strong> archives. A scholar should also be able to pers<strong>on</strong>alize the workbench with<br />

his or her own list of relevant websites <strong>and</strong> digital libraries.<br />

574 Similar c<strong>on</strong>clusi<strong>on</strong>s were reached by Palmer et al. <str<strong>on</strong>g>in</str<strong>on</strong>g> their overview of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> behavior across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es: “Thus, humanities<br />

scholars <strong>and</strong> other researchers deeply engaged <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpret<str<strong>on</strong>g>in</str<strong>on</strong>g>g source material rely heavily <strong>on</strong> brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g, collect<str<strong>on</strong>g>in</str<strong>on</strong>g>g, reread<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> notetak<str<strong>on</strong>g>in</str<strong>on</strong>g>g. They tend to<br />

compile a wide variety of sources <strong>and</strong> work with them by assembl<str<strong>on</strong>g>in</str<strong>on</strong>g>g, organiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g, read<str<strong>on</strong>g>in</str<strong>on</strong>g>g, analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g. In <str<strong>on</strong>g>in</str<strong>on</strong>g>teract<str<strong>on</strong>g>in</str<strong>on</strong>g>g with colleagues, they<br />

typically c<strong>on</strong>sult rather than collaborate, with the noti<strong>on</strong> of the l<strong>on</strong>e scholar persist<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> certa<str<strong>on</strong>g>in</str<strong>on</strong>g> fields” (Palmer et al. 2009, 37).


181<br />

The authors also offered a number of c<strong>on</strong>clusi<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g workbench support for e-texts <strong>and</strong> text<br />

analysis tools. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, they stated that any workbench should support the download<str<strong>on</strong>g>in</str<strong>on</strong>g>g, stor<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

<strong>and</strong> organiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g of e-texts as well as the ability to encode them <str<strong>on</strong>g>in</str<strong>on</strong>g> different markup languages. “With<br />

the multiple forms of mark-up <strong>and</strong> the multiple expectati<strong>on</strong>s about the multiple mark-ups,” Toms <strong>and</strong><br />

O’Brien observed, “it is clear that “multiple views” of a text are needed for e-humanists” (Toms <strong>and</strong><br />

O’Brien 2008). In additi<strong>on</strong>, as both availability <strong>and</strong> access to texts are critical to the work of<br />

humanists, Toms <strong>and</strong> O’Brien argued that part of the problem is difficulty not just <str<strong>on</strong>g>in</str<strong>on</strong>g> identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g what<br />

texts have been digitized but also <str<strong>on</strong>g>in</str<strong>on</strong>g> ga<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g access to these texts. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce UNIX tools had been reported<br />

as am<strong>on</strong>g the most useful, Toms <strong>and</strong> O’Brien reas<strong>on</strong>ed that the workbench would need to support both<br />

the awareness <strong>and</strong> use of already-exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g text analysis tools through technical <strong>and</strong> peer support <strong>and</strong><br />

examples of what the tools can do. “Access to text-analysis tools is imperative,” Toms <strong>and</strong> O’Brien<br />

acknowledged, “but more importantly, the development of new tools is badly needed by this<br />

community. While TEI <strong>and</strong> TEI-lite st<strong>and</strong>ardized the mark-up work for humanists, similar st<strong>and</strong>ards<br />

need to be developed to serve as the basis for text analysis tools, so that text with various forms of<br />

mark-up are <str<strong>on</strong>g>in</str<strong>on</strong>g>terchangeable with different types of tools”(Toms <strong>and</strong> O’Brien 2008). The authors thus<br />

make the important po<str<strong>on</strong>g>in</str<strong>on</strong>g>t that st<strong>and</strong>ards are needed not just for text markup but also for the<br />

development of compatible text-analysis tools.<br />

The results of the Toms <strong>and</strong> O’Brien study illustrate some important themes to be c<strong>on</strong>sidered when<br />

design<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for classics, namely, the need to provide a comm<strong>on</strong><br />

framework/research envir<strong>on</strong>ment where scholars can both f<str<strong>on</strong>g>in</str<strong>on</strong>g>d <strong>and</strong> use exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g text-analysis tools <strong>and</strong><br />

e-texts, to support granular scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools so that scholars can <str<strong>on</strong>g>in</str<strong>on</strong>g>teract with a text, <strong>and</strong> to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude sophisticated text-analysis <strong>and</strong> annotati<strong>on</strong> tools for use with texts <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of markup<br />

languages. Such an envir<strong>on</strong>ment should also <str<strong>on</strong>g>in</str<strong>on</strong>g>clude communicati<strong>on</strong> <strong>and</strong> awareness technologies so<br />

scholars can communicate regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g projects <strong>and</strong> resources <strong>and</strong> share tools <strong>and</strong> research<br />

methodologies.<br />

Humanist Use of Source Materials: Digital <strong>Library</strong> Design Implicati<strong>on</strong>s<br />

A recent study by Neal Audenaert <strong>and</strong> Richard Furuta has offered <str<strong>on</strong>g>in</str<strong>on</strong>g>-depth <str<strong>on</strong>g>in</str<strong>on</strong>g>sights <str<strong>on</strong>g>in</str<strong>on</strong>g>to how humanists<br />

use orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al source materials (both analog <strong>and</strong> digital facsimiles) <strong>and</strong> has made a number of<br />

recommendati<strong>on</strong>s for the design of humanities digital libraries. While Audenaert <strong>and</strong> Furuta c<strong>on</strong>ceded<br />

that a large number of digital humanities resource collecti<strong>on</strong>s already exist, they argued that the sole<br />

purpose of most of these collecti<strong>on</strong>s was to dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ate materials. “Digital libraries, however, hold the<br />

potential to move bey<strong>on</strong>d merely dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>at<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources,” Audenaert <strong>and</strong> Furuta expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “toward<br />

creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ments that support the analysis required to underst<strong>and</strong> them. To achieve this, we must<br />

first develop a better underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of how humanities scholars (<strong>and</strong> others) use source documents <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

their research” (Audenaert <strong>and</strong> Furuta 2010). To create a digital library that could provide an<br />

envir<strong>on</strong>ment for research as well as dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ate resources, Audenaert <strong>and</strong> Furuta undertook a user<br />

study as part of a larger research program to design a “creativity support envir<strong>on</strong>ment (CSE) to aid <str<strong>on</strong>g>in</str<strong>on</strong>g>depth<br />

analysis <strong>and</strong> study of paper-based documents.”<br />

The authors argued that there is an urgent need for this type of research because the “<str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary<br />

digital humanities field” has been largely dom<str<strong>on</strong>g>in</str<strong>on</strong>g>ated by the humanities community, where scholars<br />

have developed resources that simply meet their own <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual research needs or that have embodied<br />

“theoretical” def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong>s of what digital scholarship should be. They c<strong>on</strong>cluded that these issues make<br />

such resources <strong>and</strong> methodologies of limited use for <str<strong>on</strong>g>in</str<strong>on</strong>g>form<str<strong>on</strong>g>in</str<strong>on</strong>g>g large-scale systems design for the<br />

broader humanities community:


182<br />

This work tends to focus <strong>on</strong> describ<str<strong>on</strong>g>in</str<strong>on</strong>g>g the objects of study from with<str<strong>on</strong>g>in</str<strong>on</strong>g> the framework of a<br />

specific theory, rather than the more traditi<strong>on</strong>al human-centered systems approach of analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the goals of specific user communities <strong>and</strong> the tasks they use to achieve those goals. The<br />

result<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools may do an excellent job of support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the humanities scholars’ needs for “thick<br />

descripti<strong>on</strong>” but often result <str<strong>on</strong>g>in</str<strong>on</strong>g> work practices that are <str<strong>on</strong>g>in</str<strong>on</strong>g>timidat<str<strong>on</strong>g>in</str<strong>on</strong>g>g to many scholars (for<br />

example, the expectati<strong>on</strong> that scholars will manually encode documents us<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML) or that<br />

emphasize topics such as authorship attributi<strong>on</strong> that are far from the ma<str<strong>on</strong>g>in</str<strong>on</strong>g>stream of humanities<br />

research (Audenaert <strong>and</strong> Furuta 2010).<br />

In additi<strong>on</strong>, Audenaert <strong>and</strong> Furuta reiterated an earlier po<str<strong>on</strong>g>in</str<strong>on</strong>g>t made by Toms <strong>and</strong> O’Brien regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g user<br />

studies from library <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> science, namely, that such studies typically focus <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

retrieval rather than <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> use. Thus their work sought to characterize how scholars actually used<br />

materials that they found. It exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> detail how access to materials supported scholars’ research<br />

questi<strong>on</strong>s <strong>and</strong> c<strong>on</strong>sidered what types of <str<strong>on</strong>g>in</str<strong>on</strong>g>sights scholars ga<str<strong>on</strong>g>in</str<strong>on</strong>g>ed from orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al materials <strong>and</strong> what if<br />

any use they might make of a creative support envir<strong>on</strong>ment (CSE). In a series of semistructured<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terviews, Audenaert <strong>and</strong> Furuta asked eight scholars why <strong>and</strong> how they worked with orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

sources, us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a series of open-ended questi<strong>on</strong>s that focused <strong>on</strong> three research themes: (1) Why do<br />

scholars put <str<strong>on</strong>g>in</str<strong>on</strong>g> the time to work with orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al materials (2) What <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> are they look<str<strong>on</strong>g>in</str<strong>on</strong>g>g for<br />

<strong>and</strong> (3) How <strong>and</strong> when do they use computers, or do they use them at all The scholars <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed<br />

worked <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of humanities fields <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded two scientists who focused <strong>on</strong> the use of<br />

historical documents.<br />

The authors learned that scholars use orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al source materials for a variety of reas<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

fact that many orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al sources <strong>and</strong> transcripti<strong>on</strong>s are now easily available. They also used orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

source materials to form “holistic impressi<strong>on</strong>s,” to ga<str<strong>on</strong>g>in</str<strong>on</strong>g> a sense of a text as a physical object, to<br />

exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e objects/sources <str<strong>on</strong>g>in</str<strong>on</strong>g> “nuanced detail,” to alleviate their c<strong>on</strong>cerns with the accuracy or<br />

authenticity of a transcripti<strong>on</strong> or editi<strong>on</strong> (many scholars did not want to trust the work of others <strong>and</strong><br />

didn’t always trust their own work without notes), <strong>and</strong> for the “aesthetics” of work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

documents. Even though <str<strong>on</strong>g>in</str<strong>on</strong>g> many cases transcripti<strong>on</strong>s were c<strong>on</strong>sidered adequate, there were still many<br />

times when scholars <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted that access to orig<str<strong>on</strong>g>in</str<strong>on</strong>g>als (digital or analog) would be essential:<br />

While editors will try to identify <strong>and</strong> describe relevant details <str<strong>on</strong>g>in</str<strong>on</strong>g> their published editi<strong>on</strong>s, the<br />

level of detail required, the specificity required by different l<str<strong>on</strong>g>in</str<strong>on</strong>g>es of research, <strong>and</strong> the need for<br />

visual <str<strong>on</strong>g>in</str<strong>on</strong>g>specti<strong>on</strong> makes it impractical to describe all of this <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> sec<strong>on</strong>dary sources.<br />

C<strong>on</strong>sequently, many l<str<strong>on</strong>g>in</str<strong>on</strong>g>es of <str<strong>on</strong>g>in</str<strong>on</strong>g>quiry require access to source material (either directly or<br />

through digital surrogates) even when high-quality editi<strong>on</strong>s are readily available (Audenaert<br />

<strong>and</strong> Furuta 2010).<br />

This po<str<strong>on</strong>g>in</str<strong>on</strong>g>t echoes the earlier discussi<strong>on</strong> of Bodard <strong>and</strong> Garcés (2009), who argued that open-source<br />

critical editi<strong>on</strong>s should provide access to all their source materials, so that scholars can form their own<br />

c<strong>on</strong>clusi<strong>on</strong>s <strong>and</strong> ask new questi<strong>on</strong>s.<br />

F<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs for their sec<strong>on</strong>d major research topic, which exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed what scholars were look<str<strong>on</strong>g>in</str<strong>on</strong>g>g for <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al documents, illustrated four major themes. Scholars were <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> textual transmissi<strong>on</strong>,<br />

survey<str<strong>on</strong>g>in</str<strong>on</strong>g>g all the evidence <strong>and</strong> documents <strong>on</strong> a topic, identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g all the agents who c<strong>on</strong>tributed to both<br />

the creati<strong>on</strong> <strong>and</strong> transmissi<strong>on</strong> of a text (e.g., author, audience, editors, illustrators, publishers, scribe),<br />

<strong>and</strong> document<str<strong>on</strong>g>in</str<strong>on</strong>g>g the full social, political, <strong>and</strong> ec<strong>on</strong>omic c<strong>on</strong>text of text. Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, the study of text<br />

transmissi<strong>on</strong>, a critical task of much classical scholarship, was the most comm<strong>on</strong> goal of all the


183<br />

scholars surveyed. Another critical po<str<strong>on</strong>g>in</str<strong>on</strong>g>t made by Audenaert <strong>and</strong> Furuta was that s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many scholars<br />

wanted to “survey all documentary evidence” related to a text or particular topic (a task they noted that<br />

had been made more “tractable” by modern editi<strong>on</strong>s), cultural heritage digital library designers should<br />

c<strong>on</strong>sider a “systematic survey of a collecti<strong>on</strong> of source documents” when creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital libraries.<br />

The third major research questi<strong>on</strong> explored how scholars used computers <strong>and</strong> whether they would be<br />

will<str<strong>on</strong>g>in</str<strong>on</strong>g>g to use digital study <strong>and</strong> research tools <str<strong>on</strong>g>in</str<strong>on</strong>g> their work. The CSE that Audenaert <strong>and</strong> Furuta<br />

c<strong>on</strong>templated creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g would <str<strong>on</strong>g>in</str<strong>on</strong>g>clude support for both “<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> externaliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

formative ideas.” While <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g has received a great deal of attenti<strong>on</strong>, Audenaert <strong>and</strong><br />

Furuta noted that “externaliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g knowledge” had received less attenti<strong>on</strong>. 575 As the humanities research<br />

process often <str<strong>on</strong>g>in</str<strong>on</strong>g>volves <str<strong>on</strong>g>in</str<strong>on</strong>g>timate experience with both the sec<strong>on</strong>dary literature <strong>and</strong> discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary methods<br />

of a field, they stated that such knowledge is both “implicit <strong>and</strong> volum<str<strong>on</strong>g>in</str<strong>on</strong>g>ous,” <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals are often<br />

unwill<str<strong>on</strong>g>in</str<strong>on</strong>g>g to formally express such knowledge (e.g., through an <strong>on</strong>tology). At the same time, their<br />

study reflected that participants kept both detailed <strong>and</strong> systematic notes <strong>and</strong> that they usually kept them<br />

electr<strong>on</strong>ically. They def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the scholars’ process of research as “<str<strong>on</strong>g>in</str<strong>on</strong>g>cremental formalism” where they<br />

were focused <strong>on</strong> a specific f<str<strong>on</strong>g>in</str<strong>on</strong>g>al product such as a m<strong>on</strong>ograph or scholarly article. Audenaert <strong>and</strong><br />

Furuta thus asserted, as have many other studies cited <str<strong>on</strong>g>in</str<strong>on</strong>g> this report (Bowman et al. 2010, Porter et al.<br />

2009), that scholars would benefit from both note-tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> annotati<strong>on</strong> support <strong>and</strong> from a<br />

comprehensive digital envir<strong>on</strong>ment that supported all the steps of the research process from formal<br />

notes to a f<str<strong>on</strong>g>in</str<strong>on</strong>g>al publishable manuscript.<br />

One major c<strong>on</strong>clusi<strong>on</strong> Audenaert <strong>and</strong> Furuta drew from this research was that while digital facsimiles<br />

were not “adequate for all research tasks,” they n<strong>on</strong>etheless played a critical role <str<strong>on</strong>g>in</str<strong>on</strong>g> “mediat<str<strong>on</strong>g>in</str<strong>on</strong>g>g”<br />

access. In general, they noted that scholars were most c<strong>on</strong>cerned with the editorial c<strong>on</strong>tributi<strong>on</strong>s they<br />

made to a digital project, <strong>and</strong> the major form of computati<strong>on</strong>al support desired was tools that could<br />

help them prepare <strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ate their work to the larger scholarly community. While Audenaert <strong>and</strong><br />

Furuta acknowledged that scholars were not particularly reflective or “critically oriented” to their own<br />

work practices, they still believed that computati<strong>on</strong>al support for all levels of the research process<br />

should be provided. “To the c<strong>on</strong>trary, we would suggest that the clear (<strong>and</strong> relatively easy to achieve)<br />

benefits of apply<str<strong>on</strong>g>in</str<strong>on</strong>g>g technology to support the dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of scholarship, coupled with the<br />

comfortable familiarity of exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary methods,” Audenaert <strong>and</strong> Furuta articulated, “has led<br />

the digital humanities community to overlook opportunities to critically assess how new technology<br />

might be developed to support the formative stages of scholarship” (Audenaert <strong>and</strong> Furuta 2010).<br />

Audenaert <strong>and</strong> Furuta argued that scholars’ work with orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al source materials forms part of a<br />

“complex ecosystem of <str<strong>on</strong>g>in</str<strong>on</strong>g>quiry” that <str<strong>on</strong>g>in</str<strong>on</strong>g>volves underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g a text <strong>and</strong> its full c<strong>on</strong>text of creati<strong>on</strong>,<br />

transmissi<strong>on</strong>, <strong>and</strong> use. They def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed this process through the SCAD model, which c<strong>on</strong>sists of five<br />

comp<strong>on</strong>ents: primary objects that are studied, the multiple sources of a document (e.g., previous drafts<br />

or copies), c<strong>on</strong>text (historical, literary, political, etc.), actors; <strong>and</strong> “derived forms” or the related<br />

sources for which an orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al text may <str<strong>on</strong>g>in</str<strong>on</strong>g> turn serve as the source for (e.g., text reuse <strong>and</strong> repurpos<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

of c<strong>on</strong>tent). The ma<str<strong>on</strong>g>in</str<strong>on</strong>g> goal of the SCAD model, the authors expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, was to serve “analytical goals”<br />

<strong>and</strong> to be a tool that could guide designers develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools for scholars rather than a formal<br />

c<strong>on</strong>ceptual model for “represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> humanities digital libraries.” Somewhat <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>trast<br />

to Benardou et al. (2010a), Audenaert <strong>and</strong> Furuta were uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> as to whether scholars would be<br />

will<str<strong>on</strong>g>in</str<strong>on</strong>g>g to use tools that formally represented structures <strong>and</strong> relati<strong>on</strong>ships between <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>.<br />

575 Some prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary work <strong>on</strong> externaliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the methods of the humanities research process <str<strong>on</strong>g>in</str<strong>on</strong>g>to an actual <strong>on</strong>tology for use <str<strong>on</strong>g>in</str<strong>on</strong>g> system design has been<br />

c<strong>on</strong>ducted by Benardou et al. (2009) for the DARIAH project <strong>and</strong> is discussed later <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper.


184<br />

A related po<str<strong>on</strong>g>in</str<strong>on</strong>g>t offered by Audenaert <strong>and</strong> Furuta was that cultural heritage digital libraries <strong>and</strong><br />

repositories need to rec<strong>on</strong>ceptualize their potential roles <strong>and</strong> move bey<strong>on</strong>d serv<str<strong>on</strong>g>in</str<strong>on</strong>g>g primarily as f<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

repositories for scholarship, to serve as resources that can support research that is <str<strong>on</strong>g>in</str<strong>on</strong>g> process. Another<br />

important <str<strong>on</strong>g>in</str<strong>on</strong>g>sight was that s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many humanities digitizati<strong>on</strong> projects can take years, digital libraries<br />

need to be designed as “evolv<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources” that support the “entire life cycle of a research project,”<br />

from the digitizati<strong>on</strong> of materials to <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g research us<str<strong>on</strong>g>in</str<strong>on</strong>g>g those resources to the publicati<strong>on</strong> <strong>and</strong><br />

preservati<strong>on</strong> of l<strong>on</strong>g-term scholarly works.<br />

The authors c<strong>on</strong>cluded their paper with five major implicati<strong>on</strong>s for cultural heritage digital libraries.<br />

First, that the research envir<strong>on</strong>ment that supports scholarly work is as important as the metadata <strong>and</strong><br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g functi<strong>on</strong>alities. The design <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>tenance of such envir<strong>on</strong>ments, they granted, will require<br />

a high level of “<strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g technical <str<strong>on</strong>g>in</str<strong>on</strong>g>vestment” that is rarely found <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities community.<br />

Sec<strong>on</strong>d, humanities digital libraries will be highly focused <strong>and</strong> might <str<strong>on</strong>g>in</str<strong>on</strong>g>clude <strong>on</strong>ly thous<strong>and</strong>s, or even<br />

hundreds, of documents. At the same time, the materials with<str<strong>on</strong>g>in</str<strong>on</strong>g> these collecti<strong>on</strong>s will have “complex<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>ternal structure” <strong>and</strong> require a large amount of “related c<strong>on</strong>textual material <strong>and</strong> editorial notes,” a<br />

feature that is already displayed <str<strong>on</strong>g>in</str<strong>on</strong>g> many grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital classics collecti<strong>on</strong>s. Third, to become sites<br />

that support digital scholarship, humanities digital libraries will need to be created as “bootstrapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

tools for their own c<strong>on</strong>structi<strong>on</strong>” <strong>and</strong> support for this will need to be factored <str<strong>on</strong>g>in</str<strong>on</strong>g> dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g the design<br />

process. Fourth, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce projects <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities have l<strong>on</strong>g life cycles, both <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of development<br />

<strong>and</strong> reuse, digital libraries will need to be “developed as an <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g process with chang<str<strong>on</strong>g>in</str<strong>on</strong>g>g audience<br />

<strong>and</strong> needs.” F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, design<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g such complex libraries requires high levels of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>vestment.<br />

Creators of Digital Humanities Resources: Factors for Successful Use<br />

Research reported by the LAIRAH (Log Analysis of Internet Resources <str<strong>on</strong>g>in</str<strong>on</strong>g> the Arts <strong>and</strong> Humanities) 576<br />

project has recently c<strong>on</strong>firmed that there have been “no systematic, evidence-based studies of the use<br />

<strong>and</strong> n<strong>on</strong>-use of digital humanities resources”(Warwick et al. 2008b). To determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e how digital<br />

resources were be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used or not used, the LAIRAH project utilized log-analysis techniques 577 to<br />

identify 21 popular <strong>and</strong> well-used digital humanities resources with<str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom <strong>and</strong><br />

c<strong>on</strong>ducted <str<strong>on</strong>g>in</str<strong>on</strong>g>-depth <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews with their creators to see they could identify comm<strong>on</strong> factors that<br />

predisposed these resources for use.<br />

Warwick et al. (2008b) synthesized research that had been c<strong>on</strong>ducted <str<strong>on</strong>g>in</str<strong>on</strong>g>to scholarly use of digital<br />

humanities resource <strong>and</strong> <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> behavior <strong>and</strong> listed a number of important <str<strong>on</strong>g>in</str<strong>on</strong>g>sights: (1)<br />

many scholars were enthusiastic about digital humanities resources but <str<strong>on</strong>g>in</str<strong>on</strong>g> general preferred “generic<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> resources” to specialist research sources; (2) humanists needed a wide range <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

resources <strong>and</strong> types but their work typically <str<strong>on</strong>g>in</str<strong>on</strong>g>volved re<str<strong>on</strong>g>in</str<strong>on</strong>g>terpret<str<strong>on</strong>g>in</str<strong>on</strong>g>g “ideas rather than creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g or<br />

discover<str<strong>on</strong>g>in</str<strong>on</strong>g>g new data or facts”; (3) humanists would <strong>on</strong>ly use technology that fit well with their<br />

exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g research methods <strong>and</strong> if it saved them time <strong>and</strong> effort; (4) humanists preferred not to have to<br />

take specialized tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> order to use a resource; (5) while humanities researchers had “sophisticated<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> skills <strong>and</strong> mental models of their physical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> envir<strong>on</strong>ment,” they often had<br />

difficulty apply<str<strong>on</strong>g>in</str<strong>on</strong>g>g these skills <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital envir<strong>on</strong>ment; (6) humanities scholars were c<strong>on</strong>cerned with<br />

the accuracy of the materials they used; (7) scholars wanted <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about the analog resource that<br />

had been digitized; <strong>and</strong> (8) scholars expected “high quality c<strong>on</strong>tent,” <strong>and</strong> anyth<str<strong>on</strong>g>in</str<strong>on</strong>g>g that complicated<br />

576 http://www.ucl.ac.uk/slais/LAIRAH<br />

577 The LAIRAH project made use of the server logs of the AHDS <strong>and</strong> the Humbul portal that was merged <str<strong>on</strong>g>in</str<strong>on</strong>g>to Intute (http://www.<str<strong>on</strong>g>in</str<strong>on</strong>g>tute.ac.uk/), a free<br />

<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e directory of academic resources that have been reviewed by subject specialists at seven universities.


185<br />

their use of a resource, be it a challeng<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terface or c<strong>on</strong>fus<str<strong>on</strong>g>in</str<strong>on</strong>g>g data, would stop them from us<str<strong>on</strong>g>in</str<strong>on</strong>g>g it.<br />

These f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, Warwick et al. (2008b) proposed, should be carefully c<strong>on</strong>sidered by the creators of<br />

digital resources:<br />

Thus it is <str<strong>on</strong>g>in</str<strong>on</strong>g>cumbent <strong>on</strong> producers of digital resources not <strong>on</strong>ly to underst<strong>and</strong> the work<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

practices of the scholars for whom they design, but to produce a resource that is attractive,<br />

usable <strong>and</strong> easy to underst<strong>and</strong>. However, perhaps surpris<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, there appears to be no research<br />

that assesses how well digital humanities resources are perform<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> these respects (Warwick<br />

et al. 2008b).<br />

Thus the need to underst<strong>and</strong> the work<str<strong>on</strong>g>in</str<strong>on</strong>g>g practices of the scholars for whom a resource is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

designed is as important as creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an attractive <strong>and</strong> usable resource.<br />

While n<strong>on</strong>e of the 20 digital humanities projects chosen for analysis was with<str<strong>on</strong>g>in</str<strong>on</strong>g> the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of<br />

classics, the results of the LAIRAH <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews provide some useful <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> what makes a<br />

digital resource successful <str<strong>on</strong>g>in</str<strong>on</strong>g> the l<strong>on</strong>g term. Warwick et al. (2008b) explored the documentati<strong>on</strong> (if<br />

any) <strong>on</strong> each website <strong>and</strong> c<strong>on</strong>ducted a semistructured <str<strong>on</strong>g>in</str<strong>on</strong>g>terview with a project representative that<br />

covered the creati<strong>on</strong> <strong>and</strong> history of a resource, fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g, technical st<strong>and</strong>ards, dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, <strong>and</strong> user<br />

test<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Not surpris<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, they found that the <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al c<strong>on</strong>text <strong>and</strong> “research culture of particular<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es” greatly affected the producti<strong>on</strong> <strong>and</strong> use of digital resources. One major issue was limited<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al recogniti<strong>on</strong> <strong>and</strong> prestige for scholars who did digital humanities work; another was<br />

uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty am<strong>on</strong>g their colleagues as how to value digital scholarship. Another critical issue for the<br />

success of digital humanities projects was adequate technical support <strong>and</strong> staff<str<strong>on</strong>g>in</str<strong>on</strong>g>g. While most<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cipal <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigators (PIs) were relatively happy about the level of support they received (typically<br />

from local IT staff or expert colleagues), those that reported c<strong>on</strong>tact with a digital humanities center<br />

received an even higher level of expert advice. Staff<str<strong>on</strong>g>in</str<strong>on</strong>g>g issues were paramount, as research assistants<br />

required both subject knowledge <strong>and</strong> a good grasp of digital techniques. The grant-funded nature of<br />

most projects also made it hard for research assistants to obta<str<strong>on</strong>g>in</str<strong>on</strong>g> adequate technical tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g or for PIs to<br />

reta<str<strong>on</strong>g>in</str<strong>on</strong>g> them bey<strong>on</strong>d <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual projects.<br />

The most important factor that led to resources that were well used, however, was active dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong><br />

of project results. All the projects whose PIs were <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed spent c<strong>on</strong>siderable time dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>at<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about their resources at c<strong>on</strong>ferences <strong>and</strong> workshops. Warwick et al. (2008b) noted that this<br />

type of “market<str<strong>on</strong>g>in</str<strong>on</strong>g>g” was a very new area of activity for many academics. A related if not unexpected<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g was that the most-well-used resources tended to be l<strong>on</strong>g-lived. This was not necessarily an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dicator of successfully meet<str<strong>on</strong>g>in</str<strong>on</strong>g>g user needs. “The persistent use of older digital resources, even when<br />

newer, perhaps better <strong>on</strong>es become available,” Warwick et al. put forward, “may be expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by a<br />

commercial phenomen<strong>on</strong> known as ‘switch<str<strong>on</strong>g>in</str<strong>on</strong>g>g costs’”(Warwick et al. 2008b). In other words, users<br />

often rema<str<strong>on</strong>g>in</str<strong>on</strong>g> loyal to a particular resource because the effort <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> switch<str<strong>on</strong>g>in</str<strong>on</strong>g>g to a new tool is to<br />

great.<br />

Another area explored by Warwick et al. (2008b) was the amount of user c<strong>on</strong>tact <str<strong>on</strong>g>in</str<strong>on</strong>g> which successful<br />

projects were engaged. They found that few projects had “undertaken any type of user test<str<strong>on</strong>g>in</str<strong>on</strong>g>g” or<br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed any formal c<strong>on</strong>tact with their users. In additi<strong>on</strong>, most projects had little if any<br />

underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g either of how their resources were used, or how often. All projects, however, were<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> how their projects were be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used <strong>and</strong> had made some efforts <str<strong>on</strong>g>in</str<strong>on</strong>g> this area. The most<br />

comm<strong>on</strong> method, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Warwick et al., was the idea of “designer as user” or where most PIs<br />

assumed that their subject knowledge meant that they understood the needs of users <strong>and</strong> thus could


186<br />

“<str<strong>on</strong>g>in</str<strong>on</strong>g>fer user requirements from their own behaviour.” While Warwick et al. (2008b) granted that some<br />

user needs might be discovered <str<strong>on</strong>g>in</str<strong>on</strong>g> this manner, the <strong>on</strong>ly real way to discover user needs, they<br />

c<strong>on</strong>tended, is to ask or study the users themselves. In additi<strong>on</strong>, they reported that some projects<br />

“discovered that their audience c<strong>on</strong>sisted of a much more diverse group of users than the academic<br />

subject experts they had expected” (Warwick et al. 2008b). A related problem was the lack of<br />

n<strong>on</strong>expert documentati<strong>on</strong> at many projects. In the end, <strong>on</strong>ly two projects had c<strong>on</strong>ducted any type of<br />

direct user test<str<strong>on</strong>g>in</str<strong>on</strong>g>g. As with dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> <strong>and</strong> market<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Warwick et al. (2008b) commented that user<br />

test<str<strong>on</strong>g>in</str<strong>on</strong>g>g is not a traditi<strong>on</strong>al skill of humanities scholars.<br />

The last major issue determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the success of a digital humanities resource was susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability. At the<br />

time their research was c<strong>on</strong>ducted (2007–2008), the AHDS still existed <strong>and</strong> Warwick et al. stated that<br />

many projects were either archived there or backed up <strong>on</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repository. Despite this<br />

archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Warwick et al. (2008b) c<strong>on</strong>cluded that this older model of f<str<strong>on</strong>g>in</str<strong>on</strong>g>al deposit was <str<strong>on</strong>g>in</str<strong>on</strong>g>adequate<br />

s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many resources were almost never updated <strong>and</strong> typically the data were not <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent of the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terface. In many cases this resulted <str<strong>on</strong>g>in</str<strong>on</strong>g> digital projects that, despite hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g large amounts of m<strong>on</strong>ey<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>vested <str<strong>on</strong>g>in</str<strong>on</strong>g> their creati<strong>on</strong>, were fundamentally unusable after a few years. This was a problem for<br />

which they had few answers <strong>and</strong>, as they c<strong>on</strong>cluded, “Susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s an <str<strong>on</strong>g>in</str<strong>on</strong>g>tractable problem<br />

given the current models of fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital resources.”<br />

Their f<str<strong>on</strong>g>in</str<strong>on</strong>g>al recommendati<strong>on</strong>s for the l<strong>on</strong>g-term usability of digital resources were for projects to create<br />

better documentati<strong>on</strong>, develop a clear idea of their users, <strong>and</strong> c<strong>on</strong>sult <strong>and</strong> stay <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>tact with them, to<br />

develop effective technical management <strong>and</strong> support, actively dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ate results, <strong>and</strong>, for<br />

susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability, “ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> actively update their <str<strong>on</strong>g>in</str<strong>on</strong>g>terface, c<strong>on</strong>tent <strong>and</strong> functi<strong>on</strong>ality of the resource.”<br />

All these recommendati<strong>on</strong>s are relevant to the development of any last<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital<br />

classics resources as well.<br />

“Traditi<strong>on</strong>al” Academic Use of Digital Humanities Resources<br />

Several studies have recently addressed how “traditi<strong>on</strong>al” (e.g., not self-identified e-humanists or<br />

digital humanists) academics <strong>and</strong> students have made use of digital resources. This secti<strong>on</strong> exam<str<strong>on</strong>g>in</str<strong>on</strong>g>es<br />

the larger f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of this research. 578<br />

The CSHE Study<br />

One of the largest studies to approach the questi<strong>on</strong> of educator use of digital resources <str<strong>on</strong>g>in</str<strong>on</strong>g> the social<br />

sciences <strong>and</strong> humanities was c<strong>on</strong>ducted by the CSHE between 2004 <strong>and</strong> 2006 (Harley et al. 2006b).<br />

This study pursued three parallel research tracks: (1) c<strong>on</strong>duct<str<strong>on</strong>g>in</str<strong>on</strong>g>g a literature review <strong>and</strong> discussi<strong>on</strong>s<br />

with stakeholders to map out the types of digital resources available <strong>and</strong> where the user fit with<str<strong>on</strong>g>in</str<strong>on</strong>g> this<br />

universe; (2) hold<str<strong>on</strong>g>in</str<strong>on</strong>g>g discussi<strong>on</strong>s (focus groups) with <strong>and</strong> c<strong>on</strong>duct<str<strong>on</strong>g>in</str<strong>on</strong>g>g surveys of faculty at three types<br />

of <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> California, as well as with the users of various listservs, regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g how <strong>and</strong> why they<br />

used or did not use digital resources; <strong>and</strong> (3) creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a methodology for how user-study results might<br />

be shared more usefully by <str<strong>on</strong>g>in</str<strong>on</strong>g>terview<str<strong>on</strong>g>in</str<strong>on</strong>g>g site owners, resource creators, <strong>and</strong> use researchers. At the<br />

same time, Harley et al. argued that the differences between <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, as well as the<br />

vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g types of users, needed to be carefully c<strong>on</strong>sidered. “The humanities <strong>and</strong> social sciences are not<br />

a m<strong>on</strong>olith, nor are user types,” Harley et al. (2006b) expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed; “we c<strong>on</strong>tend that a disaggregati<strong>on</strong> of<br />

users by discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong> type allow us to better underst<strong>and</strong> the exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g variati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> user <strong>and</strong><br />

578 A major new study was released <str<strong>on</strong>g>in</str<strong>on</strong>g> April 2011 by the Research Informati<strong>on</strong> Network that analyzes humanities scholars use of two major digital<br />

resources: the Old Bailey Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e (http://www.oldbailey<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.org) <strong>and</strong> the Digital Image Archive of Medieval Music (http://www.diamm.ac.uk) <strong>and</strong> also<br />

offered case studies of digital resource use by specific humanities departments (Bulger et al. 2011).


187<br />

n<strong>on</strong>-user behavior.” The authors also <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted that there had likely been no “coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ated c<strong>on</strong>versati<strong>on</strong><br />

about user research” across the many types of digital resources available because of the immense<br />

variety of such resources found <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.<br />

As they began to study the types of digital resources available <str<strong>on</strong>g>in</str<strong>on</strong>g> order to create a typology of resource<br />

types (e.g., data sets, videos, maps, electr<strong>on</strong>ic journals, course materials) they quickly discovered the<br />

number of resources available was ever grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> that digital resources were be<str<strong>on</strong>g>in</str<strong>on</strong>g>g created <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

different envir<strong>on</strong>ments by many types of developers. In additi<strong>on</strong>, they noted that users often def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

resources much more granularly than did their creators. The project also def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed three major roles for<br />

analysis <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of website “owners,” <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g resource aggregators, developers of tools, <strong>and</strong> c<strong>on</strong>tent<br />

creators <strong>and</strong> owners.<br />

The major part of this study c<strong>on</strong>sisted of speak<str<strong>on</strong>g>in</str<strong>on</strong>g>g with <strong>and</strong> survey<str<strong>on</strong>g>in</str<strong>on</strong>g>g faculty <str<strong>on</strong>g>in</str<strong>on</strong>g> different discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es 579<br />

<strong>and</strong> at several k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds of <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d out why they did or did not use digital resources as part of<br />

their teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Harley et al. (2006b) found that “pers<strong>on</strong>al teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g style” <strong>and</strong> philosophy greatly<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>fluenced resource use <strong>and</strong> that there were a large number of user types, from n<strong>on</strong>users (a diverse<br />

group <str<strong>on</strong>g>in</str<strong>on</strong>g> itself) to novice users to advanced users. Images <strong>and</strong> visual materials were the resources that<br />

were listed as be<str<strong>on</strong>g>in</str<strong>on</strong>g>g most frequently used, but news websites, video, <strong>and</strong> <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e reference sources were<br />

also used quite heavily. Faculty used Google as their primary means of f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources; the sec<strong>on</strong>d<br />

most frequently used resource were their own “collecti<strong>on</strong>s” of digital resources.<br />

The reas<strong>on</strong>s for use <strong>and</strong> n<strong>on</strong>use of digital resources were quite diverse, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Harley et al.<br />

(2006b). Major reas<strong>on</strong>s faculty used digital resources were to improve student learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g, to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate<br />

primary sources <str<strong>on</strong>g>in</str<strong>on</strong>g>to their teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g, to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude materials <str<strong>on</strong>g>in</str<strong>on</strong>g> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g that would otherwise be<br />

unavailable, <strong>and</strong> to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate their research <str<strong>on</strong>g>in</str<strong>on</strong>g>terests <str<strong>on</strong>g>in</str<strong>on</strong>g>to a course. The preem<str<strong>on</strong>g>in</str<strong>on</strong>g>ent reas<strong>on</strong> for n<strong>on</strong>use<br />

was that digital resources did not support their approach to teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Additi<strong>on</strong>al reas<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded lack<br />

of time, resources that were difficult to use <strong>and</strong>, notably, the <str<strong>on</strong>g>in</str<strong>on</strong>g>ability to “f<str<strong>on</strong>g>in</str<strong>on</strong>g>d, manage, ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong><br />

reuse them <str<strong>on</strong>g>in</str<strong>on</strong>g> new c<strong>on</strong>texts.” The importance of pers<strong>on</strong>al digital collecti<strong>on</strong>s was illustrated aga<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong><br />

Harley et al. (2006b) asserted that “many faculty want to the ability to build their own collecti<strong>on</strong>s,<br />

which are often composed of a variety of materials, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g those that are copyright protected.” Thus<br />

faculty dem<strong>on</strong>strated a desire not just for resources that were easier to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d <strong>and</strong> use but also for <strong>on</strong>es<br />

that were “open” <str<strong>on</strong>g>in</str<strong>on</strong>g> the sense that they could at least be reused <str<strong>on</strong>g>in</str<strong>on</strong>g> new c<strong>on</strong>texts.<br />

Interviews with 13 digital resource providers of “generic” <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e educati<strong>on</strong>al resources (OERs) 580 <strong>and</strong><br />

two other stakeholders <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of what types of user research they had engaged <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> what they knew<br />

about their users revealed that there were no comm<strong>on</strong> metrics for measur<str<strong>on</strong>g>in</str<strong>on</strong>g>g use or def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g groups of<br />

users, but that most projects assumed faculty were their ma<str<strong>on</strong>g>in</str<strong>on</strong>g> user group. Their f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs largely<br />

c<strong>on</strong>firmed those of (Warwick et al. 2008b), namely, that little if any comprehensive or systematic<br />

research had occurred:<br />

The <str<strong>on</strong>g>in</str<strong>on</strong>g>terview analyses suggested that there were no comm<strong>on</strong> terms, metrics, methods, or<br />

values for def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g use or users am<strong>on</strong>g the targeted projects. Yet digital resource providers<br />

shared the desire to measure how <strong>and</strong> for what purpose materials were be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used <strong>on</strong>ce<br />

579 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the full report (Harley et al. 2006a), 30 faculty from classics participated <str<strong>on</strong>g>in</str<strong>on</strong>g> this study—11 (2.4 percent of the total) faculty members<br />

answered the H-Net survey <strong>and</strong> 19 faculty (2.3 percent of the total) from California universities participated <str<strong>on</strong>g>in</str<strong>on</strong>g> the focus <strong>and</strong> discussi<strong>on</strong> groups (pp. 4-15).<br />

The <strong>on</strong>ly other major f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g of this report <str<strong>on</strong>g>in</str<strong>on</strong>g> regard to classicists was that they tended to use fewer digital resources <str<strong>on</strong>g>in</str<strong>on</strong>g> their teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g than did scholars <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

many other discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es (pp. 4-56).<br />

580 Resource providers <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded MIT OpenCourseWare (OCW) (http://ocw.mit.edu/), JSTOR, <strong>and</strong> the Nati<strong>on</strong>al Science Digital <strong>Library</strong> (http://nsdl.org/).


188<br />

accessed; few providers, if any, however, had c<strong>on</strong>crete plans for undertak<str<strong>on</strong>g>in</str<strong>on</strong>g>g this measurement<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> a systematic way (Harley et al. 2006b)<br />

Several resource providers were explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g various ways of engag<str<strong>on</strong>g>in</str<strong>on</strong>g>g with <strong>and</strong> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a user<br />

community as <strong>on</strong>e potential soluti<strong>on</strong> to l<strong>on</strong>g-term susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability, a major theme of <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews. “Our<br />

research revealed that community build<str<strong>on</strong>g>in</str<strong>on</strong>g>g is important to digital resource providers,” Harley et al.<br />

(2006b) reported, “<strong>and</strong> many were explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools to enable the development or support of user<br />

“communities.” Some suggested that community c<strong>on</strong>tributi<strong>on</strong>s might hold a key to susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability<br />

challenges.”<br />

After c<strong>on</strong>duct<str<strong>on</strong>g>in</str<strong>on</strong>g>g these <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews, a two-day workshop was held with 16 experts to discuss OERs <strong>and</strong><br />

how explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g user behavior might be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to larger policy or plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g issues. Four broad topics<br />

were covered by this meet<str<strong>on</strong>g>in</str<strong>on</strong>g>g: (1) def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g a comm<strong>on</strong> framework to codify different “categories of<br />

c<strong>on</strong>tent, users, uses, <strong>and</strong> user studies”; (2) the practicality <strong>and</strong> expense of different types of user studies<br />

<strong>and</strong> methods (e.g., what types of comm<strong>on</strong> questi<strong>on</strong>s to ask, what level of research [formal, <str<strong>on</strong>g>in</str<strong>on</strong>g>formal]<br />

must be c<strong>on</strong>ducted); (3) questi<strong>on</strong>s of user dem<strong>and</strong> <strong>and</strong> l<strong>on</strong>g-term-susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability (curricular,<br />

technical/<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructural, organizati<strong>on</strong>al, <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial); <strong>and</strong> (4) the larger research questi<strong>on</strong>s that would<br />

need to be addressed. The topic of susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability brought up the largest number of complicated issues.<br />

One questi<strong>on</strong> that elicited particularly diverse resp<strong>on</strong>ses <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability was whether OER<br />

sites should “adapt their c<strong>on</strong>tent or services to un<str<strong>on</strong>g>in</str<strong>on</strong>g>tended users.” “To some participants, un<str<strong>on</strong>g>in</str<strong>on</strong>g>tended<br />

use is an opportunity for creative reuse,” Harley et al. (2006b) stated, “while many believed that an<br />

OER site should not or could not change course to serve an un<str<strong>on</strong>g>in</str<strong>on</strong>g>tended audience.” This questi<strong>on</strong> was<br />

tightly l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked with the missi<strong>on</strong> of different OERs <strong>and</strong> their f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial models. In terms of<br />

technical/<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructural susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability, many participants proposed that OERs, particularly open-access<br />

<strong>on</strong>es, need a “comm<strong>on</strong> place where they can be reliably housed, organized, searched, <strong>and</strong> preserved,”<br />

<strong>and</strong> that “centralized OER repositories” might serve as <strong>on</strong>e answer. Various models for how such a<br />

OER repository might be developed were discussed, <strong>and</strong> a number of participants agreed that federated<br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g across different repositories would be a “user-friendly” start.<br />

While Harley et al. (2006b) offered a number of c<strong>on</strong>clusi<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g their research f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, a<br />

particularly significant <strong>on</strong>e was the great desire of faculty to “build their own re-aggregated resources”<br />

or to be able to blend materials from their own pers<strong>on</strong>al digital collecti<strong>on</strong>s <strong>and</strong> with other digital<br />

resources they have found <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. The limitati<strong>on</strong>s of classroom technologies, the vast array of<br />

complicated <strong>and</strong> typically n<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable tools that were available for use <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of “collect<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g, manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> actually us<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources,” <strong>and</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>ability to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate many resources<br />

with st<strong>and</strong>ard learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g management systems were cited as significant challenges. Future digital tool<br />

developers, Harley et al. c<strong>on</strong>cluded, would need to address a number of issues, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the difficulty<br />

or “impossibility” of reus<str<strong>on</strong>g>in</str<strong>on</strong>g>g objects that are bundled or “locked” <str<strong>on</strong>g>in</str<strong>on</strong>g>to static or proprietary resources,<br />

complex digital rights issues, uneven <str<strong>on</strong>g>in</str<strong>on</strong>g>terface design <strong>and</strong> “aesthetics,” <strong>and</strong> grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g user dem<strong>and</strong>s for<br />

resource “granularity” (e.g., be<str<strong>on</strong>g>in</str<strong>on</strong>g>g able to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d <strong>and</strong> reuse <strong>on</strong>e image, text, etc., with<str<strong>on</strong>g>in</str<strong>on</strong>g> a larger digital<br />

resource).<br />

The LAIRAH Project<br />

While the study c<strong>on</strong>ducted by Harley et al. largely focused <strong>on</strong> how faculty used digital resources <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

their teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g, other research by the LAIRAH project analyzed academic use of digital resources<br />

through the use of “quantitative Deep Log Analysis techniques” <strong>and</strong> qualitative user workshops. One<br />

core goal of their research was to obta<str<strong>on</strong>g>in</str<strong>on</strong>g> detailed user op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital resources <strong>and</strong> what


189<br />

factors <str<strong>on</strong>g>in</str<strong>on</strong>g>hibited their use (Warwick et al. 2008a). For their analysis they used logs from the AHDS<br />

servers, the Humbul Humanities Hub (now Intute), <strong>and</strong> Artifact. Unfortunately, they were not able to<br />

use “<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual logs from the servers of digital humanities projects” because of time c<strong>on</strong>stra<str<strong>on</strong>g>in</str<strong>on</strong>g>ts. In<br />

additi<strong>on</strong> to us<str<strong>on</strong>g>in</str<strong>on</strong>g>g log data they mounted a questi<strong>on</strong>naire <strong>and</strong> held a workshop with users regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

neglected resources to see whether they could determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e why resources were not be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used. One<br />

significant difficulty that they encountered was attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to extract log data, even when us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

logs of large, government-funded repositories. N<strong>on</strong>etheless, the log data from the AHDS central site<br />

did show those l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks that were followed <strong>on</strong> the site <strong>and</strong> it was thus possible to generate a list of pages<br />

that visitors actually used. Resources about warfare were quite popular, as were census data <strong>and</strong> family<br />

history. One <str<strong>on</strong>g>in</str<strong>on</strong>g>sight they offered was that resources that were not particularly well named were often<br />

seldom used, <strong>and</strong> they advised digital project creators to utilize simple titles <strong>and</strong> good descripti<strong>on</strong>s that<br />

made it clear what a resource was about.<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce Warwick et al. (2008a) wanted to get a broad range of answers <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of digital resources, they<br />

did not offer a def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of resources <str<strong>on</strong>g>in</str<strong>on</strong>g> their questi<strong>on</strong>naire <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stead asked participants to list their<br />

three favorite resources. They learned that most of the users they surveyed c<strong>on</strong>sidered digital resources<br />

“not to be specialist research resources for humanities scholarship, but generic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> resources.”<br />

The most popular resource listed was the university library website; this was followed by Google.<br />

Many resources were simply classified as “other,” <strong>and</strong> most were “<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> resources or gateways,<br />

archives <strong>and</strong> subject portals” as well as subject-based digital libraries. This f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g sharply c<strong>on</strong>tradicts<br />

the belief of many digital project creators that the specialist research tools they create for faculty will<br />

be heavily used, as Warwick et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>:<br />

It therefore appears that most of our users regard digital resources primarily as a way to access<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, which <str<strong>on</strong>g>in</str<strong>on</strong>g> the analogue world might be compared to the library or archive, rather<br />

than specialist research resources which we might compare to a m<strong>on</strong>ograph or a literary text for<br />

primary study. It is significant that most resources fall <str<strong>on</strong>g>in</str<strong>on</strong>g>to the ‘other’ category, which suggests<br />

that there is a very wide range of resources be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used, <strong>and</strong> very little agreement as to which<br />

are most useful (Warwick et al. 2008a)<br />

Similar results were observed by Dalbello et al. (2006) <str<strong>on</strong>g>in</str<strong>on</strong>g> that classicists who cited electr<strong>on</strong>ic resources<br />

never utilized them as research or primary resources, or at least did not admit to do<str<strong>on</strong>g>in</str<strong>on</strong>g>g so.<br />

The last comp<strong>on</strong>ent of the user research c<strong>on</strong>ducted by Warwick et al. (2008a) was a workshop about<br />

neglected digital resources. They found it was very difficult to recruit participants, <strong>and</strong> the f<str<strong>on</strong>g>in</str<strong>on</strong>g>al group<br />

was largely composed of historians, archaeologists, graduate students, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals who worked <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

humanities comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. They found that many of their participants, particularly if they came from a<br />

more traditi<strong>on</strong>al humanities background, were generally unwill<str<strong>on</strong>g>in</str<strong>on</strong>g>g to “commit themselves” <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of<br />

the quality <strong>and</strong> usefulness of resources, especially if these resources were outside of their discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e.<br />

Warwick et al. reas<strong>on</strong>ed that this reluctance was perhaps due to the fact that most “specialist digital<br />

humanities research resources” were very unfamiliar to most humanities academics. In general,<br />

participants were fairly critical of the resources they were asked to evaluate, <strong>and</strong> there was no<br />

“universal enthusiasm” regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g “c<strong>on</strong>tent, <str<strong>on</strong>g>in</str<strong>on</strong>g>terface <strong>and</strong> ease of use.”<br />

Warwick et al. (2008a) offered a number of general recommendati<strong>on</strong>s as a result of this research <strong>and</strong><br />

str<strong>on</strong>gly argued that publicly funded digital research projects should have to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> log data <strong>and</strong><br />

make it available for at least three years. They reiterated that clear <strong>and</strong> underst<strong>and</strong>able nam<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

describ<str<strong>on</strong>g>in</str<strong>on</strong>g>g of projects was very important for ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g maximum impact. Additi<strong>on</strong>ally, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce general


190<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> sources were largely preferred over specialist research sources, they stated that fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

might most usefully be given to projects that create large collecti<strong>on</strong>s of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> resources for<br />

reference. As humanities scholars, they dem<strong>and</strong>ed c<strong>on</strong>tent <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces of the highest-possible<br />

quality (whether a resource was commercial or free), <strong>and</strong> they suggested that the creators of specialist<br />

digital resources spend more time <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terface design <strong>and</strong> user test<str<strong>on</strong>g>in</str<strong>on</strong>g>g of those designs. While their<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs did illustrate that academics want both to easily f<str<strong>on</strong>g>in</str<strong>on</strong>g>d <strong>and</strong> use digital resources, they also<br />

articulated that “the k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of scholar who is likely to know they need such a resource <strong>and</strong> persist until<br />

they f<str<strong>on</strong>g>in</str<strong>on</strong>g>d it is the k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of early adopter who is already us<str<strong>on</strong>g>in</str<strong>on</strong>g>g specialist digital resources”(Warwick et al.<br />

2008a). They thus c<strong>on</strong>cluded with a call for the producers of digital resources to focus <strong>on</strong> draw<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

more traditi<strong>on</strong>al humanities users.<br />

The RePAH Project<br />

More focused research <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of academic digital resource use has been c<strong>on</strong>ducted by the recently<br />

c<strong>on</strong>cluded RePAH (Research Portals for the Arts <strong>and</strong> Humanities) project, which carried out a doma<str<strong>on</strong>g>in</str<strong>on</strong>g>wide<br />

survey of how arts <strong>and</strong> humanities researchers might use an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e research portal <str<strong>on</strong>g>in</str<strong>on</strong>g> their work.<br />

A recent article by Brown <strong>and</strong> Greengrass (2010) has presented an overview of the RePAH project’s<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs. To set the c<strong>on</strong>text for RePAH, Brown <strong>and</strong> Greengrass outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a brief history of e-humanities<br />

research, fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure with<str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom over the past 10 years. Brown <strong>and</strong><br />

Greengrass stated that a major change <str<strong>on</strong>g>in</str<strong>on</strong>g> strategy had occurred, namely, that an orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al emphasis <strong>on</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>vest<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> access to resources had shifted to c<strong>on</strong>cerns about how these resources were used, <strong>and</strong><br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>ally to questi<strong>on</strong>s about the “skill levels <strong>and</strong> attitudes towards use of ICT <str<strong>on</strong>g>in</str<strong>on</strong>g> arts <strong>and</strong> humanities<br />

research.” At the same time, the authors noted that the arts <strong>and</strong> humanities <str<strong>on</strong>g>in</str<strong>on</strong>g>volve a large number of<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es with many different research traditi<strong>on</strong>s, <strong>and</strong> that “it can not be assumed that <str<strong>on</strong>g>in</str<strong>on</strong>g>novati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

<strong>on</strong>e discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e necessarily meet the requirements of others” (Brown <strong>and</strong> Greengrass 2010).<br />

A research portal, the authors expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, focused <strong>on</strong> “federat<str<strong>on</strong>g>in</str<strong>on</strong>g>g distributed sites of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>,” <strong>and</strong><br />

the RePAH project sought to explore how researchers across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es might use such a tool. While<br />

Warwick et al. (2008b) earlier reported that there had been no systematic exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>s of the scholarly<br />

use <strong>and</strong> n<strong>on</strong>use of digital resources, Brown <strong>and</strong> Greengrass c<strong>on</strong>firmed a similar lack of studies<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>volv<str<strong>on</strong>g>in</str<strong>on</strong>g>g the general ICT use of arts <strong>and</strong> humanities scholars, as first noted by Toms <strong>and</strong> O’Brien:<br />

Hitherto there has been no sector-wide comparative study to ascerta<str<strong>on</strong>g>in</str<strong>on</strong>g> how researchers are<br />

us<str<strong>on</strong>g>in</str<strong>on</strong>g>g ICT <strong>and</strong> what they perceive their future needs to be. C<strong>on</strong>sequently what is needed <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

terms of an ICT <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure to support Arts <strong>and</strong> Humanities research is not well understood.<br />

Are there, for example, significant differences <str<strong>on</strong>g>in</str<strong>on</strong>g> the ways <str<strong>on</strong>g>in</str<strong>on</strong>g> which researchers from different<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es use ICT <str<strong>on</strong>g>in</str<strong>on</strong>g> their research Are some doma<str<strong>on</strong>g>in</str<strong>on</strong>g>s more technically advanced than<br />

others How widespread is ICT based research across the sector Can a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle portal c<strong>on</strong>cept<br />

meet the needs of the whole community (Brown <strong>and</strong> Greengrass 2010).<br />

Such questi<strong>on</strong>s are often difficult to answer, Brown <strong>and</strong> Greengrass acknowledged, <strong>and</strong> the fact that<br />

resp<strong>on</strong>sibility for fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g this k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is typically split across multiple agencies makes<br />

cross-discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary research even more problematic.<br />

In an attempt to answer these questi<strong>on</strong>s, the RePAH project broadly def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed their users as the “arts <strong>and</strong><br />

humanities research community” <strong>and</strong> specifically wanted to ascerta<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> about these users<br />

“<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> discovery strategies,” their Internet usage, their awareness <strong>and</strong> op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g various<br />

<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e services such as repositories <strong>and</strong> portals, <strong>and</strong> their future expectati<strong>on</strong>s. They adopted a<br />

multipr<strong>on</strong>ged approach that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e questi<strong>on</strong>naire (with almost 150 resp<strong>on</strong>dents), focus


191<br />

groups, log analysis (the same <strong>on</strong>es used by the LAIRAH project for their research), Delphi 581<br />

forecast<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> user trials. 582 In the f<str<strong>on</strong>g>in</str<strong>on</strong>g>al step of their process, RePAH used the results of the user<br />

trials where participants were presented with a range of possible portal features <strong>and</strong> a number of<br />

dem<strong>on</strong>strators to “cross-check” the earlier results of the focus groups, questi<strong>on</strong>naires, <strong>and</strong> Delphi<br />

exercise.<br />

A number of major f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs emerged as a result of this research that illustrate larger themes that need<br />

to be c<strong>on</strong>sidered when design<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for any discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e with<str<strong>on</strong>g>in</str<strong>on</strong>g> the arts <strong>and</strong> humanities<br />

such as classics. They labeled <strong>on</strong>e major theme as “pull vs. push.” While 60 percent of the resp<strong>on</strong>dents<br />

to the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e questi<strong>on</strong>naire c<strong>on</strong>sidered digital resources to be essential to their work, they also saw the<br />

Web as a source of data to be used rather than “as a repository <str<strong>on</strong>g>in</str<strong>on</strong>g>to which they could push their own<br />

data.” While the collecti<strong>on</strong> <strong>and</strong> analysis of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> was important to almost half of resp<strong>on</strong>dents,<br />

data storage <strong>and</strong> archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g were not given a high priority.<br />

A sec<strong>on</strong>d major f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g that emerged from the focus group discussi<strong>on</strong>s was that arts <strong>and</strong> humanities<br />

scholars c<strong>on</strong>sidered the web to have three major benefits: speed <strong>and</strong> efficiency, timel<str<strong>on</strong>g>in</str<strong>on</strong>g>ess of resources,<br />

<strong>and</strong> new ways of work<str<strong>on</strong>g>in</str<strong>on</strong>g>g. When asked about the shortcom<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the web, resp<strong>on</strong>dents gave more<br />

diverse resp<strong>on</strong>ses. While most focus group participants reported be<str<strong>on</strong>g>in</str<strong>on</strong>g>g satisfied with the digital<br />

resources they had available, they also overwhelm<str<strong>on</strong>g>in</str<strong>on</strong>g>gly wanted greater <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e access to the subject<br />

literature of their field, particularly journals. Participants were also c<strong>on</strong>cerned about the large number<br />

of low-quality search results they often obta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>on</strong> the web, <strong>and</strong> many wanted “tools for aggregat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

data for search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> analysis <strong>and</strong> better quality c<strong>on</strong>trol <strong>and</strong> rank<str<strong>on</strong>g>in</str<strong>on</strong>g>g of results”(Brown <strong>and</strong><br />

Greengrass 2010). Despite want<str<strong>on</strong>g>in</str<strong>on</strong>g>g better quality c<strong>on</strong>trol, participants were suspicious about who<br />

would undertake the quality assurance <strong>and</strong> wanted to have an unmediated role <str<strong>on</strong>g>in</str<strong>on</strong>g> the process. Other<br />

major frustrati<strong>on</strong>s with the web <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the lack of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability between digital libraries,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g access restricti<strong>on</strong>s, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual property rights.<br />

The RePAH questi<strong>on</strong>naire had also focused <strong>on</strong> the types of resources that scholars used. Brown <strong>and</strong><br />

Greengrass reported that a wide range of resources was used <strong>and</strong> there was little comm<strong>on</strong>ality between<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es. Generic sites such as library websites were the most comm<strong>on</strong>ly cited, with Google<br />

(<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Google Scholar, images, etc.) <strong>and</strong> JSTOR as the next two most frequently cited resources.<br />

Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, Brown <strong>and</strong> Greengrass also commented that <str<strong>on</strong>g>in</str<strong>on</strong>g> certa<str<strong>on</strong>g>in</str<strong>on</strong>g> discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, such as classics <strong>and</strong><br />

ancient history, Google was listed as the “central tool for acquir<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>,” perhaps<br />

because of the relatively large number of digital resources <str<strong>on</strong>g>in</str<strong>on</strong>g> classics. In general, however, the largest<br />

category of resource cited was “other.”<br />

While user trials elicited some diverse resp<strong>on</strong>ses to different potential portal features, Brown <strong>and</strong><br />

Greengrass nevertheless stressed how “the overarch<str<strong>on</strong>g>in</str<strong>on</strong>g>g message that came out of the user trials was<br />

they wanted simple tools that required little or no <str<strong>on</strong>g>in</str<strong>on</strong>g>put of time or pers<strong>on</strong>al engagement.” Participants<br />

highly valued “resource discovery” 583 <strong>and</strong> filter<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools that provided greater c<strong>on</strong>trol over web-based<br />

resources but also wanted tools that were highly customizable. The most important <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e resources<br />

were journal articles <strong>and</strong> other bibliographical resources. Workflow-management tools such as<br />

sophisticated bookmark<str<strong>on</strong>g>in</str<strong>on</strong>g>g features <strong>and</strong> automated copyright management were also highly desired<br />

581 Brown <strong>and</strong> Greengrass expla<str<strong>on</strong>g>in</str<strong>on</strong>g> that Delphi is “a structured process for collect<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> distill<str<strong>on</strong>g>in</str<strong>on</strong>g>g knowledge from a group of experts by means<br />

of a series of questi<strong>on</strong>naires <str<strong>on</strong>g>in</str<strong>on</strong>g>terspersed with c<strong>on</strong>trolled op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong> feedback” <strong>and</strong> it was used to filter ideas from the focus group by ask<str<strong>on</strong>g>in</str<strong>on</strong>g>g experts to rank<br />

ideas <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of their importance to their future research.<br />

582 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Brown <strong>and</strong> Greengrass, “user trials are a technique for ga<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g user resp<strong>on</strong>ses to design ideas, work<str<strong>on</strong>g>in</str<strong>on</strong>g>g from mock-ups or simulati<strong>on</strong>s.”<br />

583 This f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g is somewhat ir<strong>on</strong>ic as the major resource discovery service <str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom, Intute, has recently announced that its JISC fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

will end <str<strong>on</strong>g>in</str<strong>on</strong>g> July 2011 (http://www.<str<strong>on</strong>g>in</str<strong>on</strong>g>tute.ac.uk/blog/2010/04/12/<str<strong>on</strong>g>in</str<strong>on</strong>g>tute-plans-for-the-future-2010-<strong>and</strong>-bey<strong>on</strong>d/).


192<br />

features. In additi<strong>on</strong>, while participants wanted automatic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>-harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools to be used<br />

aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st digital c<strong>on</strong>tent to which they wanted access, the use of these tools aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st their own “c<strong>on</strong>tent”<br />

was c<strong>on</strong>sidered “problematic.” Most collaborative tools, such as social bookmark<str<strong>on</strong>g>in</str<strong>on</strong>g>g, collaborative<br />

annotati<strong>on</strong> of digital resources, shared document edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> “c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the authenticati<strong>on</strong> of<br />

digital c<strong>on</strong>tent,” fell <str<strong>on</strong>g>in</str<strong>on</strong>g> the middle range of desired features. F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, advanced communicati<strong>on</strong> tools<br />

(e.g., real-time chat <strong>and</strong> video c<strong>on</strong>ferenc<str<strong>on</strong>g>in</str<strong>on</strong>g>g) were not c<strong>on</strong>sidered to be highly valuable, <strong>and</strong> most<br />

participants were satisfied with exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g systems, such as e-mail.<br />

Although Brown <strong>and</strong> Greengrass believed that the RePAH project had illustrated that digital resources<br />

were quite important for arts <strong>and</strong> humanities researchers, their impact <strong>on</strong> pers<strong>on</strong>al archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g practices were still very limited:<br />

… despite its impact <strong>on</strong> research, ICT has not fed through to the habits <strong>and</strong> procedures for<br />

pers<strong>on</strong>al digital data archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> has not yet had a substantial impact <strong>on</strong> the means of<br />

scholarly communicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the arts <strong>and</strong> humanities. In short, it has not yet profoundly<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>fluenced the way <str<strong>on</strong>g>in</str<strong>on</strong>g> which arts <strong>and</strong> humanities publicati<strong>on</strong> is c<strong>on</strong>ceived (Brown <strong>and</strong><br />

Greengrass 2010).<br />

Similar results were observed by Dalbello et al. (2006), <strong>and</strong> Brown <strong>and</strong> Greengrass also c<strong>on</strong>firmed the<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Toms <strong>and</strong> O’Brien of the solitary scholar work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> a journal article for a pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted<br />

publicati<strong>on</strong>. Additi<strong>on</strong>ally, although many scholars wanted greater access to, <strong>and</strong> quality assurance of,<br />

resources, many distrusted any portal features that either automatically harvested their c<strong>on</strong>tent (such as<br />

CVs <strong>on</strong> a research profile page <str<strong>on</strong>g>in</str<strong>on</strong>g> a portal) or m<strong>on</strong>itored their activity (e.g., observ<str<strong>on</strong>g>in</str<strong>on</strong>g>g the electr<strong>on</strong>ic<br />

resources they selected from a portal page), even if such features enhanced the performance of the<br />

system for the whole community. This hesitance was caused by a number of reas<strong>on</strong>s, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

Brown <strong>and</strong> Greengrass, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the “<str<strong>on</strong>g>in</str<strong>on</strong>g>dividualistic nature of the community” <strong>and</strong> pers<strong>on</strong>alprivacy<br />

fears. Such attitudes <strong>and</strong> other limited technical underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g, Brown <strong>and</strong> Greengrass<br />

emphasized, however, were important for portal builders to c<strong>on</strong>sider <str<strong>on</strong>g>in</str<strong>on</strong>g> any future design:<br />

… the c<strong>on</strong>cerns raised here suggest a lack of awareness about the extent to which acti<strong>on</strong>s are<br />

already m<strong>on</strong>itored <strong>and</strong> recorded. When this is coupled with the str<strong>on</strong>gly expressed preference<br />

for simple tools that require little or no learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> their expressi<strong>on</strong>s of frustrati<strong>on</strong> at the lack<br />

of sophisticati<strong>on</strong> of search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>es (a frustrati<strong>on</strong> that was often a functi<strong>on</strong> of their lack of<br />

familiarity, or perhaps underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g, of Boolean search parameters permitted <str<strong>on</strong>g>in</str<strong>on</strong>g> Google’s<br />

advanced search facilities), a picture emerges of researchers with relatively limited technical<br />

skills. Our focus group participants reported levels of formal <str<strong>on</strong>g>in</str<strong>on</strong>g>itiati<strong>on</strong> or tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital<br />

resources that they used vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g from little to n<strong>on</strong>e. The implicati<strong>on</strong> here is clearly that future<br />

portal developments should assume <strong>on</strong>ly a very basic level of ICT competence (Brown <strong>and</strong><br />

Greengrass 2010).<br />

The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al issue raised by Brown <strong>and</strong> Greengrass was the importance of access, a theme that ran<br />

through all of their results, <strong>and</strong> this access was primarily to <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e journals. While arts <strong>and</strong> humanities<br />

researchers do desire more sophisticated research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures, Brown <strong>and</strong> Greengrass c<strong>on</strong>cluded,<br />

they mostly want open access to c<strong>on</strong>tent with simple search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>es that can n<strong>on</strong>etheless guarantee<br />

quality <strong>and</strong> relevant results. Portal designers should also assume low levels of IT competence <strong>and</strong><br />

provide basic <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces that can be customized <str<strong>on</strong>g>in</str<strong>on</strong>g> that users can add l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks or feeds to their comm<strong>on</strong>ly<br />

used sources. They also ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that arts <strong>and</strong> humanities researchers <str<strong>on</strong>g>in</str<strong>on</strong>g> general felt no need for


193<br />

tools that support collaborative work<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g, or electr<strong>on</strong>ic publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> were unwill<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

to support track<str<strong>on</strong>g>in</str<strong>on</strong>g>g systems even if the could support “powerful quality assurance systems.”<br />

The TIDSR Study<br />

As has been illustrated by Brown <strong>and</strong> Greengrass (2010) <strong>and</strong> Warwick et al. (2008b), little<br />

comprehensive research has systematically explored the impact <strong>and</strong> actual usage of digital humanities<br />

resources. Recent work detailed <str<strong>on</strong>g>in</str<strong>on</strong>g> Meyer et al. (2009) reports <strong>on</strong> a study that exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the use of five<br />

digitizati<strong>on</strong> projects with<str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a variety of measures to obta<str<strong>on</strong>g>in</str<strong>on</strong>g> a more nuanced<br />

picture not just of these projects but of “digitized material <str<strong>on</strong>g>in</str<strong>on</strong>g> general.” This JISC-funded research was<br />

undertaken to promote st<strong>and</strong>ards <strong>and</strong> knowledge shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g am<strong>on</strong>g projects <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of how to measure<br />

the usage <strong>and</strong> impact of their resource. One major outcome of their work was the creati<strong>on</strong> of a “Toolkit<br />

for the Impact of Digitised Scholarly Resources” (TIDSR) 584 While Meyer et al. acknowledged much<br />

was learned from both the LAIRAH <strong>and</strong> CSHE studies, they suggested that both these studies were<br />

miss<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>e part of the larger picture:<br />

One major miss<str<strong>on</strong>g>in</str<strong>on</strong>g>g part, however, is any c<strong>on</strong>crete way for collecti<strong>on</strong> managers, developers, <strong>and</strong><br />

fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g bodies to attempt to underst<strong>and</strong> <strong>and</strong> collect data for measur<str<strong>on</strong>g>in</str<strong>on</strong>g>g impact from the <strong>on</strong>set of<br />

a project <strong>and</strong> throughout the life-cycle of a digitisati<strong>on</strong> effort. The toolkit is an attempt to fill<br />

this gap (Meyer et al. 2009).<br />

For this study they chose five digital projects 585 <strong>and</strong> used a variety of methods, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g quantitative<br />

measures (webometrics, bibliometrics, log file analysis), <strong>and</strong> qualitative methods (c<strong>on</strong>tent analysis,<br />

focus groups, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews). 586 Several important themes emerged from the <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews they<br />

c<strong>on</strong>ducted with project creators, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the importance of close c<strong>on</strong>tact with users when develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

a project, the fact that many digitized projects had little or almost no c<strong>on</strong>tact with the “custodians of<br />

the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al c<strong>on</strong>tent” that had been digitized, <strong>and</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, a “discrepancy between <str<strong>on</strong>g>in</str<strong>on</strong>g>tended usage<br />

<strong>and</strong> perceived success”—i.e., many project creators discovered that the uses to which their collecti<strong>on</strong>s<br />

had been put were very different from those they had envisaged. Interviews with users were used to<br />

gauge the levels of project impact, <strong>and</strong> the authors noted several trends, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g that the quality of<br />

some undergraduate dissertati<strong>on</strong> work seemed to improve through c<strong>on</strong>tact with primary sources, that<br />

some new types of research were be<str<strong>on</strong>g>in</str<strong>on</strong>g>g presented at c<strong>on</strong>ferences (e.g., <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly quantitative<br />

research was found <str<strong>on</strong>g>in</str<strong>on</strong>g> many c<strong>on</strong>ference papers for the fields served by the relevant digital humanities<br />

resources), <strong>and</strong> that some new types of research were be<str<strong>on</strong>g>in</str<strong>on</strong>g>g attempted. At the same time quantitative<br />

research projects with digital data had some problems, such as those reported by <strong>on</strong>e researcher who<br />

stated that keyword search<str<strong>on</strong>g>in</str<strong>on</strong>g>g was still very unreliable for digital newspaper data where the text had<br />

been created by OCR. While there were still far too many false negatives <strong>and</strong> positives, the researchers<br />

c<strong>on</strong>cluded that search<str<strong>on</strong>g>in</str<strong>on</strong>g>g the digitized newspapers was far superior to us<str<strong>on</strong>g>in</str<strong>on</strong>g>g microfilm.<br />

In additi<strong>on</strong> to the <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews, Meyer et al. held focus groups with two groups of students. These<br />

revealed that the students were generally enthusiastic about digital resources <strong>and</strong> that undergraduate<br />

students used digitized collecti<strong>on</strong>s <strong>on</strong> a regular basis, both <strong>on</strong>es recommended by their tutors <strong>and</strong> <strong>on</strong>es<br />

that they sought out <str<strong>on</strong>g>in</str<strong>on</strong>g>dependently. Somewhat different results were reported by postgraduate <strong>and</strong><br />

584 http://microsites.oii.ox.ac.uk/tidsr/<br />

585 The five resources were Histpop (http://www.histpop.org), 19 th Century British Newspapers<br />

(http://www.bl.uk/reshelp/f<str<strong>on</strong>g>in</str<strong>on</strong>g>dhelprestype/news/newspdigproj/database/paperdigit.html), Archival Sound Records at the British <strong>Library</strong><br />

(http://sounds.bl.uk), BOPCRIS (http://www.bl.uk/reshelp/f<str<strong>on</strong>g>in</str<strong>on</strong>g>dhelprestype/news/newspdigproj/database/paperdigit.html), <strong>and</strong> Medical Journals Backfiles<br />

(http://library.wellcome.ac.uk/backfiles).<br />

586 The research presented <str<strong>on</strong>g>in</str<strong>on</strong>g> Meyer et al. (2009) focuses <strong>on</strong> the results of the qualitative measures. The full report can be viewed at<br />

http://microsites.oii.ox.ac.uk/tidsr/system/files/TIDSR_F<str<strong>on</strong>g>in</str<strong>on</strong>g>alReport_20July2009.pdf


194<br />

postdoctoral students, who were also us<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources that had been recommended but were uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

about where the best recommendati<strong>on</strong>s would come from <strong>and</strong> were far more skeptical about the quality<br />

of resources that they discovered outside of library catalogs <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g aids. 587 One behavior reported<br />

by both groups of students was a general unwill<str<strong>on</strong>g>in</str<strong>on</strong>g>gness to cite digital material 588 that they had used:<br />

Both groups were unlikely to cite the digital material if there was a paper or analogue citati<strong>on</strong><br />

available, although for different reas<strong>on</strong>s. The undergraduates were c<strong>on</strong>cerned that they would<br />

be perceived as hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g not completed ‘proper’ research unless they cited the analogue<br />

resources, whereas the postgraduates <strong>and</strong> postdoctoral researchers were” more c<strong>on</strong>cerned about<br />

giv<str<strong>on</strong>g>in</str<strong>on</strong>g>g stable citati<strong>on</strong>s that future researchers would be able to trace (Meyer et al. 2009).<br />

The other major reas<strong>on</strong>s students gave for not us<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital resources were a lack of trustworth<str<strong>on</strong>g>in</str<strong>on</strong>g>ess <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the material, the uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> persistence of a digital resource, other general vett<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>cerns, <strong>and</strong> many<br />

students stressed the importance of br<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g by trusted <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s to promote use of a digital resource.<br />

The c<strong>on</strong>cern for need<str<strong>on</strong>g>in</str<strong>on</strong>g>g stable citati<strong>on</strong>s to electr<strong>on</strong>ic resources was also illustrated by Bodard (2008)<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> his discussi<strong>on</strong> of the creati<strong>on</strong> of the Inscripti<strong>on</strong>s of Aphrodisias website.<br />

To better determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e the actual impact <strong>and</strong> use of a digital resource, Meyer et al. had a number of<br />

suggesti<strong>on</strong>s. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, they argued that digital projects should measure impact from the beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

of the project, ideally before the website has even been designed. While they believed that impact<br />

should be measured regularly, they also advised that projects should not get “bogged down” by<br />

plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g overly detailed studies. In additi<strong>on</strong>, they proposed that susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability strategies should be built<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> from the beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g. They also recommended that projects should make efforts to secure follow-up<br />

fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g to measure impact <strong>and</strong> actively promote their projects through blogs, publicati<strong>on</strong>s, c<strong>on</strong>ference<br />

reports, etc., as well as make sure that they are <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> trusted gateways (such as library<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> portals). In the l<strong>on</strong>g run, <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of measur<str<strong>on</strong>g>in</str<strong>on</strong>g>g impact, they urged that all projects<br />

c<strong>on</strong>sider multiple sources of evidence, exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e them from different perspectives, <strong>and</strong> use a variety of<br />

metrics. On a practical note, they po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that projects that do not ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> stable <strong>and</strong> easy-to-cite<br />

URLs make it difficult for scholars to reference them <str<strong>on</strong>g>in</str<strong>on</strong>g> their publicati<strong>on</strong>s. Lastly, they recommended<br />

reach<str<strong>on</strong>g>in</str<strong>on</strong>g>g out to the next generati<strong>on</strong> of scholars. “There are important generati<strong>on</strong>al shifts tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g place:<br />

younger researchers are develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g research habits that will become ma<str<strong>on</strong>g>in</str<strong>on</strong>g>stream as they replace their<br />

elders.” Meyer et al c<strong>on</strong>cluded, while also emphasiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “these so-called digital natives are a<br />

natural c<strong>on</strong>stituency for digital collecti<strong>on</strong>s, so ensure that your resources are available to them, usable<br />

by them, <strong>and</strong> promoted to them” (Meyer et al. 2009).<br />

587 This behavior is <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>trast to that of the digital humanist researchers observed <str<strong>on</strong>g>in</str<strong>on</strong>g> Toms <strong>and</strong> O’Brien, who relied largely <strong>on</strong> Google or other search<br />

eng<str<strong>on</strong>g>in</str<strong>on</strong>g>es to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d resources of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest, but supports the f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of (Warwick et al. 2008a) <strong>and</strong> (Brown <strong>and</strong> Greengrass 2010) that academics <strong>and</strong> students<br />

valued resource-discovery tools that helped them identify reliable digital resources. In c<strong>on</strong>trast, a recent survey of more than 3,000 faculty by the Ithaka<br />

group <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated that faculty were <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly us<str<strong>on</strong>g>in</str<strong>on</strong>g>g discovery tools other than “library specific start<str<strong>on</strong>g>in</str<strong>on</strong>g>g po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts,” <strong>and</strong> that <strong>on</strong>ly 30 percent of humanities<br />

faculty still started their search for digital materials us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a library discovery tool (Sch<strong>on</strong>feld <strong>and</strong> Housewright 2010).<br />

588 Similar citati<strong>on</strong> behavior was reported by Sukovic (2009) where literary scholars <strong>and</strong> historians were often unwill<str<strong>on</strong>g>in</str<strong>on</strong>g>g to cite digital resources (<strong>and</strong> thus<br />

often cited the analog source even when they had <strong>on</strong>ly used the digital versi<strong>on</strong>) for various reas<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g fear that their colleagues did not approve of<br />

us<str<strong>on</strong>g>in</str<strong>on</strong>g>g such resources <strong>and</strong> that the referenc<str<strong>on</strong>g>in</str<strong>on</strong>g>g of digital resources did not fit with<str<strong>on</strong>g>in</str<strong>on</strong>g> the academic practice of their discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. She c<strong>on</strong>cluded “the<br />

multifarious nature of scholars’ use of e-texts, revealed <str<strong>on</strong>g>in</str<strong>on</strong>g> the study, was not reflected <str<strong>on</strong>g>in</str<strong>on</strong>g> citati<strong>on</strong> practices.”


195<br />

OVERVIEW OF DIGITAL CLASSICS CYBERINFRASTRUCTURE<br />

Requirements of Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for Classics<br />

A number of recent research studies have explored some of the potential needs of a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

for classics, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the development of work<str<strong>on</strong>g>in</str<strong>on</strong>g>g paper repositories, the creati<strong>on</strong> of new collaborative<br />

models for scholarship <strong>and</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g, the requirement of open data <strong>and</strong> collecti<strong>on</strong>s, <strong>and</strong> the large<br />

variety of services that will be necessary.<br />

Open-Access Repositories of Sec<strong>on</strong>dary Scholarship<br />

Two recent studies have focused <strong>on</strong> the potential of open-access repositories for classical studies (Ober<br />

et al. 2007, Pritchard 2008). Ober et al. discussed the creati<strong>on</strong> of the open-access work<str<strong>on</strong>g>in</str<strong>on</strong>g>g papers<br />

repository, the Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong> Stanford Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g Papers <str<strong>on</strong>g>in</str<strong>on</strong>g> Classics (PSWPC), 589 <strong>and</strong> exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the<br />

potential benefits of electr<strong>on</strong>ic publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> the relati<strong>on</strong>ship of “work<str<strong>on</strong>g>in</str<strong>on</strong>g>g papers” to traditi<strong>on</strong>al<br />

publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g. The PSWPC is a web-based repository that is open to the faculty <strong>and</strong> graduate students of<br />

Stanford <strong>and</strong> Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>, <strong>and</strong> the papers are not formally peer reviewed. N<strong>on</strong>etheless, many c<strong>on</strong>tributors<br />

have put up prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>ts or work<str<strong>on</strong>g>in</str<strong>on</strong>g>g papers that eventually were formally published. The creati<strong>on</strong> of this<br />

repository has raised a number of issues regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g l<strong>on</strong>g-term access <strong>and</strong> preservati<strong>on</strong>, which might be<br />

better guaranteed by a commercial archive. The authors def<str<strong>on</strong>g>in</str<strong>on</strong>g>e three processes as the traditi<strong>on</strong>al roles<br />

of scholarly publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g—mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g research public, certificati<strong>on</strong>, <strong>and</strong> archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g—<strong>and</strong> propose that<br />

certificati<strong>on</strong>, or peer review, is the most important role of traditi<strong>on</strong>al publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

While the authors acknowledge that the <strong>on</strong>ly assurance of value of the work<str<strong>on</strong>g>in</str<strong>on</strong>g>g papers <str<strong>on</strong>g>in</str<strong>on</strong>g> the PSWPC<br />

is the academic st<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of the two classics departments at Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong> <strong>and</strong> Stanford that host the<br />

repository, they po<str<strong>on</strong>g>in</str<strong>on</strong>g>t out that a large amount of traditi<strong>on</strong>al publisher peer review is relatively<br />

undem<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g. 590 They suggest that a dist<str<strong>on</strong>g>in</str<strong>on</strong>g>cti<strong>on</strong> needs to be made between “prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t/work<str<strong>on</strong>g>in</str<strong>on</strong>g>g paper”<br />

archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> postpr<str<strong>on</strong>g>in</str<strong>on</strong>g>t archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g, or the archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g of a paper that has already been formally published.<br />

One disappo<str<strong>on</strong>g>in</str<strong>on</strong>g>tment they noted was that neither the APA nor the AIA had yet created large work<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

paper repositories for the entire discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Ober et al. offer a number of recommendati<strong>on</strong>s for<br />

humanities scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> work<str<strong>on</strong>g>in</str<strong>on</strong>g>g toward open access. The first is to promote pre- <strong>and</strong> postpr<str<strong>on</strong>g>in</str<strong>on</strong>g>t archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

to the largest extent possible; the sec<strong>on</strong>d is to get the larger professi<strong>on</strong>al organizati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>volved. The<br />

authors also suggest that academic authors fight harder to reta<str<strong>on</strong>g>in</str<strong>on</strong>g> their copyrights, <strong>and</strong> that all<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> higher educati<strong>on</strong> should “move to greater flexibility <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>sider<str<strong>on</strong>g>in</str<strong>on</strong>g>g what counts as<br />

‘publicati<strong>on</strong>’ <str<strong>on</strong>g>in</str<strong>on</strong>g> the new electr<strong>on</strong>ic media” (Ober et al. 2007).<br />

A recent paper by David Pritchard provided an external look at the PSWPC <strong>and</strong> explored the large<br />

issues of open access, cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <strong>and</strong> classics. The PSWPC had been far more successful than<br />

it anticipated, report<str<strong>on</strong>g>in</str<strong>on</strong>g>g almost 2,000 downloads a week <str<strong>on</strong>g>in</str<strong>on</strong>g> September 2007. Pritchard suggested that<br />

the PSWPC fulfilled two important scholarly tasks: (1) mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g a far greater wealth of classical<br />

scholarship available to a wider audience; <strong>and</strong> (2) help<str<strong>on</strong>g>in</str<strong>on</strong>g>g authors solicit feedback <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>d a greater<br />

audience for their work. He proposed that there were four reas<strong>on</strong>s for the success of the PSWPC. First,<br />

it allowed specialists to share research. The sec<strong>on</strong>d reas<strong>on</strong> was the already “entrenched use of<br />

589 http://www.pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>.edu/~pswpc/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

590 A recent blog post by Kent Anders<strong>on</strong> at “The Scholarly Kitchen” has provided an <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g look at the vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g processes <strong>and</strong> quality of peer review<br />

by different commercial publishers (http://scholarlykitchen.sspnet.org/2010/03/30/improv<str<strong>on</strong>g>in</str<strong>on</strong>g>g-peer-review-lets-provide-an-<str<strong>on</strong>g>in</str<strong>on</strong>g>gredients-list-for-ourreaders/).<br />

In additi<strong>on</strong>, a recent article by (Bankier <strong>and</strong> Perciali 2008) has suggested that the creati<strong>on</strong> of peer-reviewed open-access journals may help<br />

revitalize digital repositories <strong>and</strong> provide a natural publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g outlet for universities.


196<br />

computers by ancient historians <strong>and</strong> classicists,” a phenomen<strong>on</strong> he noted surprised many<br />

n<strong>on</strong>classicists. The third was the dem<strong>and</strong> for open-access research <str<strong>on</strong>g>in</str<strong>on</strong>g> general, <strong>and</strong> the fourth was the<br />

high quality of the papers. Pritchard did suggest, however, that the PSWPC might improve by<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g better metadata <strong>and</strong> mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g the repository harvestable through Open Archives Initiative-<br />

Protocol for Metadata Harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g (OAI-PMH). Like Ober et al, Pritchard recommended that authors<br />

should seek to archive their pre- or postpr<str<strong>on</strong>g>in</str<strong>on</strong>g>ts, but his views <strong>on</strong> cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure were limited to the<br />

creati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repositories or to more departmental collecti<strong>on</strong>s of work papers.<br />

Open Access, Collaborati<strong>on</strong>, Reuse, <strong>and</strong> Digital Classics<br />

In additi<strong>on</strong> to open-access repositories of scholarly publicati<strong>on</strong>s, 591 the need for greater openness <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

terms of data, collecti<strong>on</strong>s, methodologies <strong>and</strong> tools, <strong>and</strong> the new models of collaborative scholarship<br />

that such openness might support have received grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g attenti<strong>on</strong>. As this review has illustrated, a<br />

number of projects, such as the archaeological projects found <str<strong>on</strong>g>in</str<strong>on</strong>g> Open C<strong>on</strong>text, the Perseus Digital<br />

<strong>Library</strong>, Pleiades, <strong>and</strong> the Inscripti<strong>on</strong>s of Aphrodisias, have made their texts <strong>and</strong> source code openly<br />

available. Similarly, many authors (Crane, Seales, <strong>and</strong> Terras 2009, Bagnall 2010, Bodard <strong>and</strong> Garcés<br />

2009, Bodard 2009, Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009) have called for a new level of open access that not <strong>on</strong>ly provides<br />

access to scholarship <strong>and</strong> collecti<strong>on</strong>s but also provides <strong>and</strong> promotes openness, <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of actively<br />

support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the reuse of source code, data, <strong>and</strong> texts.<br />

Reuse is not always an easy task, however, as evidenced by both the HESTIA <strong>and</strong> LaQuAT projects.<br />

Furthermore, document<str<strong>on</strong>g>in</str<strong>on</strong>g>g reuse is also often difficult, as Gabriel Bodard expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> his <str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong><br />

to a 2009 Digital Humanities panel <strong>on</strong> the reuse of open-source materials <str<strong>on</strong>g>in</str<strong>on</strong>g> ancient studies, for “there<br />

is often relatively little evidentiary support <str<strong>on</strong>g>in</str<strong>on</strong>g> the form of openly published datasets that have been<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dependently tested or re-used by other projects” (Bodard 2009). This panel <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the LaQuAT<br />

project, the Homer Multitext, <strong>and</strong> Pleiades, <strong>and</strong> Bodard listed several important <str<strong>on</strong>g>in</str<strong>on</strong>g>sights, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

need for open licens<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> additi<strong>on</strong> to mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g materials available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, the c<strong>on</strong>flict between<br />

electr<strong>on</strong>ic publicati<strong>on</strong> as “resource creati<strong>on</strong>” vs. “self-c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed research output,” <strong>and</strong> the need to<br />

c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>ce scholars of the advantages of publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g source code <strong>and</strong> methodologies <strong>and</strong> not just polished<br />

c<strong>on</strong>clusi<strong>on</strong>s. Bodard complicated the idea of reuse by argu<str<strong>on</strong>g>in</str<strong>on</strong>g>g that digital projects need to c<strong>on</strong>sider<br />

how to support more sophisticated reuse strategies, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the possibility that materials will be<br />

reused <str<strong>on</strong>g>in</str<strong>on</strong>g> unexpected ways, <strong>and</strong> determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g not just how to enable improved access or a better<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terface to a collecti<strong>on</strong> but also how to allow the creati<strong>on</strong> of new <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s or aggregati<strong>on</strong>s of<br />

data.<br />

Neel Smith has echoed this po<str<strong>on</strong>g>in</str<strong>on</strong>g>t <strong>and</strong> expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that the HMT project has chosen to use open-data<br />

formats <strong>and</strong> well-def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed st<strong>and</strong>ards <strong>and</strong> has released all software under a GNU public license 592 not<br />

<strong>on</strong>ly to ensure susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability <strong>and</strong> digital preservati<strong>on</strong> but also to promote the highest amount of reuse of<br />

the material as possible. “From the outset, the Homer Multitext Project has been shaped by a sense of<br />

our generati<strong>on</strong>s’ resp<strong>on</strong>sibility, as we transform the Iliadic traditi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to yet another medium, to<br />

perpetuate as completely as we can the traditi<strong>on</strong> we have received,” Smith articulated; “we need to<br />

ensure that as we focus <strong>on</strong> the new possibilities of digital media we do not <str<strong>on</strong>g>in</str<strong>on</strong>g>advertently restrict what<br />

future scholars <strong>and</strong> lovers of the Iliad can do with our digital material” (Smith 2010, 122). In terms of<br />

591 While the largest number of open-access publicati<strong>on</strong>s of classical scholarship are typically reviews, work<str<strong>on</strong>g>in</str<strong>on</strong>g>g papers <strong>and</strong> journal articles, several<br />

scholars have made copies of books available, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Gregory Crane, Thucydides <strong>and</strong> the Ancient Simplicity: The Limits of Political Realism,<br />

http://ark.cdlib.org/ark:/13030/ft767nb497/, <strong>and</strong> Gregory Nagy, P<str<strong>on</strong>g>in</str<strong>on</strong>g>dar’s Homer: The Lyric Possessi<strong>on</strong> of an Epic Past.<br />

http://www.press.jhu.edu/books/nagy/PH.html<br />

592 GNU st<strong>and</strong>s for “GNU's Not Unix” (http://www.gnu.org/gnu/gnu-history.html) <strong>and</strong> a GNU general public license is often called the GNU GPL.


197<br />

freedom <strong>and</strong> reusability, Smith recasts Richard Stallman’s four k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds of freedom for free software 593 —<br />

the freedom to run, study, redistribute, <strong>and</strong> improve <strong>and</strong> release <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the Homer Multitext. The<br />

freedom to run <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the ability to read a text or view an image; freedom to study <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the<br />

ability to see how resources are encoded; freedom to redistribute <str<strong>on</strong>g>in</str<strong>on</strong>g>volves the ability to share <strong>and</strong><br />

redistribute the digital objects, <strong>and</strong> freedom to improve <strong>and</strong> release <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes the ability to edit <strong>and</strong><br />

resample or redistribute texts <strong>and</strong> images.<br />

In additi<strong>on</strong>, the abstract object model of the HMT project has been translated <str<strong>on</strong>g>in</str<strong>on</strong>g>to an applicati<strong>on</strong><br />

architecture that will ensure that the “functi<strong>on</strong>ality of Multitext applicati<strong>on</strong>s can persist as easily as the<br />

data <str<strong>on</strong>g>in</str<strong>on</strong>g> our simple archival storage formats” (Smith 2010, 132). This has led the project to adopt four<br />

basic architectural pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciples: (1) to support reuse of code, applicati<strong>on</strong> programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces (APIs)<br />

were used for “dist<str<strong>on</strong>g>in</str<strong>on</strong>g>ct comp<strong>on</strong>ents of the system”; (2) “<str<strong>on</strong>g>in</str<strong>on</strong>g>dependent decoupled comp<strong>on</strong>ents” were<br />

used whenever possible; (3) all these comp<strong>on</strong>ents have been exposed to the Internet; <strong>and</strong> (4) all<br />

software has been released under a GNU license. As Smith succ<str<strong>on</strong>g>in</str<strong>on</strong>g>ctly c<strong>on</strong>cludes, “Taken together,<br />

these pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciples lead us to an architecture built <strong>on</strong> a suite of self-c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed network services with<br />

explicit APIs, implemented <str<strong>on</strong>g>in</str<strong>on</strong>g> free software” (Smith 2010, 132).<br />

For many of the earliest digital classics projects, however, the primary c<strong>on</strong>cern was open access, a<br />

revoluti<strong>on</strong>ary move <str<strong>on</strong>g>in</str<strong>on</strong>g> itself at the time, <strong>and</strong> this was primarily def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g free access to<br />

scholarship <strong>on</strong> the web. One of the earliest projects to follow this model was the Bryn Mawr Classical<br />

Review (BMCR). 594 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its website, BMCR “publishes timely reviews of current scholarly<br />

work <str<strong>on</strong>g>in</str<strong>on</strong>g> the field of classical studies (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g archaeology).” BMCR began as a listserv <str<strong>on</strong>g>in</str<strong>on</strong>g> 1990, the<br />

first gopher site became available <str<strong>on</strong>g>in</str<strong>on</strong>g> 1992, <strong>and</strong> the current website emerged from a partnership with the<br />

Stoa C<strong>on</strong>sortium. The entire archive of BMCR reviews (from 1990 <strong>on</strong>ward) is available <strong>on</strong> this<br />

website <strong>and</strong> can be browsed by year, reviewer, or author. A large number of reviewers participate <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the BMCR, <strong>and</strong> all reviews have stable URLs so that they can be cited <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to easily. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce<br />

August 2008, the BMCR has also offered a blog 595 that publishes citati<strong>on</strong> details <strong>and</strong> a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to reviews<br />

<strong>on</strong> the BMCR website as <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual blog entries, so that users can subscribe to the blog <strong>and</strong> get updates<br />

of BMCR c<strong>on</strong>tent through a blog reader program as well as leave comments <strong>on</strong> reviews. In additi<strong>on</strong>,<br />

the BMCR provides a daily e-mail digest as another way of push<str<strong>on</strong>g>in</str<strong>on</strong>g>g out its c<strong>on</strong>tent.<br />

A larger undertak<str<strong>on</strong>g>in</str<strong>on</strong>g>g that also focused <strong>on</strong> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a collaborative community <strong>and</strong> new digital<br />

publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g opportunities is the “Stoa C<strong>on</strong>sortium for Electr<strong>on</strong>ic Publicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities,” 596<br />

often simply referred to as Stoa. This c<strong>on</strong>sortium was founded <str<strong>on</strong>g>in</str<strong>on</strong>g> 1997 by the late Ross Scaife <strong>and</strong><br />

accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its website exists to serve a number of purposes: “dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of news <strong>and</strong><br />

announcements, ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ly via the gateway blog; discussi<strong>on</strong> of best practices via discussi<strong>on</strong> groups <strong>and</strong><br />

white papers; <strong>and</strong> publicati<strong>on</strong> of experimental <strong>on</strong>-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e projects, many of them subject to scholarly peer<br />

review.” The Stoa c<strong>on</strong>sortium states that “open access to networked scholarship” is <strong>on</strong>e of their<br />

str<strong>on</strong>gest pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciples. This is str<strong>on</strong>gly illustrated by the large number of hosted projects at Stoa,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Ancient City of Athens (a photographic archive of archaeological <strong>and</strong> architectural rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

of ancient Athens <str<strong>on</strong>g>in</str<strong>on</strong>g>tended for students <strong>and</strong> teachers), 597 Ancient Journeys (an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e festschrift), 598<br />

C<strong>on</strong>fessi<strong>on</strong>s 599 of August<str<strong>on</strong>g>in</str<strong>on</strong>g>e (an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e repr<str<strong>on</strong>g>in</str<strong>on</strong>g>t of the text with a commentary by James J. O’D<strong>on</strong>nell),<br />

593 http://www.gnu.org/philosophy/free-sw.html<br />

594 http://bmcr.brynmawr.edu/<br />

595 http://www.bmcreview.org/<br />

596 http://www.stoa.org/<br />

597 http://www.stoa.org/athens/<br />

598 http://www.stoa.org/lane/<br />

599 http://www.stoa.org/hippo/


198<br />

Demos (a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital encyclopedia about Athenian democracy extensively cross referenced with<br />

Perseus), 600 Diotima (an <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary resource <strong>on</strong> women <strong>and</strong> gender <str<strong>on</strong>g>in</str<strong>on</strong>g> the ancient world), 601<br />

Metis (a repository of QuickTime movies of Greek archaeological sites), 602 <strong>and</strong> Suda Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e (a<br />

collaborative <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e versi<strong>on</strong> of the Suda). 603 The website notes that many projects at the Stoa are<br />

closely l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked with materials <strong>and</strong> tools from Perseus <strong>and</strong> it is closely affiliated with the Digital<br />

Classicist website <strong>and</strong> community.<br />

The Digital Classicist 604 website has been established as a “decentralised <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al community<br />

of scholars <strong>and</strong> students <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> the applicati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>novative digital methods <strong>and</strong> technologies to<br />

research <strong>on</strong> the ancient world.” While this site is not officially hosted by any <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong> it n<strong>on</strong>etheless<br />

serves as a web-based hub for communicati<strong>on</strong> <strong>and</strong> collaborati<strong>on</strong> am<strong>on</strong>g digital classicists. Every<br />

summer the Digital Classicist hosts a series of sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ars 605 at the Institute of Classical Studies <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

L<strong>on</strong>d<strong>on</strong> where practiti<strong>on</strong>ers present cutt<str<strong>on</strong>g>in</str<strong>on</strong>g>g-edge research <strong>on</strong> the use of computati<strong>on</strong>al methods <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

study of antiquity. The largest comp<strong>on</strong>ent of the Digital Classicist website is the wiki, however, which<br />

was created by Gabriel Bodard <strong>and</strong> other practiti<strong>on</strong>ers who were <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> the “applicati<strong>on</strong> of the<br />

digital humanities to the study of the ancient world” (Mah<strong>on</strong>y 2006). This site aimed from the<br />

beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g to br<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholars together to support collaborative work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> thus formed partnerships<br />

with the Stoa C<strong>on</strong>sortium, the CHS, <strong>and</strong> the Digital Medievalist blog. 606<br />

The Digital Classicist wiki 607 was thus created as a central locati<strong>on</strong> to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k the diverse scholarship <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the various areas of ancient studies <strong>and</strong> even more important, sought to “fill an important gap <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarly documentati<strong>on</strong> by creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>cise, reliable <strong>and</strong> critical guidance <strong>on</strong> crucial technical<br />

issues for scholars who may <strong>on</strong>ly be <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> a basic <str<strong>on</strong>g>in</str<strong>on</strong>g>troducti<strong>on</strong> to such issues with l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to<br />

further resources if they wish” (Mah<strong>on</strong>y 2006). The website also meets two other important needs of<br />

digital classicists, who accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Mah<strong>on</strong>y <strong>and</strong> Bodard, require a space for both build<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

communities <strong>and</strong> work<str<strong>on</strong>g>in</str<strong>on</strong>g>g collaboratively:<br />

The most strik<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> successful aspect of Digital Classics is its sense of community <strong>and</strong><br />

collaborati<strong>on</strong>. Digital Classicists do not work <str<strong>on</strong>g>in</str<strong>on</strong>g> isolati<strong>on</strong>; they develop projects <str<strong>on</strong>g>in</str<strong>on</strong>g> t<strong>and</strong>em with<br />

colleagues <str<strong>on</strong>g>in</str<strong>on</strong>g> other humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es or with experts <str<strong>on</strong>g>in</str<strong>on</strong>g> technical fields…They do not<br />

publish expensive m<strong>on</strong>ographs dest<str<strong>on</strong>g>in</str<strong>on</strong>g>ed to be checked out of libraries <strong>on</strong>ce every few years;<br />

they collect data, c<strong>on</strong>duct research, develop tools <strong>and</strong> resources, <strong>and</strong> importantly make them<br />

available electr<strong>on</strong>ically, often under free <strong>and</strong> open license such as Creative Comm<strong>on</strong>s, for<br />

reference <strong>and</strong> re-use by scholars, students, <strong>and</strong> n<strong>on</strong>-specialists alike (Mah<strong>on</strong>y <strong>and</strong> Bodard 2010,<br />

Introducti<strong>on</strong>, 2).<br />

Any<strong>on</strong>e can jo<str<strong>on</strong>g>in</str<strong>on</strong>g> the Digital Classicist wiki by simply apply<str<strong>on</strong>g>in</str<strong>on</strong>g>g to <strong>on</strong>e of the four editors for an<br />

account. One major comp<strong>on</strong>ent of the wiki is a directory of more than 90 digital classics projects<br />

organized alphabetically. The length of project descripti<strong>on</strong>s can vary, <strong>and</strong> not all descripti<strong>on</strong>s are<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to active websites. Additi<strong>on</strong>ally, a FAQ has a list of 45 articles <strong>on</strong> best practices <str<strong>on</strong>g>in</str<strong>on</strong>g> digital<br />

classics <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes diverse topics, from “C<strong>on</strong>cord<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Texts” to “Sanskrit, typ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

600 http://www.stoa.org/projects/demos/home, <strong>and</strong> for more <strong>on</strong> Demos, see Crane et al. (2006).<br />

601 http://www.stoa.org/diotima/<br />

602 http://www.stoa.org/metis/<br />

603 http://www.stoa.org/sol/<br />

604 http://www.digitalclassicist.org/<br />

605 http://www.digitalclassicist.org/wip/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

606 http://www.digitalmedievalist.org/<br />

607 http://wiki.digitalclassicist.org/Ma<str<strong>on</strong>g>in</str<strong>on</strong>g>_Page


199<br />

display.” “These guides to practice derive from the research experience of the practiti<strong>on</strong>ers <str<strong>on</strong>g>in</str<strong>on</strong>g>volved,”<br />

Mah<strong>on</strong>y (2011) expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “<strong>and</strong> so should be c<strong>on</strong>sidered research outputs <str<strong>on</strong>g>in</str<strong>on</strong>g> themselves.”<br />

The website also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a list of 33 tools, from “advanced imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques” to “TimeMap,” <strong>and</strong> a<br />

brief list of selected electr<strong>on</strong>ic resources. This wiki provides an excellent means of entry for scholars<br />

first explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g potential the applicati<strong>on</strong> of digital technology <str<strong>on</strong>g>in</str<strong>on</strong>g> their area of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest <strong>and</strong> provides many<br />

collaborative work<str<strong>on</strong>g>in</str<strong>on</strong>g>g opportunities. Although Mah<strong>on</strong>y (2011) granted that collaborative work such as<br />

through jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t publicati<strong>on</strong> <strong>and</strong> analysis was not <str<strong>on</strong>g>in</str<strong>on</strong>g>herently new, he argued that tools such as the Digital<br />

Classicist Wiki enabled a new type of collaborati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> that <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e material can be “searched, analysed<br />

<strong>and</strong> edited all <str<strong>on</strong>g>in</str<strong>on</strong>g> a very short time by a number of editors regardless of their physical locati<strong>on</strong>” <strong>and</strong> thus<br />

supported dramatic <str<strong>on</strong>g>in</str<strong>on</strong>g>creases <str<strong>on</strong>g>in</str<strong>on</strong>g> scholarly productivity. He also emphasized that this new collaborative<br />

process is help<str<strong>on</strong>g>in</str<strong>on</strong>g>g shift “academic culture” away from isolated scholars to a new model where no<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual has total c<strong>on</strong>trol or ownership of the research process. “This paper does not argue for the<br />

ext<str<strong>on</strong>g>in</str<strong>on</strong>g>cti<strong>on</strong> of the l<strong>on</strong>e scholar,” Mah<strong>on</strong>y c<strong>on</strong>cluded, “but <str<strong>on</strong>g>in</str<strong>on</strong>g>stead for a scholarly envir<strong>on</strong>ment where<br />

both scenarios are recognized <strong>and</strong> valued” (Mah<strong>on</strong>y 2011). 608<br />

Although collaborati<strong>on</strong> is a frequently lauded virtue of many digital projects such as Stoa <strong>and</strong> the<br />

Digital Classicist, <strong>on</strong>e scholar quoted by Harley et al. (2010) stated rather bluntly that the level of<br />

collaborati<strong>on</strong> could vary <str<strong>on</strong>g>in</str<strong>on</strong>g> classics depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e:<br />

I would say collaborati<strong>on</strong> is still relatively rare <str<strong>on</strong>g>in</str<strong>on</strong>g> the literary side of the classics. Not that many<br />

people will coauthor articles <strong>on</strong> Socrates. … This may be different for projects with more<br />

technical comp<strong>on</strong>ents, like archaeology, papyrology, or epigraphy…In those areas, there are a<br />

lot of projects that require collaborati<strong>on</strong>…I would say that those particular fields—epigraphy,<br />

which is read<str<strong>on</strong>g>in</str<strong>on</strong>g>g rock <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, <strong>and</strong> papyrology, work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with bits of papyri—are<br />

enormously collaborative…I also th<str<strong>on</strong>g>in</str<strong>on</strong>g>k classics, <strong>on</strong> the whole, has not d<strong>on</strong>e too badly <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

embrac<str<strong>on</strong>g>in</str<strong>on</strong>g>g other fields. … or at least certa<str<strong>on</strong>g>in</str<strong>on</strong>g> practiti<strong>on</strong>ers of classics have g<strong>on</strong>e out there <strong>and</strong><br />

hooked up with colleagues <str<strong>on</strong>g>in</str<strong>on</strong>g> various discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> brought th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs back that have c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ued<br />

to exp<strong>and</strong> the field or exp<strong>and</strong> the range of th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs we can do (Harley et al. 2010, 102).<br />

Indeed, this spirit of collaborati<strong>on</strong> am<strong>on</strong>g papyrologists was called up<strong>on</strong> by Joshua Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g> his recent<br />

talk at the C<strong>on</strong>gress of the Internati<strong>on</strong>al Associati<strong>on</strong> of Papyrologists: “We must collaborate. We must<br />

share the workload. We must use comm<strong>on</strong> technical st<strong>and</strong>ards. We must do our work <str<strong>on</strong>g>in</str<strong>on</strong>g> the full<br />

sunlight of the web, <strong>and</strong> not <str<strong>on</strong>g>in</str<strong>on</strong>g> the black box of an<strong>on</strong>ymity. We must leverage the strength of our<br />

community’s dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guish<str<strong>on</strong>g>in</str<strong>on</strong>g>g spirit of collegiality” (Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> 2010).<br />

While Harley et al. noted that many scholars often worked <str<strong>on</strong>g>in</str<strong>on</strong>g>dependently <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the close study of<br />

documentary rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s, they frequently worked together <str<strong>on</strong>g>in</str<strong>on</strong>g> the creati<strong>on</strong> of scholarly editi<strong>on</strong>s,<br />

exhibiti<strong>on</strong>s, <strong>and</strong> digital projects. On the other h<strong>and</strong>, the desire for collaborative work, even with<br />

documentary rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s has been illustrated by the VRE-SDM.<br />

One of the largest <strong>and</strong> oldest truly collaborative digital classics project is the Suda On L<str<strong>on</strong>g>in</str<strong>on</strong>g>e (SOL), a<br />

“massive 10 th century Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e Greek historical encyclopedia of the ancient Mediterranean world,<br />

derived from the scholia to critical editi<strong>on</strong>s of can<strong>on</strong>ical works <strong>and</strong> from compilati<strong>on</strong>s by yet earlier<br />

authors.” 609 The purpose of SOL is to create a keyword searchable <strong>and</strong> freely available XML encoded<br />

608 This argument echoes earlier c<strong>on</strong>clusi<strong>on</strong>s by Toms <strong>and</strong> O’Brien (2008) that humanists need to work <str<strong>on</strong>g>in</str<strong>on</strong>g> greater collaborati<strong>on</strong>.<br />

609 Reference works seem to lend themselves to collaborati<strong>on</strong>; for examples, c<strong>on</strong>sider “DIR: De Imperatoribus Romanis: An Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Encyclopedia of<br />

Roman Rulers <strong>and</strong> Their Families” (http://www.roman-emperors.org/), a collaborative encyclopedia that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes Roman <strong>and</strong> Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e biographies<br />

prepared by scholars <strong>and</strong> actively updated <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to other classics sites, <strong>and</strong> Vicipaedia, a Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Wikipedia (http://la.wikipedia.org/wiki/Pag<str<strong>on</strong>g>in</str<strong>on</strong>g>a_prima).


200<br />

database of this encyclopedia complete with translati<strong>on</strong>s, annotati<strong>on</strong>s <strong>and</strong> bibliography as well as<br />

automatically generated l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to other electr<strong>on</strong>ic resources. More than 170 scholars from 18 countries<br />

have c<strong>on</strong>tributed to this project, <strong>and</strong> 25,000 of the 30,000 entries have been translated. As expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by<br />

Anne Mah<strong>on</strong>ey (2009), this collaborative translati<strong>on</strong> project has made the Suda text available to<br />

n<strong>on</strong>specialists <strong>and</strong> the <strong>on</strong>-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e editi<strong>on</strong> is far easier to use than the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t. “As a collaborati<strong>on</strong>,” Mah<strong>on</strong>ey<br />

declared, “SOL dem<strong>on</strong>strates open peer review <strong>and</strong> the feasibility of a large, but closely focused,<br />

humanities project” (Mah<strong>on</strong>ey 2009).<br />

In her brief history of the SOL, Mah<strong>on</strong>ey reported that it was <strong>on</strong>e of the first collaborative<br />

encyclopedias <strong>and</strong> predated Wikipedia by several years. Many of the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al encyclopedia entries <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

this unique reference work were filled with <str<strong>on</strong>g>in</str<strong>on</strong>g>correct <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, so each digital entry c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

explanatory commentary <strong>and</strong> references. The SOL also serves as an important source of both<br />

fragmentary texts <strong>and</strong> text variants. “Its authors had access to some texts that are no l<strong>on</strong>ger extant, so<br />

there is material <str<strong>on</strong>g>in</str<strong>on</strong>g> the Suda that cannot be found anywhere else,” Mah<strong>on</strong>ey noted; “they also had<br />

different editi<strong>on</strong>s of some of the texts we still read, so quotati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the Suda may reflect variants that<br />

are not preserved <str<strong>on</strong>g>in</str<strong>on</strong>g> our textual traditi<strong>on</strong>” (Mah<strong>on</strong>ey 2009).<br />

The SOL was implemented <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e as a semistructured text, <strong>and</strong> the translati<strong>on</strong> <strong>and</strong> edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the<br />

encyclopedia are still <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Prospective translators have to register <strong>and</strong> then ask to be assigned<br />

specific entries. While many translators work <strong>on</strong> this project, <strong>on</strong>ly a subset are designated editors who<br />

have the authority to change translati<strong>on</strong>s. All editors have significant ability <str<strong>on</strong>g>in</str<strong>on</strong>g> Ancient Greek <strong>and</strong><br />

many are college <strong>and</strong> university professors. The primary resp<strong>on</strong>sibilities of editors are to augment<br />

bibliographies, add commentaries, <strong>and</strong> verify that translati<strong>on</strong>s are correct for SOL entries. The editorial<br />

mechanisms of SOL also serve, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Mah<strong>on</strong>ey, as a “type of peer review process.” Each entry<br />

credits not <strong>on</strong>ly its orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al translator but also the editors who have worked <strong>on</strong> it. This process allows<br />

the recogniti<strong>on</strong> of all scholars <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <strong>and</strong> serves as a clear c<strong>on</strong>trast, Mah<strong>on</strong>ey notes, to the bl<str<strong>on</strong>g>in</str<strong>on</strong>g>d<br />

review<str<strong>on</strong>g>in</str<strong>on</strong>g>g found <str<strong>on</strong>g>in</str<strong>on</strong>g> many classics journals. The most critical po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of this process, Mah<strong>on</strong>ey asserted, is<br />

that it dem<strong>on</strong>strates the <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g nature of scholarship:<br />

Perhaps more important, SOL shows how scholarship progresses. A translati<strong>on</strong> or commentary<br />

published <str<strong>on</strong>g>in</str<strong>on</strong>g> a book appears f<str<strong>on</strong>g>in</str<strong>on</strong>g>al <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ished; readers are not given any clues about how it<br />

came <str<strong>on</strong>g>in</str<strong>on</strong>g>to be<str<strong>on</strong>g>in</str<strong>on</strong>g>g. SOL's translati<strong>on</strong>s <strong>and</strong> commentaries show the process of successive<br />

ref<str<strong>on</strong>g>in</str<strong>on</strong>g>ements, dem<strong>on</strong>strat<str<strong>on</strong>g>in</str<strong>on</strong>g>g that first drafts are almost never perfect, <strong>and</strong> that even senior<br />

scholars' work can benefit from editorial attenti<strong>on</strong> (Mah<strong>on</strong>ey 2009).<br />

Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, Arne Flaten (2009) made similar arguments regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g how the creati<strong>on</strong> of digital<br />

architectural models <str<strong>on</strong>g>in</str<strong>on</strong>g> the Ashes2Art project that represented uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty <strong>and</strong> various scholarly<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s illustrated to students the <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g nature of scholarly arguments.<br />

Mah<strong>on</strong>ey also po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that the SOL dem<strong>on</strong>strates how the digital envir<strong>on</strong>ment often provides a far<br />

more natural way to exploit the knowledge found with<str<strong>on</strong>g>in</str<strong>on</strong>g> a complicated reference work. While the SOL<br />

is not a completely new work, it is not simply a digital reproducti<strong>on</strong> of the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted <strong>on</strong>e. The<br />

envir<strong>on</strong>ment of the web makes it possible to better illustrate the “commentary nature of the Suda,” as<br />

Mah<strong>on</strong>ey details, because quotati<strong>on</strong>s can be identified <strong>and</strong> labeled, explicit references to primary<br />

source texts can be hyperl<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to, <strong>and</strong> bibliographies can be exp<strong>and</strong>ed to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude modern relevant<br />

scholarship. At the same time, translators can add l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to any <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e resources they f<str<strong>on</strong>g>in</str<strong>on</strong>g>d useful,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>es far bey<strong>on</strong>d the traditi<strong>on</strong>al bounds of classical scholarship. Mah<strong>on</strong>ey c<strong>on</strong>cluded that the<br />

most important accomplishment of SOL was that this material was now available to a far wider


201<br />

audience. Exp<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g the opportunities of collaborati<strong>on</strong> bey<strong>on</strong>d scholars to the <str<strong>on</strong>g>in</str<strong>on</strong>g>terested public was<br />

c<strong>on</strong>sidered as important by a variety of projects.<br />

While this secti<strong>on</strong> has largely focused <strong>on</strong> access <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of digital scholarship <strong>and</strong> c<strong>on</strong>tent that is<br />

freely available, another key comp<strong>on</strong>ent of access is the ability to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d such materials <str<strong>on</strong>g>in</str<strong>on</strong>g> the first place.<br />

The nature of open-access digital collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> classics <strong>and</strong> the challenges of catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> collect<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

them has been addressed by Chuck J<strong>on</strong>es, director of the library at ISAW. 610 As his charge at ISAW is<br />

to “develop a library of the scholarly resources required to support a research <strong>and</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g program<br />

cover<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ancient world from the Pillars of Hercules to the Pacific <strong>and</strong> from the emergence of<br />

civilized life until Late Antiquity,” he quickly realized that such a collecti<strong>on</strong> would have to be both<br />

physical <strong>and</strong> digital <strong>and</strong> that the digital comp<strong>on</strong>ent of the ISAW library would ultimately <str<strong>on</strong>g>in</str<strong>on</strong>g>clude<br />

resources developed both locally <strong>and</strong> elsewhere (J<strong>on</strong>es 2010). The ISAW is also seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to develop a<br />

project they are call<str<strong>on</strong>g>in</str<strong>on</strong>g>g the “Ancient World Digital <strong>Library</strong>” to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts of access <strong>and</strong> discovery<br />

to materials with<str<strong>on</strong>g>in</str<strong>on</strong>g> tools scholars already use. 611<br />

As chief editor of the Abzu 612 bibliography (started <str<strong>on</strong>g>in</str<strong>on</strong>g> 1994 <strong>and</strong> now part of ETANA), J<strong>on</strong>es described<br />

the chang<str<strong>on</strong>g>in</str<strong>on</strong>g>g nature of his catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g work, from almost anyth<str<strong>on</strong>g>in</str<strong>on</strong>g>g he could f<str<strong>on</strong>g>in</str<strong>on</strong>g>d to c<strong>on</strong>scious collecti<strong>on</strong><br />

mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Whereas <strong>on</strong>ce he also focused <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to commercially licensed materials, he<br />

found that research library f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools covered this area well. At the same time, he realized that “it<br />

was equally evident that the research library community was not yet com<str<strong>on</strong>g>in</str<strong>on</strong>g>g to grips with provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

suitable access to born-digital <strong>and</strong> open access digital publicati<strong>on</strong> which is freely distributed, requir<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

neither purchase nor license”(J<strong>on</strong>es 2010). So as the work of Abzu c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ued, J<strong>on</strong>es decided to create<br />

the blog “the Ancient World Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e” 613 as a means of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g even faster access to new open-access<br />

publicati<strong>on</strong>s <strong>on</strong> the Ancient World. While J<strong>on</strong>es had orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ally blogged solely under the larger Ancient<br />

World Bloggers Group, 614 he found that the sheer volume of resources available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e necessitated<br />

the development of his own blog specifically dedicated to open-access sources about the Ancient<br />

World. For example, J<strong>on</strong>es ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s an alphabetical list 615 of open-access journals <str<strong>on</strong>g>in</str<strong>on</strong>g> Ancient Studies<br />

that currently has more than 600 titles. 616 This list also dem<strong>on</strong>strates that the idea of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g open<br />

access to scholarship is steadily ga<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g acceptance with<str<strong>on</strong>g>in</str<strong>on</strong>g> the classical community.<br />

A detailed analysis of the history <strong>and</strong> results of <strong>on</strong>e of these open-access journals (Frankfurter<br />

elektr<strong>on</strong>ische Rundschau zur Altertumskunde [feRA]) has been explored <str<strong>on</strong>g>in</str<strong>on</strong>g> a recent Archaeolog blog<br />

post by Stefan Krmnicek <strong>and</strong> Peter Probst (Krmnicek <strong>and</strong> Probst 2010). 617 They expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that FeRA<br />

was created <str<strong>on</strong>g>in</str<strong>on</strong>g> 2006 <strong>and</strong> was <str<strong>on</strong>g>in</str<strong>on</strong>g>tended as an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e forum for young scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology from all<br />

over the world to publish their work. FeRA is published three times a year <strong>and</strong> has <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded 36<br />

c<strong>on</strong>tributi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> German, English, <strong>and</strong> Italian. In analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g their log files, they noted that <strong>on</strong>ly about 14<br />

percent of their visits orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ated from academic networks, <strong>and</strong> while they acknowledged that many<br />

academics might utilize commercial ISPs to access FeRA, they believed these results suggested that “a<br />

fairly large group of people <str<strong>on</strong>g>in</str<strong>on</strong>g>terested <str<strong>on</strong>g>in</str<strong>on</strong>g> the very specialized field of classical studies exists outside<br />

610 http://www.nyu.edu/isaw/<br />

611 In March 2011, an <str<strong>on</strong>g>in</str<strong>on</strong>g>itial book view<str<strong>on</strong>g>in</str<strong>on</strong>g>g applicati<strong>on</strong> for the Ancient World Digital <strong>Library</strong> was announced <strong>on</strong> the Ancient World <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e blog<br />

(http://dlib.nyu.edu/awdl/).<br />

612 http://www.etana.org/abzu/<br />

613 http://ancientworld<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.blogspot.com/<br />

614 http://ancientworldbloggers.blogspot.com/<br />

615 http://ancientworld<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.blogspot.com/2009/10/alphabetical-list-of-open-access.html<br />

616 This list also illustrates the importance of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g such a collecti<strong>on</strong> service for when search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the Directory of Open Access Journals (DOAJ)<br />

(http://www.doaj.org/), various keyword searches <str<strong>on</strong>g>in</str<strong>on</strong>g> November 2010 (ancient [6 journals]), antiquity [9 journals], classics [6]), classical [14]) turned up<br />

<strong>on</strong>ly 24 unique classics journals, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g three of the most prom<str<strong>on</strong>g>in</str<strong>on</strong>g>ent, Didaskalia (http://www.didaskalia.net/journal.html), Electr<strong>on</strong>ic Antiquity<br />

(http://scholar.lib.vt.edu/ejournals/ElAnt/) <strong>and</strong> Leeds Internati<strong>on</strong>al Classical Studies (http://www.leeds.ac.uk/classics/lics/).<br />

617 http://traumwerk.stanford.edu/archaeolog/2010/05/open_access_classical_studies.html


202<br />

academia.” At the same time, they revealed that the number of manuscripts submitted by young<br />

scholars had been far less then expected <strong>and</strong> that the emphasis had shifted from articles to reviews, <strong>and</strong><br />

they hypothesized that scholars who were not yet established <str<strong>on</strong>g>in</str<strong>on</strong>g> their fields were reluctant to publish<br />

outside traditi<strong>on</strong>al pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t media, a suppositi<strong>on</strong> c<strong>on</strong>firmed by the research of Harley et al. (2010). Thus<br />

the challenges of traditi<strong>on</strong>al peer review <strong>and</strong> scholarly promoti<strong>on</strong> meant that fewer younger scholars<br />

were fully profit<str<strong>on</strong>g>in</str<strong>on</strong>g>g from opportunities <str<strong>on</strong>g>in</str<strong>on</strong>g> digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g (e.g., reach<str<strong>on</strong>g>in</str<strong>on</strong>g>g a greater audience, higher<br />

research impact).<br />

F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, there are a number of prom<str<strong>on</strong>g>in</str<strong>on</strong>g>ent blogs that explore scholarship <strong>on</strong> the ancient world. As listed<br />

above, the Ancient World Bloggers Group is a metablog with many bloggers <strong>and</strong> serves as “a place for<br />

posts <strong>and</strong> discussi<strong>on</strong> about blogg<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Ancient World.” Two other prom<str<strong>on</strong>g>in</str<strong>on</strong>g>ent classical blogs are<br />

antiquist <strong>and</strong> 618 RogueClassicism. 619 While a full review of blogs is bey<strong>on</strong>d the scope of this review,<br />

Tom Elliott has put together several feed aggregators 620 that br<str<strong>on</strong>g>in</str<strong>on</strong>g>g together a large number of blogs,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g “Maia Atlantis: Ancient World Bloggers,” which br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs c<strong>on</strong>tent together from bloggers at<br />

the Ancient World Bloggers Group <strong>and</strong> the eClassics community <strong>on</strong> N<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 621 <strong>and</strong> “Electra Atlantis:<br />

Approaches to Antiquity,” which br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs together c<strong>on</strong>tent from ancient world blogs that also frequently<br />

exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e issues of digital scholarship <strong>and</strong> technology. These aggregators are excellent tools for keep<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

current <strong>on</strong> the classical blogosphere.<br />

Undergraduate Research, Teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> E-Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

While the previous secti<strong>on</strong> discussed new forms of openness <strong>and</strong> collaborati<strong>on</strong> am<strong>on</strong>g scholars, the<br />

field of digital classics has also presented new opportunities for collaborati<strong>on</strong> with students through<br />

undergraduate research. In additi<strong>on</strong>, the large number of digital classics resources <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, as well as the<br />

number of websites designed for <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g, present new possibilities for teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g. This<br />

secti<strong>on</strong> looks at some recent efforts <str<strong>on</strong>g>in</str<strong>on</strong>g> these areas.<br />

There are a number of useful e-learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e both for traditi<strong>on</strong>al students <strong>and</strong> for<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dependent learners of classical languages as well as for those study<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ancient world. 622 One of<br />

the oldest resources is Textkit, 623 a website that provides a number of free <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e resources for the<br />

learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>. Some of its core collecti<strong>on</strong>s are a number of public doma<str<strong>on</strong>g>in</str<strong>on</strong>g> grammar<br />

books as well of Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> e-texts. Textkit also has an extensive forum where registered users<br />

can participate <str<strong>on</strong>g>in</str<strong>on</strong>g> various topics about learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Greek.<br />

Another l<strong>on</strong>gst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g project is VRoma, 624 an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g community of teachers <strong>and</strong> students that<br />

is dedicated to creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e resources for “teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g about the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> language <strong>and</strong> ancient Roman<br />

culture.” This project was <str<strong>on</strong>g>in</str<strong>on</strong>g>itially funded <str<strong>on</strong>g>in</str<strong>on</strong>g> 1997 through a “Teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g with Technology” grant from<br />

the NEH. It ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s an active website that has two ma<str<strong>on</strong>g>in</str<strong>on</strong>g> comp<strong>on</strong>ents: an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment<br />

(MOO) <strong>and</strong> a collecti<strong>on</strong> of Internet resources. This MOO simulates an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e “place” or “a virtual<br />

618 http://www.antiquist.org/blog/page_id=2<br />

619 http://rogueclassicism.com/<br />

620 http://planet.atlantides.org/<br />

621 http://eclassics.n<str<strong>on</strong>g>in</str<strong>on</strong>g>g.com/<br />

622 Many thematic resources have been developed for the study of particular aspects of classics <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. For example, the study of ancient medic<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes<br />

“Medic<str<strong>on</strong>g>in</str<strong>on</strong>g>e Antiqua” (http://www.ucl.ac.uk/~ucgajpd/medic<str<strong>on</strong>g>in</str<strong>on</strong>g>a%20antiqua/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html), a selected classical text repository <strong>and</strong> <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e resource directory<br />

created by the Wellcome Trust Centre for the History of Medic<str<strong>on</strong>g>in</str<strong>on</strong>g>e at University College L<strong>on</strong>d<strong>on</strong>, <strong>and</strong> Asclepi<strong>on</strong><br />

(http://www.<str<strong>on</strong>g>in</str<strong>on</strong>g>diana.edu/~ancmed/<str<strong>on</strong>g>in</str<strong>on</strong>g>tro.HTM) “a World Wide Web page devoted to the study of ancient medic<str<strong>on</strong>g>in</str<strong>on</strong>g>e” that was created by the University of<br />

Indiana Bloom<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>.<br />

623 http://www.textkit.com<br />

624 http://www.vroma.org/


203<br />

learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment that is built up<strong>on</strong> a spatial <strong>and</strong> cultural metaphor of ancient <str<strong>on</strong>g>Rome</str<strong>on</strong>g>.” To explore<br />

this virtual space, users can log <str<strong>on</strong>g>in</str<strong>on</strong>g> as guests or apply for a VRoma character <strong>and</strong> password.<br />

Another <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment is Silver Muse, 625 a resource created by the Classics Department<br />

of the University of Texas-Aust<str<strong>on</strong>g>in</str<strong>on</strong>g> that seeks “to provide a Web-based system to teach <strong>and</strong> promote<br />

research <str<strong>on</strong>g>in</str<strong>on</strong>g> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> epic poetry of the early empire” <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes authors such as Ovid <strong>and</strong> Lucan. The<br />

Silver Muse system provides a hypertextual read<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment of the text of the poets, with l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked<br />

read<str<strong>on</strong>g>in</str<strong>on</strong>g>g guides, commentaries, <strong>and</strong> essays. The reader can access the full text of a number of works <strong>and</strong><br />

click <strong>on</strong> any word to get both a translati<strong>on</strong> <strong>and</strong> an example sentence.<br />

The Alpheios Project, 626 which makes software freely available for read<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g languages,<br />

has recently released several tools for read<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. 627 These tools are Firefox<br />

extensi<strong>on</strong>s that add some specific functi<strong>on</strong>alities to the browser <strong>and</strong> are usable with any HTML <strong>and</strong><br />

Unicode-compliant text. After download<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Alpheios toolbar, the user must choose either Greek or<br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> can then utilize several important features that <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the ability to look up a word by either<br />

click<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> it or by enter<str<strong>on</strong>g>in</str<strong>on</strong>g>g it <str<strong>on</strong>g>in</str<strong>on</strong>g> the toolbar <strong>and</strong> to listen to how a word is pr<strong>on</strong>ounced. 628 A pers<strong>on</strong>al<br />

vocabulary tool stores the words a user has looked up. The Alpheios website also provides access to a<br />

number of “Alpheios Enhanced Texts,” <strong>and</strong> when read<str<strong>on</strong>g>in</str<strong>on</strong>g>g these texts the toolbar has an additi<strong>on</strong>al<br />

feature that allows the user to access diagrams of each sentence <str<strong>on</strong>g>in</str<strong>on</strong>g> the form of a dependency tree. Users<br />

can also create <strong>and</strong> save their own dependency tree diagrams of sentences.<br />

This is just a sample of the wealth of material <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e for both the formal <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formal student, <strong>and</strong><br />

opportunities for collaborati<strong>on</strong> were particularly evident <str<strong>on</strong>g>in</str<strong>on</strong>g> TextKit <strong>and</strong> VRoma. The possibility of<br />

mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gful participati<strong>on</strong> by students <str<strong>on</strong>g>in</str<strong>on</strong>g> more formal classical teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> scholarship is a more<br />

difficult propositi<strong>on</strong>, but <strong>on</strong>e that Blackwell <strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> (2009) believed could be addressed by new<br />

models of undergraduate research. 629 While the potential of undergraduate research was also<br />

c<strong>on</strong>sidered earlier <str<strong>on</strong>g>in</str<strong>on</strong>g> the discussi<strong>on</strong> of the Ashes2Art project (Arne 2009) <strong>and</strong> through the creati<strong>on</strong> of<br />

scholarly treebanks (Bamman, Mambr<str<strong>on</strong>g>in</str<strong>on</strong>g>i <strong>and</strong> Crane 2009), Blackwell <strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> exam<str<strong>on</strong>g>in</str<strong>on</strong>g>e the<br />

potential of several digital classics projects, particularly the HMT, to provide students with new<br />

research <strong>and</strong> publicati<strong>on</strong> opportunities. The traditi<strong>on</strong>al task of teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g students to read scholarship <strong>and</strong><br />

produce essays with primary <strong>and</strong> sec<strong>on</strong>dary source citati<strong>on</strong>s, Blackwell <strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> argued, needs to be<br />

revamped for the digital world. One way to engage undergraduates <str<strong>on</strong>g>in</str<strong>on</strong>g> scholarship, they suggested, was<br />

to have them create <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e publicati<strong>on</strong>s that would be read by more than just their teacher <strong>and</strong> that<br />

made extensive use of actual primary sources (rather than rely<str<strong>on</strong>g>in</str<strong>on</strong>g>g solely <strong>on</strong> sec<strong>on</strong>dary sources that<br />

reference them).<br />

A related challenge of this proposal, however, is the need for far more access to both primary <strong>and</strong><br />

sec<strong>on</strong>dary materials <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e so students can both make use of <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to them. While the authors<br />

625 http://uts.cc.utexas.edu/~silver/<br />

626 http://alpheios.net/c<strong>on</strong>tent/alpheios-texts<br />

627 The Alpheios project has made extensive use of various resources of the Perseus Digital <strong>Library</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g both the Ancient Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> treebanks.<br />

The code for their tools can be downloaded from (http://sourceforge.net/projects/alpheios/).<br />

628 This feature utilizes the open-source tool eSpeak Speech Synthesizer (http://espeak.sourceforge.net/). Another <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g resource that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes<br />

audio samples of Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> is the Classical Language Instructi<strong>on</strong> Project at Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong> University (http://www.pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>.edu/~clip/). This website<br />

c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s samples of scholars read<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> prose <strong>and</strong> poetry <str<strong>on</strong>g>in</str<strong>on</strong>g> order to help students get acqua<str<strong>on</strong>g>in</str<strong>on</strong>g>ted with the sounds of Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> to<br />

practice their read<str<strong>on</strong>g>in</str<strong>on</strong>g>g skills. The authors <str<strong>on</strong>g>in</str<strong>on</strong>g>clude Homer, Plato, P<str<strong>on</strong>g>in</str<strong>on</strong>g>dar, Virgil, <strong>and</strong> Seneca. Another unique audio resource is “Ancient Greek Music”<br />

http://www.oeaw.ac.at/kal/agm/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.htm, a website that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s record<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of “all published fragments of Ancient Greek music which c<strong>on</strong>sist of more<br />

than a few scattered notes.”<br />

629 One model of undergraduate research has been developed by Sunoikisis (http://www.sunoikisis.org), a nati<strong>on</strong>al c<strong>on</strong>sortium of classics programs that<br />

was founded <str<strong>on</strong>g>in</str<strong>on</strong>g> 1999 <strong>and</strong> runs an annual undergraduate research symposium where students present their papers <strong>on</strong> research regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greece, <str<strong>on</strong>g>Rome</str<strong>on</strong>g>, <strong>and</strong><br />

the Classical Traditi<strong>on</strong> (http://www.sunoikisis.org/blog/students/symposia/).


204<br />

granted their views of the potential for undergraduate scholarship <strong>on</strong>ce “all the sources are <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e”<br />

might be somewhat idealistic, they still held high hopes:<br />

The very effort of exam<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g primary sources <strong>and</strong> th<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g about their possible mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gs<br />

would br<str<strong>on</strong>g>in</str<strong>on</strong>g>g home the reality that scholarship is always research, <str<strong>on</strong>g>in</str<strong>on</strong>g> the sense of f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <str<strong>on</strong>g>in</str<strong>on</strong>g>terpret<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> present<str<strong>on</strong>g>in</str<strong>on</strong>g>g evidence. Students could operate as scholars, whether<br />

through the process of verify<str<strong>on</strong>g>in</str<strong>on</strong>g>g the plausibility of the presentati<strong>on</strong> of evidence by others, or by<br />

present<str<strong>on</strong>g>in</str<strong>on</strong>g>g arguments <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s that are <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e way or another orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al, <str<strong>on</strong>g>in</str<strong>on</strong>g> all the<br />

various senses of that word. … When all the sources are <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, then we as teachers of Classics<br />

can more effectively engage our undergraduate students as collaborators <str<strong>on</strong>g>in</str<strong>on</strong>g> research, whether <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the collecti<strong>on</strong> of, for example, themed primary source collecti<strong>on</strong>s, or <str<strong>on</strong>g>in</str<strong>on</strong>g> the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of the<br />

countless issues <str<strong>on</strong>g>in</str<strong>on</strong>g> Classics <strong>and</strong> ancient history that still await effective <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigati<strong>on</strong> based <strong>on</strong><br />

careful analysis of well-chosen <strong>and</strong> clearly def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed data sets rather than impressi<strong>on</strong>istic<br />

asserti<strong>on</strong>s (Blackwell <strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> 2009).<br />

One salutary effect of hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g all primary sources <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e, Blackwell <strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> articulated, would be<br />

that more scholars might feel obligated to be far more meticulous about their own st<strong>and</strong>ards of primary<br />

source citati<strong>on</strong>. As an example, they menti<strong>on</strong>ed the c<strong>on</strong>fus<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarly practice of cit<str<strong>on</strong>g>in</str<strong>on</strong>g>g quotes of<br />

fragmentary authors by the st<strong>and</strong>ard reference system of a particular editi<strong>on</strong> of collected fragments<br />

without also cit<str<strong>on</strong>g>in</str<strong>on</strong>g>g the primary text from which the fragmentary quotes were orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ally drawn, a<br />

practice that makes it difficult for students to decipher these references. Another important method for<br />

undergraduates to c<strong>on</strong>tribute to classical scholarship, Blackwell <strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> offered, was through the<br />

creati<strong>on</strong> of lists <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes. They noted that as more resources became available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e through openaccess<br />

publicati<strong>on</strong> <strong>and</strong> as more software tools were able to aggregate data from wide-rang<str<strong>on</strong>g>in</str<strong>on</strong>g>g sources,<br />

the creati<strong>on</strong> of lists <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes would become far more important.<br />

The most significant opportunity, however, had come through the HMT project, where start<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

2006, a grant was secured by Casey Dué of the University of Houst<strong>on</strong> to pay for undergraduate<br />

research assistants at this university as well as the College of the Holy Cross <strong>and</strong> Furman University to<br />

beg<str<strong>on</strong>g>in</str<strong>on</strong>g> work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> the project. This group of undergraduates, called the HMT fellows, were given the<br />

task of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML transcripts <strong>and</strong> translat<str<strong>on</strong>g>in</str<strong>on</strong>g>g specific texts of five Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> medieval<br />

manuscripts of the Iliad. One important research questi<strong>on</strong> be<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>sidered by the HMT editors was<br />

how the editi<strong>on</strong>s of Aristarchus differed from the medieval editi<strong>on</strong>s <strong>and</strong> how a drift <str<strong>on</strong>g>in</str<strong>on</strong>g> the language<br />

might <str<strong>on</strong>g>in</str<strong>on</strong>g>dicate “the noti<strong>on</strong> of an <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g traditi<strong>on</strong> of multiformity.” 630 While traditi<strong>on</strong>al critical<br />

editi<strong>on</strong>s of Homer typically obscure these differences, the HMT editors hoped that the work of the<br />

HMT fellows <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML transcripts would “highlight a problem <str<strong>on</strong>g>in</str<strong>on</strong>g> the history of the Homeric<br />

text, thus c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g a po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of c<strong>on</strong>versati<strong>on</strong> <strong>and</strong> analysis to the <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g study of the Iliad”(Blackwell<br />

<strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> 2009). Blackwell <strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> submitted that this new collaborative model of research,<br />

which produced “electr<strong>on</strong>ic texts (not required to be “pr<str<strong>on</strong>g>in</str<strong>on</strong>g>table”) <str<strong>on</strong>g>in</str<strong>on</strong>g> transcripti<strong>on</strong> (rather than<br />

collati<strong>on</strong>),” not <strong>on</strong>ly allowed both students <strong>and</strong> professors <str<strong>on</strong>g>in</str<strong>on</strong>g> a distributed geographic envir<strong>on</strong>ment to<br />

work with high-quality images of “primary texts, the papyri, the Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> medieval manuscripts”<br />

but also supported a new type of scholarship that addressed the limitati<strong>on</strong>s of traditi<strong>on</strong>al critical<br />

editi<strong>on</strong>s of Homer.<br />

630 The issue of multiformity, Homeric traditi<strong>on</strong>, <strong>and</strong> digital editi<strong>on</strong>s has been discussed earlier <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper.


205<br />

Blackwell <strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>cluded that the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> technology <strong>and</strong> new models of<br />

faculty-student collaborati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to c<strong>on</strong>venti<strong>on</strong>al classical teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g would be necessary not just to<br />

re<str<strong>on</strong>g>in</str<strong>on</strong>g>vigorate the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e but also to keep it relevant:<br />

Because technology has lowered the ec<strong>on</strong>omic barriers to academic publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g—a reality that<br />

too few publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g Classicists have fully understood—it is easy to guide student-writers <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

becom<str<strong>on</strong>g>in</str<strong>on</strong>g>g student-authors. We who teach Classics can add to our pedagogy the technological<br />

tools of the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> ec<strong>on</strong>omy, thus arm<str<strong>on</strong>g>in</str<strong>on</strong>g>g ourselves aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st charges of impracticality <strong>and</strong><br />

at the same time possibly attract<str<strong>on</strong>g>in</str<strong>on</strong>g>g students whose <str<strong>on</strong>g>in</str<strong>on</strong>g>terests lie outside the Classics. And as<br />

digital libraries beg<str<strong>on</strong>g>in</str<strong>on</strong>g> to <str<strong>on</strong>g>in</str<strong>on</strong>g>ter-operate, they breathe new life <str<strong>on</strong>g>in</str<strong>on</strong>g>to largely disregarded scholarly<br />

genres <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>vent entirely new <strong>on</strong>es—geographic <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> systems, computati<strong>on</strong>al<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, <strong>and</strong> so forth (Blackwell <strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> 2009).<br />

N<strong>on</strong>etheless, it seems that such calls are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>ly partially heard, even <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of much less<br />

“radical” <str<strong>on</strong>g>in</str<strong>on</strong>g>novati<strong>on</strong>. Recent research by Dimitrios Vlachopoulos has <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated the percepti<strong>on</strong>s of<br />

academic staff that teach classical languages (Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>) regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the use of “<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e activities”<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> their teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Vlachopoulos 2009). In the first phase of this research, 33 <str<strong>on</strong>g>in</str<strong>on</strong>g>structors <str<strong>on</strong>g>in</str<strong>on</strong>g> Greece,<br />

Spa<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> the United States were asked to complete a three-part survey. The first part asked them<br />

about their “digital profile” <strong>and</strong> their general level of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> ICT underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g. The sec<strong>on</strong>d<br />

part asked them to evaluate the potential of ICT <str<strong>on</strong>g>in</str<strong>on</strong>g> classics <strong>and</strong> whether or not they or their students<br />

had the knowledge to actively utilize such technology, <strong>and</strong> the third part asked <str<strong>on</strong>g>in</str<strong>on</strong>g>structors to outl<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

the most significant challenges <str<strong>on</strong>g>in</str<strong>on</strong>g> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g ICT for <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e course delivery. In the sec<strong>on</strong>d phase of this<br />

research, <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews were c<strong>on</strong>ducted with about half of the participants. Vlachopoulos emphasized that<br />

most of the participants were worried about the future of their departments <strong>and</strong> the amount of fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

they received from their universities. “It was a comm<strong>on</strong> belief that new strategies need to be designed,”<br />

Vlachopoulos reported, “<str<strong>on</strong>g>in</str<strong>on</strong>g> order to attract more students every year <strong>and</strong> to offer them more job<br />

opportunities.”<br />

The analysis of the survey <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terview results led Vlachopoulos to classify the <str<strong>on</strong>g>in</str<strong>on</strong>g>structors <str<strong>on</strong>g>in</str<strong>on</strong>g>to three<br />

groups: c<strong>on</strong>servatives, who were completely closed to the use of <str<strong>on</strong>g>in</str<strong>on</strong>g>novative ICT <str<strong>on</strong>g>in</str<strong>on</strong>g> the classroom;<br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>stream, who even if they stated they were <str<strong>on</strong>g>in</str<strong>on</strong>g> favor of major changes <str<strong>on</strong>g>in</str<strong>on</strong>g> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g, were “risk<br />

averters” <strong>and</strong> faced significant problems <str<strong>on</strong>g>in</str<strong>on</strong>g> deploy<str<strong>on</strong>g>in</str<strong>on</strong>g>g ICT; <strong>and</strong> early adopters, who were open to the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>novative use of ICT <str<strong>on</strong>g>in</str<strong>on</strong>g> their classrooms. Despite the fears stated above about need<str<strong>on</strong>g>in</str<strong>on</strong>g>g new methods<br />

to attract students, 46 percent of the group fell <str<strong>on</strong>g>in</str<strong>on</strong>g>to the ma<str<strong>on</strong>g>in</str<strong>on</strong>g>stream, with 30 percent classified as<br />

c<strong>on</strong>servatives <strong>and</strong> <strong>on</strong>ly 24 percent identified as early adopters. 631 Vlachopoulos stated that while early<br />

adopters wanted to create new roles <str<strong>on</strong>g>in</str<strong>on</strong>g> the classroom, explored new teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g methods with technology,<br />

<strong>and</strong> reported a high level of will<str<strong>on</strong>g>in</str<strong>on</strong>g>gness to pursue experimentati<strong>on</strong>, ma<str<strong>on</strong>g>in</str<strong>on</strong>g>stream faculty wanted “proven<br />

applicati<strong>on</strong>s of recognized value” before they deployed them <str<strong>on</strong>g>in</str<strong>on</strong>g> their classroom <strong>and</strong> also needed<br />

significant technical support for almost all ICT applicati<strong>on</strong>. Expla<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g this group classificati<strong>on</strong> further,<br />

Vlachopoulos detailed that:<br />

Only 15% of the <str<strong>on</strong>g>in</str<strong>on</strong>g>structors can be identified as early adopters c<strong>on</strong>cern<str<strong>on</strong>g>in</str<strong>on</strong>g>g their skills <str<strong>on</strong>g>in</str<strong>on</strong>g> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

ICT for learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g activities. These <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals have studied computer science for pers<strong>on</strong>al use<br />

<strong>and</strong> use ICT every day <str<strong>on</strong>g>in</str<strong>on</strong>g> their pers<strong>on</strong>al life <strong>and</strong> almost every class they give. The majority of<br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>structors (55%) bel<strong>on</strong>g to the ma<str<strong>on</strong>g>in</str<strong>on</strong>g>stream category s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce they haven’t studied computer<br />

631 This appears to c<strong>on</strong>firm the earlier f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the CSHE study (Harley et al. 2006b) that classicists use digital resources <str<strong>on</strong>g>in</str<strong>on</strong>g> the classroom less<br />

frequently than do scholars other discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es.


206<br />

science <strong>and</strong> use ICT occasi<strong>on</strong>ally at home. In their classes they often use simple ICT<br />

applicati<strong>on</strong>s, such as PowerPo<str<strong>on</strong>g>in</str<strong>on</strong>g>t presentati<strong>on</strong>s, email <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ternet (Vlachopoulos 2009).<br />

The largest area of support for the use of ICT was <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of comb<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g it with traditi<strong>on</strong>al teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

methods, which 70 percent of the <str<strong>on</strong>g>in</str<strong>on</strong>g>structors believed was possible. To encourage the greater use of<br />

ICT with<str<strong>on</strong>g>in</str<strong>on</strong>g> classical teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Vlachopoulos suggested that the designers of <str<strong>on</strong>g>in</str<strong>on</strong>g>novative projects would<br />

need to come up with strategies to attract more ma<str<strong>on</strong>g>in</str<strong>on</strong>g>stream faculty but also cauti<strong>on</strong>ed that<br />

adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrators would have to c<strong>on</strong>sider the greatly idiosyncratic nature of teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g with<str<strong>on</strong>g>in</str<strong>on</strong>g> classics<br />

before deploy<str<strong>on</strong>g>in</str<strong>on</strong>g>g new teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g methods us<str<strong>on</strong>g>in</str<strong>on</strong>g>g ICT. As <strong>on</strong>ly five of the <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewees c<strong>on</strong>sidered<br />

themselves as technologically self-sufficient, Vlachopoulos surmised that universities would need to<br />

provide a large amount of technical support <str<strong>on</strong>g>in</str<strong>on</strong>g> order to successfully deploy ICT <str<strong>on</strong>g>in</str<strong>on</strong>g> the classroom. As a<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>al thought, he noted that <strong>on</strong>e of the most important po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts for encourag<str<strong>on</strong>g>in</str<strong>on</strong>g>g more ma<str<strong>on</strong>g>in</str<strong>on</strong>g>stream faculty<br />

to adopt <str<strong>on</strong>g>in</str<strong>on</strong>g>novative uses of ICT <str<strong>on</strong>g>in</str<strong>on</strong>g> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g would be to c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>ce them of its efficiency.<br />

Another discovery highlighted by Vlachopoulos as part of his research was that he was not able to<br />

“f<str<strong>on</strong>g>in</str<strong>on</strong>g>d any department of Classics that applies a complete <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e language course <str<strong>on</strong>g>in</str<strong>on</strong>g> its curriculum.”<br />

While some universities that were open to the use of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> technology had designed <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

activities such as exercises, quizzes <strong>and</strong> surveys, there was “no complete course delivery with periodic<br />

<strong>and</strong> stable <str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong> between the members of a virtual community/classroom” (Vlachopoulos 2009).<br />

An earlier JISC-funded survey by the Higher Educati<strong>on</strong> Academy, History, Classics <strong>and</strong> Archaeology<br />

Subject Center (HCA) 632 pursued similar research <strong>and</strong> exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the use of e-resources <str<strong>on</strong>g>in</str<strong>on</strong>g> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of history, classics <strong>and</strong> archaeology <str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom (MacMah<strong>on</strong><br />

2006). This survey made use of an <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e questi<strong>on</strong>naire, semistructured <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews, <strong>and</strong> focus groups.<br />

The five most used e-resources were e-mail, websites of their home <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s, PowerPo<str<strong>on</strong>g>in</str<strong>on</strong>g>t, e-<br />

journals, <strong>and</strong> other <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s’ websites. Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, the survey found that there was a significant<br />

difference between the e-resources that were the most frequently used <strong>and</strong> resources that resp<strong>on</strong>dents<br />

reported they were most likely to use, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g software tools, e-books, digital archives, <strong>and</strong> virtual<br />

learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ments. One primary c<strong>on</strong>cern of faculty was the accessibility of the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

materials. Another <str<strong>on</strong>g>in</str<strong>on</strong>g>sight offered was that faculty often felt that an e-format was not always the best<br />

way of deliver<str<strong>on</strong>g>in</str<strong>on</strong>g>g what they c<strong>on</strong>sidered to be essential learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g materials for their teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Other areas<br />

of c<strong>on</strong>cern were digital rights issues, student competence to use electr<strong>on</strong>ic resources both <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of<br />

IT skills <strong>and</strong> discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary knowledge, <strong>and</strong> low levels of <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al support for us<str<strong>on</strong>g>in</str<strong>on</strong>g>g such resources.<br />

N<strong>on</strong>etheless, the study authors reported that the resp<strong>on</strong>ses to the questi<strong>on</strong>naire had c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>ced them that<br />

e-resources had made a significant impact <strong>on</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g practices with<str<strong>on</strong>g>in</str<strong>on</strong>g> the surveyed discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es. The<br />

two alterati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g practice that were most frequently reported were an alterati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g materials <strong>and</strong> the teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g methods used to deliver them. Surveyed faculty also reported a<br />

number of positive <strong>and</strong> negative impacts of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g e-resources <strong>on</strong> student learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Access to a wider<br />

range of source materials was highly cited as a positive development, particularly s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce it enabled<br />

students to c<strong>on</strong>duct research at an earlier stage <str<strong>on</strong>g>in</str<strong>on</strong>g> their educati<strong>on</strong> with both visual <strong>and</strong> textual<br />

materials, <strong>and</strong> faculty hoped that this would encourage <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Faculty also noted that<br />

electr<strong>on</strong>ic resources permitted materials to be customized for the needs of different learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g styles <strong>and</strong><br />

accessed both off-campus <strong>and</strong> all the time. On the other h<strong>and</strong>, some faculty feared that the rote use of<br />

e-resources would actually deter <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g by focus<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> “tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g” rather than educati<strong>on</strong>,<br />

that students would be discouraged from read<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> that students used the Internet excessively <strong>and</strong><br />

632 http://www.heacademy.ac.uk/hca


207<br />

were not discern<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> their use of many questi<strong>on</strong>able websites. The primary theme of these c<strong>on</strong>cerns<br />

was that e-resources should not replace face-to-face teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g. The creators of the survey thus<br />

c<strong>on</strong>cluded that “blended learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g” or where e-resources formed part of their pedagogy best<br />

characterized the approach of faculty <str<strong>on</strong>g>in</str<strong>on</strong>g> classics, archaeology, <strong>and</strong> history <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g e-<br />

resources.<br />

Other researchers have argued that not all classroom applicati<strong>on</strong>s of ICT 633 need to be <str<strong>on</strong>g>in</str<strong>on</strong>g>novative or<br />

cutt<str<strong>on</strong>g>in</str<strong>on</strong>g>g edge to be useful. A recent article by Richard Ashdowne (2009) has exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the development<br />

of Galactica (“Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Accidence C<strong>on</strong>solidati<strong>on</strong> Tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g, Internet-Centered Assessment”), a<br />

tool that was designed to support the University of Oxford’s Classics faculty language-c<strong>on</strong>solidati<strong>on</strong><br />

classes. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Ashdowne, each year about 155 students beg<str<strong>on</strong>g>in</str<strong>on</strong>g> classes at Oxford (e.g. classes <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

archaeology or ancient history) that require them to have a knowledge of Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, or both. The<br />

level of l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic experience am<strong>on</strong>g these students varies greatly. S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce <str<strong>on</strong>g>in</str<strong>on</strong>g>tensive language classes,<br />

al<strong>on</strong>g with frequent test<str<strong>on</strong>g>in</str<strong>on</strong>g>g, are required for all these students, the department determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that some<br />

form of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e evaluati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the key area of accidence test<str<strong>on</strong>g>in</str<strong>on</strong>g>g would be highly desirable. “Moreover,<br />

most students now arrive with basic comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g skills,” Ashdowne noted, “<strong>and</strong> it is the Faculty’s stated<br />

view that it is an important part of degree-level educati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Classics that students should develop<br />

relevant skills <str<strong>on</strong>g>in</str<strong>on</strong>g> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Classics-related electr<strong>on</strong>ic resources.”<br />

C<strong>on</strong>sequently, Galactica was developed to replace paper-based tests, <strong>and</strong> students were expected to log<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>to this Internet-based system <strong>on</strong>ce a week for each language they were study<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> to take the<br />

relevant tests. Ashdowne stated that it was hoped that Galactica would provide classroom <str<strong>on</strong>g>in</str<strong>on</strong>g>structors<br />

with more time to focus <strong>on</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> assist students <str<strong>on</strong>g>in</str<strong>on</strong>g> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ability to manipulate<br />

polyt<strong>on</strong>ic Greek <strong>on</strong> a computer. N<strong>on</strong>etheless, the <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e tests themselves, he noted, largely had the<br />

same purpose as the paper tests:<br />

… <str<strong>on</strong>g>in</str<strong>on</strong>g> shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g the aims of the paper-based system, Galactica illustrates how, although new<br />

technology can be used <str<strong>on</strong>g>in</str<strong>on</strong>g> new ways or for new ends, its applicati<strong>on</strong> does not have to be<br />

pedagogically revoluti<strong>on</strong>ary. As they beg<str<strong>on</strong>g>in</str<strong>on</strong>g> to develop, e-learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> e-assessment<br />

applicati<strong>on</strong>s may often seem to focus <strong>on</strong> novelty (<str<strong>on</strong>g>in</str<strong>on</strong>g> the best sense) <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>novati<strong>on</strong>, creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

educati<strong>on</strong>al tools to allow what would have been impossible or impractical before; but<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>asmuch as technology per se rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s new <str<strong>on</strong>g>in</str<strong>on</strong>g> educati<strong>on</strong>, even traditi<strong>on</strong>al methods may be<br />

re<str<strong>on</strong>g>in</str<strong>on</strong>g>terpreted <strong>and</strong> implemented <str<strong>on</strong>g>in</str<strong>on</strong>g> a new way, as here. Classics is <strong>on</strong>e field <str<strong>on</strong>g>in</str<strong>on</strong>g> which<br />

remember<str<strong>on</strong>g>in</str<strong>on</strong>g>g the value of what has g<strong>on</strong>e before is part of its <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual core, <strong>and</strong> where<br />

rediscover<str<strong>on</strong>g>in</str<strong>on</strong>g>g that value may itself be novel (Ashdowne 2009).<br />

Ashdowne thus illustrated the important role that technology can play not <strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g> help<str<strong>on</strong>g>in</str<strong>on</strong>g>g classicists<br />

develop radical new teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g methodologies but also <str<strong>on</strong>g>in</str<strong>on</strong>g> help<str<strong>on</strong>g>in</str<strong>on</strong>g>g them perform traditi<strong>on</strong>al teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

tasks such as evaluati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a far more efficient way. The Galactica system was based <strong>on</strong> TOIA<br />

(Technologies for Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Interoperable Assessment) 634 <strong>and</strong> required full Unicode compatibility, the<br />

ability to ask “grouped multiple-choice questi<strong>on</strong>s,” <strong>and</strong> classroom management <strong>and</strong> result report<str<strong>on</strong>g>in</str<strong>on</strong>g>g. A<br />

variety of technical issues were encountered, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the fact that TOIA was compatible <strong>on</strong>ly with<br />

Internet Explorer for PCs. Another challenge was the lack of any recognized framework for<br />

“evaluat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the pedagogical success of a system of this k<str<strong>on</strong>g>in</str<strong>on</strong>g>d” (Ashdowne 2009). N<strong>on</strong>etheless, both<br />

student <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>structor feedback <strong>on</strong> the system from the limited trials had been very positive. Ashdowne<br />

633 A full explorati<strong>on</strong> of the use of ICT with<str<strong>on</strong>g>in</str<strong>on</strong>g> classical teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g is bey<strong>on</strong>d the scope of this review. For the development of <strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual applicati<strong>on</strong> for<br />

Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, see Mall<strong>on</strong> (2006); for a general overview, see McManus <strong>and</strong> Rub<str<strong>on</strong>g>in</str<strong>on</strong>g>o (2003).<br />

634 http://www.toia.ac.uk/


208<br />

also declared that the m<str<strong>on</strong>g>in</str<strong>on</strong>g>imal f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial cost <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g Galactica illustrated that “new<br />

technology can be used cost-effectively for very traditi<strong>on</strong>al purposes as well as for radically new<br />

<strong>on</strong>es.” The ma<str<strong>on</strong>g>in</str<strong>on</strong>g> benefit of Galactica, he c<strong>on</strong>cluded, would be if it helped free up class time <str<strong>on</strong>g>in</str<strong>on</strong>g> a costeffective<br />

way. The efficiency of technology was thus recognized by both Ashdowne <strong>and</strong> Vlachopoulos<br />

as an important means of c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>g traditi<strong>on</strong>al scholars to adopt a new tool.<br />

One ambitious effort <str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom to encourage classicists not just to utilize digital resources<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> the classroom but also to actively participate <str<strong>on</strong>g>in</str<strong>on</strong>g> their design has been described by OKell et al.<br />

(2010). Between 2006 <strong>and</strong> 2008, the HCA 635 <strong>and</strong> the Centre for Excellence <str<strong>on</strong>g>in</str<strong>on</strong>g> Teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

for Reusable Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g Objects (RLO-CETL) 636 collaborated to create a reusable learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g object. Their<br />

project “digitally modeled the sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ar (as a typical <str<strong>on</strong>g>in</str<strong>on</strong>g>stance of humanities pedagogy) <str<strong>on</strong>g>in</str<strong>on</strong>g> a generic form<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>side a software package” <strong>and</strong> created the Generative Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g Object (GLO) Maker software 637 that<br />

could be used by faculty <str<strong>on</strong>g>in</str<strong>on</strong>g> their teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g. As OKell et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, the RLO-CETL participated <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

this process because they wanted to “elicit pedagogical patterns” from various discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> then<br />

digitally model these patterns <str<strong>on</strong>g>in</str<strong>on</strong>g> ways that could be utilized by teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g practiti<strong>on</strong>ers. The RLO-CETL<br />

particularly wanted to ensure that the design process was “practiti<strong>on</strong>er led” <strong>and</strong> that their doma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

expertise was recognized. The HCA participated <str<strong>on</strong>g>in</str<strong>on</strong>g> this collaborati<strong>on</strong> out of a desire to engage with the<br />

e-learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g community <strong>and</strong> to create e-resources that would be appropriate for their discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary<br />

community. This collaborati<strong>on</strong> illustrates the importance of doma<str<strong>on</strong>g>in</str<strong>on</strong>g> specialists <strong>and</strong> technologists<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g together as well as the need to recognize doma<str<strong>on</strong>g>in</str<strong>on</strong>g> expertise <str<strong>on</strong>g>in</str<strong>on</strong>g> the design of discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>arily<br />

appropriate learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g objects.<br />

One key issue that the project wished to address was the need for students to engage <str<strong>on</strong>g>in</str<strong>on</strong>g> more critical<br />

learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g. They cited surveys where university students expressed frustrati<strong>on</strong> at not be<str<strong>on</strong>g>in</str<strong>on</strong>g>g taught how to<br />

read texts <strong>and</strong> at lectures not giv<str<strong>on</strong>g>in</str<strong>on</strong>g>g them the “right answer.” The project thus decided to focus <strong>on</strong><br />

creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g object that supported students <str<strong>on</strong>g>in</str<strong>on</strong>g> learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g to look at evidence <strong>and</strong> vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s of that evidence, <strong>and</strong> to then make a critical argument of their own. The HCA brought a<br />

number of <str<strong>on</strong>g>in</str<strong>on</strong>g>sights to this work from a JISC-funded scop<str<strong>on</strong>g>in</str<strong>on</strong>g>g survey they had c<strong>on</strong>ducted (MacMah<strong>on</strong><br />

2006) to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e the use of e-resources <str<strong>on</strong>g>in</str<strong>on</strong>g> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the U.K. <str<strong>on</strong>g>in</str<strong>on</strong>g> history, classics, <strong>and</strong><br />

archaeology. This survey illustrated that those faculty who participated supported “the creati<strong>on</strong> of a<br />

community model” both to share their c<strong>on</strong>tent 638 <strong>and</strong> to structure the pedagogy of the e-learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

materials they used. Participants thought that their teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g would benefit from shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g e-learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

resources with colleagues <strong>and</strong> they wanted customizable e-resources for particular c<strong>on</strong>tent <strong>and</strong> learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

objectives. At the same time, they did not want to require outside help or to have to acquire new skills<br />

to be able to use e-learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources. Similar results <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of a desire not to need to learn any new<br />

skills to use digital resources were reported by Brown <strong>and</strong> Greengrass (2010) <strong>and</strong> Warwick et al.<br />

(2008a).<br />

A key research questi<strong>on</strong> of the project was to explore if the learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g technology approaches that were<br />

used for scientific discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es could also be used <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities. The HCA held a workshop with a<br />

number of academics where they reached the c<strong>on</strong>clusi<strong>on</strong> that the best approach would be to create a<br />

635 http://www.heacademy.ac.uk/hca<br />

636 http://www.rlo-cetl.ac.uk/<br />

637 http://www.glomaker.org/<br />

638 Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, even though resp<strong>on</strong>dents overwhelm<str<strong>on</strong>g>in</str<strong>on</strong>g>gly supported the shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g of e-resources, <strong>on</strong>ly 42 percent <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated that they were actually shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

e-resources with colleagues either with<str<strong>on</strong>g>in</str<strong>on</strong>g> or outside of their home <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s. The major reas<strong>on</strong>s for this were a general lack of knowledge as to what<br />

types of e-resources were be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used by their colleagues, a belief that learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g materials should be “closely tailored” to particular learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g objectives or<br />

course c<strong>on</strong>tent, worries about ownership of materials, <strong>and</strong> lack of <str<strong>on</strong>g>in</str<strong>on</strong>g>centives to share. Pers<strong>on</strong>al c<strong>on</strong>tacts by far led to the most shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g of resources.<br />

Although there was some support for the creati<strong>on</strong> of a repository or website to collect <strong>and</strong> make such e-resources searchable, there were great c<strong>on</strong>cerns<br />

about the susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability of such a repository.


209<br />

learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g object that focused <strong>on</strong> an artifact <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s of that artifact from different<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es (a classicist, an archaeologist <strong>and</strong> a historian). “The workshop participants had identified<br />

what humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es aim to do <strong>and</strong> the means by which they do it,” OKell et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed,<br />

“This was achieved <str<strong>on</strong>g>in</str<strong>on</strong>g> a c<strong>on</strong>text where educati<strong>on</strong>al technologists keen to create the next generati<strong>on</strong> of<br />

e-Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources could identify this aim <strong>and</strong> determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e whether it could be modelled<br />

electr<strong>on</strong>ically” (OKell et al. 2010, 158).<br />

Thus, this project sought to address the challenges of digitally model<str<strong>on</strong>g>in</str<strong>on</strong>g>g the pedagogical approaches of<br />

a particular discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e by hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary practiti<strong>on</strong>ers def<str<strong>on</strong>g>in</str<strong>on</strong>g>e a set of tasks <strong>and</strong> then hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

educati<strong>on</strong>al technologists see if they could successfully model them. In this case, the “powerful<br />

pedagogical pattern” that they modeled as a Generative Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g Object was that of “evaluat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Multiple Interpretati<strong>on</strong>s” (eMI). JISC funded the development of a proof-of-c<strong>on</strong>cept software, 639 <strong>and</strong><br />

doma<str<strong>on</strong>g>in</str<strong>on</strong>g> experts were <str<strong>on</strong>g>in</str<strong>on</strong>g>volved for the entire process. The Altar of Pergamum was chosen as the<br />

artifact; a three-step process of storyboard<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> ref<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g ideas, mockup <strong>and</strong> digital design, <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

implementati<strong>on</strong> <strong>and</strong> test<str<strong>on</strong>g>in</str<strong>on</strong>g>g was then undertaken. The participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g academics were asked to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

questi<strong>on</strong>s that they wanted their students to be able to answer, <strong>and</strong> this resulted <str<strong>on</strong>g>in</str<strong>on</strong>g> three general types<br />

of questi<strong>on</strong>s: Orig<str<strong>on</strong>g>in</str<strong>on</strong>g>, Purpose, <strong>and</strong> Mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

Their attempt “to storyboard the learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g process” faced a number of challenges because the scholars<br />

wanted to support both a l<str<strong>on</strong>g>in</str<strong>on</strong>g>ear (step-by-step from orig<str<strong>on</strong>g>in</str<strong>on</strong>g> to mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g for each discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e) <strong>and</strong> branch<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

navigati<strong>on</strong> (e.g., compar<str<strong>on</strong>g>in</str<strong>on</strong>g>g different discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary perspectives <strong>on</strong> the artifact’s orig<str<strong>on</strong>g>in</str<strong>on</strong>g> or mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g)<br />

through the module, but were uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> if this was possible to design. While the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al storyboard<br />

presented by scholars <str<strong>on</strong>g>in</str<strong>on</strong>g>volved hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g students move sequentially through <strong>on</strong>e discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e at a time <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

order to avoid c<strong>on</strong>fusi<strong>on</strong>, the learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologists suggested an alternative where students could<br />

compare multiple <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s of each microtheme (e.g., “orig<str<strong>on</strong>g>in</str<strong>on</strong>g>”) to enable the comparis<strong>on</strong> of<br />

multiple <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s. This design choice was enthusiastically agreed up<strong>on</strong> <strong>and</strong> was c<strong>on</strong>sequently<br />

labeled “Access Views.” In additi<strong>on</strong>, as knowledge acquisiti<strong>on</strong> was a major goal of eMI, the module<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cluded various forms of multiple-choice questi<strong>on</strong>s to assess student learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

A number of discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary audiences positively recognized the eMI module, <strong>and</strong> OKell et al. c<strong>on</strong>cluded<br />

that by computati<strong>on</strong>ally model<str<strong>on</strong>g>in</str<strong>on</strong>g>g a specific pedagogical process the eMI framework could be easily<br />

repurposed by other groups design<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g objects. They also recognized, however, that<br />

there are limits to design<str<strong>on</strong>g>in</str<strong>on</strong>g>g for reusability. “Some parts of the process can be noted <strong>and</strong> replicated to<br />

ensure useful outcomes,” OKell et al. acknowledged, “but, overall, success when design<str<strong>on</strong>g>in</str<strong>on</strong>g>g for reuse is<br />

dependent <strong>on</strong> the work<str<strong>on</strong>g>in</str<strong>on</strong>g>g relati<strong>on</strong>ship between the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary practiti<strong>on</strong>ers driv<str<strong>on</strong>g>in</str<strong>on</strong>g>g the process <strong>and</strong><br />

the learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologists support<str<strong>on</strong>g>in</str<strong>on</strong>g>g them” (OKell, et al. 167). The eMI project thus illustrated the<br />

importance of a good work<str<strong>on</strong>g>in</str<strong>on</strong>g>g relati<strong>on</strong>ship between <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> technologists <strong>and</strong> doma<str<strong>on</strong>g>in</str<strong>on</strong>g> specialists<br />

for the l<strong>on</strong>g-term reusability of a digital object.<br />

Look<str<strong>on</strong>g>in</str<strong>on</strong>g>g Backward: State of Digital Classics <str<strong>on</strong>g>in</str<strong>on</strong>g> 2005<br />

In 2005, the now-defunct AHDS c<strong>on</strong>ducted a subject extensi<strong>on</strong> feasibility study to survey recent <strong>and</strong><br />

current digital resource creati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> areas not served by the AHDS, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g classics, philosophy, <strong>and</strong><br />

theology, to see what level of service these discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es might require from the AHDS. The study report<br />

noted that both classics <strong>and</strong> ancient history were “relatively digitally mature <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> need of advanced<br />

services.” The report’s author, Reto Speck, c<strong>on</strong>ducted a number of <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews with subject specialists<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the field <strong>and</strong> also surveyed a number of digital projects. He noted that the digital projects <str<strong>on</strong>g>in</str<strong>on</strong>g> classics<br />

639 http://www.heacademy.ac.uk/hca/themes/e-learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g/emi_glo


210<br />

were excepti<strong>on</strong>ally diverse, as were the types of resources be<str<strong>on</strong>g>in</str<strong>on</strong>g>g digitized, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g catalogs <strong>and</strong><br />

bibliographies, prosopographical databases, manuscript images, papyri, <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, artifacts, textual<br />

resources, l<str<strong>on</strong>g>in</str<strong>on</strong>g>e draw<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, CAD <strong>and</strong> VR models of architectural structures, <strong>and</strong> spatial data sets. This<br />

wide variety of resources, Speck noted, reflected the multidiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary nature of classics. He also<br />

po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that many scholars who were <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed “suggested that ICT <str<strong>on</strong>g>in</str<strong>on</strong>g> general, <strong>and</strong> hypertext<br />

<strong>and</strong> hypermedia technology <str<strong>on</strong>g>in</str<strong>on</strong>g> particular, are beneficial to CAH 640 research, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce it enables the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of textual, archaeology <strong>and</strong> historical sources <strong>and</strong> approaches <str<strong>on</strong>g>in</str<strong>on</strong>g>to <strong>on</strong>e research project.”<br />

This ability of the digital medium to re<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate the textual <strong>and</strong> material record <strong>and</strong> present a more<br />

sophisticated approach to explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ancient world was valued by many digital classics projects.<br />

Speck also found that the sophisticati<strong>on</strong> of computati<strong>on</strong>al methods used <str<strong>on</strong>g>in</str<strong>on</strong>g> projects varied greatly:<br />

For a large proporti<strong>on</strong> of projects the digital comp<strong>on</strong>ent is clearly subsidiary to the wider<br />

research questi<strong>on</strong> <strong>and</strong> the computati<strong>on</strong>al methods employed are straight-forward; however, a<br />

significant m<str<strong>on</strong>g>in</str<strong>on</strong>g>ority of projects employs <strong>and</strong> devises advanced computati<strong>on</strong>al methods<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g multi-spectral imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques, advanced 3-d modell<str<strong>on</strong>g>in</str<strong>on</strong>g>g methods, <strong>and</strong> the<br />

development of generic <strong>and</strong> re-usable mark up schemes (Speck 2005).<br />

A similar level of vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g computati<strong>on</strong>al complexity was found <str<strong>on</strong>g>in</str<strong>on</strong>g> this project’s survey of digital<br />

resources <str<strong>on</strong>g>in</str<strong>on</strong>g> classics. While some projects focused <strong>on</strong> us<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital tools to better explore traditi<strong>on</strong>al<br />

questi<strong>on</strong>s, others were develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g state-of-the-art tools to explore new questi<strong>on</strong>s.<br />

Interest<str<strong>on</strong>g>in</str<strong>on</strong>g>gly, Speck articulated that attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to develop a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle subject center to meet the needs of<br />

classics <strong>and</strong> ancient history would likely fail to address both the <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary nature of the field<br />

<strong>and</strong> that most of the services requested of the AHDS were both quite specific <strong>and</strong> advanced.<br />

N<strong>on</strong>etheless, the report did offer six recommendati<strong>on</strong>s for support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the needs of digital classics<br />

research: (1) “the development <strong>and</strong> promoti<strong>on</strong> of generic methods <strong>and</strong> st<strong>and</strong>ards” such as TEI <strong>and</strong><br />

EpiDoc; (2) the “<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>kage of exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources <strong>and</strong> cross-search<str<strong>on</strong>g>in</str<strong>on</strong>g>g”; (3) the<br />

development of VREs; (4) the “shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g of expertise, outcomes <strong>and</strong> methodologies <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

projects”; (5) the need for nati<strong>on</strong>al <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g; <strong>and</strong> (6) “<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong> encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

display of n<strong>on</strong>-Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> script.” Of all these recommendati<strong>on</strong>s, many scholars stated that the ma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

challenge would be “<str<strong>on</strong>g>in</str<strong>on</strong>g> l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g disparate collecti<strong>on</strong>s of different data types to enable powerful crosssearch<str<strong>on</strong>g>in</str<strong>on</strong>g>g.”<br />

In fact, a variety of projects have evolved to address just these issues such as C<strong>on</strong>cordia,<br />

LaQuAT, <strong>and</strong> Interediti<strong>on</strong>, all of which will be discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> greater detail <str<strong>on</strong>g>in</str<strong>on</strong>g> the next secti<strong>on</strong>.<br />

Look<str<strong>on</strong>g>in</str<strong>on</strong>g>g Forward: Classics Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, Themes, <strong>and</strong> Requirements <str<strong>on</strong>g>in</str<strong>on</strong>g> 2010<br />

While the AHDS study of 2005 took a fairly broad approach to def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the needs of digital classics<br />

projects, three recent articles <str<strong>on</strong>g>in</str<strong>on</strong>g> a special issue of the Digital Humanities Quarterly (DHQ) that was<br />

dedicated to the theme “Chang<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Center of Gravity: Transform<str<strong>on</strong>g>in</str<strong>on</strong>g>g Classical Studies Through<br />

Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure” have taken an even more expansive approach to the questi<strong>on</strong> of develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital classics, classics, <strong>and</strong> the humanities as a whole. While Crane, Seales,<br />

<strong>and</strong> Terras (2009) looked at the cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure requirements for classical philology as a means of<br />

explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g larger issues of digital classics, Crane et al. (2009a) summarized the challenges fac<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

classical studies <str<strong>on</strong>g>in</str<strong>on</strong>g> the milli<strong>on</strong>-book libraries be<str<strong>on</strong>g>in</str<strong>on</strong>g>g created by mass-digitizati<strong>on</strong> projects, <strong>and</strong><br />

Blackwell <strong>and</strong> Crane 2009 offered a c<strong>on</strong>clusi<strong>on</strong> to this special issue <strong>and</strong> an overview of its larger<br />

640 This acr<strong>on</strong>ym st<strong>and</strong>s for Classics <strong>and</strong> Ancient History.


211<br />

themes. Each of these articles <strong>and</strong> the requirements it presents for a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for classics are<br />

c<strong>on</strong>sidered here.<br />

While the theme of the advanced nature of comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> classics has been documented throughout this<br />

research, Crane, Seales, <strong>and</strong> Terras (2009) suggest that this very level of “advancement” may present<br />

unexpected c<strong>on</strong>sequences:<br />

The early use of digital tools <str<strong>on</strong>g>in</str<strong>on</strong>g> classics may, paradoxically, work aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st the creative<br />

explorati<strong>on</strong> of the digital world now tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g shape. Classicists grew accustomed to treat<str<strong>on</strong>g>in</str<strong>on</strong>g>g their<br />

digital tools as adjuncts to an established pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t world. Publicati<strong>on</strong>—the core practice by which<br />

classicists establish their careers <strong>and</strong> their reputati<strong>on</strong>s—rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s fundamentally c<strong>on</strong>servative<br />

(Crane, Seales <strong>and</strong> Terras 2009).<br />

They c<strong>on</strong>sequently recommended that philologists, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>deed all classicists, move away from creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

specialized software <strong>and</strong> start creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g specialized knowledge sources; they envisi<strong>on</strong> a new digital<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that supports the reth<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g of all the traditi<strong>on</strong>al reference sources of classical studies. 641<br />

The greatest barriers to be faced <str<strong>on</strong>g>in</str<strong>on</strong>g> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g this new <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure are social rather than technical, as<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dicated by the fact that no traditi<strong>on</strong>al elements of the scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

commentaries, editi<strong>on</strong>s, grammars <strong>and</strong> lexic<strong>on</strong>s, have truly been adapted to the digital world by be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

made mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e acti<strong>on</strong>able. Other problems <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the fact that most scholarship is still s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle authored,<br />

the TLG provides digital texts without any critical commentary, <strong>and</strong> most major new critical editi<strong>on</strong>s<br />

have copyrights that rema<str<strong>on</strong>g>in</str<strong>on</strong>g> with their publisher, thus lead<str<strong>on</strong>g>in</str<strong>on</strong>g>g to an overreliance <strong>on</strong> the TLG.<br />

N<strong>on</strong>etheless, Crane, Seales, <strong>and</strong> Terras ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> that a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for philology <strong>and</strong> classics is<br />

slowly emerg<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> builds up<strong>on</strong> three earlier “stages of digital classics: <str<strong>on</strong>g>in</str<strong>on</strong>g>cunabular projects, which<br />

reta<str<strong>on</strong>g>in</str<strong>on</strong>g> the assumpti<strong>on</strong>s of pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t culture, knowledge bases produced by small, centralized projects, <strong>and</strong><br />

digital communities, which allow many c<strong>on</strong>tributors to collaborate with m<str<strong>on</strong>g>in</str<strong>on</strong>g>imal technical expertise.”<br />

For digital <str<strong>on</strong>g>in</str<strong>on</strong>g>cunabula, the TLG <strong>and</strong> the Bryn Mawr Classical Review are listed, the PDL is suggested<br />

as a knowledge base, <strong>and</strong> the Stoa C<strong>on</strong>sortium is a model digital community. More important, the<br />

authors c<strong>on</strong>tend that these three classes of projects reflect three separate sources of energy:<br />

“<str<strong>on</strong>g>in</str<strong>on</strong>g>dustrialized processes of mass digitizati<strong>on</strong> <strong>and</strong> of general algorithms, the specialized producti<strong>on</strong> of<br />

doma<str<strong>on</strong>g>in</str<strong>on</strong>g> specific, mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e acti<strong>on</strong>able knowledge, <strong>and</strong> the generalized ability for many different<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals to c<strong>on</strong>tribute.” The authors posit that when these three sources <str<strong>on</strong>g>in</str<strong>on</strong>g>teract, they provide a new<br />

digital envir<strong>on</strong>ment that makes possible ePhilology, eClassics, <strong>and</strong> cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. Yet at the same<br />

time, they note that our current <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is not yet at this stage:<br />

The <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure of 2008 forces researchers <str<strong>on</strong>g>in</str<strong>on</strong>g> classics <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities to develop<br />

aut<strong>on</strong>omous, largely isolated, resources. We cannot apply any analysis to data that is not<br />

accessible. We need, at the least, to be able gather the data that is available today <strong>and</strong>, sec<strong>on</strong>d,<br />

to ensure that we can retrieve the same data <str<strong>on</strong>g>in</str<strong>on</strong>g> 2050 or 2110 that we retrieve <str<strong>on</strong>g>in</str<strong>on</strong>g> 2010. … We<br />

need digital libraries that may be physically distributed <str<strong>on</strong>g>in</str<strong>on</strong>g> different parts of the world but that<br />

act as a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle unit. … (Crane, Seales, <strong>and</strong> Terras 2009).<br />

This quote illustrates the c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>u<str<strong>on</strong>g>in</str<strong>on</strong>g>g challenges of limited access to primary sources <strong>and</strong> sec<strong>on</strong>dary<br />

scholarship, susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able digital preservati<strong>on</strong>, <strong>and</strong> creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated user search<str<strong>on</strong>g>in</str<strong>on</strong>g>g experience<br />

641 The importance of digital reference works <str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated research envir<strong>on</strong>ment has also been recognized by de la Flor et al. (2010a) <str<strong>on</strong>g>in</str<strong>on</strong>g> their discussi<strong>on</strong><br />

of develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g the VRE-SDM: “Moreover, classicists frequently reference other material such as prior translati<strong>on</strong>s, dicti<strong>on</strong>aries of Roman names <strong>and</strong><br />

historical documents, whilst exam<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g a manuscript. It would therefore be useful to be able to juxtapose the texts <strong>and</strong> notes they are work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> with<br />

other paper <strong>and</strong> electr<strong>on</strong>ic materials, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g be<str<strong>on</strong>g>in</str<strong>on</strong>g>g able to view partial transcripti<strong>on</strong>s of the text al<strong>on</strong>gside an image.”


212<br />

across virtual collecti<strong>on</strong>s of data. The importance of an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for research by<br />

classicists has also been recognized by the VRE-SDM project:<br />

The aim of the VRE-SDM project has been to c<strong>on</strong>struct a pilot of an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated envir<strong>on</strong>ment <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

which data (documents), tools <strong>and</strong> scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>strumenta could be available to the scholar as a<br />

complete <strong>and</strong> coherent resource. Scholars who edit ancient documents are always deal<str<strong>on</strong>g>in</str<strong>on</strong>g>g with<br />

damaged or degraded texts <strong>and</strong> ideally require access to the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>als, or the best possible<br />

facsimiles of the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>als, <str<strong>on</strong>g>in</str<strong>on</strong>g> order to decipher <strong>and</strong> verify read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, <strong>and</strong> also to a wide range of<br />

scholarly aids <strong>and</strong> reference works (dicti<strong>on</strong>aries, name-lists, editi<strong>on</strong>s of comparable texts, <strong>and</strong><br />

so <strong>on</strong>) which are essential for <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> of their texts (Bowman et al. 2010, 90).<br />

As Bowman et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated research envir<strong>on</strong>ment or cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure will require<br />

access not <strong>on</strong>ly to primary sources but also to digital tools <strong>and</strong> to a wide range of preexist<str<strong>on</strong>g>in</str<strong>on</strong>g>g reference<br />

tools/works that will need to be adapted for the digital envir<strong>on</strong>ment. They also noted that many of the<br />

necessary collecti<strong>on</strong>s have already been created or digitized but are unfortunately scattered across the<br />

websites of various museums <strong>and</strong> libraries.<br />

To create a more <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated classical cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, Crane, Seales, <strong>and</strong> Terras propose a<br />

m<str<strong>on</strong>g>in</str<strong>on</strong>g>imum list of necessities, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g libraries or repositories that can provide susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able<br />

preservati<strong>on</strong>, “sophisticated citati<strong>on</strong> <strong>and</strong> reference l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g services,” new forms of electr<strong>on</strong>ic<br />

publicati<strong>on</strong>, new models of collaborati<strong>on</strong>, <strong>and</strong> a digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that is portable across languages<br />

(Greek, Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Ch<str<strong>on</strong>g>in</str<strong>on</strong>g>ese, Arabic, etc.). They c<strong>on</strong>clude with three strategies to beg<str<strong>on</strong>g>in</str<strong>on</strong>g> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g this<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure: (1) optimiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e translati<strong>on</strong> for the field of classics; (2) c<strong>on</strong>vert<str<strong>on</strong>g>in</str<strong>on</strong>g>g as much<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> as possible <str<strong>on</strong>g>in</str<strong>on</strong>g>to mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-acti<strong>on</strong>able data; <strong>and</strong> (3) us<str<strong>on</strong>g>in</str<strong>on</strong>g>g can<strong>on</strong>ical literary texts that have<br />

already been marked up to serve as databases of l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic annotati<strong>on</strong>s.<br />

Crane et al. (2009a) provide an overview of the opportunities <strong>and</strong> challenges faced <str<strong>on</strong>g>in</str<strong>on</strong>g> mov<str<strong>on</strong>g>in</str<strong>on</strong>g>g from<br />

“small, carefully edited <strong>and</strong> curated digital collecti<strong>on</strong>s to very large, <str<strong>on</strong>g>in</str<strong>on</strong>g>dustrially produced collecti<strong>on</strong>s”<br />

with a focus <strong>on</strong> the role of classical collecti<strong>on</strong>s <strong>and</strong> knowledge sources. The authors stress the need to<br />

create a classical apographeme <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e as an analogy to the genome, or the need to represent <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e:<br />

… the complete record of all Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> textual knowledge preserved from antiquity,<br />

ultimately <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g every <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>, papyrus, graffito, manuscript, pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong> <strong>and</strong> any<br />

writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g bear<str<strong>on</strong>g>in</str<strong>on</strong>g>g medium. This apographeme c<strong>on</strong>stitutes a superset of the capabilities <strong>and</strong> data<br />

that we <str<strong>on</strong>g>in</str<strong>on</strong>g>herit from pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t culture but it is a qualitatively different <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual space (Crane et al.<br />

2009a).<br />

This argument focuses <strong>on</strong> the need to represent all Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> sources <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated<br />

envir<strong>on</strong>ment, whether <str<strong>on</strong>g>in</str<strong>on</strong>g>scribed <strong>on</strong> st<strong>on</strong>e or pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted <str<strong>on</strong>g>in</str<strong>on</strong>g> a book. Match<str<strong>on</strong>g>in</str<strong>on</strong>g>g these new <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e collecti<strong>on</strong>s<br />

with advanced OCR <strong>and</strong> other applicati<strong>on</strong>s, Crane et al. (2009a) expla<str<strong>on</strong>g>in</str<strong>on</strong>g>, is currently support<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

number of important new services, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the creati<strong>on</strong> of multitexts, chr<strong>on</strong>ologically deeper<br />

corpora, <strong>and</strong> new “textual forms of bibliographic research.” In this new world, the authors argue, all<br />

classicists are also act<str<strong>on</strong>g>in</str<strong>on</strong>g>g as corpus l<str<strong>on</strong>g>in</str<strong>on</strong>g>guists.<br />

A large part of this paper is dedicated to outl<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the services required for humanities users <str<strong>on</strong>g>in</str<strong>on</strong>g> massive<br />

digital collecti<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to physical images of sources, transcripti<strong>on</strong>al data, basic page<br />

layout <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, semantic markup with<str<strong>on</strong>g>in</str<strong>on</strong>g> a text, dynamically generated knowledge, <strong>and</strong>, f<str<strong>on</strong>g>in</str<strong>on</strong>g>ally,<br />

“l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistically labeled, mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e acti<strong>on</strong>able knowledge.” The importance of access to “mach<str<strong>on</strong>g>in</str<strong>on</strong>g>eacti<strong>on</strong>able<br />

knowledge” <strong>and</strong> the need for creators of digital classics resources to create data <strong>and</strong> sources


213<br />

that help build this knowledge base are pre-em<str<strong>on</strong>g>in</str<strong>on</strong>g>ent themes of this paper. But this process is twofold as<br />

Crane et al. (2009a) explicate: while scholars need to create data that can be used by automatic<br />

processes, they also need to be able to build off of data created by these processes.<br />

The authors thus call for the creati<strong>on</strong> of “fourth-generati<strong>on</strong> collecti<strong>on</strong>s” that will support a<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g> classics. Such collecti<strong>on</strong>s will have a number of features. They will (1) <str<strong>on</strong>g>in</str<strong>on</strong>g>clude<br />

images of all source writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g papyri, <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, manuscripts, <strong>and</strong> pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted editi<strong>on</strong>s; (2)<br />

“manage the legacy structure of books”; (3) <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate XML transcripti<strong>on</strong>s as they become available<br />

with image data so that “all digital editi<strong>on</strong>s are, at the least, reborn digital”; (4) c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> “mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

acti<strong>on</strong>able reference works” that are embedded <str<strong>on</strong>g>in</str<strong>on</strong>g> grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital collecti<strong>on</strong>s that automatically update<br />

themselves; (5) learn from their own data <strong>and</strong> collecti<strong>on</strong>s; (6) learn from their users, or c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

automated systems that can learn from the annotati<strong>on</strong>s of their users; (7) adapt themselves to their<br />

readers either through watch<str<strong>on</strong>g>in</str<strong>on</strong>g>g their acti<strong>on</strong>s (pers<strong>on</strong>alizati<strong>on</strong>) or through user choice (customizati<strong>on</strong>);<br />

<strong>and</strong> (8) support “deep computati<strong>on</strong>” with as many services as possible that can be applied to their<br />

c<strong>on</strong>tent. As <strong>on</strong>e of their f<str<strong>on</strong>g>in</str<strong>on</strong>g>al thoughts, the authors reiterate that a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for classics<br />

should <str<strong>on</strong>g>in</str<strong>on</strong>g>clude images of writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g from all types of sources. “In a library grounded <strong>on</strong> images of<br />

writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g,” Crane et al. (2009a) suggest, “There is no fundamental reas<strong>on</strong> not to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate, at the base<br />

level, images of writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g from all surfaces.” 642 In fact, the difficulties of this <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g from<br />

the pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted <strong>and</strong> material records will likely be <strong>on</strong>e of the greatest technical challenges <str<strong>on</strong>g>in</str<strong>on</strong>g> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for classics.<br />

The c<strong>on</strong>clusi<strong>on</strong> of the special DHQ issue by Blackwell <strong>and</strong> Crane (2009) offered a summary of the<br />

issues raised throughout <strong>and</strong> returned to the c<strong>on</strong>cepts of ePhilology, eClassics, <strong>and</strong> cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

Any cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for classics, they argued, must <str<strong>on</strong>g>in</str<strong>on</strong>g>clude open-access data, comprehensive<br />

collecti<strong>on</strong>s, software, “curated knowledge sources” <strong>and</strong> “advanced, doma<str<strong>on</strong>g>in</str<strong>on</strong>g> optimized services.” The<br />

authors put forward that any cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the humanities can easily beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with classics not<br />

<strong>on</strong>ly because it is <strong>on</strong>e of the most digitally mature fields but for a variety of other reas<strong>on</strong>s as well. First,<br />

classical studies provides a cultural heritage that is truly <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al. Sec<strong>on</strong>d, although most of the<br />

DHQ articles <str<strong>on</strong>g>in</str<strong>on</strong>g> this special issue focused <strong>on</strong> the textual record, there is a vast body of untapped data<br />

about the ancient world <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology:<br />

The study of the Greco-Roman world dem<strong>and</strong>s new <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al practices with which to<br />

produce <strong>and</strong> share <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. The next great advances <str<strong>on</strong>g>in</str<strong>on</strong>g> our underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of the ancient<br />

world will come from m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> visualiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the full record, textual as well as material, that<br />

survives from or talks about every corner of the ancient world (Blackwell <strong>and</strong> Crane 2009).<br />

Such a record can be built <strong>on</strong>ly through <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al collaborati<strong>on</strong>. Third, the textual corpus of classics<br />

may be f<str<strong>on</strong>g>in</str<strong>on</strong>g>ite, but it has had an immense impact <strong>on</strong> human life. Fourth, “Greco-Roman antiquity<br />

dem<strong>and</strong>s a general architecture for many historical languages” so that technical development <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

support<str<strong>on</strong>g>in</str<strong>on</strong>g>g these languages can help lead to advances <str<strong>on</strong>g>in</str<strong>on</strong>g> support<str<strong>on</strong>g>in</str<strong>on</strong>g>g languages such as Sumerian <strong>and</strong><br />

Coptic. Fifth, most c<strong>on</strong>temporary scholarship is multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual, <strong>and</strong> classics is <strong>on</strong>e of the most<br />

fundamentally multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual communities <str<strong>on</strong>g>in</str<strong>on</strong>g> the academy. 643 Sixth, knowledge <strong>and</strong> underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

the extent of the Greco-Roman world could help lead to new <str<strong>on</strong>g>in</str<strong>on</strong>g>volvement with areas such as the<br />

Middle East <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of this shared heritage. Seventh, “classical scholarship beg<str<strong>on</strong>g>in</str<strong>on</strong>g>s the c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uous<br />

642 This argument was also seen throughout this review; see <str<strong>on</strong>g>in</str<strong>on</strong>g> particular Roueché (2009) <strong>and</strong> Bagnall (2010).<br />

643 The challenges of develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital collecti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that can accommodate a multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual collecti<strong>on</strong> (Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>, Greek, Arabic, <strong>and</strong> Italian) of<br />

both classical <strong>and</strong> medieval texts <str<strong>on</strong>g>in</str<strong>on</strong>g> the history of science has been exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by the Archimedes Digital <strong>Library</strong> (http://archimedes.fas.harvard.edu/); also<br />

see Schoepfl<str<strong>on</strong>g>in</str<strong>on</strong>g> (2003).


214<br />

traditi<strong>on</strong> of European literature <strong>and</strong> c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues through the present.” This is important, the authors note,<br />

for:<br />

An <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that provides advanced services for primary <strong>and</strong> sec<strong>on</strong>dary sources <strong>on</strong><br />

classical Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s, papyri, medieval manuscripts, early modern<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted books, <strong>and</strong> mature editi<strong>on</strong>s <strong>and</strong> reference works of the 19th <strong>and</strong> twentieth centuries.<br />

Even if we restrict ourselves to textual sources, those textual sources provide heterogeneous<br />

data about the ancient world. If we <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the material record, then we need to manage videos<br />

<strong>and</strong> sound about the ancient world as well (Blackwell <strong>and</strong> Crane 2009).<br />

Classics is such a broad discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e that the various <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure challenges it raises will also be<br />

important for the development of any larger cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the humanities. The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al reas<strong>on</strong><br />

Blackwell <strong>and</strong> Crane give for lett<str<strong>on</strong>g>in</str<strong>on</strong>g>g classics help def<str<strong>on</strong>g>in</str<strong>on</strong>g>e the development of a broader<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is that classicists have devoted at least a generati<strong>on</strong> to develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools <strong>and</strong> services<br />

<strong>and</strong> now “need a more robust envir<strong>on</strong>ment <strong>and</strong> are ready to c<strong>on</strong>vert project-based efforts <str<strong>on</strong>g>in</str<strong>on</strong>g>to a shared,<br />

permanent <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure” (Blackwell <strong>and</strong> Crane 2009)<br />

To move from project-based efforts to a shared digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, the authors list numerous<br />

specialized services developed by <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual digital classics projects that will need to be supported,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g can<strong>on</strong>ical text services, OCR <strong>and</strong> page layout, morphological analysis, syntactic analysis,<br />

word sense discovery, named entity analysis, metrical analysis, translati<strong>on</strong> support, CLIR, citati<strong>on</strong><br />

identificati<strong>on</strong>, quotati<strong>on</strong> identificati<strong>on</strong>, translati<strong>on</strong> identificati<strong>on</strong>, text alignment, versi<strong>on</strong> analysis, <strong>and</strong><br />

markup projecti<strong>on</strong>. In additi<strong>on</strong> to these services, two types of texts are required to support ePhilology<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> particular: (1) multitexts, or “methods to track multiple versi<strong>on</strong>s of a text across time”—these<br />

methodologies allow for the creati<strong>on</strong> of “true digital editi<strong>on</strong>s” that <str<strong>on</strong>g>in</str<strong>on</strong>g>clude all images of their source<br />

materials, various versi<strong>on</strong>ed <strong>and</strong> rec<strong>on</strong>structed editi<strong>on</strong>s, <strong>and</strong> multiple apparatus critici that are mach<str<strong>on</strong>g>in</str<strong>on</strong>g>eacti<strong>on</strong>able;<br />

<strong>and</strong> (2) parallel texts—or texts that exist both <str<strong>on</strong>g>in</str<strong>on</strong>g> an orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al language versi<strong>on</strong> <strong>and</strong> multiple<br />

translati<strong>on</strong>s—methodologies to work with these texts extend the idea of a multitext across languages<br />

<strong>and</strong> multiple versi<strong>on</strong>s. Other collecti<strong>on</strong>s required to support ePhilology <str<strong>on</strong>g>in</str<strong>on</strong>g>clude wordnets, treebanks,<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic annotati<strong>on</strong>s, mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-acti<strong>on</strong>able <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes <strong>and</strong> commentaries.<br />

Blackwell <strong>and</strong> Crane (2009) end their piece with thoughts <strong>on</strong> what is needed for true digital publicati<strong>on</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> the announcement of the Scaife Digital <strong>Library</strong> (SDL). The authors<br />

c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>gly assert that “just because <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> is <strong>on</strong>-l<str<strong>on</strong>g>in</str<strong>on</strong>g>e does not mean that that <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> has<br />

exploited the full potential of the digital medium” (Blackwell <strong>and</strong> Crane 2009). Classical materials,<br />

they argue, need to be available <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable formats <strong>and</strong> with open licenses (e.g., almost all of the<br />

TEI-XML texts <str<strong>on</strong>g>in</str<strong>on</strong>g> the PDL have been available for downloaded under a CC license s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce 2006).<br />

Similarly, the Center for Hellenic Studies (CHS) announced a plan <str<strong>on</strong>g>in</str<strong>on</strong>g> 2008 to create a digital library of<br />

new, TEI-compliant XML editi<strong>on</strong>s “for the first thous<strong>and</strong> years of Greek.”<br />

In order for an item placed <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e to be useful <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital world, Blackwell <strong>and</strong> Crane propose that it<br />

must meet four c<strong>on</strong>diti<strong>on</strong>s of digital scholarly publicati<strong>on</strong>: (1) the c<strong>on</strong>tent must be of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest to pers<strong>on</strong>s<br />

other than its creators; (2) it must have a format that can be preserved <strong>and</strong> used for a l<strong>on</strong>g period of<br />

time; (3) it needs at least <strong>on</strong>e l<strong>on</strong>g-term home; <strong>and</strong> (4) it must be able to circulate freely. All objects<br />

that will be placed <str<strong>on</strong>g>in</str<strong>on</strong>g> the SDL must meet these requirements, <strong>and</strong> the authors also state that the SDL<br />

will not provide services to its end users, but rather provide access to digital objects that can be<br />

repurposed. In their c<strong>on</strong>clusi<strong>on</strong>, Blackwell <strong>and</strong> Crane outl<str<strong>on</strong>g>in</str<strong>on</strong>g>e three issues to be faced or perhaps<br />

accepted. First, <str<strong>on</strong>g>in</str<strong>on</strong>g> this new world, “all classicists are digital classicists,” or at least they must become


215<br />

so, for their scholarship to reta<str<strong>on</strong>g>in</str<strong>on</strong>g> mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g; sec<strong>on</strong>d, classicists will need to work with scholars who<br />

have more advanced underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of technology; <strong>and</strong> third, new <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s, or a new hybrid librarypublisher<br />

that can help classicists create <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> their objects/services, are necessary.<br />

These articles illustrate a number of important issues to be c<strong>on</strong>sidered for a classics<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, or <str<strong>on</strong>g>in</str<strong>on</strong>g>deed for a digital repository or federated series of repositories, to meet the<br />

needs of digital classicists. These requirements <str<strong>on</strong>g>in</str<strong>on</strong>g>clude open data <strong>and</strong> collecti<strong>on</strong>s (open not <strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

terms of access but <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of openly licensed where all the data are available), curated knowledge<br />

sources <strong>and</strong> mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-acti<strong>on</strong>able reference works, general <strong>and</strong> doma<str<strong>on</strong>g>in</str<strong>on</strong>g>-specialized services,<br />

collaborati<strong>on</strong> both with<str<strong>on</strong>g>in</str<strong>on</strong>g> the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of classics <strong>and</strong> with other discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, <strong>and</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that<br />

will support a reas<strong>on</strong>able level of doma<str<strong>on</strong>g>in</str<strong>on</strong>g> customizati<strong>on</strong> while still be<str<strong>on</strong>g>in</str<strong>on</strong>g>g flexible enough to provide<br />

general storage <strong>and</strong> high-speed access to computati<strong>on</strong>al processes. Similarly, Mah<strong>on</strong>y <strong>and</strong> Bodard<br />

(2010) have offered a list of requirements, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g “Digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, Open Access publicati<strong>on</strong>,<br />

re-use of freely licensed data <strong>and</strong>, <strong>and</strong> Semantic Web technologies” <str<strong>on</strong>g>in</str<strong>on</strong>g> order for Classics to fully<br />

engage with an “<str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly digital academic envir<strong>on</strong>ment” (Mah<strong>on</strong>y <strong>and</strong> Bodard 2010, 5). The next<br />

secti<strong>on</strong> outl<str<strong>on</strong>g>in</str<strong>on</strong>g>es a number of projects that have taken some <str<strong>on</strong>g>in</str<strong>on</strong>g>itial steps toward build<str<strong>on</strong>g>in</str<strong>on</strong>g>g parts of this<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

Classics Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure Projects<br />

While several nati<strong>on</strong>al <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure projects are discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> the next secti<strong>on</strong><br />

of this report, a number of smaller projects have focused <strong>on</strong> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g greater <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of major<br />

digital classics resources or greater <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for classics, subdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of classics, or medieval<br />

studies. Some of these projects have been discussed above. Some projects have been completed, <strong>and</strong><br />

others are <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

APIS—Advanced Papyrological Informati<strong>on</strong> System<br />

This project has been discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> the Papyrology secti<strong>on</strong>.<br />

CLAROS—Classical Art Research Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Services<br />

The CLAROS 644 project has actively researched how to best support the “virtual <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of digital<br />

assets <strong>on</strong> classical art” <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g pottery, gems, sculpture, ic<strong>on</strong>ography, <strong>and</strong> antiquaria. CLAROS is<br />

us<str<strong>on</strong>g>in</str<strong>on</strong>g>g “Semantic Web data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> technologies <strong>and</strong> state-of-the art image recogniti<strong>on</strong> algorithms”<br />

<strong>and</strong> seeks to br<str<strong>on</strong>g>in</str<strong>on</strong>g>g classical art “to any<strong>on</strong>e, anytime, anywhere.” 645 Its major partner <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude the Beazley Archive at Oxford, the German Archaeological Institute (DAI) <str<strong>on</strong>g>in</str<strong>on</strong>g> Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>, the<br />

Lexic<strong>on</strong> of Greek Pers<strong>on</strong>al Names (LGPN), the Lexic<strong>on</strong> Ic<strong>on</strong>ographicum Mythologiae Classicae<br />

(LIMC Basel <strong>and</strong> LIMC Paris), <strong>and</strong> the Research Archive for Ancient Sculpture Cologne (Arachne) as<br />

well as a number of other research archives. In May of 2011, the project released both its first public<br />

search <str<strong>on</strong>g>in</str<strong>on</strong>g>terface to the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated collecti<strong>on</strong> of these <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s, the CLAROS Explorer 646 , as well as a<br />

CLAROS data service 647 that provides a RESTful <str<strong>on</strong>g>in</str<strong>on</strong>g>terface to the data <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g “metadata about<br />

archaeology <strong>and</strong> art <str<strong>on</strong>g>in</str<strong>on</strong>g> mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-readable formats such as RDF, JSON <strong>and</strong> KML.” The project has<br />

created a wiki 648 that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes descripti<strong>on</strong>s of the RDF/XML CIDOC-CRM format <strong>and</strong> CLAROS<br />

entity descripti<strong>on</strong> templates for Objects, Places, Periods, <strong>and</strong> People. Dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2010, CLAROS also ran<br />

644 http://explore.clarosnet.org/XDB/ASP/clarosHome/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.html<br />

645 For a discussi<strong>on</strong> of CLAROS <strong>and</strong> its potential for exp<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g access to classical art, see Kurtz (2009).<br />

646 http://explore.clarosnet.org/XDB/ASP/clarosExplorer.asphelp=true<br />

647 http://data.clarosnet.org/<br />

648 http://www.clarosnet.org/wiki/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phptitle=Ma<str<strong>on</strong>g>in</str<strong>on</strong>g>_Page


216<br />

the related MILARQ project, 649 which had the end goal of support<str<strong>on</strong>g>in</str<strong>on</strong>g>g more efficient <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

retrieval from the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated database. This was accomplished by enhanc<str<strong>on</strong>g>in</str<strong>on</strong>g>g Jena, “an exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g, widely<br />

used, open source Semantic Web data management platform” <strong>and</strong> through the creati<strong>on</strong> of “multiple<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dexes over the underly<str<strong>on</strong>g>in</str<strong>on</strong>g>g RDF triple store, Jena TDB, <strong>and</strong> other optimizati<strong>on</strong>s relat<str<strong>on</strong>g>in</str<strong>on</strong>g>g to filter<br />

performance.” The newly released project website also <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a detailed step-by-step technical<br />

overview as to how the CLAROS database was created 650 <strong>and</strong> the guid<str<strong>on</strong>g>in</str<strong>on</strong>g>g pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciples used <str<strong>on</strong>g>in</str<strong>on</strong>g> its design.<br />

C<strong>on</strong>cordia<br />

The C<strong>on</strong>cordia <str<strong>on</strong>g>in</str<strong>on</strong>g>itiative 651 was established by the Center for Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities at K<str<strong>on</strong>g>in</str<strong>on</strong>g>g's<br />

College, L<strong>on</strong>d<strong>on</strong>, <strong>and</strong> the ISAW at New York University. It is a “a transatlantic collaborati<strong>on</strong>” that will<br />

support “dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of key epigraphical, papyrological <strong>and</strong> geographic resources for Greek <strong>and</strong><br />

Roman culture <str<strong>on</strong>g>in</str<strong>on</strong>g> North Africa, <strong>and</strong> pilot<str<strong>on</strong>g>in</str<strong>on</strong>g>g of reusable, st<strong>and</strong>ard techniques for web-based<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.” 652 A number of major projects are participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> this effort, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Duke<br />

Data Bank of Documentary Papyri, Epigraphische Datenbank Heidelberg (EDH), Inscripti<strong>on</strong>s of<br />

Aphrodisias (2007), Inscripti<strong>on</strong>s of Roman Cyrenaica, Inscripti<strong>on</strong>s of Roman Tripolitania, <strong>and</strong><br />

Pleiades. Designed as a dem<strong>on</strong>strati<strong>on</strong> project, C<strong>on</strong>cordia will unite these digital collecti<strong>on</strong>s of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s <strong>and</strong> papyri (that <str<strong>on</strong>g>in</str<strong>on</strong>g>clude 50,000 papyrological <strong>and</strong> 3,000 epigraphic texts) with the<br />

geographic data set of Pleiades. Some newly digitized c<strong>on</strong>tent will also be <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded, such as 950<br />

epigraphic texts. C<strong>on</strong>cordia will use basic web architecture <strong>and</strong> st<strong>and</strong>ard formats (XHTML,<br />

EpiDoc/TEI XML, <strong>and</strong> Atom+GeoRSS). Its ma<str<strong>on</strong>g>in</str<strong>on</strong>g> goal is to provide users with <strong>on</strong>e textual search<br />

across these collecti<strong>on</strong>s as well as “dynamic mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> geographical correlati<strong>on</strong> for arbitrary<br />

collecti<strong>on</strong>s of humanities c<strong>on</strong>tent, hosted anywhere <strong>on</strong> the web.”<br />

This project is set to c<strong>on</strong>clude <str<strong>on</strong>g>in</str<strong>on</strong>g> 2010 <strong>and</strong> has created a project wiki that tracks deliverables, workshop<br />

“results, <strong>and</strong> other general <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>.” 653 A number of software tools have been created, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

epidoc2atom (a set XSLT sheets for “creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g web feeds from EpiDoc c<strong>on</strong>formant XML documents”),<br />

the C<strong>on</strong>cordia Matchtool, a “framework for def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> execut<str<strong>on</strong>g>in</str<strong>on</strong>g>g rulesets to effect match<str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

records <str<strong>on</strong>g>in</str<strong>on</strong>g> two datasets,” <strong>and</strong> C<strong>on</strong>cordia Harvester, “software for crawl<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dex<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Atom+GeoRSS feeds.” Important deliverables that the C<strong>on</strong>cordia project also plans to create <str<strong>on</strong>g>in</str<strong>on</strong>g>clude<br />

Atom + GeoRSS web feeds for all papyri <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> collecti<strong>on</strong>s <strong>and</strong> the C<strong>on</strong>cordiaThesaurus, “a<br />

c<strong>on</strong>trolled vocabulary for express<str<strong>on</strong>g>in</str<strong>on</strong>g>g classes of relati<strong>on</strong>ships (or even asserti<strong>on</strong>s) between web-based<br />

resources <str<strong>on</strong>g>in</str<strong>on</strong>g> the c<strong>on</strong>text of Atom+GeoRSS feeds.”<br />

Digital Antiquity<br />

This project has been described <str<strong>on</strong>g>in</str<strong>on</strong>g> greater detail <str<strong>on</strong>g>in</str<strong>on</strong>g> the Archaeology subsecti<strong>on</strong>.<br />

Digital Classicist<br />

This project has been discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> greater detail <str<strong>on</strong>g>in</str<strong>on</strong>g> the secti<strong>on</strong> <strong>on</strong> Open Access.<br />

eAQUA<br />

eAQUA 654 is a major German project that seeks to use NLP techniques such as text m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g to generate<br />

“structured knowledge” from ancient texts <strong>and</strong> to provide this knowledge to classicists through a<br />

649 http://code.google.com/p/vreri/wiki/MILARQ<br />

650 http://explore.clarosnet.org/XDB/ASP/clarosHome/technicalIntro.html<br />

651 http://c<strong>on</strong>cordia.atlantides.org/<br />

652 http://www.atlantides.org/trac/c<strong>on</strong>cordia/wiki/ProjectOverview<br />

653 http://www.atlantides.org/trac/c<strong>on</strong>cordia/wiki<br />

654 http://www.eaqua.net/en/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php


217<br />

portal. Researchers <str<strong>on</strong>g>in</str<strong>on</strong>g> classics <strong>and</strong> computer science are work<str<strong>on</strong>g>in</str<strong>on</strong>g>g together <strong>on</strong> six subprojects (Büchler<br />

et al. 2008):<br />

1. Atthidographers—This subproject will use text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g methods to search through digital<br />

Greek corpora to try <strong>and</strong> discover previously unfound citati<strong>on</strong>s to <strong>and</strong> quotati<strong>on</strong>s of this group<br />

of annalistic <strong>and</strong> fragmentary Greek historians.<br />

2. Recepti<strong>on</strong> of Plato’s texts <str<strong>on</strong>g>in</str<strong>on</strong>g> ancient world—A comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of visualizati<strong>on</strong> <strong>and</strong> text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

techniques will be used to discover <strong>and</strong> graph quotati<strong>on</strong>s <strong>and</strong> citati<strong>on</strong>s of Plato <str<strong>on</strong>g>in</str<strong>on</strong>g> ancient texts<br />

(Büchler <strong>and</strong> Geßner 2009).<br />

3. The meter of Plautus—This subproject will use NLP techniques to perform metrical analysis<br />

<strong>on</strong> the texts of the Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> poet Plautus (Deufert et al. 2010).<br />

4. Knowledge map of the Early Modern Period—This subproject extends the work of MATEO,<br />

CAMENA, <strong>and</strong> Term<str<strong>on</strong>g>in</str<strong>on</strong>g>i, 655 a collecti<strong>on</strong> of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> books <strong>and</strong> tools, to analyze them from the<br />

early modern period, <strong>and</strong> will explore new research us<str<strong>on</strong>g>in</str<strong>on</strong>g>g co-occurrence analysis <strong>and</strong> text<br />

m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g to track lexical changes over time from the ancient to modern world as well as to create<br />

semantic views of the corpora.<br />

5. Epigraphical work—Extracti<strong>on</strong> of templates for <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s.<br />

6. Papyrology—This subproject will use text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g techniques to provide text completi<strong>on</strong> for<br />

distributed fragmentary collecti<strong>on</strong>s.<br />

The eAQUA project sp<strong>on</strong>sored a full-day workshop at the Digital Humanities 2010 c<strong>on</strong>ference <strong>on</strong> text<br />

m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities. 656<br />

eSAD—e-Science <strong>and</strong> Ancient Documents<br />

eSAD, 657 or “Image, Text, Interpretati<strong>on</strong>: e-Science, Technology <strong>and</strong> Documents project” is us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies to aid classicists <strong>and</strong> other scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> read<str<strong>on</strong>g>in</str<strong>on</strong>g>g ancient documents. This fouryear<br />

project has been undertaken by the University of Oxford with <str<strong>on</strong>g>in</str<strong>on</strong>g>put from University College<br />

L<strong>on</strong>d<strong>on</strong> <strong>and</strong> will c<strong>on</strong>clude <str<strong>on</strong>g>in</str<strong>on</strong>g> 2011. eSAD has two major research projects: (1) creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools to aid <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the read<str<strong>on</strong>g>in</str<strong>on</strong>g>g of damaged texts such as stylus tablets at V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a; <strong>and</strong> (2) discover<str<strong>on</strong>g>in</str<strong>on</strong>g>g how an<br />

Interpretati<strong>on</strong> Support System (ISS) “can be used <str<strong>on</strong>g>in</str<strong>on</strong>g> the day-to-day read<str<strong>on</strong>g>in</str<strong>on</strong>g>g of ancient documents <strong>and</strong><br />

keep track of how the documents are <str<strong>on</strong>g>in</str<strong>on</strong>g>terpreted <strong>and</strong> read.” This project has published extensively <strong>on</strong><br />

their work, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g (de la Flor et al. 2010a, Olsen et al. 2009, Roued 2009, Roued-Cunliffe 2010,<br />

Tarte et al. 2009, Tarte 2011). Further discussi<strong>on</strong> of these articles can be found <str<strong>on</strong>g>in</str<strong>on</strong>g> the Papyrology<br />

secti<strong>on</strong>.<br />

Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Papyrology <strong>and</strong> Papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo<br />

Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Papyrology (IDP) 658 was c<strong>on</strong>ceived <str<strong>on</strong>g>in</str<strong>on</strong>g> 2004–05, when the Duke Data Bank of<br />

Documentary Papyri (DDbDP) <strong>and</strong> the Heidelberger Gesamtverzeichnis der griechischen<br />

Papyrusurkunden Ägyptens (HGV) began “mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g their two largely overlapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g data-sets—Greek<br />

texts <strong>and</strong> descriptive metadata, respectively—to each other.” In 2007, the Mell<strong>on</strong> Foundati<strong>on</strong> provided<br />

655 http://www.uni-mannheim.de/mateo/camenahtdocs/camena.html<br />

656 http://dh2010.cch.kcl.ac.uk/academic-programme/pre-c<strong>on</strong>ference-workshops/workshop-2.html<br />

657 http://esad.classics.ox.ac.uk/<br />

658 http://idp.atlantides.org/trac/idp/wiki/


218<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>itial fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2007) to migrate DDbDP from SGML to EpiDoc <strong>and</strong> from betacode to<br />

Unicode Greek, to merge mapped DDbDP texts <strong>and</strong> HGV metadata <str<strong>on</strong>g>in</str<strong>on</strong>g> a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle XML stream, <strong>and</strong> then<br />

to map these texts to their APIS records, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g metadata <strong>and</strong> images. They also wished to create an<br />

enhanced papyrological navigator (PN) to support search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of this newly merged <strong>and</strong> mapped data<br />

set. In October 2008, the Mell<strong>on</strong> Foundati<strong>on</strong> funded IDP-2, a new, two-year project (Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2008),<br />

to “(1) improve operability of the PN search <str<strong>on</strong>g>in</str<strong>on</strong>g>terface <strong>on</strong> the merged <strong>and</strong> mapped data from the<br />

DDBDP, HGV, <strong>and</strong> APIS, (2) facilitate third-party use of the data <strong>and</strong> tools, (3) <strong>and</strong> create a versi<strong>on</strong><br />

c<strong>on</strong>trolled, transparent <strong>and</strong> fully audited, multi-author, web-based, real-time, tagless, edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

envir<strong>on</strong>ment, which—<str<strong>on</strong>g>in</str<strong>on</strong>g> t<strong>and</strong>em with a new editorial <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure—will allow the entire community<br />

of papyrologists to take c<strong>on</strong>trol of the process of populat<str<strong>on</strong>g>in</str<strong>on</strong>g>g these communal assets with data.” The<br />

goal of the IDP is to create an editorial <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure where papyrologists can make c<strong>on</strong>tributi<strong>on</strong>s to<br />

this <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated knowledge source. This edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment SoSOL is now available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. The<br />

project wiki provides extensive software descripti<strong>on</strong>s <strong>and</strong> downloadable code. 659<br />

The related papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo 660 website provides two major features: a list of l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to papyrological<br />

resources <strong>and</strong> “a customized search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>e (called the Papyrological Navigator [PN]) capable of<br />

retriev<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> from multiple related sites.” The Papyrological Navigator currently retrieves<br />

<strong>and</strong> displays <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> from the APIS, DDbDP, <strong>and</strong> HGV. L<str<strong>on</strong>g>in</str<strong>on</strong>g>ks are also provided to Trismegistos<br />

texts, <strong>and</strong> the unique Trismegistos papyri numbers (TM number) have proved very important <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

design of the PN (Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> 2010). The goal of this project is to dem<strong>on</strong>strate “that a system can be<br />

designed to provide an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated display of a variety of scholarly data sources relevant to the study of<br />

ancient texts.” This prototype uses portlet technology <strong>and</strong> a higher-resoluti<strong>on</strong> image display platform,<br />

<strong>and</strong> “moves bey<strong>on</strong>d the creati<strong>on</strong> of centralized “uni<strong>on</strong> databases,” such as APIS, to leverage <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate c<strong>on</strong>tent created <strong>and</strong> hosted elsewhere <str<strong>on</strong>g>in</str<strong>on</strong>g> the scholarly world.” A major research effort of this<br />

project is <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the scalability of their approach, <strong>and</strong> they hope to design a system that will<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate data sources bey<strong>on</strong>d the <str<strong>on</strong>g>in</str<strong>on</strong>g>itial <strong>on</strong>es <str<strong>on</strong>g>in</str<strong>on</strong>g> this project. A portlet platform was also<br />

chosen to support “pers<strong>on</strong>alizati<strong>on</strong> <strong>and</strong> profil<str<strong>on</strong>g>in</str<strong>on</strong>g>g” so scholars can use it efficiently <str<strong>on</strong>g>in</str<strong>on</strong>g> their research. A<br />

sample record 661 dem<strong>on</strong>strates the potential of this research by <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the metadata for an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual papyrus (P.Oxy 4 744) from the APIS <strong>and</strong> HGV, with the full DDbDP transcripti<strong>on</strong> (with<br />

downloadable EpiDoc XML), an English translati<strong>on</strong> (when available), <strong>and</strong> an image that can be<br />

focused <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> detail. More technical documentati<strong>on</strong> can be found at the IDP website. 662<br />

Interediti<strong>on</strong>: An “Interoperable Supranati<strong>on</strong>al Infrastructure for Digital Editi<strong>on</strong>s”<br />

The Interediti<strong>on</strong> Project 663 has the major goal of promot<str<strong>on</strong>g>in</str<strong>on</strong>g>g “<str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability of the tools <strong>and</strong><br />

methodology” used <str<strong>on</strong>g>in</str<strong>on</strong>g> the field of digital scholarly edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g. As the project website notes, many scholars<br />

have already created “amaz<str<strong>on</strong>g>in</str<strong>on</strong>g>g computer tools” <strong>and</strong> the goal of Interediti<strong>on</strong> is to facilitate c<strong>on</strong>tact<br />

between scholars <strong>and</strong> to encourage creators of such tools to make their functi<strong>on</strong>ality open <strong>and</strong> available<br />

to others.<br />

This project is funded as a European Uni<strong>on</strong> Cost Acti<strong>on</strong> from 2008 to 2012 <strong>and</strong> it will hold a series of<br />

meet<str<strong>on</strong>g>in</str<strong>on</strong>g>gs between researchers <str<strong>on</strong>g>in</str<strong>on</strong>g> the fields of digital literary research <strong>and</strong> IT to explore “the topic of a<br />

shared supranati<strong>on</strong>al networked <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital scholarly edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> analysis.” At the end of<br />

659 http://idp.atlantides.org/trac/idp/wiki/IDPSoftware<br />

660 http://www.papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo/<br />

661 http://www.papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo/apis/tor<strong>on</strong>to.apis.17/<br />

662 http://idp.atlantides.org/trac/idp/wiki/PapyrologicalNavigator<br />

663 http://www.<str<strong>on</strong>g>in</str<strong>on</strong>g>terediti<strong>on</strong>.eu/


219<br />

this project, a road map will be delivered for the implementati<strong>on</strong> of such an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. 664 They will<br />

also release a number of “proof-of-c<strong>on</strong>cept web services to dem<strong>on</strong>strate the viability of the ideas <strong>and</strong><br />

c<strong>on</strong>cepts put forward by Interediti<strong>on</strong> as a networked research platform.” The Interediti<strong>on</strong> project<br />

wiki 665 provides details about past <strong>and</strong> previous workshops <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a list of four workgroups that<br />

have been created to work <strong>on</strong> the European dimensi<strong>on</strong>, prototyp<str<strong>on</strong>g>in</str<strong>on</strong>g>g, strategic IT recommendati<strong>on</strong>s, <strong>and</strong><br />

a road map. A draft architecture has also been proposed 666 <strong>and</strong> there is a separate software<br />

development site for this project. 667<br />

LaQuAT—L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Query<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Ancient Texts<br />

The LaQuAT 668 project was a collaborati<strong>on</strong> between the center for e-Research at K<str<strong>on</strong>g>in</str<strong>on</strong>g>g’s College<br />

L<strong>on</strong>d<strong>on</strong> <strong>and</strong> the EPCC at the University of Ed<str<strong>on</strong>g>in</str<strong>on</strong>g>burgh. The project explored the use of the OGSA-DAI<br />

data-management software that is used to support “the exposure of data resources, such as relati<strong>on</strong>al or<br />

XML databases, <strong>on</strong> to grids” <str<strong>on</strong>g>in</str<strong>on</strong>g> the fields of epigraphy <strong>and</strong> papyrology. A small case study of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g three digital classics resources, the HGV (also participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the IDP project), Project<br />

Volterra, <strong>and</strong> the Inscripti<strong>on</strong>s of Aphrodisias, was c<strong>on</strong>ducted, <strong>and</strong> a dem<strong>on</strong>strator that searched across<br />

the three databases was created. The dem<strong>on</strong>strator is currently ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by K<str<strong>on</strong>g>in</str<strong>on</strong>g>g’s College, but the<br />

ultimate plan is to make the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure developed for this project part of DARIAH. As the data<br />

formats for all three databases were different, this project illustrated both the limitati<strong>on</strong>s <strong>and</strong> potential<br />

of l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g data sets <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities. “More generally,” Jacks<strong>on</strong> et al. (2009) stated, “it was realised<br />

that <strong>on</strong>ce <strong>on</strong>e starts jo<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g databases, the fuzzy, uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>, <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>c<strong>on</strong>sistent nature of the<br />

data that is generated by <strong>and</strong> used <str<strong>on</strong>g>in</str<strong>on</strong>g> humanities research leads to issues about the mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g of what is<br />

facilitated by l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g these databases” (Jacks<strong>on</strong> et al. 2009). One important c<strong>on</strong>clusi<strong>on</strong> of the LaQuAT<br />

project was the need for virtual data centers that can <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate several resources while also allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual resources to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> their unique formats.<br />

PELAGIOS: Enable Ancient L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked Geodata <str<strong>on</strong>g>in</str<strong>on</strong>g> Open Systems 669<br />

The PELAGIOS project’s major goal is to <str<strong>on</strong>g>in</str<strong>on</strong>g>troduce “L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked Open Data” pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciples <str<strong>on</strong>g>in</str<strong>on</strong>g>to digital<br />

resources that make reference to places <str<strong>on</strong>g>in</str<strong>on</strong>g> the ancient world <str<strong>on</strong>g>in</str<strong>on</strong>g> order to support more advanced forms<br />

of visualizati<strong>on</strong> <strong>and</strong> discovery us<str<strong>on</strong>g>in</str<strong>on</strong>g>g these resources. This project <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a large number of<br />

significant digital classical partners as both service <strong>and</strong> c<strong>on</strong>tent providers <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Google Ancient<br />

Places, Pleiades, the PDL, Arachne, SPQR, OpenC<strong>on</strong>text, Nomisma.org <strong>and</strong> CLAROS. The project’s<br />

blog/website c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s extensive details <strong>on</strong> the project’s research, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the results of a workshop<br />

held <str<strong>on</strong>g>in</str<strong>on</strong>g> March 2011, a full outl<str<strong>on</strong>g>in</str<strong>on</strong>g>e of the PELAGIOS project plan, <strong>and</strong> detailed technical descripti<strong>on</strong>s<br />

of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual research tasks as they progress, such as how to efficiently allow users to tag historical<br />

maps us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Pleiades references 670 through the modificati<strong>on</strong> of an already exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g “end-user toolkit for<br />

manual annotati<strong>on</strong> <strong>and</strong> semantic tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g of multimedia c<strong>on</strong>tent.” 671<br />

664 http://w3.cost.esf.org/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpid=233&acti<strong>on</strong>_number=IS0704<br />

665 http://www.<str<strong>on</strong>g>in</str<strong>on</strong>g>terediti<strong>on</strong>.eu/wiki/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/Ma<str<strong>on</strong>g>in</str<strong>on</strong>g>_Page<br />

666 http://www.<str<strong>on</strong>g>in</str<strong>on</strong>g>terediti<strong>on</strong>.eu/wiki/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/WG2:Architecture<br />

667 http://arts-itsee.bham.ac.uk/trac/<str<strong>on</strong>g>in</str<strong>on</strong>g>terediti<strong>on</strong>/<br />

668 http://www.kcl.ac.uk/iss/cerch/projects/completed/laquat.html<br />

669 http://pelagios-project.blogspot.com/p/about.html<br />

670 http://pelagios-project.blogspot.com/2011_04_01_archive.html<br />

671 The PELAGIOS project is currently experiment<str<strong>on</strong>g>in</str<strong>on</strong>g>g with YUMA (http://dme.ait.ac.at/annotati<strong>on</strong>).


220<br />

SPQR—Support<str<strong>on</strong>g>in</str<strong>on</strong>g>g Productive Queries for Research<br />

The SPQR 672 project is driven by the research outcomes of the LaQuAT project, which determ<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

that a relati<strong>on</strong>al database model might be too “<str<strong>on</strong>g>in</str<strong>on</strong>g>flexible” for <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g heterogeneous <strong>and</strong> “fuzzy”<br />

data sets <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities. Because of the limitati<strong>on</strong>s of the relati<strong>on</strong>al database model, the SPQR<br />

project is <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an approach that will make use of l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data <strong>and</strong> Semantic Web approaches<br />

<strong>and</strong> will use the Europeana Data Model (EDM) as its central <strong>on</strong>tology. The classical data sets that will<br />

be used <str<strong>on</strong>g>in</str<strong>on</strong>g> this experimental project <str<strong>on</strong>g>in</str<strong>on</strong>g>clude both the projects <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <str<strong>on</strong>g>in</str<strong>on</strong>g> LaQuAT (HGV, Project<br />

Volterra, Inscripti<strong>on</strong>s of Aphrodisias) <strong>and</strong> new projects, such as the Inscripti<strong>on</strong>s of Roman Tripolitania,<br />

the Pleiades Project, the LGPN, <strong>and</strong> the ANS’s database of co<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>and</strong> papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo. The ma<str<strong>on</strong>g>in</str<strong>on</strong>g> goal of the<br />

SPQR project is to “<str<strong>on</strong>g>in</str<strong>on</strong>g>vestigate the potential of a L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked Data approach for l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

datasets related to classical antiquity.” The project has many other research objectives 673 <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g how to represent the humanities data found <str<strong>on</strong>g>in</str<strong>on</strong>g> relati<strong>on</strong>al databases so that they can be<br />

exposed as l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked data; how to best transform <strong>and</strong> expose the targeted data sets; how to provide<br />

“<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated views” of “multiple heterogeneous datasets” so that researchers can browse <strong>and</strong> query them<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> different ways; <strong>and</strong>, perhaps most important, how to provide a useful <str<strong>on</strong>g>in</str<strong>on</strong>g>terface or means of access to<br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated datasets so that researchers can follow “paths though the data from <strong>on</strong>e dataset to<br />

another via comm<strong>on</strong> attributes (e.g., names, places or dates).” The SPQR project architecture plans to<br />

make use of REST-based services, RDF, SPARQL, SOLR, <strong>and</strong> a triplestore. In additi<strong>on</strong>, to ensure a<br />

greater level of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability with the “wider <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> space,” they will use st<strong>and</strong>ards such as<br />

EDM <strong>and</strong> OAI-ORE, <strong>and</strong> exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g doma<str<strong>on</strong>g>in</str<strong>on</strong>g>-specific vocabularies <strong>and</strong> <strong>on</strong>tologies.<br />

BUILDING A HUMANITIES CYBERINFRASTRUCTURE<br />

Def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Humanities, Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <strong>and</strong> the Future<br />

Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the humanities <str<strong>on</strong>g>in</str<strong>on</strong>g>volves th<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g both about digital humanities<br />

research <strong>and</strong> its current state as a discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. In a recent article, Christ<str<strong>on</strong>g>in</str<strong>on</strong>g>e Borgman has outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed many<br />

of the challenges faced by the digital humanities community as it attempts to possibly come together as<br />

a larger discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> struggles to plan for a shared cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure:<br />

The digital humanities are at a critical moment <str<strong>on</strong>g>in</str<strong>on</strong>g> the transiti<strong>on</strong> from a specialty area to a fullfledged<br />

community with a comm<strong>on</strong> set of methods, sources of evidence, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure – all<br />

of which are necessary for achiev<str<strong>on</strong>g>in</str<strong>on</strong>g>g academic recogniti<strong>on</strong>. … Digital collecti<strong>on</strong>s are<br />

proliferat<str<strong>on</strong>g>in</str<strong>on</strong>g>g, but most rema<str<strong>on</strong>g>in</str<strong>on</strong>g> difficult to use, <strong>and</strong> digital scholarship rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s a backwater <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

most humanities departments with respect to hir<str<strong>on</strong>g>in</str<strong>on</strong>g>g, promoti<strong>on</strong>, <strong>and</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g practices. Only<br />

the scholars themselves are <str<strong>on</strong>g>in</str<strong>on</strong>g> a positi<strong>on</strong> to move the field forward. Experiences of the sciences<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> their <str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives for cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> eScience offer valuable less<strong>on</strong>s (Borgman 2009).<br />

Borgman ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that <str<strong>on</strong>g>in</str<strong>on</strong>g> order for the digital humanities to be successful, scholars would need to<br />

beg<str<strong>on</strong>g>in</str<strong>on</strong>g> mov<str<strong>on</strong>g>in</str<strong>on</strong>g>g more actively to build the necessary <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> to promote their own <str<strong>on</strong>g>in</str<strong>on</strong>g>terests.<br />

Her article sought to serve as a call to acti<strong>on</strong> for scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital humanities with a focus <strong>on</strong> six<br />

factors that will affect the future of digital scholarship <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities: publicati<strong>on</strong> practices, data,<br />

research methods, collaborati<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>centives, <strong>and</strong> learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Each of these themes is discussed later <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

this secti<strong>on</strong>.<br />

672 http://spqr.cerch.kcl.ac.uk/<br />

673 http://spqr.cerch.kcl.ac.uk/page_id=8


221<br />

Borgman def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the digital humanities as “a new set of practices, us<str<strong>on</strong>g>in</str<strong>on</strong>g>g new sets of technologies, to<br />

address research problems of the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e.” The digital humanities as a new set of practices <strong>and</strong><br />

technologies also requires a particular type of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, the requirements of which were first laid<br />

out c<strong>on</strong>cretely <str<strong>on</strong>g>in</str<strong>on</strong>g> 2006 by a report commissi<strong>on</strong>ed by the American <str<strong>on</strong>g>Council</str<strong>on</strong>g> of Learned Societies<br />

(ACLS):<br />

Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is more than the tangible network <strong>and</strong> the means of storage <str<strong>on</strong>g>in</str<strong>on</strong>g> digitized<br />

form. It is not <strong>on</strong>ly the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e-specific software applicati<strong>on</strong> <strong>and</strong> the project-specific data<br />

collecti<strong>on</strong>s: it is also the more <str<strong>on</strong>g>in</str<strong>on</strong>g>tangible layer of expertise <strong>and</strong> best practices, st<strong>and</strong>ards <strong>and</strong><br />

tools, collecti<strong>on</strong>s <strong>and</strong> collaborative envir<strong>on</strong>ments that can be broadly shared across<br />

communities of <str<strong>on</strong>g>in</str<strong>on</strong>g>quiry (ACLS 2006).<br />

The ACLS report also offered a list of characteristics that would be required of a humanities<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure: it will operate as a public good; be both susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable; encourage<br />

collaborative work; <strong>and</strong> support experimentati<strong>on</strong>. The def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> <strong>and</strong> characteristics outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed above<br />

offer a number of major po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts that this report c<strong>on</strong>siders <str<strong>on</strong>g>in</str<strong>on</strong>g> greater detail, particularly that<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes not <strong>on</strong>ly data collecti<strong>on</strong>s <strong>and</strong> software for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es but also best<br />

practices, st<strong>and</strong>ards, collecti<strong>on</strong>s, <strong>and</strong> collaborati<strong>on</strong>s that can be shared across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es. The<br />

importance of shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g as key to the def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure has also been expressed by Geoffrey<br />

Rockwell, who posited, “Anyth<str<strong>on</strong>g>in</str<strong>on</strong>g>g that is needed to c<strong>on</strong>nect more than <strong>on</strong>e pers<strong>on</strong>, project, or entity is<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. Anyth<str<strong>on</strong>g>in</str<strong>on</strong>g>g used exclusively by a project is not” (Rockwell 2010). In his own discussi<strong>on</strong><br />

of cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, Rockwell sec<strong>on</strong>ded the po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of the ACLS report that it should be broadly<br />

useful to the public, but also c<strong>on</strong>cluded that cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure needs to be “well understood enough”<br />

so that it can be broadly useful, be able to foster ec<strong>on</strong>omic or research activity, be “funded by the<br />

public for the public,” be <str<strong>on</strong>g>in</str<strong>on</strong>g>visible so that its use becomes reliable <strong>and</strong> expected, <strong>and</strong> be ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by<br />

a l<strong>on</strong>g-term organizati<strong>on</strong>. At the same time, Rockwell made a dist<str<strong>on</strong>g>in</str<strong>on</strong>g>cti<strong>on</strong> between humanities research<br />

<strong>and</strong> research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure or cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <strong>and</strong> emphasized the importance of support<str<strong>on</strong>g>in</str<strong>on</strong>g>g both.<br />

“Research, by c<strong>on</strong>trast is not expected to be useful, necessarily, <strong>and</strong> certa<str<strong>on</strong>g>in</str<strong>on</strong>g>ly isn’t expected to be<br />

useful to a public,” Rockwell c<strong>on</strong>cluded, stat<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “research is about that which we d<strong>on</strong>’t<br />

underst<strong>and</strong>, while <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure really shouldn’t be experimental” (Rockwell 2010). Rockwell’s larger<br />

po<str<strong>on</strong>g>in</str<strong>on</strong>g>t was that <str<strong>on</strong>g>in</str<strong>on</strong>g> def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g their cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, humanists should remember that a “turn to<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure” <str<strong>on</strong>g>in</str<strong>on</strong>g>volves political <strong>and</strong> sociological decisi<strong>on</strong>s <strong>and</strong> a possible redef<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of what is<br />

c<strong>on</strong>sidered as “legitimate” research.<br />

Other major po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts made by the ACLS report were that “extensive <strong>and</strong> reusable digital collecti<strong>on</strong>s”<br />

were at the core of any cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> that scholars should be engaged <str<strong>on</strong>g>in</str<strong>on</strong>g> the development of<br />

these collecti<strong>on</strong>s, for, as the commissi<strong>on</strong> noted, almost all successful tool build<str<strong>on</strong>g>in</str<strong>on</strong>g>g is dependent <strong>on</strong> the<br />

existence of <strong>and</strong> access to digital collecti<strong>on</strong>s. The c<strong>on</strong>cept of services, c<strong>on</strong>tent, <strong>and</strong> tools as<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure was seen repeatedly not <strong>on</strong>ly throughout not just this report but also <str<strong>on</strong>g>in</str<strong>on</strong>g> the discussi<strong>on</strong>s<br />

by TextGrid <strong>and</strong> other humanities cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure research projects, <strong>and</strong> is explored further <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

next secti<strong>on</strong>.<br />

Open C<strong>on</strong>tent, Services, <strong>and</strong> Tools as Infrastructure<br />

In 2007, a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t JISC-NSF report emphasized that c<strong>on</strong>tent <strong>and</strong> the tools necessary to exploit it are two<br />

essential comp<strong>on</strong>ents of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. “In the cyber age, collecti<strong>on</strong>s of digital c<strong>on</strong>tent <strong>and</strong> the software<br />

to <str<strong>on</strong>g>in</str<strong>on</strong>g>terpret them have become the foundati<strong>on</strong> for discovery,” Arms <strong>and</strong> Larsen (2007) <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted, not<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

that “they have entered the realm of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.” In their overview of the role of virtual research


222<br />

envir<strong>on</strong>ments <str<strong>on</strong>g>in</str<strong>on</strong>g> scholarly communicati<strong>on</strong>, Voss <strong>and</strong> Procter (2009) offered a similar c<strong>on</strong>clusi<strong>on</strong>. “The<br />

c<strong>on</strong>cept of ‘c<strong>on</strong>tent as <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure’ emphasises the <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g importance of collecti<strong>on</strong>s of research<br />

data as a reusable <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure,” Voss <strong>and</strong> Procter expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “that builds <strong>on</strong> top of the physical<br />

research comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> traditi<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures such as scientific <str<strong>on</strong>g>in</str<strong>on</strong>g>struments or<br />

libraries”(Voss <strong>and</strong> Procter 2009).<br />

At the same time, the authors of the JISC-NSF report stated that more cultural heritage <strong>and</strong> educati<strong>on</strong>al<br />

organizati<strong>on</strong>s needed to work together to produce <strong>and</strong> share their c<strong>on</strong>tent, c<strong>on</strong>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “the arduous<br />

goal of open access <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities can <strong>on</strong>ly be achieved when public <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s no l<strong>on</strong>ger <str<strong>on</strong>g>in</str<strong>on</strong>g>vest <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

endeavors with proprietary output.” S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce openly licensed or c<strong>on</strong>tent that is freely available for reuse is<br />

such a fundamental part of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, the ACLS offered a similar warn<str<strong>on</strong>g>in</str<strong>on</strong>g>g, suggest<str<strong>on</strong>g>in</str<strong>on</strong>g>g that more<br />

universities needed to work to create, digitize <strong>and</strong> preserve their own collecti<strong>on</strong>s either locally or<br />

c<strong>on</strong>sortially, rather than rent<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to materials. The Associati<strong>on</strong> of Research Libraries (ARL) has<br />

recently made a similar call for large-scale government fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create a “universal, open library or<br />

digital data comm<strong>on</strong>s” (ARL 2009a).<br />

The creators of TextGrid have reached similar c<strong>on</strong>clusi<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the primacy of both c<strong>on</strong>tent <strong>and</strong><br />

services <str<strong>on</strong>g>in</str<strong>on</strong>g> what they have labeled eHumanities ecosystems as an alternative to the term<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. “However, at least for eHumanities ecosystems a model <str<strong>on</strong>g>in</str<strong>on</strong>g> which services reign<br />

supreme <strong>and</strong> c<strong>on</strong>tent is exclusively seen as the matter <strong>on</strong> which the services operate is not<br />

satisfactory,” Ludwig <strong>and</strong> Küster articulate, “for eHumanities ecosystems need models <str<strong>on</strong>g>in</str<strong>on</strong>g> which both<br />

c<strong>on</strong>tent <strong>and</strong> services are first-class citizens” (Ludwig <strong>and</strong> Küster 2008).<br />

To address the need to provide services <strong>on</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructural level for digital humanities research, the<br />

HiTHeR (e-Humanities High Throughput Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g) 674 project sought to embed a self-organiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g applicati<strong>on</strong>/agent as a RESTful web service <str<strong>on</strong>g>in</str<strong>on</strong>g> an “e-Humanities ecosystem.” Blanke,<br />

Hedges, <strong>and</strong> Palmer (2009) provided an overview of this project that sought to explore what “digital<br />

services <strong>and</strong> value-creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g activities” will particularly serve e-Humanities research. Although the<br />

particular tool described sought to create an “automatic cha<str<strong>on</strong>g>in</str<strong>on</strong>g> of read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs” for the N<str<strong>on</strong>g>in</str<strong>on</strong>g>eteenth Century<br />

Serials Editi<strong>on</strong> Project (NCSE), 675 the larger questi<strong>on</strong> c<strong>on</strong>sidered was how such an agent could be<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g>to a large humanities cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. A “resourceful web service” approach was used<br />

to avoid creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g yet another isolated tool or website soluti<strong>on</strong>. One of the greatest challenges for tool<br />

developers <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital humanities, the authors thus declared, was determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g how to create tools<br />

that were appropriate for the traditi<strong>on</strong>al research scholars may wish to pursue while still allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>novative work. 676<br />

The text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g service offered by the HiTHer project created a “semantic view,” or an automatically<br />

generated brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terface to the NCSE text collecti<strong>on</strong>:<br />

HiTHeR will offer an <str<strong>on</strong>g>in</str<strong>on</strong>g>terface to primary resources by automatically generat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a cha<str<strong>on</strong>g>in</str<strong>on</strong>g> of<br />

related documents for read<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Users of HiTHeR are able to upload collecti<strong>on</strong>s <strong>and</strong> retrieve lists<br />

of reference documents <str<strong>on</strong>g>in</str<strong>on</strong>g> their collecti<strong>on</strong>s together with the N most similar documents to this<br />

reference document (Blanke, Hedges, <strong>and</strong> Palmer 2009).<br />

674 http://hither.cerch.kcl.ac.uk/<br />

675 http://www.ncse.ac.uk/<br />

676 In a discussi<strong>on</strong> of his annotati<strong>on</strong> tool Pl<str<strong>on</strong>g>in</str<strong>on</strong>g>y, John Bradley made similar claims <strong>and</strong> reiterated a po<str<strong>on</strong>g>in</str<strong>on</strong>g>t he had made earlier (Bradley 2005), namely, “that<br />

tool builders <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital humanities would have better success persuad<str<strong>on</strong>g>in</str<strong>on</strong>g>g their n<strong>on</strong>-digital colleagues that computers could have a significant positive<br />

benefit <strong>on</strong> their research if the tools they built fit better <str<strong>on</strong>g>in</str<strong>on</strong>g>to how humanities scholarship is generally d<strong>on</strong>e, rather than if they developed new tools that<br />

were premised up<strong>on</strong> a radically different way to do th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs” (Bradley 2008).


223<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce HiTHeR also aimed to provide a comprehensive research platform, they chose to offer several<br />

text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g algorithms for their users <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g this “cha<str<strong>on</strong>g>in</str<strong>on</strong>g> of read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs.” In additi<strong>on</strong>, users<br />

could upload their own documents, not just use this tool with NCSE collecti<strong>on</strong>s.<br />

The HiTHer project quickly discovered, however, that st<strong>and</strong>ard comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ments did not<br />

provide the level of process<str<strong>on</strong>g>in</str<strong>on</strong>g>g power necessary to run these algorithms. To resolve this problem, they<br />

built an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure based <strong>on</strong> high-throughput comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g (HTC) that uses many computati<strong>on</strong>al<br />

resources to accomplish a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle computati<strong>on</strong>al task. They made use of the C<strong>on</strong>dor toolkit that let them<br />

rely <strong>on</strong> two types of computers at K<str<strong>on</strong>g>in</str<strong>on</strong>g>g’s College L<strong>on</strong>d<strong>on</strong>, underutilized desktop computers <strong>and</strong><br />

dedicated servers. The authors thus assert that HiTHeR “illustrates how e-Humanities centres can be<br />

served by implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g their own local research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, which they can relatively easily build<br />

us<str<strong>on</strong>g>in</str<strong>on</strong>g>g exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources like st<strong>and</strong>ard desktop networks” (Blanke, Hedges, <strong>and</strong> Palmer 2009).<br />

Another <str<strong>on</strong>g>in</str<strong>on</strong>g>sight offered by the HiTHeR research group was that for most applicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

humanities, “large comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g power will <strong>on</strong>ly be needed to prepare data sets for human analysis”<br />

(Blanke, Hedges, <strong>and</strong> Palmer 2009). They suggested that for much humanities research, a user would<br />

simply need to call <strong>on</strong> heavy process<str<strong>on</strong>g>in</str<strong>on</strong>g>g power to analyze a data set <strong>on</strong>ce, <strong>and</strong> would want to spend the<br />

rest of his or her time access<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the results; <str<strong>on</strong>g>in</str<strong>on</strong>g> other words, most humanists would need<br />

a “create <strong>on</strong>ce-read many resources” applicati<strong>on</strong> envir<strong>on</strong>ment. This led them to ultimately deploy<br />

HiTHer as a restful web service where humanities scholars could call up<strong>on</strong> a variety of text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

algorithms <strong>and</strong> then receive the results <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of formats (XHTML, Atom, etc.)<br />

The importance of services, or digital tools more specifically, as <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure has also been discussed<br />

by Geoffrey Rockwell, who provided an overview of the development of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for textual<br />

analysis that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the creati<strong>on</strong> of a portal for textual research called TAPoR <strong>and</strong> the development<br />

of a set of reference tools TAPoRware. 677 The <str<strong>on</strong>g>in</str<strong>on</strong>g>tent was that this portal could be used to discover <strong>and</strong><br />

use tools that had been registered by their creators as web services that were runn<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> various<br />

locati<strong>on</strong>s. The portal was to provide scholars easy access to already-exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools <strong>and</strong> to support the<br />

registrati<strong>on</strong>, creati<strong>on</strong>, <strong>and</strong> publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g of new services. Currently the portal is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g re<str<strong>on</strong>g>in</str<strong>on</strong>g>vented,<br />

Rockwell reported, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many scholars did not f<str<strong>on</strong>g>in</str<strong>on</strong>g>d it easy to use. He also suggested that web services<br />

are often not as reliable as they should be, <strong>and</strong> that most users require both simplicity <strong>and</strong> reliability.<br />

“My po<str<strong>on</strong>g>in</str<strong>on</strong>g>t here is that the model was to keep tool development as research but make the research tools<br />

easy to discover <strong>and</strong> use through portal-like <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure,” Rockwell expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed; add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “a further<br />

paradigm was that tools could be embedded <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e texts as small viral badges, thereby hid<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

portal <strong>and</strong> foreground<str<strong>on</strong>g>in</str<strong>on</strong>g>g the visible text, an experiment we are just embark<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong>” (Rockwell 2010).<br />

While Rockwell accentuated that digital tools were an important part of the portal <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that at<br />

times needed to be “<str<strong>on</strong>g>in</str<strong>on</strong>g>visible” to make the c<strong>on</strong>tent primary, he also argued that tool development is an<br />

important part of the humanities research process <str<strong>on</strong>g>in</str<strong>on</strong>g> itself.<br />

Research libraries <strong>and</strong> digital repositories, as potential key comp<strong>on</strong>ents of cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the<br />

humanities, will also need to address the complexities of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to both c<strong>on</strong>tent <strong>and</strong> services<br />

as part of a larger networked <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to a recent Associati<strong>on</strong> of Research Libraries<br />

(ARL) report <strong>on</strong> digital repository services for research libraries:<br />

… manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g unique c<strong>on</strong>tent, not just traditi<strong>on</strong>al special collecti<strong>on</strong>s but entirely new k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds of<br />

works <strong>and</strong> locally-created c<strong>on</strong>tent, will be an important emphasis for collecti<strong>on</strong> <strong>and</strong><br />

management. As users exercise new capabilities <strong>and</strong> require new services, library services will<br />

677 http://portal.tapor.ca <strong>and</strong> http://taporware.cmaster.ca


224<br />

become less “localized” with<str<strong>on</strong>g>in</str<strong>on</strong>g> the library <strong>and</strong> with<str<strong>on</strong>g>in</str<strong>on</strong>g> campus systems <strong>and</strong> exp<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to the<br />

general network envir<strong>on</strong>ment. <strong>Library</strong> services <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly mean mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong>s <strong>and</strong> will be embeddable <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of n<strong>on</strong>-library envir<strong>on</strong>ments (ARL 2009b).<br />

The services provided by digital repositories with<str<strong>on</strong>g>in</str<strong>on</strong>g> research libraries will thus need to move bey<strong>on</strong>d<br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual library to encompass services required <str<strong>on</strong>g>in</str<strong>on</strong>g> a larger network envir<strong>on</strong>ment, <strong>and</strong> new c<strong>on</strong>tent<br />

of all k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds will be required to support the research needs of users. This report made the important<br />

po<str<strong>on</strong>g>in</str<strong>on</strong>g>t that many “services” that will be required will be to support mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e-to-mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e communicati<strong>on</strong><br />

through the use of st<strong>and</strong>ards <strong>and</strong> protocols.<br />

In additi<strong>on</strong> to services <strong>and</strong> c<strong>on</strong>tent, digital tools, as outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by Rockwell above, are a key comp<strong>on</strong>ent<br />

of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. As explicated by Nguyen <strong>and</strong> Shilt<strong>on</strong> (2008) <str<strong>on</strong>g>in</str<strong>on</strong>g> their survey of exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital tools,<br />

digital tools are typically dist<str<strong>on</strong>g>in</str<strong>on</strong>g>ct from the other services <strong>and</strong> resources created by digital humanities<br />

centers. 678 They def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed tools as “software developed for the creati<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>, or shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

communicati<strong>on</strong> of digital humanities resources <strong>and</strong> collecti<strong>on</strong>s.” Nguyen <strong>and</strong> Shilt<strong>on</strong> evaluated the<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>dability <strong>and</strong> usability of digital tools that were provided by digital humanities centers <strong>and</strong> created a<br />

typology that further def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed tools accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to their objectives (“access <strong>and</strong> explorati<strong>on</strong> of resources,”<br />

“<str<strong>on</strong>g>in</str<strong>on</strong>g>sight <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>” or to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d larger patterns <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpret them, to support creati<strong>on</strong> of new<br />

digital resources, <strong>and</strong> “community <strong>and</strong> communicati<strong>on</strong>”), technological orig<str<strong>on</strong>g>in</str<strong>on</strong>g>s, <strong>and</strong> associated<br />

resources. In this particular study they excluded tools developed outside the digital humanities<br />

community or that had been developed to functi<strong>on</strong> with <strong>on</strong>ly a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle digital resource or collecti<strong>on</strong>. To<br />

further manage the scope of their research they limited the c<strong>on</strong>cept of f<str<strong>on</strong>g>in</str<strong>on</strong>g>dability to the ability of a user<br />

to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d a tool <strong>on</strong> a digital humanities center website.<br />

Nguyen <strong>and</strong> Shilt<strong>on</strong> granted that a larger research study determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g how easy it is for users to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<br />

digital tools us<str<strong>on</strong>g>in</str<strong>on</strong>g>g exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g search eng<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> metadata structures would be very useful. In fact, the<br />

difficulty scholars have <str<strong>on</strong>g>in</str<strong>on</strong>g> f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g relevant digital tools was recognized by the report of a 2009<br />

workshop (Cohen et al. 2009) sp<strong>on</strong>sored by the NSF, the NEH, <strong>and</strong> the Institute for Museum <strong>and</strong><br />

<strong>Library</strong> Services (IMLS) that <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated what would be required to create an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital<br />

tools that could then support “data-driven scholarship.” 679 Nguyen <strong>and</strong> Shilt<strong>on</strong> developed an<br />

evaluati<strong>on</strong> framework to assess the strength of each tool <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of its easy identificati<strong>on</strong>, “feature,<br />

display, <strong>and</strong> access,” the clarity of documentati<strong>on</strong> or descripti<strong>on</strong>, <strong>and</strong> ease of operati<strong>on</strong>. The<br />

effectiveness or technical performance of tools was not evaluated. Of the 39 tools evaluated, <strong>on</strong>ly<br />

seven received high marks, <strong>and</strong> am<strong>on</strong>g the highest-scor<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools were those such as Zotero 680 <strong>and</strong><br />

Omeka, 681 both created by the Center for History <strong>and</strong> New Media 682 at George Mas<strong>on</strong> University, <strong>and</strong><br />

both of which have extensive documentati<strong>on</strong>, technical support, <strong>and</strong> devoted user communities. One<br />

feature of the highest-rated tools was their choice of words for “feature <strong>and</strong> display” that dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guished<br />

them as actual tools, <strong>and</strong> all tools fared better <strong>on</strong> variables that measured “ease of access” than <strong>on</strong><br />

those that measured “clarity of use.” Nguyen <strong>and</strong> Shilt<strong>on</strong> offered seven useful recommendati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

terms of best practices for future digital tool designers: highlight tools more prom<str<strong>on</strong>g>in</str<strong>on</strong>g>ently <strong>on</strong> websites;<br />

offer a specific descripti<strong>on</strong> of the tool’s purpose <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tended audience; make previews available (e.g.,<br />

screenshots, tutorials, demos); provide technical support (FAQs, e-mail address); clearly state the<br />

678 The research c<strong>on</strong>ducted by Lilly Nguyen <strong>and</strong> Katie Shilt<strong>on</strong>, “Tools for Humanists” was part of a larger research study of digital humanities centers by<br />

Diane Zorich (2008).<br />

679 An earlier workshop <strong>on</strong> the need to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e the digital tools that humanists used <strong>and</strong> how to make them <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable was held at the University of<br />

Virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia <str<strong>on</strong>g>in</str<strong>on</strong>g> 2005 (Frischer et al. 2005).<br />

680 http://www.zotero.org/<br />

681 http://omeka.org/<br />

682 http://chnm.gmu.edu/


225<br />

technical requirements for use or download; provide easy-to-use <str<strong>on</strong>g>in</str<strong>on</strong>g>structi<strong>on</strong>s <strong>on</strong> how to download a<br />

tool or <str<strong>on</strong>g>in</str<strong>on</strong>g>teract with it (e.g., if tool is embedded <str<strong>on</strong>g>in</str<strong>on</strong>g> a Web browser); <strong>and</strong>, perhaps most important, plan<br />

for the susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability of a tool.<br />

Further research by Shilt<strong>on</strong> (Shilt<strong>on</strong> 2009) explored the susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability of digital tools <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al support that existed to susta<str<strong>on</strong>g>in</str<strong>on</strong>g> the 39 tools identified <str<strong>on</strong>g>in</str<strong>on</strong>g> the previous study. Shilt<strong>on</strong><br />

proposed two new metrics, “l<strong>on</strong>gevity of support,” or the date a tool was established or other<br />

versi<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, <strong>and</strong> “support for tool,” def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as the level of technical support (e.g., number<br />

of updates, release time l<str<strong>on</strong>g>in</str<strong>on</strong>g>es, open st<strong>and</strong>ards, l<strong>on</strong>g-term fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g) provided for a tool. While<br />

acknowledg<str<strong>on</strong>g>in</str<strong>on</strong>g>g that <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is a very broad term, Shilt<strong>on</strong> expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that her report focused <strong>on</strong> the<br />

“aspects of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure” that could be evaluated by exam<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g tools public websites. Further<br />

research, she argued, should c<strong>on</strong>sider the more “subtle <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tangible” aspects of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, such<br />

as “human capital, dedicati<strong>on</strong> <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al c<strong>on</strong>text.” Another important dimensi<strong>on</strong> that she argued<br />

should be explored was the utility of a tool to humanities research.<br />

Utiliz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the above metrics, Shilt<strong>on</strong> analyzed those tools that still existed of the 39 tools that were<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>itially identified <str<strong>on</strong>g>in</str<strong>on</strong>g> the earlier report, <strong>and</strong> found that most of the tools that were rated highly <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

first project also scored highly <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability. “The f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs suggest that accessibility of<br />

tools <strong>and</strong> the quality of their support<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure,” Shilt<strong>on</strong> observed, “are, <str<strong>on</strong>g>in</str<strong>on</strong>g> fact, correlated. A<br />

successful comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of accessibility, l<strong>on</strong>gevity <strong>and</strong> support add to the value of a tool for<br />

researchers”(Shilt<strong>on</strong> 2009). While some older tools had evolved <str<strong>on</strong>g>in</str<strong>on</strong>g>to new <strong>on</strong>es, other tools had simply<br />

been “ab<strong>and</strong><strong>on</strong>ed” because of loss of “<str<strong>on</strong>g>in</str<strong>on</strong>g>terest, time or fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g.” Shilt<strong>on</strong> offered a number of best<br />

practices <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g website design that makes tools easy to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dicates<br />

that they are supported, <strong>and</strong> professi<strong>on</strong>alism, or view<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools not just as <strong>on</strong>e-time programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

projects but as “products to support rigorous <strong>and</strong> l<strong>on</strong>g-term scholarship” that require both stewardship<br />

<strong>and</strong> <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al support. In agreement with Cohen et al. (2009), Shilt<strong>on</strong> noted that develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

a str<strong>on</strong>g user community is a critical comp<strong>on</strong>ent of encourag<str<strong>on</strong>g>in</str<strong>on</strong>g>g tool accessibility <strong>and</strong> susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability.<br />

Shilt<strong>on</strong> c<strong>on</strong>cluded that a seamless <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>visible cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure of digital tools for humanists was<br />

still <str<strong>on</strong>g>in</str<strong>on</strong>g> its <str<strong>on</strong>g>in</str<strong>on</strong>g>fancy. N<strong>on</strong>etheless, she still proposed that “imag<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the comp<strong>on</strong>ents of a curated<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is an important next step for digital humanities research.”<br />

The workshop report by Cohen et al. (2009) listed a number of comp<strong>on</strong>ents that would be required for<br />

a curated <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital tools <strong>and</strong> del<str<strong>on</strong>g>in</str<strong>on</strong>g>eated the problems with creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g, promot<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong><br />

preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital tools such those an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure would need to address. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, this report<br />

outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed the myriad problems faced by the creators of digital tools, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the c<strong>on</strong>ceptualizati<strong>on</strong> of<br />

the tool (e.g., what type of applicati<strong>on</strong> should be built), the ambiguous noti<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g what<br />

c<strong>on</strong>stitutes a tool, the failure of tool registries to ga<str<strong>on</strong>g>in</str<strong>on</strong>g> builder participati<strong>on</strong>, <strong>and</strong> the challenges of<br />

categoriz<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools with<str<strong>on</strong>g>in</str<strong>on</strong>g> tax<strong>on</strong>omies so that they can be found. Even after successful tool<br />

c<strong>on</strong>ceptualizati<strong>on</strong>, they stated, digital tool design projects still face issues with staff recruitment,<br />

community participati<strong>on</strong>, <strong>and</strong> project management. In agreement with Shilt<strong>on</strong>, Cohen et al. reported<br />

that much tool build<str<strong>on</strong>g>in</str<strong>on</strong>g>g did not meet acceptable levels of professi<strong>on</strong>alism, with effective plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

versi<strong>on</strong> c<strong>on</strong>trol of code, communicati<strong>on</strong> am<strong>on</strong>g staff, <strong>and</strong> plans for l<strong>on</strong>g-term support.<br />

Another important problem Cohen et al. described was that even after a tool has been developed to the<br />

po<str<strong>on</strong>g>in</str<strong>on</strong>g>t where it can be distributed, there is a need to “attract, reta<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> serve users.” The issues outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

by Nguyen <strong>and</strong> Shilt<strong>on</strong> regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the f<str<strong>on</strong>g>in</str<strong>on</strong>g>dability, accessibility, <strong>and</strong> lack of transparency <strong>and</strong><br />

documentati<strong>on</strong> for tools were reiterated as barriers to build<str<strong>on</strong>g>in</str<strong>on</strong>g>g successful user communities. “In short,<br />

if c<strong>on</strong>cerns about the creati<strong>on</strong> <strong>and</strong> producti<strong>on</strong> of tools has to do with the supply of new digital


226<br />

methods,” Cohen et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “more has to be d<strong>on</strong>e <strong>on</strong> the other side of the equati<strong>on</strong>: the dem<strong>and</strong><br />

for these digital methods <strong>and</strong> tools. User bases must be cultivated <strong>and</strong> are unlikely to appear naturally,<br />

<strong>and</strong> few projects do the necessary br<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g, market<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of their tool <str<strong>on</strong>g>in</str<strong>on</strong>g> the way that<br />

commercial software efforts do” (Cohen et al. 2009).<br />

A variety of soluti<strong>on</strong>s were proposed to address these <strong>and</strong> other problems, but the discussi<strong>on</strong> of most<br />

attendees, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Cohen et al. (2009), centered around cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, albeit with various<br />

labels, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g repository, registry, c<strong>on</strong>sortium, <strong>and</strong> “<str<strong>on</strong>g>in</str<strong>on</strong>g>visible college.” The major theme that<br />

emerged from these discussi<strong>on</strong>s was the need for an “ec<strong>on</strong>omy of scale <strong>and</strong> the focus<str<strong>on</strong>g>in</str<strong>on</strong>g>g of attenti<strong>on</strong>.”<br />

A list of useful features for any <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure to be developed was also presented, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a code<br />

depository, development-management tools (team management, wikis, bug track<str<strong>on</strong>g>in</str<strong>on</strong>g>g), an outreach<br />

functi<strong>on</strong> to expla<str<strong>on</strong>g>in</str<strong>on</strong>g> tools <strong>and</strong> methods, a discovery functi<strong>on</strong>, documentati<strong>on</strong> support (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

encourag<str<strong>on</strong>g>in</str<strong>on</strong>g>g st<strong>and</strong>ardizati<strong>on</strong>), the runn<str<strong>on</strong>g>in</str<strong>on</strong>g>g of c<strong>on</strong>tests or exchanges, discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e-specific code<br />

“cookbooks” <strong>and</strong> reviews, support to develop <strong>and</strong> c<strong>on</strong>duct tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ars, <strong>and</strong> resources to lobby<br />

for tool development <strong>and</strong> open access to c<strong>on</strong>tent. Discussi<strong>on</strong> of these features <strong>and</strong> a “draft strawman”<br />

request for proposals for a tool <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that were circulated at this meet<str<strong>on</strong>g>in</str<strong>on</strong>g>g raised a number of<br />

important issues, Cohen et al. stated, the most important of which was audience, or who exactly would<br />

use this cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure—end users, developers, or both After c<strong>on</strong>sider<str<strong>on</strong>g>in</str<strong>on</strong>g>g the various models, the<br />

idea of the “<str<strong>on</strong>g>in</str<strong>on</strong>g>visible college” that focused <strong>on</strong> communities rather than “static resources” was agreed<br />

up<strong>on</strong> as a more viable soluti<strong>on</strong> than a repository. It was suggested that an <str<strong>on</strong>g>in</str<strong>on</strong>g>visible college approach<br />

might foster symposia, expert sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ars, <strong>and</strong> peer review of digital tools, as well as a system of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>centives <strong>and</strong> rewards for “membership” <str<strong>on</strong>g>in</str<strong>on</strong>g> the college.<br />

Beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d all of these discussi<strong>on</strong>s, Cohen et al. po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out, was the recogniti<strong>on</strong> that all tool build<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

tool use must be deeply embedded with<str<strong>on</strong>g>in</str<strong>on</strong>g> “scholarly communities of practice,” <strong>and</strong> that such<br />

communities need to be promoted <strong>and</strong> be <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g> scope. The group envisi<strong>on</strong>ed a dynamic site<br />

similar to SourceForge that would provide (1) a “tool development envir<strong>on</strong>ment”; (2) a “curated tools<br />

repository” that would provide peer- review mechanisms as well as support discovery of tools; <strong>and</strong> (3)<br />

functi<strong>on</strong>ality that supported both community build<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> market<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Such a site, they c<strong>on</strong>cluded,<br />

might be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to the Bamboo Project, but they also acknowledged that their visi<strong>on</strong> was a complex<br />

undertak<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> thus proposed several themes for which to seek fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g: the promoti<strong>on</strong> of susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able<br />

<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable tools, the creati<strong>on</strong> of <strong>and</strong> support for <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g rewards for the<br />

development of digital tools.<br />

To promote susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable tools, they proposed fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g programs to tra<str<strong>on</strong>g>in</str<strong>on</strong>g> digital<br />

humanities tra<str<strong>on</strong>g>in</str<strong>on</strong>g>ers, to provide a grant opportunity for collaborative work that embedded already<br />

successful digital tools with<str<strong>on</strong>g>in</str<strong>on</strong>g> a significant digital humanities collecti<strong>on</strong>, <strong>and</strong> to fund grant<br />

opportunities that would make two or more significant already-exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable. To<br />

promote the creati<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, they proposed secur<str<strong>on</strong>g>in</str<strong>on</strong>g>g grant fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create a “shared tools<br />

development <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure” that should <str<strong>on</strong>g>in</str<strong>on</strong>g>clude developer tools, programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g “cookbooks,” <strong>and</strong> other<br />

relevant resources. Such an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, they c<strong>on</strong>cluded, should not be “owned” by any <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual or<br />

small group of universities but might <str<strong>on</strong>g>in</str<strong>on</strong>g>stead be hosted by centerNET or the Alliance of Digital<br />

Humanities Organizati<strong>on</strong>s (ADHO). Fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g for such an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure should also <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a salary for<br />

an experienced “project management evangelist.” In additi<strong>on</strong>, they advocated fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g a “curated tools<br />

repository” or a form of digital tools review site or journal, as a means both of stor<str<strong>on</strong>g>in</str<strong>on</strong>g>g at least <strong>on</strong>e copy<br />

of all tools submitted for publicati<strong>on</strong> <strong>and</strong> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g peer-review mechanisms for evaluat<str<strong>on</strong>g>in</str<strong>on</strong>g>g those tools.<br />

Such a repository could provide discovery <strong>and</strong> recommender services, but would also require a general


227<br />

editor <strong>and</strong> a str<strong>on</strong>g board of respected digital humanists. Both these suggesti<strong>on</strong>s illustrate the<br />

importance of staff<str<strong>on</strong>g>in</str<strong>on</strong>g>g as a key comp<strong>on</strong>ent of any <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

New Evaluati<strong>on</strong> <strong>and</strong> Incentive Models for Digital Scholarship <strong>and</strong> Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

While the ACLS report argued that young scholars would need to be offered more “formal venues <strong>and</strong><br />

opportunities for tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> encouragement” (ACLS 2006) <str<strong>on</strong>g>in</str<strong>on</strong>g> order to successfully pursue new<br />

digital scholarship, the CSHE report <strong>on</strong> scholarly communicati<strong>on</strong> practices found no evidence<br />

suggest<str<strong>on</strong>g>in</str<strong>on</strong>g>g that technologically sophisticated graduate students, postdoctoral scholars, or assistant<br />

professors were eagerly pursu<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g opportunities <str<strong>on</strong>g>in</str<strong>on</strong>g> place of traditi<strong>on</strong>al venues. “In<br />

fact, as arguably the most vulnerable populati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the scholarly community,” Harley et al. imparted,<br />

“<strong>on</strong>e would expect them to hew to the norms of their chosen discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <strong>and</strong> they do” (Harley et al.<br />

2010, 9). Senior scholars who already had tenure were undertak<str<strong>on</strong>g>in</str<strong>on</strong>g>g most of the greatest digital<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>novati<strong>on</strong>, they discovered. The ir<strong>on</strong>y rema<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, however, that young scholars were often expected by<br />

their senior colleagues to transform professi<strong>on</strong>s that would still as yet not recognize digital scholarship.<br />

A recent article by Stephen Nichols has also criticized this lack of support for both produc<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

evaluat<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital scholarship am<strong>on</strong>g the humanities:<br />

While attitudes more favourable to the needs of digital humanities projects are slowly evolv<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

we have yet to see a general acceptance of new approaches. Indeed, even where digital projects<br />

have been embraced, evidence suggests that attitudes from traditi<strong>on</strong>al or analogue scholarship<br />

c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ue to <str<strong>on</strong>g>in</str<strong>on</strong>g>fluence the way projects are evaluated, a practice that younger, untenured<br />

colleagues often f<str<strong>on</strong>g>in</str<strong>on</strong>g>d <str<strong>on</strong>g>in</str<strong>on</strong>g>timidat<str<strong>on</strong>g>in</str<strong>on</strong>g>g. At least as far as the dem<strong>and</strong>s of humanities credential<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

are c<strong>on</strong>cerned, the dom<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong> of the typewriter has yet to give way to that of the computer,<br />

metaphorically speak<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Nichols 2009).<br />

This criticism was c<strong>on</strong>firmed by the research of the LAIRAH project (Warwick et al. 2008b), which<br />

illustrated that many scholars creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital humanities projects had received little support or<br />

recogniti<strong>on</strong> from their <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s. Further evidence of this trend was dem<strong>on</strong>strated by a workshop at<br />

the 2010 Modern Language Associati<strong>on</strong> (MLA) C<strong>on</strong>ference <strong>on</strong> assess<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital scholarship <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

n<strong>on</strong>traditi<strong>on</strong>al formats as detailed by Edm<strong>on</strong>d <strong>and</strong> Schreibman (2010). When the first case study at the<br />

MLA workshop presented a digital editi<strong>on</strong> of a little-known poet complete with extensive scholarly<br />

apparatus, the first comment from a workshop participant was that the creati<strong>on</strong> of such an editi<strong>on</strong> was<br />

not scholarship but service. This attitude was reflected throughout the CSHE case study of archaeology<br />

as well, <strong>and</strong> many archaeologists suggested that much digital work was a form of service, not<br />

scholarship. “The creati<strong>on</strong> of a scholarly editi<strong>on</strong> was a service activity, not a scholarly <strong>on</strong>e regardless<br />

of the medium of presentati<strong>on</strong>,” Edm<strong>on</strong>d <strong>and</strong> Schreibman reported, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “the workshop<br />

facilitators immediately realized that the battle l<str<strong>on</strong>g>in</str<strong>on</strong>g>es were far from fixed <strong>and</strong> that hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g a work <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

digital form <strong>on</strong>ly served to re<str<strong>on</strong>g>in</str<strong>on</strong>g>force certa<str<strong>on</strong>g>in</str<strong>on</strong>g> exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g prejudices rather than allow for a widened<br />

scholarly horiz<strong>on</strong>” (Edm<strong>on</strong>d <strong>and</strong> Schreibman 2010).<br />

One reas<strong>on</strong> that Edm<strong>on</strong>d <strong>and</strong> Schreibman (2010) suggested for the lack of “trust” <str<strong>on</strong>g>in</str<strong>on</strong>g> digital resources<br />

as real scholarship is that such resources are still perceived by many scholars as ephemeral <strong>and</strong><br />

transient. They proposed that the development of susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital scholarship<br />

might help alleviate some of this distrust. In additi<strong>on</strong>, Edm<strong>on</strong>d <strong>and</strong> Schreibman po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that even if<br />

the framework for pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t-based scholarship was so “embedded <str<strong>on</strong>g>in</str<strong>on</strong>g> the academic <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al systems<br />

that support it as to be nearly <str<strong>on</strong>g>in</str<strong>on</strong>g>visible,” it was still an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g> itself that many scholars had<br />

l<strong>on</strong>g s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce realized was beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g to break:


228<br />

One might have presumed that our n<strong>on</strong>-digital colleagues might have looked to digital<br />

publicati<strong>on</strong> as a way out of the current difficulties; as a way of build<str<strong>on</strong>g>in</str<strong>on</strong>g>g new <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al<br />

structures to support <str<strong>on</strong>g>in</str<strong>on</strong>g> the first <str<strong>on</strong>g>in</str<strong>on</strong>g>stance traditi<strong>on</strong>al research activities while explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g new<br />

models made possible by digital formats. But rather, the opposite has happened. There has<br />

arisen <str<strong>on</strong>g>in</str<strong>on</strong>g>stead a bunker mentality cl<str<strong>on</strong>g>in</str<strong>on</strong>g>g<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the high old ways as assiduously as the British<br />

clung to the Ra (Edm<strong>on</strong>d <strong>and</strong> Schreibman 2010).<br />

Despite the many challenges fac<str<strong>on</strong>g>in</str<strong>on</strong>g>g traditi<strong>on</strong>al scholarly publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Edm<strong>on</strong>d <strong>and</strong> Schreibman sadly<br />

acknowledged that many scholars still did not see any soluti<strong>on</strong>s to the “pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t crisis” through the digital<br />

publicati<strong>on</strong> of scholarship.<br />

Borgman also affirmed <str<strong>on</strong>g>in</str<strong>on</strong>g> her article that neither journal nor book publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities has<br />

rapidly embraced the digital world for a variety of reas<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a distrust of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong><br />

<strong>and</strong> an unwill<str<strong>on</strong>g>in</str<strong>on</strong>g>gness to try new technologies. This needs to change, Borgman argued, because the<br />

“love affair with pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t” endangers both traditi<strong>on</strong>al <strong>and</strong> digital humanities scholarship. As pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t-<strong>on</strong>ly<br />

publicati<strong>on</strong> c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues to decrease, those who rely <strong>on</strong> it as the sole outlet for their scholarship, Borgman<br />

c<strong>on</strong>cluded, will be talk<str<strong>on</strong>g>in</str<strong>on</strong>g>g to an ever-smaller audience. She also proposed that digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

offered a number of advantages over pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ability to <str<strong>on</strong>g>in</str<strong>on</strong>g>corporate dynamic multimedia or<br />

hypermedia, the possibility of reach<str<strong>on</strong>g>in</str<strong>on</strong>g>g larger audiences, a far shorter time to publicati<strong>on</strong>, possibly<br />

heightened levels of citati<strong>on</strong>, <strong>and</strong> easier access to digital materials. 683<br />

In additi<strong>on</strong>, Borgman stated that <strong>on</strong>e key benefit of digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g for the humanities is that it<br />

“offers different ways of express<str<strong>on</strong>g>in</str<strong>on</strong>g>g ideas <strong>and</strong> present<str<strong>on</strong>g>in</str<strong>on</strong>g>g evidence for those ideas” (Borgman 2009).<br />

The ability of digital scholarship not <strong>on</strong>ly to be l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked to the primary source data <strong>on</strong> which it is based<br />

but also to dem<strong>on</strong>strate different levels of scholarly certa<str<strong>on</strong>g>in</str<strong>on</strong>g>ty or to highlight the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative nature of<br />

humanities scholarship was a factor that many digital classicists lauded as well 684 <strong>and</strong> was c<strong>on</strong>sidered<br />

to be an essential comp<strong>on</strong>ent of any humanities <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. Indeed, a number of archaeologists<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed <str<strong>on</strong>g>in</str<strong>on</strong>g> the CSHE report argued that the major reas<strong>on</strong> they would not c<strong>on</strong>sider websites as<br />

scholarly producti<strong>on</strong>s for tenure reviews was that few if any websites made a formal argument or<br />

offered an <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative analysis of the evidence they provided (Harley et. al. 2010). N<strong>on</strong>etheless,<br />

develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that not just supports but reflects the <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative nature of the data it<br />

c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s is a critical challenge. “The nexus between data gather<str<strong>on</strong>g>in</str<strong>on</strong>g>g (or digitizati<strong>on</strong>) <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>,”<br />

Stuart Dunn has argued, “is the crucial issue that librarians <strong>and</strong> technical developers are faced with<br />

when plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g, or otherwise engag<str<strong>on</strong>g>in</str<strong>on</strong>g>g with, the deployment of a VRE <str<strong>on</strong>g>in</str<strong>on</strong>g> archaeology, or <str<strong>on</strong>g>in</str<strong>on</strong>g>deed <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

humanities more generally”(Dunn 2009).<br />

A related issue addressed by Borgman is how to resolve several major dis<str<strong>on</strong>g>in</str<strong>on</strong>g>centives she had identified<br />

as likely to prevent traditi<strong>on</strong>al humanities scholars from embrac<str<strong>on</strong>g>in</str<strong>on</strong>g>g open data <strong>and</strong> digital scholarship<br />

(Borgman 2009). Borgman stated that many humanists have various reas<strong>on</strong>s for not wish<str<strong>on</strong>g>in</str<strong>on</strong>g>g to share<br />

their data or the products of their research. These reas<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the fact that there is often far more<br />

reward for publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g papers than for releas<str<strong>on</strong>g>in</str<strong>on</strong>g>g data, that the efforts to document <strong>on</strong>e’s data <strong>and</strong><br />

sources for others is far more challeng<str<strong>on</strong>g>in</str<strong>on</strong>g>g than do<str<strong>on</strong>g>in</str<strong>on</strong>g>g so just for <strong>on</strong>eself, that not shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g data <strong>and</strong><br />

sources can at times offer a competitive advantage to establish a priority of claims, <strong>and</strong> that many<br />

scholars view data as their own <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual property. The CSHE report described a similar “culture of<br />

ownership” am<strong>on</strong>g archaeologists, who were often reluctant to share data for fear of be<str<strong>on</strong>g>in</str<strong>on</strong>g>g “scooped.”<br />

683 Gabriel Bodard (2008) offered a similar list of the advantages of digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g for classical scholarship that was discussed earlier <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper.<br />

684 Such as with Bodard <strong>and</strong> Garces (2009) <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of digital editi<strong>on</strong>s, with Barker et al. (2010) for visualizati<strong>on</strong>s of historical narratives, <strong>and</strong> with<br />

Beacham <strong>and</strong> Denard (2003), Flaten (2009), <strong>and</strong> Koll et al. (2009) for 3-D archaeological models <strong>and</strong> rec<strong>on</strong>structi<strong>on</strong>.


229<br />

Borgman argues, however, that, each of these dis<str<strong>on</strong>g>in</str<strong>on</strong>g>centives aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g has some potential<br />

soluti<strong>on</strong>s. The reward structure for publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g rather than shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g is the most universal dis<str<strong>on</strong>g>in</str<strong>on</strong>g>centive,<br />

Borgman granted, but she also argued that this envir<strong>on</strong>ment is beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g to shift. In terms of datadocumentati<strong>on</strong><br />

challenges, Borgman proposed new partnerships between humanities scholars <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> professi<strong>on</strong>als. For the other dis<str<strong>on</strong>g>in</str<strong>on</strong>g>centives, Borgman recommended short- <strong>and</strong> sometimes<br />

l<strong>on</strong>g-term embargoes of data <strong>and</strong> publicati<strong>on</strong>s that could protect scholars’ rights for a time while<br />

ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g others will eventually have access to the data. Borgman acknowledged that many scholars<br />

would also like to prevent access to their sources of data until after they have published. N<strong>on</strong>etheless,<br />

she c<strong>on</strong>cluded that:<br />

As data sources such as manuscripts <strong>and</strong> out-of-pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t books are digitized <strong>and</strong> made publicly<br />

available, <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual scholars will be less able to hoard their sources. This effect of digitizati<strong>on</strong><br />

<strong>on</strong> humanities scholarship has been little explored, but could be profound. Open access to<br />

sources promotes participati<strong>on</strong> <strong>and</strong> collaborati<strong>on</strong>, while the privacy rules of libraries <strong>and</strong><br />

archives ensure that the identity of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals us<str<strong>on</strong>g>in</str<strong>on</strong>g>g specific sources is not revealed (Borgman<br />

2009).<br />

Borgman ultimately proposed that any <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure developed for the humanities should “err toward<br />

openness” <str<strong>on</strong>g>in</str<strong>on</strong>g> order to advance the field more quickly.<br />

While the future of digital scholarship, even with<str<strong>on</strong>g>in</str<strong>on</strong>g> the smaller realm of digital classics, is bey<strong>on</strong>d the<br />

scope of this report, the challenges of ga<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g acceptance for digital classics projects <strong>and</strong><br />

dem<strong>on</strong>strat<str<strong>on</strong>g>in</str<strong>on</strong>g>g how they <str<strong>on</strong>g>in</str<strong>on</strong>g> many ways can enhance traditi<strong>on</strong>al scholarship, as well as support new<br />

scholarship, were well illustrated by the overview of digital classics projects earlier <str<strong>on</strong>g>in</str<strong>on</strong>g> this report.<br />

Despite the scholarly distrust of many digital publicati<strong>on</strong>s highlighted by Edm<strong>on</strong>d <strong>and</strong> Schreibman<br />

(2010) <strong>and</strong> Borgman (2009), peer review <strong>and</strong> the vett<str<strong>on</strong>g>in</str<strong>on</strong>g>g of data were important comp<strong>on</strong>ents of many<br />

digital projects such as SAVE, Suda On L<str<strong>on</strong>g>in</str<strong>on</strong>g>e, Pleiades, <strong>and</strong> IDP.<br />

Challenges of Humanities Data <strong>and</strong> Digital Infrastructure<br />

“Central to the noti<strong>on</strong> of cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> eScience is that “data” have become essential<br />

scholarly objects,” Borgman observed, “to be captured, m<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, used, <strong>and</strong> reused” (Borgman 2009).<br />

Various types of data exist <str<strong>on</strong>g>in</str<strong>on</strong>g> both humanities <strong>and</strong> scientific research, <strong>and</strong> Borgman listed several k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds<br />

of them, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g observati<strong>on</strong>al (surveys), computati<strong>on</strong>al data (from models or simulati<strong>on</strong>s),<br />

experimental data (laboratory work), <strong>and</strong> records (government, bus<str<strong>on</strong>g>in</str<strong>on</strong>g>ess, archival). While it is this last<br />

form of data that is used most frequently by humanists, Borgman suggested, a fuller underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

the nature of humanities data is a significant research challenge fac<str<strong>on</strong>g>in</str<strong>on</strong>g>g the designers of any<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. Despite the belief that data <str<strong>on</strong>g>in</str<strong>on</strong>g> the sciences are very easy to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e, Borgman cited<br />

research from her own experience <str<strong>on</strong>g>in</str<strong>on</strong>g> envir<strong>on</strong>mental science where there were many “differ<str<strong>on</strong>g>in</str<strong>on</strong>g>g views of<br />

data <strong>on</strong> c<strong>on</strong>cepts as basic as temperature”(Borgman 2009).<br />

Borgman criticized the fact that there have been no significant “social studies of humanities” that<br />

would help better def<str<strong>on</strong>g>in</str<strong>on</strong>g>e the nature of data <str<strong>on</strong>g>in</str<strong>on</strong>g> humanities research:<br />

Lack<str<strong>on</strong>g>in</str<strong>on</strong>g>g an external perspective, humanities scholars need to be particularly attentive to<br />

unstated assumpti<strong>on</strong>s about their data, sources of evidence, <strong>and</strong> epistemology. We are <strong>on</strong>ly<br />

beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g to underst<strong>and</strong> what c<strong>on</strong>stitute data <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, let al<strong>on</strong>e how data differ from<br />

scholar to scholar <strong>and</strong> from author to reader. As Allen Renear remarked, “<str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities,<br />

<strong>on</strong>e pers<strong>on</strong>’s data is another’s theory” (Borgman 2009).


230<br />

This lack of deeper underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g about the nature of humanities data raises complicated questi<strong>on</strong>s<br />

regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g what type of data are produced, how data should be captured, <strong>and</strong> how data should be<br />

curated for reuse. Borgman also drew attenti<strong>on</strong> to Clifford Lynch’s dichotomy of data as raw material<br />

vs. <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> (Lynch 2002), po<str<strong>on</strong>g>in</str<strong>on</strong>g>t<str<strong>on</strong>g>in</str<strong>on</strong>g>g out that it br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs up two relevant issues for the digital<br />

humanities. First, raw material is far more likely to be curated then scholars’ <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>s of<br />

materials, <strong>and</strong> while it may be the nature of humanities research to c<strong>on</strong>stantly re<str<strong>on</strong>g>in</str<strong>on</strong>g>terpret sources,<br />

“what is new is the necessity of mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g explicit decisi<strong>on</strong>s about what survives for migrati<strong>on</strong> to new<br />

systems <strong>and</strong> formats.” Sec<strong>on</strong>d, humanities scholars usually have little c<strong>on</strong>trol over the <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual<br />

property rights of the sources they use (e.g., images of manuscripts, cuneiform tablets), a factor that<br />

can make data shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g very complicated <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities.<br />

Another <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g comparis<strong>on</strong> between the data practices of those work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital humanities<br />

<strong>and</strong> those work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the sciences was offered by Edm<strong>on</strong>d <strong>and</strong> Schreibman (2010):<br />

If we apply a science paradigm, a digital humanities scholar could be compared to an<br />

experimental physicist, as some<strong>on</strong>e who designs processes <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>struments to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d the answers<br />

to their research questi<strong>on</strong>s. But the most strik<str<strong>on</strong>g>in</str<strong>on</strong>g>g difference between the experimental humanist<br />

<strong>and</strong> the experimental physicist lies <str<strong>on</strong>g>in</str<strong>on</strong>g> the fate of these processes <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>struments after the<br />

article <strong>on</strong> the f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs they enabled has been written: they are transcended, perhaps licensed to<br />

another for further use, perhaps simply discarded. Why are we so different about our electr<strong>on</strong>ic<br />

data Would it be enough for humanistic scholars as well to draw their c<strong>on</strong>clusi<strong>on</strong>s <strong>and</strong> let it go<br />

either to be developed by some<strong>on</strong>e else or to mildew Or is there someth<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>herently<br />

different <str<strong>on</strong>g>in</str<strong>on</strong>g> the nature of our data, that we should be so attached to its survival For example,<br />

we expect to receive credit for scholarly editi<strong>on</strong>s—why should we not receive it for digital<br />

scholarly editi<strong>on</strong>s Are the data collecti<strong>on</strong>s created by humanists <str<strong>on</strong>g>in</str<strong>on</strong>g>herently more accessible<br />

<strong>and</strong> open than an experimental physicist’s algorithm or shade-tree spectroscope (Edm<strong>on</strong>d <strong>and</strong><br />

Schreibman 2010)<br />

In additi<strong>on</strong> to the unique nature of humanities data, Edm<strong>on</strong>d <strong>and</strong> Schreibman agreed with Borgman<br />

that there were many challenges to data reuse even bey<strong>on</strong>d the frequently cited problem of <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual<br />

property rights. They stated that <strong>on</strong>ce a scholar’s colleagues <strong>and</strong> friends had looked at a digital project,<br />

the actual explorati<strong>on</strong> <strong>and</strong> reuse of digital project materials was typically very low. Edm<strong>on</strong>d <strong>and</strong><br />

Schreibman speculated that the organizati<strong>on</strong> of a digital data set or collecti<strong>on</strong> might appear to be too<br />

“powerful an act of editorialism” to many scholars for them to believe more orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigati<strong>on</strong><br />

could be c<strong>on</strong>ducted with the same materials. They also suggested, however, that the lack of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure to communicate about digital works may be greatly h<str<strong>on</strong>g>in</str<strong>on</strong>g>der<str<strong>on</strong>g>in</str<strong>on</strong>g>g their reuse. “Stripped of<br />

publishers’ lists, of their market<str<strong>on</strong>g>in</str<strong>on</strong>g>g channels <strong>and</strong> peer review <strong>and</strong> quality c<strong>on</strong>trol systems,” they<br />

w<strong>on</strong>dered, “are we fail<str<strong>on</strong>g>in</str<strong>on</strong>g>g the next generati<strong>on</strong> of scholars by creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g too many resources <str<strong>on</strong>g>in</str<strong>on</strong>g> the wild”<br />

(Edm<strong>on</strong>d <strong>and</strong> Schreibman 2010).<br />

The lack of formal dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> <strong>and</strong> communicati<strong>on</strong> channels to promote digital resources is<br />

particularly problematic because of the sheer amount of potentially relevant (as well as irrelevant) data<br />

<strong>and</strong> tools that are available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. While the challenges of this data deluge are often discussed,<br />

particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of e-science, Stuart Dunn has suggested that for the digital humanities a more apt<br />

term might be the “complexity deluge”:<br />

Driven by <str<strong>on</strong>g>in</str<strong>on</strong>g>creased availability of relatively cheap digitizati<strong>on</strong> technologies <strong>and</strong> the<br />

development of software tools that support both exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g research tasks <strong>and</strong> wholly new <strong>on</strong>es,


231<br />

the digital arts <strong>and</strong> humanities are now fac<str<strong>on</strong>g>in</str<strong>on</strong>g>g what might be termed a complexity deluge. This<br />

can be def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as the presence of a range of opportunities aris<str<strong>on</strong>g>in</str<strong>on</strong>g>g from the rate of technological<br />

change <strong>and</strong> the availability of e-<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, that the ma<str<strong>on</strong>g>in</str<strong>on</strong>g>stream academic community is not<br />

yet equipped to address its research questi<strong>on</strong>s with (Dunn 2009).<br />

Dunn worried that this lack of read<str<strong>on</strong>g>in</str<strong>on</strong>g>ess <strong>on</strong> the part of academia meant that technology, rather than<br />

“research questi<strong>on</strong>s,” would drive the development of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure agendas.<br />

The complex nature of humanities data <strong>and</strong> the challenges of build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for these<br />

data have also been explored by Blanke et al. (2008, 2009), particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of the semantically<br />

<strong>and</strong> structurally diverse data sets that are <str<strong>on</strong>g>in</str<strong>on</strong>g>volved <strong>and</strong> the highly c<strong>on</strong>textual <strong>and</strong> qualitative nature of<br />

much of the data. “The <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of data items <str<strong>on</strong>g>in</str<strong>on</strong>g>to arts <strong>and</strong> humanities research is n<strong>on</strong>-trivial, as<br />

complicated semantics underlie the archives of human reports,” they expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, not<str<strong>on</strong>g>in</str<strong>on</strong>g>g how<br />

“humanities data may be highly c<strong>on</strong>textual, its <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> relati<strong>on</strong>ships to other<br />

resources <strong>and</strong> collecti<strong>on</strong>s, which are not necessarily digital”(Blanke et al. 2009). This semantic,<br />

structural, <strong>and</strong> c<strong>on</strong>textual complexity led to a number of computati<strong>on</strong>al problems, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the lack of<br />

formats or <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces to make data/systems <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable (Blanke et al. 2008). The difficulty of<br />

design<str<strong>on</strong>g>in</str<strong>on</strong>g>g for the “fuzzy” <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>c<strong>on</strong>sistent nature of data <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities was also acknowledged by<br />

OKell et al. (2010) <str<strong>on</strong>g>in</str<strong>on</strong>g> their overview of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a reusable digital learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g object, <strong>and</strong> they<br />

c<strong>on</strong>sequently labeled the humanities as an “ill structured knowledge doma<str<strong>on</strong>g>in</str<strong>on</strong>g>.”<br />

Similar challenges <str<strong>on</strong>g>in</str<strong>on</strong>g> humanities data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> were reported by the LaQuAT project when they<br />

evaluated the results of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Projet Volterra <strong>and</strong> the HGV (Jacks<strong>on</strong> et al. 2009), <strong>and</strong> they<br />

acknowledged that there were still many limits to the automatic <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of humanities data sets.<br />

They proposed build<str<strong>on</strong>g>in</str<strong>on</strong>g>g systems that made use of human annotati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> help<str<strong>on</strong>g>in</str<strong>on</strong>g>g l<str<strong>on</strong>g>in</str<strong>on</strong>g>k diverse data sets<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> a mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gful way:<br />

Our <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigati<strong>on</strong>s led us to c<strong>on</strong>clude that there is a need am<strong>on</strong>g at least some humanities<br />

researchers for tools support<str<strong>on</strong>g>in</str<strong>on</strong>g>g collaborative processes that <str<strong>on</strong>g>in</str<strong>on</strong>g>volve access to <strong>and</strong> use of<br />

complex, diverse <strong>and</strong> geographically distributed data resources, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g both automated<br />

process<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> human manipulati<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g> envir<strong>on</strong>ments where research groups (<str<strong>on</strong>g>in</str<strong>on</strong>g> the form of<br />

"virtual organisati<strong>on</strong>s"), research data <strong>and</strong> research outputs may all cross <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al<br />

boundaries <strong>and</strong> be subject to different, aut<strong>on</strong>omous management regimes (Hedges 2009).<br />

The need for <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructural soluti<strong>on</strong>s that deal not just with the complexities of humanities data but<br />

also with the fact that such data are often geographically distributed <strong>and</strong> can bel<strong>on</strong>g to different<br />

organizati<strong>on</strong>s with different data-management practices is further explored <str<strong>on</strong>g>in</str<strong>on</strong>g> the next secti<strong>on</strong>.<br />

“General” Humanities Infrastructures, Doma<str<strong>on</strong>g>in</str<strong>on</strong>g>-Specific Needs, <strong>and</strong> the Needs of<br />

Humanists<br />

To return to the idea of general requirements for a humanities cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, the ACLS report<br />

c<strong>on</strong>cluded that at a bare m<str<strong>on</strong>g>in</str<strong>on</strong>g>imum such an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure will have to be a public good, susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able,<br />

collaborative, <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable, <strong>and</strong> support experimentati<strong>on</strong> (ACLS 2006). These ideas have been<br />

supported <str<strong>on</strong>g>in</str<strong>on</strong>g> a variety of other recent research <strong>on</strong> how to build a general humanities<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. Franciska de J<strong>on</strong>g, <str<strong>on</strong>g>in</str<strong>on</strong>g> a recent address to the European Chapter of the Associati<strong>on</strong><br />

for Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics (EACL), del<str<strong>on</strong>g>in</str<strong>on</strong>g>eated similar requirements for an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that<br />

would support new “knowledge-driven workflows.” This list <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the “coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of coherent<br />

platforms (both local <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al)” to support the <str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong> of communities <strong>and</strong> the exchange of


232<br />

expertise, tools, experience, <strong>and</strong> guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es; “<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructural facilities” to support both researchers <strong>and</strong><br />

NLP tool developers (cit<str<strong>on</strong>g>in</str<strong>on</strong>g>g CLARIN as a good example); open-access sources <strong>and</strong> st<strong>and</strong>ards;<br />

metadata schemata; best practices, <strong>and</strong> exchanges, protocols <strong>and</strong> tools; <strong>and</strong> service centers that would<br />

be able to support heavy computati<strong>on</strong>al process<str<strong>on</strong>g>in</str<strong>on</strong>g>g (de J<strong>on</strong>g 2009). Her requirements go bey<strong>on</strong>d those<br />

of the ACLS by also specify<str<strong>on</strong>g>in</str<strong>on</strong>g>g several other important features: the need for a number of pilot<br />

projects between NLP researchers <strong>and</strong> humanists to test specific features of the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure; flexible<br />

user <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces that meet a variety of scholarly needs; <strong>and</strong> realistic evaluati<strong>on</strong> frameworks that assess<br />

how well user needs are be<str<strong>on</strong>g>in</str<strong>on</strong>g>g met by all the comp<strong>on</strong>ents of the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

Questi<strong>on</strong>s of general <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure were also c<strong>on</strong>sidered at a 2007 <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al workshop hosted by<br />

JISC <strong>and</strong> the NSF. This workshop produced a report that explored how to build an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that<br />

would support cyberscholarship across the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es. The report emphasized a number of necessary<br />

c<strong>on</strong>diti<strong>on</strong>s for <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g new methods for data capture, management <strong>and</strong> preservati<strong>on</strong> of<br />

digital c<strong>on</strong>tent, coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> at the nati<strong>on</strong>al <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al levels, <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary research, <strong>and</strong>,<br />

most important, digital c<strong>on</strong>tent that is truly “open,” or, <str<strong>on</strong>g>in</str<strong>on</strong>g> other words, available for computati<strong>on</strong>al<br />

process<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> reuse. The authors of this report cauti<strong>on</strong>, however, that creators of cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

will need to underst<strong>and</strong> that a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle approach will not work for all discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es while at the same time<br />

resist<str<strong>on</strong>g>in</str<strong>on</strong>g>g the assumpti<strong>on</strong> that there are no st<strong>and</strong>ardized services to be offered across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es (Arms<br />

<strong>and</strong> Larsen 2007). A similar warn<str<strong>on</strong>g>in</str<strong>on</strong>g>g was given by the CSHE report <strong>on</strong> scholarly communicati<strong>on</strong>.<br />

“Although robust <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures are needed locally <strong>and</strong> bey<strong>on</strong>d, “ Harley et al. c<strong>on</strong>cluded, “the sheer<br />

diversity of scholars’ needs across the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> the rapid evoluti<strong>on</strong> of the technologies<br />

themselves means that <strong>on</strong>e-size-fits-all soluti<strong>on</strong>s will almost always fall short” (Harley et al. 2010).<br />

Specific advice <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of design<str<strong>on</strong>g>in</str<strong>on</strong>g>g VREs or <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures that can be widely adopted across<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es has been given by Voss <strong>and</strong> Procter (2009). “Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated e-research experience<br />

fundamentally relies <strong>on</strong> the creati<strong>on</strong> of communities of service providers, tool builders <strong>and</strong> researchers<br />

work<str<strong>on</strong>g>in</str<strong>on</strong>g>g together to develop specific support for research tasks,” Voss <strong>and</strong> Procter argued, “as well as<br />

the creati<strong>on</strong> of a technical <strong>and</strong> organisati<strong>on</strong>al platform for <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g these tools <str<strong>on</strong>g>in</str<strong>on</strong>g>to an overall<br />

research process.” While they argued that <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary approaches must be <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated, they also<br />

stated that any <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure must address the fact that social or organizati<strong>on</strong>al/discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary behaviors<br />

<strong>and</strong> technological issues are closely related.<br />

Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that is both general enough to encourage wide-scale adopti<strong>on</strong> <strong>and</strong> use <strong>and</strong><br />

that can meet the needs of different discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es at the same time is a complicated undertak<str<strong>on</strong>g>in</str<strong>on</strong>g>g. While<br />

laud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ACLS report <str<strong>on</strong>g>in</str<strong>on</strong>g> general, Stuart Dunn also warned that:<br />

… the “not <strong>on</strong>ly discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e-specific” aspect of cyber <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure expresses both its str<strong>on</strong>gest<br />

appeal <strong>and</strong> its ma<str<strong>on</strong>g>in</str<strong>on</strong>g> drawback: while generat<str<strong>on</strong>g>in</str<strong>on</strong>g>g new knowledge by work<str<strong>on</strong>g>in</str<strong>on</strong>g>g across <strong>and</strong> bey<strong>on</strong>d<br />

established <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es is at the heart of “digital scholarship”, the lack of a<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary focus with which scholars can identify is another reas<strong>on</strong> why the term VRE has not<br />

established itself (Dunn 2009).<br />

While as Dunn observes <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary research is at the “heart” of digital scholarship, the lack of a<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary focus for <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures can make it hard for researchers to identify them as useful for<br />

their needs <strong>and</strong> has thus limited the uptake of potential tools such as virtual research envir<strong>on</strong>ments.<br />

Tobias Blanke has also explored how e-Science tools <strong>and</strong> methodologies (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g virtual research<br />

envir<strong>on</strong>ments) may or may not be able transform “digital humanities” <str<strong>on</strong>g>in</str<strong>on</strong>g>to “humanities e-Science”<br />

(Blanke 2010). One of the most successful tasks of digital humanities, Blanke noted, is us<str<strong>on</strong>g>in</str<strong>on</strong>g>g


233<br />

sophisticated text encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g with markup such as that of the TEI to support text exchange between<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual scholars or projects, but he also cauti<strong>on</strong>ed that it rema<str<strong>on</strong>g>in</str<strong>on</strong>g>ed to be seen whether TEI could be<br />

similarly useful for text exchange between computati<strong>on</strong>al agents. Blanke uses this example to illustrate<br />

how the ways <str<strong>on</strong>g>in</str<strong>on</strong>g> which technologies that have been used <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital humanities will not always work<br />

for e-Science processes. One of the greatest challenges, Blanke asserted, will be to “use the experience<br />

ga<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital Humanities to build <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures for humanities” (Blanke<br />

2010). Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g this <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure starts with exam<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> detail the research<br />

workflows of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> their specific needs, he suggests.<br />

The challenge of mov<str<strong>on</strong>g>in</str<strong>on</strong>g>g bey<strong>on</strong>d the small, ad hoc projects comm<strong>on</strong>ly found <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital humanities<br />

to more systematic research that can deliver specific pieces of a larger arts <strong>and</strong> humanities<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is a frequently cited problem. A related issue is how to develop a digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

that respects the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual questi<strong>on</strong>s <strong>and</strong> research needs of specific discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es while also work<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

toward more general-purpose soluti<strong>on</strong>s. These questi<strong>on</strong>s were addressed by Blanke et al. (2008), who<br />

reported <strong>on</strong> a number of grass-roots <str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives with<str<strong>on</strong>g>in</str<strong>on</strong>g> the U.K. <strong>and</strong> Germany that allowed them to form<br />

successful partnerships with science to address comm<strong>on</strong> problems <strong>and</strong> to adopt new viewpo<str<strong>on</strong>g>in</str<strong>on</strong>g>ts <strong>on</strong> old<br />

questi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> humanities research. While the authors argued that large-scale <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g> e-<br />

Humanities would be useful “ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ly <str<strong>on</strong>g>in</str<strong>on</strong>g> the provisi<strong>on</strong> of data <strong>and</strong> computati<strong>on</strong>al resources,” they also<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>sisted that research should neither avoid creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g local soluti<strong>on</strong>s if necessary nor try mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

universal claims. They ultimately put forward the idea of “lean grids” <strong>and</strong> claimed that “a generic<br />

soluti<strong>on</strong> cover<str<strong>on</strong>g>in</str<strong>on</strong>g>g all research doma<str<strong>on</strong>g>in</str<strong>on</strong>g>s is likely to fail.” The authors def<str<strong>on</strong>g>in</str<strong>on</strong>g>e lean grids as “approaches<br />

which <str<strong>on</strong>g>in</str<strong>on</strong>g>corporate the idea beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d grids to share resources while at the same time not rely<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong><br />

build<str<strong>on</strong>g>in</str<strong>on</strong>g>g heavy <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures” (Blanke et al. 2008).<br />

While Blanke et al. (2008) did see some utility <str<strong>on</strong>g>in</str<strong>on</strong>g> projects such as DARIAH, they also countered that<br />

the actual number of humanities <strong>and</strong> arts researchers who need grid comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g is fairly small <strong>and</strong><br />

stated furthermore that much of this research is c<strong>on</strong>ducted at smaller <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s that would lack the<br />

technical support necessary to use the grid even if it were available. The str<strong>on</strong>gest reas<strong>on</strong> they give<br />

aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st solely develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g computati<strong>on</strong>al soluti<strong>on</strong>s for humanities comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g that rely <strong>on</strong> grid<br />

technology, however, is that:<br />

… most of digital research <str<strong>on</strong>g>in</str<strong>on</strong>g> the arts <strong>and</strong> humanities is d<strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive manner as a way<br />

of humans request<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources, annotat<str<strong>on</strong>g>in</str<strong>on</strong>g>g data or runn<str<strong>on</strong>g>in</str<strong>on</strong>g>g small support<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools. The model<br />

of ‘runn<str<strong>on</strong>g>in</str<strong>on</strong>g>g jobs’ <strong>on</strong> the grid is alien to such research practices. At the moment at least, large<br />

<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> particular grid-based <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures do not support <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive behaviour well (Blanke et<br />

al. 2008).<br />

The <str<strong>on</strong>g>in</str<strong>on</strong>g>sight that too exclusive a focus <strong>on</strong> grid comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g might be counterproductive to most<br />

“c<strong>on</strong>venti<strong>on</strong>al” types of digital humanities research is important to c<strong>on</strong>sider when design<str<strong>on</strong>g>in</str<strong>on</strong>g>g a potential<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for humanists.<br />

Underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g the actual research methods <strong>and</strong> processes of humanists is thus both an important <strong>and</strong> a<br />

necessary step <str<strong>on</strong>g>in</str<strong>on</strong>g> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g any k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of resources, tools, or <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that will meet their needs. Any<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that is developed without an underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of the specific research questi<strong>on</strong>s <strong>and</strong> types<br />

of tools used by humanists <str<strong>on</strong>g>in</str<strong>on</strong>g> their daily research is unlikely to be successful. Indeed, the LaQuAT<br />

project emphasized this po<str<strong>on</strong>g>in</str<strong>on</strong>g>t <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g up humanities databases:


234<br />

All l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g-up of research databases needs to be based <strong>on</strong> a detailed underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of what<br />

researchers could do or would want to do with such l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked databases. In the future, <strong>on</strong>e would<br />

need to <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigate more realistic user scenarios <strong>and</strong> complex queries. More generally, there is a<br />

need to study the workflows currently used by researchers <strong>and</strong> underst<strong>and</strong> how an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure would c<strong>on</strong>tribute to or alter these (Jacks<strong>on</strong> et al. 2009).<br />

The risks of develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools or services that are too discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e-specific must also be c<strong>on</strong>sidered, as<br />

illustrated by a recent discussi<strong>on</strong> of e-science, the humanities, <strong>and</strong> digital classics:<br />

We need discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary centers: classicists, for example, have their own specialized needs that<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>volve the languages <strong>on</strong> which they focus. At the same time, we cannot have a flat<br />

organizati<strong>on</strong>, with each discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g its own <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. A relatively large<br />

humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e such as classics might be able to support its own unique systems, but that<br />

would <strong>on</strong>ly c<strong>on</strong>demn us to an underfunded <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that we could not susta<str<strong>on</strong>g>in</str<strong>on</strong>g> over time<br />

(Crane, Babeu, <strong>and</strong> Bamman 2007).<br />

These authors argue that while there will always be a need for specific discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary centers <strong>and</strong><br />

projects, digital humanities developers must be careful not to <strong>on</strong>ly develop isolated tools or systems<br />

that cannot “plug <str<strong>on</strong>g>in</str<strong>on</strong>g>” to a larger system or <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

N<strong>on</strong>etheless, Geoffrey Rockwell has argued that highly specific tool development has a larger place<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the humanities, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce many digital tools are re<str<strong>on</strong>g>in</str<strong>on</strong>g>vented as sources <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

humanities are c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uously re<str<strong>on</strong>g>in</str<strong>on</strong>g>terpreted:<br />

Tools are not used to extract mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to objective pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciples. In the humanities we<br />

re<str<strong>on</strong>g>in</str<strong>on</strong>g>vent ways of mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g with<str<strong>on</strong>g>in</str<strong>on</strong>g> traditi<strong>on</strong>s. We are <str<strong>on</strong>g>in</str<strong>on</strong>g> the ma<str<strong>on</strong>g>in</str<strong>on</strong>g>tenance by re<str<strong>on</strong>g>in</str<strong>on</strong>g>venti<strong>on</strong><br />

<strong>and</strong> re<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong> bus<str<strong>on</strong>g>in</str<strong>on</strong>g>ess <strong>and</strong> we d<strong>on</strong>’t want our methods <strong>and</strong> tools to become <str<strong>on</strong>g>in</str<strong>on</strong>g>visible as<br />

they are part of the research. To shift tool development from researchers to <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

providers is to direct the attenti<strong>on</strong> of humanities research away <strong>and</strong> to surrender some of the<br />

research <str<strong>on</strong>g>in</str<strong>on</strong>g>dependence we value. To shift the boundary that def<str<strong>on</strong>g>in</str<strong>on</strong>g>es what is legitimate research<br />

<strong>and</strong> what isn’t is someth<str<strong>on</strong>g>in</str<strong>on</strong>g>g humanists should care passi<strong>on</strong>ately about <strong>and</strong> resist where it<br />

c<strong>on</strong>stra<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g>quiry (Rockwell 2010).<br />

While Rockwell understood the frustrati<strong>on</strong> of funders with the “plodd<str<strong>on</strong>g>in</str<strong>on</strong>g>g iterative ways of the<br />

humanities,” he c<strong>on</strong>cluded that rather than suspend<str<strong>on</strong>g>in</str<strong>on</strong>g>g the development of new digital tools that<br />

humanists (digital or otherwise) would need to do a better job at “expla<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the value of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terpretati<strong>on</strong>.”<br />

Bey<strong>on</strong>d the viability of comm<strong>on</strong> digital tools, Brown <strong>and</strong> Greengrass (2010) have argued that a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle<br />

m<strong>on</strong>olithic repository, VRE, or portal structure that is not designed to support customizati<strong>on</strong> is likely to<br />

meet with failure:<br />

The breadth of resources required to service the needs of such a heterogeneous community is<br />

unlikely to be encompassed by any s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle repository, or even a small cluster of major<br />

repositories. An access portal therefore needs to be ‘customisable’ to create l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to <strong>and</strong> feeds<br />

from valued <strong>and</strong> comm<strong>on</strong>ly used sources (Brown <strong>and</strong> Greengrass 2010)<br />

As with <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital classics, while there are certa<str<strong>on</strong>g>in</str<strong>on</strong>g>ly comm<strong>on</strong> problems to be solved <strong>and</strong><br />

tools to be developed that will work across humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, there will still likely always be<br />

some need to customize tools <strong>and</strong> resources for different discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es. Indeed, a recently released report


235<br />

for the ARL regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the development of services for digital repositories c<strong>on</strong>cluded that research<br />

libraries would need to attend to the “dem<strong>and</strong> side” or the specific needs of different discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary user<br />

groups, for “digital repositories are as much about users as they are about c<strong>on</strong>tent, so the development<br />

of high-value repository services requires underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g user needs <strong>and</strong> capabilities” (ARL 2009b).<br />

The report went even further, urg<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “rather than develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g technologies <strong>and</strong> hop<str<strong>on</strong>g>in</str<strong>on</strong>g>g they will be<br />

usefully applied, libraries need more data, <strong>and</strong> discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e-specific data, <strong>on</strong> how a wide range of service<br />

c<strong>on</strong>sumers—<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s, libraries, scholars, <strong>and</strong> researchers—value services <strong>and</strong> want to use<br />

c<strong>on</strong>tent.” 685 Thus, underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g the specific needs, research methods, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> habits of the<br />

different humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es will be an essential part of design<str<strong>on</strong>g>in</str<strong>on</strong>g>g any larger humanities<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

This type of user-model<str<strong>on</strong>g>in</str<strong>on</strong>g>g work is currently be<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>ducted by the DARIAH project, as expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by<br />

Benardou et al. (2010a). As part of the preparati<strong>on</strong> stage of DARIAH, a “c<strong>on</strong>ceptual model for<br />

scholarly research activity” is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g created that is based <strong>on</strong> cultural-historical activity theory <strong>and</strong> is<br />

be<str<strong>on</strong>g>in</str<strong>on</strong>g>g expressed us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the CIDOC-CRM. A two-pr<strong>on</strong>ged research program is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>ducted by the<br />

Digital Curati<strong>on</strong> Unit-IMIS of the Athena Research Centre that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes (1) an “empirical study of<br />

scholarly work” that will be based <strong>on</strong> the transcripti<strong>on</strong> <strong>and</strong> c<strong>on</strong>ceptual encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews with<br />

scholars (23 Europeans arts <strong>and</strong> humanities researchers) that can be c<strong>on</strong>sidered “ma<str<strong>on</strong>g>in</str<strong>on</strong>g>stream” users of<br />

digital resources, <strong>and</strong> (2) the creati<strong>on</strong> of a “scholarly research activity model” that is based <strong>on</strong> an eventcentric<br />

approach that will be used to formalize the results of the empirical study <str<strong>on</strong>g>in</str<strong>on</strong>g>to an actual systems<br />

model. After review<str<strong>on</strong>g>in</str<strong>on</strong>g>g the extensive body of literature regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> behavior <strong>and</strong><br />

identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g a number of comm<strong>on</strong> processes across them (e.g., citati<strong>on</strong> cha<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g, brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g, gather<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

read<str<strong>on</strong>g>in</str<strong>on</strong>g>g, search<str<strong>on</strong>g>in</str<strong>on</strong>g>g, verify<str<strong>on</strong>g>in</str<strong>on</strong>g>g), Benardou et al. proposed that <strong>on</strong>e issue <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of their own work was<br />

that all of these earlier studies present models that “view <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> behavior primarily as process;<br />

c<strong>on</strong>sequently, the world of <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> objects, data <strong>and</strong> documents, rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s <str<strong>on</strong>g>in</str<strong>on</strong>g> them as a rule<br />

implicit” (Benardou et al. 2010a).<br />

Benardou et al. identified a number of other limitati<strong>on</strong>s with the current literature regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarly<br />

research activity, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g that it (1) focused predom<str<strong>on</strong>g>in</str<strong>on</strong>g>antly <strong>on</strong> the practice of “<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g”<br />

rather than the whole “lifecycle of scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> use,” a po<str<strong>on</strong>g>in</str<strong>on</strong>g>t also made earlier by Toms <strong>and</strong><br />

O’Brien (2008); (2) c<strong>on</strong>centrated primarily <strong>on</strong> the use of scholarly objects such as research<br />

publicati<strong>on</strong>s from a library perspective <strong>and</strong> <strong>on</strong>ly “implicitly” <strong>on</strong> the use of primary evidence <strong>and</strong><br />

sec<strong>on</strong>dary archives (a major research comp<strong>on</strong>ent of many humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es”; (3) privileged the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>-seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g process over “object model<str<strong>on</strong>g>in</str<strong>on</strong>g>g”; (4) del<str<strong>on</strong>g>in</str<strong>on</strong>g>eated a broad number of research<br />

activities <strong>and</strong> processes (e.g., scholarly primitives) but never attempted to formally def<str<strong>on</strong>g>in</str<strong>on</strong>g>e these entities<br />

or the relati<strong>on</strong>ships between them <str<strong>on</strong>g>in</str<strong>on</strong>g>to a model of the research process; <strong>and</strong> (5) typically never went<br />

bey<strong>on</strong>d “explanatory schematizati<strong>on</strong>s.” For these reas<strong>on</strong>s, Benardou et al. stated that their work<br />

objective would be to create a formal schematic of the research process, or basically to:<br />

… establish a c<strong>on</strong>ceptually sound, pert<str<strong>on</strong>g>in</str<strong>on</strong>g>ent with regard to actual scholarly practice, <strong>and</strong><br />

elegant model of scholarly research activity, encompass<str<strong>on</strong>g>in</str<strong>on</strong>g>g both “object” (structure) <strong>and</strong><br />

“process/practice” (functi<strong>on</strong>al) perspectives, <strong>and</strong> amenable to operati<strong>on</strong>alisati<strong>on</strong> as a tool for: •<br />

structur<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> analys<str<strong>on</strong>g>in</str<strong>on</strong>g>g the outcomes of evidence-based research <strong>on</strong> scholarly practice <strong>and</strong><br />

requirements <strong>and</strong> • produc<str<strong>on</strong>g>in</str<strong>on</strong>g>g clear <strong>and</strong> pert<str<strong>on</strong>g>in</str<strong>on</strong>g>ent <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> requirements, <strong>and</strong> specificati<strong>on</strong>s<br />

685 Research <str<strong>on</strong>g>in</str<strong>on</strong>g>to data curati<strong>on</strong> <strong>and</strong> digital repositories by Mart<str<strong>on</strong>g>in</str<strong>on</strong>g>ez-Uribe <strong>and</strong> Macd<strong>on</strong>ald (2009), where the authors <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewed life science researchers<br />

regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g their research practices <strong>and</strong> their likelihood of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a shared <strong>and</strong> open data repository, also stressed the importance of <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g actual user<br />

requirements for a digital repository <strong>and</strong> emphasized that discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary differences existed <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of a desire for open data or digital archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g.


236<br />

of architecture, tools <strong>and</strong> services for scholarly research <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital envir<strong>on</strong>ment (Benardou et<br />

al. 2010a).<br />

As a basis for their model, Benardou et al. used the cultural-historical activity theory of Le<strong>on</strong>t’ev,<br />

which has as its key c<strong>on</strong>cept, activity, or purposeful <str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong>s of a subject with the world. Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

activity theory framework, they def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed scholarly research as a “purposeful process” that is carried out<br />

by actors (whether <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals or groups) us<str<strong>on</strong>g>in</str<strong>on</strong>g>g specific methods. Research processes were broken <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

simpler tasks, each of which could then be operati<strong>on</strong>alized <str<strong>on</strong>g>in</str<strong>on</strong>g>to specific procedures. Benardou et al.<br />

(2010a) cauti<strong>on</strong>ed, however, that these research processes <strong>and</strong> their corresp<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g procedures should<br />

be c<strong>on</strong>sidered to have a normative character <strong>and</strong> “c<strong>on</strong>vey what is believed by a community of<br />

practiti<strong>on</strong>ers to be good practice at any given time.”<br />

The CIDOC-CRM <strong>on</strong>tology was chosen to formalize their model of the research process, <strong>and</strong> three key<br />

entities were def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g physical objects (i.e., objects that are found, stored, <strong>and</strong> analyzed),<br />

c<strong>on</strong>ceptual objects (e.g., c<strong>on</strong>cepts that have been created, logical propositi<strong>on</strong>s), <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> objects<br />

(e.g., c<strong>on</strong>ceptual objects that have “corresp<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g physical <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> carriers”). In sum, Benardou et<br />

al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that “the <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> objects are the c<strong>on</strong>tents of digital repositories; the physical objects<br />

are the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al doma<str<strong>on</strong>g>in</str<strong>on</strong>g> material; <strong>and</strong> the c<strong>on</strong>ceptual objects are the c<strong>on</strong>tent of scientific theories.”<br />

Their current research, however, focused exclusively <strong>on</strong> the relati<strong>on</strong>ship between <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong><br />

c<strong>on</strong>ceptual objects. Another major entity that was def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed was Research Activity, <strong>and</strong> this was used as<br />

the basic “c<strong>on</strong>struct for represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g research processes.” This entity is typically associated with other<br />

entities, such as Procedure, which is <str<strong>on</strong>g>in</str<strong>on</strong>g> turn related to the Methods that are employed <strong>and</strong> the Tools or<br />

Services it requires. A special Propositi<strong>on</strong> entity was also created that represents all hypotheses that are<br />

formulated or arguments that are made.<br />

As discussed earlier <str<strong>on</strong>g>in</str<strong>on</strong>g> this report, services are a key comp<strong>on</strong>ent of cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>deed<br />

Benardou et al. reported that services formed an essential part of their <strong>on</strong>tological model of scholarly<br />

research. “Services thus become an important mediator between methods, procedures <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

repositories,” they expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “From a functi<strong>on</strong>al perspective, affordances of digital scholarship are<br />

embodied <str<strong>on</strong>g>in</str<strong>on</strong>g> services available. From a teleological <strong>and</strong> methodological perspective, services evolve to<br />

better meet requirements” (Benardou et al. 2010a). The authors c<strong>on</strong>cluded that both their own<br />

empirical study plus those <str<strong>on</strong>g>in</str<strong>on</strong>g> their extensive literature review have provided the “necessary<br />

substantiati<strong>on</strong> <strong>on</strong> primitives” that, al<strong>on</strong>g with an “elaborati<strong>on</strong> of research goals,” enable the<br />

development of a research model that is specific enough to develop appropriate digital services. The<br />

next stages of their research will be to operati<strong>on</strong>alize this model <strong>and</strong> to tag all of the scholarly<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terview transcripts <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of this model to validate its soundness. 686<br />

While the majority of research c<strong>on</strong>sidered <str<strong>on</strong>g>in</str<strong>on</strong>g> this review stressed that developers should c<strong>on</strong>duct needs<br />

assessment before design<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools <strong>and</strong> should carry out user test<str<strong>on</strong>g>in</str<strong>on</strong>g>g of prototypes or f<str<strong>on</strong>g>in</str<strong>on</strong>g>al products,<br />

Cohen et al. (2009) have po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that such test<str<strong>on</strong>g>in</str<strong>on</strong>g>g might not always necessarily yield the desired<br />

results. They suggested that it rema<str<strong>on</strong>g>in</str<strong>on</strong>g>ed an open questi<strong>on</strong> whether “researchers or c<strong>on</strong>tent communities<br />

can accurately assess what they need ahead of time, or whether they are biased toward repeat<str<strong>on</strong>g>in</str<strong>on</strong>g>g modes<br />

<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces they have already seen but which will be less effective at digital scholarship than more<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>novative software.” The ways <str<strong>on</strong>g>in</str<strong>on</strong>g> which scholars’ current underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g of technology <strong>and</strong> comfort<br />

levels with certa<str<strong>on</strong>g>in</str<strong>on</strong>g> types of <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces may predispose them aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st new methodologies is a po<str<strong>on</strong>g>in</str<strong>on</strong>g>t worth<br />

c<strong>on</strong>sider<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the design of any <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

686 The <str<strong>on</strong>g>in</str<strong>on</strong>g>itial results of this work <str<strong>on</strong>g>in</str<strong>on</strong>g>terview<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholars <strong>and</strong> tagg<str<strong>on</strong>g>in</str<strong>on</strong>g>g transcripts have recently been published (Benardou et al. 2010b).


237<br />

Virtual Research Envir<strong>on</strong>ments <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities: A Way to Address Doma<str<strong>on</strong>g>in</str<strong>on</strong>g>-Specific<br />

Needs<br />

A variety of research has emphasized the development of virtual research envir<strong>on</strong>ments, or VREs, for<br />

the humanities as <strong>on</strong>e potentially useful build<str<strong>on</strong>g>in</str<strong>on</strong>g>g block for larger cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. 687 Blanke (2010)<br />

promoted the idea of a humanities VRE that “would br<str<strong>on</strong>g>in</str<strong>on</strong>g>g together several Digital Humanities<br />

applicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>to an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure to support the complete life cycle of humanities research”<br />

(Blanke 2010). One useful def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of VREs has been offered by Michael Fraser:<br />

Virtual research envir<strong>on</strong>ments (VREs), as <strong>on</strong>e hopes the name suggests, comprise digital<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> services which enable research to take place. The idea of a VRE, which <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

this c<strong>on</strong>text <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> e-<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, arises from <strong>and</strong> rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tr<str<strong>on</strong>g>in</str<strong>on</strong>g>sically l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked with, the development of e-science. The VRE helps to broaden the popular<br />

def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of e-science from grid-based distributed comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g for scientists with huge amounts<br />

of data to the development of <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e tools, c<strong>on</strong>tent, <strong>and</strong> middleware with<str<strong>on</strong>g>in</str<strong>on</strong>g> a coherent<br />

framework for all discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> all types of research (Fraser 2005).<br />

Fraser suggested look<str<strong>on</strong>g>in</str<strong>on</strong>g>g at VREs as <strong>on</strong>e comp<strong>on</strong>ent of a digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, rather than as st<strong>and</strong>al<strong>on</strong>e<br />

software, <str<strong>on</strong>g>in</str<strong>on</strong>g>to which the user could plug tools <strong>and</strong> resources. In fact, he argued that <str<strong>on</strong>g>in</str<strong>on</strong>g> some ways<br />

the terms VRE <strong>and</strong> cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure could almost be syn<strong>on</strong>ymous, with the <strong>on</strong>e difference be<str<strong>on</strong>g>in</str<strong>on</strong>g>g that<br />

“the VRE presents a holistic view of the c<strong>on</strong>text <str<strong>on</strong>g>in</str<strong>on</strong>g> which research takes place whereas e-<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

focuses <strong>on</strong> the core, shared services over which the VRE is expected to operate” (Fraser 2005). Fraser<br />

also stated that VREs are <str<strong>on</strong>g>in</str<strong>on</strong>g>tended <str<strong>on</strong>g>in</str<strong>on</strong>g> general to be both collaborative <strong>and</strong> multidiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary.<br />

On the other h<strong>and</strong>, Voss <strong>and</strong> Procter have recently criticized a number of VRE projects because of<br />

their overly specific or generically designed architectures <strong>and</strong> an overall lack of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability.<br />

“VREs that have been built to date tend to be either specific c<strong>on</strong>figurati<strong>on</strong>s for particular research<br />

projects or systems serv<str<strong>on</strong>g>in</str<strong>on</strong>g>g very generic functi<strong>on</strong>s,” Voss <strong>and</strong> Procter c<strong>on</strong>cluded, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “the<br />

technologies used to build VREs also differ widely, lead<str<strong>on</strong>g>in</str<strong>on</strong>g>g to significant fragmentati<strong>on</strong> <strong>and</strong> lack of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability”(Voss <strong>and</strong> Procter 2009). To promote greater <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability, Voss <strong>and</strong> Procter<br />

suggested identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g comm<strong>on</strong> features that would be required from a generic <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure across<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es by explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the research life cycle. They identified the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g functi<strong>on</strong>alities as<br />

comm<strong>on</strong> to research envir<strong>on</strong>ments for almost all discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es: authenticate, communicate <strong>and</strong><br />

collaborate, transfer data, c<strong>on</strong>figure a resource, <str<strong>on</strong>g>in</str<strong>on</strong>g>voke computati<strong>on</strong>, reuse data, give credit, archive<br />

output, publish outputs (formally <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formally), discover resources, m<strong>on</strong>itor resources, ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

awareness, ensure data provenance, <strong>and</strong> provide authorizati<strong>on</strong>. In additi<strong>on</strong>, Voss <strong>and</strong> Procter identified<br />

a number of research challenges that would need to be further <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigated before the successful<br />

deployment of VREs. These challenges <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g the factors that <str<strong>on</strong>g>in</str<strong>on</strong>g>fluence the adopti<strong>on</strong><br />

of VREs, learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g how they are used differently <str<strong>on</strong>g>in</str<strong>on</strong>g> various discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, <strong>and</strong> c<strong>on</strong>sider<str<strong>on</strong>g>in</str<strong>on</strong>g>g their<br />

implicati<strong>on</strong>s for scholarly communicati<strong>on</strong>s.<br />

The project of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g eSAD with the VRE-SDM (VRE for the Study of Documents <strong>and</strong><br />

Manuscripts) provides a useful example of a possible VRE for classics. eSAD had <str<strong>on</strong>g>in</str<strong>on</strong>g>dependently<br />

developed a number of image-process<str<strong>on</strong>g>in</str<strong>on</strong>g>g algorithms for scholars work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with ancient documents, <strong>and</strong><br />

these were c<strong>on</strong>sequently “offered as functi<strong>on</strong>alities wrapped <str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e or several web-services <strong>and</strong><br />

presented to the user <str<strong>on</strong>g>in</str<strong>on</strong>g> a portlet <str<strong>on</strong>g>in</str<strong>on</strong>g> the VRE-SDM applicati<strong>on</strong>” (Wallom et al. 2009). The authors also<br />

687 JISC has recently released an extensive study that explores the role of VREs <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>ally <str<strong>on</strong>g>in</str<strong>on</strong>g> support<str<strong>on</strong>g>in</str<strong>on</strong>g>g collaborative research both with<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> across<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es (Carusi <strong>and</strong> Reimer 2010).


238<br />

reported that eSAD was develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a knowledge base that could also be implemented <str<strong>on</strong>g>in</str<strong>on</strong>g> the same<br />

portlet or another <strong>on</strong>e. Before these algorithms were implemented <str<strong>on</strong>g>in</str<strong>on</strong>g> the VRE-SDM they were difficult<br />

to access <strong>and</strong> required more process<str<strong>on</strong>g>in</str<strong>on</strong>g>g power than many s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle systems possessed. Although<br />

develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>terface to c<strong>on</strong>nect the portal with the Nati<strong>on</strong>al Grid Service (NSG) ran <str<strong>on</strong>g>in</str<strong>on</strong>g>to some<br />

complicati<strong>on</strong>s, Wallom et al. c<strong>on</strong>cluded that this project had been a success for they had managed to<br />

create a “build <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stallati<strong>on</strong> c<strong>on</strong>figurati<strong>on</strong> toolkit” that could be used by the eSAD researchers to<br />

distribute the algorithms they had created <strong>on</strong>to the NGS.<br />

The VRE-SDM project grew out of the “Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a VRE for the Humanities”(BVREH) project 688 at<br />

Oxford University that was completed <str<strong>on</strong>g>in</str<strong>on</strong>g> September 2006. This project surveyed the use of digital<br />

technologies through a scop<str<strong>on</strong>g>in</str<strong>on</strong>g>g study at Oxford <strong>and</strong> sought to create set of priorities for develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for humanities research. The researchers c<strong>on</strong>ducted a series of semistructured <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews<br />

with humanities professors <str<strong>on</strong>g>in</str<strong>on</strong>g> order to create a set of scenarios describ<str<strong>on</strong>g>in</str<strong>on</strong>g>g typical researchers (Pybus<br />

<strong>and</strong> Kirkham 2009). The surveyed revealed a number of <str<strong>on</strong>g>in</str<strong>on</strong>g>sights <strong>and</strong> dem<strong>on</strong>strated that:<br />

… the overall priorities of most <str<strong>on</strong>g>in</str<strong>on</strong>g>terviewees c<strong>on</strong>cerned central host<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> curati<strong>on</strong> of the<br />

digital comp<strong>on</strong>ents of research projects; <strong>and</strong> potential for a VRE to facilitate communicati<strong>on</strong>s.<br />

The latter <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded the dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of results from projects, event notificati<strong>on</strong>, registers of<br />

research <str<strong>on</strong>g>in</str<strong>on</strong>g>terests, collaborati<strong>on</strong> bey<strong>on</strong>d <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al boundaries, <strong>and</strong> the promoti<strong>on</strong> of<br />

humanities research at Oxford more generally. The cross-search<str<strong>on</strong>g>in</str<strong>on</strong>g>g of distributed databases,<br />

project management tools, <strong>and</strong> directory services for hardware, software <strong>and</strong> other types of<br />

electr<strong>on</strong>ic resources were also noted as important requirements (Fraser 2005).<br />

The researchers found that focus<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> humanities professors’ current research <strong>and</strong> ask<str<strong>on</strong>g>in</str<strong>on</strong>g>g them to<br />

describe a “day <str<strong>on</strong>g>in</str<strong>on</strong>g> the life” of their work was a very effective <str<strong>on</strong>g>in</str<strong>on</strong>g>terview<str<strong>on</strong>g>in</str<strong>on</strong>g>g approach that not <strong>on</strong>ly gave<br />

professors a sense of ownership <str<strong>on</strong>g>in</str<strong>on</strong>g> the process but also allowed documented user needs, rather than<br />

technology, to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e what dem<strong>on</strong>strators they developed.<br />

After the survey was completed, the BVREH team used these comm<strong>on</strong> priorities as the basis of a<br />

workshop for which they developed four dem<strong>on</strong>strators as st<strong>and</strong>ards-compliant portlets. One of these<br />

dem<strong>on</strong>strators was the “Virtual Workspace for the Study of Ancient Documents (VWSAW),” which,<br />

as stated above, evolved <str<strong>on</strong>g>in</str<strong>on</strong>g>to the VRE-SDM. The research of the BVREH project that built <strong>on</strong> this<br />

prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>ary survey illustrated how c<strong>on</strong>duct<str<strong>on</strong>g>in</str<strong>on</strong>g>g an analysis of the humanities research envir<strong>on</strong>ment<br />

allowed them to map “exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g research tools to specific comp<strong>on</strong>ents of the research life cycle” (Fraser<br />

2005). Although technical development was not an end goal of this project, the survey still allowed<br />

them to make <str<strong>on</strong>g>in</str<strong>on</strong>g>formed decisi<strong>on</strong>s regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g future <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure choices. While the BVREH team<br />

reported that iterative development was very important, their most important recommendati<strong>on</strong> was that<br />

VRE developers attend meet<str<strong>on</strong>g>in</str<strong>on</strong>g>gs with their <str<strong>on</strong>g>in</str<strong>on</strong>g>tended users so they could underst<strong>and</strong> both the type of<br />

research their users c<strong>on</strong>duct <strong>and</strong> the k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of materials they use (Bowman et al. 2010).<br />

The experience of the BVREH project c<strong>on</strong>firms the advice given by Voss <strong>and</strong> Procter (2009) <str<strong>on</strong>g>in</str<strong>on</strong>g> terms<br />

of build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a VRE or that <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure designers need to c<strong>on</strong>sider the research methods of their<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tended users as well as their larger social <strong>and</strong> organizati<strong>on</strong>al c<strong>on</strong>text. “As the name virtual research<br />

envir<strong>on</strong>ment implies,” Voss <strong>and</strong> Procter expla<str<strong>on</strong>g>in</str<strong>on</strong>g>, “the aim is not to build s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle, m<strong>on</strong>olithic systems but<br />

rather socio-technical c<strong>on</strong>figurati<strong>on</strong>s of different tools that can be assembled to suit the researchers’<br />

needs without much effort, work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with<str<strong>on</strong>g>in</str<strong>on</strong>g> organisati<strong>on</strong>al, community <strong>and</strong> wider societal c<strong>on</strong>text”<br />

(Voss <strong>and</strong> Procter 2009). In other words, creators of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure should not seek to build a universal<br />

688 http://bvreh.humanities.ox.ac.uk/


239<br />

m<strong>on</strong>olithic system that can meet all possible needs but <str<strong>on</strong>g>in</str<strong>on</strong>g>stead design extendable c<strong>on</strong>figurati<strong>on</strong>s of<br />

both general <strong>and</strong> doma<str<strong>on</strong>g>in</str<strong>on</strong>g>-specific resources <strong>and</strong> tools that users can adapt to meet their own research<br />

needs across the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es. While technological challenges rema<str<strong>on</strong>g>in</str<strong>on</strong>g>, the sociological c<strong>on</strong>siderati<strong>on</strong>s of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure decisi<strong>on</strong>s also need to be c<strong>on</strong>sidered.<br />

Despite the success of the VRE-SDM project, Stuart Dunn has cauti<strong>on</strong>ed that implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g VREs <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the arts <strong>and</strong> humanities will be always be more complicated than <str<strong>on</strong>g>in</str<strong>on</strong>g> the sciences because of “fuzzy”<br />

nature of research practices <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities (Dunn 2009). He argued that Google Earth, 689 although a<br />

comm<strong>on</strong>ly used tool <strong>and</strong> c<strong>on</strong>sidered by some to be a sample VRE, should <str<strong>on</strong>g>in</str<strong>on</strong>g>stead be looked at as a<br />

comp<strong>on</strong>ent of a VRE. A successful VRE <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, Dunn stated, will have to meet number of<br />

requirements, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g support for authenticati<strong>on</strong> so that scholars can record their methodology or<br />

publish how they researched their c<strong>on</strong>clusi<strong>on</strong>s or created visualizati<strong>on</strong>s. In additi<strong>on</strong>, users must have<br />

c<strong>on</strong>trol, or at the very least knowledge, of how or whether their data are “preserved, accessed <strong>and</strong><br />

stored.” F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, a VRE for the humanities must have a clearly def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed research purpose. Dunn<br />

ultimately proposed that Pleiades, rather than Google Earth, be c<strong>on</strong>sidered as a sample humanities<br />

VRE, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the former “has a def<str<strong>on</strong>g>in</str<strong>on</strong>g>able presence, it allows <strong>on</strong>ly authenticated users to c<strong>on</strong>tribute to its<br />

knowledge base, <strong>and</strong> there is a quantifiable <strong>and</strong> coherent research purpose” (Dunn 2009).<br />

Another topic that Dunn brought up was a c<strong>on</strong>cern documented throughout this research, namely, that<br />

digital publicati<strong>on</strong> is an act of <str<strong>on</strong>g>in</str<strong>on</strong>g>terpretative scholarship that needs to make its data, methodology, <strong>and</strong><br />

decisi<strong>on</strong>s transparent <strong>and</strong> to create stable <strong>and</strong> citable results that can be verified <strong>and</strong>, ideally, tested <strong>and</strong><br />

reused. Another overrid<str<strong>on</strong>g>in</str<strong>on</strong>g>g theme illustrated by Dunn was the importance of design<str<strong>on</strong>g>in</str<strong>on</strong>g>g technology that<br />

can support traditi<strong>on</strong>al or exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g research practices as well as enable groundbreak<str<strong>on</strong>g>in</str<strong>on</strong>g>g work:<br />

The successful development <strong>and</strong> deployment of a VRE <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities or arts is c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>gent<br />

<strong>on</strong> recogniz<str<strong>on</strong>g>in</str<strong>on</strong>g>g that workflows are not scientific objects <str<strong>on</strong>g>in</str<strong>on</strong>g> their own right. Workflows <str<strong>on</strong>g>in</str<strong>on</strong>g> these<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es are highly <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual, often <str<strong>on</strong>g>in</str<strong>on</strong>g>formal, <strong>and</strong> cannot be easily shared or reproduced.<br />

The focus of VREs <str<strong>on</strong>g>in</str<strong>on</strong>g> the arts <strong>and</strong> humanities should therefore be <strong>on</strong> support<str<strong>on</strong>g>in</str<strong>on</strong>g>g exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

research practice, rather than seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to revoluti<strong>on</strong>ize it (Dunn 2009).<br />

The c<strong>on</strong>clusi<strong>on</strong> that VREs, or <str<strong>on</strong>g>in</str<strong>on</strong>g>deed any cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, will not be able to be used for <str<strong>on</strong>g>in</str<strong>on</strong>g>novative<br />

research until designers better underst<strong>and</strong> how they can support st<strong>and</strong>ard research has been seen<br />

throughout this review. “Until analytical tools <strong>and</strong> services are more sophisticated, robust, transparent,<br />

<strong>and</strong> easy to use for the motivated humanities researcher,” Borgman asserted, “it will be difficult to<br />

attract a broad base of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest with<str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities community” (Borgman 2009). As Borgman<br />

c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>gly argues, if the digital tools <strong>and</strong> collecti<strong>on</strong>s already available cannot be easily used, mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

an argument for the greater uptake of digital humanities research will be very difficult <str<strong>on</strong>g>in</str<strong>on</strong>g>deed.<br />

New Models of Scholarly Collaborati<strong>on</strong><br />

The need for more collaborati<strong>on</strong> between humanities scholars both with<str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities <strong>and</strong> with<br />

other discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es is called for repeatedly throughout the literature <strong>on</strong> digital classics <strong>and</strong> <strong>on</strong> humanities<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. In their discussi<strong>on</strong> of plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g for cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, Green <strong>and</strong> Roy (2008)<br />

c<strong>on</strong>cluded that unfortunately, “much of the daily activity of the humanities <strong>and</strong> social sciences is<br />

rooted <str<strong>on</strong>g>in</str<strong>on</strong>g> the assumpti<strong>on</strong> that research <strong>and</strong> publicati<strong>on</strong> form essentially an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual rather than a<br />

collaborative activity.” While they were certa<str<strong>on</strong>g>in</str<strong>on</strong>g> that the future of liberal arts scholarship would be <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

greater collaborati<strong>on</strong>, they were uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> if collaborati<strong>on</strong> would be brought about by the creati<strong>on</strong> of<br />

689 http://earth.google.com/


240<br />

semantic tools that would make it easier to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d <strong>and</strong> work with partners or if collaborati<strong>on</strong> would be<br />

forced because of the need to organize <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly huge amounts of data.<br />

The CSHE report <strong>on</strong> scholarly communicati<strong>on</strong> also argued that greater collaborati<strong>on</strong> will be important<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> cyberscholarship, but stressed as well the difficulties new collaborative models face, both<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial:<br />

Collaborati<strong>on</strong>s around <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary gr<strong>and</strong> challenge questi<strong>on</strong>s are especially complex,<br />

creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g new dem<strong>and</strong>s for fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g streams, adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istrative homes, shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g of resources,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al recogniti<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual scholars’ c<strong>on</strong>tributi<strong>on</strong>s, <strong>and</strong> the need for participants to<br />

learn the “languages” of the multiple c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es (Harley et al. 2010, 16).<br />

Because of these obstacles, as well as a general resistance to change, the authors noted that there is still<br />

little jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t authorship <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities. At the same time, Choudhury <strong>and</strong> St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> hoped that the sheer<br />

scale of large digitizati<strong>on</strong> projects such as the Roman de la Rose would <str<strong>on</strong>g>in</str<strong>on</strong>g>spire humanists to pursue<br />

new collaborative forms of “data driven scholarship.” Similar arguments were made by Stephen<br />

Nichols, who reflected that collaborative efforts were required to engage <str<strong>on</strong>g>in</str<strong>on</strong>g> the new types of<br />

scholarship that could now be c<strong>on</strong>ducted because of ever-<str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g amounts of data. “The typical<br />

digital project cannot be pursued, much less completed by the proverbial ‘solitary scholar’ familiar to<br />

us from the analogue research model,” Nichols <str<strong>on</strong>g>in</str<strong>on</strong>g>sisted, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “because of the way data is<br />

acquired <strong>and</strong> then scaled, digital research rests <strong>on</strong> a basis of collaborati<strong>on</strong> at many levels” (Nichols<br />

2009). Nichols listed three levels of collaborati<strong>on</strong> that would become <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly necessary: (1)<br />

partnerships between scholars <strong>and</strong> IT professi<strong>on</strong>als; (2) new, “dynamic” <str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong>s between scholars<br />

both with<str<strong>on</strong>g>in</str<strong>on</strong>g> the same discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e <strong>and</strong> across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es; <strong>and</strong> (3) collaborati<strong>on</strong> between various IT<br />

professi<strong>on</strong>als develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g websites for scholars.<br />

Despite many calls for collaborative scholarship, Borgman echoed the criticism of the CSHE report<br />

that is also found <str<strong>on</strong>g>in</str<strong>on</strong>g> (Crane, Seales, Terras 2009) about the c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>u<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>dividualistic nature of much<br />

humanities scholarship. “While the digital humanities are <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly collaborative,” Borgman<br />

argued, “elsewhere <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities the image of the ‘l<strong>on</strong>e scholar’ spend<str<strong>on</strong>g>in</str<strong>on</strong>g>g m<strong>on</strong>ths or years al<strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

dusty archives, followed years later by the completi<strong>on</strong> of a dissertati<strong>on</strong> or m<strong>on</strong>ograph, still obta<str<strong>on</strong>g>in</str<strong>on</strong>g>s”<br />

(Borgman 2009). She agreed with an earlier <str<strong>on</strong>g>in</str<strong>on</strong>g>sight of Amy Friedl<strong>and</strong>er (Friedl<strong>and</strong>er 2009) that <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

order to survive, the digital humanities must move bey<strong>on</strong>d large numbers of uncoord<str<strong>on</strong>g>in</str<strong>on</strong>g>ated “boutique”<br />

projects <str<strong>on</strong>g>in</str<strong>on</strong>g>to larger collaborative projects that can not <strong>on</strong>ly attract more fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g but also ideally create<br />

larger susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able platforms <strong>on</strong> which more research can be built. The need to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k projects <strong>and</strong><br />

researchers across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s has also been called for by the LaQuAT project:<br />

It is necessary to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k up not <strong>on</strong>ly data, but also services <strong>and</strong> researchers—<str<strong>on</strong>g>in</str<strong>on</strong>g> the plural.<br />

Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities need no l<strong>on</strong>ger be an activity carried out by a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle scholar, but<br />

rather by collaborat<str<strong>on</strong>g>in</str<strong>on</strong>g>g researchers <str<strong>on</strong>g>in</str<strong>on</strong>g>teract<str<strong>on</strong>g>in</str<strong>on</strong>g>g with<str<strong>on</strong>g>in</str<strong>on</strong>g> an extended network of data resources,<br />

digital repositories <strong>and</strong> libraries, tools <strong>and</strong> services, <strong>and</strong> other researchers, a shared<br />

envir<strong>on</strong>ment that facilitates <strong>and</strong> susta<str<strong>on</strong>g>in</str<strong>on</strong>g>s collaborative scholarly processes (Hedges 2009).<br />

In additi<strong>on</strong> to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g data, services, <strong>and</strong> researchers, Borgman cited two other important issues for the<br />

success of digital humanities projects: (1) the need to move from a focus <strong>on</strong> audience to a focus <strong>on</strong><br />

participati<strong>on</strong> be it students, scholars, or the public; <strong>and</strong> (2) the need to pursue collaborative<br />

relati<strong>on</strong>ships not <strong>on</strong>ly across the discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es of the humanities but with computer scientists as well.<br />

Crane, Babeu, <strong>and</strong> Bamman (2007) have echoed this sec<strong>on</strong>d po<str<strong>on</strong>g>in</str<strong>on</strong>g>t, stress<str<strong>on</strong>g>in</str<strong>on</strong>g>g that humanists lack the<br />

resources <strong>and</strong> the expertise to go it al<strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure:


241<br />

Unlike their colleagues <str<strong>on</strong>g>in</str<strong>on</strong>g> the sciences, however, humanists have relatively few resources with<br />

which to develop this new <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. They must therefore systematically cultivate alliances<br />

with better-funded discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es, learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g how to build <strong>on</strong> emerg<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure from other<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong>, where possible, c<strong>on</strong>tribut<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the design of a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that serves<br />

all of academia, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the humanities (Crane, Babeu, <strong>and</strong> Bamman 2007).<br />

Scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities thus need to learn to build relati<strong>on</strong>ships with colleagues <str<strong>on</strong>g>in</str<strong>on</strong>g> the sciences <strong>and</strong><br />

to repurpose as many tools as possible for their own needs. Michael Fraser made similar po<str<strong>on</strong>g>in</str<strong>on</strong>g>ts <str<strong>on</strong>g>in</str<strong>on</strong>g> his<br />

recent piece <strong>on</strong> VREs:<br />

For the most part, it is expected that computer science will act <str<strong>on</strong>g>in</str<strong>on</strong>g> partnership with other<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es to lay the foundati<strong>on</strong>s, <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g methods <strong>and</strong> knowledge from the relevant subject<br />

areas. Humanities scholars, for example, cannot necessarily be expected to apply tools <strong>and</strong><br />

processes (<str<strong>on</strong>g>in</str<strong>on</strong>g>itially developed for the e-science community) effectively to their own subjects.<br />

Better to articulate the challenges <strong>and</strong> methods <strong>and</strong> sit down with the computer scientists. This<br />

is not an alien idea for many <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities - there is a l<strong>on</strong>g history of such partnerships<br />

(Fraser 2005).<br />

The need of humanists to both outl<str<strong>on</strong>g>in</str<strong>on</strong>g>e their needs to computer scientists <strong>and</strong> to utilize their tools has<br />

also been made by Choudhury <strong>and</strong> St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> (2007):<br />

One of the imperatives for the humanities community is to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e its own needs <strong>on</strong> a<br />

c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uous basis <strong>and</strong> from that to create the specificati<strong>on</strong>s for <strong>and</strong> build many of its own tools.<br />

… At the same time, it will be worthwhile to discover whether new cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure-related<br />

tools, services, <strong>and</strong> systems from <strong>on</strong>e discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e can support scientists, eng<str<strong>on</strong>g>in</str<strong>on</strong>g>eers, social<br />

scientists, <strong>and</strong> humanists <str<strong>on</strong>g>in</str<strong>on</strong>g> others (Choudhury <strong>and</strong> St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2007).<br />

Thus, partnerships between the humanities <strong>and</strong> computer science not <strong>on</strong>ly are necessary but also offer<br />

opportunities for truly <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary work. More research also needs to be c<strong>on</strong>ducted <str<strong>on</strong>g>in</str<strong>on</strong>g>to how<br />

easily tools can be repurposed across discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es.<br />

On the other h<strong>and</strong>, scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> computer science are also beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g to push harder for closer<br />

c<strong>on</strong>necti<strong>on</strong>s with the humanities. In an <str<strong>on</strong>g>in</str<strong>on</strong>g>vited talk at the EACL, Franciscka de J<strong>on</strong>g called for<br />

rebuild<str<strong>on</strong>g>in</str<strong>on</strong>g>g old liais<strong>on</strong>s between the natural language process<str<strong>on</strong>g>in</str<strong>on</strong>g>g community <strong>and</strong> humanists:<br />

A crucial c<strong>on</strong>diti<strong>on</strong> for the revival of the comm<strong>on</strong> playground for NLP <strong>and</strong> the humanities is<br />

the availability of representatives of communities that could use the outcome, either <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

development of services to their users or as end users (de J<strong>on</strong>g 2009).<br />

Toms <strong>and</strong> O’Brien have also recognized the need for greater collaborati<strong>on</strong> between self-described e-<br />

humanists <strong>and</strong> the larger computer science community, not<str<strong>on</strong>g>in</str<strong>on</strong>g>g that the lack of communicati<strong>on</strong> between<br />

these two groups had led to a limited awareness am<strong>on</strong>g e-humanists as to what tools were already<br />

available. “Perhaps due to the general <strong>and</strong> likely c<strong>on</strong>diti<strong>on</strong>ed practice of not collaborat<str<strong>on</strong>g>in</str<strong>on</strong>g>g,” Toms <strong>and</strong><br />

O’Brien suggested, “they have not sought advice or collaborated with <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals from a host of other<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es who have created tools <strong>and</strong> technologies that could support the humanities, e.g.,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> retrieval, natural language process<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics to name a few” (Toms <strong>and</strong> O’Brien<br />

2008). Both communities will thus have to beg<str<strong>on</strong>g>in</str<strong>on</strong>g> mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>roads to start build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a comm<strong>on</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.


242<br />

Susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able Preservati<strong>on</strong> <strong>and</strong> Curati<strong>on</strong> Infrastructures for Digital Humanities<br />

Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the humanities <str<strong>on</strong>g>in</str<strong>on</strong>g>volves not just creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g new collaborative scholarly<br />

spaces for access<str<strong>on</strong>g>in</str<strong>on</strong>g>g distributed c<strong>on</strong>tent <strong>and</strong> services but also ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g the l<strong>on</strong>g-term preservati<strong>on</strong>,<br />

curati<strong>on</strong>, <strong>and</strong> susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability of that c<strong>on</strong>tent. Although digital preservati<strong>on</strong> has received a great deal of<br />

attenti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the library community, 690 the specific challenges of preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g complicated digital<br />

humanities projects have received less attenti<strong>on</strong>. In her recent overview of this subject, L<str<strong>on</strong>g>in</str<strong>on</strong>g>da Cantara<br />

noted how most scholarship <str<strong>on</strong>g>in</str<strong>on</strong>g> this area seemed to assume that preservati<strong>on</strong> was the resp<strong>on</strong>sibility of<br />

the creator of a digital project. At the same time, she observed most humanities scholars seemed to<br />

believe that <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al or digital repositories be<str<strong>on</strong>g>in</str<strong>on</strong>g>g created by research libraries would h<strong>and</strong>le all the<br />

challenges of the creati<strong>on</strong> of the metadata necessary for both preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the usability<br />

of digital c<strong>on</strong>tent (Cantara 2006). The <str<strong>on</strong>g>in</str<strong>on</strong>g>terim report of the “Blue Ribb<strong>on</strong> Task Force <strong>on</strong> Susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able<br />

Digital Preservati<strong>on</strong> <strong>and</strong> Access” also found that there was no agreement am<strong>on</strong>g stakeholders as to<br />

who should be preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital c<strong>on</strong>tent or who should pay for it (NSF 2008).<br />

This disc<strong>on</strong>nect between c<strong>on</strong>tent creators <strong>and</strong> preservers has begun to be addressed <strong>on</strong>ly <str<strong>on</strong>g>in</str<strong>on</strong>g> the past few<br />

years. In 2009, a workshop entitled “Curriculum Development <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital Humanities <strong>and</strong> Archival<br />

Studies” was held at the Archival Educati<strong>on</strong> <strong>and</strong> Research Institute <strong>and</strong> enabled digital humanists <strong>and</strong><br />

archival researchers to meet <strong>and</strong> “collectively outl<str<strong>on</strong>g>in</str<strong>on</strong>g>e future directi<strong>on</strong>s for digital humanities research”<br />

(Buchanan 2010). Am<strong>on</strong>g the issues discussed were the challenges of apprais<str<strong>on</strong>g>in</str<strong>on</strong>g>g collecti<strong>on</strong>s of digital<br />

objects <strong>and</strong> design<str<strong>on</strong>g>in</str<strong>on</strong>g>g navigable digital libraries that were accessible to n<strong>on</strong>experts, the importance of<br />

support<str<strong>on</strong>g>in</str<strong>on</strong>g>g collaborati<strong>on</strong>, the c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>u<str<strong>on</strong>g>in</str<strong>on</strong>g>g need for descriptive metadata to discover items <strong>on</strong> a granular<br />

level, the roles digital humanists have to play <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>struct<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital archives, <strong>and</strong> def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the skills<br />

needed by archivists <strong>and</strong> digital humanists. “It is crucial that the digital humanities not <strong>on</strong>ly ref<str<strong>on</strong>g>in</str<strong>on</strong>g>e its<br />

extant discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary foci, but also beg<str<strong>on</strong>g>in</str<strong>on</strong>g> to th<str<strong>on</strong>g>in</str<strong>on</strong>g>k generally <strong>and</strong> reflexively about its own susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability,<br />

<strong>and</strong> that of its source data,” Buchanan c<strong>on</strong>cluded, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “as the digital humanities community<br />

c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ues grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the directi<strong>on</strong> of data collecti<strong>on</strong> <strong>and</strong> curati<strong>on</strong> for born-digital (<strong>and</strong> not <strong>on</strong>ly paperto-digital,<br />

or “digitized”) materials, the field must beg<str<strong>on</strong>g>in</str<strong>on</strong>g> to plan for regular surveys <strong>and</strong> m<strong>on</strong>itor<str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

these valuable collecti<strong>on</strong>s” (Buchanan 2010).<br />

One of the largest-scale digital preservati<strong>on</strong> research projects that ended <str<strong>on</strong>g>in</str<strong>on</strong>g> early 2010 was the<br />

European Uni<strong>on</strong>–funded Planets (Preservati<strong>on</strong> <strong>and</strong> L<strong>on</strong>g-Term Access Through Networked<br />

Services). 691 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its website, the Planets Project, which began <str<strong>on</strong>g>in</str<strong>on</strong>g> 2006, was created to<br />

“deliver a susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able framework to enable l<strong>on</strong>g-term preservati<strong>on</strong> of digital c<strong>on</strong>tent, <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Europe's ability to ensure access <str<strong>on</strong>g>in</str<strong>on</strong>g> perpetuity to its digital <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>.” This work <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded a number<br />

of activities such as provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g preservati<strong>on</strong> plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g services, develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g methodologies <strong>and</strong> tools for<br />

the characterizati<strong>on</strong> of digital objects, 692 creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g “preservati<strong>on</strong> acti<strong>on</strong>s” tools to “transform <strong>and</strong><br />

emulate obsolete digital assets,” build<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability framework to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate diverse tools <strong>and</strong><br />

services across a distributed network, provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g a testbed to evaluate different preservati<strong>on</strong> protocols,<br />

tests <strong>and</strong> services, <strong>and</strong> oversee<str<strong>on</strong>g>in</str<strong>on</strong>g>g a dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> program to promote vendor takeup <strong>and</strong> user tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

690 See Ball (2010) for an overview of the tools that have been developed to assist <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repositories <str<strong>on</strong>g>in</str<strong>on</strong>g> digital preservati<strong>on</strong>;. For an overview of<br />

digital preservati<strong>on</strong> issues for libraries, see McGovern (2007).<br />

691 http://www.planets-project.eu/<br />

692 This research has <str<strong>on</strong>g>in</str<strong>on</strong>g>volved determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g not <strong>on</strong>ly the significant properties of digital objects that must be ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed to ensure l<strong>on</strong>g-term mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gful<br />

access but also how these objects are used by different stakeholders <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> what types of envir<strong>on</strong>ments <str<strong>on</strong>g>in</str<strong>on</strong>g> order to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e appropriate preservati<strong>on</strong><br />

strategies (Knight <strong>and</strong> Pennock 2008).


243<br />

They have developed a Preservati<strong>on</strong> Plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g Tool named Plato, 693 published a large number of white<br />

papers <strong>and</strong> research publicati<strong>on</strong>s, <strong>and</strong> released the <str<strong>on</strong>g>in</str<strong>on</strong>g>itial Planets testbed. 694<br />

A smaller-scale digital preservati<strong>on</strong> organizati<strong>on</strong> that previously existed <str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom was<br />

the AHDS, which was funded between 1996 <strong>and</strong> 2008 with the purpose of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital<br />

preservati<strong>on</strong> <strong>and</strong> distributed access to digital humanities projects created <str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom.<br />

Fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g for the AHDS ended <str<strong>on</strong>g>in</str<strong>on</strong>g> 2008, <strong>and</strong> <strong>on</strong>ly the Archaeology Data Service survived. A great deal<br />

of the data, particularly the data describ<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital projects <strong>and</strong> ICT methodologies, became part of the<br />

arts-humanities.net hub. N<strong>on</strong>etheless, even when the AHDS still existed, Warwick et al. (2008b)<br />

argued that its <str<strong>on</strong>g>in</str<strong>on</strong>g>gesti<strong>on</strong> processes for “completed” digital humanities projects, as well as that of many<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repositories, was not sufficient for true l<strong>on</strong>g-term access to these resources:<br />

The de facto soluti<strong>on</strong> is that <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s have become resp<strong>on</strong>sible for the electr<strong>on</strong>ic<br />

resources produced by their staff. However, although they may be will<str<strong>on</strong>g>in</str<strong>on</strong>g>g to archive a static<br />

versi<strong>on</strong> of a resource <str<strong>on</strong>g>in</str<strong>on</strong>g> a repository <strong>and</strong> provide web server space, it is far more difficult for<br />

them to provide resources for active updat<str<strong>on</strong>g>in</str<strong>on</strong>g>g, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce few <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repositories have the<br />

expertise or pers<strong>on</strong>nel to ensure that the functi<strong>on</strong>ality of the resource is ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed. As a result,<br />

it seems likely that the slow decay of <strong>on</strong>ce functi<strong>on</strong>al digital resources will become more rather<br />

than less prevalent <str<strong>on</strong>g>in</str<strong>on</strong>g> future, at least <str<strong>on</strong>g>in</str<strong>on</strong>g> the case of the UK-based digital resources (Warwick et<br />

al. 2008b).<br />

The authors also noted that older models of <strong>on</strong>e-time deposit of digital data are of limited utility s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce<br />

data are rarely <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent of their <str<strong>on</strong>g>in</str<strong>on</strong>g>terface <strong>and</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>ability to update a resource <strong>on</strong>ce it has been<br />

deposited means that systems based <strong>on</strong> this model quickly grow outdated <strong>and</strong> often unusable.<br />

The <str<strong>on</strong>g>in</str<strong>on</strong>g>ability of libraries <strong>and</strong> traditi<strong>on</strong>al repositories to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> digital resources that are c<strong>on</strong>stantly<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly <str<strong>on</strong>g>in</str<strong>on</strong>g> complexity, as well as the digital tools that are needed to use them, has also been<br />

described by Geoffrey Rockwell:<br />

The more sophisticated digital works we create, the more there is that has to be ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong><br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed at much greater cost than just shelv<str<strong>on</strong>g>in</str<strong>on</strong>g>g a book <strong>and</strong> occasi<strong>on</strong>ally reb<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g it.<br />

Centers <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stitutes get to the po<str<strong>on</strong>g>in</str<strong>on</strong>g>t that they can't do anyth<str<strong>on</strong>g>in</str<strong>on</strong>g>g new because ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g what<br />

they have d<strong>on</strong>e is c<strong>on</strong>sum<str<strong>on</strong>g>in</str<strong>on</strong>g>g all their resources. One way to solve that problem is to c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>ce<br />

libraries to take your digital editi<strong>on</strong>s, but many of us d<strong>on</strong>’t have libraries with the<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. Another way to deal with this is to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e certa<str<strong>on</strong>g>in</str<strong>on</strong>g> tools as<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure so that they are understood as th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs that need <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g support by<br />

organizati<strong>on</strong>s funded over the l<strong>on</strong>g-term. If the scale is right we might even have an ec<strong>on</strong>omy<br />

of scale so that we could all pay for a comm<strong>on</strong> organizati<strong>on</strong> to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> the comm<strong>on</strong>wealth of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure … (Rockwell 2010).<br />

Rockwell proposed that the Bamboo project might be an important first step <str<strong>on</strong>g>in</str<strong>on</strong>g> this directi<strong>on</strong> s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce it is<br />

attempt<str<strong>on</strong>g>in</str<strong>on</strong>g>g to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e the comm<strong>on</strong> elements required for a digital research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure across<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> then plans to develop a c<strong>on</strong>sortium to build <strong>and</strong> susta<str<strong>on</strong>g>in</str<strong>on</strong>g> these elements <str<strong>on</strong>g>in</str<strong>on</strong>g> a distributed<br />

manner.<br />

693 http://www.ifs.tuwien.ac.at/dp/plato<br />

694 https://testbed.planets-project.eu/testbed//. This testbed requires users to log <str<strong>on</strong>g>in</str<strong>on</strong>g>, but also allows users to experiment with a number of different<br />

emulati<strong>on</strong> <strong>and</strong> preservati<strong>on</strong> services that can be tested aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st real data, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g external services that were not created by Planets, such as the popular<br />

JHOVE (JSTOR/Harvard Object Validati<strong>on</strong> Envir<strong>on</strong>ment) tool (http://hul.harvard.edu/jhove/).


244<br />

A recent ARL report that explored the potential role of digital repository services <str<strong>on</strong>g>in</str<strong>on</strong>g> regard to research<br />

libraries also made a number of recommendati<strong>on</strong>s to address issues surround<str<strong>on</strong>g>in</str<strong>on</strong>g>g the systematic<br />

collecti<strong>on</strong> <strong>and</strong> preservati<strong>on</strong> of digital projects <strong>and</strong> their l<strong>on</strong>g-term curati<strong>on</strong>. 695 This report observed that<br />

digital repositories have begun to develop quite rapidly <strong>and</strong> are quickly becom<str<strong>on</strong>g>in</str<strong>on</strong>g>g a “key element of<br />

research cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.” In particular, the report proposed that research libraries should develop<br />

strategies to reach out to researchers <strong>and</strong> scholars who have collected or created c<strong>on</strong>tent that may have<br />

grown bey<strong>on</strong>d their ability to manage: “Where these collecti<strong>on</strong>s are of high value, local processes are<br />

needed to migrate early digital collecti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>to an <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>ally-managed service envir<strong>on</strong>ment” (ARL<br />

2009b). Research by Palmer et al. (2009) also identified the catalog<str<strong>on</strong>g>in</str<strong>on</strong>g>g, collecti<strong>on</strong>, <strong>and</strong> curati<strong>on</strong> of<br />

digital materials, particularly the pers<strong>on</strong>al digital collecti<strong>on</strong>s 696 of <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual scholars, as important<br />

strategic services to be provided by research libraries <str<strong>on</strong>g>in</str<strong>on</strong>g> the future.<br />

The ARL report identified a number of other key issues that research libraries would need to address<br />

<strong>and</strong> services they would need to provide to develop digital repositories that functi<strong>on</strong>ed as part of a<br />

larger cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. At the m<str<strong>on</strong>g>in</str<strong>on</strong>g>imum, they proposed that digital repositories would need to offer<br />

the follow<str<strong>on</strong>g>in</str<strong>on</strong>g>g services: l<strong>on</strong>g-term digital preservati<strong>on</strong>, <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g migrati<strong>on</strong> of c<strong>on</strong>tent, access<br />

management, dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of research, metadata <strong>and</strong> format management, various types of discovery<br />

tools, digital publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> data m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g or other forms of text analysis. Al<strong>on</strong>g with these essential<br />

services, the report urged that research libraries should beg<str<strong>on</strong>g>in</str<strong>on</strong>g> to develop services “around new c<strong>on</strong>tent<br />

<strong>and</strong> old c<strong>on</strong>tent <str<strong>on</strong>g>in</str<strong>on</strong>g> new forms.”<br />

While many repositories for research libraries had been orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ally c<strong>on</strong>ceived of as storage services for<br />

static PDFs of formally published faculty research output, the ARL report advocated that research<br />

libraries must plan for the fact that their <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s produce “large <strong>and</strong> ever-grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g quantities of<br />

data, images, multimedia works, learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g objects, <strong>and</strong> digital records” as well as recognize that “mass<br />

digitizati<strong>on</strong> has launched a new scale of digital c<strong>on</strong>tent collect<str<strong>on</strong>g>in</str<strong>on</strong>g>g.” This report also emphasized how<br />

digital repositories will need to be able to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate the diverse digital c<strong>on</strong>tent that already exists <strong>and</strong><br />

that will c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ue to grow far outside of library-managed envir<strong>on</strong>ments:<br />

Research practices will <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly take advantage of strategies predicated <strong>on</strong> the availability<br />

of large amounts of widely accessible, rather than isolated <strong>and</strong> sparse, data. Many primary<br />

source materials support<str<strong>on</strong>g>in</str<strong>on</strong>g>g humanistic <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigati<strong>on</strong>s—large corpora of texts, collecti<strong>on</strong>s of<br />

images, <strong>and</strong> collecti<strong>on</strong>s of cultural materials—will be complemented by newly available <strong>and</strong><br />

discoverable materials from disparate sources outside of library collecti<strong>on</strong>s. To draw <strong>on</strong> c<strong>on</strong>tent<br />

from these diverse sources, researchers will <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate use of library services <strong>and</strong> resources with<br />

funder-supported resources, commercially provided resources, <strong>and</strong> services <strong>and</strong> resources<br />

provided by other entities with<str<strong>on</strong>g>in</str<strong>on</strong>g> the academy. C<strong>on</strong>sequently, librarians will have much less<br />

c<strong>on</strong>trol of the user experience than they currently do <strong>and</strong> will adopt more strategies that rely <strong>on</strong><br />

collaborati<strong>on</strong> with users. For <str<strong>on</strong>g>in</str<strong>on</strong>g>stance, <str<strong>on</strong>g>in</str<strong>on</strong>g> areas such as curati<strong>on</strong> <strong>and</strong> preservati<strong>on</strong> of data,<br />

librarians will be regularly curat<str<strong>on</strong>g>in</str<strong>on</strong>g>g with, not just for, researchers (ARL 2009b).<br />

695 The ARL has also recently released a new report that more specifically exam<str<strong>on</strong>g>in</str<strong>on</strong>g>es the role of digital curati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of research libraries’ digital<br />

preservati<strong>on</strong> activities, with a brief secti<strong>on</strong> <strong>on</strong> the implicati<strong>on</strong>s of digital curati<strong>on</strong> for the digital humanities (Walters <strong>and</strong> Sk<str<strong>on</strong>g>in</str<strong>on</strong>g>ner 2011).<br />

696 Palmer et al. have described these pers<strong>on</strong>al digital collecti<strong>on</strong>s <strong>and</strong> their importance <str<strong>on</strong>g>in</str<strong>on</strong>g> great detail: “In the humanities, pers<strong>on</strong>al collecti<strong>on</strong>s are the<br />

equivalent of f<str<strong>on</strong>g>in</str<strong>on</strong>g>ely curated special collecti<strong>on</strong>s that have been expertly selected <strong>and</strong> c<strong>on</strong>trolled for quality <strong>and</strong> applicati<strong>on</strong>. Reread<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> note tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g are<br />

core functi<strong>on</strong>s with these collecti<strong>on</strong>s. There is c<strong>on</strong>siderable potential for shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> reuse of these collecti<strong>on</strong>s, but the provenance <strong>and</strong> c<strong>on</strong>text of the<br />

materials from the scholar’s research perspective is a large part of the value that would need to be reta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong> represented” (Palmer et al. 2009, 44). The<br />

CHSE study of faculty use of digital resources <str<strong>on</strong>g>in</str<strong>on</strong>g> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g also noted the importance of pers<strong>on</strong>al digital collecti<strong>on</strong>s (Harley et al. 2006a, Harley et al.<br />

2006b).


245<br />

The ARL report recognized that the massive amount of humanities materials that have been digitized<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> open-access projects <strong>and</strong> commercially licensed sources will be comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by researchers <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

new ways, <strong>and</strong> that the research library will need to develop new strategies to work with both its users<br />

<strong>and</strong> other academic units to support digital scholarship as well as preservati<strong>on</strong> <strong>and</strong> curati<strong>on</strong> of digital<br />

data. Abby Smith has made similar arguments <str<strong>on</strong>g>in</str<strong>on</strong>g> her overview of how the research library will need to<br />

evolve <str<strong>on</strong>g>in</str<strong>on</strong>g> the twenty-first century to meet the needs of humanities scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> particular:<br />

The accelerated development of digital humanities is an even more significant trend for<br />

research libraries, if <strong>on</strong>ly because humanists have been their primary clientele. Bey<strong>on</strong>d the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g use of quantitative research methods <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, there is a grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g dem<strong>and</strong><br />

by humanists to access <strong>and</strong> manipulate resources <str<strong>on</strong>g>in</str<strong>on</strong>g> digital form. With the primacy of “datadriven<br />

humanities,” certa<str<strong>on</strong>g>in</str<strong>on</strong>g> humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es will eventually grow their own doma<str<strong>on</strong>g>in</str<strong>on</strong>g>specific<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> specialists. While perhaps tra<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as librarians or archivists, such<br />

specialists will work embedded <str<strong>on</strong>g>in</str<strong>on</strong>g> a department or discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary research center (Smith 2008).<br />

Many scholars <strong>and</strong> librarians have also stressed the need for research libraries to exp<strong>and</strong> their<br />

preservati<strong>on</strong> missi<strong>on</strong> to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude complicated digital objects <strong>and</strong> not just scholarly research<br />

publicati<strong>on</strong>s. 697 Michael Fraser highlighted the importance of the ability of digital repositories to<br />

preserve more complicated c<strong>on</strong>tent than just PDFs <str<strong>on</strong>g>in</str<strong>on</strong>g> the l<strong>on</strong>g-term <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure of VREs:<br />

Indeed, preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g the 'project', compris<str<strong>on</strong>g>in</str<strong>on</strong>g>g data, publicati<strong>on</strong>s, workflows <strong>and</strong> the 'grey'<br />

material of reports, notebooks <strong>and</strong> other forms of more nebulous communicati<strong>on</strong>s is important<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> a research envir<strong>on</strong>ment where much of the material is born <strong>and</strong> raised digital. The<br />

development of today's research by tomorrow's scholars depends <strong>on</strong> it (Fraser 2005).<br />

Similarly, <str<strong>on</strong>g>in</str<strong>on</strong>g> their process of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a digital library of the Roman de la Rose, Choudhury <strong>and</strong> St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong><br />

described the new dual resp<strong>on</strong>sibility of libraries to provide both physical <strong>and</strong> digital preservati<strong>on</strong>:<br />

For while the curati<strong>on</strong> of physical codices will rema<str<strong>on</strong>g>in</str<strong>on</strong>g> an essential role for libraries, the<br />

collecti<strong>on</strong> <strong>and</strong> curati<strong>on</strong> of digital objects will assume greater importance for libraries of the<br />

future, <strong>and</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, budgetary priorities, <strong>and</strong> strategic plans of library organizati<strong>on</strong>s<br />

would do well to account for this so<strong>on</strong>er rather than later (Choudhury <strong>and</strong> St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2007).<br />

A variety of research has also proposed that new organizati<strong>on</strong>al structures bey<strong>on</strong>d the traditi<strong>on</strong>al<br />

research library or <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repository (IR) may be needed to support such digital preservati<strong>on</strong> <strong>and</strong><br />

curati<strong>on</strong> activities. 698 Sayeed Choudhury has recently described how Johns Hopk<str<strong>on</strong>g>in</str<strong>on</strong>g>s University created<br />

its IR as “a “gateway” to the underly<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital archive that will support data curati<strong>on</strong> as part of an<br />

evolv<str<strong>on</strong>g>in</str<strong>on</strong>g>g cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure featur<str<strong>on</strong>g>in</str<strong>on</strong>g>g open, modular comp<strong>on</strong>ents” (Choudhury 2008). As was argued<br />

by Abby Smith, they found that new roles were develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g al<strong>on</strong>g with this new <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the development of “data humanists” or “data scientists.”<br />

In a recent article <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital Humanities Quarterly, W. A. Kretzschmar has also proposed that the <strong>on</strong>ly<br />

way to effectively support l<strong>on</strong>g-term <strong>and</strong> large scale humanities comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g projects may be to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d a<br />

“stable <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al sett<str<strong>on</strong>g>in</str<strong>on</strong>g>g” for them (Kretzschmar 2009). Crane, Babeu, <strong>and</strong> Bamman (2007) have<br />

similarly argued that humanists may need to “develop new organizati<strong>on</strong>al structures to develop <strong>and</strong><br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> the services” <strong>on</strong> which they <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly depend. The need to c<strong>on</strong>sider what type of<br />

697 See Sennyey (2009) for a thorough overview of how academic libraries need to redef<str<strong>on</strong>g>in</str<strong>on</strong>g>e their missi<strong>on</strong> to meet the needs of digital scholarship.<br />

698 For example, the Digital Curati<strong>on</strong> Centre (http://www.dcc.ac.uk/) <str<strong>on</strong>g>in</str<strong>on</strong>g> the United K<str<strong>on</strong>g>in</str<strong>on</strong>g>gdom was launched <str<strong>on</strong>g>in</str<strong>on</strong>g> 2004 to provide advice <strong>and</strong> tools for the<br />

curati<strong>on</strong> of research data. Its website c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s an extensive resources directory as well as a catalog of digital curati<strong>on</strong> tools.


246<br />

organizati<strong>on</strong>s <strong>and</strong> fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g will be required to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities has also<br />

been explored by Geoffrey Rockwell. “When we look closely at civic <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, we see that the<br />

physical <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> service <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure are dependent <strong>on</strong> organizati<strong>on</strong>s for ma<str<strong>on</strong>g>in</str<strong>on</strong>g>tenance <strong>and</strong><br />

operati<strong>on</strong>,” Rockwell expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed; “<str<strong>on</strong>g>in</str<strong>on</strong>g> fact, if it is important that <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure last <strong>and</strong> be open, then the<br />

organizati<strong>on</strong> that ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s it is more important than the item itself” (Rockwell 2010). One major<br />

problem he listed was that fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g bodies typically like to build new <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure rather than budget<br />

for or fund its <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g ma<str<strong>on</strong>g>in</str<strong>on</strong>g>tenance. He c<strong>on</strong>cluded that a realistic c<strong>on</strong>cepti<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure should<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude not just hardware comp<strong>on</strong>ents <strong>and</strong> “softer services” but also professi<strong>on</strong>als who will be needed<br />

to operate <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> it.<br />

Green <strong>and</strong> Roy (2008) also ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that new types of arrangements <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s would be<br />

necessary to support <strong>and</strong> preserve digital scholarship. They provided an overview of several models,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g privatizati<strong>on</strong>, the creati<strong>on</strong> of open-source models, <strong>and</strong> the creati<strong>on</strong> of new types of “trans<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al<br />

associati<strong>on</strong>s” such as HASTAC 699 that could reduce the risks that <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s<br />

need to take <strong>and</strong> start work<str<strong>on</strong>g>in</str<strong>on</strong>g>g towards the build<str<strong>on</strong>g>in</str<strong>on</strong>g>g of “discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e-based communities of practices.”<br />

Whatever soluti<strong>on</strong> is chosen, they c<strong>on</strong>clude:<br />

Although each of these models—privatizati<strong>on</strong>, open source, pay-to-have-a-say open source,<br />

<strong>and</strong> members-<strong>on</strong>ly or emergent trans<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al associati<strong>on</strong>s—has its place <str<strong>on</strong>g>in</str<strong>on</strong>g> this emerg<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

l<strong>and</strong>scape, the key shift <str<strong>on</strong>g>in</str<strong>on</strong>g> th<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g must be away from what can be d<strong>on</strong>e locally <strong>on</strong> an<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>dividual campus <strong>and</strong> toward how the campus can be c<strong>on</strong>nected to other campuses <strong>and</strong> how it<br />

can c<strong>on</strong>tribute to the ref<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g of these new ways of do<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholarship <strong>and</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Green <strong>and</strong><br />

Roy 2008).<br />

To br<str<strong>on</strong>g>in</str<strong>on</strong>g>g about this shift, they c<strong>on</strong>clude with a number of important tasks that must be undertaken: (1)<br />

a strategic <str<strong>on</strong>g>in</str<strong>on</strong>g>vestment <str<strong>on</strong>g>in</str<strong>on</strong>g> cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that will <str<strong>on</strong>g>in</str<strong>on</strong>g>clude a move from “collecti<strong>on</strong> development to<br />

c<strong>on</strong>tent curati<strong>on</strong>”; (2) the development <strong>and</strong> foster<str<strong>on</strong>g>in</str<strong>on</strong>g>g of open-access policies; (3) the promoti<strong>on</strong> of<br />

“cooperati<strong>on</strong> between the public <strong>and</strong> private sectors”; (4) the cultivati<strong>on</strong> of leadership <strong>on</strong> all levels; (5)<br />

the encouragement of digital scholarship by creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g nati<strong>on</strong>al centers that support its growth”; (6) the<br />

development <strong>and</strong> ma<str<strong>on</strong>g>in</str<strong>on</strong>g>tenance of “open st<strong>and</strong>ards <strong>and</strong> tools”; <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>ally (7) the creati<strong>on</strong> digital<br />

collecti<strong>on</strong>s that are both extensive <strong>and</strong> can be reused.<br />

Questi<strong>on</strong>s of preservati<strong>on</strong> <strong>and</strong> susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability will likely rema<str<strong>on</strong>g>in</str<strong>on</strong>g> difficult <strong>on</strong>es for the foreseeable future.<br />

One useful list<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the comp<strong>on</strong>ents of susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability, as stated by D<strong>on</strong> Waters <strong>and</strong> recalled by Roger<br />

Bagnall, is “a product people want, a functi<strong>on</strong>al governance structure, <strong>and</strong> a workable f<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial model”<br />

(Bagnall 2010). Similarly, the ARL report stated that the multi-<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repository the<br />

HathiTrust 700 might f<str<strong>on</strong>g>in</str<strong>on</strong>g>d l<strong>on</strong>g-term success because they have already c<strong>on</strong>fr<strong>on</strong>ted the issue of<br />

“balanc<str<strong>on</strong>g>in</str<strong>on</strong>g>g govern<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g” (ARL 2009b). In fact, secur<str<strong>on</strong>g>in</str<strong>on</strong>g>g l<strong>on</strong>g-term fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g, particularly for<br />

staff <strong>and</strong> basic technical <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, is perhaps the critical issue that rema<str<strong>on</strong>g>in</str<strong>on</strong>g>s without many tractable<br />

soluti<strong>on</strong>s, as recognized by Edm<strong>on</strong>d <strong>and</strong> Schreibman (2010):<br />

How do we ensure that the <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces, web applicati<strong>on</strong>s, or digital objects are ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed for<br />

future use by the scholarly community We should be devot<str<strong>on</strong>g>in</str<strong>on</strong>g>g more resources to the<br />

699 http://www.hastac.org/<br />

700 The HathiTrust (http://www.hathitrust.org/) was created by the 13 universities of the Committee <strong>on</strong> Instituti<strong>on</strong>al Cooperati<strong>on</strong> <strong>and</strong> the University of<br />

California to establish a shared digital repository for preservati<strong>on</strong> <strong>and</strong> access. It <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes all of the books digitized for these universities by Google Books.<br />

A number of other research libraries have jo<str<strong>on</strong>g>in</str<strong>on</strong>g>ed HathiTrust, <strong>and</strong> currently the repository provides access to over six milli<strong>on</strong> volumes. In additi<strong>on</strong> to<br />

provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g preservati<strong>on</strong> for <strong>and</strong> access to digitized volumes, the HathiTrust has also recently jo<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g> a partnership with Indiana University <strong>and</strong> the<br />

University of Ill<str<strong>on</strong>g>in</str<strong>on</strong>g>ois to create the HathiTrust Research Center (HTRC), which will develop tools <strong>and</strong> cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure “to enable advanced<br />

computati<strong>on</strong>al access to the grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital record of human knowledge” (http://news<str<strong>on</strong>g>in</str<strong>on</strong>g>fo.iu.edu/news/page/normal/18245.html).


247<br />

development of susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able digital scholarship rather than accept<str<strong>on</strong>g>in</str<strong>on</strong>g>g the fragility of the<br />

structures we create due to short-term or soft fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g; resources that <strong>on</strong>ce created are situated<br />

outside the traditi<strong>on</strong>al fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al structures <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities. Moreover we need<br />

to f<str<strong>on</strong>g>in</str<strong>on</strong>g>d l<strong>on</strong>g-term fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g strategies for support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the pers<strong>on</strong>nel <strong>and</strong> resources needed for these<br />

projects despite the fact that they are more typical of a science lab than a humanities project:<br />

programmers, servers, web developers, metadata specialists, to name but the most obvious<br />

(Edm<strong>on</strong>d <strong>and</strong> Schreibman 2010).<br />

In their work with the Digital Humanities Observatory, Edm<strong>on</strong>d <strong>and</strong> Schreibman stated that that their<br />

organizati<strong>on</strong> was currently funded through 2011, with no clear bus<str<strong>on</strong>g>in</str<strong>on</strong>g>ess model or susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability plan.<br />

Mov<str<strong>on</strong>g>in</str<strong>on</strong>g>g from “core fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g” to piecemeal fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g secured through grants, they noted frequently,<br />

“diverts staff from “core activities” that the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure was designed to carry out.” The authors also<br />

criticized the fact that almost exclusively project-based fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g had encouraged the creati<strong>on</strong> of digital<br />

silos rather than <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated resources, a trend that projects such as Bamboo, CLARIN, <strong>and</strong> DARIAH<br />

are seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to address. N<strong>on</strong>etheless, Edm<strong>on</strong>d <strong>and</strong> Schreibman were not certa<str<strong>on</strong>g>in</str<strong>on</strong>g>, as Peter Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> has<br />

commented before, that such large <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure projects were necessarily go<str<strong>on</strong>g>in</str<strong>on</strong>g>g to be successful.<br />

“Generati<strong>on</strong>s of big projects, Europe’s DARIAH <strong>and</strong> Project Bamboo not excepted,” they reas<strong>on</strong>ed,<br />

“seem to struggle with the noti<strong>on</strong> that the right tools will turn the scholarly Sauls to Pauls, <strong>and</strong> br<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

them <str<strong>on</strong>g>in</str<strong>on</strong>g> their droves <str<strong>on</strong>g>in</str<strong>on</strong>g>to the digital fold. Others put forward the noti<strong>on</strong> that generati<strong>on</strong>al change will<br />

br<str<strong>on</strong>g>in</str<strong>on</strong>g>g us al<strong>on</strong>g regardless of our efforts for or aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st changes <str<strong>on</strong>g>in</str<strong>on</strong>g> modes of scholarly communicati<strong>on</strong>”<br />

(Edm<strong>on</strong>d <strong>and</strong> Schreibman 2010). But as was illustrated by the CSHE report (Harley et al. 2010) l<strong>on</strong>gterm<br />

changes <str<strong>on</strong>g>in</str<strong>on</strong>g> acceptance <strong>and</strong> susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability of digital scholarship will likely require far more than a<br />

simple chang<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the guard.<br />

Although many humanities scholars may feel that <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> technical questi<strong>on</strong>s are the<br />

preserve of librarians <strong>and</strong> technologists, Neel Smith has c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>gly argued that questi<strong>on</strong>s of<br />

openness, digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <strong>and</strong> susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability must be at the forefr<strong>on</strong>t of any humanist discussi<strong>on</strong><br />

of digital scholarship:<br />

Humanists can with some justificati<strong>on</strong> feel that the dizzy<str<strong>on</strong>g>in</str<strong>on</strong>g>g pace of development <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> technology leaves them little time to reflect <strong>on</strong> its applicati<strong>on</strong> to their area of<br />

expertise. And, after all, why should they c<strong>on</strong>cern themselves As l<strong>on</strong>g as digital scholarship<br />

‘just works’ for their purposes, isn’t that enough Here, as with software, the problem is that<br />

digital scholarship never ‘just’ works. The Homer Multitext project has focused <strong>on</strong> the choice<br />

of licences, <strong>and</strong> the design of data models, archival storage formats, <strong>and</strong> an architecture for<br />

network services because those decisi<strong>on</strong>s determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e what forms our scholarly discourse can<br />

assume <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital envir<strong>on</strong>ment as def<str<strong>on</strong>g>in</str<strong>on</strong>g>itively as code determ<str<strong>on</strong>g>in</str<strong>on</strong>g>es what a piece of software can<br />

accomplish (Smith 2010, 136–137).<br />

The Homer Multitext Project chose to use open-data formats <strong>and</strong> free licenses so that their material<br />

could be both preserved <strong>and</strong> reused, two key comp<strong>on</strong>ents to susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability. Similar arguments<br />

regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the relati<strong>on</strong>ship between the ability to reuse materials <strong>and</strong> l<strong>on</strong>g-term susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability have been<br />

made by Hugh Cayless (Cayless 2010b). Cayless has recently proposed that study<str<strong>on</strong>g>in</str<strong>on</strong>g>g how ancient texts<br />

have survived may provide us with some idea for how digital objects may be preserved. As he<br />

expla<str<strong>on</strong>g>in</str<strong>on</strong>g>s, texts have typically been transmitted <str<strong>on</strong>g>in</str<strong>on</strong>g> four ways: accident, reuse through <str<strong>on</strong>g>in</str<strong>on</strong>g>corporati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to<br />

other entities, republicati<strong>on</strong>, or replicati<strong>on</strong> <strong>and</strong> durability of material (e.g., st<strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong>s). Through<br />

an overview of manuscript transmissi<strong>on</strong> <strong>and</strong> textual criticism, Cayless detailed the vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g fortunes of<br />

Virgil, Sappho, <strong>and</strong> the Res Gestae across the centuries. He then touched up<strong>on</strong> a theme illustrated


248<br />

throughout this review, i.e., of how textual criticism needs to move bey<strong>on</strong>d attempts to perfectly<br />

rec<strong>on</strong>struct the “orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al” text of an author by correct<str<strong>on</strong>g>in</str<strong>on</strong>g>g the “errors” of scribes found <str<strong>on</strong>g>in</str<strong>on</strong>g> manuscripts<br />

(Bolter 1991, Dué <strong>and</strong> Ebbott 2009). “It is clear after centuries of study<str<strong>on</strong>g>in</str<strong>on</strong>g>g the processes by which<br />

manuscripts are transmitted,” Cayless argued, “that precise, mechanical copy<str<strong>on</strong>g>in</str<strong>on</strong>g>g was not typically the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tent of those mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g new editi<strong>on</strong>s of classical works” (Cayless 2010b, 144). Cayless stated that the<br />

methods of textual criticism would need to be adapted even further <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of digital copies <strong>and</strong> their<br />

derivative formats.<br />

Another press<str<strong>on</strong>g>in</str<strong>on</strong>g>g issue identified by Cayless was that current digital rights management schemes rarely<br />

work well with digital preservati<strong>on</strong> goals, for successful preservati<strong>on</strong> requires the ability to distribute<br />

<strong>and</strong> migrate copies. 701 Cayless observed that Creative Comm<strong>on</strong>s licenses were <strong>on</strong>e k<str<strong>on</strong>g>in</str<strong>on</strong>g>d of license that<br />

dealt with vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels of reuse or mashups. While not<str<strong>on</strong>g>in</str<strong>on</strong>g>g that <strong>on</strong>e major frequently stated c<strong>on</strong>cern<br />

regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g or mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g reuse available was that it would reduce an author’s ability to profit from<br />

his or her own work, he also reported that some authors had stated that they had made more m<strong>on</strong>ey by<br />

mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g at least part of their work freely available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Cayless c<strong>on</strong>cluded that s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g more of<br />

a work freely available <str<strong>on</strong>g>in</str<strong>on</strong>g>creases the likelihood of it be<str<strong>on</strong>g>in</str<strong>on</strong>g>g quoted <strong>and</strong> reused, it also enhances its<br />

chances of be<str<strong>on</strong>g>in</str<strong>on</strong>g>g preserved:<br />

It seems therefore reas<strong>on</strong>able to argue that we have returned to a situati<strong>on</strong> somewhat like the<br />

<strong>on</strong>e that existed <str<strong>on</strong>g>in</str<strong>on</strong>g> the ancient world <strong>and</strong>, furthermore, that perhaps some of the processes that<br />

governed the survival of ancient works might perta<str<strong>on</strong>g>in</str<strong>on</strong>g> to digital media. As <str<strong>on</strong>g>in</str<strong>on</strong>g> ancient times, a<br />

work released <str<strong>on</strong>g>in</str<strong>on</strong>g>to the electr<strong>on</strong>ic envir<strong>on</strong>ment may be copied, quoted, reused, or resold without<br />

the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ator’s hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g much c<strong>on</strong>trol over what happens to it. There are legal frameworks for<br />

c<strong>on</strong>troll<str<strong>on</strong>g>in</str<strong>on</strong>g>g what happens to copies of a work, but <str<strong>on</strong>g>in</str<strong>on</strong>g> practice they may be hard to apply or may<br />

not be worth the trouble. Some works may be licensed <str<strong>on</strong>g>in</str<strong>on</strong>g> such a way that there are no legal<br />

barriers to such treatment. What we have seen from the limited survey of ancient works above<br />

is that copy<str<strong>on</strong>g>in</str<strong>on</strong>g>g often provides the most promis<str<strong>on</strong>g>in</str<strong>on</strong>g>g avenue for l<strong>on</strong>g-term survival (Cayless<br />

2010b, 147).<br />

While copy<str<strong>on</strong>g>in</str<strong>on</strong>g>g does tend to reflect the motivati<strong>on</strong>s of the current culture <strong>and</strong> is not without its<br />

complicati<strong>on</strong>s, Cayless also po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that without copy<str<strong>on</strong>g>in</str<strong>on</strong>g>g, n<strong>on</strong>e of the texts of Sappho would have<br />

survived.<br />

Another digital preservati<strong>on</strong> issue that Cayless felt was misguided was a focus <strong>on</strong> preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g the “user<br />

experience,” which he felt was typically def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as the appearance of the text <strong>on</strong> the page. “Modern<br />

pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<str<strong>on</strong>g>in</str<strong>on</strong>g>g methods are completely unsuited to represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g the appearance of ancient texts,” Cayless<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>sisted, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g as well “it wouldn’t be possible to pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t a scroll <strong>on</strong> a modern laser pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ter without<br />

destroy<str<strong>on</strong>g>in</str<strong>on</strong>g>g its form. But there is absolutely no guarantee that the current st<strong>and</strong>ard form will be the<br />

dom<str<strong>on</strong>g>in</str<strong>on</strong>g>ant <strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g> a hundred years” (Cayless 2010b, 147). This c<strong>on</strong>centrati<strong>on</strong> <strong>on</strong> the “user experience,”<br />

Cayless proposed, had resulted <str<strong>on</strong>g>in</str<strong>on</strong>g> an emphasis <strong>on</strong> technology rather than c<strong>on</strong>tent. For the l<strong>on</strong>g-term<br />

preservati<strong>on</strong> of c<strong>on</strong>tent, Cayless suggested the use of text-based markup technologies such as XML<br />

rather than document formats such as PDF:<br />

Text-based markup technologies, <strong>on</strong> the other h<strong>and</strong>, such as XML, allow for the presentati<strong>on</strong> of<br />

documents to be abstracted out to a separate set of <str<strong>on</strong>g>in</str<strong>on</strong>g>structi<strong>on</strong>s. Instead of the document be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

701 This po<str<strong>on</strong>g>in</str<strong>on</strong>g>t was also made earlier by Kansa et al. (2007), <strong>and</strong> Cayless cites the LOCKSS (Lots of Copies Keeps Stuff Safe) program based at Stanford<br />

University (http://lockss.stanford.edu/lockss/Home), which <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes an “open source, peer-to-peer, decentralized digital preservati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure” as <strong>on</strong>e<br />

model to be c<strong>on</strong>sidered.


249<br />

embedded <str<strong>on</strong>g>in</str<strong>on</strong>g> the format, the format is applied to the document. In other words, the c<strong>on</strong>tent<br />

becomes primary aga<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> the appearance sec<strong>on</strong>dary. This type of focus is very much <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

keep<str<strong>on</strong>g>in</str<strong>on</strong>g>g with the ways <str<strong>on</strong>g>in</str<strong>on</strong>g> which ancient documents have reached us: n<strong>on</strong>e of their copyists<br />

would have argued that the text’s appearance was as important as its c<strong>on</strong>tent. The appearance<br />

will have changed every time the text was copied (Cayless 2010b, 148).<br />

Thus a renewed focus <strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual c<strong>on</strong>tent rather than physical appearance is not <strong>on</strong>ly important<br />

from a digital preservati<strong>on</strong> perspective, Cayless argued, but also more <str<strong>on</strong>g>in</str<strong>on</strong>g> keep<str<strong>on</strong>g>in</str<strong>on</strong>g>g with how ancient<br />

texts were transmitted. In additi<strong>on</strong>, Cayless stated that many ancient texts were transmitted with their<br />

commentaries, <strong>and</strong> that digital texts will need to be able to have their own modern versi<strong>on</strong>s of<br />

commentaries such as notes <strong>and</strong> annotati<strong>on</strong>s preserved as well. While both PDF <strong>and</strong> Microsoft Word<br />

have <str<strong>on</strong>g>in</str<strong>on</strong>g>flexible annotati<strong>on</strong> models, Cayless noted that XML allows for easier, more flexible text<br />

annotati<strong>on</strong>.<br />

F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, Cayless listed five <str<strong>on</strong>g>in</str<strong>on</strong>g>terest<str<strong>on</strong>g>in</str<strong>on</strong>g>g pieces of advice for digital archivists that offer food for thought<br />

for any l<strong>on</strong>g-term <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g for the humanities. First, as the future view or use of any<br />

work cannot easily be predicted, due care must be taken for preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g a large variety of digital<br />

resources. At the same time, Cayless reiterated how “l<strong>on</strong>g-term survival may best be ensured by<br />

releas<str<strong>on</strong>g>in</str<strong>on</strong>g>g copies from our c<strong>on</strong>trol” (Cayless 2010b, 149). Sec<strong>on</strong>d, as works have vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g cycles of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terest, l<strong>on</strong>g-term preservati<strong>on</strong> must account for cycles of dis<str<strong>on</strong>g>in</str<strong>on</strong>g>terest that could threaten the survival<br />

of a work. Cayless thus advised digital archivists to promote the use of their whole collecti<strong>on</strong>,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g their lesser-known items. 702 Third, “self-susta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g communities of <str<strong>on</strong>g>in</str<strong>on</strong>g>terest” may prove the<br />

most important factor <str<strong>on</strong>g>in</str<strong>on</strong>g> the l<strong>on</strong>g-term survival of works, so digital archivists should seek to help<br />

c<strong>on</strong>nect <strong>and</strong> facilitate communicati<strong>on</strong> between <str<strong>on</strong>g>in</str<strong>on</strong>g>terested users <strong>and</strong> promote the growth of<br />

communities. Fourth, while an entire orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al object may not survive, the <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual c<strong>on</strong>tent might<br />

still be preserved (e.g., fragmentary texts or derivative works), so Cayless suggested that digital<br />

archivists should perhaps worry less about ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrity of digital objects outside of their<br />

curatorial c<strong>on</strong>trol. Fifth, the more copies of a work that exist, the more likely it will be to survive, so<br />

digital archivists should extend efforts to obta<str<strong>on</strong>g>in</str<strong>on</strong>g> rights to reproduce digital resources without<br />

limitati<strong>on</strong>s.<br />

Levels of Interoperability <strong>and</strong> Infrastructure<br />

A key issue for any l<strong>on</strong>g-term preservati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure will be the <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability of its comp<strong>on</strong>ents,<br />

such as different digital repository platforms; diverse types of widely distributed c<strong>on</strong>tent <strong>and</strong><br />

heterogeneous data; <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual digital humanities applicati<strong>on</strong>s, services, <strong>and</strong> tools. The ARL report<br />

identified the press<str<strong>on</strong>g>in</str<strong>on</strong>g>g need for research libraries to engage with a “larger networked envir<strong>on</strong>ment” <strong>and</strong><br />

stressed that digital repositories could no l<strong>on</strong>ger be created as isolated collecti<strong>on</strong>s or silos <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stead<br />

needed to be designed “<str<strong>on</strong>g>in</str<strong>on</strong>g> ways that allow them to participate <str<strong>on</strong>g>in</str<strong>on</strong>g> higher-level, cross-repository<br />

services”(ARL 2009b).<br />

In additi<strong>on</strong>, the report by the ARL asserted that by 2015 much technology that was <strong>on</strong>ce managed<br />

locally would be managed <str<strong>on</strong>g>in</str<strong>on</strong>g> a distributed <strong>and</strong> virtualized <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure such as through “cloud<br />

comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g,” 703 either through collaborati<strong>on</strong> with<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> am<strong>on</strong>g <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s or through c<strong>on</strong>tract<str<strong>on</strong>g>in</str<strong>on</strong>g>g to<br />

702 Similar arguments were made by Audenaert <strong>and</strong> Furuta (2010), who revealed that the audience for an orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al source is often the most widely forgotten<br />

actor <strong>and</strong> stated that “c<strong>on</strong>sequently, audiences have a significant, if <str<strong>on</strong>g>in</str<strong>on</strong>g>direct, h<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> a work by determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g what is accepted, what is copied, how it is<br />

packaged <strong>and</strong> which works survive.”<br />

703 Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Webopedia (http://www.webopedia.com/TERM/c/cloud_comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g.html), cloud comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g is similar to grid comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> it “relies <strong>on</strong><br />

shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources rather than hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g local servers or pers<strong>on</strong>al devices to h<strong>and</strong>le applicati<strong>on</strong>s.” Cloud comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g applies supercomput<str<strong>on</strong>g>in</str<strong>on</strong>g>g power<br />

to “perform tens of trilli<strong>on</strong>s of computati<strong>on</strong>s per sec<strong>on</strong>d, <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>sumer-oriented applicati<strong>on</strong>s” <strong>and</strong> accomplishes this by network<str<strong>on</strong>g>in</str<strong>on</strong>g>g “large groups of servers,


250<br />

commercial providers. At the same time, they noted that library st<strong>and</strong>ards for <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperati<strong>on</strong> would be<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>gly overshadowed by more general network st<strong>and</strong>ards. As large data sets <strong>and</strong> other massive<br />

amounts of c<strong>on</strong>tent become available, “research will grow more reliant <strong>on</strong> the producti<strong>on</strong> <strong>and</strong> use of<br />

large collecti<strong>on</strong>s of data or primary source <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> organized <str<strong>on</strong>g>in</str<strong>on</strong>g>to a plethora of repositories<br />

operat<str<strong>on</strong>g>in</str<strong>on</strong>g>g at nati<strong>on</strong>al, discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al levels” (ARL 2009b). Thus for researchers of the<br />

near future, it will not be necessarily important where a particular digital object or collecti<strong>on</strong> lives, but<br />

it will be essential that different repositories are able to <str<strong>on</strong>g>in</str<strong>on</strong>g>teract with each other. As the ARL report<br />

c<strong>on</strong>cludes, “In this envir<strong>on</strong>ment, <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperati<strong>on</strong> between repositories <strong>and</strong> service technologies will be a<br />

press<str<strong>on</strong>g>in</str<strong>on</strong>g>g priority”(ARL 2009b). Gregory Crane has also argued this po<str<strong>on</strong>g>in</str<strong>on</strong>g>t recently, c<strong>on</strong>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g that<br />

humanists will need networks of repositories, <strong>and</strong> that the “greatest need is for networked repositories<br />

that can <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate collecti<strong>on</strong>s <strong>and</strong> services distributed across the network <str<strong>on</strong>g>in</str<strong>on</strong>g> the short term <strong>and</strong> can then<br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> these collecti<strong>on</strong>s <strong>and</strong> services over decades” (Crane 2008).<br />

The need for <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable repository <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures has been addressed by Aschenbrenner et al.<br />

(2008), who have argued that for digital repositories to be successful they must become a natural <strong>and</strong><br />

an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated part of users daily work envir<strong>on</strong>ments. 704 This requires shared st<strong>and</strong>ards, descripti<strong>on</strong><br />

schemas, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, a task that, they argued, is bey<strong>on</strong>d <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s <strong>and</strong> that will not<br />

be accomplished through the creati<strong>on</strong> of “local repository isl<strong>and</strong>s”:<br />

Once it is possible to compose repository comp<strong>on</strong>ents through st<strong>and</strong>ard <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces, repository<br />

managers can <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigate which tasks they should take <strong>on</strong> themselves <strong>and</strong> which tasks to outsource<br />

to suitable service providers. This creates a much-needed competitive market where<br />

services can be plugged <str<strong>on</strong>g>in</str<strong>on</strong>g> with greater ease <strong>and</strong> for less cost. … However, for those new<br />

perspectives to manifest, the members of the repository community need to move closer<br />

together <strong>and</strong> develop a collaborative agenda (Aschenbrenner et al. 2008).<br />

They also noted that many useful opportunities already existed <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of promot<str<strong>on</strong>g>in</str<strong>on</strong>g>g repository<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g scalable <strong>and</strong> <strong>on</strong>-dem<strong>and</strong> generic storage <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures, large-scale file-level<br />

process<str<strong>on</strong>g>in</str<strong>on</strong>g>g for c<strong>on</strong>tent analysis, more useful <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of applicati<strong>on</strong> envir<strong>on</strong>ments, <strong>and</strong> a variety of<br />

lightweight preservati<strong>on</strong> services such as “l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g to community-wide format registries <strong>and</strong> migrati<strong>on</strong><br />

services.”<br />

Romary <strong>and</strong> Armbruster (2009) have made similar arguments (with a focus <strong>on</strong> research publicati<strong>on</strong>s)<br />

<strong>and</strong> c<strong>on</strong>cluded that digital repositories (thematic, geographical, or <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al) will be successful <strong>and</strong><br />

see major uptake <strong>on</strong>ly if they are organized as large, central repositories (often organized by discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

such as Arxiv.org 705 ) that can support faster dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, better services, <strong>and</strong> more effective<br />

preservati<strong>on</strong> <strong>and</strong> digital curati<strong>on</strong>. Centralized digital repositories, they also argued, will need to<br />

become part of the larger scholarly <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that is develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g:<br />

The specific visi<strong>on</strong> that we have advocated <str<strong>on</strong>g>in</str<strong>on</strong>g> this paper goes <str<strong>on</strong>g>in</str<strong>on</strong>g>to the directi<strong>on</strong> of provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

scientists with digital scholarly workbenches which, through a better coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of technical<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures <strong>and</strong> adapted editorial support will provide both the quality <strong>and</strong> flexibility that is<br />

usually those with low-cost c<strong>on</strong>sumer PC technology, with specialized c<strong>on</strong>necti<strong>on</strong>s to spread data-process<str<strong>on</strong>g>in</str<strong>on</strong>g>g chores across them. This shared IT<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s large pools of systems that are l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked together.” Virtualizati<strong>on</strong> technologies are frequently used to implement cloud comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. In<br />

terms of academic projects, Fedoraz<strong>on</strong> has recently explored the use of cloud comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g to support digital repositories us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Amaz<strong>on</strong> Web Services <strong>and</strong><br />

Fedora (Fl<strong>and</strong>ers 2009). A recent study by C<strong>on</strong>stance Malpas of OCLC has also provided an important look at the future of cloud comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> academic<br />

libraries (Malpas 2011).<br />

704 Recent research by Cather<str<strong>on</strong>g>in</str<strong>on</strong>g>e Marshall has offered <str<strong>on</strong>g>in</str<strong>on</strong>g>itial analysis <str<strong>on</strong>g>in</str<strong>on</strong>g>to scholarly writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g practices through <str<strong>on</strong>g>in</str<strong>on</strong>g>terviews with scientists <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

order to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e how to make the practice of scholarly archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g a more <str<strong>on</strong>g>in</str<strong>on</strong>g>tegral part of the research process (Marshall 2008).<br />

705 http://arxiv.org/


251<br />

required for efficient scientific work. Even if we have focused here <strong>on</strong> the issue of publicati<strong>on</strong><br />

repositories, which, for many reas<strong>on</strong>s, lie currently at the centre of most debates, it is important<br />

to c<strong>on</strong>sider that this perspective is just <strong>on</strong>e element with<str<strong>on</strong>g>in</str<strong>on</strong>g> a larger set of digital scholarly<br />

services that have to be managed <str<strong>on</strong>g>in</str<strong>on</strong>g> a coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ated way (Romary <strong>and</strong> Armbruster 2009).<br />

Although Romary <strong>and</strong> Armbruster have suggested that the creati<strong>on</strong> of large-scale centralized digital<br />

repositories may be the best soluti<strong>on</strong>, the DRIVER 706 project has <str<strong>on</strong>g>in</str<strong>on</strong>g>stead developed an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

that supports federated access to over 249 <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual digital repositories across Europe. Their <str<strong>on</strong>g>in</str<strong>on</strong>g>itial<br />

research (Ween<str<strong>on</strong>g>in</str<strong>on</strong>g>k et al. 2008, Feijen et al. 2007) identified a number of issues that needed to be<br />

addressed to create such an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual property rights, data curati<strong>on</strong>, <strong>and</strong><br />

l<strong>on</strong>g-term preservati<strong>on</strong>. The DRIVER project guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es m<strong>and</strong>ated a st<strong>and</strong>ard way for repository data<br />

to be exposed but also provided technology to harvest “c<strong>on</strong>tent from multiple repositories <strong>and</strong> manage<br />

its transformati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to a comm<strong>on</strong> <strong>and</strong> uniform 'shared <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> space’” (Feijen et al. 2007). This<br />

shared <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> space provides a variety of services, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g (1) “services needed to ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> it”<br />

so data stores, <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes, <strong>and</strong> aggregators are distributed <strong>on</strong> computers owned by various organizati<strong>on</strong>s;<br />

(2) the ability to add services as necessary; (3) a clean<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> enhancement service that st<strong>and</strong>ardizes<br />

c<strong>on</strong>tent that is harvested <str<strong>on</strong>g>in</str<strong>on</strong>g>to DRIVER records; <strong>and</strong> (4) a search (SRW/CQL 707 ) <strong>and</strong> OAI-Publisher<br />

service that allows all DRIVER records to be used by external applicati<strong>on</strong>s. C<strong>on</strong>sequently, any<br />

repository that wishes to participate can register with<str<strong>on</strong>g>in</str<strong>on</strong>g> the DRIVER <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> have their<br />

c<strong>on</strong>tent “extracted, 'cleaned', <strong>and</strong> aggregated with<str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> space for <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated use” (Feijen et<br />

al. 2007). Ultimately, the DRIVER project focused <strong>on</strong> a centralized <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure with an extendable<br />

service model:<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the focus of DRIVER has been <strong>on</strong> develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, it has not aimed to provide a<br />

pre-def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed set of services. The <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes open, def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces which allow<br />

any service providers work<str<strong>on</strong>g>in</str<strong>on</strong>g>g at a local, nati<strong>on</strong>al or subject-based level, to build services <strong>on</strong><br />

top. They will be able to reuse the data <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure (the Informati<strong>on</strong> Space) <strong>and</strong> the software<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure to build or enhance their systems. Services can therefore be developed accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

to the needs of users (Feiijen et al. 2007).<br />

The DRIVER project illustrates the need for <strong>and</strong> viability of a comm<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for digital<br />

preservati<strong>on</strong> <strong>and</strong> data storage, while also support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ability to develop <str<strong>on</strong>g>in</str<strong>on</strong>g>novative services by<br />

different projects.<br />

One <str<strong>on</strong>g>in</str<strong>on</strong>g>novative approach to support<str<strong>on</strong>g>in</str<strong>on</strong>g>g even more sophisticated levels of repository <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability<br />

has been <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced by Tarrant et al. (2009). Their work used the Object Reuse <strong>and</strong> Exchange (ORE)<br />

framework 708 that was developed by the OAI to support the “descripti<strong>on</strong> <strong>and</strong> exchange of aggregati<strong>on</strong>s<br />

of Web resources” <strong>and</strong> was c<strong>on</strong>ducted as part of the JISC-funded Preserv 2 project, 709 which sought to<br />

f<str<strong>on</strong>g>in</str<strong>on</strong>g>d a way to replicate entire IRs across any repository platforms. As the OAI-ORE specificati<strong>on</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cludes approaches for both describ<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital objects <strong>and</strong> “facilitates access <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>gest of these<br />

706 http://www.driver-repository.eu/<br />

707 SRW st<strong>and</strong>s for “Search & Retrieve Web Service” <strong>and</strong> accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to OCLC the development of the SRW st<strong>and</strong>ard is part of a larger <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al effort<br />

to “develop a st<strong>and</strong>ard web-based text-search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>terface” (http://www.oclc.org/research/activities/srw/default.htm), it has been built us<str<strong>on</strong>g>in</str<strong>on</strong>g>g “comm<strong>on</strong> web<br />

development tools” <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g WSDL, SOAP, HTTP <strong>and</strong> XML. A related st<strong>and</strong>ard is SRU, which st<strong>and</strong>s for “Search & Retrieve Web Service”, a “URLbased<br />

alternative to SRW” where “messages are sent via HTTP us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the GET method” <strong>and</strong> SRW-SOAP comp<strong>on</strong>ents are mapped to HTTP parameters.<br />

The <strong>Library</strong> of C<strong>on</strong>gress actively ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s the SRW/SRU st<strong>and</strong>ard (http://www.loc.gov/st<strong>and</strong>ards/sru/). CQL st<strong>and</strong>s for “c<strong>on</strong>textual query language”<br />

(http://www.loc.gov/st<strong>and</strong>ards/sru/specs/cql.html) <strong>and</strong> it has been developed as a both human-writable <strong>and</strong> mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e readable “formal language for<br />

represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g queries to <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> retrieval systems such as web <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes, bibliographic catalogs <strong>and</strong> museum collecti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>.” It is used by SRU<br />

as its st<strong>and</strong>ard query syntax.<br />

708 http://www.openarchives.org/ore/ <strong>and</strong> for more <strong>on</strong> the development of OAI-ORE, see Van de Sompel <strong>and</strong> Lagoze (2007).<br />

709 http://www.preserv.org.uk/


252<br />

representati<strong>on</strong>s bey<strong>on</strong>d the borders of host<str<strong>on</strong>g>in</str<strong>on</strong>g>g repositories,” Tarrant et al. decided to see if it could be<br />

used to support a new level of cross-repository services. OAI-ORE was orig<str<strong>on</strong>g>in</str<strong>on</strong>g>ally developed with the<br />

idea of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g descripti<strong>on</strong>s of aggregati<strong>on</strong>s of digital objects (e.g., <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual PDFs that are chapters<br />

of a book) <strong>and</strong> the relati<strong>on</strong>ships between them that could be used by any digital repository platform.<br />

The OAI-ORE specificati<strong>on</strong> uses the c<strong>on</strong>cepts of Aggregati<strong>on</strong>s <strong>and</strong> Aggregated Resources, where an<br />

Aggregati<strong>on</strong> represents a set of Aggregated Resources, with each resource <strong>and</strong> the Aggregati<strong>on</strong> itself<br />

be<str<strong>on</strong>g>in</str<strong>on</strong>g>g represented by URIs. Tarrant et al. expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that <str<strong>on</strong>g>in</str<strong>on</strong>g> a sample OAI-ORE implementati<strong>on</strong> the<br />

highest level Aggregati<strong>on</strong> could be the repository itself; it could c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> various Aggregated Resources<br />

(e.g., research publicati<strong>on</strong>s), each of which <str<strong>on</strong>g>in</str<strong>on</strong>g> turn could c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> its own Aggregated Resources (e.g.<br />

an associated dataset, an image, etc.). Resource Maps are used to describe <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual Aggregati<strong>on</strong>s<br />

(<strong>and</strong> can l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to <strong>on</strong>ly <strong>on</strong>e aggregati<strong>on</strong>), <strong>and</strong> each Resource Map must have a unique URI but can also<br />

make use of various namespaces <strong>and</strong> metadata schemas. OAI-ORE models can be represented either <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

RDF XML or the Atom syndicati<strong>on</strong> format.<br />

The soluti<strong>on</strong> Tarrant et al. (2009) implemented made use of OAI-ORE with various extensi<strong>on</strong>s (e.g.,<br />

writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g export <strong>and</strong> import plug-<str<strong>on</strong>g>in</str<strong>on</strong>g>s, creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual applicati<strong>on</strong> to create resource maps from<br />

digital objects stored <str<strong>on</strong>g>in</str<strong>on</strong>g> Fedora us<str<strong>on</strong>g>in</str<strong>on</strong>g>g RDFLib) 710 to replicate all digital objects <str<strong>on</strong>g>in</str<strong>on</strong>g> two different<br />

repositories (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g their metadata <strong>and</strong> object history data) <strong>and</strong> enabled them to execute a lossless<br />

transfer of all digital objects between a Fedora <strong>and</strong> EPr<str<strong>on</strong>g>in</str<strong>on</strong>g>ts archive <strong>and</strong> vice versa. 711 By represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

repository c<strong>on</strong>tent through OAI-ORE Resource Maps, they proposed that many different digital<br />

repository platforms could then be used to provide access to the same c<strong>on</strong>tent, which would enable an<br />

important “transformati<strong>on</strong> from repositories-as-software to a services-based c<strong>on</strong>cepti<strong>on</strong>.” OAI-ORE<br />

also specifies a number of different import <strong>and</strong> export <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces that support the exchange <strong>and</strong> reuse of<br />

digital objects, <strong>and</strong> Tarrant et al. (2009) believed that this feature could greatly aid <str<strong>on</strong>g>in</str<strong>on</strong>g> digital<br />

preservati<strong>on</strong> by enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g simpler migrati<strong>on</strong> of objects between platforms:<br />

OAI-ORE provides another tool to help repository managers tackle the problem of l<strong>on</strong>g-term<br />

preservati<strong>on</strong>, provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g a simple model <strong>and</strong> protocol for express<str<strong>on</strong>g>in</str<strong>on</strong>g>g objects so they can be<br />

exchanged <strong>and</strong> re-used. In future we hope to see OAI-ORE be<str<strong>on</strong>g>in</str<strong>on</strong>g>g used at the lowest level<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> a repository, the storage layer. B<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g objects <str<strong>on</strong>g>in</str<strong>on</strong>g> this manner would allow the<br />

c<strong>on</strong>structi<strong>on</strong> of a layered repository where the core is the storage <strong>and</strong> b<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> all other<br />

software <strong>and</strong> services sit <strong>on</strong> top of this layer (Tarrant et al. 2009).<br />

The use of OAI-ORE illustrates <strong>on</strong>e data model that might be used by different digital classics or<br />

digital humanities repositories that wish to support the not just the reuse of their objects <str<strong>on</strong>g>in</str<strong>on</strong>g> various<br />

digital applicati<strong>on</strong>s but also their replicati<strong>on</strong> across different digital repository platforms.<br />

Johns Hopk<str<strong>on</strong>g>in</str<strong>on</strong>g>s University used OAI-ORE to represent various data aggregati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> an astr<strong>on</strong>omical<br />

data case study. Their data model <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded Resource Maps that represented Aggregati<strong>on</strong>s that<br />

c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed multiple digital objects as well as objects bey<strong>on</strong>d the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual Aggregati<strong>on</strong> that were stored<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> a large number of repositories. Choudhury argued that <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual repositories could never be<br />

expected to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude this level of data. “At some fundamental level, OAI-ORE acknowledges the<br />

realizati<strong>on</strong> that repositories are not an end,” Choudhury remarked, “but rather a means to participate <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

710 A Pyth<strong>on</strong> library for work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with RDF, available at (http://rdflib.net/).<br />

711 Their approach also w<strong>on</strong> the 2008 Comm<strong>on</strong> Repositories Interface Group (CRIG) challenge.


253<br />

a distributed network of c<strong>on</strong>tent <strong>and</strong> services” (Choudhury 2008). Choudhury c<strong>on</strong>cluded that for<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al <strong>and</strong> c<strong>on</strong>sequently digital repositories to be successful they would need to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

themselves as <strong>on</strong>e part of the larger cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, not as the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

An additi<strong>on</strong>al challenge for creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is that many <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual digital<br />

humanities applicati<strong>on</strong>s or tools are very specialized <strong>and</strong> do not support even limited <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability.<br />

One means of address<str<strong>on</strong>g>in</str<strong>on</strong>g>g this problem, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Stephen Nichols, would to design more digital<br />

projects to be “tool-agnostic”:<br />

Rather than creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools specifically for a given set of material, <strong>on</strong>e can make platforms toolagnostic:<br />

mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g simply that the site is designed to accommodate varied c<strong>on</strong>tent. The capacity<br />

of a site to host multiple projects <str<strong>on</strong>g>in</str<strong>on</strong>g>vites collaborati<strong>on</strong> am<strong>on</strong>g scholarly groups who would<br />

otherwise each be putt<str<strong>on</strong>g>in</str<strong>on</strong>g>g up its own separate site. This <str<strong>on</strong>g>in</str<strong>on</strong>g> turn will promote scholarly<br />

communicati<strong>on</strong> <strong>and</strong> collaborati<strong>on</strong> …<str<strong>on</strong>g>in</str<strong>on</strong>g> short, true <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability. Technically such a model is<br />

not difficult to achieve; the problem lies elsewhere: <str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>v<str<strong>on</strong>g>in</str<strong>on</strong>g>c<str<strong>on</strong>g>in</str<strong>on</strong>g>g scholars <strong>and</strong> IT professi<strong>on</strong>als<br />

to th<str<strong>on</strong>g>in</str<strong>on</strong>g>k imag<str<strong>on</strong>g>in</str<strong>on</strong>g>atively <strong>and</strong> proactively by creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an ‘ecumenical’ platform for their orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

c<strong>on</strong>tent, i.e. <strong>on</strong>e that is general <str<strong>on</strong>g>in</str<strong>on</strong>g> its extent <strong>and</strong> applicati<strong>on</strong> (Nichols 2009).<br />

Cohen et al. (2009) made similar criticisms of the limited amount of tool <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability, not<str<strong>on</strong>g>in</str<strong>on</strong>g>g that<br />

despite the existence of various st<strong>and</strong>ards <strong>and</strong> methods, most tools have been built as “<strong>on</strong>e-off,<br />

st<strong>and</strong>al<strong>on</strong>e web applicati<strong>on</strong>s or pieces of software.” Other issues they listed were the <str<strong>on</strong>g>in</str<strong>on</strong>g>ability to import<br />

<strong>and</strong> export between tools <strong>and</strong> the difficulty of c<strong>on</strong>nect<str<strong>on</strong>g>in</str<strong>on</strong>g>g different tools that perform the same task,<br />

cit<str<strong>on</strong>g>in</str<strong>on</strong>g>g as an example the vast proliferati<strong>on</strong> of annotati<strong>on</strong> software over the last five years (e.g., Co-<br />

Annotea, Pl<str<strong>on</strong>g>in</str<strong>on</strong>g>y, Zotero). Compound<str<strong>on</strong>g>in</str<strong>on</strong>g>g the problem is that most digital tools work <strong>on</strong>ly with limited<br />

c<strong>on</strong>tent collecti<strong>on</strong>s, often because digital collecti<strong>on</strong>s lack any way of communicat<str<strong>on</strong>g>in</str<strong>on</strong>g>g with tools such as<br />

an API.<br />

In terms of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g larger VREs for the humanities, Voss <strong>and</strong> Procter have similarly argued that any<br />

successful VRE will need to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude both generic <strong>and</strong> repurposable comp<strong>on</strong>ents that can be used<br />

widely. They believed that most research is <str<strong>on</strong>g>in</str<strong>on</strong>g>deed head<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> that directi<strong>on</strong>: “A wide range of<br />

commoditised comp<strong>on</strong>ents <strong>and</strong> systems are available,” Voss <strong>and</strong> Proctor expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed, “<strong>and</strong> efforts are<br />

underway to develop <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability frameworks to foster flexible <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> to form seamless<br />

collaborative work envir<strong>on</strong>ments” (Voss <strong>and</strong> Procter 2009).<br />

The VRE-SDM has followed this approach by mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g use of open-source tools wherever possible<br />

(e.g., for annotati<strong>on</strong> <strong>and</strong> for document view<str<strong>on</strong>g>in</str<strong>on</strong>g>g) <strong>and</strong> of an exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g VRE tool, the uPortal framework 712<br />

(Bowman et al. 2010). They have used this framework to ensure <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability with other VREs <strong>and</strong><br />

virtual learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ments (VLE) for it allows them to reuse portlets from other projects <strong>and</strong> also<br />

makes their own comp<strong>on</strong>ents easier to reuse. This framework can be customized by users, who can<br />

compile their own <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces us<str<strong>on</strong>g>in</str<strong>on</strong>g>g portlets that offer the tools <strong>and</strong> services they want. They are us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

other, newer st<strong>and</strong>ards (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Google gadget/OpenSocial) 713 to support the <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of various<br />

comp<strong>on</strong>ents. “This means that <str<strong>on</strong>g>in</str<strong>on</strong>g> the l<strong>on</strong>g-term the VRE will be able to provide tools to researchers<br />

across the humanities,” Bowman et al. (2010) proposed; “some, such as the view<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> annotati<strong>on</strong><br />

tools, will be relevant to the broadest range of scholars, while other more specialist tools can be added<br />

by <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual users or groups as <strong>and</strong> when needed” (Bowman et al. 2010, 96). As an example, they<br />

proposed that the VRE-SDM could deploy both its own tools <strong>and</strong> those of eSAD to explore collecti<strong>on</strong>s<br />

712 http://www.jasig.org/uportal<br />

713 http://code.google.com/apis/opensocial


254<br />

such as V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a <strong>and</strong> the LGPN. By us<str<strong>on</strong>g>in</str<strong>on</strong>g>g st<strong>and</strong>ards <strong>and</strong> tools that can be reused by various digital<br />

humanities projects, the VRE-SDM hoped to encourage other projects “to present their own tools <strong>and</strong><br />

services for reuse with<str<strong>on</strong>g>in</str<strong>on</strong>g> the envir<strong>on</strong>ment.” Their model of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g both a customizable <strong>and</strong><br />

extendable architecture has been followed by other projects such as TextGrid <strong>and</strong> DARIAH.<br />

The vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g challenges of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability (c<strong>on</strong>tent, metadata, software, hardware, services, tools) are<br />

be<str<strong>on</strong>g>in</str<strong>on</strong>g>g addressed by all the major digital humanities cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure projects, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g Bamboo,<br />

CLARIN, DARIAH, <strong>and</strong> TextGrid, all of which are discussed <str<strong>on</strong>g>in</str<strong>on</strong>g> further detail below, but a brief<br />

overview of their vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g approaches to <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability is given here. The Bamboo project plans to<br />

develop a services platform that will host <strong>and</strong> deliver shared services for arts <strong>and</strong> humanities research,<br />

<strong>and</strong> these services will run <strong>on</strong> the “cloud.” The project also plans to adopt comm<strong>on</strong> st<strong>and</strong>ards <strong>and</strong> reuse<br />

other services <strong>and</strong> technology whenever possible (Ka<str<strong>on</strong>g>in</str<strong>on</strong>g>z 2009). CLARIN is us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Grid <strong>and</strong> Semantic<br />

Web technologies to ensure a full <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of services <strong>and</strong> resources <strong>and</strong> semantic <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability,<br />

respectively. Their ultimate goal is to create a “virtual, distributed research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure” through a<br />

federati<strong>on</strong> of trusted digital archives that will provide resources <strong>and</strong> tools (typically through web<br />

services) <strong>and</strong> provide users with a secure log <strong>on</strong> (Váradi et al. 2008). Similarly, DARIAH is explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the use of Fedora <strong>and</strong> the IRODS data grid technology to create a distributed research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

that is secure, customizable, <strong>and</strong> extendable (Blanke 2010).<br />

A recent article by Aschenbrenner et al. (2010) has further exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed how the DARIAH <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

will support the federati<strong>on</strong> of various digital archives, an important task s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce research questi<strong>on</strong>s<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities often require materials that are stored <str<strong>on</strong>g>in</str<strong>on</strong>g> different locati<strong>on</strong>s. To support robust<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong> between different agents <str<strong>on</strong>g>in</str<strong>on</strong>g> an open-repository envir<strong>on</strong>ment, they have broken down<br />

“<str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong>s for repository federati<strong>on</strong>” <str<strong>on</strong>g>in</str<strong>on</strong>g>to three layers: physical, logical <strong>and</strong> c<strong>on</strong>ceptual. They have<br />

also identified six attributes of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability that must be addressed by any federated system: (1)<br />

digital object encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g (e.g., “byte serializati<strong>on</strong>” for characters); (2) digital object syntax (“the str<str<strong>on</strong>g>in</str<strong>on</strong>g>gs<br />

<strong>and</strong> statements that can be used to express semantics,” e.g., XML grammars); (3) semantics for digital<br />

objects (“the mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g of terms <strong>and</strong> statements <str<strong>on</strong>g>in</str<strong>on</strong>g> a certa<str<strong>on</strong>g>in</str<strong>on</strong>g> c<strong>on</strong>text,” e.g., metadata formats, c<strong>on</strong>trolled<br />

vocabularies, or <strong>on</strong>tologies); (4) protocols (how different <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual entities relate to <strong>on</strong>e another<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> an <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> system, e.g., OAI-ORE, the METS st<strong>and</strong>ard 714 ); (5) patterns (“identifies<br />

recurr<str<strong>on</strong>g>in</str<strong>on</strong>g>g design problems <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> systems <strong>and</strong> present a well-proven generic approach for its<br />

soluti<strong>on</strong>, c<strong>on</strong>sist<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the c<strong>on</strong>stituent comp<strong>on</strong>ents, their resp<strong>on</strong>sibilities <strong>and</strong> relati<strong>on</strong>ships,” e.g.,<br />

harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g through OAI-PMH); <strong>and</strong> (6) architectures (“specifies the overall structure, capabilities of<br />

<strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teracti<strong>on</strong>s between system comp<strong>on</strong>ents to achieve an overall goal”) (Aschenbrenner et al. 2010).<br />

In their approach to <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability, or as they more specifically def<str<strong>on</strong>g>in</str<strong>on</strong>g>e it to promote larger, federated<br />

eHumanities systems, the creators of TextGrid have suggested the creati<strong>on</strong> of federated semantic<br />

service registries, or registries that provide descripti<strong>on</strong>s of the services <strong>and</strong> resources <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual digital<br />

humanities projects or systems provide so that they can be discovered <strong>and</strong> potentially reused both by<br />

users <strong>and</strong> other systems. Aschenbrenner et al. (2009) posited that “st<strong>and</strong>ardized descripti<strong>on</strong>s of<br />

services <strong>and</strong> other resources will be a prec<strong>on</strong>diti<strong>on</strong> for build<str<strong>on</strong>g>in</str<strong>on</strong>g>g shared, federated registries” <strong>and</strong> will<br />

have the added benefit of “enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g a central query <str<strong>on</strong>g>in</str<strong>on</strong>g>terface.” Such a registry, they proposed, would<br />

require a doma<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>tology that they have prelim<str<strong>on</strong>g>in</str<strong>on</strong>g>arily developed. Aschenbrenner et al. (2009) also<br />

observed that <str<strong>on</strong>g>in</str<strong>on</strong>g> the past few years what they def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed as “eHumanities Digital Ecosystems” have<br />

sprung up <str<strong>on</strong>g>in</str<strong>on</strong>g> great numbers. “The big challenge ahead,” they c<strong>on</strong>cluded, “is now to see how these<br />

714 For a useful overview of how METS <strong>and</strong> OAI-ORE differ <strong>and</strong> how they might be mapped <str<strong>on</strong>g>in</str<strong>on</strong>g> order to support greater levels of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability, see<br />

McD<strong>on</strong>ough (2009).


255<br />

subsystems can beg<str<strong>on</strong>g>in</str<strong>on</strong>g> to merge <str<strong>on</strong>g>in</str<strong>on</strong>g>to <strong>on</strong>e larger eHumanities DE while still ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g their <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual<br />

characters <strong>and</strong> strengths” (Aschenbrenner et al. 2009). Two prerequisites for such successful<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability, they argue, are “loosely coupled services” <strong>and</strong> the “visibility of resources.” While they<br />

proposed a reference <strong>on</strong>tology for both services <strong>and</strong> documents <str<strong>on</strong>g>in</str<strong>on</strong>g> eHumanities, they stressed that any<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure design also must take user needs <str<strong>on</strong>g>in</str<strong>on</strong>g>to account <strong>and</strong> ideally have users <str<strong>on</strong>g>in</str<strong>on</strong>g>volved from the<br />

very beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g. “Novel <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that is imposed <strong>on</strong> the user will fail,” Aschenbrenner et al.<br />

predicted; “TextGrid has doma<str<strong>on</strong>g>in</str<strong>on</strong>g> experts as core partners <str<strong>on</strong>g>in</str<strong>on</strong>g> the team, <strong>and</strong> these experts are shap<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

issues such as st<strong>and</strong>ards <strong>and</strong> community-build<str<strong>on</strong>g>in</str<strong>on</strong>g>g” (Aschenbrenner et al. 2009).<br />

TextGrid thus made use of both doma<str<strong>on</strong>g>in</str<strong>on</strong>g> experts <strong>and</strong> computer scientists <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of def<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g st<strong>and</strong>ards<br />

for their project, <strong>and</strong> Aschenbrenner et al. reported that TextGrid has used noth<str<strong>on</strong>g>in</str<strong>on</strong>g>g but open st<strong>and</strong>ards<br />

to promote the fullest amount of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability. They also took <str<strong>on</strong>g>in</str<strong>on</strong>g>to account the three <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability<br />

layers identified by the European Informati<strong>on</strong> Framework: “technical, semantic, <strong>and</strong> organizati<strong>on</strong>al”<br />

(Aschenbrenner et al. 2009). Earlier research by the TextGrid group had highlighted the challenges of<br />

both syntactic <strong>and</strong> semantic differences <str<strong>on</strong>g>in</str<strong>on</strong>g> humanities data sets <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of achiev<str<strong>on</strong>g>in</str<strong>on</strong>g>g mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gful data<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong>. “In the humanities, the major obstacle to data <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability is syntactic <strong>and</strong> semantic<br />

heterogeneity,” Dimitriadis et al. stated, add<str<strong>on</strong>g>in</str<strong>on</strong>g>g that “roughly speak<str<strong>on</strong>g>in</str<strong>on</strong>g>g, it is the differences <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

term<str<strong>on</strong>g>in</str<strong>on</strong>g>ology that make it so difficult to cross the boundaries <strong>and</strong> create a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t doma<str<strong>on</strong>g>in</str<strong>on</strong>g> of language<br />

resources that can be utilized seamlessly” (Dimitriadis et al. 2006). Similar research by Shen et al.<br />

(2008) had reported that two major types of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability challenges for digital libraries were<br />

syntactic <strong>and</strong> semantic, with syntactic be<str<strong>on</strong>g>in</str<strong>on</strong>g>g at the level of applicati<strong>on</strong>s <strong>and</strong> semantic <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability<br />

as the “knowledge-level <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability” that allows digital libraries to be <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes “the<br />

ability to bridge semantic c<strong>on</strong>flicts aris<str<strong>on</strong>g>in</str<strong>on</strong>g>g from differences <str<strong>on</strong>g>in</str<strong>on</strong>g> implicit mean<str<strong>on</strong>g>in</str<strong>on</strong>g>gs, perspectives, <strong>and</strong><br />

assumpti<strong>on</strong>s, thus creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a semantically compatible <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> envir<strong>on</strong>ment.”<br />

Although the development <strong>and</strong> use of st<strong>and</strong>ards to promote <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability were called for by many<br />

projects such as TextGrid, other research, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g that by the LaQuAT project, has po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that<br />

st<strong>and</strong>ards have their limits as well:<br />

While there are a variety of st<strong>and</strong>ardisati<strong>on</strong> activities with the aim of <str<strong>on</strong>g>in</str<strong>on</strong>g>creas<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability<br />

between digital resources <strong>and</strong> enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g them to be used <str<strong>on</strong>g>in</str<strong>on</strong>g> comb<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, st<strong>and</strong>ardisati<strong>on</strong> al<strong>on</strong>e is<br />

unlikely to solve all problems related to l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g up data. Humanists still have to deal with<br />

legacy data <str<strong>on</strong>g>in</str<strong>on</strong>g> diverse <strong>and</strong> often obsolete formats, <strong>and</strong> even when st<strong>and</strong>ards are used the sheer<br />

variety of data <strong>and</strong> research means that there is a great deal of flexibility <str<strong>on</strong>g>in</str<strong>on</strong>g> how the st<strong>and</strong>ards<br />

are applied. Moreover, st<strong>and</strong>ards are generally developed with<str<strong>on</strong>g>in</str<strong>on</strong>g> particular discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es or<br />

doma<str<strong>on</strong>g>in</str<strong>on</strong>g>s, whereas research is often <str<strong>on</strong>g>in</str<strong>on</strong>g>ter-discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary, mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g use of varied materials, <strong>and</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>corporat<str<strong>on</strong>g>in</str<strong>on</strong>g>g data c<strong>on</strong>form<str<strong>on</strong>g>in</str<strong>on</strong>g>g to different st<strong>and</strong>ards. There will <str<strong>on</strong>g>in</str<strong>on</strong>g>evitably be diversity of<br />

representati<strong>on</strong> when <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> is gathered together from different doma<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>and</strong> for different<br />

purposes, <strong>and</strong> c<strong>on</strong>sequently there will always be a need to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate this diversity (Hedges<br />

2009).<br />

Hedges argued that the realities of legacy data <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, the differ<str<strong>on</strong>g>in</str<strong>on</strong>g>g applicati<strong>on</strong> of st<strong>and</strong>ards,<br />

<strong>and</strong> the doma<str<strong>on</strong>g>in</str<strong>on</strong>g> specificity of many st<strong>and</strong>ards necessitate the design of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure soluti<strong>on</strong>s that can<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate diverse data. He suggested that research by the grid community <strong>on</strong> the “<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of<br />

structured <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>” <strong>and</strong> turn<str<strong>on</strong>g>in</str<strong>on</strong>g>g data repositories <str<strong>on</strong>g>in</str<strong>on</strong>g>to “virtualized data resources” <strong>on</strong> a grid may<br />

allow digital repositories to hide the “heterogeneity of digital objects” from their users, rather than<br />

try<str<strong>on</strong>g>in</str<strong>on</strong>g>g to force all data <str<strong>on</strong>g>in</str<strong>on</strong>g>to <strong>on</strong>e st<strong>and</strong>ard. Whatever soluti<strong>on</strong>s are pursued, this secti<strong>on</strong> has <str<strong>on</strong>g>in</str<strong>on</strong>g>dicated


256<br />

that the questi<strong>on</strong> of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability exists <strong>on</strong> many levels <strong>and</strong> will present significant challenges for<br />

any <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure design.<br />

The Future of Digital Humanities <strong>and</strong> Digital Scholarship<br />

Christ<str<strong>on</strong>g>in</str<strong>on</strong>g>e Borgman started <strong>and</strong> ended her report <strong>on</strong> the future of digital humanities with five questi<strong>on</strong>s<br />

the community will need to c<strong>on</strong>sider to move forward as a discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e: “What are data What are the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure requirements Where are the social studies of digital humanities What is the humanities<br />

laboratory of the twenty-first century <strong>and</strong> What is the value propositi<strong>on</strong> for digital humanities <str<strong>on</strong>g>in</str<strong>on</strong>g> an<br />

era of decl<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g budgets” (Borgman 2009).<br />

Similar questi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> regard to the development of a research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the humanities were<br />

explored at a two-day workshop held <str<strong>on</strong>g>in</str<strong>on</strong>g> October 2010 that was funded by the European Science<br />

Foundati<strong>on</strong> (ESF). The topic was “Research Communities <strong>and</strong> Research Infrastructures <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

Humanities.” 715 The ESF had established a St<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g Committee for the Humanities (SCH) <str<strong>on</strong>g>in</str<strong>on</strong>g> 2009,<br />

<strong>and</strong> the SCH Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g Group <strong>on</strong> Research Infrastructures organized the workshop to “gather different<br />

research communities’ perspectives <strong>on</strong> scholarly-driven design <strong>and</strong> use of research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the humanities.” Scholars from all over Europe were <str<strong>on</strong>g>in</str<strong>on</strong>g>vited to present <strong>on</strong> their research, <strong>and</strong> the<br />

workshop was divided <str<strong>on</strong>g>in</str<strong>on</strong>g>to five major themes: (1) the development of humanities research<br />

communities <strong>and</strong> their vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels of adopti<strong>on</strong> of digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure; (2) the reuse <strong>and</strong><br />

repurpos<str<strong>on</strong>g>in</str<strong>on</strong>g>g of data; (3) the challenges for <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g> need<str<strong>on</strong>g>in</str<strong>on</strong>g>g to deal with both textual <strong>and</strong><br />

n<strong>on</strong>textual material; (4) how digital technologies both challenge traditi<strong>on</strong>al discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary boundaries <strong>and</strong><br />

recogniti<strong>on</strong> patterns <strong>and</strong> created new <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary resources; <strong>and</strong> (5) the difficulties of <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

already exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g resources. The outcomes of this workshop will be used to guide the SCH Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Group’s creati<strong>on</strong> of a formal ESF policy publicati<strong>on</strong> <strong>on</strong> humanities research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.<br />

The CSHE report similarly outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a number of topics that deserve further attenti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> order to<br />

promote digital scholarship <strong>and</strong> new forms of scholarly communicati<strong>on</strong>. It recommended that tenure<strong>and</strong>-promoti<strong>on</strong><br />

practices needed to become more nuanced <strong>and</strong> be accompanied by a reexam<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of<br />

the processes of peer review. For scholarly communicati<strong>on</strong> to evolve, they c<strong>on</strong>cluded that bus<str<strong>on</strong>g>in</str<strong>on</strong>g>ess<br />

models also need to be developed that can susta<str<strong>on</strong>g>in</str<strong>on</strong>g> high quality <strong>and</strong> affordable journals <strong>and</strong><br />

m<strong>on</strong>ographs. Additi<strong>on</strong>ally, they suggested that more sophisticated models of electr<strong>on</strong>ic publicati<strong>on</strong> are<br />

needed that can “accommodate arguments of varied length, rich media, <strong>and</strong> embedded l<str<strong>on</strong>g>in</str<strong>on</strong>g>ks to data”<br />

(Harley et al. 2010). F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, they addressed the importance of “support for manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

new research methods <strong>and</strong> products <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g comp<strong>on</strong>ents of natural language process<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

visualizati<strong>on</strong>, complex distributed databases, <strong>and</strong> GIS, am<strong>on</strong>g many others” (Harley et al. 2010).<br />

The future of the digital humanities was also addressed by a THATCamp 716 held <str<strong>on</strong>g>in</str<strong>on</strong>g> Paris, France, <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

May 2010. 717 This group issued a “Digital Humanities Manifesto” that <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded a def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> of digital<br />

humanities <strong>and</strong> general guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es for ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g a successful future for the digital humanities as a<br />

whole. The guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es called for open access to data <strong>and</strong> metadata that must be both technically well<br />

documented <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable; for greater <strong>and</strong> more open dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of research data, code,<br />

methods <strong>and</strong> f<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs; for <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> of digital humanities educati<strong>on</strong> with<str<strong>on</strong>g>in</str<strong>on</strong>g> the larger social science<br />

<strong>and</strong> humanities curriculum (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g formal degree programs <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital humanities with<br />

715 http://www.esf.org/research-areas/humanities/strategic-activities/research-<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures-<str<strong>on</strong>g>in</str<strong>on</strong>g>-the-humanities.html<br />

716 THATCamp is an acr<strong>on</strong>ym for “The Humanities <strong>and</strong> Technology Camp” (http://thatcamp.org/), a project that was started by the Center for History <strong>and</strong><br />

New Media (CHNM) at George Mas<strong>on</strong> University. THATCamps have been held <str<strong>on</strong>g>in</str<strong>on</strong>g> various cities <strong>and</strong> have been described as “unc<strong>on</strong>ferences” …“where<br />

humanists <strong>and</strong> technologists meet to work together for the comm<strong>on</strong> good.” An unc<strong>on</strong>ference is <strong>on</strong>e that is generally organized day-by-day by its<br />

participants accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to their <str<strong>on</strong>g>in</str<strong>on</strong>g>terests <strong>and</strong> is typically small, <str<strong>on</strong>g>in</str<strong>on</strong>g>formal, short, <str<strong>on</strong>g>in</str<strong>on</strong>g>expensive, <str<strong>on</strong>g>in</str<strong>on</strong>g>formal, <strong>and</strong> n<strong>on</strong>hierarchical, accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the project website.<br />

717 http://www.digitalhumanities.cnrs.fr/wikis/tcp/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phptitle=Anglais


257<br />

c<strong>on</strong>current promoti<strong>on</strong> <strong>and</strong> career opportunities); for the def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong> <strong>and</strong> development of best practices<br />

that meet real discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary needs; <strong>and</strong> for the iterative creati<strong>on</strong> of a scalable digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that is<br />

based <strong>on</strong> real needs as identified by various discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary communities.<br />

The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al secti<strong>on</strong> of this report provides an overview of some large humanities cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure<br />

projects <strong>and</strong> how they are beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g to meet some of these needs.<br />

OVERVIEW OF LARGE CYBERINFRASTRUCTURE PROJECTS<br />

The past five years have seen the creati<strong>on</strong> of a large number of <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al <strong>and</strong> nati<strong>on</strong>al<br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure or e-science/e-research/e-humanities projects, creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a fragmented envir<strong>on</strong>ment<br />

that has made it challeng<str<strong>on</strong>g>in</str<strong>on</strong>g>g to determ<str<strong>on</strong>g>in</str<strong>on</strong>g>e what type of <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure has already been built or may yet<br />

be created. C<strong>on</strong>sequently, a number of the most important projects, arts-humanities.net, 718 ADHO—<br />

Alliance of Digital Humanities Organizati<strong>on</strong>s, 719 CLARIN, 720 centerNET, 721 DARIAH, 722 NoC-<br />

Network of Expert Centres (<str<strong>on</strong>g>in</str<strong>on</strong>g> Great Brita<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Irel<strong>and</strong>), 723 Project Bamboo, 724 <strong>and</strong> TextGrid 725 —<br />

formed a coaliti<strong>on</strong> named CHAIN 726 <str<strong>on</strong>g>in</str<strong>on</strong>g> October 2009. CHAIN, or the Coaliti<strong>on</strong> of Humanities <strong>and</strong><br />

Arts Infrastructures <strong>and</strong> Networks, plans to act as “a forum for areas of shared <str<strong>on</strong>g>in</str<strong>on</strong>g>terest to its<br />

participants,” <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g advocat<str<strong>on</strong>g>in</str<strong>on</strong>g>g for strengthen<str<strong>on</strong>g>in</str<strong>on</strong>g>g digital <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the humanities;<br />

develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g bus<str<strong>on</strong>g>in</str<strong>on</strong>g>ess models; promot<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability for resources, tools, <strong>and</strong> services; promot<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

best practices <strong>and</strong> technical st<strong>and</strong>ards; develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a “shared service <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure”; <strong>and</strong> widen<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

geographical scope of the current coaliti<strong>on</strong>. The various organizati<strong>on</strong>s had previously organized a<br />

panel at Digital Humanities 2009 to explore these same issues (Wynne et al. 2009) <strong>and</strong> met aga<str<strong>on</strong>g>in</str<strong>on</strong>g> at the<br />

2010 Digital Humanities C<strong>on</strong>ference.<br />

This secti<strong>on</strong> provides an overview of each of these organizati<strong>on</strong>s as well as several other important<br />

nati<strong>on</strong>al humanities <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure projects (i.e., Digital Humanities Observatory, DRIVER, TextVRE,<br />

SEASR) <strong>and</strong> the work they have d<strong>on</strong>e.<br />

Alliance of Digital Humanities Organizati<strong>on</strong>s<br />

The Alliance of Digital Humanities Organizati<strong>on</strong>s (ADHO) is an umbrella organizati<strong>on</strong> “whose goals<br />

are to promote <strong>and</strong> support digital research <strong>and</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g across arts <strong>and</strong> humanities discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es.” 727<br />

The organizati<strong>on</strong> was set up to more closely coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ate the activities of the Associati<strong>on</strong> for Computers<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities (founded <str<strong>on</strong>g>in</str<strong>on</strong>g> 1973), the Associati<strong>on</strong> for Literary <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g (founded<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> 1978), <strong>and</strong> the Society for Digital Humanities/ Société pour l'étude des médias <str<strong>on</strong>g>in</str<strong>on</strong>g>teractifs (founded<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> 1986). The ADHO is adm<str<strong>on</strong>g>in</str<strong>on</strong>g>istered by a steer<str<strong>on</strong>g>in</str<strong>on</strong>g>g committee, <strong>and</strong> membership can be obta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

through subscripti<strong>on</strong> to Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. The ADHO is resp<strong>on</strong>sible for oversee<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

number of digital humanities publicati<strong>on</strong>s, 728 <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g peer-reviewed journals such as Literary &<br />

718 http://www.arts-humanities.net/<br />

719 http://www.digitalhumanities.org/<br />

720 http://www.clar<str<strong>on</strong>g>in</str<strong>on</strong>g>.eu/<br />

721 http://www.digitalhumanities.org/centernet/<br />

722 http://www.dariah.eu/<br />

723 http://www.arts-humanities.net/noc/<br />

724 http://projectbamboo.org/<br />

725 http://www.textgrid.de/en.html<br />

726 http://www.dariah.eu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpopti<strong>on</strong>=com_c<strong>on</strong>tent&view=article&id=107:cha<str<strong>on</strong>g>in</str<strong>on</strong>g>-dariah-participates-<str<strong>on</strong>g>in</str<strong>on</strong>g>-an-<str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al-coaliti<strong>on</strong>-of-arts-<strong>and</strong>humanities-<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure-<str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives&catid=3:dariah<br />

727 http://digitalhumanities.org/about<br />

728 http://digitalhumanities.org/publicati<strong>on</strong>s


258<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Digital Studies, Digital Humanities Quarterly, Computers <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities<br />

Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g Papers, <strong>and</strong> Text Technology; <strong>and</strong> a number of book series, such as Blackwell’s Compani<strong>on</strong><br />

to Digital Humanities <strong>and</strong> Digital Literary Studies. The ADHO website provides a list of community<br />

resources <strong>and</strong> hosts the Humanist discussi<strong>on</strong> group archives. ADHO also oversees the annual Digital<br />

Humanities C<strong>on</strong>ference.<br />

arts-humanities.net<br />

The arts-humanities.net website provides an “<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e hub for research <strong>and</strong> teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital arts<br />

<strong>and</strong> humanities” that “enables members to locate <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>, promote their research <strong>and</strong> discuss<br />

ideas.” This hub was developed by the Centre for e-Research (CeRch) at K<str<strong>on</strong>g>in</str<strong>on</strong>g>g's College L<strong>on</strong>d<strong>on</strong><br />

(KCL) <strong>and</strong> is coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ated by Torsten Reimer. It <str<strong>on</strong>g>in</str<strong>on</strong>g>corporates several projects, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g ICT Guides<br />

database of projects <strong>and</strong> methods, which was developed by the AHDS, <strong>and</strong> the orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al artshumanities.net<br />

developed by Reimer for the AHRC ICT Methods Network. Initial fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g was<br />

provided by the AHRC, <strong>and</strong> the project is now supported by JISC. A number of projects <strong>and</strong> groups,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Network of Expert Centers, the Digital Humanities Observatory, the Oxford e-research<br />

center, <strong>and</strong> the Arts & Humanities e-Science Support Center (AHeSCC), c<strong>on</strong>tribute to this hub.<br />

This hub has more than 1,400 registered users, who are able to create a blog, participate <str<strong>on</strong>g>in</str<strong>on</strong>g> discussi<strong>on</strong><br />

forums, <strong>and</strong> tag various tools <strong>and</strong> projects. The website can be browsed by subject discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e; for<br />

example, brows<str<strong>on</strong>g>in</str<strong>on</strong>g>g by “Classics <strong>and</strong> Ancient History” 729 br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs the user to a list of resources <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

subject, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g recently added digital projects, events, additi<strong>on</strong>s to the research bibliography, calls<br />

for papers, <strong>and</strong> blog entries, am<strong>on</strong>g others. Arts-humanities.net also c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s several major<br />

comp<strong>on</strong>ents, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a catalog of digital resources, a directory of digital tools used <str<strong>on</strong>g>in</str<strong>on</strong>g> projects, a<br />

computati<strong>on</strong>al methods tax<strong>on</strong>omy, a research bibliography <strong>and</strong> a library of case studies <strong>and</strong> brief<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

papers, an events calendar <strong>and</strong> list of calls for papers, a community forum that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a discussi<strong>on</strong><br />

forum, a list of members’ blogs <strong>and</strong> user groups, <strong>and</strong> an actively updated list of job post<str<strong>on</strong>g>in</str<strong>on</strong>g>gs.<br />

The “Projects” secti<strong>on</strong>, which serves as a “catalogue of digital scholarship,” is <strong>on</strong>e of the major<br />

resources <strong>on</strong> this website <strong>and</strong> provides several hundred detailed records <strong>on</strong> digital arts <strong>and</strong> humanities<br />

projects. 730 While the focus is <strong>on</strong> U.K. projects, the details provided are extensive, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the types<br />

of digital resources created, the technical methods used, the data formats created, the types of tools<br />

used to create the resource, <strong>and</strong> several subject head<str<strong>on</strong>g>in</str<strong>on</strong>g>gs. The project catalog can be searched or<br />

browsed alphabetically or by method, discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e, or c<strong>on</strong>tent created. The actively updated digital tools<br />

secti<strong>on</strong> 731 can be searched by keyword or be browsed alphabetically (as well as by license, life cycle<br />

stage, platform, subject tags [user created], <strong>and</strong> supported specificati<strong>on</strong>s). Another major comp<strong>on</strong>ent of<br />

the website is the ICT methods tax<strong>on</strong>omy, 732 which <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes seven broad method categories such as<br />

communicati<strong>on</strong> <strong>and</strong> collaborati<strong>on</strong> or data analysis. Each method <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a list of submethods that<br />

leads to a descripti<strong>on</strong> <strong>and</strong> a full list of projects us<str<strong>on</strong>g>in</str<strong>on</strong>g>g that method.<br />

The arts-humanities.net hub fulfills a number of important requirements <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

humanities <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure as outl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed above. It is collaborative, supports the creati<strong>on</strong> of communities of<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>terest, <strong>and</strong> has created a central place to describe tools <strong>and</strong> methods to support best practices.<br />

729 http://www.arts-humanities.net/discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es/classics_ancient_history<br />

730 http://www.arts-humanities.net/project<br />

731 http://www.arts-humanities.net/tools<br />

732 http://www.arts-humanities.net/ictguides/methods


259<br />

centerNET<br />

centerNET is “an <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al network of digital humanities centers formed for cooperative <strong>and</strong><br />

collaborative acti<strong>on</strong> that will benefit digital humanities <strong>and</strong> allied fields <str<strong>on</strong>g>in</str<strong>on</strong>g> general, <strong>and</strong> centers as<br />

humanities cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g> particular.” 733 This network grew out of a meet<str<strong>on</strong>g>in</str<strong>on</strong>g>g hosted by the<br />

NEH <strong>and</strong> the University of Maryl<strong>and</strong>, College Park, <str<strong>on</strong>g>in</str<strong>on</strong>g> 2007 <strong>and</strong> was created <str<strong>on</strong>g>in</str<strong>on</strong>g> resp<strong>on</strong>se to the ACLS<br />

Report <strong>on</strong> Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities (ACLS 2006). The largest comp<strong>on</strong>ent of this website<br />

is an <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al directory 734 of over 200 digital humanities organizati<strong>on</strong>s that can be viewed<br />

alphabetically. Each organizati<strong>on</strong> entry <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a short descripti<strong>on</strong> <strong>and</strong> a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to the organizati<strong>on</strong>’s<br />

website. To be <str<strong>on</strong>g>in</str<strong>on</strong>g>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> the directory, an organizati<strong>on</strong> simply has to request to jo<str<strong>on</strong>g>in</str<strong>on</strong>g>. As of July 2010,<br />

a new beta website 735 for centerNET was announced that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a Google Map of all of the<br />

registered digital humanities organizati<strong>on</strong>s, an aggregated web feed from all registered centers’<br />

websites or blogs, an updated directory of centers that can be searched or limited by geographic<br />

category, <strong>and</strong> a resource list that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a l<str<strong>on</strong>g>in</str<strong>on</strong>g>k to a frequently updated “digital research tools<br />

wiki.” 736<br />

CLARIN<br />

CLARIN (Comm<strong>on</strong> Language Resources <strong>and</strong> Technology Infrastructure) is a pan-European project<br />

that is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g to “establish an <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure of language<br />

resources <strong>and</strong> its technology.” It “aims at lift<str<strong>on</strong>g>in</str<strong>on</strong>g>g the current fragmentati<strong>on</strong>, offer<str<strong>on</strong>g>in</str<strong>on</strong>g>g a stable, persistent,<br />

accessible <strong>and</strong> extendable <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <strong>and</strong> therefore enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g eHumanities.” 737 The <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated<br />

envir<strong>on</strong>ment of services <strong>and</strong> resources will be based <strong>on</strong> grid technologies, use Semantic Web<br />

technologies to ensure semantic <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability, <strong>and</strong> be extendable so that new resources <strong>and</strong> services<br />

can be added. CLARIN plans to help l<str<strong>on</strong>g>in</str<strong>on</strong>g>guists improve their models <strong>and</strong> tools, aid humanities scholars<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g to access <strong>and</strong> use language technology, <strong>and</strong> “lower thresholds to multil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual <strong>and</strong><br />

multicultural c<strong>on</strong>tent.” CLARIN ultimately plans to build a “virtual distributed research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure”<br />

through a “federati<strong>on</strong> of trusted archive centers that will provide resources <strong>and</strong> tools through web<br />

services with a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle sign-<strong>on</strong>” (Váradi et al. 2008)<br />

CLARIN also seeks to address the issue of heterogeneous language resources <str<strong>on</strong>g>in</str<strong>on</strong>g> a fragmented<br />

envir<strong>on</strong>ment <strong>and</strong> to c<strong>on</strong>nect resources <strong>and</strong> tools that already exist with scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities:<br />

The benefits of computer-enhanced language process<str<strong>on</strong>g>in</str<strong>on</strong>g>g will become available <strong>on</strong>ly when a<br />

critical mass of coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ated effort is <str<strong>on</strong>g>in</str<strong>on</strong>g>vested <str<strong>on</strong>g>in</str<strong>on</strong>g> build<str<strong>on</strong>g>in</str<strong>on</strong>g>g an enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, which will<br />

make the exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools <strong>and</strong> resources readily accessible across a wide span of doma<str<strong>on</strong>g>in</str<strong>on</strong>g>s <strong>and</strong><br />

provide the relevant tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> advice (Váradi et al. 2008).<br />

The CLARIN project especially wishes to turn the large number of exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> dispersed technologies<br />

<strong>and</strong> sources <str<strong>on</strong>g>in</str<strong>on</strong>g>to “accessible <strong>and</strong> stable services” that users can share <strong>and</strong> repurpose. Other key themes<br />

they are deal<str<strong>on</strong>g>in</str<strong>on</strong>g>g with are “persistent identifiers, comp<strong>on</strong>ent metadata, c<strong>on</strong>cept registries, <strong>and</strong> support<br />

for virtual collecti<strong>on</strong>s” (Dallas <strong>and</strong> Doorn 2009).<br />

As CLARIN is a very large project with many deliverables, its organizati<strong>on</strong> is divided <str<strong>on</strong>g>in</str<strong>on</strong>g>to seven work<br />

packages: management <strong>and</strong> coord<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>; technical <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure; humanities overview; language<br />

733 http://digitalhumanities.org/centernet/<br />

734 http://digitalhumanities.org/centernet/page_id=4<br />

735 http://digitalhumanities.org/centernet_new/<br />

736 http://digitalresearchtools.pbworks.com/<br />

737 http://www.clar<str<strong>on</strong>g>in</str<strong>on</strong>g>.eu/external/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phppage=about-clar<str<strong>on</strong>g>in</str<strong>on</strong>g>&sub=0


260<br />

resources <strong>and</strong> technology overview; <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> gather<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>; <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual property<br />

rights <strong>and</strong> bus<str<strong>on</strong>g>in</str<strong>on</strong>g>ess models; <strong>and</strong> c<strong>on</strong>structi<strong>on</strong> <strong>and</strong> exploitati<strong>on</strong> agreements. CLARIN is still <str<strong>on</strong>g>in</str<strong>on</strong>g> its<br />

preparatory stage <strong>and</strong> envisi<strong>on</strong>s two later phases, a c<strong>on</strong>structi<strong>on</strong> phase <strong>and</strong> an exploitati<strong>on</strong> phase. This<br />

preparatory phase has a number of objectives, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g organiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> governance <str<strong>on</strong>g>in</str<strong>on</strong>g> 22<br />

countries <strong>and</strong> thoroughly explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the technical dimensi<strong>on</strong>, for, as Váradi et al. admit, “a language<br />

resources <strong>and</strong> technology <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is a novel c<strong>on</strong>cept.” CLARIN is fully <str<strong>on</strong>g>in</str<strong>on</strong>g>vestigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the user<br />

dimensi<strong>on</strong> <strong>and</strong> is undertak<str<strong>on</strong>g>in</str<strong>on</strong>g>g an analysis of how language technology is currently used <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

humanities to make sure that all developed technical specificati<strong>on</strong>s meet the actual needs of humanities<br />

users. This scop<str<strong>on</strong>g>in</str<strong>on</strong>g>g study <strong>and</strong> research <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes undertak<str<strong>on</strong>g>in</str<strong>on</strong>g>g a number of typical humanities research<br />

projects to help validate developed prototypes. 738 In additi<strong>on</strong>, they plan to c<strong>on</strong>duct outreach to less<br />

technologically advanced secti<strong>on</strong>s of the humanities <strong>and</strong> social sciences to promote the use of language<br />

resources <strong>and</strong> technology (Váradi et al. 2008). CLARIN is also seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g to br<str<strong>on</strong>g>in</str<strong>on</strong>g>g together the<br />

humanities <strong>and</strong> language technology communities, <strong>and</strong> it plans to collaborate with DARIAH <str<strong>on</strong>g>in</str<strong>on</strong>g> this<br />

area <strong>and</strong> others. 739<br />

One example of a humanities case study was reported by Villegas <strong>and</strong> Parra (2009), who explored the<br />

scenario of a social historian wish<str<strong>on</strong>g>in</str<strong>on</strong>g>g to c<strong>on</strong>duct a search of multiple newspaper archives. They found<br />

that provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g access to primary source data that were “highly distributed <strong>and</strong> stored <str<strong>on</strong>g>in</str<strong>on</strong>g> different<br />

applicati<strong>on</strong>s with different formats” was very difficult <strong>and</strong> that humanities researchers required the<br />

“<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability of distributed <strong>and</strong> heterogeneous research data.” Villegas <strong>and</strong> Parra<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>cluded a detailed analysis of the complicated steps required to create a f<str<strong>on</strong>g>in</str<strong>on</strong>g>al envir<strong>on</strong>ment where the<br />

user could actually analyze the data. They also provided some <str<strong>on</strong>g>in</str<strong>on</strong>g>sights for further CLARIN research<br />

<strong>and</strong> <strong>on</strong>go<str<strong>on</strong>g>in</str<strong>on</strong>g>g case studies; namely, that (1) humanists need to be made better aware of exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics resources <strong>and</strong> tools; (2) users need <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated access to data with more automated processes<br />

to simplify laborious data-gather<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> -<str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> tasks; (3) the use of st<strong>and</strong>ards <strong>and</strong> protocols<br />

would help make data <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrati<strong>on</strong> easier; (4) NLP tools require textual data <str<strong>on</strong>g>in</str<strong>on</strong>g> order to perform<br />

automated analysis but many data providers do not provide access to their data <str<strong>on</strong>g>in</str<strong>on</strong>g> a textual format; <strong>and</strong><br />

(5) the use of web services with st<strong>and</strong>ardized <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces <strong>and</strong> str<strong>on</strong>gly typed XML messages could help<br />

guarantee <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability of resources <strong>and</strong> tools. The authors admit that this last desideratum will<br />

require a great deal of c<strong>on</strong>sensus am<strong>on</strong>g service providers.<br />

In additi<strong>on</strong> to study<str<strong>on</strong>g>in</str<strong>on</strong>g>g how humanities users might use language tools <strong>and</strong> resources, CLARIN plans<br />

to <str<strong>on</strong>g>in</str<strong>on</strong>g>clude language resources for all European languages <str<strong>on</strong>g>in</str<strong>on</strong>g> participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g countries, <strong>and</strong> it has def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed<br />

BLARK, or a Basic Language Resources Toolkit, that will be required for each well-documented<br />

language. BLARKs must c<strong>on</strong>sist of two types of lexic<strong>on</strong>s, <strong>on</strong>e “form based” <strong>and</strong> <strong>on</strong>e “lexical<br />

semantic,” or essentially a treebank <strong>and</strong> an automatically annotated larger corpus. As part of this work,<br />

CLARIN has recently made a number of services available <strong>on</strong> its website under its “Virtual Language<br />

Observatory.” 740 Included am<strong>on</strong>g these services are massive language-resource <strong>and</strong> language-tool<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>ventories that can be searched or browsed by faceted metadata.<br />

738 As various publicati<strong>on</strong>s <strong>and</strong> deliverables are completed for the various work packages, all reports can be downloaded from the CLARIN website<br />

(http://www.clar<str<strong>on</strong>g>in</str<strong>on</strong>g>.eu/deliverables).<br />

739 In July of 2011, both CLARIN <strong>and</strong> DARIAH signed a “letter of <str<strong>on</strong>g>in</str<strong>on</strong>g>tent” with EGI (European Grid Infrastructure (http://www.egi.eu)), “which has the<br />

express <str<strong>on</strong>g>in</str<strong>on</strong>g>tenti<strong>on</strong> of ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g that technology developed by the two ESFRI projects <strong>and</strong> EGI are compatible <strong>and</strong> provides the best service to their users.”<br />

Both CLARIN <strong>and</strong> DARIAH are funded by ESFRI (European Strategy Forum <strong>on</strong> Research Infrastructures), <strong>and</strong> this agreement with the EGI has the end<br />

goal of ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g that all three projects “develop comm<strong>on</strong> tools <strong>and</strong> technologies while explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g further opportunities for collaborati<strong>on</strong>.”<br />

(http://www.dariah.eu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpopti<strong>on</strong>=com_c<strong>on</strong>tent&view=article&id=151:egi-signed-letter-of-<str<strong>on</strong>g>in</str<strong>on</strong>g>tent-with-dariah-<strong>and</strong>clar<str<strong>on</strong>g>in</str<strong>on</strong>g>&catid=3:dariah&Itemid=197)<br />

740 http://www.clar<str<strong>on</strong>g>in</str<strong>on</strong>g>.eu/vlo/ <strong>and</strong> for more details <strong>on</strong> this resource, see (Uytvanck et al. 2010).


261<br />

F<str<strong>on</strong>g>in</str<strong>on</strong>g>ally, the CLARIN website provides access to newsletters, scientific publicati<strong>on</strong>s, <strong>and</strong> extensive<br />

readable documentati<strong>on</strong> 741 <strong>on</strong> the technological decisi<strong>on</strong>s of CLARIN. Documentati<strong>on</strong> is available<br />

regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g the development of their c<strong>on</strong>cept registry, comp<strong>on</strong>ent metadata, persistent identifiers, l<strong>on</strong>gterm<br />

preservati<strong>on</strong>, st<strong>and</strong>ards <strong>and</strong> best practices, text encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, virtual collecti<strong>on</strong>s, service-oriented<br />

architecture, <strong>and</strong> web services <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability.<br />

DARIAH Project<br />

DARIAH, 742 which st<strong>and</strong>s for “Digital Research Infrastructure for the Arts <strong>and</strong> Humanities,” has been<br />

funded to “c<strong>on</strong>ceptualize <strong>and</strong> afterwards build a virtual bridge between different humanities <strong>and</strong><br />

cultural heritage data resources <str<strong>on</strong>g>in</str<strong>on</strong>g> Europe” (Blanke 2010). The DARIAH project commenced <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

September 2009 <strong>and</strong> seeks to improve access to the hidden resources of archives, libraries, <strong>and</strong><br />

museums across Europe. It completed its preparatory phase <str<strong>on</strong>g>in</str<strong>on</strong>g> February 2011, <strong>and</strong> is currently mov<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>to the “transiti<strong>on</strong> phase” where it will “submit an applicati<strong>on</strong> to the European Commissi<strong>on</strong> to<br />

establish a European Research Infrastructure C<strong>on</strong>sortium (ERIC).” 743 If this applicati<strong>on</strong> is successful it<br />

will help to establish a legal framework that will enable the l<strong>on</strong>g-term susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability of DARIAH <strong>and</strong><br />

they will beg<str<strong>on</strong>g>in</str<strong>on</strong>g> a “c<strong>on</strong>structi<strong>on</strong> phase” <str<strong>on</strong>g>in</str<strong>on</strong>g> January 2012. DARIAH has also made a number of<br />

documents available as it prepares to move <str<strong>on</strong>g>in</str<strong>on</strong>g>to its c<strong>on</strong>structi<strong>on</strong> phase, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g a technical report 744<br />

<strong>on</strong> less<strong>on</strong>s learned dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g the preparatory phase of DARIAH <strong>and</strong> a list of strategic policy documents<br />

regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g data <strong>and</strong> data management 745 that will guide DARIAH activities <str<strong>on</strong>g>in</str<strong>on</strong>g> the next phase (such as<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> terms of data creati<strong>on</strong>, collecti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>gest, management <strong>and</strong> preservati<strong>on</strong>, <strong>and</strong> trusted digital repository<br />

compliance).<br />

DARIAH is a c<strong>on</strong>sortium of 14 partners from 10 countries. 746 Similar to that of CLARIN, the work of<br />

DARIAH was divided <str<strong>on</strong>g>in</str<strong>on</strong>g>to work packages: Project Management, Dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, Strategic Work,<br />

F<str<strong>on</strong>g>in</str<strong>on</strong>g>ancial Work, Governance <strong>and</strong> Logistical Work, Legal Work, Technical Reference Architecture, <strong>and</strong><br />

C<strong>on</strong>ceptual Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

DARIAH has explored the use of the Fedora digital repository <strong>and</strong> the iRODS data grid technology<br />

developed by the San Diego Supercomput<str<strong>on</strong>g>in</str<strong>on</strong>g>g Center <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of design<str<strong>on</strong>g>in</str<strong>on</strong>g>g a humanities research<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <strong>and</strong> their current plan is that any <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that is designed will be built <strong>on</strong> top of the<br />

“exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g EGEE gLite <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.” 747 Recently, Tobias Blanke offered a succ<str<strong>on</strong>g>in</str<strong>on</strong>g>ct overview of<br />

DARIAH’s basic architecture:<br />

DARIAH will <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate pan-European humanities research data collecti<strong>on</strong>s by us<str<strong>on</strong>g>in</str<strong>on</strong>g>g advanced<br />

grid <strong>and</strong> digital repository technologies. Formally managed digital repositories for research data<br />

can provide an effective means of manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g the complexity encountered <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

general, <strong>and</strong> will take <strong>on</strong> a central <strong>and</strong> pivotal role <str<strong>on</strong>g>in</str<strong>on</strong>g> the research lifecycle. The DARIAH<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure will be fundamentally distributed <strong>and</strong> will provide arts <strong>and</strong> humanities publishers<br />

<strong>and</strong> researchers alike with a secure <strong>and</strong> customizable envir<strong>on</strong>ment with<str<strong>on</strong>g>in</str<strong>on</strong>g> which to collaborate<br />

effectively <strong>and</strong> purposefully (Blanke 2010).<br />

741 http://www.clar<str<strong>on</strong>g>in</str<strong>on</strong>g>.eu/external/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phppage=publicati<strong>on</strong>s&sub=3<br />

742 http://www.dariah.eu/<br />

743 http://ec.europa.eu/research/<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex_en.cfmpg=eric<br />

744 http://dariah.eu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpopti<strong>on</strong>=com_docman&task=doc_download&gid=477&Itemid=200<br />

745 http://dariah.eu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpopti<strong>on</strong>=com_docman&task=cat_view&gid=92&Itemid=200<br />

746 http://www.dariah.eu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpopti<strong>on</strong>=com_docman&task=doc_download&gid=301&Itemid=200<br />

747 http://technical.eu-egee.org/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpid=149


262<br />

Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to Blanke, DARIAH thus seeks to build a secure, distributed, <strong>and</strong> customizable<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. Similar to TextGrid, DARIAH <str<strong>on</strong>g>in</str<strong>on</strong>g>tends to create a flexible architecture that is loosely<br />

coupled so other communities can also add their own services <strong>on</strong> top of it.<br />

Aschenbrenner et al. (2010) have also stated that the major strategy beh<str<strong>on</strong>g>in</str<strong>on</strong>g>d the DARIAH repository<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is that <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual repositories should rema<str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>dependent <strong>and</strong> evolve over time (thus<br />

rema<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g open), but at the same time all c<strong>on</strong>tents <strong>and</strong> tools provided through DARIAH should appear<br />

to researchers as if they were us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle platform (thus a closely knit <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure). Various<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s <strong>and</strong> researchers have been <str<strong>on</strong>g>in</str<strong>on</strong>g>vited to c<strong>on</strong>tribute their c<strong>on</strong>tent <strong>and</strong> tools to the DARIAH<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <strong>and</strong> Aschenbrenner et al. reported that they are committed to support<str<strong>on</strong>g>in</str<strong>on</strong>g>g reas<strong>on</strong>able<br />

levels of semantic diversity <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability:<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g diversity is at the core of Dariah’s philosophy. Discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities differ<br />

greatly with regard to their resources—their data, tools, <strong>and</strong> methodologies. Moreover,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>novati<strong>on</strong> is sometimes associated with <str<strong>on</strong>g>in</str<strong>on</strong>g>troduc<str<strong>on</strong>g>in</str<strong>on</strong>g>g variati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>to their data, tools, or<br />

methodologies, thereby re<str<strong>on</strong>g>in</str<strong>on</strong>g>forc<str<strong>on</strong>g>in</str<strong>on</strong>g>g heterogeneity even with<str<strong>on</strong>g>in</str<strong>on</strong>g> a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e. Through<br />

l<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g this diversity Dariah aims to build bridges, to enable researchers from different<br />

discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es or cultural backgrounds to collaborate <strong>on</strong> the same material <str<strong>on</strong>g>in</str<strong>on</strong>g> a comm<strong>on</strong> research<br />

envir<strong>on</strong>ment, <strong>and</strong> to share their diverse perspectives <strong>and</strong> methodologies (Aschenbrenner et al.<br />

2010).<br />

DAR IAH seeks both to learn from <strong>and</strong> be <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable with other repository federati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives<br />

such as DRIVER <strong>and</strong> Europeana 748 <strong>and</strong> c<strong>on</strong>sequently has decided not to enforce rich metadata<br />

guidel<str<strong>on</strong>g>in</str<strong>on</strong>g>es. The project is still determ<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g how to deal with diversity not <strong>on</strong>ly am<strong>on</strong>g research data <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the humanities but also am<strong>on</strong>g the different agents that will participate <str<strong>on</strong>g>in</str<strong>on</strong>g> the DARIAH envir<strong>on</strong>ment<br />

(e.g., collaborati<strong>on</strong> platforms, service registries, private <strong>and</strong> public archives, c<strong>on</strong>tent or service<br />

providers, applicati<strong>on</strong>s <strong>and</strong> data created by <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual scholars). Aschenbrenner et al. (2010) also<br />

emphasized that <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability is not the same as uniformity; any <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability scheme used for<br />

DARIAH may differ significantly from <str<strong>on</strong>g>in</str<strong>on</strong>g>ternal metadata schemes used at <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual partner sites.<br />

Currently DARIAH is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> “enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g an evoluti<strong>on</strong>ary metadata approach” where scholars can<br />

start with m<str<strong>on</strong>g>in</str<strong>on</strong>g>imum annotati<strong>on</strong> <strong>and</strong> this can be extended over time by other scholars or automatically.<br />

To support the federati<strong>on</strong> of a wide variety of repositories <strong>and</strong> other agents, DARIAH has created a<br />

prototype federati<strong>on</strong> architecture 749 for “expos<str<strong>on</strong>g>in</str<strong>on</strong>g>g repository events” that is based <strong>on</strong> a “hybrid<br />

push/poll notificati<strong>on</strong> pattern” us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Atom <strong>and</strong> supports “CRUD” 750 events:<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce the Dariah envir<strong>on</strong>ment will prospectively c<strong>on</strong>sist of a c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uously grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g number of<br />

agents that may depend <strong>on</strong> each other’s state, a notificati<strong>on</strong> pattern is a suitable architectural<br />

build<str<strong>on</strong>g>in</str<strong>on</strong>g>g block. The Atom feeds c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the events of Dariah repositories can be c<strong>on</strong>sumed<br />

by decentralised agents, decentralised mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g that they <strong>on</strong>ly use the data available from the<br />

Atom feeds <strong>and</strong> do not plug <str<strong>on</strong>g>in</str<strong>on</strong>g>to any proprietary repository <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces. Such decentralised<br />

agents could hence be built by other <str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives c<strong>on</strong>sum<str<strong>on</strong>g>in</str<strong>on</strong>g>g the event messages from various<br />

sources without the repositories be<str<strong>on</strong>g>in</str<strong>on</strong>g>g aware of them (Aschenbrenner et al. 2010).<br />

748 http://www.europeana.eu/<br />

749 In additi<strong>on</strong> to us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Atom syndicati<strong>on</strong> format (http://www.ietf.org/rfc/rfc4287.txt), this prototype made use of the Metadata Object Descripti<strong>on</strong><br />

Schema (MODS) (http://www.loc.gov/st<strong>and</strong>ards/mods/) for <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual digital object descripti<strong>on</strong>s <strong>and</strong> OAI-ORE Resource Maps for the creati<strong>on</strong> of<br />

aggregated objects.<br />

750 CRUD: The creati<strong>on</strong>, update, or deleti<strong>on</strong> of an object <str<strong>on</strong>g>in</str<strong>on</strong>g> a digital repository.


263<br />

DARIAH has tested this <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure by creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an experiment that l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked TextGrid, an iRODS, <strong>and</strong><br />

a Fedora test server <str<strong>on</strong>g>in</str<strong>on</strong>g>to a s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle federati<strong>on</strong> <strong>and</strong> replicated digital objects across the different<br />

repositories <strong>and</strong> created an <str<strong>on</strong>g>in</str<strong>on</strong>g>dex of all the TEI/XML objects <str<strong>on</strong>g>in</str<strong>on</strong>g> the federati<strong>on</strong>. Aschenbrenner et al.<br />

(2010) c<strong>on</strong>cluded that the use of Atom will not <strong>on</strong>ly “ensure coherence am<strong>on</strong>g decentralised agents”<br />

but also, as a lightweight protocol that is “deeply embedded <str<strong>on</strong>g>in</str<strong>on</strong>g>to the web envir<strong>on</strong>ment of HTTP-based,<br />

ReSTful Services,” serve as a gateway to a number of exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools <strong>and</strong> improve the scalability of<br />

DARIAH as a whole. Some rema<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g challenges to be addressed by this <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g>clude user<br />

<strong>and</strong> rights management <strong>and</strong> the need for persistent identifiers for digital objects. 751<br />

One of the major projects of this first stage of DARIAH was to build two dem<strong>on</strong>strators that<br />

dem<strong>on</strong>strated the feasibility of their technical architecture. As the website, notes, however, they were<br />

also an opportunity for two “associated communities to positi<strong>on</strong> themselves with<str<strong>on</strong>g>in</str<strong>on</strong>g> the DARIAH<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure,” 752 namely the digital archaeology <strong>and</strong> textual encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g communities. The first<br />

“community dem<strong>on</strong>strator” ARENA 2, 753 migrated a “legacy applicati<strong>on</strong> of the European archaeology<br />

community <str<strong>on</strong>g>in</str<strong>on</strong>g>to a more susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able service-oriented architecture (SOA).” The orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al ARENA<br />

(Archaeological Records of Europe-Networked Access) project was f<str<strong>on</strong>g>in</str<strong>on</strong>g>ished <str<strong>on</strong>g>in</str<strong>on</strong>g> 2004 <strong>and</strong> had served as<br />

a “traditi<strong>on</strong>al metadata search portal service” based <strong>on</strong> Z39.50 754 <strong>and</strong> OAI Harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g. The newly<br />

released dem<strong>on</strong>strator makes use of DARIAH web services <strong>and</strong> exposed the various participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

archaeological databases as aut<strong>on</strong>omous services. The sec<strong>on</strong>d dem<strong>on</strong>strator, the TEI dem<strong>on</strong>strator, was<br />

designed to “dem<strong>on</strong>strate the practical benefits of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g TEI for the representati<strong>on</strong> of digital resources<br />

of all k<str<strong>on</strong>g>in</str<strong>on</strong>g>ds, but primarily of orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al source collecti<strong>on</strong>s with<str<strong>on</strong>g>in</str<strong>on</strong>g> the arts <strong>and</strong> humanities.” The<br />

dem<strong>on</strong>strator can be used to upload <strong>and</strong> publish TEI documents <str<strong>on</strong>g>in</str<strong>on</strong>g>to a repository am<strong>on</strong>g other<br />

functi<strong>on</strong>alities <strong>and</strong> makes use of software platform called eSciDoc 755 that was developed by the Max<br />

Planck Digital <strong>Library</strong>. The end goal of this dem<strong>on</strong>strator was thus to make it easier for humanities<br />

researchers to both share their TEI texts with others <strong>and</strong> to compare pers<strong>on</strong>al encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g practices with<br />

that of the larger TEI community. 756<br />

Another important factor c<strong>on</strong>sidered by DARIAH is the frequently distributed nature of humanities<br />

data; for example, <strong>on</strong>e digital archive may have transcripti<strong>on</strong>s of a manuscript while another has digital<br />

images of this manuscript. Thus, DARIAH plans to build a data architecture that will “cover the easy<br />

exchange of file type data, the ability to create relati<strong>on</strong>ships between files <str<strong>on</strong>g>in</str<strong>on</strong>g> remote locati<strong>on</strong>s <strong>and</strong><br />

flexible cach<str<strong>on</strong>g>in</str<strong>on</strong>g>g mechanism to deal with the exchange of large s<str<strong>on</strong>g>in</str<strong>on</strong>g>gle data items like digitizati<strong>on</strong><br />

images”(Blanke 2010). S<str<strong>on</strong>g>in</str<strong>on</strong>g>ce humanities data also need to be preserved for l<strong>on</strong>g periods of time to<br />

support archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> reuse, DARIAH plans to <str<strong>on</strong>g>in</str<strong>on</strong>g>corporate exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g archived research data. In his f<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

overview of the project, Blanke proposed that:<br />

DARIAH is <strong>on</strong>e way to build a research <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the humanities. It uses grid<br />

technologies together with digital library technologies to deliver services to support the<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> needs of humanities researchers. It <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrates many services useful for humanities<br />

research <strong>and</strong> will focus less <strong>on</strong> automati<strong>on</strong> of process<str<strong>on</strong>g>in</str<strong>on</strong>g>g but <strong>on</strong> provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure to<br />

751 The discussi<strong>on</strong> over how to both design <strong>and</strong> implement persistent <strong>and</strong> unique identifiers has a vast body of literature. For some recent work, see T<strong>on</strong>k<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

(2008), Campbell (2007), <strong>and</strong> Hilse <strong>and</strong> Kothe (2006).<br />

752 http://dariah.eu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpopti<strong>on</strong>=com_c<strong>on</strong>tent&view=article&id=129&Itemid=113<br />

753 http://www.dariah.eu/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpopti<strong>on</strong>=com_c<strong>on</strong>tent&view=article&id=30&Itemid=34. The dem<strong>on</strong>strator can also be accessed at<br />

http://mun<str<strong>on</strong>g>in</str<strong>on</strong>g>n.york.ac.uk/arena2/<br />

754 Z39.50 is an ISO st<strong>and</strong>ard ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed by the <strong>Library</strong> of C<strong>on</strong>gress that “specifies a client/server-based protocol for search<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> retriev<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

from remote databases” (http://www.loc.gov/z3950/agency/) <strong>and</strong> has been the predom<str<strong>on</strong>g>in</str<strong>on</strong>g>ant st<strong>and</strong>ard used <str<strong>on</strong>g>in</str<strong>on</strong>g> <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated library systems<br />

755 http://www.escidoc.org/<br />

756 The TEI dem<strong>on</strong>strator can be accessed at (http://vm20.mpdl.mpg.de:8080/tei_dem<strong>on</strong>strator/).


264<br />

support the ma<str<strong>on</strong>g>in</str<strong>on</strong>g> activity of humanities researchers, the attempt to establish the mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g of<br />

textual <strong>and</strong> other human created resources (Blanke 2010).<br />

Blanke also reported that DARIAH is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g built <strong>on</strong> exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g nati<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures <strong>and</strong> will<br />

c<strong>on</strong>sequently be “embedded” am<strong>on</strong>g service providers that are already <str<strong>on</strong>g>in</str<strong>on</strong>g> place to ensure the bestpossible<br />

chances of success.<br />

Digital Humanities Observatory<br />

The Digital Humanities Observatory (DHO) 757 is a digital humanities “collaboratory work<str<strong>on</strong>g>in</str<strong>on</strong>g>g with<br />

Humanities Serv<str<strong>on</strong>g>in</str<strong>on</strong>g>g Irish Society (HSIS), nati<strong>on</strong>al, European, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>ternati<strong>on</strong>al partners to further e-<br />

scholarship.” Founded <str<strong>on</strong>g>in</str<strong>on</strong>g> 2008, DHO was created to support digital humanities research <str<strong>on</strong>g>in</str<strong>on</strong>g> Irel<strong>and</strong> <strong>and</strong><br />

help manage the creati<strong>on</strong> <strong>and</strong> preservati<strong>on</strong> of digital resources. The DHO will focus <strong>on</strong> three ma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

issues <str<strong>on</strong>g>in</str<strong>on</strong>g> the next few years: “encourag<str<strong>on</strong>g>in</str<strong>on</strong>g>g collaborati<strong>on</strong>; provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g for the management, access, <strong>and</strong><br />

preservati<strong>on</strong> of project data; <strong>and</strong> promulgat<str<strong>on</strong>g>in</str<strong>on</strong>g>g shared st<strong>and</strong>ards <strong>and</strong> technology for project<br />

development”(Schreibman et al. 2009). The DHO has po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out that the expectati<strong>on</strong>s of digital<br />

humanities centers are rapidly chang<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> that plans for l<strong>on</strong>g-term viability need to be created from<br />

the very beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g of development.<br />

Schreibman et al. noted that the creati<strong>on</strong> of the DHO is <str<strong>on</strong>g>in</str<strong>on</strong>g> l<str<strong>on</strong>g>in</str<strong>on</strong>g>e with other grow<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives such as<br />

Project Bamboo <strong>and</strong> DARIAH, where digital humanities projects are mov<str<strong>on</strong>g>in</str<strong>on</strong>g>g away from “digital silos”<br />

to an approach where scholarly resources will be “l<str<strong>on</strong>g>in</str<strong>on</strong>g>ked, <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable, reusable, <strong>and</strong> preserved.” The<br />

DHO is creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g three “dist<str<strong>on</strong>g>in</str<strong>on</strong>g>ct but <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrated <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructures”: (1) a portal that will serve as the public<br />

face of the DHO <strong>and</strong> is based <strong>on</strong> the Drupal 758 c<strong>on</strong>tent management system; (2) DRAPIER (Digital<br />

Research <strong>and</strong> Practices <str<strong>on</strong>g>in</str<strong>on</strong>g> Irel<strong>and</strong>), which is also based <strong>on</strong> Drupal <strong>and</strong> will serve as a framework for<br />

public discovery of digital projects <str<strong>on</strong>g>in</str<strong>on</strong>g> Irel<strong>and</strong>; <strong>and</strong> (3) an “access <strong>and</strong> preservati<strong>on</strong> repository based <strong>on</strong><br />

Fedora.” Some resources created by HSIS partners of the DHO will reside <str<strong>on</strong>g>in</str<strong>on</strong>g> their Fedora repository<br />

while others will be federated <str<strong>on</strong>g>in</str<strong>on</strong>g> Fedora <str<strong>on</strong>g>in</str<strong>on</strong>g>stances managed by DHO partners.<br />

DRIVER<br />

DRIVER (Digital Repository Infrastructure Visi<strong>on</strong> for European Research) 759 is a “multi-phase effort<br />

whose visi<strong>on</strong> <strong>and</strong> primary objective is to establish a cohesive, pan-European <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure of digital<br />

repositories, offer<str<strong>on</strong>g>in</str<strong>on</strong>g>g sophisticated functi<strong>on</strong>ality services to both researchers <strong>and</strong> the general public.”<br />

At the end of its first stage <str<strong>on</strong>g>in</str<strong>on</strong>g> November 2007, the DRIVER project provided access to a testbed<br />

system that produced a search portal with open-access c<strong>on</strong>tent from over 70 repositories. The DRIVER<br />

efforts <str<strong>on</strong>g>in</str<strong>on</strong>g>itially c<strong>on</strong>centrated <strong>on</strong> the <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure aspect <strong>and</strong> developed “clearly def<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <str<strong>on</strong>g>in</str<strong>on</strong>g>terfaces to<br />

the c<strong>on</strong>tent network, which allow any qualified service-provider to build services <strong>on</strong> top of it.” 760 As<br />

of March 2010, the DRIVER search portal offered access to more than 2.5 milli<strong>on</strong> publicati<strong>on</strong>s from<br />

249 repositories <str<strong>on</strong>g>in</str<strong>on</strong>g> 39 countries. In its current stage, DRIVER II seeks to exp<strong>and</strong> its geographic<br />

coverage, support more advanced end user functi<strong>on</strong>ality for search<str<strong>on</strong>g>in</str<strong>on</strong>g>g complex digital objects, <strong>and</strong><br />

provide access to a greater variety of open-access materials.<br />

757 http://dho.ie/<br />

758 http://drupal.org/<br />

759 http://www.driver-repository.eu/<br />

760 http://www.driver-repository.eu/Driver-About/About-DRIVER.html


265<br />

NoC-Network of Expert Centres<br />

The Network of Expert Centres 761 is a collaborati<strong>on</strong> of “centres with expertise <str<strong>on</strong>g>in</str<strong>on</strong>g> digital arts <strong>and</strong><br />

humanities research <strong>and</strong> scholarship, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g practice-led research.” The areas of research expertise<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>clude “data creati<strong>on</strong>, curati<strong>on</strong>, preservati<strong>on</strong>, management (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g rights <strong>and</strong> legal issues), access<br />

<strong>and</strong> dissem<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong>, <strong>and</strong> methodologies of data use <strong>and</strong> re-use.” Membership <str<strong>on</strong>g>in</str<strong>on</strong>g> NoC is open to centers<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> Great Brita<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Irel<strong>and</strong> that have a formal <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al status, recognized expertise <str<strong>on</strong>g>in</str<strong>on</strong>g> digital arts<br />

<strong>and</strong> humanities, a history of persistence, <strong>and</strong> an <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al or <str<strong>on</strong>g>in</str<strong>on</strong>g>ter<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al focus of activity.<br />

Currently participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g organizati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g>clude the Archaeology Data Service (ADS), the Centre for<br />

Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities (CCH), the Centre for Data Digitizati<strong>on</strong> <strong>and</strong> Analysis (CDDA), the<br />

Digital Design Studio (DDS), the History Data Service (HDS), the Humanities Advanced Technology<br />

& Informati<strong>on</strong> Institute (HATII), the Humanities Research Institute (HRI), the Oxford Text Archive<br />

(OTA), the UCL Centre for Digital Humanities, <strong>and</strong> the VADS (Visual Arts Data Service).<br />

The purpose of this network is to enable all the members to pursue a series of collective aims <strong>and</strong><br />

objectives <str<strong>on</strong>g>in</str<strong>on</strong>g> the support of arts <strong>and</strong> humanities research <strong>and</strong> scholarship, for much of which artshumanities.net<br />

will provide a central hub. These objectives <str<strong>on</strong>g>in</str<strong>on</strong>g>clude (1) promot<str<strong>on</strong>g>in</str<strong>on</strong>g>g the broad use of<br />

ICT; (2) provid<str<strong>on</strong>g>in</str<strong>on</strong>g>g leadership <str<strong>on</strong>g>in</str<strong>on</strong>g> the use of digital methods <strong>and</strong> resources; (3) develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

exchang<str<strong>on</strong>g>in</str<strong>on</strong>g>g expertise, st<strong>and</strong>ards, <strong>and</strong> best practices; (4) identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> serv<str<strong>on</strong>g>in</str<strong>on</strong>g>g the needs of the<br />

research community; <strong>and</strong> (5) c<strong>on</strong>duct<str<strong>on</strong>g>in</str<strong>on</strong>g>g dialogue with stakeholders. NoC has a steer<str<strong>on</strong>g>in</str<strong>on</strong>g>g committee of<br />

six vot<str<strong>on</strong>g>in</str<strong>on</strong>g>g members drawn from the different centers; the committee’s role is to propose <str<strong>on</strong>g>in</str<strong>on</strong>g>itiatives <strong>and</strong><br />

activities, to provide reports to the membership, to c<strong>on</strong>vene regular meet<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the full network, <strong>and</strong><br />

to c<strong>on</strong>vene workgroups.<br />

Project Bamboo<br />

Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to its website, Project Bamboo 762 “is a multi-<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al, <str<strong>on</strong>g>in</str<strong>on</strong>g>terdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary, <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>terorganizati<strong>on</strong>al<br />

effort that br<str<strong>on</strong>g>in</str<strong>on</strong>g>gs together researchers <str<strong>on</strong>g>in</str<strong>on</strong>g> arts <strong>and</strong> humanities, computer scientists,<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> scientists, librarians, <strong>and</strong> campus <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> technologists” to answer <strong>on</strong>e major<br />

questi<strong>on</strong>: “How can we advance arts <strong>and</strong> humanities research through the development of shared<br />

technology services” Designed as a “community driven cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g>itiative” (Ka<str<strong>on</strong>g>in</str<strong>on</strong>g>z 2009),<br />

Project Bamboo <str<strong>on</strong>g>in</str<strong>on</strong>g>itially received fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g from the Mell<strong>on</strong> Foundati<strong>on</strong> as a plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g project <strong>and</strong> this<br />

phase c<strong>on</strong>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> September 2010.<br />

Project Bamboo is currently <str<strong>on</strong>g>in</str<strong>on</strong>g> its first 18-m<strong>on</strong>th phase of technology development that will c<strong>on</strong>clude<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> spr<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2012. The “Bamboo Technology Project” (BTP) has ten partner <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s that have<br />

collectively committed $1.7 milli<strong>on</strong> dollars to this project al<strong>on</strong>g with $1.3 milli<strong>on</strong> provided by the<br />

Mell<strong>on</strong> Foundati<strong>on</strong>. This first phase will <str<strong>on</strong>g>in</str<strong>on</strong>g>volve the development of three deliverables: a research<br />

envir<strong>on</strong>ment for scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, a shared <strong>and</strong> foundati<strong>on</strong>al <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure that will allow<br />

both librarians <strong>and</strong> technologists to support humanities scholarship across <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al boundaries, <strong>and</strong><br />

a technology bluepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t for the next phase of the BTP that will focus <strong>on</strong> the “evoluti<strong>on</strong> of shared<br />

applicati<strong>on</strong>s for the curati<strong>on</strong> <strong>and</strong> explorati<strong>on</strong> of widely distributed c<strong>on</strong>tent collecti<strong>on</strong>s.” BTP’s<br />

technology development is formally organized around four major areas of work, Research<br />

Applicati<strong>on</strong>s (divided <str<strong>on</strong>g>in</str<strong>on</strong>g>to “Research Envir<strong>on</strong>ments” <strong>and</strong> “Corpora Space Redesign Process”) <strong>and</strong><br />

Shared Infrastructure (divided <str<strong>on</strong>g>in</str<strong>on</strong>g>to “Scholarly Web Services <strong>on</strong> Services Platform” <strong>and</strong> “Collecti<strong>on</strong>s<br />

761 http://www.arts-humanities.net/noc<br />

762 http://www.projectbamboo.org/about/. The Bamboo Project also ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s a blog (http://www.projectbamboo.org/blog/) <strong>and</strong> an active Technology<br />

Wiki (https://wiki.projectbamboo.org/display/BTECH/Technology+Wiki+-+Home) to track progress <strong>on</strong> various areas of work.


266<br />

Interoperability”). 763 The major goal of the BTP is thus to establish a shared <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for<br />

humanities research by develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g web services, <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable collecti<strong>on</strong>s, <strong>and</strong> partnerships that meet<br />

the needs of both researchers <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s.<br />

The BTP builds off of the Bamboo Plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g project that c<strong>on</strong>cluded <str<strong>on</strong>g>in</str<strong>on</strong>g> 2010, <strong>and</strong> as part of this project<br />

six workshops were held between the fall of 2008 <strong>and</strong> the summer of 2010, with over 600 <str<strong>on</strong>g>in</str<strong>on</strong>g>dividuals<br />

participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g from a number of organizati<strong>on</strong>s <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s. 764 Workshop One, entitled “The<br />

Plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g Process & Underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g Arts <strong>and</strong> Humanities Scholarship,” <str<strong>on</strong>g>in</str<strong>on</strong>g>volved four separate<br />

workshops at different locati<strong>on</strong>s where scholars <str<strong>on</strong>g>in</str<strong>on</strong>g> the arts <strong>and</strong> humanities held dialogues with<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> technologists to underst<strong>and</strong> the scholarship practices of these discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es <strong>and</strong> their future<br />

directi<strong>on</strong>s. One major goal of these workshops was to develop a “high-level list of scholarly practices<br />

related to arts <strong>and</strong> humanities,” <strong>and</strong> this list, as well as all workshop notes, documents, <strong>and</strong> other<br />

materials, was placed <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e <str<strong>on</strong>g>in</str<strong>on</strong>g> the project plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g wiki. 765<br />

Workshop Two built off the results of the first series of workshops <strong>and</strong> exam<str<strong>on</strong>g>in</str<strong>on</strong>g>ed possible future<br />

directi<strong>on</strong>s for Project Bamboo, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g advocacy <strong>and</strong> leadership; educati<strong>on</strong> <strong>and</strong> tra<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g; <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al<br />

partnerships <strong>and</strong> support; scholarly network<str<strong>on</strong>g>in</str<strong>on</strong>g>g; st<strong>and</strong>ards <strong>and</strong> best practices; “tools & c<strong>on</strong>tent<br />

partners”; <strong>and</strong> a shared service framework. At the end of this workshop, seven work<str<strong>on</strong>g>in</str<strong>on</strong>g>g groups 766 were<br />

formed around these themes, with the additi<strong>on</strong> of an eighth work<str<strong>on</strong>g>in</str<strong>on</strong>g>g group entitled “Stories,”<br />

eventually renamed, “Scholarly Narratives,” which was chartered to “collect narratives <strong>and</strong>/or<br />

illustrative examples <strong>on</strong> behalf of all Bamboo work<str<strong>on</strong>g>in</str<strong>on</strong>g>g groups that express particular aspects of<br />

scholarship, scholarly workflow, research, <strong>and</strong>/or teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g that are or could be facilitated by<br />

technology.” Each work<str<strong>on</strong>g>in</str<strong>on</strong>g>g group had a separate wiki page that allowed group members to collaborate,<br />

post documents, <strong>and</strong> provide other <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>. Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g groups were also charged with creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

dem<strong>on</strong>strators 767 “to support the discussi<strong>on</strong> <strong>and</strong> analysis activities of the work<str<strong>on</strong>g>in</str<strong>on</strong>g>g groups, <strong>and</strong> to reflect<br />

<strong>and</strong> test the results of that discussi<strong>on</strong> <strong>and</strong> analysis.” 768 Initial progress of the work<str<strong>on</strong>g>in</str<strong>on</strong>g>g groups was<br />

reported at Workshop Three, <strong>and</strong> a “straw c<strong>on</strong>sortial model” for the project was <str<strong>on</strong>g>in</str<strong>on</strong>g>troduced. The fourth<br />

workshop <str<strong>on</strong>g>in</str<strong>on</strong>g>volved the discussi<strong>on</strong> of a draft program document; <strong>and</strong> the fifth workshop f<str<strong>on</strong>g>in</str<strong>on</strong>g>alized plans<br />

for the implementati<strong>on</strong> proposal to the Mell<strong>on</strong> Foundati<strong>on</strong> that was successfully funded.<br />

The amount of data generated from these workshops <strong>and</strong> found <strong>on</strong> the project plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g wiki was quite<br />

extensive, <strong>and</strong> the Bamboo Project synthesized these data <str<strong>on</strong>g>in</str<strong>on</strong>g>to four reports that are available <strong>on</strong> the<br />

website 769 <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g project’s f<str<strong>on</strong>g>in</str<strong>on</strong>g>al report, a scholarly practices report that <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes a<br />

“distillati<strong>on</strong> of the themes of scholarly practice that emerged <str<strong>on</strong>g>in</str<strong>on</strong>g> Workshop 1,” a scholarly narratives<br />

report that describes scholars percepti<strong>on</strong> of their work <strong>and</strong> how it may change with the use of<br />

technology, <strong>and</strong> a f<str<strong>on</strong>g>in</str<strong>on</strong>g>al dem<strong>on</strong>strator report. In additi<strong>on</strong>, the project has also created a “Themes<br />

Database” 770 that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s quotes from participants <str<strong>on</strong>g>in</str<strong>on</strong>g> the Bamboo Plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g Project’s Workshop 1<br />

series that have been “analyzed <strong>and</strong> organized <str<strong>on</strong>g>in</str<strong>on</strong>g>to themes of scholarly practice” with thirteen major<br />

themes hav<str<strong>on</strong>g>in</str<strong>on</strong>g>g been identified <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g: annotat<str<strong>on</strong>g>in</str<strong>on</strong>g>g/document<str<strong>on</strong>g>in</str<strong>on</strong>g>g, “citati<strong>on</strong>, credit, peer review,”<br />

763 http://www.projectbamboo.org/about/areas-of-work/<br />

764 Details from these workshops can be found at the now archived Project Plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g Wiki<br />

(https://wiki.projectbamboo.org/display/BPUB/Bamboo+Plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g+Wiki+(Archive)).<br />

765 https://wiki.projectbamboo.org/display/BPUB/Workshop+1+Summary<br />

766 https://wiki.projectbamboo.org/display/BPUB/Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g+Groups<br />

767 The orig<str<strong>on</strong>g>in</str<strong>on</strong>g>al planned list of dem<strong>on</strong>strators can be found <strong>on</strong> the Project Plann<str<strong>on</strong>g>in</str<strong>on</strong>g>g Wiki<br />

(https://wiki.projectbamboo.org/display/BPUB/Dem<strong>on</strong>strators+List). The f<str<strong>on</strong>g>in</str<strong>on</strong>g>al report <strong>on</strong> the dem<strong>on</strong>strators that were actually developed can be found here<br />

(http://www.projectbamboo.org/wp-c<strong>on</strong>tent/uploads/Project-Bamboo-Dem<strong>on</strong>strator-Report.pdf).<br />

768 https://wiki.projectbamboo.org/display/BPUB/About+Dem<strong>on</strong>strators<br />

769 http://www.projectbamboo.org/about/<br />

770 http://themes.projectbamboo.org/


267<br />

collaborat<str<strong>on</strong>g>in</str<strong>on</strong>g>g, c<strong>on</strong>textualiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g, gather<str<strong>on</strong>g>in</str<strong>on</strong>g>g/forag<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g data.<br />

SEASR<br />

SEASR, or the Software Envir<strong>on</strong>ment for the Advancement of Scholarly Research, has been funded by<br />

the Mell<strong>on</strong> Foundati<strong>on</strong> as a “transformati<strong>on</strong>al cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure technology” <strong>and</strong> seeks to support two<br />

major functi<strong>on</strong>s: (1) to enable scholars to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividually <strong>and</strong> collaboratively pursue computati<strong>on</strong>ally<br />

advanced digital research <str<strong>on</strong>g>in</str<strong>on</strong>g> a robust virtual work envir<strong>on</strong>ment; <strong>and</strong> (2) to support digital humanities<br />

developers with a robust programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment where they can both rapidly <strong>and</strong> efficiently design<br />

applicati<strong>on</strong>s that can be shared.<br />

SEASR provides a visual programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g envir<strong>on</strong>ment named Me<strong>and</strong>re 771 that allows users to develop<br />

applicati<strong>on</strong>s, labeled “flows,” that can then be deployed <strong>on</strong> an already-exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g robust hardware<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to the project website, Me<strong>and</strong>re is a “semantic enabled web-driven, dataflow<br />

executi<strong>on</strong> envir<strong>on</strong>ment” It provides “the mach<str<strong>on</strong>g>in</str<strong>on</strong>g>ery for assembl<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> execut<str<strong>on</strong>g>in</str<strong>on</strong>g>g data flows -software<br />

applicati<strong>on</strong>s c<strong>on</strong>sist<str<strong>on</strong>g>in</str<strong>on</strong>g>g of software comp<strong>on</strong>ents that process data,” as well as “publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g capabilities<br />

for flows <strong>and</strong> comp<strong>on</strong>ents, enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g users to assemble a repository of comp<strong>on</strong>ents for reuse <strong>and</strong><br />

shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g.” In other words, digital humanities developers can use Me<strong>and</strong>re to quickly develop <strong>and</strong> share<br />

software applicati<strong>on</strong>s to support <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual scholarship <strong>and</strong> research collaborati<strong>on</strong> as well as reuse<br />

applicati<strong>on</strong>s that have been developed by others, as SEASR ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s an exp<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g repository of<br />

different comp<strong>on</strong>ents <strong>and</strong> applicati<strong>on</strong>s.<br />

The sec<strong>on</strong>d major functi<strong>on</strong> of SEASR is to provide a virtual work envir<strong>on</strong>ment where digital<br />

humanities scholars can share data <strong>and</strong> research <strong>and</strong> a variety of data <strong>and</strong> text-m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g tools, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

frequent pattern m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g, cluster<str<strong>on</strong>g>in</str<strong>on</strong>g>g, text summarizati<strong>on</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> extracti<strong>on</strong>, <strong>and</strong> named-entity<br />

recogniti<strong>on</strong>. This work envir<strong>on</strong>ment allows scholars to access digital materials that are stored <str<strong>on</strong>g>in</str<strong>on</strong>g> a<br />

variety of formats, experiment with different algorithms, <strong>and</strong> use supercomput<str<strong>on</strong>g>in</str<strong>on</strong>g>g power to provide<br />

new visualizati<strong>on</strong>s <strong>and</strong> discover new relati<strong>on</strong>ships between data.<br />

SEASR uses both a service-oriented architecture (SOA) <strong>and</strong> semantic web comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g 772 to address<br />

four key research needs: (1) to transform semi- or unstructured data (<str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g natural language texts)<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>to structured data; (2) to improve automatic knowledge discovery through analytics; (3) to support<br />

collaborative scholarship through a VRE; <strong>and</strong> (4) to promote open-source development <strong>and</strong> community<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>volvement through shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g user applicati<strong>on</strong>s developed through Me<strong>and</strong>re <str<strong>on</strong>g>in</str<strong>on</strong>g> a community repository.<br />

A number of digital humanities projects have used SEASR, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Networked Envir<strong>on</strong>ment for<br />

Music Analysis (NEMA) 773 <strong>and</strong> the MONK (Metadata Offer New Knowledge) project. 774<br />

TextGrid<br />

TextGrid began work <str<strong>on</strong>g>in</str<strong>on</strong>g> 2006 <strong>and</strong> has evolved <str<strong>on</strong>g>in</str<strong>on</strong>g>to a jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t project of 10 partners with fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g through<br />

2012. The project is work<str<strong>on</strong>g>in</str<strong>on</strong>g>g to create an <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for a VRE <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities that c<strong>on</strong>sists of<br />

two key comp<strong>on</strong>ents: (1) a TextGrid repository that will serve as a “l<strong>on</strong>g-term archive for research data<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, embedded <str<strong>on</strong>g>in</str<strong>on</strong>g> a grid <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure” <strong>and</strong> will “ensure l<strong>on</strong>g-term availability <strong>and</strong><br />

access to its research data as well as <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability”; <strong>and</strong> (2) a “TextGrid Laboratory” that will serve<br />

771 http://seasr.org/me<strong>and</strong>re/documentati<strong>on</strong>/<br />

772 http://seasr.org/documentati<strong>on</strong>/overview/<br />

773 http://www.music-ir.org/q=node/12<br />

774 http://m<strong>on</strong>kproject.org/. For more <strong>on</strong> their use of SEASR <str<strong>on</strong>g>in</str<strong>on</strong>g> text m<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> named-entity recogniti<strong>on</strong>, see Vuillemot et al. (2009).


268<br />

as the po<str<strong>on</strong>g>in</str<strong>on</strong>g>t of entry to the VRE <strong>and</strong> provide access to both exist<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> new tools. 775 TextGrid is<br />

solicit<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uous feedback <strong>on</strong> the TextGrid laboratory <str<strong>on</strong>g>in</str<strong>on</strong>g> order to improve it, add new tools, <strong>and</strong><br />

address <str<strong>on</strong>g>in</str<strong>on</strong>g>terface issues. A beta versi<strong>on</strong> of TextGrid laboratory can be downloaded. 776<br />

The TextGrid project has published extensively <strong>on</strong> their work, <strong>and</strong> that literature is briefly reviewed<br />

here. TextGrid’s <str<strong>on</strong>g>in</str<strong>on</strong>g>itial audience was philologists, <strong>and</strong> their early work established a “community grid<br />

for the collaborative edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g, annotati<strong>on</strong>, analysis, <strong>and</strong> publicati<strong>on</strong> of specialist texts” (Aschenbrenner<br />

et al. 2009). Initial research c<strong>on</strong>ducted by TextGrid largely focused <strong>on</strong> this aspect (Dimitriadis et al.<br />

2006, Gietz et al. 2006) of the project, particularly <str<strong>on</strong>g>in</str<strong>on</strong>g> the development of philological editi<strong>on</strong>s <strong>and</strong><br />

services for scholars. More recent research by the TextGrid group has presented detailed <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong><br />

<strong>on</strong> the technical architecture of the project <strong>and</strong> how it relates to the larger world of eHumanities <strong>and</strong><br />

cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure (Aschenbrenner et al. 2009, Ludwig <strong>and</strong> Küster 2008, Ziel<str<strong>on</strong>g>in</str<strong>on</strong>g>ski et al. 2009).<br />

The creators of TextGrid ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> that <str<strong>on</strong>g>in</str<strong>on</strong>g> the humanities, the key resources are data, “knowledge about<br />

data,” <strong>and</strong> services, <strong>and</strong> that the challenge is to process <strong>and</strong> c<strong>on</strong>nect these resources (Aschenbrenner et<br />

al. 2009). The <str<strong>on</strong>g>in</str<strong>on</strong>g>tellectual c<strong>on</strong>tent <strong>and</strong> the community that uses it are at the core of TextGrid. While the<br />

majority of c<strong>on</strong>tent with<str<strong>on</strong>g>in</str<strong>on</strong>g> TextGrid will be textual, image resources have also been provided by a<br />

number of German <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>s, <strong>and</strong> these two resources will be merged <str<strong>on</strong>g>in</str<strong>on</strong>g>to a “virtual library” us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the Globus toolkit grid <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure. This will allow TextGrid to provide seamless search<str<strong>on</strong>g>in</str<strong>on</strong>g>g over<br />

federated archives while still allow<str<strong>on</strong>g>in</str<strong>on</strong>g>g the data to rema<str<strong>on</strong>g>in</str<strong>on</strong>g> distributed <strong>and</strong> support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the additi<strong>on</strong> of<br />

new organizati<strong>on</strong>s (Ludwig <strong>and</strong> Küster 2008). The authors also discussed the difficulties of creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

digital c<strong>on</strong>tent that will need to be used <strong>and</strong> preserved for a time that will outlast any system design:<br />

Furthermore, the typical project durati<strong>on</strong> of eHumanities projects <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> particular that for the<br />

elaborati<strong>on</strong> of critical editi<strong>on</strong>s, academic dicti<strong>on</strong>aries <strong>and</strong> large l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic corpora is often<br />

l<strong>on</strong>g—many years at least, often decades, sometimes even centuries. The time span dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

which those resources rema<str<strong>on</strong>g>in</str<strong>on</strong>g> pert<str<strong>on</strong>g>in</str<strong>on</strong>g>ent for then current research can be much l<strong>on</strong>ger still. In<br />

this it by far surpasses the average lifetime of any particular generati<strong>on</strong> of software technology<br />

(Ludwig <strong>and</strong> Küster 2008).<br />

They thus po<str<strong>on</strong>g>in</str<strong>on</strong>g>ted out the importance of c<strong>on</strong>tent stability <strong>and</strong> of design<str<strong>on</strong>g>in</str<strong>on</strong>g>g c<strong>on</strong>tent that can be ported<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>to new systems as time passes, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce digital resources created will likely be used for far l<strong>on</strong>ger than<br />

the <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual services designed to use them.<br />

TextGrid’s developers recommended creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g an open-service envir<strong>on</strong>ment with robust <strong>and</strong> general<br />

services that can ultimately be used to “form the basis for value added services <strong>and</strong>, eventually, doma<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

specific services <strong>and</strong> tailored applicati<strong>on</strong>s” (Aschenbrenner et al. 2009). The creators of TextGrid thus<br />

argued that to be successful, a digital humanities project must first create an open <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure with<br />

fairly generic services while at the same time promot<str<strong>on</strong>g>in</str<strong>on</strong>g>g community creati<strong>on</strong> of specialized<br />

applicati<strong>on</strong>s <strong>and</strong> workflows that can motivate community participati<strong>on</strong> <strong>and</strong> greater uptake. In fact,<br />

active community build<str<strong>on</strong>g>in</str<strong>on</strong>g>g has been <strong>on</strong>e of TextGrid’s most dynamic tasks dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g both project design<br />

<strong>and</strong> development, <strong>and</strong> they have designed specific documentati<strong>on</strong> <strong>and</strong> communicati<strong>on</strong> for their three<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>tended user groups (i.e., visitors/users, data providers, <strong>and</strong> tool developers).<br />

TextGrid has designed a service oriented architecture <str<strong>on</strong>g>in</str<strong>on</strong>g> multiple layers: (1) the applicati<strong>on</strong><br />

envir<strong>on</strong>ment or TextGrid access po<str<strong>on</strong>g>in</str<strong>on</strong>g>t that is Eclipse based <strong>and</strong> geared towards philologists; (2)<br />

775 http://www.textgrid.de/en/ueber-textgrid.html<br />

776 http://www.textgrid.de/en/beta.html


269<br />

services, or “build<str<strong>on</strong>g>in</str<strong>on</strong>g>g blocks of specialized functi<strong>on</strong>ality,” <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g functi<strong>on</strong>alities such as<br />

tokenizati<strong>on</strong>, lemmatizati<strong>on</strong>, <strong>and</strong> collati<strong>on</strong> that are “wrapped <str<strong>on</strong>g>in</str<strong>on</strong>g>to <str<strong>on</strong>g>in</str<strong>on</strong>g>dividual services to be re-used by<br />

other services or plugged <str<strong>on</strong>g>in</str<strong>on</strong>g>to an applicati<strong>on</strong> envir<strong>on</strong>ment”; (3) the TextGrid middleware; <strong>and</strong> (4)<br />

stable archives (Aschenbrenner et al. 2009). They have also developed a semantic service registry for<br />

TextGrid. Ziel<str<strong>on</strong>g>in</str<strong>on</strong>g>ski et al. have offered a c<strong>on</strong>cise summary of their approach:<br />

The TextGrid <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure is a multilayered system created with the motivati<strong>on</strong> to hide the<br />

complex grid <str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure … from the scholars <strong>and</strong> to make it possible to <str<strong>on</strong>g>in</str<strong>on</strong>g>tegrate external<br />

services with TextGrid tools. Basically <str<strong>on</strong>g>in</str<strong>on</strong>g> this service oriented architecture (SOA), there are<br />

three layers: the user <str<strong>on</strong>g>in</str<strong>on</strong>g>terface, a services layer with tools for textual analysis <strong>and</strong> text<br />

process<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> the TextGrid middleware, which itself <str<strong>on</strong>g>in</str<strong>on</strong>g>cludes multiple layers (Ziel<str<strong>on</strong>g>in</str<strong>on</strong>g>ksi et al.<br />

2009).<br />

N<strong>on</strong>etheless, the TextGrid project faced a variety of data <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability challenges <str<strong>on</strong>g>in</str<strong>on</strong>g> terms of us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the TEI as its basic form of markup, s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce partner projects used the TEI at vary<str<strong>on</strong>g>in</str<strong>on</strong>g>g levels of<br />

sophisticati<strong>on</strong>. While they did not want to sacrifice the depth of semantic encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g the TEI offered,<br />

they needed to def<str<strong>on</strong>g>in</str<strong>on</strong>g>e a m<str<strong>on</strong>g>in</str<strong>on</strong>g>imum “abstracti<strong>on</strong> level” necessary to promote larger <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability of<br />

computati<strong>on</strong>al processes <str<strong>on</strong>g>in</str<strong>on</strong>g> TextGrid (Blanke et al. 2008). As a soluti<strong>on</strong>, TextGrid developed a “core”<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g approach:<br />

… which follows a simple pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ciple: <strong>on</strong>e can always go from a higher semantic degree to a<br />

lower semantic degree; <strong>and</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> possessi<strong>on</strong> of a suitable transformati<strong>on</strong> script, this mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g can<br />

be d<strong>on</strong>e automatically. TextGrid encourages all its participat<str<strong>on</strong>g>in</str<strong>on</strong>g>g projects to describe their data<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> an XML-based markup that is suitable for their specific research questi<strong>on</strong>s. At the same time<br />

projects can register a mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g from their specific, semantically deep data to the respective<br />

TextGrid-wide core encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g that is a reas<strong>on</strong>ably expressive TEI-subset (Blanke et al. 2008)<br />

TextGrid’s soluti<strong>on</strong> thus attempts to respect the sophisticated encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of local practices while<br />

ma<str<strong>on</strong>g>in</str<strong>on</strong>g>ta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g a basic level of <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability. This illustrates the difficulties of support<str<strong>on</strong>g>in</str<strong>on</strong>g>g cross-corpora<br />

search<str<strong>on</strong>g>in</str<strong>on</strong>g>g even with<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong>e project. All c<strong>on</strong>tent that is created with the help of TextGridLab or comes<br />

from external resources is saved unchanged to the TextGrid repository. Metadata are extracted <strong>and</strong><br />

normalized before be<str<strong>on</strong>g>in</str<strong>on</strong>g>g stored <str<strong>on</strong>g>in</str<strong>on</strong>g> central metadata storage, <strong>and</strong> a full-text <str<strong>on</strong>g>in</str<strong>on</strong>g>dex is extracted from the<br />

raw data repository <strong>and</strong> updated with all changes (Ludwig <strong>and</strong> Küster 2008).<br />

The TextGridLab tool (an Eclipse-based GUI) is <str<strong>on</strong>g>in</str<strong>on</strong>g>tended to help users create TEI resources that can<br />

live with<str<strong>on</strong>g>in</str<strong>on</strong>g> the data grid. Although TEI documents form a large part of the resources <str<strong>on</strong>g>in</str<strong>on</strong>g> TextGrid, it can<br />

h<strong>and</strong>le heterogeneous data formats (pla<str<strong>on</strong>g>in</str<strong>on</strong>g> text, TEI/XML, images). TextGrid also provides a number of<br />

basic services (tokenizers, lemmatizers, sort<str<strong>on</strong>g>in</str<strong>on</strong>g>g tools, stream<str<strong>on</strong>g>in</str<strong>on</strong>g>g editors, collati<strong>on</strong> tools) that can be<br />

used aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st its objects while also lett<str<strong>on</strong>g>in</str<strong>on</strong>g>g users create their own services with<str<strong>on</strong>g>in</str<strong>on</strong>g> a Web services<br />

framework:<br />

Web Service frameworks are available for many programm<str<strong>on</strong>g>in</str<strong>on</strong>g>g languages—so if a pers<strong>on</strong> or<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong> wishes to make his/her text process<str<strong>on</strong>g>in</str<strong>on</strong>g>g tool available to the TextGrid community <strong>and</strong><br />

the workflow eng<str<strong>on</strong>g>in</str<strong>on</strong>g>e, the first step is to implement a Web Service wrapper for the tool <strong>and</strong><br />

deploy it <strong>on</strong> a public server (or <strong>on</strong>e of TextGrid’s). The next steps are to apply for registrati<strong>on</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the TextGrid service registry <strong>and</strong> to provide a client plug-<str<strong>on</strong>g>in</str<strong>on</strong>g> for the Eclipse GUI so that the<br />

tool is accessible for humans (GUI) <strong>and</strong> mach<str<strong>on</strong>g>in</str<strong>on</strong>g>es (service registry) alike (Ziel<str<strong>on</strong>g>in</str<strong>on</strong>g>ski et al. 2009).


270<br />

This architecture makes it easy to extend the TextGrid framework to work with other digital<br />

humanities applicati<strong>on</strong>s. Further details <strong>on</strong> how users can search the TextGrid collecti<strong>on</strong>s can be found<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> Ziel<str<strong>on</strong>g>in</str<strong>on</strong>g>ski et al. (2009).<br />

TextVRE<br />

TextVRE 777 is a recently <str<strong>on</strong>g>in</str<strong>on</strong>g>itiated project by the Center for e-Research (CeReh) <strong>and</strong> the Center for<br />

Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities at K<str<strong>on</strong>g>in</str<strong>on</strong>g>g’s College L<strong>on</strong>d<strong>on</strong>, the University of Sheffield Natural Language<br />

Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g Group, <strong>and</strong> the State <strong>and</strong> University <strong>Library</strong> at Gött<str<strong>on</strong>g>in</str<strong>on</strong>g>gen. Accord<str<strong>on</strong>g>in</str<strong>on</strong>g>g to their project<br />

website:<br />

The overall aim of TEXTvre is to support the complete lifecycle of research <str<strong>on</strong>g>in</str<strong>on</strong>g> e-Humanities<br />

textual studies. The project provides researchers with advanced services to process <strong>and</strong> analyse<br />

research texts that are held <str<strong>on</strong>g>in</str<strong>on</strong>g> formally managed, metadata-rich <str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repositories. The<br />

access <strong>and</strong> analysis of textual research data will be supported by annotati<strong>on</strong> <strong>and</strong> retrieval<br />

technology <strong>and</strong> will provide services for every step <str<strong>on</strong>g>in</str<strong>on</strong>g> the digital research life cycle.<br />

The TEXTvre will build <strong>on</strong> the results of the TextGrid project, but will be adapted to U.K. needs <strong>and</strong><br />

br<str<strong>on</strong>g>in</str<strong>on</strong>g>g together the major organizati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> “e-Humanities textual studies.” The project plans to embed<br />

itself with<str<strong>on</strong>g>in</str<strong>on</strong>g> the daily workflow of researchers at K<str<strong>on</strong>g>in</str<strong>on</strong>g>g’s College <strong>and</strong> to be <str<strong>on</strong>g>in</str<strong>on</strong>g>teroperable with<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>stituti<strong>on</strong>al repository <strong>and</strong> data-management structures. This project is at its beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g stage <strong>and</strong> does<br />

have a useful list of tools for potential <str<strong>on</strong>g>in</str<strong>on</strong>g>clusi<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the “virtual research envir<strong>on</strong>ment” they will<br />

develop. 778<br />

777 http://textvre.cerch.kcl.ac.uk/<br />

778 http://textvre.cerch.kcl.ac.uk/page_id=9


271<br />

REFERENCES<br />

[ACLS 2006]. American <str<strong>on</strong>g>Council</str<strong>on</strong>g> of Learned Societies. Our Cultural Comm<strong>on</strong>wealth: The F<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

Report of the American <str<strong>on</strong>g>Council</str<strong>on</strong>g> of Learned Societies Commissi<strong>on</strong> <strong>on</strong> Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the<br />

Humanities & Social Sciences. ACLS, (2006). http://www.acls.org/cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure/<br />

[Agosti et al. 2005]. Agosti, Maristella, Nicola Ferro, <strong>and</strong> Nicola Orio. “Annotat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Illum<str<strong>on</strong>g>in</str<strong>on</strong>g>ated<br />

Manuscripts: an Effective Tool for Research <strong>and</strong> Educati<strong>on</strong>.” JCDL '05: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 5th<br />

ACM/IEEE-CS Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t C<strong>on</strong>ference <strong>on</strong> Digital Libraries. New York, NY: ACM (2005): 121-130.<br />

http://dx.doi.org/10.1145/1065385.1065412<br />

[Allis<strong>on</strong> 2001]. Allis<strong>on</strong>, Penelope. Pompeian Households: An Analysis of Material Culture. Cotsen<br />

Institute of Archaeology, 2001.<br />

[Álvarez et al. 2010]. Álvarez, Fern<strong>and</strong>o-Luis, Elena García-Barriocanal, <strong>and</strong> Joaquín-L Gómez-<br />

Pantoja. “Shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g Epigraphic Informati<strong>on</strong> as L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked Data.” Metadata <strong>and</strong> Semantic Research. Volume<br />

108 of Communicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer <strong>and</strong> Informati<strong>on</strong> Science, Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>, Heidelberg: Spr<str<strong>on</strong>g>in</str<strong>on</strong>g>ger, (2010):<br />

222-234. http://dx.doi.org/10.1007/978-3-642-16552-8_21. Open-access copy available at:<br />

http://www.zotero.org/paregorios/items/KP5GA2E6<br />

[Am<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2008]. Am<str<strong>on</strong>g>in</str<strong>on</strong>g>, Alia, Jacco van Ossenbruggen, Lynda Hardman, <strong>and</strong> Annelies van Nispen.<br />

“Underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g Cultural Heritage Experts’ Informati<strong>on</strong> Seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g Needs.” JCDL '08: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the<br />

8th ACM/IEEE-CS Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t C<strong>on</strong>ference <strong>on</strong> Digital Libraries. New York, NY: ACM (2008): 39-47.<br />

http://dx.doi.org/10.1145/1378889.1378897<br />

[APA/AIA 2007]. APA/AIA Task Force <strong>on</strong> Electr<strong>on</strong>ic Publicati<strong>on</strong>s: F<str<strong>on</strong>g>in</str<strong>on</strong>g>al Report. American<br />

Philological Instituti<strong>on</strong>/American Archaeological Institute of America, Philadelphia, PA; Bost<strong>on</strong>, MA,<br />

(March 2007). http://socrates.berkeley.edu/~p<str<strong>on</strong>g>in</str<strong>on</strong>g>ax/pdfs/TaskForceF<str<strong>on</strong>g>in</str<strong>on</strong>g>alReport.pdf<br />

[APA 2010]. American Philological Associati<strong>on</strong>. “From Gatekeeper to Gateway: the Campaign for<br />

Classics <str<strong>on</strong>g>in</str<strong>on</strong>g> the 21 st Century.” 2010. http://www.apaclassics.org/campaign/Full_Case.pdf<br />

[ARL 2009a]. Associati<strong>on</strong> of Research Libraries. “Establish a Universal, Open <strong>Library</strong> or Digital Data<br />

Comm<strong>on</strong>s.” Associati<strong>on</strong> of Research Libraries, (January 2009).<br />

http://www.arl.org/bm~doc/ibopenlibpsa2.pdf<br />

[ARL 2009b]. Associati<strong>on</strong> of Research Libraries. The Research <strong>Library</strong>'s Role <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital Repository<br />

Services: F<str<strong>on</strong>g>in</str<strong>on</strong>g>al Report of the ARL Digital Repository Issues Task Force. Associati<strong>on</strong> of Research<br />

Libraries, (January 2009). http://www.arl.org/bm~doc/repository-services-report.pdf<br />

[Arms <strong>and</strong> Larsen 2007]. Arms, William Y., <strong>and</strong> R<strong>on</strong>ald L. Larsen. The Future of Scholarly<br />

Communicati<strong>on</strong>: Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Infrastructure for Cyberscholarship. Nati<strong>on</strong>al Science Foundati<strong>on</strong>,<br />

(September 2007). http://www.sis.pitt.edu/~repwkshop/SIS-NSFReport2.pdf


272<br />

[Aschenbrenner et al. 2008]. Aschenbrenner, Andreas, Tobias Blanke, David Fl<strong>and</strong>ers, Mark Hedges,<br />

<strong>and</strong> Ben O'Steen. “The Future of Repositories Patterns for (Cross-)Repository Architectures.” D-Lib<br />

Magaz<str<strong>on</strong>g>in</str<strong>on</strong>g>e, 14 (November 2008).<br />

http://www.dlib.org/dlib/november08/aschenbrenner/11aschenbrenner.html<br />

[Aschenbrenner et al. 2009]. Aschenbrenner, Andreas, Marc W. Küster, Christoph Ludwig, <strong>and</strong><br />

Thorsten Vitt. “Open eHumanities Digital Ecosystems <strong>and</strong> the Role of Resource Registries.” DEST<br />

'09: 3rd IEEE Internati<strong>on</strong>al C<strong>on</strong>ference Digital Ecosystems <strong>and</strong> Technologies, (June 2009): 745-750.<br />

http://dx.doi.org/10.1109/DEST.2009.5276672<br />

[Aschenbrenner et al. 2010]. Aschenbrenner, Andreas, Tobias Blanke, <strong>and</strong> Marc W. Küster. “Towards<br />

an Open Repository Envir<strong>on</strong>ment.” Journal of Digital Informati<strong>on</strong>, 11 (March 2010).<br />

http://journals.tdl.org/jodi/article/view/758<br />

[Ashdowne 2009]. Ashdowne, Richard. “Accidence <strong>and</strong> Acr<strong>on</strong>yms: Deploy<str<strong>on</strong>g>in</str<strong>on</strong>g>g Electr<strong>on</strong>ic Assessment<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> Support of Classical Language Teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> a University C<strong>on</strong>text.” Arts <strong>and</strong> Humanities <str<strong>on</strong>g>in</str<strong>on</strong>g> Higher<br />

Educati<strong>on</strong>, 8 (June 2009): 201-216. http://dx.doi.org/10.1177/1474022209102685<br />

[Audenaert <strong>and</strong> Furuta 2010]. Audenaert, Neal, <strong>and</strong> Richard Furuta. “What Humanists Want: How<br />

Scholars Use Source Materials.” JCDL '10: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 10th Annual Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t C<strong>on</strong>ference <strong>on</strong><br />

Digital Libraries. New York, New York: ACM, (2010): 283-292.<br />

http://dx.doi.org/10.1145/1816123.1816166<br />

[Bagnall 1980]. Bagnall, Roger. Research Tools for the Classics: the Report of the American<br />

Philological Associati<strong>on</strong>'s Ad Hoc Committee <strong>on</strong> Basic Research Tools. Chico, CA: Scholars Press,<br />

1980.<br />

[Bagnall 2010]. Bagnall, Roger. “Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Papyrology.” Paper presented at Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

Humanities Scholarship: The Shape of Th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs to Come, (March 2010).<br />

http://hdl.h<strong>and</strong>le.net/2451/29592<br />

[Baker et al. 2008]. Baker, Mark, Claire Fisher, Emma O’Riordan, Matthew Grove, Michael Fulford,<br />

Claire Warwick, Melissa Terras, Am<strong>and</strong>a Clarke, <strong>and</strong> Mike Ra<str<strong>on</strong>g>in</str<strong>on</strong>g>s. “VERA: A Virtual Envir<strong>on</strong>ment for<br />

Research <str<strong>on</strong>g>in</str<strong>on</strong>g> Archaeology.” Fourth Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> e-Social Science, University of<br />

Manchester, (June 2008). http://www.ncess.net/events/c<strong>on</strong>ference/programme/fri/3abaker.pdf<br />

[Ball 2010]. Ball, Alex. Preservati<strong>on</strong> <strong>and</strong> Curati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Instituti<strong>on</strong>al Repositories (Versi<strong>on</strong> 1.3). Digital<br />

Curati<strong>on</strong> Centre, UKOLN, University of Bath, (March 2010).<br />

http://dcc.ac.uk/sites/default/files/documents/reports/irpc-report-v1.3.pdf<br />

[Bamman <strong>and</strong> Crane 2006]. Bamman, David, <strong>and</strong> Gregory Crane. “The Design <strong>and</strong> Use of a Lat<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Dependency Treebank.” TLT 2006: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Fifth Internati<strong>on</strong>al Treebanks <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic<br />

Theories C<strong>on</strong>ference, (2006): 67-78. http://hdl.h<strong>and</strong>le.net/10427/42684


273<br />

[Bamman <strong>and</strong> Crane 2007]. Bamman, David, <strong>and</strong> Gregory Crane. “The Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Dependency Treebank <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

a Cultural Heritage Digital <strong>Library</strong>.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Workshop <strong>on</strong> Language Technology for<br />

Cultural Heritage Data (LaTech 2007), (2007): 33-40. http://acl.ldc.upenn.edu/W/W07/W07-0905.pdf<br />

[Bamman <strong>and</strong> Crane 2008]. Bamman, David, <strong>and</strong> Gregory Crane. “Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Dynamic Lexic<strong>on</strong> from<br />

a Digital <strong>Library</strong>.” JCDL '08: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 8th ACM/IEEE-CS Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t C<strong>on</strong>ference <strong>on</strong> Digital<br />

Libraries. New York, NY: ACM, (2008): 11-20. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at<br />

http://hdl.h<strong>and</strong>le.net/10427/42686<br />

[Bamman <strong>and</strong> Crane 2009]. Bamman, David, <strong>and</strong> Gregory Crane. “Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics <strong>and</strong><br />

Classical Lexicography.” Digital Humanities Quarterly, 3 (January 2009).<br />

http://www.digitalhumanities.org/dhq/vol/3/1/000033.html<br />

[Bamman, Mambr<str<strong>on</strong>g>in</str<strong>on</strong>g>i, <strong>and</strong> Crane 2009]. Bamman, David, Francesco Mambr<str<strong>on</strong>g>in</str<strong>on</strong>g>i, <strong>and</strong> Gregory Crane.<br />

“An Ownership Model of Annotati<strong>on</strong>: The Ancient Greek Dependency Treebank.” TLT 2009-Eighth<br />

Internati<strong>on</strong>al Workshop <strong>on</strong> Treebanks <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Theories, Milan, Italy, (November 2009).<br />

Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at: http://www.perseus.tufts.edu/publicati<strong>on</strong>s/tlt8.pdf<br />

[Bamman, Passarotti, <strong>and</strong> Crane 2008]. Bamman, David, Marco Passarotti, <strong>and</strong> Gregory Crane. “A<br />

Case Study <str<strong>on</strong>g>in</str<strong>on</strong>g> Treebank Collaborati<strong>on</strong> <strong>and</strong> Comparis<strong>on</strong>: Accusativus cum Inf<str<strong>on</strong>g>in</str<strong>on</strong>g>itivo <strong>and</strong> Subord<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong><br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>.” Prague Bullet<str<strong>on</strong>g>in</str<strong>on</strong>g> of Mathematical L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, 90 (December 2008): 109-122.<br />

http://ufal.mff.cuni.cz/pbml/90/art-bamman-et-al.pdf<br />

[Bankier <strong>and</strong> Perciali 2008]. Bankier, Jean-Gabriel, <strong>and</strong> Irene Perciali. “The Instituti<strong>on</strong>al Repository<br />

Rediscovered: What Can a University Do for Open Access Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g.” Serials Review, 34 (March<br />

2008): 21-26. Also available at: http://works.bepress.com/jean_gabriel_bankier/1<br />

[Barker 2010]. Barker, Elt<strong>on</strong>. “Repurpos<str<strong>on</strong>g>in</str<strong>on</strong>g>g Perseus: the Herodotus Encoded Space-Text-Image<br />

Archive (HESTIA).” Presentati<strong>on</strong> given at NEH-DFG Workshop, Medford, MA, (January 2010).<br />

http://www.l<str<strong>on</strong>g>in</str<strong>on</strong>g>guistik.hu-berl<str<strong>on</strong>g>in</str<strong>on</strong>g>.de/<str<strong>on</strong>g>in</str<strong>on</strong>g>stitut/professuren/korpusl<str<strong>on</strong>g>in</str<strong>on</strong>g>guistik/events-en/nehdfg/pdf/hestiapresentati<strong>on</strong><br />

[Barker et al. 2010]. Barker, Elt<strong>on</strong>, Stefan Bouzarovski, Chris Pell<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> Leif Isaksen. “Mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g an<br />

Ancient Historian <str<strong>on</strong>g>in</str<strong>on</strong>g> a Digital Age: the Herodotus Encoded Space-Text-Image Archive (HESTIA).”<br />

Leeds Internati<strong>on</strong>al Classical Studies, 9 (March 2010).<br />

http://www.leeds.ac.uk/classics/lics/2010/201001.pdf<br />

[Barmpoutis et al. 2009]. Barmpoutis, Angelos, Eleni Bozia, <strong>and</strong> Robert S. Wagman. “A Novel<br />

Framework for 3D Rec<strong>on</strong>structi<strong>on</strong> <strong>and</strong> Analysis of Ancient Inscripti<strong>on</strong>s.” Mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e Visi<strong>on</strong> <strong>and</strong><br />

Applicati<strong>on</strong>s, 21 (2009): 989-998. http://dx.doi.org/10.1007/s00138-009-0198-7<br />

[Barrenechea 2006]. Barrenechea, Francisco. “A Fragment of Old Comedy: P. Columbia <str<strong>on</strong>g>in</str<strong>on</strong>g>v. 430.”<br />

Zeitschrift für Papyrologie und Epigraphik, 158 (2006): 49-54. http://www.jstor.org/stable/20191149<br />

[Bauer et al. 2008]. Bauer, Péter, Zsolt Hernáth, Zoltán Horváth, Gyula Mayer, Zsolt Parragi, Zoltán<br />

Porkoláb, <strong>and</strong> Zsolt Sztupák. “HypereiDoc—An XML Based Framework Support<str<strong>on</strong>g>in</str<strong>on</strong>g>g Cooperative Text


274<br />

Editi<strong>on</strong>s.” Advances <str<strong>on</strong>g>in</str<strong>on</strong>g> Databases <strong>and</strong> Informati<strong>on</strong> Systems (Lecture Notes <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer Science,<br />

Volume 5207), (2008): 14-29. http://dx.doi.org/10.1007/978-3-540-85713-6_3<br />

[Baumann <strong>and</strong> Seales 2009]. Baumann, Ryan, <strong>and</strong> Brent W. Seales. “Robust Registrati<strong>on</strong> of<br />

Manuscript Images.” JCDL '09: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 2009 Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Digital<br />

Libraries. New York, NY: ACM, (2009): 263-266. http://dx.doi.org/10.1145/1555400.1555443<br />

[Beacham <strong>and</strong> Denard 2003]. Beacham, Richard, <strong>and</strong> Hugh Denard. “The Pompey Project: Digital<br />

Research <strong>and</strong> Virtual Rec<strong>on</strong>structi<strong>on</strong> of <str<strong>on</strong>g>Rome</str<strong>on</strong>g>’s First Theatre.” Computers <strong>and</strong> the Humanities, 37<br />

(February 2003): 129-139. http://dx.doi.org/10.1023/A:1021859830043<br />

[Benardou et al. 2010a]. Benardou, Agiatis, Panos C<strong>on</strong>stantopoulos, Costis Dallas, <strong>and</strong> Dimitris<br />

Gavrilis. “A C<strong>on</strong>ceptual Model for Scholarly Research Activity.” iC<strong>on</strong>ference 2010, (February 2010).<br />

http://hdl.h<strong>and</strong>le.net/2142/14945<br />

[Benardou et al. 2010b]. Benardou, Agiatis, Panos C<strong>on</strong>stantopoulos, Costis Dallas, <strong>and</strong> Dimitris<br />

Gavrilis. “Underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g the Informati<strong>on</strong> Requirements of Arts <strong>and</strong> Humanities Scholarship.”<br />

Internati<strong>on</strong>al Journal of Digital Curati<strong>on</strong>, 5 (July 2010).<br />

http://ijdc.net/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/ijdc/article/view/144/0<br />

[Berti et al. 2009]. Berti, M<strong>on</strong>ica, Matteo Romanello, Alis<strong>on</strong> Babeu, <strong>and</strong> Gregory Crane. “Collect<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Fragmentary Authors <str<strong>on</strong>g>in</str<strong>on</strong>g> a Digital <strong>Library</strong>.” JCDL '09: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 2009 Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t Internati<strong>on</strong>al<br />

C<strong>on</strong>ference <strong>on</strong> Digital Libraries. New York, NY: ACM, (2009): 259-262. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at:<br />

http://hdl.h<strong>and</strong>le.net/10427/70401<br />

[Bilane et al. 2008]. Bilane, Petra, Stéphane Bres, <strong>and</strong> Hubert Emptoz. “Local Orientati<strong>on</strong> Extracti<strong>on</strong><br />

for Wordspott<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> Syriac Manuscripts.” Image <strong>and</strong> Signal Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Lecture Notes <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer<br />

Science, Volume 5099), (2008): 481-489. http://dx.doi.org/10.1007/978-3-540-69905-7_55<br />

[B<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2010]. B<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Ceri. “Implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g Archaeological Time Periods Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g CIDOC CRM <strong>and</strong><br />

SKOS." The Semantic Web: Research <strong>and</strong> Applicati<strong>on</strong>s, LNCS 6088, Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>, Heidelberg: Spr<str<strong>on</strong>g>in</str<strong>on</strong>g>ger,<br />

(2010): 273-287.<br />

Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at: http://hypermedia.research.glam.ac.uk/media/files/documents/2010-06-<br />

09/ESWC2010_b<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g_paper.pdf<br />

[B<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g et al. 2008]. B<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Ceri, Keith May, <strong>and</strong> Douglas Tudhope. “Semantic Interoperability <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Archaeological Datasets: Data Mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Extracti<strong>on</strong> via the CIDOC CRM.” Research <strong>and</strong><br />

Advanced Technology for Digital Libraries, (2008): 280-290.<br />

http://hypermedia.research.glam.ac.uk/media/files/documents/2008-07-05/b<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g_ECDL2008.pdf<br />

[Bizer et al. 2007]. Bizer, Chris, Richard Cyganiak, <strong>and</strong> Tom Heath. “How to Publish L<str<strong>on</strong>g>in</str<strong>on</strong>g>ked Data <strong>on</strong><br />

the Web” http://sites.wiwiss.fu-berl<str<strong>on</strong>g>in</str<strong>on</strong>g>.de/suhl/bizer/pub/L<str<strong>on</strong>g>in</str<strong>on</strong>g>kedDataTutorial/<br />

[Blackwell <strong>and</strong> Crane 2009]. Blackwell, Christopher, <strong>and</strong> Gregory Crane. “C<strong>on</strong>clusi<strong>on</strong>:<br />

Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, the Scaife Digital <strong>Library</strong> <strong>and</strong> Classics <str<strong>on</strong>g>in</str<strong>on</strong>g> a Digital Age.” Digital Humanities<br />

Quarterly, 3 (January 2009). http://www.digitalhumanities.org/dhq/vol/3/1/000035.html


275<br />

[Blackwell <strong>and</strong> Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> 2009]. Blackwell, Chris, <strong>and</strong> Thomas R. Mart<str<strong>on</strong>g>in</str<strong>on</strong>g>. “Technology, Collaborati<strong>on</strong>,<br />

<strong>and</strong> Undergraduate Research.” Digital Humanities Quarterly, 3 (January 2009).<br />

http://www.digitalhumanities.org/dhq/vol/3/1/000024.html<br />

[Blackwell <strong>and</strong> Smith 2009]. Blackwell, Christopher, <strong>and</strong> David Neel Smith. “Homer Multitext - N<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

Year Update.” Digital Humanities 2009 C<strong>on</strong>ference Abstracts, (June 2009): 6-8.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Blanke 2010]. Blanke, Tobias. “From Tools <strong>and</strong> Services to e-Infrastructure for the Arts <strong>and</strong><br />

Humanities.” Producti<strong>on</strong> Grids <str<strong>on</strong>g>in</str<strong>on</strong>g> Asia, Part II (2010): 117-127.<br />

http://dx.doi.org/10.1007/978-1-4419-0046-3_10<br />

[Blanke et al. 2008]. Blanke, Tobias, Andreas Aschenbrenner, Marc Küster, <strong>and</strong> C. Ludwig. “No<br />

Claims for Universal Soluti<strong>on</strong>s - Possible Less<strong>on</strong>s from Current e-Humanities Practices <str<strong>on</strong>g>in</str<strong>on</strong>g> Germany<br />

<strong>and</strong> the UK.” E-SCIENCE '08: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the IEEE e-Humanities Workshop, (November 2008).<br />

http://www.clar<str<strong>on</strong>g>in</str<strong>on</strong>g>.eu/system/files/ClaimsUniversal-eHum2008.pdf<br />

[Blanke et al. 2009]. Blanke, Tobias, Mark Hedges, <strong>and</strong> Stuart Dunn. “Arts <strong>and</strong> Humanities E-<br />

Science—Current Practices <strong>and</strong> Future Challenges.” Future Generati<strong>on</strong> Computer Systems, 25 (April<br />

2009): 474-480. http://dx.doi.org/10.1016/j.future.2008.10.004<br />

[Blanke, Hedges, <strong>and</strong> Palmer 2009]. Blanke, Tobias, Mark Hedges, <strong>and</strong> Richard Palmer. “Restful<br />

Services for the e-Humanities — Web Services that Work for the e-Humanities Ecosystem.” 3rd IEEE<br />

Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Digital Ecosystems Technologies (DEST 2009), (June 2009): 637-642.<br />

http://dx.doi.org/10.1109/DEST.2009.5276740<br />

[Bodard 2006]. Bodard, Gabriel. “Inscripti<strong>on</strong>s of Aphrodisias: Paradigm of an Electr<strong>on</strong>ic Publicati<strong>on</strong>.”<br />

CLiP 2006: Literatures, Languages <strong>and</strong> Cultural Heritage <str<strong>on</strong>g>in</str<strong>on</strong>g> a Digital World, (July 2006).<br />

http://www.cch.kcl.ac.uk/clip2006/redist/abstracts_pdf/paper33.pdf<br />

[Bodard 2008]. Bodard, Gabriel. “The Inscripti<strong>on</strong>s of Aphrodisias as Electr<strong>on</strong>ic Publicati<strong>on</strong>: A User's<br />

Perspective <strong>and</strong> a Proposed Paradigm.” Digital Medievalist, 4 (2008).<br />

http://www.digitalmedievalist.org/journal/4/bodard/<br />

[Bodard 2009]. Bodard, Gabriel. “Digital Classicist: Re-use of Open Source <strong>and</strong> Open Access<br />

Publicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> Ancient Studies.” Digital Humanities 2009 C<strong>on</strong>ference Abstracts, (June 2009): 2.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Bodard <strong>and</strong> Garcés 2009]. Bodard, Gabriel, <strong>and</strong> Juan Garcés. “Open Source Critical Editi<strong>on</strong>s: A<br />

Rati<strong>on</strong>ale.” <str<strong>on</strong>g>in</str<strong>on</strong>g> Text Edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t <strong>and</strong> the Digital World. Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT. Ashgate Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2009,<br />

pp. 83-98.<br />

[Bodard <strong>and</strong> Mah<strong>on</strong>y 2010]. Bodard, Gabriel, <strong>and</strong> Sim<strong>on</strong> Mah<strong>on</strong>y, eds. Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Study<br />

of Classical Antiquity. (Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Arts <strong>and</strong> Humanities Series). Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate<br />

Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2010.


276<br />

[Bodard et al. 2009]. Bodard, Gabriel, Tobias Blanke, <strong>and</strong> Mark Hedges. “L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Query<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Ancient Texts: a Case Study with Three Epigraphic/Papyrological Datasets.” Digital Humanities 2009<br />

C<strong>on</strong>ference Abstracts, (June 2009): 2-4.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Bolter 1991]. Bolter, Jay D. “The Computer, Hypertext, <strong>and</strong> Classical Studies.” The American<br />

Journal of Philology, 112 (1991): 541-545. http://dx.doi.org/10.2307/294933<br />

[Borgman 2009]. Borgman, Christ<str<strong>on</strong>g>in</str<strong>on</strong>g>e L. “The Digital Future is Now: A Call to Acti<strong>on</strong> for the<br />

Humanities.” (2009). http://works.bepress.com/cgi/viewc<strong>on</strong>tent.cgiarticle=1232&c<strong>on</strong>text=borgman.<br />

[Boschetti 2007]. Boschetti, Federico. “Methods to Extend Greek <strong>and</strong> Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Corpora with Variants <strong>and</strong><br />

C<strong>on</strong>jectures: Mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g Critical Apparatuses <strong>on</strong>to Reference Text.” CL 2007: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the<br />

Corpus L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics C<strong>on</strong>ference, University of Birm<str<strong>on</strong>g>in</str<strong>on</strong>g>gham, UK, (July 2007).<br />

http://ucrel.lancs.ac.uk/publicati<strong>on</strong>s/CL2007/paper/150_Paper.pdf<br />

[Boschetti 2009]. Boschetti, Federico. “Digital Aeschylus - Breadth <strong>and</strong> Depth Issues <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital<br />

Libraries.” Workshop <strong>on</strong> Advanced Technologies for Digital Libraries 2009 (AT4DL 2009),<br />

(September 2009): 5-8.<br />

http://www.unibz.it/en/public/universitypress/publicati<strong>on</strong>s/all/Documents/9788860460301.pdf<br />

[Boschetti et al. 2009]. Boschetti, Federico, Matteo Romanello, Alis<strong>on</strong> Babeu, David Bamman, <strong>and</strong><br />

Gregory Crane. “Improv<str<strong>on</strong>g>in</str<strong>on</strong>g>g OCR Accuracy for Classical Critical Editi<strong>on</strong>s.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 13th<br />

European C<strong>on</strong>ference <strong>on</strong> Research <strong>and</strong> Advanced Technology for Digital Libraries (ECDL 2009),<br />

Corfu, Greece, (2009): 156-167. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at: http://hdl.h<strong>and</strong>le.net/10427/70402<br />

[Bowman et al. 2009]. Bowman, Alan K., R. S. O. Toml<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> Klaas Worp. “Emptio Bovis Frisica:<br />

The ‘Frisian Ox Sale’ Rec<strong>on</strong>sidered.” Journal of Roman Studies, 99 (2009): 156-170.<br />

[Bowman et al. 2010]. Bowman, Alan K., Charles V. Crowther, Ruth Kirkham, <strong>and</strong> John Pybus. “A<br />

Virtual Research Envir<strong>on</strong>ment for the Study of Documents <strong>and</strong> Manuscripts.” In Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the Study of Classical Antiquity (eds. Gabriel Bodard <strong>and</strong> Sim<strong>on</strong> Mah<strong>on</strong>y). Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate<br />

Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2010, pp. 87-103.<br />

[Bozzi <strong>and</strong> Calabretto 1997]. Bozzi, Andrea, <strong>and</strong> Sylvie Calabretto. “The Digital <strong>Library</strong> <strong>and</strong><br />

Computati<strong>on</strong>al Philology: The BAMBI Project.” Research <strong>and</strong> Advanced Technology for Digital<br />

Libraries, (Lecture Notes <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer Science, Volume 1324), (1997): 269-285.<br />

http://dx.doi.org/10.1007/BFb0026733<br />

[Bradley 2005]. Bradley, John. “What You (Fore)see is What You Get: Th<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g about Usage<br />

Paradigms for Computer Assisted Text Analysis.” Text Technology, 14 (2005): 1-19. Also available<br />

<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e: http://texttechnology.mcmaster.ca/pdf/vol14_2/bradley14-2.pdf<br />

[Bradley 2008]. Bradley, John. “Pl<str<strong>on</strong>g>in</str<strong>on</strong>g>y: A Model for Digital Support of Scholarship.” Journal of Digital<br />

Informati<strong>on</strong>, 9 (May 2008). http://journals.tdl.org/jodi/article/view/209/198


277<br />

[Bradley <strong>and</strong> Short 2005]. Bradley, John, <strong>and</strong> Harold Short. “Texts <str<strong>on</strong>g>in</str<strong>on</strong>g>to Databases: The Evolv<str<strong>on</strong>g>in</str<strong>on</strong>g>g Field<br />

of New-style Prosopography.” Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 20 (January 2005): 3-24.<br />

http://dx.doi.org/10.1093/llc/fqi022<br />

[Breuel 2009]. Breuel, Thomas. “Apply<str<strong>on</strong>g>in</str<strong>on</strong>g>g the OCRopus OCR System to Scholarly Sanskrit<br />

Literature.” Sanskrit Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics (Lecture Notes <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer Science, Volume 5402),<br />

(2009): 391-402. http://dx.doi.org/10.1007/978-3-642-00155-0_21<br />

[Brown <strong>and</strong> Greengrass 2010]. Brown, Stephen, <strong>and</strong> Mark Greengrass. “Research Portals <str<strong>on</strong>g>in</str<strong>on</strong>g> the Arts<br />

<strong>and</strong> Humanities.” Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 25 (April 2010): 1-21.<br />

http://dx.doi.org/10.1093/llc/fqp032<br />

[Brunner 1987]. Brunner, Theodore F. “Data Banks for the Humanities: Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g from Thesaurus<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>guae Graecae." Scholarly Communicati<strong>on</strong>, 7 (1987): 1, 6-9.<br />

[Brunner 1991]. Brunner, Theodore F. “The Thesaurus L<str<strong>on</strong>g>in</str<strong>on</strong>g>guae Graecae: Classics <strong>and</strong> the Computer.”<br />

<strong>Library</strong> Hi-Tech, 9 (1991): 61-67.<br />

[Brunner 1993]. Brunner, Theodore F. “Classics <strong>and</strong> the Computer: The History of a Relati<strong>on</strong>ship.” In<br />

Access<str<strong>on</strong>g>in</str<strong>on</strong>g>g Antiquity: The Computerizati<strong>on</strong> of Classical Studies, (ed. J<strong>on</strong> Solom<strong>on</strong>). L<strong>on</strong>d<strong>on</strong> <strong>and</strong><br />

Tucs<strong>on</strong>: University of Ariz<strong>on</strong>a Press; 1993, pp. 10-33.<br />

[Buchanan et al. 2005]. Buchanan, George, Sally Jo Cunn<str<strong>on</strong>g>in</str<strong>on</strong>g>gham, Ann Bl<strong>and</strong>ford, J<strong>on</strong> Rimmer, <strong>and</strong><br />

Claire Warwick. “Informati<strong>on</strong> Seek<str<strong>on</strong>g>in</str<strong>on</strong>g>g by Humanities Scholars.” ECDL '05: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 9th<br />

European C<strong>on</strong>ference <strong>on</strong> Research <strong>and</strong> Advanced Technology for Digital Libraries. Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>-Heidelberg,<br />

Spr<str<strong>on</strong>g>in</str<strong>on</strong>g>ger, (2005): 218-229. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at http://epr<str<strong>on</strong>g>in</str<strong>on</strong>g>ts.ucl.ac.uk/5155/1/5155.pdf<br />

[Buchanan 2010]. Buchanan, Sarah. “Accessi<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g the Digital Humanities: Report from the 1st<br />

Archival Educati<strong>on</strong> <strong>and</strong> Research Institute.” Digital Humanities Quarterly, 4 (August 2010).<br />

http://www.digitalhumanities.org/dhq/vol/4/1/000084/000084.html<br />

[Büchler et al. 2008]. Büchler, Marco, Gerhard Heyer, <strong>and</strong> Sab<str<strong>on</strong>g>in</str<strong>on</strong>g>e Grunder. “eAQUA - Br<str<strong>on</strong>g>in</str<strong>on</strong>g>g<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Modern Text M<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g Approaches to Two Thous<strong>and</strong> Years Old Ancient Texts.” e-Humanities – an<br />

emerg<str<strong>on</strong>g>in</str<strong>on</strong>g>g discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>e: Workshop <str<strong>on</strong>g>in</str<strong>on</strong>g> the 4th IEEE Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> e-Science, (December<br />

2008). http://www.clar<str<strong>on</strong>g>in</str<strong>on</strong>g>.eu/system/files/2008-09-05-IEEE2008-eAQUA-project.pdf<br />

[Büchler <strong>and</strong> Geßner 2009]. Büchler, Marco, <strong>and</strong> Annette Geßner. “Citati<strong>on</strong> Detecti<strong>on</strong> <strong>and</strong> Textual<br />

Reuse <strong>on</strong> Ancient Greek Texts.” DHCS 2009-Chicago Colloquium <strong>on</strong> Digital Humanities <strong>and</strong><br />

Computer Science, Chicago, (November 2009). http://l<str<strong>on</strong>g>in</str<strong>on</strong>g>gcog.iit.edu/%7Eargam<strong>on</strong>/DHCS09-<br />

Abstracts/Buechler-Gessner.pdf<br />

[Bulacu <strong>and</strong> Schomaker 2007]. Bulacu, Marius, <strong>and</strong> Lambart Schomaker. “Automatic H<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Identificati<strong>on</strong> <strong>on</strong> Medieval Documents.” ICIAP 2007: 14th Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Image<br />

Analysis <strong>and</strong> Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g, (2007): 279-284. http://www.ai.rug.nl/~bulacu/iciap2007-bulacuschomaker.pdf


278<br />

[Bulger et al. 2011]. Bulger, M<strong>on</strong>ica, Eric T. Meyer, Grace de la Flor, Melissa Terras, Sally Wyatt, Mar<str<strong>on</strong>g>in</str<strong>on</strong>g>a<br />

Jirotka, Kather<str<strong>on</strong>g>in</str<strong>on</strong>g>e Eccles, <strong>and</strong> Christ<str<strong>on</strong>g>in</str<strong>on</strong>g>e Madsen. Re<str<strong>on</strong>g>in</str<strong>on</strong>g>vent<str<strong>on</strong>g>in</str<strong>on</strong>g>g Research Informati<strong>on</strong> Practices <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

Humanities. Research Informati<strong>on</strong> Network. April 2011.<br />

http://www.r<str<strong>on</strong>g>in</str<strong>on</strong>g>.ac.uk/system/files/attachments/Humanities_Case_Studies_for_screen.pdf<br />

[Bulst 1989]. Bulst, Neithard. “Prosopography <strong>and</strong> the Computer: Problems <strong>and</strong> Possibilities.” In<br />

History <strong>and</strong> Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g II (ed. Peter Denley). Manchester, UK: Manchester University Press, 1989, pp.<br />

12–18. Postpr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at http://repositories.ub.uni-bielefeld.de/bipr<str<strong>on</strong>g>in</str<strong>on</strong>g>ts/volltexte/2010/4053<br />

[Byrne 2007]. Byrne, Kate. “Nested Named Entity Recogniti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Historical Archive Text.” ICSC<br />

2007: Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Semantic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. (2007): 589-596.<br />

[Cahill <strong>and</strong> Passamano 2007]. Cahill, Jane, <strong>and</strong> James A. Passamano. “Full Disclosure Matters.” Near<br />

Eastern Archaeology, 70 (December 2007): 194-196.<br />

http://www.alex<strong>and</strong>riaarchive.org/publicati<strong>on</strong>s/KansaKansaSchultz_NEADec07.pdf<br />

[Cal<strong>and</strong>ucci et al. 2009]. Cal<strong>and</strong>ucci, Ant<strong>on</strong>io, Jorge Sevilla, Roberto Barbera, Giuseppe Andr<strong>on</strong>ico,<br />

M<strong>on</strong>ica Saso, Aless<strong>and</strong>ro De Filippo, Stefania Iannizzotto, Domenico Vic<str<strong>on</strong>g>in</str<strong>on</strong>g>anza, <strong>and</strong> Francesco<br />

De Mattia. “Cultural Heritage Digital Libraries <strong>on</strong> Data Grids.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 13th European<br />

C<strong>on</strong>ference <strong>on</strong> Research <strong>and</strong> Advanced Technology for Digital Libraries (ECDL 2009), Corfu, Greece,<br />

(2009): 469-472. http://dx.doi.org/10.1007/978-3-642-04346-8_61<br />

[Campbell 2007]. Campbell, Douglas. “Identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Identifiers.” 2007 Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the<br />

Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Dubl<str<strong>on</strong>g>in</str<strong>on</strong>g> Core <strong>and</strong> Metadata Applicati<strong>on</strong>s, (2007): 74-84.<br />

http://www.dcmipubs.org/ojs/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/pubs/article/view/34/16<br />

[C<strong>and</strong>ela et al. 2007]. C<strong>and</strong>ela, Le<strong>on</strong>ardo, D<strong>on</strong>atella Castelli, Pasquale Pagano, C<strong>on</strong>sant<str<strong>on</strong>g>in</str<strong>on</strong>g>o Thanos,<br />

Yannis Ioannidis, Georgia Koutrika, Seamus Ross, Hans Jörg Schek, <strong>and</strong> Heiko Schuldt. “Sett<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

Foundati<strong>on</strong>s of Digital Libraries: The DELOS Manifesto.” D-Lib Magaz<str<strong>on</strong>g>in</str<strong>on</strong>g>e, 13, (2007).<br />

http://www.dlib.org/dlib/march07/castelli/03castelli.html<br />

[Cantara 2006]. Cantara, L<str<strong>on</strong>g>in</str<strong>on</strong>g>da. “L<strong>on</strong>g-Term Preservati<strong>on</strong> of Digital Humanities Scholarship.” OCLC<br />

Systems & Services, 22 (2006): 38-42.<br />

[Carroll et al. 2007]. Carroll, James L., Robbie Haertel, Peter McClanahan, Eric R<str<strong>on</strong>g>in</str<strong>on</strong>g>gger, <strong>and</strong> Kev<str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Seppi. "Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Annotati<strong>on</strong> Process for Ancient Corpus Creati<strong>on</strong>." Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the<br />

Internati<strong>on</strong>al C<strong>on</strong>ference of Electr<strong>on</strong>ic Corpora of Ancient Languages (ECAL), (2007), pp. 25-42.<br />

http://james.jlcarroll.net/publicati<strong>on</strong>s/12%20Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g%20the%20Annotati<strong>on</strong>%20Process%20for%20<br />

Ancient%20Corpus%20Creati<strong>on</strong>%20ver%202.pdf<br />

[Carusi <strong>and</strong> Reimer 2010]. Carusi, Annamaria, <strong>and</strong> Torsten Reimer. “Virtual Research Envir<strong>on</strong>ment<br />

Collaborative L<strong>and</strong>scape Study: A JISC Funded Project.” Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t Informati<strong>on</strong> Systems Committee,<br />

(January 2010). http://www.jisc.ac.uk/media/documents/publicati<strong>on</strong>s/vrel<strong>and</strong>scapereport.pdf<br />

[Carver 2007]. Carver, Mart<str<strong>on</strong>g>in</str<strong>on</strong>g>. “Archaeology Journals, Academics <strong>and</strong> Open Access.” European<br />

Journal of Archaeology, 10 (August 2007): 135-148.


279<br />

[Casadio <strong>and</strong> Lambek 2005]. Casadio, Claudia, <strong>and</strong> Jim Lambek. “A Computati<strong>on</strong>al Algebraic<br />

Approach to Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Grammar.” Research <strong>on</strong> Language & Computati<strong>on</strong>, 3 (April 2005): 45-60.<br />

http://dx.doi.org/10.1007/s11168-005-1286-0<br />

[Cayless 2003]. Cayless, Hugh. “Develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Toolkit for Digital Epigraphy." ALLC/ACH 2003,<br />

(2003): 155-157.<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.91.5363&rep=rep1&type=pdf<br />

[Cayless 2008]. Cayless, Hugh A. “L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Page Images to Transcripti<strong>on</strong>s with SVG.” Balisage: The<br />

Markup C<strong>on</strong>ference 2008, (August 2008): 12-15.<br />

http://www.balisage.net/Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs/vol1/html/Cayless01/BalisageVol1-Cayless01.html<br />

[Cayless 2009]. Cayless, Hugh A. “Image as Markup: Add<str<strong>on</strong>g>in</str<strong>on</strong>g>g Semantics to Manuscript Images.”<br />

Digital Humanities 2009 C<strong>on</strong>ference Abstracts, (June 2009): 83-84.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Cayless et al. 2009]. Cayless, Hugh, Charlotte Roueché, Tom Elliott, <strong>and</strong> Gabriel Bodard. “Epigraphy<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> 2017.” Digital Humanities Quarterly, 3 (January 2009).<br />

http://www.digitalhumanities.org/dhq/vol/3/1/000030.html<br />

[Cayless 2010a]. Cayless, Hugh. “<str<strong>on</strong>g>Digitized</str<strong>on</strong>g> Manuscripts <strong>and</strong> Open Licens<str<strong>on</strong>g>in</str<strong>on</strong>g>g.” Digital Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of<br />

the Lawrence J. Schoenberg Symposium <strong>on</strong> Manuscript Studies <str<strong>on</strong>g>in</str<strong>on</strong>g> the Digital Age, 2 (1), Article 7,<br />

(2010). http://repository.upenn.edu/ljsproceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs/vol2/iss1/7<br />

[Cayless 2010b]. Cayless, Hugh A. “Ktêma es aiei: Digital Permanence from an Ancient Perspective.”<br />

In Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Study of Classical Antiquity (eds. Gabriel Bodard <strong>and</strong> Sim<strong>on</strong> Mah<strong>on</strong>y).<br />

Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2010, pp. 139-150.<br />

[Cayless 2010c]. Cayless, Hugh A. “Mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g a New Numbers Server for Papyri.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo.” Scriptio<br />

C<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>ua, (March 2, 2010). http://philomousos.blogspot.com/2010/03/mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g-new-numbers-serverfor.html<br />

[Childs <strong>and</strong> Kagan 2008]. Childs, Terry S., <strong>and</strong> Seth Kagan. “A Decade of Study <str<strong>on</strong>g>in</str<strong>on</strong>g>to Repository Fees<br />

for Archaeological Collecti<strong>on</strong>s.” Studies <str<strong>on</strong>g>in</str<strong>on</strong>g> Archeology <strong>and</strong> Ethnography, Number 6. Archeology<br />

Program, Nati<strong>on</strong>al Park Service, Wash<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, DC, (2008).<br />

http://www.nps.gov/archeology/PUBS/studies/STUDY06A.htm<br />

[Choudhury 2008]. Choudhury, Sayeed G. “Case Study <str<strong>on</strong>g>in</str<strong>on</strong>g> Data Curati<strong>on</strong> at Johns Hopk<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

University.” <strong>Library</strong> Trends, 57 (2008): 211-220.<br />

http://muse.jhu.edu/journals/library_trends/v057/57.2.choudhury.html


280<br />

[Choudhury <strong>and</strong> St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2007]. Choudhury, Sayeed G., <strong>and</strong> Timothy L. St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong>. “The Virtual<br />

Observatory <strong>and</strong> the Roman de la Rose: Unexpected Relati<strong>on</strong>ships <strong>and</strong> the Collaborative Imperative.”<br />

Academic Comm<strong>on</strong>s, (December 2007).<br />

http://www.academiccomm<strong>on</strong>s.org/comm<strong>on</strong>s/essay/VO-<strong>and</strong>-roman-de-la-rose-collaborativeimperative<br />

[Choudhury et al. 2006]. Choudhury, Sayeed G., Tim DiLauro, Robert Fergus<strong>on</strong>, Michael Droettboom,<br />

<strong>and</strong> Ichiro Fuj<str<strong>on</strong>g>in</str<strong>on</strong>g>aga. “Document Recogniti<strong>on</strong> for a Milli<strong>on</strong> Books.” D-Lib Magaz<str<strong>on</strong>g>in</str<strong>on</strong>g>e, 12 (2006).<br />

http://www.dlib.org/dlib/march06/choudhury/03choudhury.html<br />

[Ciechomski et al. 2004]. Ciechomski, Pablo de Heras, Branislav Ulicny, Rachel Cetre, <strong>and</strong> Daniel<br />

Thalmann. “A Case Study of a Virtual Audience <str<strong>on</strong>g>in</str<strong>on</strong>g> a Rec<strong>on</strong>structi<strong>on</strong> of an Ancient Roman Ode<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Aphrodisias.” VAST 2004: The 5th Internati<strong>on</strong>al Symposium <strong>on</strong> Virtual Reality, Archaeology <strong>and</strong><br />

Cultural Heritage, (2004).<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.90.3403&rep=rep1&type=pdf<br />

[Ciula 2009]. Ciula, Arianna. “The Palaeographical Method Under the Light of a Digital Approach.”<br />

Kodikologie und Paläographie im digitalen Zeitalter-Codicology <strong>and</strong> Palaeography <str<strong>on</strong>g>in</str<strong>on</strong>g> the Digital Age.<br />

Norderstedt: Books <strong>on</strong> Dem<strong>and</strong>, 2009, pp. 219-235. Also available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e at:<br />

http://kups.ub.uni-koeln.de/volltexte/2009/2971/<br />

[Clarysse <strong>and</strong> Thomps<strong>on</strong> 2006]. Clarysse, Willy, <strong>and</strong> Dorothy J. Thomps<strong>on</strong>. Count<str<strong>on</strong>g>in</str<strong>on</strong>g>g the People <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Hellenistic Egypt, Volume 2: Historical Studies (Cambridge Classical Studies). Cambridge: Cambridge<br />

University Press, 2006.<br />

[Clocks<str<strong>on</strong>g>in</str<strong>on</strong>g> 2003]. Clocks<str<strong>on</strong>g>in</str<strong>on</strong>g>, William F. “Towards Automatic Transcripti<strong>on</strong> of Syriac H<strong>and</strong>writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g.”<br />

Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 12th Image Analysis <strong>and</strong> Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g C<strong>on</strong>ference, (2003): 664-669.<br />

http://ieeexplore.ieee.org/xpls/abs_all.jsparnumber=1234126<br />

[Cohen et al. 2004]. Cohen, J<strong>on</strong>athan, D<strong>on</strong>ald Duncan, Dean Snyder, Jerrold Cooper, Subodh Kumar,<br />

Daniel Hahn, Yuan Chen, Budirijanto Purnomo, <strong>and</strong> John Graett<str<strong>on</strong>g>in</str<strong>on</strong>g>ger. “iClay: Digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g Cuneiform.”<br />

VAST 2004: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Fifth Internati<strong>on</strong>al Symposium <strong>on</strong> Virtual Reality, Archaeology, <strong>and</strong><br />

Cultural Heritage, (2004): 135-143. Open access copy available at:<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.73.87&rep=rep1&type=pdf<br />

[Cohen et al. 2009]. Cohen, Dan, Neil Fraistat, Matthew Kirschenbaum, <strong>and</strong> Tom Sche<str<strong>on</strong>g>in</str<strong>on</strong>g>feldt. Tools<br />

for Data-Driven Scholarship: Past, Present <strong>and</strong> Future. A Report <strong>on</strong> the Workshop of 22-24 October<br />

2008. Turf Valley Resort, Ellicott City, Maryl<strong>and</strong>. March 2009, Center for History <strong>and</strong> New Media,<br />

George Mas<strong>on</strong> University <strong>and</strong> Maryl<strong>and</strong> Institute for Technology <strong>and</strong> the Humanities.<br />

http://mith.umd.edu/tools/f<str<strong>on</strong>g>in</str<strong>on</strong>g>al-report.html<br />

[Cohen 2010]. Cohen, Patricia. “Scholars Test Web Alternative to Peer Review.” New York Times,<br />

August 23, 2010, Published <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e at: http://www.nytimes.com/2010/08/24/arts/24peer.html_r=1<br />

[Coll<str<strong>on</strong>g>in</str<strong>on</strong>g>s 2008]. Coll<str<strong>on</strong>g>in</str<strong>on</strong>g>s, Derek. “Mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Entrails: The Practice of Greek Hepatoscopy.”<br />

American Journal of Philology, 129 (2008): 319-345.


281<br />

[C<strong>on</strong>naway <strong>and</strong> Dickey 2010]. C<strong>on</strong>naway, Lynn S., <strong>and</strong> Timothy J. Dickey. The Digital Informati<strong>on</strong><br />

Seeker: Report of F<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>gs from Selected OCLC, RIN <strong>and</strong> JISC User Behaviour Projects. Higher<br />

Educati<strong>on</strong> Fund<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>Council</str<strong>on</strong>g>, (March 2010).<br />

http://www.jisc.ac.uk/media/documents/publicati<strong>on</strong>s/reports/2010/digital<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong>seekerreport.pdf<br />

[Crane 1991]. Crane, Gregory. “Generat<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Pars<str<strong>on</strong>g>in</str<strong>on</strong>g>g Classical Greek.” Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic<br />

Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 6 (January 1991): 243-245. http://llc.oxfordjournals.org/cgi/c<strong>on</strong>tent/abstract/6/4/243<br />

[Crane 1998]. Crane, Gregory. “New Technologies for Read<str<strong>on</strong>g>in</str<strong>on</strong>g>g: The Lexic<strong>on</strong> <strong>and</strong> the Digital <strong>Library</strong>.”<br />

The Classical World, 91 (1998): 471-501. http://www.jstor.org/stable/4352154<br />

[Crane 2004]. Crane, Gregory. “Classics <strong>and</strong> the Computer: an End of the History.” In A Compani<strong>on</strong><br />

to Digital Humanities. Oxford: Blackwell Publishers, 2004, pp. 46-55. Available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e at:<br />

http://www.digitalhumanities.org/compani<strong>on</strong>/viewdocId=blackwell/9781405103213/9781405103213.<br />

xml&chunk.id=ss1-2-4&toc.depth=1&toc.id=ss1-2-4&br<strong>and</strong>=9781405103213_br<strong>and</strong><br />

[Crane 2005]. Crane, Gregory. “In a Digital World, No Book is an Isl<strong>and</strong>: Design<str<strong>on</strong>g>in</str<strong>on</strong>g>g Electr<strong>on</strong>ic<br />

Primary Sources <strong>and</strong> Reference Works for the Humanities.” In Creati<strong>on</strong>, Deployment <strong>and</strong> Use of<br />

Digital Informati<strong>on</strong>, Mahwah, New Jersey: Lawrence Erlbaum Associates, 2005, pp. 11-26.<br />

[Crane 2008]. Crane, Gregory. “Repositories, Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure, <strong>and</strong> the Humanities.” EDUCAUSE<br />

Review, 6 (November 2008).<br />

http://www.educause.edu/EDUCAUSE+Review/EDUCAUSEReviewMagaz<str<strong>on</strong>g>in</str<strong>on</strong>g>eVolume43/Repositorie<br />

sCyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructur/163269<br />

[Crane <strong>and</strong> J<strong>on</strong>es 2006]. Crane, Gregory, <strong>and</strong> Alis<strong>on</strong> J<strong>on</strong>es. “Text, Informati<strong>on</strong>, Knowledge <strong>and</strong> the<br />

Evolv<str<strong>on</strong>g>in</str<strong>on</strong>g>g Record of Humanity.” D-Lib Magaz<str<strong>on</strong>g>in</str<strong>on</strong>g>e, 12 (March 2006).<br />

http://www.dlib.org/dlib/march06/j<strong>on</strong>es/03j<strong>on</strong>es.html<br />

[Crane, Babeu, <strong>and</strong> Bamman 2007]. Crane, Gregory, Alis<strong>on</strong> Babeu, <strong>and</strong> David Bamman. “eScience<br />

<strong>and</strong> the Humanities.” Internati<strong>on</strong>al Journal <strong>on</strong> Digital Libraries, 7 (October 2007): 117-122. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<br />

available at: http://hdl.h<strong>and</strong>le.net/10427/42690<br />

[Crane, Seales, <strong>and</strong> Terras 2009]. Crane, Gregory, Brent Seales, <strong>and</strong> Melissa Terras.<br />

“Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for Classical Philology.” Digital Humanities Quarterly, 3 (January 2009).<br />

http://www.digitalhumanities.org/dhq/vol/3/1/000023.html#<br />

[Crane et al. 2006]. Crane, Gregory, David Bamman, Lisa Cerrato, Alis<strong>on</strong> J<strong>on</strong>es, David Mimno,<br />

Adrian Packel, David Sculley, <strong>and</strong> Gabriel Weaver. “Bey<strong>on</strong>d Digital Incunabula: Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Next<br />

Generati<strong>on</strong> of Digital Libraries.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 10th European C<strong>on</strong>ference <strong>on</strong> Digital Libraries<br />

(ECDL 2006). September 2006, 353-366. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at: http://hdl.h<strong>and</strong>le.net/10427/36131


282<br />

[Crane et al. 2009a]. Crane, Gregory, Alis<strong>on</strong> Babeu, David Bamman, Thomas Breuel, Lisa Cerrato,<br />

Daniel Deckers, Anke Lüdel<str<strong>on</strong>g>in</str<strong>on</strong>g>g, David Mimno, Rashmi S<str<strong>on</strong>g>in</str<strong>on</strong>g>ghal, David A. Smith, <strong>and</strong> Amir Zeldes.<br />

“Classics <str<strong>on</strong>g>in</str<strong>on</strong>g> the Milli<strong>on</strong> Book <strong>Library</strong>.” Digital Humanities Quarterly, 3 (2009).<br />

http://www.digitalhumanities.org/dhq/vol/003/1/000034.html<br />

[Crane et al. 2009b] Crane, Gregory, Alis<strong>on</strong> Babeu, David Bamman, Lisa Cerrato, <strong>and</strong> Rashmi<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>ghal. “Tools for Th<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g: ePhilology <strong>and</strong> Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure.” <str<strong>on</strong>g>in</str<strong>on</strong>g> Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g Together or Apart:<br />

Promot<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Next Generati<strong>on</strong> of Digital Scholarship: Report of a Workshop Cosp<strong>on</strong>sored by the<br />

<str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources <strong>and</strong> the Nati<strong>on</strong>al Endowment for the Humanities.<br />

<str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources, Publicati<strong>on</strong> Number 145, (March 2009): 16-26.<br />

http://www.clir.org/activities/digitalscholar2/crane11_11.pdf<br />

[Csernel <strong>and</strong> Patte 2009]. Csernel, Marc, <strong>and</strong> François Patte. “Critical Editi<strong>on</strong> of Sanskrit Texts.”<br />

Sanskrit Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (Lecture Notes <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer Science, Volume 5402), Spr<str<strong>on</strong>g>in</str<strong>on</strong>g>ger,<br />

(2009): 358-379. http://dx.doi.org/10.1007/978-3-642-00155-0_19<br />

[Dalbello et al. 2006]. Dalbello, Marija, Irene Lopatovska, Patricia Mah<strong>on</strong>y, <strong>and</strong> Nomi R<strong>on</strong>.<br />

“Electr<strong>on</strong>ic Texts <strong>and</strong> the Citati<strong>on</strong> System of Scholarly Journals <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities: Case Studies of<br />

Citati<strong>on</strong> Practices <str<strong>on</strong>g>in</str<strong>on</strong>g> the Fields of Classical Studies <strong>and</strong> English Literature.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of Libraries<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the Digital Age (LIDA), (2006).<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.83.7025&rep=rep1&type=pdf<br />

[Dallas <strong>and</strong> Doorn 2009]. Dallas, Costis, <strong>and</strong> Peter Doorn. “Report <strong>on</strong> the Workshop <strong>on</strong> Digital<br />

Curati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the Human Sciences at ECDL 2009: Corfu, 30 September - 1 October 2009.” D-Lib<br />

Magaz<str<strong>on</strong>g>in</str<strong>on</strong>g>e, 15 (November 2009). http://www.dlib.org/dlib/november09/dallas/11dallas.html<br />

[D’Andrea <strong>and</strong> Niccolucci 2008]. D'Andrea, Andrea, <strong>and</strong> Franco Niccolucci. “Mapp<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Embedd<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

<strong>and</strong> Extend<str<strong>on</strong>g>in</str<strong>on</strong>g>g: Pathways to Semantic Interoperability The Case of Numismatic Collecti<strong>on</strong>s.” Fifth<br />

European Semantic Web C<strong>on</strong>ference Workshop: SIEDL 2008-Semantic Interoperability <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

European Digital <strong>Library</strong>, (2008): 63-76.<br />

http://image.ntua.gr/swamm2006/SIEDLproceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs.pdf#page=69<br />

[Deckers et al. 2009]. Deckers, Daniel, Lutz Koll, <strong>and</strong> Crist<str<strong>on</strong>g>in</str<strong>on</strong>g>a Vertan. “Representati<strong>on</strong> <strong>and</strong> Encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

of Heterogeneous Data <str<strong>on</strong>g>in</str<strong>on</strong>g> a Web Based Research Envir<strong>on</strong>ment for Manuscript <strong>and</strong> Textual Studies.”<br />

Kodikologie und Paläographie im digitalen Zeitalter-Codicology <strong>and</strong> Palaeography <str<strong>on</strong>g>in</str<strong>on</strong>g> the Digital Age.<br />

Norderstedt: Books <strong>on</strong> Dem<strong>and</strong>, 2009. Also available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e at:<br />

http://kups.ub.uni-koeln.de/volltexte/2009/2962/<br />

[de J<strong>on</strong>g 2009]. de J<strong>on</strong>g, Franciscka. “Invited Talk: NLP <strong>and</strong> the Humanities: The Revival of an Old<br />

Liais<strong>on</strong>.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 12th C<strong>on</strong>ference of the European Chapter of the ACL (EACL 2009),<br />

Athens, Greece, (2009): 10-15. http://www.aclweb.org/anthology-new/E/E09/E09-1002.pdf


283<br />

[de la Flor et al. 2010a]. Flor, Grace de la, Paul Luff, Mar<str<strong>on</strong>g>in</str<strong>on</strong>g>a Jirotka, John Pybus, Ruth Kirkham, <strong>and</strong><br />

Annamaria Carusi. “The Case of the Disappear<str<strong>on</strong>g>in</str<strong>on</strong>g>g Ox: See<str<strong>on</strong>g>in</str<strong>on</strong>g>g Through Digital Images to an Analysis<br />

of Ancient Texts.” CHI '10: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 28th Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Human Factors <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g Systems. New York, NY: ACM, (2010): 473-482.<br />

http://dx.doi.org/10.1145/1753326.1753397. Open access copy available at:<br />

https://www.sugarsync.com/pf/D259396_79_12126677.<br />

[de la Flor et al. 2010b]. Flor, Grace de la, Mar<str<strong>on</strong>g>in</str<strong>on</strong>g>a Jirotka, Paul Luff, John Pybus, <strong>and</strong> Ruth Kirkham.<br />

“Transform<str<strong>on</strong>g>in</str<strong>on</strong>g>g Scholarly Practice: Embedd<str<strong>on</strong>g>in</str<strong>on</strong>g>g Technological Interventi<strong>on</strong>s to Support the<br />

Collaborative Analysis of Ancient Texts.” Computer Supported Cooperative Work (CSCW), (April<br />

2010): 1-26-26. http://dx.doi.org/10.1007/s10606-010-9111-1. Open-access copy available at:<br />

http://ora.ouls.ox.ac.uk/objects/uuid%3A1e7ac7c3-f0e3-4fe8-847e-f97645a3f7c6<br />

[Dekhtyar et al. 2005]. Dekhtyar, Alex, I<strong>on</strong>ut E. Iacob, Jerzy W. Jaromczyk, Kev<str<strong>on</strong>g>in</str<strong>on</strong>g> Kiernan, Neil<br />

Moore, <strong>and</strong> Dorothy Carr Porter. “Support for XML Markup of Image-Based Electr<strong>on</strong>ic Editi<strong>on</strong>s.”<br />

Internati<strong>on</strong>al Journal <strong>on</strong> Digital Libraries, 6 (2006): 55-69. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at:<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.103.2471&rep=rep1&type=pdf<br />

[Deufert et al. 2010]. Deufert, Marcus, Judith Blumenste<str<strong>on</strong>g>in</str<strong>on</strong>g>, Andreas Trebesius, Stefan Beyer, <strong>and</strong><br />

Marco Büchler. “Objective Detecti<strong>on</strong> of Plautus’ Rules by Computer Support.” Digital Humanities<br />

2010 C<strong>on</strong>ference Abstracts, (July 2010): 126-128.<br />

http://dh2010.cch.kcl.ac.uk/academic-programme/abstracts/papers/pdf/book-f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Dik <strong>and</strong> Whal<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2009]. Dik, Helma, <strong>and</strong> Richard Whal<str<strong>on</strong>g>in</str<strong>on</strong>g>g. “Implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g Greek Morphology.”<br />

Digital Humanities 2009 C<strong>on</strong>ference Abstracts, (June 2009): 338-339.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Dimitriadis et al. 2006]. Dimitriadis, Alexis, Marc Kemps-Snijders, Peter Wittenburg, M. Everaert,<br />

<strong>and</strong> S. Lev<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong>. “Towards a L<str<strong>on</strong>g>in</str<strong>on</strong>g>guist's Workbench Support<str<strong>on</strong>g>in</str<strong>on</strong>g>g eScience Methods.” E-SCIENCE '06:<br />

Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Sec<strong>on</strong>d IEEE Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> e-Science <strong>and</strong> Grid Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. IEEE<br />

Computer Society, (2006). http://www.lat-mpi.eu/papers/papers-2006/escience-sketch-f<str<strong>on</strong>g>in</str<strong>on</strong>g>al2.pdf/view<br />

[Doerr <strong>and</strong> Iorizzo 2008]. Doerr, Mart<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> Dolores Iorizzo. “The Dream of a Global Knowledge<br />

Network—A New Approach.” Journal <strong>on</strong> Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Cultural Heritage, 1 (June 2008): 1-23.<br />

http://dx.doi.org/10.1145/1367080.1367085<br />

[Dogan <strong>and</strong> Scharsky 2008]. Dogan, Zeki, <strong>and</strong> Alfred Scharsky. “Virtual Unificati<strong>on</strong> of the Earliest<br />

Christian Bible: Digitisati<strong>on</strong>, Transcripti<strong>on</strong>, Translati<strong>on</strong> <strong>and</strong> Physical Descripti<strong>on</strong> of the Codex<br />

S<str<strong>on</strong>g>in</str<strong>on</strong>g>aiticus.” ECDL 2008: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 12 th European C<strong>on</strong>ference <strong>on</strong> Research <strong>and</strong> Advanced<br />

Technology for Digital Libraries, (2008): 221-226. http://dx.doi.org/10.1007/978-3-540-87599-4_22<br />

[Doumat et al. 2008]. Doumat, Reim, Elöd E. Zsigm<strong>on</strong>d, Jean M. P<str<strong>on</strong>g>in</str<strong>on</strong>g><strong>on</strong>, <strong>and</strong> Emese Csiszar. “Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e<br />

Ancient Documents: Armarius.” DocEng '08: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the Eighth ACM Symposium <strong>on</strong> Document<br />

Eng<str<strong>on</strong>g>in</str<strong>on</strong>g>eer<str<strong>on</strong>g>in</str<strong>on</strong>g>g. New York, NY: ACM, (2008): 127-130. http://dx.doi.org/10.1145/1410140.1410167


284<br />

[Dué <strong>and</strong> Ebbott 2009]. Dué, Casey, <strong>and</strong> Mary Ebbott. “Digital Criticism: Editorial St<strong>and</strong>ards for the<br />

Homer Multitext.” Digital Humanities Quarterly, 3 (January 2009).<br />

http://www.digitalhumanities.org/dhq/vol/3/1/000029.html#<br />

[Dunn 2009]. Dunn, Stuart. “Deal<str<strong>on</strong>g>in</str<strong>on</strong>g>g with the Complexity Deluge: VREs <str<strong>on</strong>g>in</str<strong>on</strong>g> the Arts <strong>and</strong> Humanities.”<br />

<strong>Library</strong> Hi Tech, 27 (2009): 205-216. http://dx.doi.org/10.1108/07378830910968164<br />

[Dunn 2010]. Dunn, Stuart. “Space as an Artefact: A Perspective <strong>on</strong> ‘Neogeography’ from the Digital<br />

Humanities.” <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Study of Classical Antiquity (eds. Gabriel Bodard <strong>and</strong> Sim<strong>on</strong><br />

Mah<strong>on</strong>y). Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VA: Ashgate Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2010, pp. 53-69<br />

[Dutschke 2008]. Dutschke, C<strong>on</strong>suelo W. “Digital Scriptorium: Ten Years Young, <strong>and</strong> Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong><br />

Survival.” Storicamente, 4, (2008).<br />

http://www.storicamente.org/02_tecnostoria/filologia_digitale/dutschke.html#_ftn1<br />

[Ebel<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2007]. Ebel<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Jarle. “The Electr<strong>on</strong>ic Text Corpus of Sumerian Literature.” Corpora, 2<br />

(2007): 111-120. http://www.eupjournals.com/doi/abs/10.3366/cor.2007.2.1.111<br />

[Eder 2007]. Eder, Maciej. “How Rhythmical is Hexameter: A Statistical Approach to Ancient Epic<br />

Poetry.” Digital Humanities 2007 C<strong>on</strong>ference Abstracts, (May 2007).<br />

http://www.digitalhumanities.org/dh2007/abstracts/xhtml.xqid=137<br />

[Edm<strong>on</strong>d <strong>and</strong> Schreibman 2010]. Edm<strong>on</strong>d, Jennifer, <strong>and</strong> Susan Schreibman. “European Elephants <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the Room (Are They the Ones With the Bigger or Smaller Ears)” Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Humanities Scholarship:<br />

The Shape of Th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs to Come, Rice University Press, (March 2010).<br />

http://cnx.org/c<strong>on</strong>tent/m34307/1.2/<br />

[Edmunds 1995]. Edmunds, Lowell. "Review: Computers <strong>and</strong> the Classics: The Third Phase." Ari<strong>on</strong>,<br />

3, no. 2/3, (1995): 317-350. http://www.jstor.org/stable/20163591<br />

[Edwards et al. 2004]. Edwards, Jaety, Yee W. Teh, David Forsyth, Roger Bock, <strong>and</strong> Michael Maire.<br />

“Mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Manuscripts Searchable Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g gHMMs.” NIPS, (2004).<br />

http://books.nips.cc/papers/files/nips17/NIPS2004_0550.pdf<br />

[Egl<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2006]. Egl<str<strong>on</strong>g>in</str<strong>on</strong>g>, Vér<strong>on</strong>ique, Frank Lebourgeois, S. Bres, Hubert Emptoz, Yann Leydier,<br />

Ikram Moalla, <strong>and</strong> F. Drira. “Computer Assistance for Digital Libraries: C<strong>on</strong>tributi<strong>on</strong>s to Middle-Ages<br />

<strong>and</strong> Authors' Manuscripts Exploitati<strong>on</strong> <strong>and</strong> Enrichment.” DIAL 2006: Sec<strong>on</strong>d Internati<strong>on</strong>al<br />

C<strong>on</strong>ference <strong>on</strong> Document Image Analysis for Libraries, (2006). http://dx.doi.org/10.1109/DIAL.2006.9<br />

[Eiteljorg 2004]. Eiteljorg, II, Harris<strong>on</strong>. “Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g for Archaeologists.” In A Compani<strong>on</strong> to Digital<br />

Humanities (eds. Susan Schreibman, Ray Siemens, <strong>and</strong> John Unsworth). Oxford: Blackwell, 2004.<br />

http://www.digitalhumanities.org/compani<strong>on</strong>/viewdocId=blackwell/9781405103213/9781405103213.<br />

xml&chunk.id=ss1-2-2&toc.depth=1&toc.id=ss1-2-2&br<strong>and</strong>=9781405103213_br<strong>and</strong>


285<br />

[Eiteljorg 2005]. Eiteljorg, Harris<strong>on</strong>. “Archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g Archaeological Data - Is There a Viable Bus<str<strong>on</strong>g>in</str<strong>on</strong>g>ess<br />

Model for a Repository” CSA Newsletter, Vol XVII, No. 3, W<str<strong>on</strong>g>in</str<strong>on</strong>g>ter (2005).<br />

http://csanet.org/newsletter/w<str<strong>on</strong>g>in</str<strong>on</strong>g>ter05/nlw0501.html<br />

[Eiteljorg <strong>and</strong> Limp 2008]. Eiteljorg, II, Harris<strong>on</strong>, <strong>and</strong> W. Frederick Limp. Archaeological Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

2nd ed. Bryn Mawr, PA: Center for the Study of Architecture, December 2008.<br />

http://archcomp.csanet.org/<br />

[Elliott 2008]. Elliott, Michelle. “Introduc<str<strong>on</strong>g>in</str<strong>on</strong>g>g the “Digital Archaeological Record.”” (2008).<br />

http://www.tdar.org/c<strong>on</strong>fluence/download/attachments/131075/tDAR_Instructi<strong>on</strong>s.pdfversi<strong>on</strong>=1&mo<br />

dificati<strong>on</strong>Date=1218787124032<br />

[Elliott <strong>and</strong> Gillies 2009a]. Elliott, Tom, <strong>and</strong> Sean Gillies. “Data <strong>and</strong> Code for Ancient Geography:<br />

Shared Effort Across Projects <strong>and</strong> Discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>es.” Digital Humanities 2009 C<strong>on</strong>ference Abstracts, (June<br />

2009): 4-6.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Elliott <strong>and</strong> Gillies 2009b]. Elliott, Tom, <strong>and</strong> Sean Gillies. “Digital Geography <strong>and</strong> Classics.” Digital<br />

Humanities Quarterly, 3 (January 2009). http://www.digitalhumanities.org/dhq/vol/3/1/000031.html<br />

[Emery <strong>and</strong> Toth 2009] Emery, Doug, <strong>and</strong> Michael B. Toth. “Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Images <strong>and</strong> Text with<br />

Comm<strong>on</strong> Data <strong>and</strong> Metadata St<strong>and</strong>ards <str<strong>on</strong>g>in</str<strong>on</strong>g> the Archimedes Palimpsest.” Digital Humanities 2009<br />

C<strong>on</strong>ference Abstracts, (June 2009): 281-283.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Ernst-Gerlach <strong>and</strong> Crane 2008]. Ernst-Gerlach, Andrea, <strong>and</strong> Gregory Crane. “Identify<str<strong>on</strong>g>in</str<strong>on</strong>g>g Quotati<strong>on</strong>s<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> Reference Works <strong>and</strong> Primary Materials.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 12 th European C<strong>on</strong>ference <strong>on</strong><br />

Research <strong>and</strong> Advanced Technology for Digital Libraries, (2008): 78-87.<br />

http://dx.doi.org/10.1007/978-3-540-87599-4_9<br />

[Feijen et al. 2007]. Feijen, Mart<str<strong>on</strong>g>in</str<strong>on</strong>g>, Wolfram Horstmann, Paolo Manghi, Mary Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong>, <strong>and</strong><br />

Rosemary Russell. “DRIVER: Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Network for Access<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Repositories across<br />

Europe.” Ariadne, 53 (October 2007). http://www.ariadne.ac.uk/issue53/feijen-et-al/<br />

[Feraudi-Gruénais 2010]. Feraudi-Gruénais, Francisca, ed. Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>on</strong> St<strong>on</strong>e: Epigraphic Research <strong>and</strong><br />

Electr<strong>on</strong>ic Archives (Roman Studies: Interdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary Approaches). Lanham, MD: Lex<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong> Books,<br />

2010. http://www.worldcat.org/oclc/526091486<br />

[F<str<strong>on</strong>g>in</str<strong>on</strong>g>kel <strong>and</strong> Stump 2009]. F<str<strong>on</strong>g>in</str<strong>on</strong>g>kel, Raphael, <strong>and</strong> Gregory Stump. “What Your Teacher Told You is<br />

True: Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Verbs Have Four Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cipal Parts.” Digital Humanities Quarterly, 3 (January 2009).<br />

http://www.digitalhumanities.org/dhq/vol/3/1/000032.html<br />

[Fl<strong>and</strong>ers 2009]. Fl<strong>and</strong>ers, David F. Fedoraz<strong>on</strong>: F<str<strong>on</strong>g>in</str<strong>on</strong>g>al Report. Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t Informati<strong>on</strong> Systems Committee,<br />

(November 2009). http://ie-repository.jisc.ac.uk/426/


286<br />

[Flaten 2009]. Flaten, Arne R. “The Ashes2Art Project: Digital Models of Fourth-Century BCE<br />

Delphi, Greece.” Visual Resources: An Internati<strong>on</strong>al Journal of Documentati<strong>on</strong>, 25 (December 2009):<br />

345-362. http://dx.doi.org/10.1080/01973760903331783<br />

[Forstall <strong>and</strong> Scheirer 2009]. Forstall, Christopher W., <strong>and</strong> W. J. Scheirer. “Features from Frequency:<br />

Authorship <strong>and</strong> Stylistic Analysis Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Repetitive Sound.” DHCS 2009-Chicago Colloquium <strong>on</strong><br />

Digital Humanities <strong>and</strong> Computer Science. November 2009.<br />

http://l<str<strong>on</strong>g>in</str<strong>on</strong>g>gcog.iit.edu/%7Eargam<strong>on</strong>/DHCS09-Abstracts/Forstall.pdf<br />

[Forte <strong>and</strong> Siliotti 1997]. Forte, Maurizio, <strong>and</strong> Alberto Siliotti. Virtual Archaeology: Re-Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Ancient Worlds. New York, NY: H.N. Abrams, 1997.<br />

[Fraser 2005]. Fraser, Michael. “Virtual Research Envir<strong>on</strong>ments: Overview <strong>and</strong> Activity.” Ariadne, 44<br />

(July 2005). http://www.ariadne.ac.uk/issue44/fraser/<br />

[Fraser 2008]. Fraser, Bruce L. “Bey<strong>on</strong>d Def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong>: Organis<str<strong>on</strong>g>in</str<strong>on</strong>g>g Semantic Informati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Bil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual<br />

Dicti<strong>on</strong>aries.” Internati<strong>on</strong>al Journal of Lexicography, 21 (March 2008): 69-93.<br />

http://dx.doi.org/10.1093/ijl/ecn002. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at:<br />

http://www.dspace.cam.ac.uk/bitstream/1810/223842/1/IJL PrePr<str<strong>on</strong>g>in</str<strong>on</strong>g>t.pdf<br />

[Friedl<strong>and</strong>er 2009]. Friedl<strong>and</strong>er, Amy. “Ask<str<strong>on</strong>g>in</str<strong>on</strong>g>g Questi<strong>on</strong>s <strong>and</strong> Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Research Agenda for Digital<br />

Scholarship.” In Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g Together or Apart: Promot<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Next Generati<strong>on</strong> of Digital Scholarship:<br />

Report of a Workshop Cosp<strong>on</strong>sored by the <str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources <strong>and</strong> the<br />

Nati<strong>on</strong>al Endowment for the Humanities. <str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources, Publicati<strong>on</strong><br />

Number 145 (March 2009): 1-15. http://www.clir.org/pubs/reports/pub145/pub145.pdf<br />

[Frischer et al. 2005]. Frischer, Bernie, John Unsworth, Arienne Dwyer, Anita J<strong>on</strong>es, Lew Lancaster,<br />

Geoffrey Rockwell, <strong>and</strong> Roy Rosenzweig. Summit <strong>on</strong> Digital Tools for the Humanities: Report <strong>on</strong><br />

Summit Accomplishments. The University of Virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia, September 28, 2005.<br />

http://www.iath.virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia.edu/dtsummit/SummitText.pdf<br />

[Fulford et al. 2010]. Fulford, Michael G., Emma J. O’Riordan, Am<strong>and</strong>a Clarke, <strong>and</strong> Michael Ra<str<strong>on</strong>g>in</str<strong>on</strong>g>s.<br />

“Silchester Roman Town: Develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g Virtual Research Practice 1997-2008.” <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the Study of Classical Antiquity (eds. Gabriel Bodard <strong>and</strong> Sim<strong>on</strong> Mah<strong>on</strong>y). Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate<br />

Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2010, pp. 16-34.<br />

[Fusi 2008]. Fusi, Daniele. “An Expert System for the Classical Languages: Metrical Analysis<br />

Comp<strong>on</strong>ents.” (2008). http://fusisoft.it/Doc/ActaVenezia.pdf<br />

[Gelernter <strong>and</strong> Lesk 2008]. Gelernter, Judith, <strong>and</strong> Michael E. Lesk. “Traditi<strong>on</strong>al Resources Help<br />

Interpret Texts.” BooksOnl<str<strong>on</strong>g>in</str<strong>on</strong>g>e '08: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the 2008 ACM Workshop <strong>on</strong> Research Advances <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Large Digital Book Repositories. New York, NY: ACM, (2008): 17-20.<br />

http://dx.doi.org/10.1145/1458412.1458418<br />

[Gietz et al. 2006]. Gietz, Peter, Andreas Aschenbrenner, Stefan Budenbender, Fotis Jannidis, Marc W.<br />

Küster, Christoph Ludwig, Wolfgang Pempe, Thorsten Vitt, Werner Wegste<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> Andrea Ziel<str<strong>on</strong>g>in</str<strong>on</strong>g>ski.


287<br />

“TextGrid <strong>and</strong> eHumanities.” E-SCIENCE '06: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Sec<strong>on</strong>d IEEE Internati<strong>on</strong>al<br />

C<strong>on</strong>ference <strong>on</strong> e-Science <strong>and</strong> Grid Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Wash<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, DC: IEEE Computer Society, (2006).<br />

http://dx.doi.org/10.1109/E-SCIENCE.2006.133<br />

[Gill 2009]. Gill, Alys<strong>on</strong> A. “Digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Past: Chart<str<strong>on</strong>g>in</str<strong>on</strong>g>g New Courses <str<strong>on</strong>g>in</str<strong>on</strong>g> the Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Virtual<br />

L<strong>and</strong>scapes.” Visual Resources: An Internati<strong>on</strong>al Journal of Documentati<strong>on</strong>, 25 (2009): 313-332.<br />

http://dx.doi.org/10.1080/01973760903331809<br />

[Goudriaan et al. 1995]. Goudriaan, Koen, Kees M<strong>and</strong>emakers, Jogchum Reitsma, <strong>and</strong> Peter Stabel,<br />

eds. Prosopography <strong>and</strong> Computer: C<strong>on</strong>tributi<strong>on</strong>s of Mediaevalists <strong>and</strong> Modernists <strong>on</strong> the Use of<br />

Computer <str<strong>on</strong>g>in</str<strong>on</strong>g> Historical Research. Leuven, 1995.<br />

[Graham <strong>and</strong> Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i 2007]. Graham, Shawn, <strong>and</strong> Giovanni Ruff<str<strong>on</strong>g>in</str<strong>on</strong>g>i. “Network Analysis <strong>and</strong> Greco-<br />

Roman Prosopography.” In Prosopography Approaches <strong>and</strong> Applicati<strong>on</strong>s: A H<strong>and</strong>book. Ed. K. S. B.<br />

Keats-Rohan Oxford: Unit for Prosopographical Research, L<str<strong>on</strong>g>in</str<strong>on</strong>g>acre College, University of Oxford,<br />

2007, pp. 325-336.<br />

[Green <strong>and</strong> Roy 2008] Green, David, <strong>and</strong> Michael Roy. “Th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs to Do While Wait<str<strong>on</strong>g>in</str<strong>on</strong>g>g for the Future<br />

to Happen: Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure for the Liberal Arts.” EDUCAUSE Review, 43 (July 2008).<br />

http://c<strong>on</strong>nect.educause.edu/display/46969<br />

[Gruber 2009]. Gruber, Ethan. “Encoded Archival Descripti<strong>on</strong> for Numismatic Collecti<strong>on</strong>s.”<br />

Presentati<strong>on</strong> made at Computer Applicati<strong>on</strong>s <strong>and</strong> Quantitative Methods <str<strong>on</strong>g>in</str<strong>on</strong>g> Archaeology, (March 2009).<br />

http://co<str<strong>on</strong>g>in</str<strong>on</strong>g>s.lib.virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia.edu/documentati<strong>on</strong>/caa2009.pdf<br />

[Guidi et al. 2006]. Guidi, Gabriele, Bernard Frischer, Michele Russo, Aless<strong>and</strong>ro Sp<str<strong>on</strong>g>in</str<strong>on</strong>g>etti, Luca<br />

Carosso, <strong>and</strong> Laura Micoli. “Three-Dimensi<strong>on</strong>al Acquisiti<strong>on</strong> of Large <strong>and</strong> Detailed Cultural Heritage<br />

Objects.” Mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e Visi<strong>on</strong> <strong>and</strong> Applicati<strong>on</strong>s, 17 (December 2006): 349-360.<br />

http://dx.doi.org/10.1007/s00138-006-0029-z<br />

[Haertel et al. 2010]. Haertel, Robbie A., Peter McClanahan, <strong>and</strong> Eric K. R<str<strong>on</strong>g>in</str<strong>on</strong>g>gger. “Automatic<br />

Diacritizati<strong>on</strong> for Low-Resource Languages Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Hybrid Word <strong>and</strong> C<strong>on</strong>s<strong>on</strong>ant CMM.” Human<br />

Language Technologies: The 2010 Annual C<strong>on</strong>ference of the North American Chapter of the<br />

Associati<strong>on</strong> for Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics. Morristown, NJ: Associati<strong>on</strong> for Computati<strong>on</strong>al<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (2010): 519-527. http://www.aclweb.org/anthology/N/N10/N10-1076.pdf<br />

[Hahn et al. 2006]. Hahn, Daniel V., D<strong>on</strong>ald D. Duncan, Kev<str<strong>on</strong>g>in</str<strong>on</strong>g> C. Baldw<str<strong>on</strong>g>in</str<strong>on</strong>g>, J<strong>on</strong>ath<strong>on</strong> D. Cohen, <strong>and</strong><br />

Budirijanto Purnomo. “Digital Hammurabi: Design <strong>and</strong> Development of a 3D Scanner for Cuneiform<br />

Tablets.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of SPIE, Vol 6056: Three-Dimensi<strong>on</strong>al Image Capture <strong>and</strong> Applicati<strong>on</strong>s VII,<br />

(January 2006). http://citeseerx.ist.psu.edu/viewdoc/summarydoi=10.1.1.73.1965<br />

[Hans<strong>on</strong> 2001]. Hans<strong>on</strong>, Ann E. “Papyrology: M<str<strong>on</strong>g>in</str<strong>on</strong>g>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g Other People's Bus<str<strong>on</strong>g>in</str<strong>on</strong>g>ess.” Transacti<strong>on</strong>s of the<br />

American Philological Associati<strong>on</strong>, 131 (2001): 297-313. http://www.jstor.org/stable/20140974


288<br />

[Hardwick 2000]. Hardwick, Lorna. “Electrify<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Can<strong>on</strong>: The Impact of Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>on</strong> Classical<br />

Studies.” Computers <strong>and</strong> the Humanities, 34 (August 2000): 279-295.<br />

http://dx.doi.org/10.1023/A:1002089109613<br />

[Harley et al. 2006a]. Harley, Diane, J<strong>on</strong>athan Henke, Shann<strong>on</strong> Lawrence, Ian Miller, Irene Perciali,<br />

David Nasatir, Charis Kaskiris, <strong>and</strong> Cara Bautista. Use <strong>and</strong> Users of Digital Resources: A Focus <strong>on</strong><br />

Undergraduate Educati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities <strong>and</strong> Social Sciences. Center for Studies <str<strong>on</strong>g>in</str<strong>on</strong>g> Higher<br />

Educati<strong>on</strong>, (April 5, 2006). http://www.escholarship.org/uc/item/8c43w24h<br />

[Harley et al. 2006b]. Harley, Diane, J<strong>on</strong>athan Henke, <strong>and</strong> Shann<strong>on</strong> Lawrence. “Why Study Users An<br />

Envir<strong>on</strong>mental Scan of Use <strong>and</strong> Users of Digital Resources <str<strong>on</strong>g>in</str<strong>on</strong>g> Humanities <strong>and</strong> Social Sciences<br />

Undergraduate Educati<strong>on</strong>.” Research & Occasi<strong>on</strong>al Paper Series: CSHE.15.06. Center for Studies <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Higher Educati<strong>on</strong>, (September 2006). http://www.escholarship.org/uc/item/61g3s91k<br />

[Harley et al. 2010]. Harley, Diane, Sophia K. Acord, Sarah Earl-Novell, Shann<strong>on</strong> Lawrence, <strong>and</strong><br />

C. Juds<strong>on</strong> K<str<strong>on</strong>g>in</str<strong>on</strong>g>g. eScholarship: Assess<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Future L<strong>and</strong>scape of Scholarly Communicati<strong>on</strong>.<br />

University of California, Berkeley: Center for Studies <str<strong>on</strong>g>in</str<strong>on</strong>g> Higher Educati<strong>on</strong>, (January 2010).<br />

http://escholarship.org/uc/cshe_fsc<br />

[Heath 2010]. Heath, Sebastian. “Diversity <strong>and</strong> Reuse of Digital Resources for Ancient Mediterranean<br />

Material Culture.” In Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Study of Classical Antiquity (eds. Gabriel Bodard <strong>and</strong><br />

Sim<strong>on</strong> Mah<strong>on</strong>y). Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g Company, 2010, pp. 35-52. Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e copy<br />

available at: http://sebastianheath.com/files/HeathS2010-DigitalResearch.pdf<br />

[Hedges 2009]. Hedges, Mark. “Grid-Enabl<str<strong>on</strong>g>in</str<strong>on</strong>g>g Humanities Datasets.” Digital Humanities Quarterly, 3<br />

(4), (2009). http://www.digitalhumanities.org/dhq/vol/3/4/000078/000078.html#<br />

[Hellwig 2007]. Hellwig, Oliver. “SanskritTagger: A Stochastic Lexical <strong>and</strong> POS Tagger for Sanskrit.”<br />

Sanskrit Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (Lectures Notes <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer Science, Volume 5402), Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>,<br />

Heidelberg: Spr<str<strong>on</strong>g>in</str<strong>on</strong>g>ger (2009): 266-277. http://dx.doi.org/10.1007/978-3-642-00155-0_11<br />

[Hellwig 2010]. Hellwig, Oliver. “Etymological Trends <str<strong>on</strong>g>in</str<strong>on</strong>g> the Sanskrit Vocabulary.” Literary &<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 25 (1) (October 2009): 105-118. http://dx.doi.org/10.1093/llc/fqp034<br />

[Hillen 2007]. Hillen, Michael, (translated by Kathleen M. Coleman). “F<str<strong>on</strong>g>in</str<strong>on</strong>g>ish<str<strong>on</strong>g>in</str<strong>on</strong>g>g the TLL <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

Digital Age: Opportunities, Challenges, Risks.” Transacti<strong>on</strong>s of the American Philological<br />

Associati<strong>on</strong>, 137 (2007): 491-495.<br />

[Hilse <strong>and</strong> Kothe 2006]. Hilse, Hans Werner, <strong>and</strong> Jochen Kothe. Implement<str<strong>on</strong>g>in</str<strong>on</strong>g>g Persistent Identifiers.<br />

Research <strong>and</strong> Development Department of the Goett<str<strong>on</strong>g>in</str<strong>on</strong>g>gen State <strong>and</strong> University <strong>Library</strong>, (November<br />

2006). http://www.knaw.nl/ecpa/publ/pdf/2732.pdf<br />

[H<strong>on</strong>igman 2004]. H<strong>on</strong>igman, Sylvie. “Abraham <str<strong>on</strong>g>in</str<strong>on</strong>g> Egypt: Hebrew <strong>and</strong> Jewish-Aramaic Names <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Egypt <strong>and</strong> Judaea <str<strong>on</strong>g>in</str<strong>on</strong>g> Hellenistic <strong>and</strong> Early Roman Times.” Zeitschrift für Papyrologie und Epigraphik,<br />

146 (2004): 279-297.


289<br />

[Huet 2004]. Huet, Gérard. “Design of a Lexical Database for Sanskrit.” ElectricDict '04: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs<br />

of the Workshop <strong>on</strong> Enhanc<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Electr<strong>on</strong>ic Dicti<strong>on</strong>aries. Morristown, NJ: Associati<strong>on</strong> for<br />

Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (2004), 8-14. http://portal.acm.org/citati<strong>on</strong>.cfmid=1610042.1610045<br />

[Hunt et al. 2005]. Hunt, Leta, Marilyn Lundberg, <strong>and</strong> Bruce Zuckerman. “InscriptiFact: A Virtual<br />

Archive of Ancient Inscripti<strong>on</strong>s from the Near East.” Internati<strong>on</strong>al Journal <strong>on</strong> Digital Libraries, 5<br />

(May 2005): 153-166. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at: http://www.<str<strong>on</strong>g>in</str<strong>on</strong>g>scriptifact.com/news/JDL.pdf<br />

[IFLA 1998] Internati<strong>on</strong>al Federati<strong>on</strong> of <strong>Library</strong> Associati<strong>on</strong>s (IFLA). Functi<strong>on</strong>al Requirements for<br />

Bibliographic Records: F<str<strong>on</strong>g>in</str<strong>on</strong>g>al Report, Volume 19 of UBCIM Publicati<strong>on</strong>s-New Series. München:<br />

K.G.Saur. http://www.ifla.org/VII/s13/frbr/frbr.pdf<br />

[Isaksen 2008]. Isaksen, Leif. “The Applicati<strong>on</strong> Of Network Analysis To Ancient Transport<br />

Geography: A Case Study Of Roman Baetica.” Digital Medievalist, (2008).<br />

http://www.digitalmedievalist.org/journal/4/isaksen/<br />

[Isaksen 2009]. Isaksen, Leif. “Augment<str<strong>on</strong>g>in</str<strong>on</strong>g>g Epigraphy.” Paper presented at “Object Artefact Script<br />

Workshop.” (October 8-9 2009). http://wiki.esi.ac.uk/Object_Artefact_Script_Isaksen<br />

[Isaksen 2010]. Isaksen, Leif. “Read<str<strong>on</strong>g>in</str<strong>on</strong>g>g Between the L<str<strong>on</strong>g>in</str<strong>on</strong>g>es: Unearth<str<strong>on</strong>g>in</str<strong>on</strong>g>g Structure <str<strong>on</strong>g>in</str<strong>on</strong>g> Ptolemy's<br />

Geography.” Digital Classicist/ICS Work <str<strong>on</strong>g>in</str<strong>on</strong>g> Progress Sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ar, (June 2010).<br />

http://www.digitalclassicist.org/wip/wip2010-01li.pdf<br />

[Jacks<strong>on</strong> et al. 2009]. Jacks<strong>on</strong>, Mike, Mario Ant<strong>on</strong>ioletti, Alastair Hume, Tobias Blanke, Gabriel<br />

Bodard, Mark Hedges, <strong>and</strong> Shrija Rajbh<strong>and</strong>ari. “Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g Bridges Between Isl<strong>and</strong>s of Data - An<br />

Investigati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to Distributed Data Management <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities.” e-Science '09: Fifth IEEE<br />

Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> e-Science, (December 2009): 33-39.<br />

http://dx.doi.org/10.1109/e-Science.2009.13<br />

[James 1912]. James, M<strong>on</strong>tague R. A Descriptive Catalogue of The Manuscripts <str<strong>on</strong>g>in</str<strong>on</strong>g> the <strong>Library</strong> of<br />

Corpus Christi College Cambridge. 2 vols. Cambridge, UK: Cambridge University Press, 1912.<br />

[Jaworski 2008]. Jaworski, Wojciech. “C<strong>on</strong>tents Modell<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Neo-Sumerian Ur III Ec<strong>on</strong>omic Text<br />

Corpus.” COLING '08: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 22nd Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Computati<strong>on</strong>al<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics. Morristown, NJ: Associati<strong>on</strong> for Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (2008): 369-376.<br />

http://www.aclweb.org/anthology/C08-1047<br />

[Jeffrey et al. 2009a]. Jeffrey, Stuart, Julian Richards, Fabio Ciravegna, Stewart Waller, Sam<br />

Chapman, <strong>and</strong> Ziqui Zhang. “The Archaeotools Project: Faceted Classificati<strong>on</strong> And Natural Language<br />

Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> an Archaeological C<strong>on</strong>text.” Philosophical Transacti<strong>on</strong>s of the Royal Society, A 367<br />

(June 2009): 2507-2519. http://rsta.royalsocietypublish<str<strong>on</strong>g>in</str<strong>on</strong>g>g.org/c<strong>on</strong>tent/367/1897/2507rss=1.abstract<br />

[Jeffrey et al. 2009b]. Jeffrey, Stuart, Julian Richards, Fabio Ciravegna, Stewart Waller, Sam<br />

Chapman, <strong>and</strong> Ziqui Zhang. “Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Archaeological Literature Into Resource Discovery Interfaces<br />

Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Natural Language Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g And Name Authority Services.” 5th IEEE Internati<strong>on</strong>al


290<br />

C<strong>on</strong>ference <strong>on</strong> E-Science Workshops, IEEE, (December 2009): 184-187.<br />

http://dx.doi.org/10.1109/ESCIW.2009.5407967<br />

[J<strong>on</strong>es 2010]. J<strong>on</strong>es, Charles E. “Go<str<strong>on</strong>g>in</str<strong>on</strong>g>g AWOL (ancientworld<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e.blogspot.com/): Thoughts <strong>on</strong><br />

Develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Tool for the Organizati<strong>on</strong> <strong>and</strong> Discovery of Open Access Scholarly Resources for the<br />

Study of the Ancient World.” CSA Newsletter, XXII (January 2010).<br />

http://csanet.org/newsletter/w<str<strong>on</strong>g>in</str<strong>on</strong>g>ter10/nlw1001.html<br />

[Ka<str<strong>on</strong>g>in</str<strong>on</strong>g>z 2009]. Ka<str<strong>on</strong>g>in</str<strong>on</strong>g>z, Chad J. “Bamboo: Another Part of the Jigsaw.” Presentati<strong>on</strong> at Digital<br />

Humanities, (June 2009). http://projectbamboo.org/files/presentati<strong>on</strong>s/0906_Bamboo_Jigsaw.pdf<br />

[Kampel <strong>and</strong> Zaharieva 2008]. Kampel, Mart<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> Maia Zaharieva. “Recogniz<str<strong>on</strong>g>in</str<strong>on</strong>g>g Ancient Co<str<strong>on</strong>g>in</str<strong>on</strong>g>s<br />

Based <strong>on</strong> Local Features.” Advances <str<strong>on</strong>g>in</str<strong>on</strong>g> Visual Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Lecture Notes <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer Science, Vol.<br />

5358), (2008): 11-22. http://dx.doi.org/10.1007/978-3-540-89639-5_2<br />

[Kansa et al. 2007]. Kansa, Sarah W., Eric C. Kansa, <strong>and</strong> Jas<strong>on</strong> M. Schultz. “An Open C<strong>on</strong>text for<br />

Near Eastern Archaeology.” Near Eastern Archaeology, 70 (December 2007): 188-194.<br />

http://www.alex<strong>and</strong>riaarchive.org/publicati<strong>on</strong>s/KansaKansaSchultz_NEADec07.pdf<br />

[Kansa et al. 2010]. Kansa, Eric, Sarah Kansa, Margie Burt<strong>on</strong>, <strong>and</strong> C<str<strong>on</strong>g>in</str<strong>on</strong>g>dy Stankowski. “Googl<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

Grey: Open Data, Web Services, <strong>and</strong> Semantics.” Archaeologies, 6 (August 2010): 301-326.<br />

http://dx.doi.org/10.1007/s11759-010-9146-4<br />

[Kilbride 2005]. Kilbride, William. “Past, Present <strong>and</strong> Future: XML, Archaeology <strong>and</strong> Digital<br />

Preservati<strong>on</strong>.” CSA Newsletter, Vol. XVII, No. 3, W<str<strong>on</strong>g>in</str<strong>on</strong>g>ter (2005).<br />

http://www.csanet.org/newsletter/w<str<strong>on</strong>g>in</str<strong>on</strong>g>ter05/nlw0502.html<br />

[Kiraz 1994]. Kiraz, George A. “Automatic C<strong>on</strong>cordance Generati<strong>on</strong> of Syriac Texts.” In<br />

Symposium Syriacum, no. 6, 1992, (ed. R. Lavenant). Cambridge, 1994, pp. 461-475.<br />

[Kiraz 2000]. Kiraz, George A. “Multitiered N<strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>ear Morphology Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Multitape F<str<strong>on</strong>g>in</str<strong>on</strong>g>ite Automata:<br />

A Case Study <strong>on</strong> Syriac <strong>and</strong> Arabic.” Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, 26 (March 2000): 77-105.<br />

[Knight <strong>and</strong> Pennock 2008]. Knight, Gareth, <strong>and</strong> Maureen Pennock. “Data Without Mean<str<strong>on</strong>g>in</str<strong>on</strong>g>g:<br />

Establish<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Significant Properties of Digital Research.” iPRES 2008: The Fifth Internati<strong>on</strong>al<br />

C<strong>on</strong>ference <strong>on</strong> Preservati<strong>on</strong> of Digital Objects, (September 2008).<br />

http://www.bl.uk/ipres2008/presentati<strong>on</strong>s_day1/16_Knight.pdf<br />

[Knoll et al. 2009]. Knoll, Adolf, Tomáš Psohlavec, Stanislav Psohlavec, <strong>and</strong> Zdeněk Uhlíř. “Creati<strong>on</strong><br />

of an Internati<strong>on</strong>al Digital <strong>Library</strong> of Manuscripts: Seamless Access to Data from Heterogeneous<br />

Resources (ENRICH Project).” ELPUB 2009: 13th Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Electr<strong>on</strong>ic<br />

Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g: Reth<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Electr<strong>on</strong>ic Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g: Innovati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Communicati<strong>on</strong> Paradigms <strong>and</strong><br />

Technologies, (June 2009): 335-347.<br />

http://c<strong>on</strong>ferences.aepic.it/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/elpub/elpub2009/paper/view/71/30


291<br />

[Koller et al. 2009]. Koller, David, Bernard Frischer, <strong>and</strong> Greg Humphreys. “Research Challenges for<br />

Digital Archives of 3D Cultural Heritage Models.” Journal <strong>on</strong> Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Cultural Heritage, 2<br />

(3), (2009): 1-17. http://dx.doi.org/10.1145/1658346.1658347<br />

[Kretzschmar 2009]. Kretzschmar, William A. “Large-Scale Humanities Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g Projects: Snakes<br />

Eat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Tails, or Every End is a New Beg<str<strong>on</strong>g>in</str<strong>on</strong>g>n<str<strong>on</strong>g>in</str<strong>on</strong>g>g” Digital Humanities Quarterly, 3 (June 2009).<br />

http://www.digitalhumanities.org/dhq/vol/3/2/000038.html<br />

[Krmnicek <strong>and</strong> Probst 2010]. Krmnicek, Stefan, <strong>and</strong> Peter Probst. “Open Access, Classical Studies <strong>and</strong><br />

Publicati<strong>on</strong> by Postgraduate Researchers” Archaeolog, (May 22, 2010).<br />

http://traumwerk.stanford.edu/archaeolog/2010/05/open_access_classical_studies.html<br />

[Kumar et al. 2003]. Kumar, Subodh, Dean Snyder, D<strong>on</strong>ald Duncan, J<strong>on</strong>athan Cohen, <strong>and</strong> Jerry<br />

Cooper. “Digital Preservati<strong>on</strong> of Ancient Cuneiform Tablets Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g 3D-Scann<str<strong>on</strong>g>in</str<strong>on</strong>g>g.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the<br />

Fourth Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> 3-D Digital Imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g (3DIM’03), (October 2003):<br />

326-333.<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.58.9706&rep=rep1&type=pdf<br />

[Kurtz et al. 2009]. Kurtz, D<strong>on</strong>na, Greg Parker, David Shott<strong>on</strong>, Graham Klyne, Florian Schroff,<br />

Andrew Zisserman, <strong>and</strong> Yorick Wilks. “CLAROS - Br<str<strong>on</strong>g>in</str<strong>on</strong>g>g<str<strong>on</strong>g>in</str<strong>on</strong>g>g Classical Art to a Global Public.” Fifth<br />

IEEE Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> e-Science‘09, (December 2009): 20-27.<br />

http://www.clarosnet.org/PID1023719.pdf<br />

[Latousek 2001]. Latousek, Rob. “Fifty Years of Classical Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g: A Progress Report." CALICO<br />

Journal, 18 (2001): 211-222. https://calico.org/html/article_479.pdf<br />

[Lee 2007]. Lee, John. “A Computati<strong>on</strong>al Model of Text Reuse <str<strong>on</strong>g>in</str<strong>on</strong>g> Ancient Literary Texts.”<br />

Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 45th Annual Meet<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the Associati<strong>on</strong> of Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, Prague,<br />

Czech Republic: Associati<strong>on</strong> for Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (2007): 472-479.<br />

http://acl.ldc.upenn.edu/P/P07/P07-1060.pdf<br />

[Lee 2008]. Lee, John. “A Nearest-Neighbor Approach to the Automatic Analysis of Ancient Greek<br />

Morphology.” CoNLL '08: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Twelfth C<strong>on</strong>ference <strong>on</strong> Computati<strong>on</strong>al Natural<br />

Language Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g. Morristown, NJ: Associati<strong>on</strong> for Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (2008): 127-134.<br />

http://www.aclweb.org/anthology/W/W08/W08-2117.pdf<br />

[Leydier et al. 2007]. Leydier, Yann, Frank LeBourgeois, <strong>and</strong> Hubert Emptoz. “Text Search for<br />

Medieval Manuscript Images.” Pattern Recogniti<strong>on</strong>, 40 (December 2007): 3552-3567.<br />

http://dx.doi.org/10.1016/j.patcog.2007.04.024<br />

[Leydier et al. 2009]. Leydier, Yann, Asma Ouji, Frank LeBourgeois, <strong>and</strong> Hubert Emptoz. “Towards<br />

An Omnil<str<strong>on</strong>g>in</str<strong>on</strong>g>gual Word Retrieval System For Ancient Manuscripts.” Pattern Recogniti<strong>on</strong>, 42,<br />

(September 2009): 2089-2105. http://dx.doi.org/10.1016/j.patcog.2009.01.026<br />

[Lock 2003]. Lock, Gary. Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Computers In Archaeology: Towards Virtual Pasts. L<strong>on</strong>d<strong>on</strong>, UK:<br />

Routledge, 2003.


292<br />

[Lockyear 2007]. Lockyear, Kris. “Where Do We Go From Here Record<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Analys<str<strong>on</strong>g>in</str<strong>on</strong>g>g Roman<br />

Co<str<strong>on</strong>g>in</str<strong>on</strong>g>s from Archaeological Excavati<strong>on</strong>s.” Britannia, (November 2007): 211-224.<br />

http://www.jstor.org/stable/30030574<br />

[Lüdel<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Zeldes 2007]. Lüdel<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Anke, <strong>and</strong> Amir Zeldes. “Three Views <strong>on</strong> Corpora: Corpus<br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, Literary Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics.” Jahrbuch für Computerphilologie, 9<br />

(2007): 149-178. http://computerphilologie.tu-darmstadt.de/jg07/luedzeldes.html<br />

[Ludwig <strong>and</strong> Küster 2008]. Ludwig, Christoph, <strong>and</strong> Marc Wilhelm Küster. “Digital Ecosystems of<br />

eHumanities Resources <strong>and</strong> Services.” DEST 2008: 2nd IEEE Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Digital<br />

Ecosystems <strong>and</strong> Technologies, (2008): 476-481. http://dx.doi.org/10.1109/DEST.2008.4635178<br />

[Lynch 2002]. Lynch, Clifford. “Digital Collecti<strong>on</strong>s, Digital Libraries <strong>and</strong> the Digitizati<strong>on</strong> of Cultural<br />

Heritage Informati<strong>on</strong>.” First M<strong>on</strong>day, 7 (5), (May 2002).<br />

http://www.firstm<strong>on</strong>day.org/issues/issue7_5/lynch/<br />

[MacMah<strong>on</strong> 2006]. MacMah<strong>on</strong>, Cary. “Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Shar<str<strong>on</strong>g>in</str<strong>on</strong>g>g Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Resources <str<strong>on</strong>g>in</str<strong>on</strong>g> History, Classics <strong>and</strong><br />

Archaeology.” The Higher Educati<strong>on</strong> Academy--Subject Centre for History, Classics <strong>and</strong><br />

Archaeology, (2006).<br />

http://www.heacademy.ac.uk/assets/hca/documents/Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g<strong>and</strong>Shar<str<strong>on</strong>g>in</str<strong>on</strong>g>gOnl<str<strong>on</strong>g>in</str<strong>on</strong>g>eResourcesHCA.pdf<br />

[Mah<strong>on</strong>ey 2009]. Mah<strong>on</strong>ey, Anne. “Tachypaedia Byzant<str<strong>on</strong>g>in</str<strong>on</strong>g>a: The Suda On L<str<strong>on</strong>g>in</str<strong>on</strong>g>e as Collaborative<br />

Encyclopedia.” Digital Humanities Quarterly, 3 (January 2009).<br />

http://www.digitalhumanities.org/dhq/vol/3/1/000025.html#<br />

[Mah<strong>on</strong>y 2006]. Mah<strong>on</strong>y, Sim<strong>on</strong>. “New Tools for Collaborative Research: the Example of the Digital<br />

Classicist Wiki.” CLiP 2006: Literatures, Languages <strong>and</strong> Cultural Heritage <str<strong>on</strong>g>in</str<strong>on</strong>g> a Digital World,<br />

(2006). http://www.cch.kcl.ac.uk/clip2006/c<strong>on</strong>tent/abstracts/paper31.html<br />

[Mah<strong>on</strong>y 2011]. Mah<strong>on</strong>y, Sim<strong>on</strong>. “Research Communities <strong>and</strong> Open Collaborati<strong>on</strong>: the Example of<br />

the Digital Classicist Wiki.” Digital Medievalist, 6,<br />

http://www.digitalmedievalist.org/journal/6/mah<strong>on</strong>y/<br />

[Mah<strong>on</strong>y <strong>and</strong> Bodard 2010]. Mah<strong>on</strong>y, Sim<strong>on</strong>, <strong>and</strong> Gabriel Bodard. “Introducti<strong>on</strong>.” <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital Research<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> the Study of Classical Antiquity (eds. G. Bodard <strong>and</strong> S. Mah<strong>on</strong>y). Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate<br />

Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2010, pp. 1-14.<br />

[Mall<strong>on</strong> 2006]. Mall<strong>on</strong>, Adrian. “eL<str<strong>on</strong>g>in</str<strong>on</strong>g>gua Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>a: Design<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Classical-Language E-Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Resource.” Computer Assisted Language Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 19 (2006): 373-387.<br />

[Malpas 2011]. Malpas, C<strong>on</strong>stance. Cloud-Sourc<str<strong>on</strong>g>in</str<strong>on</strong>g>g Research Collecti<strong>on</strong>s: Manag<str<strong>on</strong>g>in</str<strong>on</strong>g>g Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

Mass-<str<strong>on</strong>g>Digitized</str<strong>on</strong>g> <strong>Library</strong> Envir<strong>on</strong>ment. OCLC Research.<br />

http://www.oclc.org/research/publicati<strong>on</strong>s/library/2011/2011-01.pdf


293<br />

[Manuelian 1998]. Manuelian, Peter Der. “Digital Epigraphy: An Approach to Streaml<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Egyptological Epigraphic Method.” Journal of the American Research Center <str<strong>on</strong>g>in</str<strong>on</strong>g> Egypt, 35 (1998): 97-<br />

113. http://www.jstor.org/stable/40000464<br />

[Marchi<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>i <strong>and</strong> Crane 1994]. Marchi<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>i, Gary, <strong>and</strong> Gregory Crane. “Evaluat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Hypermedia <strong>and</strong><br />

Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g: Methods <strong>and</strong> Results from the Perseus Project.” ACM Transacti<strong>on</strong>s <strong>on</strong> Informati<strong>on</strong> Systems<br />

(TOIS), 12 (January 1994): 5-34. http://dx.doi.org/10.1145/174608.174609<br />

[Marek 2009]. Marek, J<str<strong>on</strong>g>in</str<strong>on</strong>g>dřich. “Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Document Management System for the Digital <strong>Library</strong> of<br />

Manuscripts: M-Tool <strong>and</strong> M-Can for Manuscriptorium.” Presentati<strong>on</strong> given at TEI 2009 Members'<br />

Meet<str<strong>on</strong>g>in</str<strong>on</strong>g>g, (November 2009).<br />

http://www.lib.umich.edu/spo/teimeet<str<strong>on</strong>g>in</str<strong>on</strong>g>g09/files/TEI_MM_2009_MSS_SIG_Marek.pdf<br />

[Mar<str<strong>on</strong>g>in</str<strong>on</strong>g>ai 2009]. Mar<str<strong>on</strong>g>in</str<strong>on</strong>g>ai, Sim<strong>on</strong>e. “Text Retrieval from Early Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted Books.” AND '09: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs<br />

of The Third Workshop <strong>on</strong> Analytics for Noisy Unstructured Text Data. New York, NY: ACM, (2009):<br />

33-40. http://dx.doi.org/10.1145/1568296.1568304<br />

[Marshall 2008]. Marshall, Cather<str<strong>on</strong>g>in</str<strong>on</strong>g>e C. “From Writ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Analysis to the Repository: Tak<str<strong>on</strong>g>in</str<strong>on</strong>g>g the<br />

Scholars' Perspective <strong>on</strong> Scholarly Archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g.” JCDL '08: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 8th ACM/IEEE-CS Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t<br />

C<strong>on</strong>ference <strong>on</strong> Digital Libraries. New York, NY: ACM, (2008): 251-260.<br />

http://dx.doi.org/10.1145/1378889.1378930. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at:<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.141.352&rep=rep1&type=pdf<br />

[Mart<str<strong>on</strong>g>in</str<strong>on</strong>g>ez-Uribe <strong>and</strong> Macd<strong>on</strong>ald 2009]. Mart<str<strong>on</strong>g>in</str<strong>on</strong>g>ez-Uribe, Luis, <strong>and</strong> Stuart Macd<strong>on</strong>ald. “User<br />

Engagement <str<strong>on</strong>g>in</str<strong>on</strong>g> Research Data Curati<strong>on</strong>.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 13th European C<strong>on</strong>ference <strong>on</strong> Research<br />

<strong>and</strong> Advanced Technology for Digital Libraries (ECDL 2009), Corfu, Greece, Spr<str<strong>on</strong>g>in</str<strong>on</strong>g>ger, (2009): 309-<br />

314. http://dx.doi.org/10.1007/978-3-642-04346-8_30.<br />

Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at: http://ed<str<strong>on</strong>g>in</str<strong>on</strong>g>a.ac.uk/presentati<strong>on</strong>s_publicati<strong>on</strong>s/Mart<str<strong>on</strong>g>in</str<strong>on</strong>g>ez_Macd<strong>on</strong>ald_ECDL09.pdf<br />

[Mathisen 1988]. Mathisen, Ralph W. “Medieval Prosopography <strong>and</strong> Computers: Theoretical <strong>and</strong><br />

Methodological C<strong>on</strong>siderati<strong>on</strong>s.” Medieval Prosopography, 9 (1988): 73–128.<br />

[Mathisen 2007]. Mathisen, Ralph W. “Where are all the PDBs: The Creati<strong>on</strong> of Prosopographical<br />

Databases for the Ancient <strong>and</strong> Medieval Worlds.” Prosopography Approaches <strong>and</strong> Applicati<strong>on</strong>s: A<br />

H<strong>and</strong>book. Ed. K. S. B. Keats-Rohan. Oxford: Unit for Prosopographical Research, L<str<strong>on</strong>g>in</str<strong>on</strong>g>acre College,<br />

University of Oxford, 2007, pp. 95-126. Also available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e at<br />

http://prosopography.modhist.ox.ac.uk/images/04%20Mathisen%20pdf.pdf<br />

[Matthews <strong>and</strong> Rahtz 2008]. Matthews, Ela<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <strong>and</strong> Sebastian Rahtz. “The Lexic<strong>on</strong> of Greek Pers<strong>on</strong>al<br />

Names <strong>and</strong> Classical Web Services.” Digital Classicist/ICS Work <str<strong>on</strong>g>in</str<strong>on</strong>g> Progress Sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ar, (June 2008).<br />

http://www.digitalclassicist.org/wip/wip2008-01emsr.pdf<br />

[McClanahan et al. 2010]. McClanahan, Peter, George Busby, Robbie Haertel, Kristian Heal, Deryle<br />

L<strong>on</strong>sdale, Kev<str<strong>on</strong>g>in</str<strong>on</strong>g> Seppi, <strong>and</strong> Eric R<str<strong>on</strong>g>in</str<strong>on</strong>g>gger. “A Probabilistic Morphological Analyzer for Syriac.”<br />

Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 2010 C<strong>on</strong>ference <strong>on</strong> Empirical Methods <str<strong>on</strong>g>in</str<strong>on</strong>g> Natural Language Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g.


294<br />

Cambridge, MA: Associati<strong>on</strong> for Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (2010): 810-820.<br />

http://www.aclweb.org/anthology-new/D/D10/D10-1079.pdf<br />

[McD<strong>on</strong>ough 1959]. McD<strong>on</strong>ough, James. “Computers <strong>and</strong> Classics.” Classical World, 53 (2),<br />

(November 1959): 44-50. http://www.jstor.org/stable/4344244<br />

[McD<strong>on</strong>ough 2009]. McD<strong>on</strong>ough, Jerome P. “Align<str<strong>on</strong>g>in</str<strong>on</strong>g>g METS with the OAI-ORE Data Model.” JCDL<br />

'09: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 9th ACM/IEEE-CS Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t C<strong>on</strong>ference <strong>on</strong> Digital Libraries. New York, NY:<br />

ACM, (2009): 323-330. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at http://hdl.h<strong>and</strong>le.net/2142/10744<br />

[McGillivray <strong>and</strong> Passarotti 2009]. McGillivray, Barbara, <strong>and</strong> Marco Passarotti. “The Development of<br />

the Index Thomisticus Treebank Valency Lexic<strong>on</strong>.” LaTeCH-SHELT&R '09: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the EACL<br />

2009 Workshop <strong>on</strong> Language Technology <strong>and</strong> Resources for Cultural Heritage, Social Sciences,<br />

Humanities, <strong>and</strong> Educati<strong>on</strong>. Morristown, NJ: Associati<strong>on</strong> for Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (2009): 43-<br />

50. http://portal.acm.org/citati<strong>on</strong>.cfmid=1642049.1642055. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available <str<strong>on</strong>g>in</str<strong>on</strong>g> proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs at:<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.164.7767&rep=rep1&type=pdf#page=52<br />

[McGovern 2007]. McGovern, Nancy Y. “A Digital Decade: Where Have We Been <strong>and</strong> Where Are<br />

We Go<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital Preservati<strong>on</strong>” RLG DigiNews, 11 (April 2007).<br />

http://hdl.h<strong>and</strong>le.net/2027.42/60441<br />

[McManam<strong>on</strong> <strong>and</strong> K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh 2010]. McManam<strong>on</strong>, Francis P., <strong>and</strong> Keith W. K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh. “Digital Antiquity:<br />

Transform<str<strong>on</strong>g>in</str<strong>on</strong>g>g Archaeological Data <str<strong>on</strong>g>in</str<strong>on</strong>g>to Knowledge.” The SAA Archaeological Record, (March 2010):<br />

37-40. http://www.tdar.org/images/SAA-McM-K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh.pdf<br />

[McManam<strong>on</strong>, K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh, <strong>and</strong> Br<str<strong>on</strong>g>in</str<strong>on</strong>g> 2010]. McManam<strong>on</strong>, Francis P., Keith W. K<str<strong>on</strong>g>in</str<strong>on</strong>g>tigh, <strong>and</strong> Adam Br<str<strong>on</strong>g>in</str<strong>on</strong>g>.<br />

“Digital Antiquity <strong>and</strong> the Digital Archaeological Record (tDAR): Broaden<str<strong>on</strong>g>in</str<strong>on</strong>g>g Access <strong>and</strong> Ensur<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

L<strong>on</strong>g-Term Preservati<strong>on</strong> for Digital Archaeological Data.” CSA Newsletter, Vol. XXIII, no. 2,<br />

(September 2010). http://csanet.org/newsletter/fall10/nlf1002.html<br />

[McManus <strong>and</strong> Rub<str<strong>on</strong>g>in</str<strong>on</strong>g>o 2003]. McManus, Barbara F., <strong>and</strong> Carl A. Rub<str<strong>on</strong>g>in</str<strong>on</strong>g>o. “Classics <strong>and</strong> Internet<br />

Technology.” The American Journal of Philology, 124 (2003): 601-608.<br />

http://www.jstor.org/stable/1561793cookieSet=1<br />

[Meckseper <strong>and</strong> Warwick 2003]. Meckseper, Christiane, <strong>and</strong> Claire Warwick. “The Publicati<strong>on</strong> of<br />

Archaeological Excavati<strong>on</strong> Reports Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML.” Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 18 (April 2003):<br />

63-75. http://dx.doi.org/10.1093/llc/18.1.63<br />

[Meyer et al. 2009]. Meyer, Eric T., Kathryn Eccles, <strong>and</strong> Christ<str<strong>on</strong>g>in</str<strong>on</strong>g>e Madsen. “Digitisati<strong>on</strong> as e-<br />

Research Infrastructure: Access to Materials <strong>and</strong> Research Capabilities <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities.” Oxford<br />

Internet Institute, Oxford University, (2009).<br />

http://www.ncess.ac.uk/resources/c<strong>on</strong>tent/papers/Meyer(2).pdf<br />

[Miller <strong>and</strong> Richards 1994]. Miller, P., <strong>and</strong> J. D. Richards. “The Good, the Bad <strong>and</strong> the Downright<br />

Mislead<str<strong>on</strong>g>in</str<strong>on</strong>g>g: Archaeological Adopti<strong>on</strong> of Computer Visualizati<strong>on</strong>.” CAA 94:Computer Applicati<strong>on</strong>s<br />

<strong>and</strong> Quantitative Methods <str<strong>on</strong>g>in</str<strong>on</strong>g> Archaeology, (1994): 19-22.


295<br />

[Mimno 2009]. Mimno, David. “Rec<strong>on</strong>struct<str<strong>on</strong>g>in</str<strong>on</strong>g>g Pompeian Households.” Applicati<strong>on</strong>s of Topic Models<br />

Workshop-NIPS 2009, (2009). http://www.cs.umass.edu/~mimno/papers/pompeii.pdf<br />

[Mitcham, Niven, <strong>and</strong> Richards 2010]. Mitcham, Jenny, Kier<strong>on</strong> Niven, <strong>and</strong> Julian Richards.<br />

“Archiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g Archaeology: Introduc<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Guides to Good Practice.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 7th<br />

Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Preservati<strong>on</strong> of Digital Objects. iPRES 2010. Vienna, Austria, September<br />

19-24, (September 2010)<br />

http://www.ifs.tuwien.ac.at/dp/ipres2010/papers/mitcham-54.pdf<br />

[Moalla et al. 2006]. Moalla, Ikram, Frank Lebourgeois, Hubert Emptoz, <strong>and</strong> Adel Alimi.<br />

“C<strong>on</strong>tributi<strong>on</strong> to the Discrim<str<strong>on</strong>g>in</str<strong>on</strong>g>ati<strong>on</strong> of the Medieval Manuscript Texts: Applicati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

Palaeography.” Document Analysis Systems VII, (2006): 25-37. http://dx.doi.org/10.1007/11669487_3<br />

[M<strong>on</strong>ella 2008]. M<strong>on</strong>ella, Paolo. “Towards a Digital Model to Edit the Different Paratextuality Levels<br />

with<str<strong>on</strong>g>in</str<strong>on</strong>g> a Textual Traditi<strong>on</strong>.” Digital Medievalist, (March 2008).<br />

http://www.digitalmedievalist.org/journal/4/m<strong>on</strong>ella/<br />

[M<strong>on</strong>tanari 2004]. M<strong>on</strong>tanari, Franco. “Electr<strong>on</strong>ic Tools for Classical Philology: The Aristarchus<br />

Project On L<str<strong>on</strong>g>in</str<strong>on</strong>g>e.” Zbornik Matice srpske za klasične studije, (2004): 155-160.<br />

http://sc<str<strong>on</strong>g>in</str<strong>on</strong>g>deks-clanci.nb.rs/data/pdf/1450-6998/2004/1450-69980406155M.pdf<br />

[Mueller <strong>and</strong> Lee 2004]. Mueller, Katja, <strong>and</strong> William Lee. “From Mess to Matrix <strong>and</strong> Bey<strong>on</strong>d:<br />

Estimat<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Size of Settlements <str<strong>on</strong>g>in</str<strong>on</strong>g> the Ptolemaic Fayum/Egypt.” Journal of Archaeological Science,<br />

32 (January 2005): 59-67. http://dx.doi.org/10.1016/j.jas.2004.06.007<br />

[Nagy 2010]. Nagy, Gregory. “Homer Multitext project.” Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Humanities Scholarship: The Shape<br />

of Th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs to Come, Rice University Press, (March 2010).<br />

http://rup.rice.edu/cnx_c<strong>on</strong>tent/shape/m34314.html<br />

[Neal 1990]. Neal, Gord<strong>on</strong>. “What Has Athens to Do with Silic<strong>on</strong> Valley Questi<strong>on</strong>s about the Role of<br />

Computers <str<strong>on</strong>g>in</str<strong>on</strong>g> Literary Study.” Computers & Educati<strong>on</strong>, 15 (1990): 111-117.<br />

http://dx.doi.org/10.1016/0360-1315(90)90136-U<br />

[Nguyen <strong>and</strong> Shilt<strong>on</strong> 2008]. Nguyen, Lilly, <strong>and</strong> Katie Shilt<strong>on</strong>. “Tools for Humanists Project F<str<strong>on</strong>g>in</str<strong>on</strong>g>al<br />

Report (Appendix F).” In Zorich, Diane. A Survey of Digital Humanities Centers <str<strong>on</strong>g>in</str<strong>on</strong>g> the United States.<br />

<str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources, Publicati<strong>on</strong> Number 143, (2008).<br />

http://www.clir.org/pubs/reports/pub143/appendf.html<br />

[Nichols 2009]. Nichols, Stephen. “Time to Change Our Th<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g: Dismantl<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Silo Model of<br />

Digital Scholarship.” Ariadne, 58 (January 2009). http://www.ariadne.ac.uk/issue58/nichols/<br />

[NSF 2008]. Nati<strong>on</strong>al Science Foundati<strong>on</strong>. Susta<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g the Digital Investment: Issues <strong>and</strong> Challenges of<br />

Ec<strong>on</strong>omically Susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able Digital Preservati<strong>on</strong>. Interim Report of the Blue Ribb<strong>on</strong> Task Force <strong>on</strong><br />

Susta<str<strong>on</strong>g>in</str<strong>on</strong>g>able Digital Preservati<strong>on</strong> <strong>and</strong> Access, (December 2008).<br />

http://brtf.sdsc.edu/biblio/BRTF_Interim_Report.pdf


296<br />

[Ntzios et al. 2007]. Ntzios, Kosta, Basilios Gatos, Ioannis Pratikakis, T. K<strong>on</strong>idaris, <strong>and</strong> Stravros<br />

Perant<strong>on</strong>is. “An Old Greek H<strong>and</strong>written OCR System Based <strong>on</strong> an Efficient Segmentati<strong>on</strong>-Free<br />

Approach.” Internati<strong>on</strong>al Journal <strong>on</strong> Document Analysis <strong>and</strong> Recogniti<strong>on</strong>, 9 (April 2007): 179-192.<br />

Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at:<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.81.7236&rep=rep1&type=pdf<br />

[Oates 1993]. Oates, John. “The Duke Databank of Documentary Papyri.” In Access<str<strong>on</strong>g>in</str<strong>on</strong>g>g Antiquity: The<br />

Computerizati<strong>on</strong> of Classical Studies (ed. J<strong>on</strong> Solom<strong>on</strong>). Tusc<strong>on</strong>: University of Ariz<strong>on</strong>a, (1993), pp.<br />

62-72.<br />

[Ober et al. 2007]. Ober, Josiah, Walter Scheidel, Brent D. Shaw, <strong>and</strong> D<strong>on</strong>na Sanclemente. “Toward<br />

Open Access <str<strong>on</strong>g>in</str<strong>on</strong>g> Ancient Studies: The Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>-Stanford Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g Papers <str<strong>on</strong>g>in</str<strong>on</strong>g> Classics.” Hesperia, 76<br />

(2007): 229-242. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at http://www.pr<str<strong>on</strong>g>in</str<strong>on</strong>g>cet<strong>on</strong>.edu/~pswpc/pdfs/ober/020702.pdf<br />

[OCLC-CRL 2007]. Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Computer <strong>Library</strong> Center, Inc., <strong>and</strong> Center for Research Libraries.<br />

Trustworthy Repositories Audit & Certificati<strong>on</strong> (TRAC) Criteria <strong>and</strong> Checklist. Chicago: Center for<br />

Research Libraries; Dubl<str<strong>on</strong>g>in</str<strong>on</strong>g>, OH: OCLC Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Computer <strong>Library</strong> Center, Inc., 2007.<br />

http://catalog.crl.edu/record=b2212602~S1<br />

[OKell et al. 2010]. OKell, Eleanor, Dejan Ljubojevic, <strong>and</strong> Cary MacMah<strong>on</strong>. “Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Generative<br />

Learn<str<strong>on</strong>g>in</str<strong>on</strong>g>g Object (GLO): Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> an ‘Ill-Structured’ Envir<strong>on</strong>ment <strong>and</strong> Gett<str<strong>on</strong>g>in</str<strong>on</strong>g>g Students to Th<str<strong>on</strong>g>in</str<strong>on</strong>g>k.”<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g> Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Study of Classical Antiquity (eds. Gabriel Bodard <strong>and</strong> Sim<strong>on</strong> Mah<strong>on</strong>y).<br />

Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2010, pp. 151-170.<br />

[Olsen et al. 2009]. Olsen, Henriette Roued, Segolene M. Tarte, Melissa Terras, J. M. Brady, <strong>and</strong> Alan<br />

K. Bowman. “Towards an Interpretati<strong>on</strong> Support System for Read<str<strong>on</strong>g>in</str<strong>on</strong>g>g Ancient Documents.” Digital<br />

Humanities Abstracts 2009, (June 2009): 237-239.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Packard 1973]. Packard, David W. “Computer Assisted Morphological Instructi<strong>on</strong> of Ancient Greek.”<br />

Computati<strong>on</strong>al <strong>and</strong> Mathematical L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong><br />

Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, 2, (1973): 343-356.<br />

[Palmer et al. 2009]. Palmer, Carole L., Lauren C. Teffeau, <strong>and</strong> Carrie M. Pirmann. Scholarly<br />

Informati<strong>on</strong> Practices <str<strong>on</strong>g>in</str<strong>on</strong>g> the Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Envir<strong>on</strong>ment: Themes from the Literature <strong>and</strong> Implicati<strong>on</strong>s for<br />

<strong>Library</strong> Service Development. OCLC Research, (January 2009).<br />

http://www.oclc.org/programs/publicati<strong>on</strong>s/reports/2009-02.pdf<br />

[Panagopoulos et al. 2008]. Panagopoulos, Michail, C<strong>on</strong>stant<str<strong>on</strong>g>in</str<strong>on</strong>g> Papaodysseus,<br />

Panayiotis Rousopoulos, Dimitra Dafi, <strong>and</strong> Stephen Tracy. “Automatic Writer Identificati<strong>on</strong> of<br />

Ancient Greek Inscripti<strong>on</strong>s.” IEEE Transacti<strong>on</strong>s <strong>on</strong> Pattern Analysis <strong>and</strong> Mach<str<strong>on</strong>g>in</str<strong>on</strong>g>e Intelligence, 31<br />

(August 2008): 1404-1414. http://dx.doi.org/10.1109/TPAMI.2008.201


297<br />

[Pantelia 2000]. Pantelia, Maria. “‘Noûs, Into Chaos': The Creati<strong>on</strong> Of The Thesaurus Of The Greek<br />

Language.” Internati<strong>on</strong>al Journal of Lexicography, 13 (March 2000): 1-11.<br />

http://dx.doi.org/10.1093/ijl/13.1.1<br />

[Papakitsos 2011]. Papakitsos, Evangelos C. “Computerized Scansi<strong>on</strong> of Ancient Greek Hexameter.”<br />

Literary <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 26 (1): 57-69. http://dx.doi.org/10.1093/llc/fqq015<br />

[Pappelau <strong>and</strong> Belt<strong>on</strong> 2009]. Pappelau, Christ<str<strong>on</strong>g>in</str<strong>on</strong>g>e, <strong>and</strong> Graham Belt<strong>on</strong>. “Roman Spolia <str<strong>on</strong>g>in</str<strong>on</strong>g> 3D: High<br />

Resoluti<strong>on</strong> Leica Laserscanner Meets Ancient Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g Structures.” Digital Classicist-Works <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Progress Sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ar, (July 2009). http://www.digitalclassicist.org/wip/wip2009-07cp.pdf<br />

[Pasqualis dell Ant<strong>on</strong>io 2005] Pasqualis dell Ant<strong>on</strong>io, Sim<strong>on</strong>etta. “From the Roman Eagle to EAGLE:<br />

Harvest<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Web for Ancient Epigraphy.” Humanities, Computers <strong>and</strong> Cultural Heritage:<br />

Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the XVI Internati<strong>on</strong>al C<strong>on</strong>ference of the Associati<strong>on</strong> for History <strong>and</strong> Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

(2005): 224-228.<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.103.2193&rep=rep1&type=pdf<br />

[Pettersen et al. 2008]. Pettersen, Oyste<str<strong>on</strong>g>in</str<strong>on</strong>g>, Nicole Bordes, Sean Ulm, David Gwynne, Terry Simmich,<br />

<strong>and</strong> Bernard Pailthorpe. “Grid Services for E-Archaeology.” AusGrid '08: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Sixth<br />

Australasian Workshop <strong>on</strong> Grid Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> E-Research. Darl<str<strong>on</strong>g>in</str<strong>on</strong>g>ghurst, Australia: Australian<br />

Computer Society, Inc., (2008): 17-25. http://portal.acm.org/citati<strong>on</strong>.cfmid=1386844<br />

[Ploeger et al. 2009]. Ploeger, Lieke, Yola Park, Jeanna N.R. Gaviria, Clemens Neudecker, Fedor<br />

Bochow, <strong>and</strong> Michael <str<strong>on</strong>g>Day</str<strong>on</strong>g>. “'IMPACT C<strong>on</strong>ference: Optical Character Recogniti<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Mass<br />

Digitisati<strong>on</strong>.” Ariadne, 59 (June 2009). http://www.ariadne.ac.uk/issue59/impact-2009-rpt/<br />

[Porter 2010]. Porter, Dorothy. “How Does TILE Relate to TEI.” TILE Blog, (March 13, 2010).<br />

http://mith.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo/tile/2010/03/13/how-does-tile-relate-to-tei/<br />

[Porter et al. 2006]. Porter, Dorothy, William Du Casse, Jerzy W. Jaromczyk, Neal Moore, Ross<br />

Scaife, <strong>and</strong> Jack Mitchell. “Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g CTS Collecti<strong>on</strong>s.” Digital Humanities 2006, (2006): 269-274.<br />

http://www.csdl.tamu.edu/~furuta/courses/06c_689dh/dh06read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs/DH06-269-274.pdf<br />

[Porter et al. 2009]. Porter, Dorothy, Doug Reside, <strong>and</strong> John Walsh. “Text-Image L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Envir<strong>on</strong>ment (TILE).” Digital Humanities C<strong>on</strong>ference Abstracts 2009, (June 2009): 388-390.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Pritchard 2008]. Pritchard, David. “Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g Papers, Open Access, <strong>and</strong> Cyber-Infrastructure <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Classical Studies.” Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 23 (June 2008): 149-162. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at:<br />

http://ses.library.usyd.edu.au/h<strong>and</strong>le/2123/2226<br />

[Pybus <strong>and</strong> Kirkham 2009]. Pybus, John, <strong>and</strong> Ruth Kirkham. “Experiences of User Involvement <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

C<strong>on</strong>structi<strong>on</strong> of a Virtual Research Envir<strong>on</strong>ment for the Humanities.” 2009 5th IEEE Internati<strong>on</strong>al<br />

C<strong>on</strong>ference <strong>on</strong> E-Science Workshops, IEEE, (2009): 135-137.<br />

http://dx.doi.org/10.1109/ESCIW.2009.5407961


298<br />

[Reddy <strong>and</strong> Crane 2006]. Reddy, Sravana, <strong>and</strong> Gregory Crane. “A Document Recogniti<strong>on</strong> System for<br />

Early Modern Lat<str<strong>on</strong>g>in</str<strong>on</strong>g>.” DHCS 2006-Chicago Colloquium <strong>on</strong> Digital Humanities <strong>and</strong> Computer Science,<br />

(November 2006). http://hdl.h<strong>and</strong>le.net/10427/57011<br />

[Rem<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>o et al. 2009]. Rem<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>o, Fabio, Stefano Girardi, Aless<strong>and</strong>ro Rizzi, <strong>and</strong> Lorenzo G<strong>on</strong>zo.<br />

“3D Model<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Complex <strong>and</strong> Detailed Cultural Heritage Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Multi-Resoluti<strong>on</strong> Data.” Journal of<br />

Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Cultural Heritage, 2 (2009): 1-20. http://dx.doi.org/10.1145/1551676.1551678<br />

[Reside 2010]. Reside, Doug. “A Four Layer Model for Image-Based Editi<strong>on</strong>s.” TILE Blog, (February<br />

2010). Part One: http://mith.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo/tile/2010/02/03/a-four-layer-model-for-image-based-editi<strong>on</strong>s/ <strong>and</strong><br />

Part Two: http://mith.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo/tile/2010/02/<br />

[Richards 1997]. Richards, Julian D. “Preservati<strong>on</strong> <strong>and</strong> Re-use of Digital Data: the Role of the<br />

Archaeology Data Service.” Antiquity, 71 (1997): 1057-1059.<br />

[Richards 1998]. Richards, Julian D. “Recent Trends <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer Applicati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> Archaeology.”<br />

Journal of Archaeological Research, 6 (December 1998): 331-382.<br />

[Roberts<strong>on</strong> 2009]. Roberts<strong>on</strong>, Bruce. “Explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g Historical RDF with Heml.” Digital Humanities<br />

Quarterly, 3 (2009). http://www.digitalhumanities.org/dhq/vol/003/1/000026.html<br />

[Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009]. Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong>, Peter. “Towards a Scholarly Edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g System for the Next Decades.”<br />

Sanskrit Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics (Lecture Notes <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer Science, Volume 5402), Berl<str<strong>on</strong>g>in</str<strong>on</strong>g>,<br />

Heidelberg: Spr<str<strong>on</strong>g>in</str<strong>on</strong>g>ger, (2009): 346-357. http://dx.doi.org/10.1007/978-3-642-00155-0_18<br />

[Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2010]. Rob<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong>, Peter. “Edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g Without Walls.” Literature Compass, 7 (2010): 57-61.<br />

http://dx.doi.org/10.1111/j.1741-4113.2009.00676.x<br />

[Rockwell 2010]. Rockwell, Geoffrey. “As Transparent as Infrastructure: On the Research of<br />

Cyber<str<strong>on</strong>g>in</str<strong>on</strong>g>frastructure <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities.” Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Humanities Scholarship: The Shape of Th<str<strong>on</strong>g>in</str<strong>on</strong>g>gs to Come,<br />

Houst<strong>on</strong>, Texas: Rice University Press, (March 2010). http://cnx.org/c<strong>on</strong>tent/m34315/latest/<br />

[Romanello 2008]. Romanello, Matteo. “A Semantic L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Framework to Provide Critical Value-<br />

Added Services for E-Journals <strong>on</strong> Classics.” ELPUB2008. Open Scholarship: Authority, Community,<br />

<strong>and</strong> Susta<str<strong>on</strong>g>in</str<strong>on</strong>g>ability <str<strong>on</strong>g>in</str<strong>on</strong>g> the Age of Web 2.0 - Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 12th Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong><br />

Electr<strong>on</strong>ic Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, (June 2008): 401-414.<br />

http://elpub.scix.net/cgi-b<str<strong>on</strong>g>in</str<strong>on</strong>g>/works/Show401_elpub2008<br />

[Romanello et al. 2009a]. Romanello, Matteo, Federico Boschetti, <strong>and</strong> Gregory Crane. “Citati<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

the Digital <strong>Library</strong> of Classics: Extract<str<strong>on</strong>g>in</str<strong>on</strong>g>g Can<strong>on</strong>ical References By Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g C<strong>on</strong>diti<strong>on</strong>al R<strong>and</strong>om<br />

Fields.” NLPIR4DL '09: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 2009 Workshop <strong>on</strong> Text <strong>and</strong> Citati<strong>on</strong> Analysis for<br />

Scholarly Digital Libraries. Morristown, NJ: Associati<strong>on</strong> for Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (2009): 80-<br />

87. http://aye.comp.nus.edu.sg/nlpir4dl/NLPIR4DL10.pdf<br />

[Romanello et al. 2009b]. Romanello, Matteo, M<strong>on</strong>ica Berti, Federico Boschetti, Alis<strong>on</strong> Babeu, <strong>and</strong><br />

Gregory Crane. “Reth<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Critical Editi<strong>on</strong>s of Fragmentary Texts By Ontologies.” ELPUB2009.


299<br />

Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of 13th Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Electr<strong>on</strong>ic Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g: Reth<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Electr<strong>on</strong>ic<br />

Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g: Innovati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Communicati<strong>on</strong> Paradigms <strong>and</strong> Technologies, (2009): 155-174. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t<br />

available at: http://www.perseus.tufts.edu/publicati<strong>on</strong>s/elpub2009.pdf<br />

[Romanello et al. 2009c]. Romanello, Matteo, M<strong>on</strong>ica Berti, Alis<strong>on</strong> Babeu, <strong>and</strong> Gregory Crane.<br />

“When Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>ted Hypertexts Go Digital: Informati<strong>on</strong> Extracti<strong>on</strong> from the Pars<str<strong>on</strong>g>in</str<strong>on</strong>g>g of Indices.” HT '09:<br />

Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 20th ACM C<strong>on</strong>ference <strong>on</strong> Hypertext <strong>and</strong> Hypermedia. New York, NY: ACM,<br />

(2009): 357-358. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at: http://www.perseus.tufts.edu/publicati<strong>on</strong>s/ht159-romanello.pdf<br />

[Romary <strong>and</strong> Armbruster 2009]. Romary, Laurent, <strong>and</strong> Chris Armbruster. “Bey<strong>on</strong>d Instituti<strong>on</strong>al<br />

Repositories.” Social Science Research Network Work<str<strong>on</strong>g>in</str<strong>on</strong>g>g Paper Series, (June 2009).<br />

http://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID1425919_code434782.pdfabstractid=1425692&mi<br />

rid=1<br />

[Roueché 2009]. Roueché, Charlotte. “Digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g Inscribed Texts.” <str<strong>on</strong>g>in</str<strong>on</strong>g> Text Edit<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Pr<str<strong>on</strong>g>in</str<strong>on</strong>g>t <strong>and</strong> the<br />

Digital World. Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2009, pg.159-169.<br />

[Roued 2009] Roued, Henriette. “Textual Analysis Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML: Underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g Ancient Textual<br />

Corpora.” 5th IEEE C<strong>on</strong>ference <strong>on</strong> e-Science 2009, (December 2009).<br />

http://esad.classics.ox.ac.uk/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpopti<strong>on</strong>=com_docman&task=doc_download&gid=30&Itemid=7<br />

8<br />

[Roued-Cunliffe 2010]. Roued-Cunliffe, Henriette. “Towards a Decisi<strong>on</strong> Support System For Read<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Ancient Documents.” Literary <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 25 (4) (2010): 365-379.<br />

http://llc.oxfordjournals.org/c<strong>on</strong>tent/25/4/365.shortrss=1<br />

[Ruddy 2009]. Ruddy, David. “L<str<strong>on</strong>g>in</str<strong>on</strong>g>k<str<strong>on</strong>g>in</str<strong>on</strong>g>g Resources <str<strong>on</strong>g>in</str<strong>on</strong>g> the Humanities: Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g OpenURL to Cite<br />

Can<strong>on</strong>ical Works.” DLF Spr<str<strong>on</strong>g>in</str<strong>on</strong>g>g 2009 Forum, (May 2009).<br />

http://www.diglib.org/forums/spr<str<strong>on</strong>g>in</str<strong>on</strong>g>g2009/presentati<strong>on</strong>s/Ruddy.pdf<br />

[Rudman 1998]. Rudman, Joseph. “N<strong>on</strong>-Traditi<strong>on</strong>al Authorship Attributi<strong>on</strong> Studies <str<strong>on</strong>g>in</str<strong>on</strong>g> the Historia<br />

Augusta: Some Caveats.” Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 13 (September 1998): 151-157.<br />

http://dx.doi.org/10.1093/llc/13.3.151<br />

[Ruhleder 1995]. Ruhleder, Karen. “Rec<strong>on</strong>struct<str<strong>on</strong>g>in</str<strong>on</strong>g>g Artifacts, Rec<strong>on</strong>struct<str<strong>on</strong>g>in</str<strong>on</strong>g>g Work: From Textual<br />

Editi<strong>on</strong> to On-L<str<strong>on</strong>g>in</str<strong>on</strong>g>e Databank.” Science, Technology, & Human Values, 20 (1995): 39-64.<br />

http://www.jstor.org/stable/689880<br />

[Ryan 1996] Ryan, N. “Computer Based Visualisati<strong>on</strong> of the Past: Technical ‘Realism’ <strong>and</strong> Historical<br />

Credibility.” In: T. Higg<str<strong>on</strong>g>in</str<strong>on</strong>g>s, P.Ma<str<strong>on</strong>g>in</str<strong>on</strong>g> <strong>and</strong> Lang, J. (eds.) Imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Past: Electr<strong>on</strong>ic Imag<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong><br />

Computer Graphics <str<strong>on</strong>g>in</str<strong>on</strong>g> Museums <strong>and</strong> Archaeology. Occasi<strong>on</strong>al Papers (114). The British Museum,<br />

L<strong>on</strong>d<strong>on</strong>, pp. 95-108<br />

[Rydberg-Cox 2002]. Rydberg-Cox, Jeffrey A. “M<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g Data from an Electr<strong>on</strong>ic Greek Lexic<strong>on</strong>.” The<br />

Classical Journal, 98 (2002): 183-188. http://www.jstor.org/stable/3298020


300<br />

[Rydberg-Cox 2009]. Rydberg-Cox, Jeffrey A. “Digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> Incunabula: Challenges, Methods,<br />

<strong>and</strong> Possibilities.” Digital Humanities Quarterly, 3 (January 2009).<br />

http://www.digitalhumanities.org/dhq/vol/3/1/000027.html#<br />

[Salerno et al. 2007]. Salerno, Emanuele, Anna T<strong>on</strong>azz<str<strong>on</strong>g>in</str<strong>on</strong>g>i, <strong>and</strong> Luigi Bed<str<strong>on</strong>g>in</str<strong>on</strong>g>i. “Digital Image Analysis<br />

to Enhance Underwritten Text <str<strong>on</strong>g>in</str<strong>on</strong>g> the Archimedes Palimpsest.” Internati<strong>on</strong>al Journal <strong>on</strong> Document<br />

Analysis <strong>and</strong> Recogniti<strong>on</strong>, 9 (April 2007): 79-87. http://dx.doi.org/10.1007/s10032-006-0028-7<br />

[Sankar et al. 2006]. Sankar, K., Vamshi Ambati, Lakshmi Pratha, <strong>and</strong> C. V. Jawahar. “Digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g a<br />

Milli<strong>on</strong> Books: Challenges for Document Analysis.” Document Analysis Systems VII (Lecture Notes <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Computer Science, Volume 3872), (2006): 425-436.<br />

http://cvit.iiit.ac.<str<strong>on</strong>g>in</str<strong>on</strong>g>/papers/pramod06Digitiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g.pdf<br />

[Sayeed <strong>and</strong> Szpakowicz 2004]. Sayeed, Asad B., <strong>and</strong> Stan Szpakowicz. “Develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g a M<str<strong>on</strong>g>in</str<strong>on</strong>g>imalist<br />

Parser for Free Word Order Languages with Disc<strong>on</strong>t<str<strong>on</strong>g>in</str<strong>on</strong>g>uous C<strong>on</strong>stituency.” Advances <str<strong>on</strong>g>in</str<strong>on</strong>g> Natural<br />

Language Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g (Lecture Notes <str<strong>on</strong>g>in</str<strong>on</strong>g> Computer Science (Volume 3230), (2004): 115-126.<br />

[Schibel <strong>and</strong> Rydberg-Cox 2006]. Schibel, Wolfgang, <strong>and</strong> Jeffrey A. Rydberg-Cox. “Early Modern<br />

Culture <str<strong>on</strong>g>in</str<strong>on</strong>g> a Comprehensive Digital <strong>Library</strong>.” D-Lib Magaz<str<strong>on</strong>g>in</str<strong>on</strong>g>e, 12 (March 2006).<br />

http://www.dlib.org/dlib/march06/schibel/03schibel.html<br />

[Schilit <strong>and</strong> Kolak 2008]. Schilit, Bill N., <strong>and</strong> Okan Kolak. “Explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Digital <strong>Library</strong> Through Key<br />

Ideas.” JCDL '08: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 8th ACM/IEEE-CS Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t C<strong>on</strong>ference <strong>on</strong> Digital Libraries. New<br />

York, NY: ACM, 2008, 177-186. http://schilit.googlepages.com/fp035-schilit.pdf<br />

[Schloen 2001]. Schloen, J. David. “Archaeological Data Models <strong>and</strong> Web Publicati<strong>on</strong> Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g XML.”<br />

Computers <strong>and</strong> the Humanities, 35 (May 2001): 123-152. http://dx.doi.org/10.1023/A:1002471112790<br />

[Schmidt 2010]. Schmidt, Desm<strong>on</strong>d. “The Inadequacy of Embedded Markup for Cultural Heritage<br />

Texts.” Literary <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 25 (April 2010): 337-356.<br />

http://dx.doi.org/10.1093/llc/fqq007<br />

[Schmidt <strong>and</strong> Colomb 2009]. Schmidt, Desm<strong>on</strong>d, <strong>and</strong> Robert Colomb. “A Data Structure for<br />

Represent<str<strong>on</strong>g>in</str<strong>on</strong>g>g Multi-Versi<strong>on</strong> Texts Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e.” Internati<strong>on</strong>al Journal of Human Computer Studies, 67<br />

(June 2009): 497-514.<br />

[Schmitz 2009]. Schmitz, Patrick. “Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g Natural Language Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Social Network Analysis<br />

to Study Ancient Babyl<strong>on</strong>ian Society.” UC Berkeley iNews, (March 1, 2009).<br />

http://<str<strong>on</strong>g>in</str<strong>on</strong>g>ews.berkeley.edu/articles/Spr<str<strong>on</strong>g>in</str<strong>on</strong>g>g2009/BPS<br />

[Schoepfl<str<strong>on</strong>g>in</str<strong>on</strong>g> 2003]. Schoepfl<str<strong>on</strong>g>in</str<strong>on</strong>g>, Urs. “The Archimedes Project: Realiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Visi<strong>on</strong> of an Open Digital<br />

Research <strong>Library</strong> for the Study of L<strong>on</strong>g-Term Developments <str<strong>on</strong>g>in</str<strong>on</strong>g> the History of Mechanics.” RCDL<br />

2003: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 5th Nati<strong>on</strong>al Russian Research C<strong>on</strong>ference “Digital Libraries: Advanced<br />

Methods <strong>and</strong> Technologies, Digital Collecti<strong>on</strong>s,” (2003): 124-129.<br />

http://edoc.mpg.de/get.eplfid=15799&did=169534&ver=0


301<br />

[Sch<strong>on</strong>feld <strong>and</strong> Housewright 2010]. Sch<strong>on</strong>feld, Roger C.l <strong>and</strong> Ross Housewright. Faculty Survey<br />

2009: Key Strategic Insights for Libraries, Publishers, <strong>and</strong> Societies. Ithaka, (April 2010).<br />

http://www.ithaka.org/ithaka-s-r/research/faculty-surveys-2000-2009/Faculty%20Study%202009.pdf<br />

[Schreibman 2009]. Schreibman, Susan. “An E-Framework for Scholarly Editi<strong>on</strong>s.” Digital<br />

Humanities 2009 C<strong>on</strong>ference Abstracts, (June 2009): 18-19.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Schreibman et al. 2009]. Schreibman, Susan, Jennifer Edm<strong>on</strong>d, Dot Porter, Shawn <str<strong>on</strong>g>Day</str<strong>on</strong>g>, <strong>and</strong> Dan<br />

Gourley. “The Digital Humanities Observatory: Build<str<strong>on</strong>g>in</str<strong>on</strong>g>g a Nati<strong>on</strong>al Collaboratory.” Digital<br />

Humanities 2009 C<strong>on</strong>ference Abstracts, (June 2009): 40-43.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Sennyey 2009]. Sennyey, P<strong>on</strong>gracz, Lyman Ross, <strong>and</strong> Carol<str<strong>on</strong>g>in</str<strong>on</strong>g>e Mills. “Explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Future of<br />

Academic Libraries: A Def<str<strong>on</strong>g>in</str<strong>on</strong>g>iti<strong>on</strong>al Approach.” The Journal of Academic Librarianship, 35 (May<br />

2009): 252-259. http://dx.doi.org/10.1016/j.acalib.2009.03.003<br />

[Shen et al. 2008]. Shen, Rao, Naga Sr<str<strong>on</strong>g>in</str<strong>on</strong>g>ivas Vemuri, Weiguo Fan, <strong>and</strong> Edward A. Fox. “Integrati<strong>on</strong> of<br />

Complex Archeology Digital Libraries: An ETANA-DL Experience.” Informati<strong>on</strong> Systems, 33<br />

(November 2008): 699-723. http://dx.doi.org/10.1016/j.is.2008.02.006<br />

[Shiaw et al. 2004]. Shiaw, Horn Y., Robert J. K. Jacob, <strong>and</strong> Gregory R. Crane. “The 3D Vase<br />

Museum: a New Approach to C<strong>on</strong>text <str<strong>on</strong>g>in</str<strong>on</strong>g> a Digital <strong>Library</strong>.” JCDL '04: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 4th<br />

ACM/IEEE-CS jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t c<strong>on</strong>ference <strong>on</strong> Digital libraries. New York, NY: ACM, (2004): 125-134.<br />

http://citeseerx.ist.psu.edu/viewdoc/download;jsessi<strong>on</strong>id=0AAF0CD5C3D796C85A4E7EC96EA8FD<br />

CDdoi=10.1.1.58.760&rep=rep1&type=pdf<br />

[Shilt<strong>on</strong> 2009]. Shilt<strong>on</strong>, Katie. Support<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Tools for Humanists: Investigat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Tool<br />

Infrastructure. F<str<strong>on</strong>g>in</str<strong>on</strong>g>al Report. <str<strong>on</strong>g>Council</str<strong>on</strong>g> for <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources, (May 2009).<br />

http://www.clir.org/pubs/archives/Shilt<strong>on</strong>Toolsf<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Siemens 2009]. Siemens, Lynne. “‘It's a Team if You Use “Reply All”’: An Explorati<strong>on</strong> of Research<br />

Teams <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital Humanities Envir<strong>on</strong>ments.” Literary <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 24 (April 2009):<br />

225-233. http://dx.doi.org/10.1093/llc/fqp009<br />

[Smith 2002]. Smith, David A. “Detect<str<strong>on</strong>g>in</str<strong>on</strong>g>g Events with Date <strong>and</strong> Place Informati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Unstructured<br />

Text.” JCDL '02: Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 2nd ACM/IEEE-CS Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t C<strong>on</strong>ference <strong>on</strong> Digital Libraries. New<br />

York, NY: ACM, (2002): 191-196. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at: http://hdl.h<strong>and</strong>le.net/10427/57018<br />

[Smith 2008]. Smith, Abby. “The Research <strong>Library</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> the 21st Century: Collect<str<strong>on</strong>g>in</str<strong>on</strong>g>g, Preserv<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong><br />

Mak<str<strong>on</strong>g>in</str<strong>on</strong>g>g Accessible Resources for Scholarship.” In No Brief C<strong>and</strong>le: Rec<strong>on</strong>ceiv<str<strong>on</strong>g>in</str<strong>on</strong>g>g Research Libraries<br />

for the 21st Century. <str<strong>on</strong>g>Council</str<strong>on</strong>g> <strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources, Publicati<strong>on</strong> Number 142, (August<br />

2008): pp. 13-20. http://www.clir.org/pubs/reports/pub142/pub142.pdf<br />

[Smith 2009]. Smith, D. Neel. “Citati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Classical Studies.” Digital Humanities Quarterly, 3 (2009).<br />

http://www.digitalhumanities.org/dhq/vol/003/1/000028.html#


302<br />

[Smith 2010]. Smith, D. Neel. “Digital Infrastructure <strong>and</strong> the Homer Multitext Project.” <str<strong>on</strong>g>in</str<strong>on</strong>g> Digital<br />

Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Study of Classical Antiquity (eds. Gabriel Bodard <strong>and</strong> Sim<strong>on</strong> Mah<strong>on</strong>y). Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>,<br />

VT: Ashgate Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2010, pp.121-137.<br />

[Snow et al. 2006]. Snow, Dean R., Mark Gahegan, Lee C. Giles, Kenneth G. Hirth, George R. Milner,<br />

Prasenjit Mitra, <strong>and</strong> James Z. Wang. “Cybertools <strong>and</strong> Archaeology.” Science, 311 (February 2006):<br />

958-959. http://dx.doi.org/10.1126/science.1121556<br />

[Solom<strong>on</strong> 1993]. Solom<strong>on</strong>, J<strong>on</strong> (ed.). Access<str<strong>on</strong>g>in</str<strong>on</strong>g>g Antiquity: the Computerizati<strong>on</strong> of Classical Studies.<br />

Tusc<strong>on</strong>, Ariz<strong>on</strong>a: University of Ariz<strong>on</strong>a Press, 1993.<br />

[Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2007]. Sos<str<strong>on</strong>g>in</str<strong>on</strong>g>, Joshua, et al. “Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Papyrology.” Successful bid to Mell<strong>on</strong><br />

Foundati<strong>on</strong>. http://www.duke.edu/~jds15/DDbDP-APIS-HGV_propRedacted.pdf<br />

[Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2008]. Sos<str<strong>on</strong>g>in</str<strong>on</strong>g>, Joshua et al. “Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Digital Papyrology 2.” Successful bid to Mell<strong>on</strong><br />

Foundati<strong>on</strong>. http://www.duke.edu/~jds15/IDP2-F<str<strong>on</strong>g>in</str<strong>on</strong>g>alProposalRedacted.pdf<br />

[Sos<str<strong>on</strong>g>in</str<strong>on</strong>g> 2010]. Sos<str<strong>on</strong>g>in</str<strong>on</strong>g>, Joshua. “Digital Papyrology.” C<strong>on</strong>gress of the Internati<strong>on</strong>al Associati<strong>on</strong><br />

of Papyrologists, (August 2010), Geneva, Switzerl<strong>and</strong>. http://www.stoa.org/archives/1263<br />

[Speck 2005]. Speck, Reto. The AHDS <strong>and</strong> Digital Resource Creati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g> Classics, Ancient History,<br />

Philosophy, Religious Studies <strong>and</strong> Theology. Arts <strong>and</strong> Humanities Data Service, (November 2005).<br />

http://ahds.ac.uk/about/projects/documents/subject_extensi<strong>on</strong>_report_v1.pdf<br />

[Sporleder 2010]. Sporleder, Carol<str<strong>on</strong>g>in</str<strong>on</strong>g>e. “Natural Language Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g for Cultural Heritage Doma<str<strong>on</strong>g>in</str<strong>on</strong>g>s.”<br />

Language <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics Compass, 4 (2010): 750-768.<br />

[Stewart, Crane, <strong>and</strong> Babeu. 2007]. Stewart, Gord<strong>on</strong>, Gregory Crane, <strong>and</strong> Alis<strong>on</strong> Babeu. “A New<br />

Generati<strong>on</strong> of Textual Corpora: M<str<strong>on</strong>g>in</str<strong>on</strong>g><str<strong>on</strong>g>in</str<strong>on</strong>g>g Corpora from Very Large Collecti<strong>on</strong>s.” JCDL '07:<br />

Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 7 th ACM/IEEE-CS jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t C<strong>on</strong>ference <strong>on</strong> Digital Libraries. New York, NY: ACM,<br />

(2007): 356-365. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at: http://hdl.h<strong>and</strong>le.net/10427/14853<br />

[St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong> 2009]. St<str<strong>on</strong>g>in</str<strong>on</strong>g>s<strong>on</strong>, Timothy. “Codicological Descripti<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> the Digital Age.” Kodikologie und<br />

Paläographie im digitalen Zeitalter-Codicology <strong>and</strong> Palaeography <str<strong>on</strong>g>in</str<strong>on</strong>g> the Digital Age. Norderstedt:<br />

Books <strong>on</strong> Dem<strong>and</strong>, 2009, pp. 35-51. Also available <strong>on</strong>l<str<strong>on</strong>g>in</str<strong>on</strong>g>e at: http://kups.ub.unikoeln.de/volltexte/2009/2959/<br />

[Stokes 2009]. Stokes, Peter A. “Computer-Aided Palaeography, Present <strong>and</strong> Future.” Digital<br />

Humanities 2009 C<strong>on</strong>ference Abstracts, (June 2009): 266-268.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Strothotte et al. 1999]. Strothotte, Thomas, Maic Masuch, <strong>and</strong> Tobias Isenberg. “Visualiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

Knowledge about Virtual Rec<strong>on</strong>structi<strong>on</strong>s of Ancient Architecture.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of Computer<br />

Graphics Internati<strong>on</strong>al, (1999): 36-43. http://dx.doi.org/10.1109/CGI.1999.777901


303<br />

[Tablan et al 2006]. Tablan, Valent<str<strong>on</strong>g>in</str<strong>on</strong>g>, Wim Peters, Diana Maynard, <strong>and</strong> Hamish Cunn<str<strong>on</strong>g>in</str<strong>on</strong>g>gham.<br />

“Creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Tools for Morphological Analysis of Sumerian.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of LREC 2006, (2006).<br />

http://gate.ac.uk/sale/lrec2006/etcsl/etcsl-paper.pdf<br />

[Tambouratzis 2008]. Tambouratzis, George. “Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g an Ant Col<strong>on</strong>y Metaheuristic to Optimize<br />

Automatic Word Segmentati<strong>on</strong> for Ancient Greek.” IEEE Transacti<strong>on</strong>s <strong>on</strong> Evoluti<strong>on</strong>ary Computati<strong>on</strong>,<br />

13 (2009): 742-753. http://dx.doi.org/10.1109/TEVC.2009.2014363<br />

[Tarrant et al. 2009]. Tarrant, David, Ben O'Steen, Tim Brody, Steve Hitchcock, Neil Jefferies, <strong>and</strong><br />

Leslie Carr. “Us<str<strong>on</strong>g>in</str<strong>on</strong>g>g OAI-ORE to Transform Digital Repositories <str<strong>on</strong>g>in</str<strong>on</strong>g>to Interoperable Storage <strong>and</strong><br />

Services Applicati<strong>on</strong>s.” The Code4Lib Journal, (March 2009).<br />

http://journal.code4lib.org/articles/1062<br />

[Tarte 2011]. Tarte, Segolene M. “Papyrological Investigati<strong>on</strong>s: Transferr<str<strong>on</strong>g>in</str<strong>on</strong>g>g Percepti<strong>on</strong> <strong>and</strong><br />

Interpretati<strong>on</strong> <str<strong>on</strong>g>in</str<strong>on</strong>g>to the Digital World.” Literary <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 26 (2), 2011:<br />

http://llc.oxfordjournals.org/c<strong>on</strong>tent/early/2011/04/27/llc.fqr010.full<br />

[Tarte et al. 2009]. Tarte, Segolene M., David Wallom, P<str<strong>on</strong>g>in</str<strong>on</strong>g> Hu, Kang Tang, <strong>and</strong> Tiejun Ma. “An Image<br />

Process<str<strong>on</strong>g>in</str<strong>on</strong>g>g Portal <strong>and</strong> Web-Service for the Study of Ancient Documents.” 5th IEEE C<strong>on</strong>ference <strong>on</strong> e-<br />

Science 2009, (December 2009): 14-19.<br />

http://esad.classics.ox.ac.uk/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpopti<strong>on</strong>=com_docman&task=doc_download&gid=20&Itemid=7<br />

8<br />

[Tchernetska et al. 2007]. Tchernetska, Natalie, E. H<strong>and</strong>ley, C. Aust<str<strong>on</strong>g>in</str<strong>on</strong>g>, <strong>and</strong> L. Horváth, “New<br />

Read<str<strong>on</strong>g>in</str<strong>on</strong>g>gs <str<strong>on</strong>g>in</str<strong>on</strong>g> the Fragment of Hyperides’ Aga<str<strong>on</strong>g>in</str<strong>on</strong>g>st Tim<strong>and</strong>ros from the Archimedes Palimpsest.”<br />

Zeitschrift für Papyrologie und Epigraphik, 162 (2007): 1-4.<br />

[Terras 2005]. Terras, Melissa. “Read<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Readers: Modell<str<strong>on</strong>g>in</str<strong>on</strong>g>g Complex Humanities Processes to<br />

Build Cognitive Systems.” Literary <strong>and</strong> L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 20 (March 2005): 41-59.<br />

http://dx.doi.org/10.1093/llc/fqh042<br />

[Terras 2010]. Terras, Melissa. “The Digital Classicist: Discipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary Focus <strong>and</strong> Interdiscipl<str<strong>on</strong>g>in</str<strong>on</strong>g>ary<br />

Visi<strong>on</strong>.” In Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Study of Classical Antiquity (eds. Gabriel Bodard <strong>and</strong> Sim<strong>on</strong><br />

Mah<strong>on</strong>y). Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2010, pp. 171-189.<br />

[Thiruvathukal 2009]. Thiruvathukal, George K., Steven E. J<strong>on</strong>es, <strong>and</strong> Peter Shill<str<strong>on</strong>g>in</str<strong>on</strong>g>gsburg. “The e-<br />

Carrel: An Envir<strong>on</strong>ment for Collaborative Textual Scholarship.” DHCS 2009, (October 2009).<br />

http://l<str<strong>on</strong>g>in</str<strong>on</strong>g>gcog.iit.edu/~argam<strong>on</strong>/DHCS09-Abstracts/Thiruvathukal.pdf<br />

[Tob<str<strong>on</strong>g>in</str<strong>on</strong>g> et al. 2008]. Tob<str<strong>on</strong>g>in</str<strong>on</strong>g>, Richard, Claire Grover, Shar<strong>on</strong> Giv<strong>on</strong>, <strong>and</strong> Julian Ball. “Named Entity<br />

Recogniti<strong>on</strong> for Digitised Historical Texts.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Sixth Internati<strong>on</strong>al Language<br />

Resources <strong>and</strong> Evaluati<strong>on</strong> C<strong>on</strong>ference (LREC'08), (2008).<br />

http://www.ltg.ed.ac.uk/np/publicati<strong>on</strong>s/ltg/papers/bopcris-lrec.pdf<br />

[Toms <strong>and</strong> Flora 2005]. Toms, Ela<str<strong>on</strong>g>in</str<strong>on</strong>g>e G., <strong>and</strong> N. Flora. “From Physical to Digital Humanities <strong>Library</strong><br />

– Design<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Humanities Scholar’s Workbench.” <str<strong>on</strong>g>in</str<strong>on</strong>g> M<str<strong>on</strong>g>in</str<strong>on</strong>g>d Technologies: Humanities Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g


304<br />

<strong>and</strong> the Canadian Academic Community (eds. R. Siemens <strong>and</strong> D. Moorman). Calgary, Canada:<br />

University of Calgary Press, 2005<br />

[Toms <strong>and</strong> O’Brien 2008]. Toms, Ela<str<strong>on</strong>g>in</str<strong>on</strong>g>e G., <strong>and</strong> Heather L. O'Brien. “Underst<strong>and</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>g the Informati<strong>on</strong><br />

<strong>and</strong> Communicati<strong>on</strong> Technology Needs of the E-Humanist.” Journal of Documentati<strong>on</strong>, 64 (2008):<br />

102-130. http://dx.doi.org/10.1108/00220410810844178<br />

[T<strong>on</strong>k<str<strong>on</strong>g>in</str<strong>on</strong>g> 2008]. T<strong>on</strong>k<str<strong>on</strong>g>in</str<strong>on</strong>g>, Emma. “Persistent Identifiers: C<strong>on</strong>sider<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Opti<strong>on</strong>s.” Ariadne, 56 (July<br />

2008), http://www.ariadne.ac.uk/issue56/t<strong>on</strong>k<str<strong>on</strong>g>in</str<strong>on</strong>g>/<br />

[Toth <strong>and</strong> Emery 2008]. Toth, Michael, <strong>and</strong> Doug Emery. “Apply<str<strong>on</strong>g>in</str<strong>on</strong>g>g DCMI Elements to Digital<br />

Images <strong>and</strong> Text <str<strong>on</strong>g>in</str<strong>on</strong>g> the Archimedes Palimpsest Program.” DC-2008—Berl<str<strong>on</strong>g>in</str<strong>on</strong>g> Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the<br />

Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Dubl<str<strong>on</strong>g>in</str<strong>on</strong>g> Core <strong>and</strong> Metadata Applicati<strong>on</strong>s, (2008): 163-168.<br />

http://dcpapers.dubl<str<strong>on</strong>g>in</str<strong>on</strong>g>core.org/ojs/pubs/article/view/929<br />

[Toufexis 2010]. Toufexis, Notis. “One Era's N<strong>on</strong>sense, Another's Norm: Diachr<strong>on</strong>ic Study of Greek<br />

<strong>and</strong> the Computer.” In Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Study of Classical Antiquity (eds. Gabriel Bodard <strong>and</strong><br />

Sim<strong>on</strong> Mah<strong>on</strong>y). Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 2010, pp. 105-118. Postpr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at:<br />

http://www.toufexis.<str<strong>on</strong>g>in</str<strong>on</strong>g>fo/wp-c<strong>on</strong>tent/uploads/2009/07/DigitalResearch_Toufexis_2010.pdf<br />

[Tse <strong>and</strong> Bigun 2007]. Tse, Elizabeth, <strong>and</strong> Josef Bigun. “A Base-L<str<strong>on</strong>g>in</str<strong>on</strong>g>e Character Recogniti<strong>on</strong> for<br />

Syriac-Aramaic.” ISIC (2007): IEEE Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> Systems, Man <strong>and</strong> Cybernetics,<br />

(October 2007): 1048-1055. http://dx.doi.org/10.1109/ICSMC.2007.4414012. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at:<br />

http://citeseerx.ist.psu.edu/viewdoc/downloaddoi=10.1.1.161.4671&rep=rep1&type=pdf<br />

[Tupman 2010]. Tupman, Charlotte. “C<strong>on</strong>textual Epigraphy <strong>and</strong> XML: Digital Publicati<strong>on</strong> <strong>and</strong> its<br />

Applicati<strong>on</strong> to the Study of Inscribed Funerary M<strong>on</strong>uments.” In Digital Research <str<strong>on</strong>g>in</str<strong>on</strong>g> the Study of<br />

Classical Antiquity (eds. Gabriel Bodard <strong>and</strong> Sim<strong>on</strong> Mah<strong>on</strong>y). Burl<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, VT: Ashgate Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

2010, pp. 73-86.<br />

[Unsworth 2000]. Unsworth, John. “Scholarly Primitives: What Methods do Humanities Researchers<br />

Have <str<strong>on</strong>g>in</str<strong>on</strong>g> Comm<strong>on</strong>, <strong>and</strong> How Might our Tools Reflect This” (2000).<br />

http://www.iath.virg<str<strong>on</strong>g>in</str<strong>on</strong>g>ia.edu/~jmu2m/K<str<strong>on</strong>g>in</str<strong>on</strong>g>gs.5-00/primitives.html<br />

[Uytvanck et al. 2010]. Uytvanck, Dieter V., Claus Z<str<strong>on</strong>g>in</str<strong>on</strong>g>n, Daan Broeder, Peter Wittenburg, <strong>and</strong><br />

Mariano Gardell<str<strong>on</strong>g>in</str<strong>on</strong>g>i. “Virtual Language Observatory: The Portal to the Language Resources <strong>and</strong><br />

Technology Universe.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the Seventh C<strong>on</strong>ference <strong>on</strong> Internati<strong>on</strong>al Language Resources<br />

<strong>and</strong> Evaluati<strong>on</strong> (LREC'10). European Language Resources Associati<strong>on</strong> (ELRA), (May 2010).<br />

http://www.lrec-c<strong>on</strong>f.org/proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs/lrec2010/pdf/273_Paper.pdf<br />

[Van Gr<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>gen 1932]. Van Gr<strong>on</strong><str<strong>on</strong>g>in</str<strong>on</strong>g>gen, B. A. “Projet d'unificati<strong>on</strong> des Systemes de Signes Critiques.”<br />

Chr<strong>on</strong>ique d'Egypte, 7 (13-14), (1932): 262-269.<br />

http://brepols.metapress.com/c<strong>on</strong>tent/92w111270m365748/


305<br />

[Van de Sompel <strong>and</strong> Lagoze 2007]. Van de Sompel, Herbert, <strong>and</strong> Carl Lagoze. “Interoperability for the<br />

Discovery, Use, <strong>and</strong> Re-Use of Units of Scholarly Communicati<strong>on</strong>.” CTWatch Quarterly, 3 (August<br />

2007).<br />

http://www.ctwatch.org/quarterly/articles/2007/08/<str<strong>on</strong>g>in</str<strong>on</strong>g>teroperability-for-the-discovery-use-<strong>and</strong>-re-useof-units-of-scholarly-communicati<strong>on</strong>/<br />

[van Peursen 2009]. Van Peursen, W. Th. “How to Establish a Verbal Paradigm <strong>on</strong> the Basis of<br />

Ancient Syriac Manuscripts. Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the EACL 2009 Workshop <strong>on</strong> Computati<strong>on</strong>al Approaches<br />

to Semitic Languages. Morristown, NJ: Associati<strong>on</strong> for Computati<strong>on</strong>al L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistics, (2009): 1-9.<br />

http://www.aclweb.org/anthology/W/W09/W09-0801.pdf<br />

[Váradi et al. 2008]. Váradi, Tamás, Steven Krauwer, Peter Wittenburg, Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> Wynne, <strong>and</strong> Kimmo<br />

Koskenniemi. “CLARIN: Comm<strong>on</strong> Language Resources <strong>and</strong> Technology Infrastructure.” Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs<br />

of the Sixth Internati<strong>on</strong>al Language Resources <strong>and</strong> Evaluati<strong>on</strong> (LREC'08). European Language<br />

Resources Associati<strong>on</strong> (ELRA), (May 2008).<br />

http://www.lrec-c<strong>on</strong>f.org/proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs/lrec2008/summaries/317.html<br />

[Vemuri et al. 2006]. Vemuri, Naga S., Rao Shen, Sameer Tupe, Weiguo Fan, <strong>and</strong> Edward A. Fox.<br />

“ETANA-ADD: An Interactive Tool For Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Archaeological DL Collecti<strong>on</strong>s.” JCDL '06:<br />

Proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs of the 6th ACM/IEEE-CS Jo<str<strong>on</strong>g>in</str<strong>on</strong>g>t C<strong>on</strong>ference <strong>on</strong> Digital Libraries. New York, NY: ACM<br />

Press, (2006): 161-162. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at:<br />

http://www.computer.org/comp/proceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs/jcdl/2006/2840/00/28400161.pdf<br />

[Vertan 2009]. Vertan, Crist<str<strong>on</strong>g>in</str<strong>on</strong>g>a. “A Knowledge Web-Based eResearch Envir<strong>on</strong>ment for Classical<br />

Philology.” Digital Classicist-Works <str<strong>on</strong>g>in</str<strong>on</strong>g> Progress Sem<str<strong>on</strong>g>in</str<strong>on</strong>g>ar, (July 2009).<br />

http://www.digitalclassicist.org/wip/wip2009-06cv.pdf<br />

[Villegas <strong>and</strong> Parra 2009] Villegas, Marta, <strong>and</strong> Carla Parra. “Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g Full-Text Search <strong>and</strong><br />

L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Analyses <strong>on</strong> Disperse Data for Humanities <strong>and</strong> Social Sciences Research Projects.” Fifth<br />

IEEE Internati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> E-Science, (December 2009): 28-32.<br />

http://dx.doi.org/10.1109/e-Science.2009.12<br />

[Vlachopoulos 2009]. Vlachopoulos, Dimitrios. “Introduc<str<strong>on</strong>g>in</str<strong>on</strong>g>g Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Teach<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> Humanities: A Case<br />

Study About the Acceptance of Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Activities by the Academic Staff of Classical Languages.”<br />

DIGITHUM, 11 (May 2009).<br />

http://digithum.uoc.edu/ojs/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.php/digithum/article/view/n11_vlachopoulos<br />

[Voss <strong>and</strong> Procter 2009]. Voss, Alex<strong>and</strong>er, <strong>and</strong> Rob Procter. “Virtual Research Envir<strong>on</strong>ments <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

Scholarly Work <strong>and</strong> Communicati<strong>on</strong>s.” <strong>Library</strong> Hi Tech, 27 (2009): 174-190.<br />

http://dx.doi.org/10.1108/07378830910968146<br />

[Vuillemot 2009]. Vuillemot, Roma<str<strong>on</strong>g>in</str<strong>on</strong>g>, Tanya Clement, Cather<str<strong>on</strong>g>in</str<strong>on</strong>g>e Plaisant, <strong>and</strong> Amit Kumar. “What’s<br />

Be<str<strong>on</strong>g>in</str<strong>on</strong>g>g Said Near “Martha” Explor<str<strong>on</strong>g>in</str<strong>on</strong>g>g Name Entities <str<strong>on</strong>g>in</str<strong>on</strong>g> Literary Text Collecti<strong>on</strong>s.” VAST 2009: IEEE<br />

Symposium <strong>on</strong> Visual Analytics Science <strong>and</strong> Technology, (2009): 107-114.<br />

http://dx.doi.org/10.1109/VAST.2009.5333248


306<br />

[Wallom et al. 2009]. Wallom, David, Segolene Tarte, Tiejun Ma, P<str<strong>on</strong>g>in</str<strong>on</strong>g> Hu, <strong>and</strong> Kang Tang.<br />

“Integrat<str<strong>on</strong>g>in</str<strong>on</strong>g>g eSAD (The Image, Text, Interpretati<strong>on</strong>: e-Science, Technology <strong>and</strong> Documents) <strong>and</strong><br />

VRE-SDM (VRE for the Study of Documents <strong>and</strong> Manuscripts) projects.” AHM 2009, (2009).<br />

http://esad.classics.ox.ac.uk/<str<strong>on</strong>g>in</str<strong>on</strong>g>dex.phpopti<strong>on</strong>=com_docman&task=doc_download&gid=23&Itemid=7<br />

8<br />

[Walters <strong>and</strong> Sk<str<strong>on</strong>g>in</str<strong>on</strong>g>ner 2011]. Walters, Tyler <strong>and</strong> Sk<str<strong>on</strong>g>in</str<strong>on</strong>g>ner, Kather<str<strong>on</strong>g>in</str<strong>on</strong>g>e. New Roles for New Times: Digital<br />

Curati<strong>on</strong> for Preservati<strong>on</strong>. Report prepared for the Assocati<strong>on</strong> of Research Libraries, March 2011.<br />

http://www.arl.org/bm~doc/nrnt_digital_curati<strong>on</strong>17mar11.pdf<br />

[Warwick et al. 2008a]. Warwick, Claire, Melissa Terras, Paul Hunt<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, <strong>and</strong> Nikoleta Pappa. “If<br />

You Build It Will They Come The LAIRAH Study: Quantify<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Use of Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e Resources <str<strong>on</strong>g>in</str<strong>on</strong>g> the<br />

Arts <strong>and</strong> Humanities Through Statistical Analysis of User Log Data.” Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic<br />

Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 23 (April 2008): 85-102. Open access copy available at: http://epr<str<strong>on</strong>g>in</str<strong>on</strong>g>ts.ucl.ac.uk/13808/<br />

[Warwick et al. 2008b]. Warwick, Claire, Isabel Gal<str<strong>on</strong>g>in</str<strong>on</strong>g>a, Melissa Terras, Paul Hunt<str<strong>on</strong>g>in</str<strong>on</strong>g>gt<strong>on</strong>, <strong>and</strong> Nikoleta<br />

Pappa. “The Master Builders: LAIRAH Research <strong>on</strong> Good Practice <str<strong>on</strong>g>in</str<strong>on</strong>g> the C<strong>on</strong>structi<strong>on</strong> of Digital<br />

Humanities Projects.” Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 23 (September 2008): 383-396.<br />

Open access copy available at: http://discovery.ucl.ac.uk/13810/<br />

[Warwick et al. 2009]. Warwick, Claire, Claire Fisher, Melissa Terras, Mark Baker, Am<strong>and</strong>a Clarke,<br />

Mike Fulford, Matt Grove, Emma O’Riordan, <strong>and</strong> Mike Ra<str<strong>on</strong>g>in</str<strong>on</strong>g>s. “iTrench: A Study of User Reacti<strong>on</strong>s to<br />

the Use of Informati<strong>on</strong> Technology <str<strong>on</strong>g>in</str<strong>on</strong>g> Field Archaeology.” Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 24 (June<br />

2009): 211-223. Open access copy available at: http://discovery.ucl.ac.uk/13916/<br />

[Ween<str<strong>on</strong>g>in</str<strong>on</strong>g>k et al. 2008]. Ween<str<strong>on</strong>g>in</str<strong>on</strong>g>k, Kasja, Leo Waaijers, <strong>and</strong> Karen van Godtsenhoven. A DRIVER's<br />

Guide to European Repositories: Five Studies of Important Digital Repository Related Issues <strong>and</strong><br />

Good Practices. Amsterdam: Amsterdam University Press, (2008). http://dare.uva.nl/document/93898<br />

[Wynne et al. 2009]. Wynne, Mart<str<strong>on</strong>g>in</str<strong>on</strong>g>, Steven Krauwer, Sheila Anders<strong>on</strong>, Chad Ka<str<strong>on</strong>g>in</str<strong>on</strong>g>z, <strong>and</strong> Neil Fraistat.<br />

“Support<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Digital Humanities: Putt<str<strong>on</strong>g>in</str<strong>on</strong>g>g the Jigsaw Together.” Digital Humanities C<strong>on</strong>ference<br />

Abstracts 2009, (June 2009): 49.<br />

http://www.mith2.umd.edu/dh09/wp-c<strong>on</strong>tent/uploads/dh09_c<strong>on</strong>ferencepreceed<str<strong>on</strong>g>in</str<strong>on</strong>g>gs_f<str<strong>on</strong>g>in</str<strong>on</strong>g>al.pdf<br />

[Xia 2006]. Xia, J<str<strong>on</strong>g>in</str<strong>on</strong>g>gfeng. “Electr<strong>on</strong>ic Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> Archaeology.” Journal of Scholarly Publish<str<strong>on</strong>g>in</str<strong>on</strong>g>g,<br />

37 (4), (July 2006): 280-287. http://www.metapress.com/c<strong>on</strong>tent/0p71155260765284<br />

[Zhang et al. 2010]. Zhang, Ziqi, Sam Chapman, <strong>and</strong> Fabio Ciravegna. “A Methodology towards<br />

Effective <strong>and</strong> Efficient Manual Document Annotati<strong>on</strong>: Address<str<strong>on</strong>g>in</str<strong>on</strong>g>g Annotator Discrepancy <strong>and</strong><br />

Annotati<strong>on</strong> Quality.” Knowledge Eng<str<strong>on</strong>g>in</str<strong>on</strong>g>eer<str<strong>on</strong>g>in</str<strong>on</strong>g>g <strong>and</strong> Management by the Masses, LNCS Volume 6317,<br />

(2010): 301-315. Prepr<str<strong>on</strong>g>in</str<strong>on</strong>g>t available at:<br />

http://www.dcs.shef.ac.uk/~fabio/paperi/EKAW_2010_ver7.pdf<br />

[Ziel<str<strong>on</strong>g>in</str<strong>on</strong>g>ski et al. 2009]. Ziel<str<strong>on</strong>g>in</str<strong>on</strong>g>ski, Andrea, Wolfgang Pempe, Peter Gietz, Mart<str<strong>on</strong>g>in</str<strong>on</strong>g> Haase, Stefan Funk,<br />

<strong>and</strong> Christian Sim<strong>on</strong>. “TEI Documents <str<strong>on</strong>g>in</str<strong>on</strong>g> the Grid.” Literary & L<str<strong>on</strong>g>in</str<strong>on</strong>g>guistic Comput<str<strong>on</strong>g>in</str<strong>on</strong>g>g, 24 (September<br />

2009): 267-279.


307<br />

[Zorich 2008]. Zorich, Diane M. A Survey of Digital Humanities Centers <str<strong>on</strong>g>in</str<strong>on</strong>g> the United States. <str<strong>on</strong>g>Council</str<strong>on</strong>g><br />

<strong>on</strong> <strong>Library</strong> <strong>and</strong> Informati<strong>on</strong> Resources, Publicati<strong>on</strong> Number 143, (April 2008).<br />

http://www.clir.org/pubs/abstract/pub143abst.html<br />

[Zuk et al. 2005]. Zuk, T., S. Carpendale, <strong>and</strong> W. D. Glanzman. “Visualiz<str<strong>on</strong>g>in</str<strong>on</strong>g>g Temporal Uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

3D Virtual Rec<strong>on</strong>structi<strong>on</strong>s.” VAST (2005): The Sixth Internati<strong>on</strong>al Symposium <strong>on</strong> Virtual Reality,<br />

Archaeology <strong>and</strong> Cultural Heritage, (2005): 99-106.<br />

http://<str<strong>on</strong>g>in</str<strong>on</strong>g>novis.cpsc.ucalgary.ca/<str<strong>on</strong>g>in</str<strong>on</strong>g>novis/uploads/Publicati<strong>on</strong>s/Publicati<strong>on</strong>s/Zuk_2005_Visualiz<str<strong>on</strong>g>in</str<strong>on</strong>g>gTemp<br />

oralUncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty.pdf

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!