26.12.2014 Views

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

Rome Wasn't Digitized in a Day - Council on Library and Information ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

151<br />

This example illustrates the primary advantage of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g the editi<strong>on</strong>s <str<strong>on</strong>g>in</str<strong>on</strong>g> XML. If editors<br />

wish to differ between uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g> characters <strong>and</strong> broken characters they can encode them with<br />

different tags. They can then transform both tags <str<strong>on</strong>g>in</str<strong>on</strong>g>to under-dots if they still wish to present<br />

both <str<strong>on</strong>g>in</str<strong>on</strong>g>stances as such or they can decide to visualize <strong>on</strong>e <str<strong>on</strong>g>in</str<strong>on</strong>g>stance, underl<str<strong>on</strong>g>in</str<strong>on</strong>g>ed <strong>and</strong> the other<br />

under-dotted to dist<str<strong>on</strong>g>in</str<strong>on</strong>g>guish between them (Roued 2009).<br />

Thus, EpiDoc allows different scholarly op<str<strong>on</strong>g>in</str<strong>on</strong>g>i<strong>on</strong>s to be encoded <str<strong>on</strong>g>in</str<strong>on</strong>g> the same XML file s<str<strong>on</strong>g>in</str<strong>on</strong>g>ce c<strong>on</strong>tent<br />

markup (EpiDoc XML) <strong>and</strong> presentati<strong>on</strong> (separate XSLT sheets) are separated. Roued also supported<br />

the argument of Roueché (2009) that EpiDoc encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g is not a “substantial c<strong>on</strong>ceptual leap” from<br />

Leiden encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g.<br />

While the first two V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablet publicati<strong>on</strong>s were encoded us<str<strong>on</strong>g>in</str<strong>on</strong>g>g EpiDoc, Roued observed that<br />

the level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g was not very granular <strong>and</strong> the website was not well set up to exploit the<br />

encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g. She also noted that the level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g a project chooses typically depends both <strong>on</strong> the<br />

technology chosen <strong>and</strong> the anticipated future use of the data. For the next series of V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a tablets,<br />

Roued expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that the project decided to pursue an even more granular level of encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <str<strong>on</strong>g>in</str<strong>on</strong>g>clud<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

words <strong>and</strong> terms <str<strong>on</strong>g>in</str<strong>on</strong>g> the transcripti<strong>on</strong>. This has supported an <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive search functi<strong>on</strong>ality <strong>and</strong> added<br />

greater value to the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g as a knowledge base. To beg<str<strong>on</strong>g>in</str<strong>on</strong>g> with, the project encoded the tablets <str<strong>on</strong>g>in</str<strong>on</strong>g><br />

greater detail regard<str<strong>on</strong>g>in</str<strong>on</strong>g>g Leiden:<br />

Encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g>stances of uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>ty, added characters <strong>and</strong> abbreviati<strong>on</strong>s enables us to extract<br />

these <str<strong>on</strong>g>in</str<strong>on</strong>g>stances from their respective texts <strong>and</strong> analyze them. We can, for example, count how<br />

many characters <str<strong>on</strong>g>in</str<strong>on</strong>g> the text or texts are deemed to be uncerta<str<strong>on</strong>g>in</str<strong>on</strong>g>. Similarly, we can look at the<br />

type of characters that are most likely to be supplied. These illustrate the many new<br />

possibilities for analyz<str<strong>on</strong>g>in</str<strong>on</strong>g>g the read<str<strong>on</strong>g>in</str<strong>on</strong>g>g of ancient document (Roued 2009).<br />

In additi<strong>on</strong> to more extensive encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the texts <str<strong>on</strong>g>in</str<strong>on</strong>g> EpiDoc, the eSAD project decided to perform a<br />

certa<str<strong>on</strong>g>in</str<strong>on</strong>g> amount of manual “c<strong>on</strong>textual encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g” of words, people, place names, dates, <strong>and</strong> military<br />

terms, or basically of all the items found <str<strong>on</strong>g>in</str<strong>on</strong>g> the <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes. For words, the <str<strong>on</strong>g>in</str<strong>on</strong>g>dex c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>ed a list of<br />

lemmas with references to places <str<strong>on</strong>g>in</str<strong>on</strong>g> the text where corresp<strong>on</strong>d<str<strong>on</strong>g>in</str<strong>on</strong>g>g words occurred; encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g these data<br />

allowed them to extract <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> such as the number of times a lemma occurred <str<strong>on</strong>g>in</str<strong>on</strong>g> the text. Dur<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

the encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g of the <str<strong>on</strong>g>in</str<strong>on</strong>g>dexes, the project discovered numerous errors that needed to be corrected. All of<br />

this encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g has been performed to support new advanced search<str<strong>on</strong>g>in</str<strong>on</strong>g>g features with a new launch of the<br />

website as V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a Tablets Onl<str<strong>on</strong>g>in</str<strong>on</strong>g>e 2.0 <str<strong>on</strong>g>in</str<strong>on</strong>g> 2010. In particular, they have developed an <str<strong>on</strong>g>in</str<strong>on</strong>g>teractive<br />

search feature us<str<strong>on</strong>g>in</str<strong>on</strong>g>g AJAX, 509 LiveSearch, JavaScript, <strong>and</strong> PHP 510 that gives the user feedback while<br />

typ<str<strong>on</strong>g>in</str<strong>on</strong>g>g <str<strong>on</strong>g>in</str<strong>on</strong>g> a search term. In the case of V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a, it will give users a list of all words, terms, names,<br />

<strong>and</strong> dates that c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g> their search pattern.<br />

The XML document created for each <str<strong>on</strong>g>in</str<strong>on</strong>g>scripti<strong>on</strong> text c<strong>on</strong>ta<str<strong>on</strong>g>in</str<strong>on</strong>g>s all of its relevant bibliographic<br />

<str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>and</strong> textual encod<str<strong>on</strong>g>in</str<strong>on</strong>g>g, <strong>and</strong> Roued expla<str<strong>on</strong>g>in</str<strong>on</strong>g>ed that this necessitated develop<str<strong>on</strong>g>in</str<strong>on</strong>g>g methods that<br />

could extract relevant <str<strong>on</strong>g>in</str<strong>on</strong>g>formati<strong>on</strong> <strong>on</strong>ly, depend<str<strong>on</strong>g>in</str<strong>on</strong>g>g up<strong>on</strong> the need. The project thus decided to build<br />

RESTful web services us<str<strong>on</strong>g>in</str<strong>on</strong>g>g the ZEND framework 511 <strong>and</strong> PHP. The V<str<strong>on</strong>g>in</str<strong>on</strong>g>dol<strong>and</strong>a web services receive<br />

URLs with certa<str<strong>on</strong>g>in</str<strong>on</strong>g> parameters <strong>and</strong> return answers as XML. This allows other projects to utilize these<br />

encoded XML files, <strong>and</strong>, <str<strong>on</strong>g>in</str<strong>on</strong>g> particular, the knowledge base of Lat<str<strong>on</strong>g>in</str<strong>on</strong>g> words. This web service is be<str<strong>on</strong>g>in</str<strong>on</strong>g>g<br />

used <str<strong>on</strong>g>in</str<strong>on</strong>g> their related project that seeks to develop an ISS for readers of ancient documents. The<br />

509 AJAX, short for “Asynchr<strong>on</strong>ous JavaScript <strong>and</strong> XML” <strong>and</strong> is a technique “for creat<str<strong>on</strong>g>in</str<strong>on</strong>g>g fast <strong>and</strong> dynamic web pages”<br />

http://www.w3schools.com/ajax/ajax_<str<strong>on</strong>g>in</str<strong>on</strong>g>tro.asp<br />

510 PHP st<strong>and</strong>s for “Hypertext Processor” <strong>and</strong> is a server-side script<str<strong>on</strong>g>in</str<strong>on</strong>g>g language, http://www.w3schools.com/php/php_<str<strong>on</strong>g>in</str<strong>on</strong>g>tro.asp<br />

511 http://framework.zend.com/

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!