13.07.2015 Views

A New Approach for Knowledge Management and ... - Wseas

A New Approach for Knowledge Management and ... - Wseas

A New Approach for Knowledge Management and ... - Wseas

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

WSEAS TRANSACTIONS onINFORMATION SCIENCE <strong>and</strong> APPLICATIONSGiulio Concas, Filippo Eros Pani, Maria Ilaria Lunesu5 Case studyThe ASAS project (http://asas.flosslab.it) aims tocreate an IR with an annotated spoken languageelectronic corpus that could become a plat<strong>for</strong>m <strong>for</strong>the preservation, study, communication <strong>and</strong>appreciation of oral traditions of the Sardinianlanguage, especially improvised poetry.5.1 Annotations through PRAATThe electronic corpus was annotated by linguists<strong>and</strong> musicologists through the PRAAT software[19], which, besides per<strong>for</strong>ming spoken languageanalysis, allows <strong>for</strong> multilevel segmentation <strong>and</strong>linguistic annotations of audio files. The softwarehas a graphic interface with wave<strong>for</strong>ms <strong>and</strong> voicespectrum that make annotators' work easier <strong>and</strong>make visible those acoustic phenomena that can befound by an accurate spectrum analysis, followed byannotation levels. Linguists <strong>and</strong> musicologistsworking on the Sardinian Linguistic Sound Archivechose a list of possible annotation levels (syllable,tone, morpheme, syntagm, accents, etc.), useful <strong>for</strong>both linguistic <strong>and</strong> musical analysis of audiorecordings.5.2 Metain<strong>for</strong>mation Associated to AudioRecordingsMusicologists <strong>and</strong> Linguists, other than withannotations, wanted to complete every audiorecording by describing it with a number ofin<strong>for</strong>mation, chosen among the most relevantfeatures of the recordings. The in<strong>for</strong>mation could beused to manage recordings in the archive, becauseby describing them they allow <strong>for</strong> selection <strong>and</strong>organization, facilitating efficient retrieval <strong>and</strong>usage. Metain<strong>for</strong>mation range from somethingclosely related to cataloguing, like author, title,object, recording date, etc., up to more technicalin<strong>for</strong>mation like the different singing types, speechtypes, accompaniment or instruments. Linguists <strong>and</strong>musicologists selected 38 metain<strong>for</strong>mationsassociated to audio recordings: title, author, object,description, <strong>for</strong>mat, etc.characteristics of texts. Through a continuousdialogue with the scholars, audio recordings wereanalysed <strong>for</strong> their essential <strong>and</strong> basic properties,needed to organize <strong>and</strong> retrieve texts in the corpus.Twelve general metadata were found: title, author,publisher, object, contributor, date, place, occasion,document accessibility, language, description <strong>and</strong><strong>for</strong>mat. Those metadata outlined the necessaryin<strong>for</strong>mation to describe spoken texts in the corpus,conveying in particular singing or speech type, theoccasion in which the audio was recorded, <strong>and</strong> thelinguistic variety it belongs to.The top-down approach proceeds to furtherspecialize the metadata. More specific, or qualified,metadata are represented by adding a qualifier to thename of the more general metadata <strong>and</strong> using thecommon syntax metadata.qualifier.Lastly, "relational" metadata are defined as well,in order to define a certain relation among two ormore different objects belonging to the corpus. Aninclusion relation must be specified in order todescribe the belonging of one or more objects to thesame recording set, <strong>for</strong> example different songs in asinging contest.5.4 Formalization of Linguistic Annotations:Bottom-Up <strong>Approach</strong>The <strong>for</strong>malization of annotations in a metadataschema can be achieved using a bottom-up orinductive reasoning, as explained in the previoussection. The structure of annotations is analysedwith the PRAAT software. Annotations areorganized with a precise structure: each annotationis made of a time interval <strong>and</strong> a text label or by aninstant <strong>and</strong> a marker with its text. All annotations inthe same linguistic category are collected in thesame tier (or annotation level), which can beconsidered as the category they belong to, giving itsname to the corresponding metadata. In this way, arepeatable metadata is found in each annotationlevel of the TextGrid (the text file where PRAATstores all Tier with their own segmentations <strong>and</strong>annotations) <strong>and</strong> each annotation can be representedas multiple occurrences of that metadata.5.3 Formalization of SemanticCharacteristics: Top-Down <strong>Approach</strong>After designing the conceptual model of theknowledge domain, a top-down or deductiveapproach can be used <strong>for</strong> <strong>for</strong>malizing the semantic5.5 Choosing a Metadata Schema <strong>for</strong> KMSEnteringDepending on the interoperability needs that mustbe met, importing the metadata schema that was justcreated into the knowledge management systemE-ISSN: 2224-3402 143 Issue 5, Volume 10, May 2013

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!