12.07.2015 Views

Topics in Language Resources for Translation ... - ymerleksi - home

Topics in Language Resources for Translation ... - ymerleksi - home

Topics in Language Resources for Translation ... - ymerleksi - home

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

64 Bel<strong>in</strong>da MaiaSearchablecorporaencod<strong>in</strong>gCQPXMLCorpora searches(term<strong>in</strong>ology,semantic relations,etc.)Pre-process<strong>in</strong>gProduction ofnew resourcesUse of exist<strong>in</strong>gresources <strong>in</strong> thedatabaseText filesEdit<strong>in</strong>g ofMeta-<strong>in</strong><strong>for</strong>mationCorpógrafo In<strong>for</strong>mation systemMeta-In<strong>for</strong>mation + result<strong>in</strong>g resources + auxiliaryresources of the CorpógrafoMeta-<strong>in</strong><strong>for</strong>mationon filesTerm<strong>in</strong>ologydatabasesMeta-<strong>in</strong><strong>for</strong>mationon corporaLexical <strong>Resources</strong>and Search PatternsFigure 1. Structure of the Corpógrafo from the user’s po<strong>in</strong>t of view <strong>in</strong> 2006.and password. Each researcher or group is provided with a private space on thededicated server on which to carry out their work. They create, analyse and experimentwith their own corpora and databases and try out new ideas, but everyth<strong>in</strong>gthey do with these tools is saved on this private space. The adm<strong>in</strong>istrator and theteacher or supervisor may use the student or researcher’s username and passwordto enter that space, also over the Web, and provide help and advice when necessary.Otherwise, each project functions autonomously.On acquir<strong>in</strong>g a space on the server via free registration <strong>for</strong> a username andpassword, the user is presented with an empty ‘space’ <strong>in</strong> which to work, togetherwith <strong>in</strong>structions <strong>for</strong> use. The Gestor (File Manager) allows one to:– Import texts <strong>in</strong> various <strong>for</strong>mats and upload them to the Corpógrafo;– Register the metadata of the texts, i.e., document, authorship, source, doma<strong>in</strong>and text type (this allows proper credit to be given <strong>for</strong> any <strong>in</strong><strong>for</strong>mationextracted, and serves as some protection from copyright problems);– Preprocess texts by the removal of unwanted material and text correction;– Automatically divide the text <strong>in</strong>to sentence length units;– Comb<strong>in</strong>e and re-comb<strong>in</strong>e texts to <strong>for</strong>m corpora <strong>for</strong> specialised research (e.g.,onecancomb<strong>in</strong>eallthedoma<strong>in</strong>specifictexts <strong>in</strong> one language to extract ter-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!