24.07.2013 Views

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

Onto.PT: Towards the Automatic Construction of a Lexical Ontology ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Preface<br />

About six years ago, almost by accident, I ended up engaging in an academic research<br />

career. It all started with my Master’s dissertation, <strong>the</strong> final, and probably <strong>the</strong> most<br />

important, stage <strong>of</strong> my Master’s degree. Then, I was not planning to dedicate more<br />

than one year <strong>of</strong> my life to research. But even one year later, when I started working<br />

as a researcher for Linguateca, it was far from my thoughts that I would soon enroll<br />

on a PhD.<br />

Briefly, <strong>the</strong> main goal <strong>of</strong> my Master’s work was to, given a rhythmic sequence,<br />

generate matching lyrics, in Portuguese. My intention was always to work with my<br />

mo<strong>the</strong>r tongue – not only because I felt that <strong>the</strong> results would be more understandable<br />

and funnier for <strong>the</strong> people surrounding me, but also because I used to write a<br />

few Portuguese lyrics for my former band. I was thus very interested in investigating<br />

how far an automatic lyricist could go.<br />

However, working with Portuguese revealed to be a challenging task. Since <strong>the</strong><br />

beginning <strong>of</strong> <strong>the</strong> work, we noticed that <strong>the</strong>re was a lack <strong>of</strong> language resources for<br />

Portuguese and it was not easy to find <strong>the</strong> few existing ones. For instance, at<br />

that time, we could not find a public comprehensive lexicon for providing words<br />

and information on <strong>the</strong>ir morphology and possible inflections. Not to mention a<br />

semantics-oriented lexicon. Since <strong>the</strong>n, I decided I wanted to contribute with something<br />

useful, that would hopefully fulfill <strong>the</strong> aforementioned shortage <strong>of</strong> resources.<br />

More or less at <strong>the</strong> same time, I had my first contact with Linguateca, a distributed<br />

language resource centre for Portuguese, responsible not only for cataloguing existing<br />

resources, but also for developing and providing free access to <strong>the</strong>m.<br />

I was very lucky that, before <strong>the</strong> end <strong>of</strong> my Master’s, Linguateca opened a position<br />

that I applied for. The main goal <strong>of</strong> this position was to develop PAPEL,<br />

a lexical-semantic resource for Portuguese, automatically extracted from a dictionary.<br />

After my Master’s, I was hired for that precise task. While working for<br />

Linguateca, I started to have a deeper contact with o<strong>the</strong>r researchers working on<br />

<strong>the</strong> computational processing <strong>of</strong> Portuguese. I started to gain some experience on<br />

natural language processing (NLP), especially on semantic information extraction,<br />

and I became passionate for research in this area. So much that, today, I do not see<br />

myself doing something completely unrelated.<br />

The work with Linguateca was very important for my training as a researcher<br />

in NLP. It was so enriching that I felt that, with what I had learned, I could do,<br />

and learn, more. And <strong>the</strong>re is so much to do to contribute to <strong>the</strong> development <strong>of</strong><br />

Portuguese NLP, that I wanted to continue my work, which I did, after embarking<br />

on my PhD. This <strong>the</strong>sis presents <strong>the</strong> result <strong>of</strong> a four year PhD where, starting with<br />

what we learned with PAPEL, we created a larger resource, <strong>Onto</strong>.<strong>PT</strong>, by exploiting<br />

o<strong>the</strong>r sources, and we developed a model for organising this resource in an alternative<br />

way, which might suit better concept-oriented NLP.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!