11.07.2015 Views

Université de Montréal - Thèse sous forme numérique

Université de Montréal - Thèse sous forme numérique

Université de Montréal - Thèse sous forme numérique

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

179Figure 24. Bottom-up approach4.3.1. Extraction of candidate termsCandidate terms were extracted by means of a tool called TermoStat (Drouin 2003), a termextractor that computes the ―specificities‖ of words occurring in a given specialized corpusby comparing their frequency in that corpus with their frequency in a general-languagecorpus (or reference corpus). Basically, the higher the specificity of a word, the more likelyit is to be a term of the subject field. Conversely, a word with a low specificity coefficient islikely to belong to the general language. TermoStat can perform extractions based on theform of terms (single- or multi-word terms) and based on the part of speech of terms(nouns, verbs, adjectives and adverbs). This term extractor was chosen for two mainreasons. Firstly, contrary to other term extractors, TermoStat can extract verbs, the type ofunits on which this research focuses. Secondly, TermoStat has been used in otherterminographic projects with good results (L‘Homme 2008; Le Serrec et al. 2009).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!