25.10.2012 Views

Laurie Bauer - WordPress.com — Get a Free Blog Here

Laurie Bauer - WordPress.com — Get a Free Blog Here

Laurie Bauer - WordPress.com — Get a Free Blog Here

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

83 THE DATA OF LINGUISTICS<br />

• There is variability between sources as to style, etc., although this can<br />

be exploited.<br />

• The texts display an unknown amount of editorial interference and<br />

standardisation.<br />

• Newspaper sources are typically anonymous, which makes it hard to<br />

use them for sociolinguistic enquiry.<br />

Dictionaries and word-lists<br />

Dictionaries look like a linguist’s heaven: they are full of words, each word is<br />

provided with one or more meanings, some of them provide illustrations of the<br />

use of the words (and thus of the syntactic patterns in which they occur), and<br />

more and more they are available in electronic format, which makes them easier<br />

to search.<br />

However, care is required with dictionary data. First, all words are treated<br />

the same, so that fiacre and fiancé have similar entries, despite the fact that one<br />

is much more <strong>com</strong>mon that the other. Second, dictionaries given no direct<br />

information on word frequency, and very little on the ways in which words are<br />

used – their collocations and typical grammatical patterning. Sometimes dictionaries<br />

aimed at non-native speakers are more useful than dictionaries for<br />

native speakers in this regard. Third, dictionaries do not necessarily make it<br />

simple to take a random sample of words, although they appear to do that. The<br />

problem is that a word like <strong>com</strong>bust may have an entry consisting of just a few<br />

lines, while a word like <strong>com</strong>e may have an entry which spills over several<br />

columns or pages. While there are more words like <strong>com</strong>bust, more room is given<br />

to words like <strong>com</strong>e, and any simple counting or sampling procedure (such as<br />

consider the first new word on every fifth page) is likely to end up with a biased<br />

sample. Fourth, dictionaries inevitably involve <strong>com</strong>promises between academic<br />

integrity and <strong>com</strong>mercial feasibility, and there is a certain random element in<br />

what happens to be included in them. Having said that, The Oxford English<br />

Dictionary, particularly in its on-line incarnation, is an invaluable tool for<br />

anyone dealing with the history of English or the vocabulary of English.<br />

It should be recalled that as well as ordinary monolingual and translating dictionaries,<br />

there are dictionaries of special vocabularies, dialect dictionaries, dictionaries<br />

of pronunciations, dictionaries of synonyms and antonyms, dictionaries<br />

of etymology, dictionaries of Indo-European roots, and a host of other works<br />

which provide fascinating reading and a wealth of valuable information.<br />

Possible benefits of this type of data include:<br />

• It provides easy access to large amounts of data.<br />

• The existence of <strong>com</strong>peting dictionaries provides simple checks on the<br />

accuracy of the available data.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!