02.07.2013 Views

Vernacular Geography - Host Ireland

Vernacular Geography - Host Ireland

Vernacular Geography - Host Ireland

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Vernacular</strong> <strong>Geography</strong><br />

Chris Jones and Florian Twaroch*<br />

Cardiff University<br />

Some results based on work by R. Purves (Univ. Zurich), P. Cough (Univ. Sheffield) and<br />

H. Joho (Univ. Glasgow)<br />

*Funded by Ordnance Survey and EC TRIPOD Project<br />

1


<strong>Vernacular</strong> Place Names<br />

• Informal Names<br />

– Different from Administrative names<br />

– Or Administrative names used informally<br />

• The Mid West (USA), South of France, Midlands,<br />

North of England, Lake District, Wye Valley..<br />

• The West End (London), Cardiff Bay,<br />

The Gorbles (Glasgow)..<br />

2


Place Names for Information Retrieval<br />

• Specifying location on internet<br />

– e.g. timetables, routing instructions, yellow pages, maps,<br />

+ “local” web search<br />

• Gazetteers used to recognise places<br />

and<br />

convert to quantitative footprint (point, box, polygon..)<br />

• Most vernacular place names not recorded in<br />

conventional gazetteers<br />

– often no precise boundary<br />

• Need to acquire knowledge of vernacular names<br />

3


Sources of place name knowledge<br />

• Gazetteers (incomplete / inadequate)<br />

• Published maps with place annotation<br />

– Difficult to derive footprint of vague places<br />

– For named natural feature, can use annotation point in<br />

association with digital terrain models to derive extent<br />

• People<br />

– Interviews / questionnaires – traditional methods inefficient,<br />

but web questionnaires offer great potential<br />

• Web documents<br />

– Many documents mention places, addresses, postcodes<br />

Social Web and Volunteered Geographic Information<br />

- create map data and annotate media with place names<br />

4<br />

Wikimapia, Open Street Map, Flikr photos…..


Spatial modelling of places on the Web<br />

• Documents that refer to<br />

vernacular places may also<br />

refer to more precise places<br />

inside them.<br />

set of points<br />

• Use density surface<br />

modelling methods to map<br />

the frequency of occurrence<br />

of co-located places<br />

5


Summary of Web Query Method<br />

• Submit web search engine queries referring to a<br />

target place<br />

• Parse resulting highest ranking web pages for<br />

occurrence of place names<br />

• Geocode (“ground”) place names with coordinates<br />

• Create geometric model (surface model) and<br />

extract approximate boundary.<br />

6


Formulating appropriate Web queries<br />

• Region only, e.g. “Rocky Mountains”<br />

– Retrieves all documents mentioning the name<br />

• Region + Concept, e.g. “Hotels in Cotswolds”<br />

– Tends to retrieve directory pages listing places<br />

associated with the target place<br />

• Region and lexical pattern (trigger phrase), e.g.<br />

“Midwest towns such as xxxx”;<br />

“xxxx in the South of France” (spatial preposition)<br />

• Region + Concept produces highest numbers of<br />

co-associated places in top ranking documents.<br />

7


Geo-parsing<br />

• Use Named Entity Recognition (NER) methods<br />

and gazetteers to identify names<br />

• Distinguish between geographical and nongeographical<br />

uses – many place names occur in<br />

organisation names and in people’s names.<br />

• Use rules / patterns to identify these cases, e.g.<br />

indicates person’s<br />

name.<br />

– Machine learning methods often used<br />

8


Geo-Parsing : true & false references<br />

Some types of false<br />

geographic reference<br />

• Personal names<br />

Smedes York<br />

• Business name<br />

Dorchester Hotel,<br />

York Properties..<br />

• Street names<br />

Oxford Street,<br />

London Road…<br />

• Common words that are<br />

also places<br />

urban, institute, land,<br />

battle, derby, over, well,<br />

……<br />

Highlighted words are present in Getty<br />

Thesaurus of Geographic Names<br />

9


Geocoding<br />

• Need to assign coordinate to name<br />

• But many places have the same name<br />

• search for co-occurrence of parent or<br />

neighbouring places that establish<br />

uniqueness<br />

10


Experiments with Web query method<br />

• Queries of form “hotels in ”<br />

submitted to Google with API<br />

• Initially used target places that are precise –<br />

English counties - and compared results with<br />

known exact boundary<br />

• Then used several target vague places and<br />

evaluated qualitatively<br />

11


Devon (county)<br />

Distribution of associated places Density surface at three<br />

threshold levels (1, 0.5, 0.25<br />

points per cell)<br />

Thresholded<br />

boundary compared<br />

with actual boundary<br />

Density surface<br />

Note: some<br />

places wrongly<br />

geocoded<br />

12


4 precise places + thresholded<br />

boundaries<br />

13


Other vague places : Mid Wales<br />

Mid Wales<br />

approximated with<br />

thresholds of 0.5,<br />

0.25 and 0.12<br />

14


Vague place: Cotswolds<br />

Cotswolds (large region<br />

in centre) with thresholds<br />

of 0.5, 0.25. and 0.125<br />

points per grid cell.<br />

15


Vague place : Highlands of Scotland<br />

a b c<br />

a) Density of unique places<br />

b) Density of number of occurrences of the name of each place<br />

c) Density of number of documents that mention each place<br />

Main peak corresponds to Inverness - main town in Highlands<br />

2 nd peak in b) is due to mis-geocoding of “Cameron”<br />

16


Vague place :Mittelland (Switzerland)<br />

Human<br />

interpretations of<br />

the extent<br />

+ is the “core”<br />

Density surface of<br />

web mining results<br />

17


Web Screen Scraping Methods<br />

• Gumtree – Classified Ads and Community Site<br />

– British cities<br />

– ads of items to sell/buy/share,+ neighbourhood names<br />

georeferenced by UK postcodes<br />

• Google Maps Local Businesses<br />

– United Kingdom<br />

– business addresses, georeferenced by UK postcodes<br />

• Google Maps User Created Contents<br />

– United Kingdom<br />

– users can enter point, line and polygon features to<br />

describe local places using an interface to Google Maps<br />

18


Ads in Gumtree for Cardiff<br />

Neighbourhood<br />

name<br />

+<br />

Postcode<br />

19


<strong>Vernacular</strong> use of Cardiff community<br />

(parish) names from Gumtree and Google<br />

Usage of Plasnewydd coincides with boundary<br />

Usage of Roath does not coincide with boundary and<br />

subsumes Plasnewydd<br />

Plasnewydd<br />

Plasnewydd<br />

Roath<br />

Roath<br />

Cathays Splott<br />

Boundarires based on Ordnance Survey Data<br />

© Crown Copyright 2008<br />

20


Future work<br />

• Web query method for regions<br />

– different sorts of queries to cover places where few<br />

settlements<br />

• Screen scraping methods for neighbourhoods<br />

– Evaluation of quality of different sources<br />

• Improve methods for thresholding surfaces<br />

• Use multiple types of data, e.g. population, landcover..<br />

• Discover vernacular names on Web<br />

• Web questionnaires to elicit people’s understanding of<br />

the interpretation of vernacular names<br />

21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!