Vernacular Geography - Host Ireland
Vernacular Geography - Host Ireland
Vernacular Geography - Host Ireland
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Vernacular</strong> <strong>Geography</strong><br />
Chris Jones and Florian Twaroch*<br />
Cardiff University<br />
Some results based on work by R. Purves (Univ. Zurich), P. Cough (Univ. Sheffield) and<br />
H. Joho (Univ. Glasgow)<br />
*Funded by Ordnance Survey and EC TRIPOD Project<br />
1
<strong>Vernacular</strong> Place Names<br />
• Informal Names<br />
– Different from Administrative names<br />
– Or Administrative names used informally<br />
• The Mid West (USA), South of France, Midlands,<br />
North of England, Lake District, Wye Valley..<br />
• The West End (London), Cardiff Bay,<br />
The Gorbles (Glasgow)..<br />
2
Place Names for Information Retrieval<br />
• Specifying location on internet<br />
– e.g. timetables, routing instructions, yellow pages, maps,<br />
+ “local” web search<br />
• Gazetteers used to recognise places<br />
and<br />
convert to quantitative footprint (point, box, polygon..)<br />
• Most vernacular place names not recorded in<br />
conventional gazetteers<br />
– often no precise boundary<br />
• Need to acquire knowledge of vernacular names<br />
3
Sources of place name knowledge<br />
• Gazetteers (incomplete / inadequate)<br />
• Published maps with place annotation<br />
– Difficult to derive footprint of vague places<br />
– For named natural feature, can use annotation point in<br />
association with digital terrain models to derive extent<br />
• People<br />
– Interviews / questionnaires – traditional methods inefficient,<br />
but web questionnaires offer great potential<br />
• Web documents<br />
– Many documents mention places, addresses, postcodes<br />
Social Web and Volunteered Geographic Information<br />
- create map data and annotate media with place names<br />
4<br />
Wikimapia, Open Street Map, Flikr photos…..
Spatial modelling of places on the Web<br />
• Documents that refer to<br />
vernacular places may also<br />
refer to more precise places<br />
inside them.<br />
set of points<br />
• Use density surface<br />
modelling methods to map<br />
the frequency of occurrence<br />
of co-located places<br />
5
Summary of Web Query Method<br />
• Submit web search engine queries referring to a<br />
target place<br />
• Parse resulting highest ranking web pages for<br />
occurrence of place names<br />
• Geocode (“ground”) place names with coordinates<br />
• Create geometric model (surface model) and<br />
extract approximate boundary.<br />
6
Formulating appropriate Web queries<br />
• Region only, e.g. “Rocky Mountains”<br />
– Retrieves all documents mentioning the name<br />
• Region + Concept, e.g. “Hotels in Cotswolds”<br />
– Tends to retrieve directory pages listing places<br />
associated with the target place<br />
• Region and lexical pattern (trigger phrase), e.g.<br />
“Midwest towns such as xxxx”;<br />
“xxxx in the South of France” (spatial preposition)<br />
• Region + Concept produces highest numbers of<br />
co-associated places in top ranking documents.<br />
7
Geo-parsing<br />
• Use Named Entity Recognition (NER) methods<br />
and gazetteers to identify names<br />
• Distinguish between geographical and nongeographical<br />
uses – many place names occur in<br />
organisation names and in people’s names.<br />
• Use rules / patterns to identify these cases, e.g.<br />
indicates person’s<br />
name.<br />
– Machine learning methods often used<br />
8
Geo-Parsing : true & false references<br />
Some types of false<br />
geographic reference<br />
• Personal names<br />
Smedes York<br />
• Business name<br />
Dorchester Hotel,<br />
York Properties..<br />
• Street names<br />
Oxford Street,<br />
London Road…<br />
• Common words that are<br />
also places<br />
urban, institute, land,<br />
battle, derby, over, well,<br />
……<br />
Highlighted words are present in Getty<br />
Thesaurus of Geographic Names<br />
9
Geocoding<br />
• Need to assign coordinate to name<br />
• But many places have the same name<br />
• search for co-occurrence of parent or<br />
neighbouring places that establish<br />
uniqueness<br />
10
Experiments with Web query method<br />
• Queries of form “hotels in ”<br />
submitted to Google with API<br />
• Initially used target places that are precise –<br />
English counties - and compared results with<br />
known exact boundary<br />
• Then used several target vague places and<br />
evaluated qualitatively<br />
11
Devon (county)<br />
Distribution of associated places Density surface at three<br />
threshold levels (1, 0.5, 0.25<br />
points per cell)<br />
Thresholded<br />
boundary compared<br />
with actual boundary<br />
Density surface<br />
Note: some<br />
places wrongly<br />
geocoded<br />
12
4 precise places + thresholded<br />
boundaries<br />
13
Other vague places : Mid Wales<br />
Mid Wales<br />
approximated with<br />
thresholds of 0.5,<br />
0.25 and 0.12<br />
14
Vague place: Cotswolds<br />
Cotswolds (large region<br />
in centre) with thresholds<br />
of 0.5, 0.25. and 0.125<br />
points per grid cell.<br />
15
Vague place : Highlands of Scotland<br />
a b c<br />
a) Density of unique places<br />
b) Density of number of occurrences of the name of each place<br />
c) Density of number of documents that mention each place<br />
Main peak corresponds to Inverness - main town in Highlands<br />
2 nd peak in b) is due to mis-geocoding of “Cameron”<br />
16
Vague place :Mittelland (Switzerland)<br />
Human<br />
interpretations of<br />
the extent<br />
+ is the “core”<br />
Density surface of<br />
web mining results<br />
17
Web Screen Scraping Methods<br />
• Gumtree – Classified Ads and Community Site<br />
– British cities<br />
– ads of items to sell/buy/share,+ neighbourhood names<br />
georeferenced by UK postcodes<br />
• Google Maps Local Businesses<br />
– United Kingdom<br />
– business addresses, georeferenced by UK postcodes<br />
• Google Maps User Created Contents<br />
– United Kingdom<br />
– users can enter point, line and polygon features to<br />
describe local places using an interface to Google Maps<br />
18
Ads in Gumtree for Cardiff<br />
Neighbourhood<br />
name<br />
+<br />
Postcode<br />
19
<strong>Vernacular</strong> use of Cardiff community<br />
(parish) names from Gumtree and Google<br />
Usage of Plasnewydd coincides with boundary<br />
Usage of Roath does not coincide with boundary and<br />
subsumes Plasnewydd<br />
Plasnewydd<br />
Plasnewydd<br />
Roath<br />
Roath<br />
Cathays Splott<br />
Boundarires based on Ordnance Survey Data<br />
© Crown Copyright 2008<br />
20
Future work<br />
• Web query method for regions<br />
– different sorts of queries to cover places where few<br />
settlements<br />
• Screen scraping methods for neighbourhoods<br />
– Evaluation of quality of different sources<br />
• Improve methods for thresholding surfaces<br />
• Use multiple types of data, e.g. population, landcover..<br />
• Discover vernacular names on Web<br />
• Web questionnaires to elicit people’s understanding of<br />
the interpretation of vernacular names<br />
21