View/Open - ARAN

Automated Aerial Image Analysis using Ordnance 

Survey Vector Data 

Brian Sexton 

Master of Science 

NUI Galway 

Department of Information Technology 

August 2010 

Dr. James Duggan 

Dr. Sam Redfern

Certificate of Authorship 

i

Contents 

ii 

Page 

Certificate of Authorship .......................................................................................i 

Contents ..................................................................................................................ii 

List of Tables ........................................................................................................ iii 

List of Figures........................................................................................................iv 

Abstract..................................................................................................................vi 

1 Project Outline...................................................................................................1 

1.1 Project Overview .......................................................................................1 

1.2 General Introduction and Background.......................................................8 

2 Stepping through the Algorithm....................................................................18 

2.1 Initial Inputs.............................................................................................23 

2.2 Area Extraction........................................................................................27 

2.3 Spectral Value Comparison .....................................................................29 

2.4 Confirmation............................................................................................32 

3 Sampling for the Baseline Image Key ...........................................................34 

3.1 Roads .......................................................................................................35 

3.2 Water........................................................................................................42 

3.3 Marsh .......................................................................................................49 

3.4 Coniferous Forestry .................................................................................55 

3.5 Mixed Forestry.........................................................................................61 

3.6 Track ........................................................................................................66 

3.7 Shade........................................................................................................72 

3.8 Roof Areas ...............................................................................................78 

3.9 Pasture......................................................................................................86 

3.10 Rough Pasture..........................................................................................92 

4 Testing ..............................................................................................................98 

4.1 Pasture Test..............................................................................................99 

4.2 Rough Pasture Test ................................................................................109 

4.3 Marsh Test .............................................................................................119 

4.4 Bog Test.................................................................................................132 

4.5 Conclusion .............................................................................................146 

5 Literature Review..........................................................................................148 

5.1 Spectral and image considerations for the thesis...................................152 

5.2 Vector and polygon based studies of aerial photography......................159 

6 References ......................................................................................................167

List of Tables 

iii 

Page 

Table 1: Road sample values .................................................................................37 

Table 2: Road test sample value 1 .........................................................................39 



Table 5: Water sample values ................................................................................43 

Table 6: Water test sample values .........................................................................45 

Table 7: Marsh sample values................................................................................50 

Table 8: Marsh test sample values .........................................................................52 

Table 9: Coniferous forestry sample values...........................................................56 

Table 10: Coniferous forestry test sample values ..................................................58 

Table 11: Mixed forestry sample values................................................................62 

Table 12: Mixed forestry test sample values .........................................................63 

Table 13: Track sample values...............................................................................67 

Table 14: Track test sample values........................................................................69 

Table 15: Shade sample values ..............................................................................73 

Table 16: Shade test sample value 1 ......................................................................74 



Table 19: Roof pixel sample values.......................................................................81 

Table 20: Roof test sample value 1........................................................................84 



Table 23: Pasture sample values ............................................................................87 

Table 24: Pasture test sample values......................................................................89 

Table 25: Rough pasture sample values.................................................................93 

Table 26: Rough pasture test sample values ..........................................................95

List of Figures 

iv 

Page 

Figure 1: Aerial view of sample area.....................................................................34 

Figure 2: Road area and surrounding detail...........................................................35 

Figure 3: Road area and vector data.......................................................................36 

Figure 4: Typical Water Area Image .....................................................................42 

Figure 5: Water Area Image Modification.............................................................44 

Figure 6: Sample area as a mosaic of polygons.....................................................47 

Figure 7: Typical Marsh Area Image.....................................................................49 

Figure 8: Typical Mixed Forestry Area Image ......................................................61 

Figure 9: Typical Track Area Image......................................................................66 

Figure 10: Typical Shade Area Image ...................................................................72 

Figure 11: Histogram for Shade and Pasture .........................................................75 

Figure 12: Typical Roof Value Area Image...........................................................78 

Figure 13: Distribution of Buildings/Roofs in the Sample ....................................80 

Figure 14: Blue colour band pixel count for study area.........................................82 

Figure 15: Typical Pasture Area Image .................................................................86 

Figure 16: Typical Rough Pasture Area Image......................................................92 

Figure 17: Creating the ASCII file.......................................................................100 

Figure 18: Aerial view of pasture test 1...............................................................100 

Figure 19: Red colour band for pasture test 1......................................................101 

Figure 20: Green colour band for pasture test 1 ..................................................102 







Figure 27: Vector data for pasture test 4..............................................................106 




Figure 31: Vector data for rough pasture test 1 ...................................................110 

Figure 32: Aerial view of rough pasture test 1.....................................................110 

Figure 33: Red colour band for rough pasture test 1 ...........................................111 

Figure 34: Green colour band for rough pasture test 1 ........................................112 







Figure 41: Vector data for rough pasture test 4 ...................................................116 




Figure 45: Vector data for marsh test 1 ...............................................................120 

Figure 46: Aerial view of marsh test 1.................................................................121

Figure 47: Red colour band for marsh test 1........................................................122 

Figure 48: Green colour band marsh test 1..........................................................122 

Figure 49: Blue colour band for marsh test 1.......................................................123 

Figure 50: Aerial view of marsh test 2.................................................................124 


Figure 52: Green colour band marsh test 2..........................................................125 








Figure 60: Vector data for bog test 1 ...................................................................132 

Figure 61: Aerial view for bog test 1 ...................................................................133 

Figure 62: Red colour band for bog test 1 ...........................................................134 

Figure 63: Green colour band for bog test 1 ........................................................134 

Figure 64: Blue colour band for bog test 1 ..........................................................135 

Figure 65: Vector data for bog test 2 ...................................................................136 













v

Abstract 

This study sets out an algorithm for the automatic analysis of controlled (flattened) 

aerial photography using ordnance survey vector data. It uses the vector to clip the 

aerial image into a set of small area polygons which are then analyzed for the 

spectral properties and classified according to the result. The study tests sections 

of aerial photography from a sample area in County Galway for specific spectral 

properties. This was to identify the type of ground cover and was achieved using 

an image key of spectral properties which was developed during the study. This is 

called training the image key. A testing section shows that it is possible to derive 

information about the land use type from these areas based on the range of values 

returned from a pixel count of spectral properties within a small area polygon. 

The study uses several open source software frameworks to complete the 

experiment, most notably the MATLAB based Mirone application, but can be 

extended to any software capable of handling irregular polygons in a projection 

system. The body of the study is set out in three chapters, the first detailing the 

process, the second detailing the sampling for unique values and the third details 

the testing which took place. Chapters 3 and 4 are sub divided into sections 

describing the research on specific land use types and their spectral signatures. 

Chopping an aerial image into a mosaic of (relatively) homogenous values, e.g. 

pasture, forestry, marsh etc., increases the accuracy of automated analysis. This is 

the first time that a spectral analysis has been attempted using ordnance survey 

Ireland small area polygons to clip the image. It is of interest to researchers, 

planners, developers etc., looking to simplify and automate this type of search 

over a large region. 

Note: Contact brian.sexton@osi.ie for a set of sample files for training the image 

key. 

vi

1 Project Outline 

1.1 Project Overview 

The following study is an attempt to devise an automatic method of analyzing 

aerial photography based on vector data. It presents an algorithm and calibration 

data for someone seeking to complete a search of aerial imagery based on a 

spectral signature. The premise on which the work was undertaken was that, given 

the small area polygons and coding data present in ordnance survey data, it should 

be possible to cut a controlled aerial photograph into a mosaic of sections and 

automatically identify the type of ground cover. 

The goal of the study was to identify a series of steps with which this could be 

completed. These steps are intended as a template for either a standalone 

application which could run searches over large geographical areas, or as a means 

of achieving the result for smaller areas using existing open source software 

libraries. This document is aimed at people seeking to develop a generic tool for 

completing a spectral analysis of aerial photography, or for anyone looking to 

execute a search of the Irish landscape for data which exhibits a distinct spectral 

value (crop disease, impermeable surface area, flooding etc.). The study differs 

from other methods of automatic image processing in that it takes existing 

analysis (in the form of Ordnance Survey vector data) and uses it to convert the 

spectral data into manageable sections. This summary presents a chronological 

overview of the work completed, and outlines the process used. 

The basis of this study is the clipping of raster data for spectral analysis, and the 

intention is to prove that this makes the process of image analysis easier. 

Traditional approaches can refine the analysis itself to reveal more about the 

region of interest than is completed here. For example, a lidar (aerial laser imaging) 

survey could provide researchers with data relating to the height of a tree canopy 

or the depth of peat in a bog. This study, while it does identify specific sets of 

values relating to land use types, focuses on identifying an easily replicated 

process for automatically obtaining data from aerial imagery. 

1

The process was designed so that it can be coded into a standalone process and has 

been tested using open source software. In this way the study is aimed at 

simplifying what can be expensive and time consuming into a series of steps that 

someone without a high level of training in either mapping or computer science 

could run. 

One of the difficulties presented by attempting to find data from imagery is 

identifying target areas within a region of interest. This is compounded by the 

nature of values returned by an aerial image –the clusters of pixels with similar 

values are often not bounded by clean borders and often display a gradual gradient 

of values when merging with another cluster. In other words to automatically 

determine the true values on the ground a program needs to know the extent of the 

set of data that the clusters sit in. One analogy might be tables in a relational 

database –by dividing the total set of pixels for a region of interest into discrete 

parcels of land reflected in the photograph the program has a database of tables to 

query for specific values. This study takes the vector data and uses it to create a 

mosaic of separate pixel groups for analysis. This in itself, however, still leaves a 

huge body of data to be analyzed. To further improve an analysis the parcels 

within the mosaic are classified according to known values taken from the vector 

coding, and these known values are then used to train an image key, which in turn 

allows the remaining parcels to be analyzed. It is this clipping process which is at 

the core of this study. This provides a means of accessing the raster data which 

readers can easily replicate and automate for their own purpose. 

Vector data has been used to target and control aerial image analysis in previous 

studies with a degree of success. The studies often involve additional user input to 

refine the region of interest so that automated analysis techniques like multivariate 

analysis of variance can be applied to the image. This process of refining the 

region requires a level of technical expertise which could make a spectral analysis 

of aerial imagery too time consuming for many users. An example of one of these 

processes is contained in the 2007 assessment of impermeable surface area by 

Yuyu Zhou and Y.Q. Wang (Zhou & Wang, 2007). The authors determined that 

segmentation would be the most important part of the study and applied an 

algorithm of multiple-agent segmentation and classification. This involved 

importing transportation data to create buffers along the major roads at varying 

2

distances from the centre to divide the imagery. This process allowed the authors 

to determine the nature of ground cover with a high degree of accuracy, revealed 

by random point sampling. A description of similar studies and methods is 

contained in the literature review at the end of this paper. The type of preparation 

for the analysis required to determine the appropriate extent of image segments 

required for similar studies can be difficult to automatically include in a study. For 

example, the knowledge of the availability and accuracy of transport data and how 

to use it to segment the photography such as in the previously mentioned study 

(Zhou & Wang, 2007). An alternative to this approach is to use vector data of a 

known accuracy for the image segmentation. This approach is something which 

depends on the availability of vector data and confines the focus of this study to an 

Irish context. 

One of the benefits of applying an algorithm, which involves automatically 

segmenting an aerial image into small area parcels, is that it creates a platform on 

which further image analysis can take place. This study takes a set of ASCII 

coordinates from the vector data and clips the imagery. These parcels are then 

classified according to their land use type. A user could then use these 

classifications to target specific sets for interpretation. For example, determining 

the type of growth present in marsh areas by using the set of polygons for marsh 

returned to identifying the percentage of the pixel count corresponding to the 

expected value for the growth 

There are a couple of pre-requisite data sets for running this type of analysis and a 

section from both of these sets (just west of Oughterard, Co.Galway) was used for 

this study. The first requirement is digital vector data from the Ordnance Survey 

and the second is colour (RGB) photography stores in GeoTiff format and 

projected using Irish Transverse Mercator (to match the vector data). It is possible 

to automatically re-project the imagery using GDAL_transform (GDAL, nd.) 

given other projections, but this was not used in this study. The software 

requirements are: 

• Something which can manipulate and interrogate vector data files. 

3

• Something capable of handling irregular polygons within a coordinate 

system. 

• Software capable of analyzing pixel values. 

In this study a commercial package called Radius vision was used to export the 

coordinate set for each small area polygon. The processing of the image polygons 

was completed using the Mirone MATLAB based framework tool developed in 

the University of the Algarve by Joaquim Luis (Mirone, 2009). The histogram 

values for the segmented image sections was obtained using PCI Geomaticas 

geomatica package. For each polygon tested in the study the following steps were 

taken, which are the basis for the proposed algorithm: 

• Extract the point data which surrounds the polygon(s) within the region of 

interest. 

• Import the point data into a software procedure to clip the aerial imagery 

and save the segmented image in GeoTiff format. 

• Run a histogram analysis for the image segment. 

• Run comparison procedures to classify the segment. 

The point data mentioned above refers to controlled data points which indicate 

fixed x and y positions on the ground and can be used to analyze the imagery. 

Attached to those points are vectors and coding which, for much of the image, 

indicate the type of land use present, e.g. forestry, water, buildings, etc. The 

comparison procedures mentioned above refer to values obtained during a 

sampling process conducted in the early part of this study. This sampling first took 

sections from known polygons and recorded the spectral values for these samples 

in order to calibrate an image key. Samples for parcels of types not coded into the 

vector data were then taken and the percentage variance between the two sets was 

recorded. 

The sampling for the study was completed on ten separate types of land use. Five 

of these types were identified from the vector data, while the remaining types 

were identified using the techniques developed in this study. Separate sections of 

4

the image were extracted for the analysis, ranging from three to ten for each area 

type. The samples were uniform clear examples of each type of terrain, clear of 

any biasing factors such as shade or overhanging vegetation so as to obtain clear 

baseline data. The samples were extracted as GeoTiff images and analyzed for 

their spectral qualities. A full description and tables for each sample are contained 

in chapter 3, which is divided into sections according to the land use type for easy 

reference. 

In general terms the results were what might be expected; large bodies of water 

(e.g. river, lake) produced a clearly identifiable signature while mixed forestry and 

rough pasture areas had a higher level of standard deviation than more uniform 

cover. The areas sampled fell under the categories of; roofs, roads, water, marsh, 

rough pasture, mixed forestry, pasture, track and shade. Although shade is not a 

distinct area, values for shade (manually identified from the imagery and clipped 

for analysis) were used to calibrate the image key so that these could be 

recognised when found in polygons identified by the vector data. In a similar way 

values taken as representative for spectral qualities present in roadways, for 

example, did not include overhanging tree cover which is present in polygons 

extracted based on the ground revised vector data. 

The aim of this part of the study was to identify a series of proportional values 

which could be used to indicate the presence of a land use type for an unknown 

polygon –for example, a mean red and green pixel values for the known areas of 

water was identified as 30 and 45% that of pasture; something which an automatic 

search could use to flag an area being used as pasture. This sampling was not 

intended to be comprehensive in terms of creating a key for use in every possible 

automated image search, but was undertaken to prove the potential for automated 

image processing based on segmenting the images using small area polygons. 

One surprising result from this sampling was in the values returned for roof 

polygons. These polygons did contain two identifiable ranges of values associated 

with the pitch in the roof, where the angle of the light created shade on one side, 

which might facilitate a process to determine the angle of the pitch. At the 

beginning of the study I had believed that these roof values would provide enough 

5

of a control to calibrate most of the image. However, the variation in the ranges of 

values from shade to light on the angled surfaces made roof values an unreliable 

source of control value for the study. Of the known values samples, the most 

useful in terms of providing a consistent control to base comparative procedures 

were water, roads and coniferous forestry. Of the unknown (in terms of being 

automatically identified from vector data) pasture and bog had the most distinct 

sets of values. The next phase of the study involved testing the algorithm against 

these identified spectral values to see if the irregular polygons (with internal 

distorting factors) matched the range expected from the sampling. 

The testing process followed the outline for the algorithm. Polygons were 

extracted from the vector data in the form of a set of coordinate points saved in an 

ASCII file which were then used to create a clipping path to cut the relevant 

section from the aerial image, which was then saved in GeoTiff format. This file 

was then analyzed for its spectral content and the resulting range of pixel values 

was compared to those expected for the land use type. 

The testing focused on sets of known polygon types for three typical areas (not 

coded to the vector data); pasture, marsh, bog and rough pasture. It should be 

noted that this testing section of the study represented an execution of the 

algorithm but the level of automation can be improved when the vector data is 

made available in GML format. Coordinate sets for multiple polygons can be 

extracted in one file with GML format, something which is expected in the next 

two years. 

The areas analyzed were polygons containing marsh, bog, pasture and rough 

pasture. Of these, pasture and bog produced the most distinctive spectral traits and 

matched expected values, allowing for any comparative procedure to 

automatically classify them. Both the marsh and rough pasture sets of samples 

contained high levels of deviation from the mean pixel value across the red and 

green colour bands with a similar range of values. However, these can be 

distinguished by a trough between values corresponding to shade and vegetation 

present in all of the red colour band values for rough pasture. A full description of 

testing can be found in chapter 4 of this study. 

6

The results of this study point to the value of accessing the spectral values 

contained in aerial imagery through ordnance survey vector data. In almost every 

land use tested the polygons returned a consistent pixel count for the type. It is 

important to qualify these results by noting that the values are based on an 

analysis of the red, green and blue colour bands (so restricted to colour imagery) 

and the process relies on the vector data. It does, however, present a relatively 

simple means for completing an analysis of aerial imagery. This in turn opens up 

the possibility of coding a standalone application for analyzing and comparing 

polygons within a region of interest. The procedure takes the form of a series of 

loops designed to eliminate known values. Once the known values, followed by 

the derived values have been eliminated, the user is left with a relatively small set 

of polygons to examine and can apply a key which has been further trained for the 

specific study. The thesis layout begins with a description of the background to 

the study followed by three chapters. 

7

1.2 General Introduction and Background 

An overview of the work of this study can be found in the executive summary; 

this section is intended to provide background information and explain some of 

the terms used. The study was written with the intention of making it easy for 

someone to access the part of the study relevant to them and then make use of any 

techniques identified. For example, if your intention is to identify the percentage 

of bog in your region of interest; read the sampling section on bog, followed by 

the testing and then read how to apply the algorithm (chapter 2). 

I think it might be helpful if I first explained my interest and motivation for this 

work. For the past decade I have been involved in the photogrammetric capture of 

the vector data used in this study and know that this type of surveying is difficult 

and can be extremely tedious, but I believe it does present a template for 

automatic capture of additional data from aerial imagery. A brief description of 

the nature of this surveying can be found at the end of this section. I believe that it 

should be possible with a robust key of spectral data for known polygons (parcel 

of surface area enclosed by controlled boundaries such as walls/ fences/ roads etc.) 

to automatically search the data for specific values. In other words someone with 

little knowledge of mapping or software could select a region of interest and 

search for a particular value from either a selection from the imagery or 

coordinates imported form a portable GPS device. This could take the form of a 

standalone application or through various freely available software packages. This 

study focuses on the use of open source software but suggests areas where a 

specialized tool could be developed. In general terms, a search of aerial imagery 

requires specialized tools and knowledge to access the information contained in 

the data (such as the spread of crop disease, level of impermeable surface area etc.) 

and can be a time consuming process. This study is an attempt to automate that 

kind of search using small area polygons; something which is unique in its 

approach. 

The study itself involves a mixture of computer science and mapping. As more 

and more of the surface of the earth becomes digitally captured and analyzed, 

8

these two fields will by necessity start to merge. This study looks at one small 

aspect of mapping and how automated software could be used to increase the 

amount of information available to a user. The premise of the study is that, given 

enough previously captured and accurate data, it is possible to automatically read 

the landscape. In short I hope to take point and line data, slice sections from aerial 

photography, and run a spectral analysis. The focus of the study is in the 

methodology so that a means for completing automated updating is identified. I 

should point out that by updating I am referring to providing information relating 

to the percentages of ground cover within a small area polygon. The task of 

physically capturing new structures, roads, height values (even when considering 

lidar) is probably something that will always require a human eye to interpret the 

data to some degree, for example, if the structure is temporary or if road works are 

underway. 

It might be useful at this point to introduce some more background to this study. 

The island of Ireland was fully digitally mapped in 2005 and recently a new 

database for this data has been introduced which allows small area polygons to 

retain unique identifiers linked to the surrounding geometry and features. The 

mapping is on an update cycle but for the most part it can be assumed that these 

polygons will remain constant (with the percentage change even lower following 

the building boom of the last decade). This opens up the opportunity for someone 

to visit the sections of surface area represented by the area polygons over 

successive runs of aerial photography and extract land use change data. 

I mentioned earlier that the focus of the study is on establishing a methodology; 

this was because the motivation for assessing land use change would vary 

according to the user. An example of this might be someone considering the 

potential for flooding within an area. This person might want to take a look at the 

water courses and new housing developments to determine the amount of 

impermeable surface area (paving, patios etc.) over the course of their study. To 

physically do this, either using a photogrammetry tool such as SOCKET SET or 

field GPS, would be an onerous task. The aim of this thesis is to provide a method 

for automatically doing this. It might seem an obvious point, but the more 

information available to a process when commencing an examination of an area, 

9

the higher the chances of successful data being returned. This is where this study 

differs from previous attempts at automated data capture. The process being 

suggested takes a large amount of previously captured and verified data to aid the 

algorithm. By this I mean most sections of the aerial imagery are extracted based 

on definite boundaries such as walls, streams, buildings etc. Internal polygons 

within the target area such as water, buildings, roads, forestry are also identified 

and used to aid the search. In this way the study is entirely dependent on 

previously surveyed data. This is something that has not been attempted before 

with Irish data. I did not find any similar study from overseas over the course of 

my research. I hope to prove to you that this is something that is possible to do 

and implement, using sample software. 

The software used in the study comes from several open source projects, and also 

from one commercial vector data manipulation package (Radius). These are all 

packages which could be considered to be generic tools. It is important to note that 

this is not in reference to their capabilities or any slight on the people who develop 

them but in that the functions being accessed are common to several similar 

software packages. For example, the ASCII coordinate files created using Radius 

could equally have been achieved using ArcView or Microstation (among others). 

The intention was to keep the algorithm as flexible as possible so that users could 

adapt it to their available resources. 

Some of the primary software tools being used in this study come from the GDAL 

(geospatial data abstraction) library. In particular, its facility for writing raster 

geospatial data format is used to manipulate the geoTiff files containing the aerial 

imagery being used in the study. GDAL came about as a project sponsored by the 

open source geospatial foundation, which is a non profit, non-governmental 

organization set up to support the development of open source geospatial software. 

The foundation also supports projects like geotools, grassGis, mapbender and 

mapgrade open source mapserver among others. In this study the GDAL library is 

accessed using another open source software library known as OpenEV. This 

allows the GDAL library to be presented within an application for displaying and 

analyzing the data. As with GDAL, it is implemented in C but has the potential for 

manipulation with Python. In this study the processes will be run on a Windows 

10

platform and used as a means of accessing the raster data from the geoTiff 

imagery. It is necessary to access GDAL in order to open the geoTiff using the 

appropriate ITM coordinates for the search area point set being searched. This 

choice of access to the raster data should not suggest that GDAL and OpenEV are 

unique. Similar software exists that could also have been used in the study. 

Another example is the ImageTool open source software library. This was 

developed in the 90’s in the department of dental diagnostic science in the 

University of Texas, and written using C++. GDAL and OpenEV were chosen in 

preference because the data returned could be more easily modified to striate 

histogram and statistical data with these libraries due to the larger body of work 

contained within them. One of the most important software considerations was 

flexibility (and extensibility), as ideally the users of any suggested methodology 

would modify the process to suit their particular study. As was mentioned above, 

the preferred option was to build a top down processing tool tailored to the 

methodology which would accept methods from other libraries to create plug in/ 

additional functions. For this study OpenEV is substituted for that purpose. 

Outside of the basic software, two other core components exist. These are the data 

relating to the specific polygons being extracted and the key to represent the 

colour values being studied. As with most aerial image analysis studies, defining 

and validating this key forms the major part of the work involved. The process is 

aided by the availability of data which could be classed as controlled, that is the 

knowledge of areas of roof and water, which can be used to reference the other 

values in terms of their deviation from these known values. 

As mentioned at the start of this general introduction, a commercial geographic 

software package was used to extract the coordinates from the vector data (a vital 

first step in the algorithm). This particular software was not chosen for any 

specific capabilities other than the fact that I already had a licence and wanted to 

focus on proving the premise of the study (as opposed to executing the function 

over a specially tailored package). There are a number of commercially available 

image processing packages which have relevance to this study. One of these, 

SOCKET SET, could potentially assist the study. This software is a 

photogrammetry package developed by BAE systems for working on aerial 

photography. It allows the user to capture three dimensional data points from 

11

overlaid aerial imagery. This is due to a system of triangulation based on the 

position of the cameras when the photograph was taken. As much of the data used 

to extract the sections of aerial photography being analyzed in this study was 

captured using this process it is possible that the study could be completed during 

this point in data capture. I am referring here to map production, and the stage at 

which line and point data are taken from remote imagery. At this stage in map 

production it is possible that an-add on process linked to the photogrammetric 

software would allow the user to run an analysis of the polygon at the moment it is 

fully captured and coded, which in turn would mean that the area marker and 

associated polylines could then be given added data of value to the end user 

(spectral content of the polygon/ percentage land cover/ rough pasture/ 

impermeable surface area etc.). I decided against investigating this further for two 

reasons. Firstly it would have involved a difficult and time consuming 

collaboration with the software provider. Secondly, and more importantly, this 

country has been fully digitally mapped and photogrammetric work is now 

confined to update only. This means that the application of any algorithm at the 

photogrammetric/ data capture stage would be confined to small, mostly urban 

areas. 

It is difficult to discuss the manipulation of spatial data without reference to the 

ArcGIS package of software created by ESRI. This is widely used in both 

commercial and educational entities for interpreting spatial data. In terms of this 

study the ArcView package within ArcGIS could have been utilized for image 

processing once the input files were converted to shapefile format. The limitations 

imposed by having to obtain a licence for the software (outside of trial/ 

educational versions) precluded the use of this package. This is not to indicate that 

the software would not have been a useful tool for manipulating the imagery in the 

study, only that it was not practical at the time of the study..The value of this study 

is in allowing new data to be obtained and added to that originally obtained from 

an analysis of aerial imagery. While the study identifies one means of doing this, 

in terms of software and operating platform ( i.e. executing the process using parts 

of the GDAL library and ASCI input files), there are potentially numerous other 

software processes that could apply. The algorithm, however, is intended to 

remain as independent of software considerations as possible; being constrained 

12

only by the quality of the remote imagery and the accuracy of the captured data 

points and associated coding. 

Looking briefly at some of the other commercially available desktop GIS software, 

it is intended that the methodologies suggested could be applied to these products. 

However, due to the limitations involved in both learning to use the software and 

licensing issues this application of the study was not explored. These products 

include AutoDesk, Microstation, the ESRI ArcView product mentioned in the 

previous paragraph, IDRISI and MapInfo among others, such as the 1Spatial 

Radius platform used to edit the geometric input data being used in this study. All 

of the above products are useful in the case of updating and editing, that is to say – 

dealing with change. This study looks at read only data and could be described as 

a way of interpreting already captured data. One result of this is that the functions 

required to store and update change polygons and data values are not needed in 

the proposed algorithm. The ability to connect the statistical data to the unique 

identifier for the polygon should be enough to allow it to be input as an attribute 

by the spatial database management system. Outside of analysis the GIS software 

requirements relate only to coordinate transformation. As a result, while storage 

(of the statistics) remains a consideration, the necessity for creating, editing or 

updating (moving points etc.) do not form part of the requirements for the study. 

Even though a specific analysis tool (in terms of a standalone executable) is not 

presented in this study it is possible to create one and add to the existing body of 

open source work. OpenEV, for example, allows for the addition of newly created 

functions using a Python compiler. This programming language allows a user to 

interface with GIS applications written in C and has the potential to be a flexible 

means of accessing C libraries (e.g., accessing GDAL from OpenEV). It has been 

used to compile different software libraries such as GeoDjango, Thuban, OpenEV, 

pyTerra, and AVPython. The language itself is not used in this thesis as it added 

another layer to the process but provides a possible means of packaging an 

extended experiment. One of the advantages of using Python as a programming 

language in preparing a GIS application is that the assignment of a variable does 

not have to indicate whether it is declaring a string, number, list etc. The variables, 

however, are case sensitive and follow the ESRI using a combination of lower and 

13

upper case, beginning with lower case. The acronym for the variable is at the 

beginning of the name, while the descriptive part follows, beginning with an upper 

case (Eg. htElev). In addition to modules such as math and string Python also has 

several geoprocessing modules. One of these, arcgisscripting, accesses all the arc 

toolbox tools. It should be noted that the geoprocessing object that might be being 

called is accessed differently, depending on the version of arcGis being used. 

Another package called gdal (which accesses the spatial data abstraction library 

being utilized in this thesis) allows for the manipulation of this library. In this 

Python module the language connects to the original gdal programming language 

(usually C or an object oriented variation) using a SWIG interface compiler. An 

example of Python in use in GID can be its application alongside ArcGis; ArcGis 

was built using hundreds of arc objects such as “featureClass”, “symbol”, “field” 

etc., each of which has properties and methods accessed by Python using dot 

notation. An example of this notation is the assignment of a variable name tr = 

arcGisScripting.create(). 

Another possible method for implementing the process suggested in this study is 

through the .NET platform. In particular VB.NET provides a programming 

language that can be utilized to access GeoMedia software –which is a .NET 

oriented group of geographic software packages provided by Intergraph. This 

software allows the user to interact with ESRI shapefiles and also with spatial 

databases created using Oracle Spatial. It also allows for developed tools to tie in 

with graphical editing platforms such as Autocad and Microstation which would 

allow for interpretation of both images and associated geospatial data. This would 

also mean (given appropriate licence) that the developer could access specific 

ancillary products for database management (for databases based in Oracle) and 

others ranging from map production to 3D modeling. This programming language 

(VB.NET) was not used in this thesis because of the potential licencing issues 

which may have been involved. 

This thesis suggests procedures which lend themselves to C as they involve 

repeated loops of steps, from the pixel analysis to the classification of the 

polygons identified by the user through the region of interest. These procedures 

involve processor heavy analysis and lend themselves to being developed in C. In 

14

general terms, of the programming languages used in GIS development (and aerial 

image analysis), the C programming language is the most widely used to interpret 

geographic information. Many analysis programmes such as MITAB and Shape 

Library use C as a means of accessing geographical data. One of the advantages of 

using C is that processor heavy functions such as the analysis of pixels in order to 

categorize them into shades and variations from a mean can be best achieved in 

this language. This is probably most evident by the fact that most of the open 

source programming projects such as ImageTool or GDAL have all been written 

using the C programming language. This thesis makes use of a small aspect of 

these libraries and as such in turn uses the C programming language. This is not to 

suggest that the default programming language for this type of study should 

necessarily be C but that in order to make use of the available body of knowledge; 

previous studies can probably be best extended using C. It is important to note that 

the main problems encountered when analyzing aerial data/ imagery are those of 

co-ordinating the imagery so that it can be referenced and analyzed properly. The 

three main methods of this are geographic, projected and pixel. 

This thesis uses a mixture of both pixel and geographic. The fact that it is 

necessary to use both for the relatively simple cutting and analysis of image 

segments demonstrates the importance to GIS programming of being able to 

transform coordinates. Geographic coordinates refer to latitiude and longitude, 

while projected coordinates refer to a flat two dimensional coordinate structure. 

The C programming language (through the available libraries and its high level 

nature) provides an accurate means to execute these coordinate transformations. 

Note: Other options are available, such as the modules in .net –and coordinate 

transformation is something which can be achieved across all programming 

languages once the correct math functions are accessed. 

Another factor which needed to be considered while reviewing programming 

languages for this thesis was the limitations that would occur due to the 

programming experience of the author. Ideally, a processing algorithm, written in 

C with a Visual basic front end, which could also tie into OSI metadata would be 

the preferred solution. This could then be extended to allow a user to zoom in on a 

map window, identify a subject area, review the available photography and target 

15

a selected study area. This is possible using existing systems but falls outside of 

what could reasonably be achieved with the available resources for the study. The 

main purpose of the study is to determine whether the methodology being 

suggested is applicable and whether useful data can be returned from this type of 

study. The degree of success indicates the fesability of tailoring an application. 

The benefits from developing the already well researched methods of analyzing 

imagery; in terms of aggregating the pixels and deriving statistical data from a 

selected tile of aerial photography would be limited. This is because there is 

already a vast body of knowledge dealing with the subject available. In particular I 

am referring to work such as ImageTool, which, again written in C, was 

developed in California and is open source. It (along with several other open 

source image processing projects) effectively interprets imagery in terms of its 

spectral content. As I suggested earlier, one of the most important aspects of 

viewing and studying the surface of the earth remotely is the way it is projected. 

That is to project it into a format which can give valuable information to the user 

and this is where a large part of this study will focus. The study takes a captured 

and referenced coordinate grouping (set of data points) and uses these to analyze 

sections of the earth. These coordinate groupings are definite points along fixed 

boundaries which form physical barriers in terms of walls, streams, buildings and 

roads. These could probably be better explained in terms of a bull in a field. In 

general terms, any polygon which would prevent the bull from escaping forms a 

parcel, which is then analyzed. This means that the study areas are bounded by a 

series of fixed vectors/ polylines which are unlikely to deviate over time, allowing 

the suggested algorithm to be run over successive years of data capture. With a 

standard mean and control key for spectral values it should be possible to gain an 

insight into changes in land use in the specific semi-urban areas looked at the 

study. 

The study areas could possibly be extended to rural areas over time. The reason 

why the study does not extend to these areas is that it would have to account for a 

much larger study area and less well defined boundaries. This would probably 

only be effectively done using tried and tested values derived from more fixed (ie. 

no fuzzy data/ hazy boundaries where pixels gradiate and physical features such as 

man made fences and walls are not present) polygons. 

16

The start of this study contains a glossary of terms, which are probably familiar to 

most readers, and the next three chapters will refer to some terms which are 

specific to this type of analysis. The first term is aerial imagery which is a 

reference to all the spectral data obtained during the study. When described as 

aerial imagery or raster polygons the reference is to an aerial image corrected to 

allow for distortions such as slopes so that it corresponds to the vector mapping. 

The second term which is repeated is the vector data, which refers to ordnance 

survey data which was captured through a mix of photogrammetry and field 

surveying. Although most readers are probably familiar with the .tiff file format it 

is probably worth noting that the input photography files for the study are in the 

GeoTIFF format. This format complies with the TIFF 6.0 standard and gives the 

input data the flexibility to be accessed in a wide range of programs; which 

allowed the imagery to be viewed outside the study software as the work was 

undertaken. The key metadata components of this file format for this study are the 

georeferencing coordinates which allow the sections being analyzed to be 

accessed. This format is also recognized by the GDAL library being used in the 

study. The projection used in all the files used in the study is ITM. This is not vital 

to the success of the algorithm but making use of an additional projection requires 

the inclusion of a transform function whenever the datasets intersect. The 

following three chapters form chronological record of the study; starting with the 

suggested algorithm (Chapter 2), followed by the sampling section necessary for 

the basis of the procedure (Chapter 3) and finishing with a test on known polygons 

for specific search values (Chapter 4). 

17

2 Stepping through the Algorithm 

This thesis introduces a method for analyzing aerial imagery that can be translated 

into a procedure and run automatically. The operation is specific to two types of 

data 

• Digital vector files from the ordnance survey. 

• Controlled aerial photography stored as GeoTiff files. 

Both of these data sources are projected using ITM projection and are referenced 

during the study. The premise of the study is that it is possible to automatically 

capture additional information about area polygons from aerial photography using 

previously captured vector polygons as a guide. It is an attempt to fill in the blanks 

in terms of polygon attributes not included in the photogrammetry which led to the 

vector data. The focus is not primarily to obtain an accurate list of all polygons 

from the sample data but instead to identify a verifiable method for automatically 

doing so. As the focus is on the identification of methods, the process that is 

outlined can be extended to apply to searches for specific spectral qualities –in 

other words someone searching for a particular crop type might employ the 

algorithm here, but add a target set of data specific to their work. In short what 

follows is an attempt to take the two data sets mentioned above (photography and 

vector data), combining them and returning a new set of information derived from 

both. The process does not merge the data sources but uses the vector data (a large 

portion of which was derived from the photography) as a reference to cut 

segments from the imagery and treat these segments as smaller manageable pixel 

collections for analysis. This process is helped by the fact that the content of many 

of these polygons is known and has been coded to the vector data. 

What was completed in the sampling part of the thesis was an attempt to identify 

specific spectral qualities that can be applied to these known polygons, and then 

used to reference the unknown areas. This had a reasonable level of success with 

some polygon types making a more useful reference than others. A description of 

these can be found in the sampling section of this study. Automated aerial image 

analysis generally focuses on attempting to determine the values of the imagery 

18

from scratch. For an example of this type of work see Thomas Knudsens 2005 

study on aerial image analysis. What is unique about this study is that it attempts 

to use previously captured data as a basis for further image interpretation. This is 

something which, from the research into the data and contact with ordnance 

survey, has not been attempted before for Irish digital spatial data. All of the 

difficult remote sensing is complete (via the vector mapping) before this analysis 

begins, and control points, physical boundaries and closed polygons have all been 

identified. This study presents a method for developing software to extend the 

work completed, and assist users to identify specific traits in what would 

otherwise be an impossibly large store of imagery for the human eye to analyze 

(without using a team of trained analysts). The algorithm proposed here is 

essentially a way of looking for spectral values in small area polygons and 

comparing them to known values. 

The process outlined is for people intending to scan aerial imagery of Ireland for 

specific spectral properties. It is intended as an additional facility for users of 

aerial photography. At present it is possible for people to conduct specific research 

using the photography and a GIS tool. This algorithm is intended to make the 

process accessible to users who do not have the time or access to the resources or 

software licences to conduct this type of research. It can be used to identify 

specific land use types which are outside those currently captured by ordnance 

digital mapping and serve as an add on tool for anyone using that type of data. The 

purpose in compiling the data and researching software for the study was to 

outline a method and as such the application of the method is dependent on the 

user. In the case of this study a lot of emphasis was placed on the identification of 

pasture. This is because it is the major form of land use in rural and peri-urban 

areas and its correct identification helps in limiting the search for other types of 

cover to a relatively narrow number of polygons. In a similar way someone could 

take the same steps –identifying unique statistical properties of the pixel count in 

the types of area being studied and add them to the algorithm. This would involve 

including the additional search to the cycle of flagged polygons at the end of the 

third part of the search execution. 

19

The process can also be coded into a standalone application or as an extension of 

existing software for a specific use (such as searching for crop disease). One 

example might be with the python based raster viewer OpenEV, where a user is 

analyzing aerial photography using the package. It is possible to extend the 

functionality of this to analyze statistical data using the GDAL library. A user 

concerned with a specific set of spectral values, or wanting to confine the research 

to a specific area polygon type within the image, could make use of the methods 

set out here to set the statistical function to return target data only (as opposed to a 

general application of the histogram function). In broad terms this study is for 

users of aerial raster imagery and the results of the sampling are based on samples 

of Irish data. It may be possible to execute similar studies for different regions but 

the small well defined polygon types with clear consistent (over large periods of 

time) boundaries are a vital part of the analysis. This is probably a result of 

relatively small property divisions and rigorous maintenance of the boundaries 

over hundreds of years and may be unique to Ireland. In short the study is a look 

at a possible coded routine to analyze the Irish landscape using all the available 

data. 

As mentioned in the previous paragraph this study is intended for users of aerial 

photography. The pre-requisites to this are that it is controlled and has the 

projection embedded in the file, and the users have access to ordnance survey 

vector data. Outside of these conditions the algorithm is intended for users who do 

not have a strong background in information technology as much as those who 

have a good knowledge of code and could easily convert the proposed steps into 

routines. The open source software described in the study has a familiar user 

interface to any GIS package (standard toolbars/ zoom/ measurement etc.) and it is 

possible for someone to run the algorithm without having to alter any of the steps. 

Ideally, however, the steps would be converted to add on to an existing piece of 

software that is being used (ArcView, Microstation etc.) so that the user can 

quickly run through large amounts of data. In this way the routine is designed for 

anyone who is interested in targeting specific properties of Irish topography that 

can be defined in terms of their spectral values. These areas range from forestry 

and agriculture to urban planning. The limitations of the study are in the quality of 

the imagery and it was shown that some potential applications of spectral analysis 

20

would not return accurate results. A study of sediment levels in drains or canals, 

for example, could not be developed using the methods outlined here because of 

the difficulty in getting a large enough sample to train the image key. 

The method outlined can be used against small area polygons, so can be applied to 

land use types across most of the country, with the exception of full urban areas 

and remote mountain areas (where the small land divisions are not found). It 

should be noted that the target areas described refer to peri-urban data. This is 

because fully urban areas are covered by large scale 1:1000 mapping and spectral 

analysis would not improve on the available data (outside of highly specialized 

heat radiation studies etc., which are not the intended use for this method). The 

method could also be used by someone seeking to trace patterns in land use over 

recent decades. This study would be confined to RGB photography as a 

comparative analysis of the properties of the pixel counts by colour band are a 

requirement for the process. Once the proposed key is calibrated for the particular 

run of photography, the algorithm could be set to run for a specified region of 

interest across the period when this type of photography was available. 

This differs from previous studies looking at automatic aerial photography in two 

ways. Firstly, the focus is specific to Irish ordnance survey data and concentrates 

on making use on the codes and known values that can be extracted from this. 

Secondly, the study uses small area polygons to target the spectral analysis of the 

imagery to relatively small sample areas. This is to reduce the difficulty posed by 

variations in pixel values found in large samples. In this the process is relatively 

unique as it takes the small area polygons as a guide and cuts the matching areas 

from the raster aerial imagery. This allows for automatic decisions to be made 

regarding the level of standard deviation present in the sample. In many ways this 

simplifies the process of image interpretation because most of the difficult image 

control work is completed and the software can then focus on variations specific 

to the ground cover being studied. While this has limitations on the extension of 

the work to other (general small scale) datasets it does outline a method for 

automating the search of imagery. Over the course of the last two years of this 

Masters study I have learned about how users can interrogate and manipulate data 

in large databases. 

21

Aerial imagery is a form of database. Once it is structured into tables it can be 

interrogated for spectral properties just like vector data in a spatial database. By 

using the boundary data points from the vector data the imagery is converted to 

manageable sections and properties can be determined and logged. This sub 

division of the image into a mosaic of areas, starting with known polygons, 

moving to polygons which can be easily classified (strong variation form the other 

known values with a low level of standard deviation such as cut pasture) and 

flagging any whose values fall outside the image key means the job of analyzing 

the image is made easier. This sub-dividing of raster imagery is something which 

has not been attempted with Irish ordnance data and aerial photography (to the 

best of my knowledge, I have conducted a search of research papers and similar 

work had not been undertaken within the ordnance survey). The focus of the study 

is on proving that this method is practical, and can be applied to a variety of area 

types. The methods suggested by this study are unique to the area divisions and 

available vector data, and present the steps necessary to train an image key to look 

for specific properties in the Irish landscape. 

The process works by taking the point data from the polygons contained within 

vector data representing an area of an image. Using this point data to crop the area 

of the image the polygon it represents and log pixel values for that area. This is 

repeated for every area in the region of interest. These are then compared to an 

image key and areas classified according to the presence of values specific to the 

key. One such key was developed during this thesis but could be re-calibrated for 

higher values. In other words a higher mean for an water bodies within a separate 

run of photography would increase the key values by that amount in the key. 

Within the key are known values (water, forestry, roads etc.) and the proportional 

difference (in terms of the mean pixel count for values in the red, green and blue 

colour bands, and the levels of standard deviation) between these known values 

and search values (such as pasture) is measured against the histogram values for 

cropped polygons of unknown use and a category applied for matches. In other 

words the process steps through locating, cutting and analyzing small areas of the 

image to enhance the available data and search specific values across the whole 

image. 

22

2.1 Initial Inputs 

The following four sections describe the work in terms of steps through the 

algorithm being proposed. This first section introduces the initial inputs 

required to define the region of interest to be analyzed: 

This study attempts to find an automatic method for image analysis using vector 

data as a reference, and in particular small area polygons and their associated 

coding. At the beginning of the proposed algorithm a user is required to input a 

region of interest for the process. This corresponds to the geographical area in 

which the user is interested. The most convenient way for someone to do this is to 

manually select the area from vector data or photography (or a combination of 

both displayed together) displayed in a window on a pc. The result of this should 

be a set of co-ordinates from which the study area can be extracted. 

The user also needs to input a sample target area for the study. This can be one of 

a set of values developed in this study or may take the form of a particular 

variation (such as a distinct type of crop etc.). In the second case a sample of the 

required value is needed. This can be obtained in the same way as the first part of 

the region of interest selection where, as mentioned above, the user manually 

selects the target area from a viewing window and the output is a set of co- 

ordinates. A second way this target data might be obtained would be from a co- 

ordinate set input from a field survey completed using mobile GPS device. In this 

case the co-ordinates first need to be converted to the Irish Transverse Mercator 

framework, so as to allow the process to match them to the projection used in the 

photography and vector mapping. 

The general flow of the first part of the algorithm being suggested is user inputting 

the region of interest and a required value from the image analysis, which are then 

converted to a format which can be compared against the data. In the case of the 

software used in this thesis this takes the form of a simple ASCII file containing a 

co-ordinate set, but a common format might also be the .shp file used by ESRI. 

23

The software required for this step in the proposed algorithm includes an 

application for viewing and analyzing raster data and capable of performing 

transformations on sets of co-ordinates. For this study four sets of libraries were 

used, packaged into open source applications known as Open EV, Mirone and 

GDAL. The vector data was clipped using an application which forms part of a 

geographical information system called Radius Vision. All the processes 

necessary for the first part of this algorithm can be performed using GDAL, with 

the exception of clipping to an irregular polygon, which is still under development 

(GDAL, 2010). There is a license requirement for the Radius software, which was 

used in this study for the step involving the user selecting the extent of the region 

of interest in the vector mapping. It should be noted that this can also be 

completed using any other vector mapping tools such as ArcView (which would 

create a .shp file). Another alternative is for the user to manually create an ASCII 

file of co-ordinates (with the convention of easting northing, separated by 

newline). This alternative can be frustrating for the user and the suggested process 

is to make use of software capable of designating the region of interest through a 

viewer. 

The data required for the aerial image analysis is ordnance survey ortho-rectified 

colour aerial photography and matching digital mapping. The archive of aerial 

imagery goes back to the 1970s and the algorithm being suggested is designed to 

operate with any run of photography so users can discern dispersal patterns over 

time through successive photography dates. The process, however, makes use of 

the three colour bands present in colour photography and is limited to 

photography with the red, green and blue colour bands. The vector data used will 

take the form of 1:5000 or 1:2500 scale digital data and it is this data which forms 

the basis of the search process. The vector data has the region of interest divided 

into a mosaic of small area polygons the majority of which are coded according to 

their use or content. The aim of this thesis is primarily to automatically register 

additional data for those of unknown use type –and secondly to flag those of 

known use type with specified (spectral) anomalies from user requests. In order to 

be successful the data requires the coding, which may be useful to consider as a 

data hierarchy at this stage in the process. The following hierarchy is only for 

illustration. In practice each polygon will be analyzed according to its spectral 

24

content and placed in a unique set. Any relationships between those sets would be 

made post analysis by the user for the purposes of their particular survey. Having 

said that the known polygons are; 

Forestry –divided into categories of mixed, coniferous and deciduous. 

Water –divided into categories of stream, lake, river, drain, pond and reservoir. 

Road –divided into categories of motorway, national primary, national secondary, 

regional, third, fourth and track (also coded as footpath and forestry road). 

Buildings –variously coded as solid, dwelling and a variety of functions though for 

the purposes of the algorithm they will be treated as one data type (this is because 

they were found to be unreliable in terms of consistent spectral values and biased 

result sets from the spectral analysis). 

Two other aspects of this data, the presence of marsh and pasture symbols can be 

used to indicate known values for a polygon when found inside the bounding co- 

ordinates, though in the study these must be compared against the spectral key to 

ensure the symbol in representative of the entire polygon. 

The output from this stage in the algorithm is two required and one optional data 

set. The first necessary return is an area of vector mapping, containing a mosaic of 

vector polygons divided by polylines representing real world physical boundaries 

between the areas and the ordnance survey coding related to this data. This was 

extracted using the extract_map function from the Radius GIS library, but this 

software is not a necessary requirement –the data can be exported to a different 

format and a similar procedure completed (using ArcView, Microstation, 

AutoCAD etc.), once the map projection and co-ordinate attributes associated with 

the vectors are retained. The study did not explore the possibility of creating a 

unique vector data cutting tool as it can be assumed anyone making use of this 

algorithm will have access to some level of mapping software (if not similar open 

source software can be obtained from Brazil’s National Institute for Space 

Research under the SPRING project –see Appendix) The second necessary return 

is a region of aerial photography matching the co-ordinate set outlined in the 

section cut from the vector data. This can be cut using the co-ordinate set outlined 

from the vector data. In this study the MATLAB based Mirone software 

developed in the University of Algarve for earth sciences was used to extract the 

25

study area from the photography and the input file was in ASCII format (although 

other formats, such as .shp could also be used). 

The third optional output from this stage in the process is a co-ordinate set for 

possible sample areas in the image which a user is seeking to complete an 

inventory. This could be selected from the image using the vector manipulation 

software described in the previous paragraph, or could be obtained from point data 

collected by the user in the field. If field data is used there are two requirements. 

Firstly that it makes a closed polygon so an area can be sampled for spectral 

values. Secondly, that the co-ordinates conform to ITM projection to match those 

of the imagery. Transformation of the co-ordinates can be achieved using 

gdalwarp (GDAL, nd.) which can then be used to extract the required pixel set for 

examination through software such as Mirone. If a specific value is required, then 

at least three samples are used to obtain a representative value for the image key. 

26

2.2 Area Extraction 

The following section introduces the second part of the algorithm, where it steps 

into a series of loops for cutting out known areas (via vector data) from the 

image: 

The second step in the process is to extract the known areas from the study area in 

the imagery. This involves creating a set of polygons conforming to the coded 

values and excluding them from the image search. This set is then either flagged 

for analysis further in the algorithm (should a target area search have been created 

by the user in the first step) or placed in a holding set for inclusion in the statistical 

data output at the end of the analysis. The output from this step should be a set of 

unknown polygons and their associated raster image sections, along with the sets 

of known polygons. 

For this study the software employed for this step in the process was Radius (to 

obtain co-ordinate data for cropping the image into area polygons) and Mirone (to 

crop the image). This thesis was written with a view to operating on a new spatial 

database being developed for ordnance survey data. This database will have the 

capacity to return sets of values in GML form, which would mean that the input 

sets for this step in the algorithm would be more easily obtained by extracting 

… using a text editor to create a master input 

set. For the purposes of this study, however the input files were created in ASCII 

format from the polygon co-ordinate sets outlined in Radius (by copying and 

pasting). These co-ordinate sets were then imported into the Mirone software and 

used to create closed polygons, which in turn allowed the target areas to be 

exported. It should be noted that this is a user intensive process and was used only 

to test the theory being proposed in this thesis. There are many ways in which this 

part of the image analysis process could be automated and the process time 

reduced, but the focus of this study is to prove that usable data can be obtained 

using the methods outlines and they were not expanded on. 

27

The areas extracted were placed into sets according to their nature and the total 

area of the sets recorded (using the area property associated with the input vectors 

–note these can also be calculated at the raster extraction stage using the Mirone 

measurement function). For each set of imagery run the image key needs to be 

reset to match the spectral values present, and the area sets created at this stage 

can then be used for this calibration. The algorithm itself is concerned with the 

proportional difference between the pixel values in the polygons so new values 

can be applied for the image key using the methods outlined during the sampling 

section of this thesis. This means that, should a new set of photography be used 

the initial sets for this stage are analyzed to create new baseline data. In other 

words the value for each polygon of road, section of coniferous and mixed 

forestry, river, lake and pond (though not reservoir as it did not return reliable man 

pixel value settings during the sampling); the mean pixel value and standard 

deviation by colour band are recorded and averaged by group. These averages in 

turn are tested by the expected proportional relationships between them and once 

verified are applied as image keys for that particular run of photography. In the 

case of this study this was completed using PCI geomatics geomatica software, 

which returned statistical data for the clipped polygons using the analysis function 

against the red, green and blue colour bands. As with the step as a whole, this 

process could benefit from an application specific to the algorithm which would 

return these values for the purposes of creating an image key alone. 

This stage of the algorithm does not require any user input (the type of 

photography is entered at the start of the analysis). The outputs are a series of sets 

of polygons and their associated raster image areas containing the image 

projection (in GeoTiff format). Once the input data for this stage is available in 

GML/ XML format then it should be possible to code a series of iterative loops to 

set up the sets and reduce the amount of remaining polygons required for spectral 

analysis. These improvements on the step being described would serve to speed up 

the analysis and make the process neater to the user but in order to ensure the 

process was worthwhile they were omitted and the focus of the study concentrated 

on defining relational values and determining if the vector/ raster analysis hybrid 

model would reveal useful data for the user. 

28

2.3 Spectral Value Comparison 

The third part of the algorithm consists of a series of procedures to assign 

known areas and areas with values that can be determined from the known sets: 

This part of the algorithm involves comparing the spectral values for the unknown 

polygon types (from step 2). The first part involves creating statistical histogram 

data for all of these polygons and comparing it to an expected value key for 

classification according to land use type. Areas which do not conform to known 

values are placed in a set for further analysis while those matching are categorized 

according to their values. The first part of this step involves verifying any 

polygons which were found to include descriptive symbols from the vector data 

(marsh, pasture -note: pasture in this case refers to known areas of rough pasture). 

Following from this an analysis according to spectral values was completed, and 

the set of neighbouring polygons (taken directly from the vector data set) 

examined to see if probably neighbouring areas might influence the result. For 

example; if an area with a set of pixel values close to those expected to pasture 

was identified but displayed a high level of standard deviation this area was given 

to the pasture set if three or more neighbouring polygons contained pasture (as the 

deviation is probably caused by shade in the image), otherwise the image is 

flagged for examination of the histogram results later in the process –to see if a 

double spike in the red and green polygons is present. 

In order to complete this step the algorithm cycles through a number of relative 

values to determine the probable land area of the polygon being analyzed. For this 

section of the study the histogram values were exported from the geomatica 

software package as tables and graphs and compared manually, in order to 

complete the same task on a larger scale this process would be coded into a 

routine taking the statistical (image) data and image key as input and outputting 

the closest match. For example the mean data by colour band would be compared 

to the values for roads and if a 50% decrease in the red, 40% in the green and 50% 

in the blue colour bands was detected the polygon would then be compared to 

water where if a 70% increase in red, 55% in green and 20% in the blue colour 

29

ands was detected the standard deviation would be matched for its range outside 

an expected value of 10; allowing the polygon to be coded as pasture. This routine 

is not coded here but was executed using a set of comparative tables. 

At the beginning of this step the image consists of several sets of known polygon 

types, the clipped image polygons, associated vector codes and an image key. 

After completion of the step there are several more known polygon sets and a set 

of unknown areas which fell outside the ranges expected. This may be the result of 

the samples being biased by high levels of shade or the fact that they represent a 

transitional data type (bog to rough pasture etc.). These remaining polygons are 

further analyzed in the next step but this stage of the algorithm is used to classify 

as many known values as possible. These were obtained through a series of 

comparative steps as follows: 

The sampling during this study pointed to a number of interdependent 

relationships between the spectral values found in the polygons studied. The fact 

that the polygons are clearly defined (through the vector mapping) and that the 

content of many of these the polygons in the image is known prior to analysis 

means that the algorithm can focus on identifying a narrow range of additional 

area types. To achieve this it is necessary to loop through a series of criteria for 

four main land types; pasture, rough pasture, bog and marsh. The last two on this 

list will have identifying symbols present in most cases, which can be used to 

assist the automatic search. Once the four target areas have been identified (with 

pasture being the main land use in most semi-urban imagery) the remaining 

polygons form a small set of areas which are further analyzed in the next step of 

the process. 

The polygon is analyzed using the geomatica analysis tool and the histogram 

values exported for comparison to the known values. If the sample has a mean 

value for the red colour band 40% lower than that of road polygons, and the blue 

colour band displayed a similar 40% lower mean value than roads, and the 

standard deviation is lower than a value of 15 then the sample is matched to water 

–if the mean for the red colour band is close to three times that of water, and has a 

green value close to twice that of water the polygon is coded as pasture. 

30

If the sample does not match the above criteria but had a mean pixel value for the 

red and green colour bands close to half that of roads, and displayed a level of 

standard deviation three times that of roads and the red and green values represent 

close to double the value of those in the water polygon (taken from the image key) 

then the polygon is coded as rough pasture. 

If the sample has red values around 30% lower than mixed forestry, and 20% 

lower in green for the same known polygon type and the standard deviation 

remains within 10% of the mixed forestry then the polygon can be coded as marsh 

(usually a symbol under the level is present in within the polygon in the vector 

dataset, but not always). 

If the sample does not match those criteria outlined so far but has a low standard 

deviation across all three colour bands and contains a decrease in the mean value 

of over 30% for all colour bands when compared to the known road values then 

the sample is tested for area size, if it is above the maximum value for pasture then 

it is coded as bog. 

The remaining polygons after this step in the algorithm fall into two categories – 

those surrounding buildings or with mixed use and those with a high level of 

shade present. Further analysis is required to step through the remainder to 

identify areas with a homogenous pixel value but have a higher level of standard 

deviation due to levels of shade, and those of mixed use. The output from this part 

of the process are six further area sets; pasture, rough pasture, marsh and bog, 

areas with high standard deviation for further analysis and areas containing 

building polygons (automatically flagged through the vector data). 

31

2.4 Confirmation 

The final stage of the algorithm involves the reduced data set being stepped 

through for manual confirmation (or compared against an additional set of 

values determined by the user): 

This part of the algorithm is concerned with tidying up some of the remaining data 

from the previous sweeps through the polygon. To begin with the polygon set 

classified as pasture is selected and analyzed for differences in the mean values of 

the red and green colour bands. Those with mean values above 190 on the 

converted greyscale in red, and 200 on the scale in green are classified as cut 

pasture (while initially this may not appear to be of direct value to the user, it 

could help with any subsequent analysis of pasture in particular). 

The next loop is designed to remove polygons containing homogenous pixel 

values whose standard deviation has been biased by a high proportion of shade in 

the sample. It involves checking the histogram for two peaks (one for shade and 

one for pasture) in the pixel count. If present, the polygons are assigned to the 

pasture polygon set. 

The process was completed using the geomatica software to extract the statistical 

data from the polygons of raster imagery (extracted earlier in the process using a 

combination of ASCII data from the vector manipulation software and the Mirone 

clipping function). The remaining areas were cross checked with the polygons 

containing buildings other than those coded s dwellings. Those found not 

containing a building polygon are retained for further analysis and logged 

according to adjacent polygon types (E.g. 123445.34 

232234.34 etc. –neighbouring road, building polygon, pasture –area 7658m2). 

This was completed manually for the study using the Radius software and GeoTiff 

referencing but would be best completed inside a routine for larger samples. These 

unknown polygons can then be visually referenced by a user and manually 

categorized (displayed according to an input co-ordinate set returned from this 

algorithm through software such as Mirone). The result of this sample image study 

32

was that only a small area of the original search area was processed at this stage in 

the study. 

33

3 Sampling for the Baseline Image Key 

Figure 1: Aerial view of sample area 

The main body of research in this study involved identifying areas which would 

make useful benchmarks for an automated image analysis to use as a search key; 

ten areas were selected for inclusion as they formed the most distinct sets which 

could be used. These ten area types sampled were: Roads (of all class), Water 

(Lake, River, Stream, Drain and Pond), Marsh, Coniferous forestry, Mixed 

Forestry, Track, Shade (to obtain reference values when a high level of standard 

deviation occurred), Building (roofs), Pasture and Rough pasture. Of these 

sample values four could not be determined from the vector data (Pasture, Marsh, 

Rough pasture and Bog) and were used to test the ability of the process to identify 

target values by their relationship to known values. The next section describes the 

findings for these sampling areas. 

34

3.1 Roads 

Figure 2: Road area and surrounding detail 

This part of the study looked at sample sections of road (tarred/ hard cover) to see 

the relationship between the mean spectral value in these areas and the image as a 

whole. In general terms areas of road appear lighter than other parts of an aerial 

image due to increased reflection along the surface and this was borne out by a 

mean greyscale pixel value of 30% above the image average across the three 

colour bands. 

The study involved using OpenEV (open source raster imaging tool based on the 

GDAL library) and PCI Geomatics Geomatica geospatial viewing application. 

The files were exported as GeoTiff files from the original image using the 

GDAL_export facility. In all ten regions were sampled. These sample areas were 

taken from within the road polygons (as opposed to sampling the entire polygon) 

in order to identify true baseline data for these features. Sampling the entire 

feature would also have meant including areas obscured by tree cover, and 

necessarily biased the results –the intention of this part of the study was to create a 

benchmark against which tolerances for deviation could be included. 

The ten sample areas were taken from a series of roads in the south east of the 

image and three sections of the national primary road running along the north of 

the image (the pixel representation in the example below is to illustrate the area 

being sampled but is at a lower resolution than was used in the study). 

35

Figure 3: Road area and vector data 

In general terms the sample areas had an equal distribution of values across the 

red green and blue colour bands when compared to the image as a whole and it 

was not possible to discern any unique variation on the proportion of pixels in 

each of these bands contained in the road polygons. The results, however, did not 

deviate to any great extent between the samples and the mean greyscale values in 

the samples remained consistently higher than those of the image as a whole. This 

was significant as the variation remained at around 30% higher for each band in 

each sample (34.4 on av. in red, 26.6 on av. in green and 36 on av. in blue). 

Road value sample 1 Mean pixel value 

Red 215.5 

Green 243 

Blue 194.7 


Red 206.444 

Green 234.5 

Blue 157 

36


Red 202.3 

Green 228.8 

Blue 183.9 


Red 163.083 

Green 189.75 

Blue 147.667 


Red 171.417 

Green 196.083 

Blue 153.333 


Red 169.444 

Green 190.111 

Blue 151.444 


Red 167.111 

Green 190.444 

Blue 155.889 


Red 155.667 

Green 185.25 

Blue 145.833 


Red 148.417 

Green 172.25 

Blue 143.833 


Red 187.061 

Green 211.545 

Blue 173.364 

Table 1: Road sample values 

37

(The previous samples were compared to statistics from the image as a whole of 

Red: Grey Level Values: 6 – 255, Median: 115, Mean: 111.711, StdDev:27.2804; 

Green: Grey Level Values: 30 – 255, Median: 135, Mean: 136.542, 

StdDev:27.3776; Blue: Grey Level Values: 5 – 255, Median: 102, Mean: 102.636, 

StdDev:17.8523) 

The mean pixel values are a useful benchmark to base further analysis of the 

imagery on and the fact that each sample displayed a more or less uniform 

deviation from the image standard implies that there is some merit to applying the 

results to a key which identifies impermeable surface area. In general such areas 

in an urban area will contain similar properties to a road surface. Further iterations 

of this sampling will involve sampling shingle against concrete and tar (see the 

analysis of the spectral values contained in areas of track) in order to see if there is 

a measurable spectral variation between them; this will, however, involve a small 

amount of post processing to enhance the differences between them. 

The road network is a useful point of reference in automated image analysis and 

ways of analyzing imagery to capture road networks have been well studied (such 

as pattern analysis techniques explored by van der Werff & van der Meer, 2008). 

In this thesis the focus is not on capturing road data, something which has been 

completed and is on a continuous revision cycle, but on utilizing this data to make 

the image analysis process easier. Most study areas (and almost every part of the 

island of Ireland) will have at least some part of the road network present (at the 

risk of sounding pedantic this assumption is not verified here because it can be 

assumed as general knowledge and can be discerned from a cursory look at any 

online small scale representation of the network). This means that there are 

polygons of a specific unique spectral value available for referencing most studies, 

even when confined to a particular area or series of photographs. In order to get a 

clearer insight into how this can be applied three areas close to roads around the 

image were sample and compared to sample values from their closest road 

polygon. 

38

Road test sample 1 Mean pixel value Standard deviation 

Red 205.889 8.18 

Green 228.167 11.289 

Blue 178.778 12.73 

Adjacent spectral values sample 1 

(pasture) 

Mean pixel value Standard deviation 

Red 91.539 6.573 

Green 138.706 7.504 

Blue 92.519 10.84 

Table 2: Road test sample value 1 

The first test sample took an area of road and a sample area from an area of 

pasture adjacent to the road. This can be assumed to be a recurring set of values 

that can be located in most aerial imagery that this study is considering. In the test 

the values for standard deviation from the mean across all three colour bands in 

both samples did not vary to any large extent and can be omitted as reference 

values for the search algorithm looking to match a set of values as pasture using 

the nearest road polygon as key. In contrast there was a large difference in the 

mean values for all three colour bands with the road polygon containing pixels of 

a mean 50% higher for the red, 40% higher for the green and 50% higher for the 

blue colour band. This discernable difference allows a range surrounding this 

relative difference to be included into the search and areas of pasture to be 

identified. Note: potential candidates for a pasture set of polygons are also cross 

checked against other relative values, outlined in later sections of this sampling 

study part of the thesis. A result of this variance means that once the known areas 

are identified the road set can be compared against unknown polygons adjacent to 

it and, given similar levels of standard deviation and proportional mean by colour 

band outlined above, the unknown polygons can be placed in a pasture polygon 

set for further reference and confirmation as the analysis progresses. 

39


Red 157.961 7.537 

Green 181.211 8.08 

Blue 145.553 10.129 

Adjacent test values sample 2 

(bog) 


Red 111.479 7.086 

Green 125.162 8.764 

Blue 98.326 12.152 


This sample looked at an area of bog adjacent to a road for discernable pixel 

variations between both, although bog will be sampled further at another stage in 

this thesis it is worth noting that all bogs have access roads nearby and a spectral 

comparison is a useful reference. The surface of the roads mentioned varies but, as 

is outlined in the road section of this thesis, surface type has only small affect on 

the range of spectral values returned from the road. The standard deviation for all 

three colour bands was almost identical between both samples (road and bog) 

which could be expected due to the relative uniformity of surface cover (from a 

medium altitude aerial perspective). The mean pixel value for these colour bands, 

however, varied almost uniformly across the three bands with a 30% smaller value 

obtained for the red, green and blue bands in the bog sample. As with road and 

pasture this is a strong proportional variation for analysis purposes and allows bog 

to be established in an initial polygon set during processing. The set can then be 

compared against the other expected variances (water, forestry, pasture) and 

matched against polygon size (bog will almost always the largest area polygon in 

any sample –other large polygons such as lakes and forestry are coded and can be 

automatically placed in a set during analysis). 

40


Red 199.833 10.912 

Green 218.667 16.52 

Blue 163 165 

Adjacent test values sample 3 

(Mixed forestry) 


Red 88.365 28.944 

Green 120.627 29.751 

Blue 90.361 20.369 


The third set of samples for referencing adjacent data to the spectral values from 

road polygons was a section of mixed forestry close to a road (gravel track). The 

samples displayed a notable difference in the level of standard deviation for all 

three colour bands. This level of deviation is consistent with other samples of 

mixed forestry (and rough pasture) analyzed in this thesis; with roads displaying a 

deviation of approximately one third of the mixed forestry values for the red and 

green colour bands. The differences in mean pixel values for all three colour 

bands was also distinct; 55% lower for mixed forestry in the red, 45% lower in the 

green and 45% lower in the blue colour bands. This variation, together with the 

level of standard deviation, provides a useful quality assurance value set to test the 

accuracy of the derived values being estimated in the algorithm. Since the areas of 

mixed forestry and road are both present as polygons in the vector data set the 

pixel sets for each can be extracted and matched –any values falling outside a 

range close to the above expected variances would flag an issue with either the 

photography or vector data. 

41

3.2 Water 

Figure 4: Typical Water Area Image 

This section took a look at four water samples present in the sample image 

(comprising of three sections of lake, and one of river) to see if the spectral values 

could be used to control and calibrate image processing of land polygons. The 

results were good and indicated several unique properties for this cover that could 

be used to calibrate a key in relation to surfaces being studies. The percentage of 

the image covered by water was proportionally small but the lake section made up 

the biggest single polygon. Of the water, the majority of the polygons were for 

was drains, several streams (ranging from three to less than a meter in width 

stream), there was also a river and lake present. The streams and drains were 

eliminated from the study This was for two reasons; firstly they have very small 

width (less than a meter in some cases) and were often obscured by overhanging 

vegetation and secondly they are already captured and any spectral analysis would 

only be of use as comparative values to use against the rest of the image –which 

was not possible due to the vegetation. 

Variations between the samples were slight, suggesting that lower flown 

photography would be necessary to compile any useful information regarding 

sediment levels, but allowing good baseline figures to be derived from the values 

present. The pixel value was almost uniformly two thirds less than the image 

42

average for the red colour band; with a low standard deviation in all samples. 

There was also value on the green colour band of just 50% (with little variation) of 

the image mean for all samples. Similarly the value returned for the blue colour 

band was 20% less than the image average. 

Water Sample 1 Mean Pixel Value Standard Deviation 

Red 36.455 4.568 

Green 69.214 7.254 

Blue 81.614 13.217 


Red 36.119 4.562 

Green 69.524 6.944 

Blue 83.718 12.256 


Red 37.692 5.468 

Green 70.758 7.711 

Blue 83.386 13.056 


Red 39.714 5.548 

Green 73.083 7.07 

Blue 82.797 16.235 

Table 5: Water sample values 

It could also be said that the uniform nature of the results indicate that the relative 

depth of the water has little effect on the spectral value of the area for photography 

at that height, introducing the potential for water to be used as one of the main 

baseline properties in this type of image analysis. It can often be the case that 

certain areas contain large amounts of temporary ponds following heavy rain; this 

is particularly so in the 1:5000 scale rural mapping. Applying the above values 

against pixel histograms for these areas (typically bog or pasture) for photography 

runs taken following heavy rainfall could reveal useful data with regards to runoff 

and capacity across land areas. In terms of this study the values will form part of a 

key against which the histogram values for pixels across the colour bands can be 

applied in order to calibrate the key (set of values to identify land cover). 

43

The purpose of this thesis is to identify an automated process for image analysis 

using vector data alongside a spectral analysis. The ability to extract water areas, 

in particular lakes or ponds with a large concentration of pixels of similar values, 

and establish a baseline value to calibrate the image key by would allow the user 

to target specific areas across successive runs of photography. This could be 

completed automatically using GDAL extract and returning the results of a 

histogram analysis. 

In order to gain a visual impression of the location of the main water bodies in the 

sample image the green and blue colour bands were mapped into the red, allowing 

the definition between the (relatively) monotone water and the remainder of the 

image. 

Figure 5: Water Area Image Modification 

The next part of the study involved taking separate sample areas from around the 

image and comparing them to the spectral values associated with water. The 

44

samples were taken at three separate sections around the image; they did not 

correspond to samples taken for other parts of this study (specific road, building, 

pasture areas etc.) so as to increase the variety of input data. The three areas 

consisted of forestry (coniferous plantation), pasture and track. The pasture 

sample also contained a high degree of shade, which was not rectified in the table 

to see if it could be possible to identify this type of area with shade included. 

Water values testing sample 1 

(forestry) 

45 

Mean Pixel 

Value 

Standard 

Deviation 

Red 73.532 19.591 

Green 112.507 21.605 

Blue 96.772 16.042 


(pasture) 

Mean Pixel 

Value 

Standard 

Deviation 

Red 115.153 12.691 

Green 167.608 15.487 

Blue 104.872 13.2 


(track) 

Mean Pixel 

Value 

Standard 

Deviation 

Red 213.429 10.748 

Green 237.429 12.369 

Blue 192.821 14.636 

Table 6: Water test sample values 

As might be expected the track (artificial surface) showed the greatest difference, 

with a red colour band value of less than 20% of that found in track. The relative 

disparity between these two values (the red mean pixel value in areas of water and 

track/ road) could be used to calibrate an image key during automatic image 

analysis; and the percentages of other less distinct land cover derived by 

comparison. One of the main aims of this study is to see if it would be possible to 

analyze aerial imagery using an automatic process based on vector data. With 

known water polygons and road polygons present this can be achieved, however, 

as was mentioned above the body of water needs to be large enough to obtain an 

accurate baseline reading for the photography run. If only drains or streams are

present in the target area (or its immediate (~1km) surroundings then extracting a 

set of baseline pixel values from water polygons would not benefit the analysis. It 

can therefore be concluded that water polygons provide a useful reference for 

image analysis, but in the context of using them to add value to ordnance survey 

small area polygons they need to be part of areas not less than five pixels in 

diameter (i.e. belong to classes of rivers, lakes or large ponds). 

When the control values for water (from first table) were compared to pasture the 

red and green colour bands showed values which could be used to calculate if 

pasture was present in a polygon. The mean pixel value for water in the red colour 

band was 30% of the value for pasture and only 45% of the value for the green 

colour band found in the pasture sample (which included a section of shade). The 

value for the blue colour band also had a disparity of just over 20% less than the 

pasture mean pixel value. This implies that if an algorithm was to be run on 

sections of ordnance survey data which took polygon co-ordinates from the vector 

polygons, calibrated a key from the water polygon and compared the red and 

green band colour values against the red and green colour values of the 

neighbouring polygon and then confirmed the level of standard deviation (which 

was found to be low in an area of pasture, ~10 on the greyscale) it is probable the 

area can be labelled as pasture. In itself this does not present much of a 

breakthrough but when added to the known polygon it helps complete the picture 

of a target area being analyzed. 

At this point it might be better to think of the image as its vector representation. 

As the polygons which the vector data encloses are identified the areas can be 

filled. In this way the study is filling the blanks around known values. If the result 

was thought of as a mosaic of known area properties (type and nature of land 

cover) then applying the label pasture dramatically reduces the areas left to 

identify. 

46

Figure 6: Sample area as a mosaic of polygons 

The purpose of the study is to identify an automated software process to do this. 

The user would start with the requirement to identify the percentage of a certain 

property in the photography: throughout this thesis the example of impervious 

surface area is given but this could also be a fungal infection affecting crops, the 

spread of invasive plant species, the extent of flood damage etc. What the 

sampling of water polygons is doing is attempting to create a set of automated 

conditions which the software would initially retrieve to set a base for the 

algorithm. The user would then select an area from the vector data where the 

target values were present. This area would be in the format of a specially coded 

polygon composed of vector data; either using those in the ordnance survey data 

or appending lines to controlled data to fully enclose the target sample (and 

creating the necessary vector set). Once identified the target area could be 

calibrated against the value for water (among others) and areas not relevant 

eliminated. 

As every section of the image will be composed of an area polygon (taken from 

the vector data) which in general enclose relatively small areas it should be 

47

possible to quickly process each section by clipping and cutting the sections, 

comparing the mean pixel values across the colour bands, and classifying the 

result. This type of process is linked to the photography and once the edge of a 

given run is reached the process needs to be restarted and the values re-calculated. 

As the extent of the photography is known the co-ordinates can be included into 

the algorithm and the user informed when the extent of the search has been 

reached. 

48

3.3 Marsh 

Figure 7: Typical Marsh Area Image 

This study looked at areas of marshy ground. The purpose was to try and identify 

if sections of waterlogged surface area had unique values which could be 

identified in a small area polygon. The areas used for the study were captured 

examples adjoining a lake –they had been field revised and identified specifically 

as marsh within an enclosed polygon. The boundary on one side was the edge of 

the lake, while the boundary on the other side was the border with pasture 

enclosed with a notional (mapping) line. The samples (three in total) can be 

assumed to be typical of marsh (due to the position within the target area) but 

there was a small amount of variation, around 5%, between results for the 

different colour bands. This was as expected and the values were very close to the 

image average. This suggests that areas of marsh would be difficult to detect using 

a mean pixel/ standard deviation analysis based on area polygons alone, and 

additional coding data taken from the original vector mapping is required to make 

an accurate prediction as to the probability of marshy ground being present. 

49

Marsh Sample 1 Mean Pixel Value Standard Deviation 

Red 110.968 10.557 

Green 135.143 12.497 

Blue 102.401 13.700 


Red 108.672 16.799 

Green 132.305 15.717 

Blue 93.646 14.9 


Red 104.652 8.66 

Green 123.725 10.028 

Blue 94.018 13.032 

Table 7: Marsh sample values 

One factor which might help to differentiate between areas of marsh and the 

overall mean is the fact that the standard deviation was less than half of the overall 

for the red and green colour bands in all three samples. This is due to the fact that 

although there is variety in the spectral values for the vegetation present; the area 

is uniformly covered by vegetation. This difference could become useful when 

removing known features from a polygon; in other words taking water and the 

built environment from an area polygon, and analyzing to see if the spectral values 

displayed similar levels of deviation on the red and green colour bands. 

Marsh areas are generally indicated by a symbol which signifies the presence of 

this type of ground cover extending to the next logical boundary (or the notional 

mapping boundary mentioned above). The boundaries in the vector data do not 

contain the level marsh so a method to identify the areas from the relatively 

narrow red colour band identified above could provide an automated way of 

determining the extent of marsh lands based on existing data. This is one of the 

areas of the study which does not lend itself to a software solution and relates to 

the question of “fuzzy data” and how to incorporate it into a digital environment. 

Without digressing onto a tangent outside the scope of this thesis it needs to be 

mentioned, in the context of this part of the study, that certain features of this 

50

planet will always have fluid boundaries. In this example the change from marsh 

to rough pasture is a gradual one, and does not correspond to a single vector. 

Various solutions such as an additional transitional polygon instead of a linear 

boundary are simply methods of belting the square peg of a gradual change into a 

relational database. Although the study is attempting to identify statistical data 

(percentages of types of land cover) that can be appended to the entry for a given 

area polygon in a spatial database in this case a bitmap displaying concentrations 

of values corresponding to marsh might be more appropriate. 

When the values obtained from marsh were compared to three sample sections 

from other polygons in the study some unique proportional variations emerged. 

The purpose of the study is to iteratively reduce the quantity of unknown (in terms 

of land usage) polygons in the search area by appending values derived from the 

aerial photograph. The sample areas used in this test were not from any of the 

original samples used to obtain baseline spectral data for the land type they 

represent but were chosen to see if a reliable (or at least significant) proportional 

deviation could be observed. The three sections sampled were pasture (which had 

been recently cut), mixed forestry (chosen because of the variety of spectral values 

that this type of cover represents) and paving (taken from a yard surrounding 

buildings but similar to any of the road and track hard surface areas sampled 

elsewhere in this study). 

All three sample areas shower unique properties consistent with the sampling used 

for their respective baseline values but also useful in terms of obtaining a key for 

identifying polygons of marsh. As was mentioned above these will generally fall 

within a polygon composed of vector polylines but it can occasionally be the case 

where the marsh was not fully enclosed. It may be necessary to introduce a 

process that retains all the polygons containing marsh symbols but displaying 

spectral values outside those expected for that type of land cover for verification – 

this, however, was not the case for the samples used in the study. 

51

Marsh test Sample 1 

(Pasture) 

Mean pixel value Standard Deviation 

Red 201.617 8.05 

Green 209.713 9.569 

Blue 136.645 12.082 


(Mixed Forestry) 


Red 69.22 34.492 

Green 103.352 33.620 

Blue 86.594 19.885 


(Paving) 


Red 246.167 8.1 

Green 252.542 5.4 

Blue 206.125 8 

Table 8: Marsh test sample values 

The first sample, taken from freshly cut pasture, produced the relatively high 

values for the red and green colour bands that were found in the pasture testing 

samples for that type of ground colour. In relation to the values for marsh they 

showed a high level of disparity; with the mean red colour band pixel value for 

marsh being half of the pasture sample, and the green value for marsh being 60% 

of the test sample. The disparity in the blue colour band was less but this aspect of 

the spectral values could be used to relate the disparities found as specific to 

pasture, so that an examination of neighbouring polygon could use a known marsh 

area (presence of symbol and expected spectral values) as a reference to set the 

relative differences and possibly reset the marsh values to within the values for the 

known polygon for that particular areas. 

The last suggestion will not be included in this study bit it is worth noting that an 

algorithm which could constantly recalibrate the relative values as it processed 

neighbouring polygons might produce better results than one dependant on a key 

set during the beginning of the processing. 

52

The second test sample looked at an area of mixed forestry for deviation (in terms 

of mean pixel values across the colour bands) from those found to be present in 

areas of marsh. The values had a relatively unique variation from marsh in that 

while both the red and green colour band mean pixel values were a lot lower (35% 

and 20% respectively) the blue colour band had a comparable level of standard 

deviation of a mean which was within 10% of marsh, although this could be 

attributed to the level of shade present in the forestry due to the tree canopy 

varying in height across the sample. As is pointed out elsewhere in this study, 

areas of mixed forestry did not give reliable enough data to calibrate other surface 

areas form, based on spectral values alone. In the case of these types of areas there 

is vector data coding present to uniquely identify the forestry, however, 

knowledge of an expected proportional difference between the (known) forestry 

and an area of marsh is a useful additional factor to include in the algorithm and 

might increase the accuracy of any search for these types of areas (or at least help 

to eliminate them from a search for other specific properties). 

The third test sample took an area of hard cover (paving/ track) from a yard 

between agricultural buildings. This type of cover is a part of this study which 

revealed the most distinct values and presents a valuable calibration tool for the 

algorithm. When compared to this third sample the mean pixel value (converted to 

greyscale) for the red colour band found in the marsh samples was only 43% of 

the hard cover, while similar disparity was found between the green and blue 

mean pixel values in marsh and the green and blue mean pixel values in the hard 

cover (with the marsh mean values at only 51% and 46% of the hard cover 

respectively). These types of areas are well coded in the vector data. Some areas 

of hard cover surrounding private dwellings and farm buildings may not be 

captured and the automated identification of these types of areas through aerial 

image processing is one of the aims of this thesis. It can be assumed, however, 

that for any given area (excepting rural mapping covering mountains, which this 

study is not addressing) the road polygons have been accurately captured and 

there will be several sample polygons to calibrate a hard cover value from. In 

terms of this part if the study, the relative proportional deviation of marsh values 

53

form both pasture and road/ paving allow a key for its identification to be 

developed. 

54

3.4 Coniferous Forestry 

This part of the study used unprocessed sections of coniferous (commercially 

planted) forestry to evaluate a deviation from the average for the image that might 

indicate this type of ground cover. It took seven samples from a total of five areas 

of this type of this type of forestry in the image. These were then analyzed in 

terms of mean values through the red green and blue colour bands to determine if 

there was any unique deviation from other features. As would be expected for this 

type of ground cover the values for red (and near infra red) were lower due to 

colour being absorbed by the foliage. This gives a useful indicator for this type of 

ground cover and provides a comparative value that the target polygons of the 

study can be compared against. It should be noted that there is also the potential to 

use a pattern recognition algorithm to accompany any specific search for this type 

of forestry as uniform rows are a feature of this type of surface cover. The 

statistics for the survey are in the table below –it should also be noted that the 

standard deviation remained relatively consistent for each sample area. 

55

Forest Sample 1 Mean pixel value Standard deviation 

Red 96.1975 19.943 

Green 125.339 18.6976 

Blue 98.4551 18.6976 


Red 87.0853 21.8368 

Green 125.905 23.8361 

Blue 103.376 17.2807 


Red 97.484 20.715 

Green 137.59 21.077 

Blue 109.5 14.9791 


Red 73.072 20.762 

Green 112.015 22.976 

Blue 96.784 16.405 


Red 76.670 23.651 

Green 111.181 24.107 

Blue 94.424 16.962 


Red 75.534 24.507 

Green 113.943 27.153 

Blue 97.851 17.693 


Red 72.424 24.194 

Green 111.688 27.628 

Blue 96.596 18.1686 

Table 9: Coniferous forestry sample values 

An indicator for this type of land cover (as with rough pasture) is the larger scale 

value for the blue colour band when compared to the red. This was not the case for 

the image as a whole where the red band produced a mean almost 10% higher than 

56

lue. Another indicator present in all the samples was the fact that there was an 

increase in disparity between the red and green colour bands -18.2% for the image 

as a whole, compared to almost 30% across the samples. It is the red band which 

is the most valuable indicator of this type of ground cover, with over 25% lower 

value from the image mean. While coniferous will have been captured and 

indicated as a level in the OSI vector data, smaller areas of this type of ground 

cover will be typically present along the margins of small area polygons close to 

urban areas. In particular a polygon closed by what is called a peck in the vector 

data layers could be analyzed for the mean red colour pixel value converted to 

greyscale and compared to the image whole. If the variation is close to 25% lower, 

then it is probable that either this type of tree cover is present. It should be noted 

that further spectral analysis (involving swapping of the colour bands) can present 

additional indicators, which will be applied later in the study. 

There are several implications of being able to identify coniferous vegetation in an 

urban area; it indicates permeable surface area for planning/ flood modelling. In 

the context of this thesis it allows a section within the study area to be identified 

and adjoining areas to be measured against; for example, once the presence of 

coniferous vegetation is detected image processing could be applied to eliminate 

this from the result set from the target polygon –allowing another analysis to be 

run on the remaining surface area. 

57

Coniferous forestry test sample 1 (tree/ 

shade/ pasture mix) 

58 

Mean pixel 

value 

Standard 

deviation 

Red 81.824 35.235 

Green 118.246 34.475 

Blue 92.287 17.189 

Coniferous forestry test sample 2 (pasture) Mean pixel 

value 

Standard 

deviation 

Red 122.813 9.725 

Green 174.274 12.991 

Blue 105.527 12.604 

Coniferous forestry test sample 1 (bog) Mean pixel 

value 

Standard 

deviation 

Red 115.672 9.270 

Green 133.684 9.933 

Blue 105.059 11.881 

Table 10: Coniferous forestry test sample values 

Three sample areas were chosen to match the data from the coniferous sample 

areas against. The first of these does not conform to the vector polygons against 

which the proposed algorithm operated, but was chosen for its mix of ground 

cover so as to provide a worst possible combination against coniferous values. In 

other words the distinguishing features for coniferous (outside the vector coding, 

this sampling was only to test the values relative to samples around the image) of 

high levels of standard deviation in the red and green colour bands would not be a 

useful comparative feature as the sample contained a variety of ground cover. The 

mean and standard deviation alone did not provide strong indicators of the ground 

cover type but the sample did demonstrate the usefulness of histogram data. The 

pixel count for the red and green colour bands displayed two clear spikes when 

presented as a histogram, corresponding to the expected values for both pasture 

and shade. This presents the possibility of determining a relative proportional (to 

the polygon size) pixel count flag which would indicate the percentage of land 

type within an area of mixed use. As was mentioned at the introduction the basis 

of this study is the referencing of areas within the aerial imagery by small vector

polygons of uniform ground type. Further analysis of these polygons can be done 

once initial categorizations have been made and the level of analysis increased. To 

accurately determine the correct pixel proportion to flag requires a larger sample 

than being used in this study, which could be obtained from the data sets of 

polygons requiring further analysis returned from prolonged use of the algorithm. 

In other words this is something which would be developed later in the image 

analysis cycle because the nature of the sample (deliberately crossing outside the 

search polygons) makes it unlikely to be a feature of these types of images. 

The second sampling area involved taking a section of pasture for comparison 

with the coniferous samples. The values differed with an increase of 

approximately 40% for the mean of the red and green colour bands with a 

standard deviation 50% reduced on those found in coniferous forestry. This data is 

another useful reference in the identification of pasture as coniferous areas are 

coded and outlined in the vector data so can be automatically fed into a reference 

table during image analysis. As mentioned throughout this part of the study, the 

correct identification (and elimination) of areas of pasture from the image analysis 

is essential for the success of the suggested algorithm. Using spectral analysis for 

aerial image analysis (and remote sensing in general) is a specialized field of 

knowledge and studies tend to focus on a particular study area (Such as the 

analysis and classification undertaken by Coredo-Sancho and Adler in 2007). 

The focus of this thesis is to create a generic method for image analysis which 

makes use of captured vector data to filter the image, reduce the study area, and 

narrow the range of pixel variations that can be analyzed. In this way the study has 

focused on finding an algorithm that can be coded into an easy to use solution for 

this type of research. By necessity the current mapping methods are labour 

intensive and resources are not available to capture the type of secondary data that 

might be gained from automatic image analysis. In addition to this specific 

research (such as an inventory of impermeable surface in a region) could require 

specialist skills and methods –the algorithm suggested here is aimed to allow a 

user to filter through the current data by including their required target area into 

the process. For this reason the sampling has been generic (in that the image as a 

59

whole was not analyzed but individual sections representative of a specific land 

cover type were selected). 

The third sampling area was a section of bog, which was chosen because it is 

often found close to coniferous forestry in the Irish landscape. The ability to use 

coniferous as a reference when analyzing the image for the presence of bog means 

for most studies there will be an adjacent source of analysis to base the search key 

on. The mean pixel value for the red colour band was similar to the mean pixel 

value for the red colour band in pasture but the green value was notably different 

(to the green colour band in pasture). The green colour band value for bog was 

just under 20% larger than the value found in the coniferous samples; indicating 

that a 40% increase in the red colour band and a 20% increase in the green colour 

band mean values, coupled with a 50% reduction in the standard deviation for 

areas close to the known coniferous figures has a high probability of being an area 

of bog. Once this variation is cross checked with other known values in the area 

(water, road and buildings/ roofs), and cross checked with the secondary derived 

values identified in the algorithm (pasture etc.) areas of bog can be automatically 

quantified. 

60

3.5 Mixed Forestry 

Figure 8: Typical Mixed Forestry Area Image 

This part of the study looks at the spectral values for areas of mixed forestry. It is 

not concerned with finding a set of unique attributes which would uniquely 

identify this type of land cover from an aerial photograph, but is intended to 

investigate if values corresponding to this type of cover could be separated from 

those of rough pasture. 

One reason for attempting to differentiate between areas of mixed forestry and 

rough pasture is the age of the surface cover. Mixed forestry generally includes 

sections of native woodland, which is slow growing and can be assumed to be an 

area capable of supporting wildlife (it is also less prone to change than rough 

pasture; due to the difficulty in obtaining permission to clear this type of 

woodland). Any study looking at the wildlife corridors across the country would 

benefit from an automatic method of distinguishing smaller linear sections of this 

type of ground cover (along hedges etc.) from other land use types. As is 

evidenced below by the similarity between the results of this to those from an 

analysis of rough pasture this remains difficult to do. There is also not much scope 

for pattern recognition algorithms to be used in the detection of isolated sections 

of mixed forestry (outside those captured by conventional mapping) because of 

the seemingly random nature of the shade patterns. Note: An obvious solution is 

to fly the same areas at different times of the year and compare the red and near 

61

infrared signatures to identify the presence of natural deciduous species at the 

borders of area polygons but this would be prohibitively costly and beyond the 

budget of any potential environmental survey. A cheaper solution might become 

possible in future by identifying patterns in lidar data at polygon boundaries –the 

focus of this study, however, is on spectral values and although unique in 

comparison to the image average –the sample returned values very close to those 

of rough pasture. 

Areas of this type of surface cover, comprising of a mixture of coniferous and 

natural woodland are already captured in Ireland and the study took a section of 

land from one of these (present in the study area) and compared the spectral 

values to those of the image as a whole. 

Mixed Forestry Sample Mean Pixel Value Standard Deviation 

Red 70.246 33.337 

Green 104.33 32.726 

Blue 85.974 19.626 

Table 11: Mixed forestry sample values 

As was expected the results showed a similar disparity with the values for the 

image as a whole to rough pasture. As with rough pasture the standard deviation in 

pixel values for the red and green colour bands was high, with a similar large 

difference in values for those bands (37% and 23% lower respectively). This is 

also an indication that areas of rough pasture can contain similar coverage to 

mixed forestry –in that rough pasture is often overgrown and contains some tree 

cover. Once areas without buildings and roads with the comparative variation 

between the whole image and sample polygons which match those above (and in 

the rough pasture survey) it should be possible to apply the rough pasture attribute. 

The known mixed forestry polygon set (which is taken from the vector data 

coding) can then be subtracted from this to give a percentage of rough pasture for 

a target area. 

62

Mixed Forestry comparative sample 1 

(bog) 

63 

Mean Pixel 

Value 

Standard 

Deviation 

Red 110.761 6.429 

Green 121.958 8.253 

Blue 103.929 11.605 


(pasture) 

Mean Pixel 

Value 

Standard 

Deviation 

Red 128.801 10.469 

Green 165.843 9.496 

Blue 100.921 12.367 


(cut pasture) 

Mean Pixel 

Value 

Standard 

Deviation 

Red 220.074 7.623 

Green 209.855 9.249 

Blue 137.92 11.493 

Table 12: Mixed forestry test sample values 

The mixed forestry part of the sample was compared against three sample areas to 

add to the proportional deviations in the algorithm. The aim is to achieve a high 

enough level of relative values in the image to establish the composition of every 

polygon. The polygon extraction is made difficult by the fact that cropped small 

area polygons have irregular shape –the analysis of which was done against a 

blank background –resulting in altered standard deviation values for these samples. 

In the case of the three sample areas for this type of forestry the samples were 

rectangular areas within the uniform types used for the comparison. 

The first sample type used for comparison was an area of bog. This was chosen as 

it often occurs close to areas of mixed forestry and rough pasture (which by nature 

of the terrain have not been turned over to pasture). The values contained in the 

sample were similar (relatively) to coniferous forestry, but differed in the red 

colour band with almost a 35% higher mean and a notable low level of standard 

deviation. This low level of standard deviation was found in all three of the colour 

bands, and could be used to differentiate between the two types of forestry

sampled in this study. In relation to this, a sliding scale of standard deviation for 

pixels in the red colour band between bog, coniferous forestry and mixed forestry 

is evident –from under 10 values for the area of bog, averaging at 20 for areas of 

coniferous forestry and over 30 for areas of mixed forestry. Matching these to the 

mean could give a useful relative indication for automatic identification of bog 

present in an area. At this point it is probably useful to comment on the nature of 

the vector data for areas of bog. These types of areas are generally bounded by 

polylines (though these are not coded), it could be the case that the same polygon 

contains an area of bog and rough pasture (similar to mixed forestry); being able 

to estimate the relative values for this transitional analysis based on the above data 

is useful in such cases. If an area polygon bordering one or more areas containing 

pixel values contains values close to those of mixed forestry but with a low 

standard deviation it is probable that the polygon is describing this type of 

transitional area. 

The second type was a pasture sample taken for comparison as it is the most likely 

neighbouring polygon to an area of mixed forestry and is common to most (rural/ 

semi urban) areas in the country. This sample of pasture is separate from others 

used in this thesis but returned similar values (as expected). The most notable 

difference was in the level of standard deviation present in the sample area, with 

particular respect to the red and green colour bands. The mean pixel values for 

these colour bands also showed a distinct relative difference (over 45% and over 

35% for the red and green bands respectively). This variation in values is a useful 

indication of which type of ground colour pixel values belong to. In particular it 

could serve as one of the primary steps in the algorithm. It is useful to begin the 

analysis by eliminating known values from the search so as the target comparison 

list (ground cover types to pixel values) is smaller and has a higher chance of 

being successful. One further aspect that can be included in any automated 

analysis looking for areas of bog is the size of the polygons in the search. Most 

large polygons will be assigned a value based on the input vector coding. They 

will generally represent known parts of the image such as plots of forestry and 

water parcels. The remaining large polygons will (in the context of the Irish 

landscape being analyzed) most probably by areas of bog. This can further be 

refined by eliminating areas of flat rock as islands within the large polygons (these 

64

island areas are present in the vector data). In this way large polygons (with 

islands eliminated) matching the deviation from expected mixed forestry values 

outlined above can be assumed to be representative of bog. There is probably 

some scope for the use of pattern analysis to further refine this search. The 

uniform (low standard deviation) values displayed by the bog sample suggest that 

it may be possible, once the areas are identified in the automatic search algorithm 

suggested in this study, to examine these for rows and machinery tracks to 

automatically calculate the level of cutting taking place. 

The third sample type taken for comparative analysis with mixed forestry was an 

area of cut pasture. This was taken as a control to ensure the previous two samples 

matched expected values (in terms of relative percentages) found in the other 

survey areas of the image. The high mean pixel values (and low level of standard 

deviation) matched expected results in that both the red and green colour bands 

showed markedly higher values (see table above). This type of pasture will not be 

factored into a proportional value check against mixed forestry in the algorithm as 

it can be referenced against the two stable control value sets present in water and 

roofs. It should be noted that this particular area type is dependant on the time of 

year and weather conditions prior to the time the aerial imagery was flown and is 

something that must be included in a second or higher loop of the algorithm. 

65

3.6 Track 

Figure 9: Typical Track Area Image 

This part of the study looks at values for track –corresponding to unpaved or 

gravel access roads and takes six sample areas for a comparison of spectral values. 

The purpose of the study was to see if there was a way of distinguishing between 

the spectral values for these type of roads and paved/ tarred roads (NRA category 

four upwards). Roads have unique spectral values in terms of an increase in mean 

pixel value of close to 30% for the three colour bands. This in itself is not 

particularly to an automated image analysis using OSI vector data as a baseline as 

the road network has been captured and is updated, however, if areas of paving or 

hard cover (similar to track) could be shown to have similar unique properties it 

might be possible to detect the presence of impermeable surface in recently 

developed suburban areas. I am conscious that the above explanation is long 

winded so the following example might explain things a bit better. A recently 

developed suburban area is experiencing problems with flooding and runoff –at 

present the mapping captures the water courses, buildings, roads and property 

boundaries (and street furniture/ utility details etc.) but does not indicate the extent 

of paving and patios within the individual plots; a survey filtering pasture using its 

spectral signature and known buildings, tarred road and footpath using vector data 

66

and comparing the remainder to expected spectral values for hard cover could 

return this value. The results from the six samples are in the table below: 

Track sample 1 Mean Pixel Value Standard Deviation 

Red 178.5 14.02 

Green 194 15.231 

Blue 146.5 19 


Red 164 23.013 

Green 193.833 14.729 

Blue 147.5 15.149 


Red 196.75 10.511 

Green 219.875 12.240 

Blue 145 16.639 


Red 193.8 193.8 

Green 207.067 207.067 

Blue 153.667 153.667 


Red 190.833 15.014 

Green 208.667 11.089 

Blue 148.5 10.518 


Red 190.5 2.738 

Green 217.167 7.574 

Blue 138 5.44 

Table 13: Track sample values 

While the above samples outline unique pixel signatures when compared to the 

entire image (40% above the mean for red, 35% for green and 30% for the blue 

colour bands) the difference between these samples and those returned for the 

standard road network is not significant. This does not necessarily make it difficult 

to distinguish between standard tarred road and other types of impermeable 

67

surface area that might have been introduced to the landscape. The road network 

is present in the vector dataset so subtracting this (along with the other known 

polygons such as water, buildings etc.) from the means that polygons with a high 

number of pixels corresponding to these values would indicate that the surface 

contains an area of cover similar to track or road (hard impermeable cover). 

This unique value has potential to increase the accuracy of flood mapping and 

prediction but requires additional processing to ensure that the high values are the 

result of permeable surface area. This could take place within an automated 

software process by swapping the colour bands to increase the difference between 

these areas and areas of vegetation. 

The purpose for sampling the areas of track was to see if there could be any means 

of determining if an area surrounding a private dwelling had been paved, or if 

there were any paved yards/ areas of hard cover present in other semi urban 

polygons. The test sampling took three areas to compare the values for track 

against; an unpaved dirt track, an area of compacted gravel yard and an area of 

paved yard. With the exception of the blue colour band the results were similar to 

those from cut pasture. These can be discerned from cut pasture by setting the 

search algorithm to look at the mean pixel value for the blue colour band, which 

was 30% less for both the paved yard and dirt track. The values returned for the 

yard of compacted dirt and gravel were similar to cut pasture but were within a 

small polygon containing several roofed buildings. The algorithm can therefore be 

set to accept values similar to cut pasture for small area polygons containing a 

number of roofed buildings as gravel/ dirt hard cover. In the particular case of this 

sample the results obtained are most likely due to pigment in the gravel biasing 

the sample. 

68

Track test sample 1 

(unpaved dirt track) 


Red 218.917 11.378 

Green 225.708 13.658 

Blue 161.583 10.1 


(compacted dirt/ gravel) 


Red 178.315 8.713 

Green 195.648 10.802 

Blue 131.056 16.122 


(paved yard) 


Red 174.278 8.77 

Green 200.722 6.257 

Blue 171.778 7.075 

Table 14: Track test sample values 

Taking these values for a larger area would be a difficult task but the fact that the 

small area polygons derived from the vector mapping cut apart the image means 

that greater levels of information can be derived fro the same set of values in 

polygons with different associated coding. The initial sampling displayed values 

for pasture in small area land parcels surrounding dwellings matching those of the 

mean outside cut pasture. From this it can be inferred that a small area polygon 

surrounding a dwelling which displays spectral values similar to cut pasture could 

potentially be gravelled and the algorithm would then run a specific analysis on 

the values for the blue colour band. For the third sample area, the paved yard, the 

values matched those expected for paved covering and again these values would 

indicate the high probability of hard cover (patio/ concrete etc.) when present in a 

small area polygon surrounding a dwelling. 

The identification of track has been the subject of a large amount of work which 

looked at pattern recognition software which might extract the network of roads 

based on pattern recognition and the unique spectral values for this type of feature 

69

(Phynn et al, 2002). In this study all roads and tracks have been captured and the 

purpose of analyzing the spectral qualities of these is in an effort to train an 

algorithm to recognise the specific properties of hard ground/ impermeable 

surface area within small area polygons. The results revealed distinctive qualities 

thanks mostly to the high reflective quality of these types of surface for red and 

green colour bands. 

As is mentioned in other sections of this study, the initial part of the algorithm 

involves extracting the roads (and water, and forestry, known marsh etc.) to leave 

a smaller number of polygons for analysis. The next step would be the removal 

(classification) of areas with relatively unique spectral values such as all the 

pasture polygons. Following this, the urban polygons would be analyzed, having 

been identified by the presence of building polygons within them –these would 

then be classified according to the nature of the building (as this data is only 

available in some instances the first iteration of the loop would include all 

buildings). The nature of the spectral values would then be compared to the values 

sampled here, as the low level of standard deviation among the colour bands 

associated with them enables classification with a degree of certainty. 

This type of survey has particular benefit in flood mapping and can help the 

development of models factoring in runoff rates during times of high rainfall. The 

sample area used here is just outside an urban area and as such has a useful mix of 

all the possible land cover types –ranging from dirt track to paved yards to 

forestry to pasture to dwelling houses within small polygons. Specific searches of 

urban developments could expect to find more homogenous ranges within each 

polygon. The values taken from this sampling could serve as the baseline for one 

of these surveys. The user would then select known areas of the target land cover 

being analyzed (via a mobile GPS unit or selected from the vector mapping draped 

over the aerial photography). A combination of the standard expected values and 

the entered key values could then be used to calculate the percentages across a 

wide area. The fact that for each area analysis the pixel variations are confined to 

a small area polygon means that there is less chance of a gradual distortion biasing 

the results, as each separate polygon is calculated based on its own features (i.e. 

the values of neighbouring polygons and presence of buildings mentioned above). 

70

It should be noted that the samples used in the above section of the study covered 

small areas relative those used for forestry, pasture, water etc. This is because 

paved or hard ground was only a small part of the sample area. It is unlikely, 

however, that a larger sample of hard ground would have revealed any different 

results; firstly because the sampling was done over a relatively wide geographical 

area and secondly because large expanses of paved areas are rare enough to be 

considered an anomaly in the Irish landscape, which would in any case could be 

flagged by the search algorithm (by setting a maximum expected area for values 

matching hard cover). 

71

3.7 Shade 

Figure 10: Typical Shade Area Image 

The purpose of this part of the study is to see if there are any unique spectral 

qualities from in areas of shade which would allow them to be eliminated from an 

examination of a given polygon. There are a number of ways to prevent shade 

from distorting the results of a spectral analysis of these types of polygon. The 

photography and vector data could be imported into a geographic information 

system capable of manipulating vector data. A number of control points could be 

taken matching the edge of areas of shade to vertices in the vector data. The vector 

dataset could be then transformed/ moved to match the control points and the 

offset calculated and eliminated from the original polygons so as a subsequent 

spectral analysis would focus on areas outside shade. While this would probably 

be the most accurate method it is difficult to automate as selection of control 

points requires human input (a random selection would skew results and there is 

too much variety in coding and polyline length and shapes to set rules). 

A second method might be to identify unique spectral values for shade and 

introduce a process to subtract them from the polygon pixel set results to leave 

only values which can be identified. This is what is being attempted here and the 

results below reveal a similar signature to that of water in the image. Since all 

water bodies have been captured it could be possible to subtract these lower pixel 

values from a polygon and analyze the remainder. 

72

Shade Sample 1 Mean Pixel Value Standard Deviation 

Red 33.9264 5.428 

Green 71.9221 7.446 

Blue 80.2165 14.322 


Red 41 5.577 

Green 81.361 8.077 

Blue 78.381 11.5439 


Red 42.813 6.943 

Green 83.186 7.878 

Blue 84.505 12.848 


Red 45.333 15 

Green 85.854 17.921 

Blue 85.75 17.118 

Table 15: Shade sample values 

This is something which has been attempted in previous studies (Martin et al, 

1998), however, as with the thesis in general, I believe a hybrid method 

incorporating aspects of vector mapping and spectral analysis produces the best 

results. This is because in certain situations removing shade based on pixel values 

alone might bias the analysis (mixed forestry could produce a signature similar to 

marsh once lower pixel values are removed). A better method might be to 

manually take four or five samples per square kilometre and determine a mean 

percentage of shade based on areas of shade within those samples –this can be 

done relatively quickly using any vector manipulation software and the sample 

areas recorded for reference. Any further automated analysis would not eliminate 

pixels matching the shade signature above the sample percentages. 

Note: In future the addition of lidar data might allow the calculation of shade 

based on the height of the irregular tree canopy/ roof pitches etc. to be calculated 

and incorporated into the automated image analysis algorithm. This data is not 

73

currently available for all areas and has not been included in the study. It should 

be noted that its addition has the potential to improve the accuracy of this type of 

study. 

A couple of sample sections of the image were selected with varying degrees of 

shade present in the image. This was to determine if it is possible to introduce a 

step into the algorithm which would correct for areas of shade in otherwise 

uniform closed polygons. The first sample involved selecting an area of pasture 

which contained a large amount of shaded ground from some high trees along one 

of the border vector boundaries i.e., along the hedge. The area of shade 

corresponded to roughly half the area of the test polygon. It should be noted that a 

degree of shade will be present in all of the orthophotos to which this study 

applies; this is because aerial survey takes place in bright sunshine by necessity. 

The sample was analyzed for mean pixel values along the colour bands, and also 

the standard deviation displayed. 

Shade test sample 1 

(pasture with app. 50% shade) 


Red 88.457 37.387 

Green 135.151 40.01 

Blue 100.616 20.386 

Table 16: Shade test sample value 1 

The results for this were as might be expected; with an increase in the standard 

deviation for the first red and green colour bands reflecting the variety of tone 

present in the sample. This, however, does not give a complete representation of 

the data and a histogram (see below), reveals peaks for the shade value, and a peak 

for the pasture value in both the red and green colour bands. These two peaks 

indicate that the area is pasture, and that the high pixel count for the lower values 

in the red and green colour bands is a result of shade. 

74

Figure 11: Histogram for Shade and Pasture 

There are two factors for inclusion in the automated search algorithm that can be 

taken from this; firstly a reading of high levels of standard deviation from the 

mean value for the red and green colour bands for a given polygon are a strong 

indication that the analysis might be distorted by shade. Secondly if a histogram of 

the pixel count for the red and green colour bands shows a peak at the mean for 

shade, and at the mean for pasture then the area is pasture (given that the other 

probable cause, water, has already been identified through vector mapping). 

The algorithm could account for this by taking a sample of neighbouring polygons 

–if these contain areas of pasture then the standard deviation is flagged. If the 

standard deviation is above 30 values on the greyscale then the polygon is flagged 

and a histogram appended for further processing. The resulting set of these types 

of polygons would be returned with the result set from the analysis so as the user 

could accept or reject the anomaly as the result of shade (in terms of large 

amounts of standard deviation for the red and green colour bands in a small area 

polygon). 

The second test sample for shade took an area of pasture with a smaller percentage 

of shade present for comparison with the areas of shade. The values for the sample 

matched those expected for pasture with the exception of a higher level of 

standard deviation in the red and green colour bands. These could again be 

flagged once a level of standard deviation above that expected for pasture, and the 

75

trends of bordering polygons mentioned above was detected and included in the 

flagged polygon set, where the histogram could be analyzed. 


(Pasture containing small areas of 

shade) 

76 

Mean pixel 

value 

Standard 

deviation 

Red 136.056 23.905 

Green 179.235 25.065 

Blue 114.285 14.684 


Although the mean pixel values matched those expected for pasture across the 

three colour bands the high level of standard deviation, suggests further 

examination could be necessary. A histogram representation shows the relatively 

higher pixel count across the values associated with shade but displays a clear 

peak for pasture. From the evidence of these samples it would seem safe to set the 

tolerance for standard deviation in both the red and green colour bands to a figure 

between 25 and 30; where any polygons exceeding this (even if they are bordering 

pasture polygons) would be flagged and exported to an examination set together 

with appended histograms. 

The final part of the testing took a sample from relatively clear pasture (little or no 

shade present) to see if the trend identified in the first two samples would continue. 

In other words if a large amount of shade resulted in a high pixel count and peak 

at the values for shade, and a smaller amount less so then the trend should show a 

simple peak for an area with little or no shade present. While this produced 

expected results, it was necessary to confirm the trend.


(Pasture containing 

little shade) 


Red 123.828 8.645 

Green 177.68 9.8 

Blue 105.261 12.69 


The sampling for shade highlighted the need for any automatic aerial image 

algorithm to account for levels of standard deviation; and also the necessity to 

detect peaks of values within the colour bands. The two peaks for lower and 

higher values in the second test sample demonstrated the usefulness of 

categorizing the images according to spectral values and is a good example of how, 

once the imagery can be broken into its constituent area polygons using vector 

mapping, useful data can be derived. The properties of shade in the sample 

imagery were uniform, allowing the distortion caused by shade to the spectral 

values of the target polygons to be included into image processing. The results of 

this part of the study go some way to proving the central premise of this thesis; it 

is possible to automatically scan aerial imagery using good quality vector data. 

The process of eliminating known areas and flagging borderline values (such as 

the distortion in the first test sample) means that the land cover can be examined 

through a series of scans until all the surface area is accounted for. This is a 

process which could then be adapted for specific projects (flood mapping etc.). 

The software process outlined here would act as a template for these projects and 

allow the users to input specific search values themselves without relying on 

obtaining the data from previous studies. 

77

3.8 Roof Areas 

Figure 12: Typical Roof Value Area Image 

This study involved taking a sample of roof imagery from a test area and 

comparing the luminescence values to those of the entire image to determine if a 

distinguishing deviation in values existed for these features of the image. The 

initial part of the experiment involved taking a sample number of roof values from 

the orthophotography and converting them to a format suitable for individual 

examination. The study involved a section of rural landscape in south county 

Galway (Ordnance survey sheet no. 3012-c). 

The aim of this part of the thesis is to try to obtain one of a series of benchmark 

control values which can then be applied to the area polygons in determining 

relative deviation of pixel values. In other words this section of the study is an 

attempt to try to get a unique indicator for roof pixel values to form part of a key 

for interpreting statistical values from the polygon being processed. It involved the 

use of three image analysis software packages, but can be achieved using the 

GDAL library alone (using GDAL_translate). 

The first step involved editing the image (aerial orthophoto corresponding to the c 

quadrant of OSI #3012-c) using OpenEV. This involved targeting the areas of 

interest from the imagery based on building polygons captured by OSI vector data. 

78

The vector data was overlaid on the image and the target areas exported using the 

GDAL export tool to create ten separate .tiff files containing roof values. 

The entire image was then loaded into the geomatica software and statistical data 

for the red, green and blue primary values (converted to greyscale) obtained. This 

was so as to obtain a mean value for these within the entire image, which 

contained a range of features including forestry, pasture and several water bodies. 

The image was not enhanced and no post processing was used in order to obtain 

standard baseline values to measure future iterations of the study against. 

It should be noted that the sample area contained 76 buildings and this study 

concentrated on the south east of the image (which contained the largest number 

of buildings). For each building polygon there was a clear division between 

sections of shade and light due to the roof pitch. While there is little value in 

pursuing this unique aspect of the image features (as buildings are already being 

well captured for mapping purposes by traditional photogrammetry and field 

survey), this might be useful for automatic key generation. By this I mean that 

pattern recognition software could be applied to separate these two areas and the 

resultant variations analyzed to determine the roof pitch; this is, however, beyond 

the scope of this study. 

Below is a screenshot of the buildings present in the image (so as to give an 

impression of their dispersal in the study area); 

79

Figure 13: Distribution of Buildings/Roofs in the Sample 

The ten sample roof values were all taken from the concentration of buildings in 

the south east, and the mean values for the three primary values (when converted 

to greyscale) were as follows: 

Roof Sample 1 Mean pixel value 

Red 102.683 

Green 130.923 

Blue 120.673 


Red 146.152 

Green 168.429 

Blue 161.457 

80


Red 51.0667 

Green 79.2 

Blue 85.5176 


Red 145.414 

Green 165.429 

Blue 161.457 


Red 95.2596 

Green 123.74 

Blue 113.106 


Red 92.416 

Green 122.633 

Blue 118.177 


Red 220.818 

Green 194 

Blue 142.896 


Red 133.528 

Green 124.556 

Blue 116.528 


Red 119.315 

Green 148.351 

Blue 142.967 


Red 173.111 

Green 187.506 

Blue 149.494 

Table 19: Roof pixel sample values 

81

These mean greyscale pixel values for the image were 111.711 for the red channel, 

136.542 for the green and 102.636 for the blue. This meant an average deviation 

of 22% in the blue channel for areas of roof (average mean for the roof samples of 

130.776 compared to 102.636 for the entire image). In terms of automated image 

analysis the variation could help develop a control by training a key against the 

known building values. 

Figure 14: Blue colour band pixel count for study area 

Histogram of mean blue channel pixel values for the c quadrant of ordnance 

survey sheet # 3012 –building polygons in this are were found to vary by 20%, 

indicating that this variation has the potential to form part of a training algorithm 

used in the development of a key for automated image analysis. 

It should be noted that image quality and the degree of shade and light can vary 

according to the environmental variations present when the image was captured. 

This study is not concerned with determining an exact value for the capture of 

buildings but is instead looking for variations from the mean values in the entire 

image. The aim is to gather enough unique deviations (such as the variance found 

in the blue channel above) to narrow the remaining values into categories. 

82

An iterative run through a practical application might be: 

• Obtain the mean for unique known features (e.g. the blue channel in 

buildings). 

• Subtract these known features from the area polygon. 

• Obtain the mean values for the remaining pixels. 

• Compare these values relative to the known features and determine the 

most probable land coverage. 

Following this initial sampling of roof values, it emerged that the values were not 

consistent enough for reference to determine adjacent land use types (though the 

fact that the polygons are coded vector polygons means they can be excluded form 

pixel value searches of small area polygons. One aspect of the pixel values which 

did emerge, and has potential for further analysis, was the clear division and pixel 

count peaks for roofs with a pitch. Taken further, these values, in terms of their 

relative pitch and matched to the set of proportional values that this study sets out 

to establish, the ability to determine roof pitch could be included. This would 

necessitate a separate study (over a large quantity of buildings across a wide 

geographical area and will not form part of this study; although the algorithm 

outlined in the thesis would be an extremely beneficial for such a study. 

The sampling for roof values in proportional reference to adjacent areas took three 

samples of separate land cover types and the road polygons adjacent to them. The 

three areas analyzed were; a section of pasture close to a farm dwelling, a section 

of mixed forest close to another dwelling (with several pitches in the roof) and a 

section of cut pasture close to a flat roofed shed. As mentioned above the 

sampling did not produce any useful data for the algorithm (except to eliminate 

the possibility of buildings being used for spectral reference). It did, however, 

outline the potential for a separate building analysis based on roof pitches. The 

following few paragraphs outline the results of the comparison. 

83

Roof test sample 1 Mean pixel value Standard deviation 

Red 171.121 25.028 

Green 202.818 22.936 

Blue 178.773 17.709 

Roof test adjacent sample 1 

(pasture) 


Red 113.004 7.169 

Green 156.632 8.779 

Blue 99.867 11.634 

Table 20: Roof test sample value 1 

The first sample took an area of pasture close to a dwelling where the pasture gave 

expected values for the type of land cover (which is outlined in the pasture 

sampling in this section of the thesis). The roof sample showed a relatively high 

level of standard deviation across the colour bands (although low by roof value 

standards) and the mean colour values returned, although high for a small area, 

could not be used to uniquely identify other features. The double peak (of shade 

and light) in the red and green colour bands indicated that that the roof had a high 

pitch. One problem which makes using this type of surface difficult is the degree 

of reflection that can take place form the relatively smooth surface of roof 

covering – further distorting spectral values. 


Red 121.394 42.959 

Green 144.111 41.557 

Blue 128.182 32.159 


(mixed forestry) 


Red 70.209 34.505 

Green 104.58 33.996 

Blue 86.887 20.602 


84

The second sample took an area of mixed forestry and a dwelling adjacent to it – 

the dwelling had multiple pitches due to dormer windows. As with the first the 

values of the area sampled adjacent to the roof was consistent with values 

calculated from samples of this land type (mixed forestry) elsewhere in this thesis. 

The roof sample itself had a high level of standard deviation (revealing a high 

level of reflection and shade) and a variation in mean colour band values from 

other roof samples consistent with the initial roof survey. From this it can be 

concluded that roof values are not a reliable indicator of forestry values and 

cannot be reliably used as a reference. 


Red 213.815 41.112 

Green 223.148 34.272 

Blue 208.148 42.432 


(cut pasture) 


Red 205.074 7.903 

Green 206.539 8.182 

Blue 134.537 9.784 


The third set of samples for determining the potential for using roof spectral 

values as a tool for validating adjacent area values used a section of cut pasture 

and a flat roofed shed. The values for the shed were high (with a correspondingly 

high level of standard deviation), indicating the level of reflection on the roof 

surface. By comparison the standard deviation across all three colour bands was 

low in the cut pasture, which is consistent with values identified elsewhere in this 

thesis. The compared mean values between the two samples were similar and this, 

together with the fact that buildings have been returning inconsistent spectral 

values in the study, suggests there is little to be gained by examining the 

relationship between the spectral values in roofs and adjacent polygons. This is 

something of a surprise, as at the outset of the research I had expected the 

relatively uniform nature of the shades and texture of roofing used in Ireland to 

provide an ideal reference for spectral analysis of aerial imagery. 

85

3.9 Pasture 

Figure 15: Typical Pasture Area Image 

This part of the study looks at spectral values for areas of pasture within a section 

of aerial photography, corresponding to 3sqkm of semi rural county Galway. The 

aim is to establish baseline data for the relatively stable spectral values returned 

from this type of land cover. The polygons in question correspond to field areas 

bounded by walls or fences (and occasionally roads and water bodies). They can 

be described as stable in terms of their usage; they will almost always be turned 

over to one particular usage –in other words they will not contain large areas of 

separate land cover within them (other than islands of forestry or water, which is 

already captured and can be eliminated from the image processing). The study 

looked at nine separate areas, two of which were freshly cut. The values returned 

were uniform with the variation in the two freshly cut fields remaining consistent. 

Another feature which emerged was the low level of deviation for values in the 

red colour band, approximately one third of that returned from the image as a 

whole, across all the samples. 

Pasture Sample 1 Mean Pixel Value Standard Deviation 

Red 126.131 8.538 

Green 173.848 9.45 

Blue 109.745 12.646 

86


Red 139.318 6.267 

Green 182.748 8.016 

Blue 115.605 12.547 


Red 133.267 8.871 

Green 173.4 10.926 

Blue 108.128 11.748 


Red 198.363 9.629 

Green 207.759 10.881 

Blue 134.204 11.83 


Red 197.046 14.712 

Green 207.816 13.067 

Blue 135.054 14.474 


Red 110.019 6.96 

Green 163.79 8.215 

Blue 100.187 11.92 


Red 126.238 9.459 

Green 168.753 9.56 

Blue 100.17 12.073 


Red 117.301 10.329 

Green 169.254 11.042 

Blue 104.371 12.349 


Red 126.357 8.542 

Green 183.48 9.611 

Blue 110.941 12.799 

Table 23: Pasture sample values 

87

From the above samples, samples 4 and 5 were of freshly cut fields, reflected in 

the high mean pixel value for the red colour band (almost 50% higher than the 

average for the image as a whole). This gives areas of freshly cut pasture a unique 

trait in a high red colour band mean and low standard deviation for a given 

polygon. This trait also distinguishes cut pasture as the red colour band mean 

values are over 35% higher than those of pasture in general. The mean green 

colour band pixel value for pasture is 25% more than that of the image as a whole; 

and when matched to a low standard deviation gives another strong indication of 

the type of land use. 

The above data is useful for indicating the presence of earthworks and landscaped 

areas within a small polygon. With a key calibrated to identify polygons 

containing the above values it should be able to accurately calculate the 

percentage of land area given to pasture; and also improve the accuracy of any 

process ran on imagery (in conjunction with vector data) by reducing the number 

of polygons for analysis. 

Pasture also serves an important part in the algorithm this thesis is attempting to 

map out. This is for two reasons. Its uniform property (in terms of standard 

deviation of the pixels form mean values and also the relative proportional 

difference between those mean values and the mean values for other land 

coverage types) allows it to form the bases of keys generated at the beginning of 

image processing to identify land use with. The second reason is that in an Irish 

context it is by far the biggest form of land use and any algorithm processing 

aerial photography (even in semi urban areas) will have successfully identified the 

majority of the surface area by correctly flagging area polygons of pasture. Of the 

remaining areas most can be identified by polyline coding from the vector data 

leaving the study with a higher chance of success in obtaining useful information 

about the remaining areas. It is useful to divide the classes of pasture into the two 

categories identified tin the samples above –that is to create a separate category of 

cut pasture for the purposes of executing a comparative survey of polygons in the 

aerial photography. With this in mind the following test samples were compared 

against the spectral values of both. 

88

Pasture test sample 1 

(track) 


Red 213.952 18.444 

Green 236.762 16.532 

Blue 194.833 19 


(coniferous forestry) 


Red 72.566 19.204 

Green 111.643 21.468 

Blue 96.381 16.026 


(mixed forestry) 


Red 69.968 32.486 

Green 104.287 32.054 

Blue 86.217 19.492 

Table 24: Pasture test sample values 

The above samples were chosen from areas outside those already captured for the 

baseline survey of the spectral values for track, coniferous and mixed forestry. 

The track was chosen as hard cover gives unique high values and is of benefit in 

calibrating a proportional difference between target areas (such as pasture in this 

case) and its high values. As was noted above, pasture itself is also useful in 

calibrating a key for an algorithm due to its specific mean colour values in the red 

colour band and low level of standard deviation from those mean values, but is not 

coded in the original vector input data and can only be derived from the image 

processing. The area of mixed forestry was chosen for its high level of standard 

deviation and the fact that it has similar spectral values to rough pasture. Mixed 

forestry will be identified and present in most imagery (or the specific properties 

could be pre-set into a key for the automated analysis) so using those values 

would allow an automated processing technique to eliminate areas with this level 

of high standard deviation from the process. 

89

For the track/ hard cover the difference in the values for mean number of pixels 

converted to greyscale for the red colour band was significantly higher. This was 

also the case for the mean of the green colour band where the variation was 

upwards of 40% for the pasture which had not been cut. It should be noted, 

however, that the mean pixel in the green colour band was higher for cut pasture 

and this variation brought it closer to the value of hard ground –the difference in 

the mean of the blue colour band between the test sample and all the samples of 

pasture (including the cut pasture samples) was consistently 40% higher. This 

differential (in conjunction with a low standard deviation and a mean 10% to 40% 

lower than hard ground in the red and green colour bands) could be used to 

determine the presence of pasture. 

The two forestry samples showed a mean pixel value in the red colour band of 

approximately 50% less than pasture –with an increased standard deviation for 

both. In terms of distinguishing between coniferous and mixed; the standard 

deviation was one third higher for the red and green colour bands in the mixed 

forestry. The mean value for the green colour band in both of the forestry types 

was 40% lower than the value of the mean in the pasture, giving another relative 

indicator to calibrate pasture values from. Small areas of forestry are found close 

to most semi urban areas and the fact that they are coded and measured means 

they are useful in training an algorithm to match land use to neighbouring 

polygons. 

The aim of this sampling is to try and achieve a number of probability factors that 

will allow an automated process to determine land use or coverage types based on 

the available data. With this in mind a broad variety of samples were taken and the 

relative proportions between them assessed. In the above example the pasture area 

was compared against other known area types. The increased amount of cross 

checking allows the elimination of areas which conform to a particular type. In 

this way the image is gradually broken down until every polygon is referenced. 

The value of this is not necessarily in the ability to classify field use (although this 

has some merit in that it adds value to available mapping) but in the identified 

process. This enables a piece of software to be developed which can be reused 

90

according to a users requirements. The process would remain constant (known 

values eliminated until the user is left with a set of polygons not conforming to 

known values and matching a set of target values identified by the user. The 

system could accept a set of co-ordinates recorded on a mobile GPS device or 

alternatively allow the user to view ordnance survey vector mapping draped over 

the orthophoto and manually enter the location of sample target values. In this 

way the process would allow the user to quickly establish the initial conditions for 

an automated search of aerial photography and return the spatial extent of the 

values input. 

91

3.10 Rough Pasture 

Figure 16: Typical Rough Pasture Area Image 

This part of the study focuses on areas of land cover known as rough pasture. 

They correspond to areas where scrub/ mixed forestry or overgrown vegetation are 

present. In general terms it could be used to refer to areas of bad pasture but could 

also be found close to an urban area where the land is not in use. It is similar to 

mixed forestry but would be slightly lighter than forested areas (and returned 

spectral values accordingly –by app 75 across all colour bands). The values for 

this type of coverage could be particularly useful in determining whether the 

target polygon was in use or not. The results from the sample areas returned a high 

level of deviation from the mean value in both the red and green colour bands. 

This would not be expected where the land was in use. If the polygon being 

examined was not part of an urban development (comprising of a mixture of 

artificial and landscaped surface cover) and was in use then a relatively low level 

of deviation form the mean value would result from human activity –for example; 

the results from a study of polygons used for pasture in this survey returned a 

deviation of almost one quarter of what was found here. 

92

Rough Pasture Sample 1 Mean pixel value Standard Deviation 

Red 81.61 31.475 

Green 120.334 31.922 

Blue 94.606 17.356 


Red 73.432 34.752 

Green 114.44 36.594 

Blue 95.898 18.988 


Red 69.694 35.687 

Green 111.165 37.813 

Blue 95.857 18.771 


Red 66.43 27.096 

Green 105.61 28.348 

Blue 94.081 17.243 

Table 25: Rough pasture sample values 

As mentioned above; it could be possible to obtain similar values to rough pasture 

in a spectral analysis of mixed forestry or certain urban developments. This, 

however, does not present a problem for the potential image key as both those 

areas can be eliminated from an automatic survey due to the fact that both mixed 

forestry and areas containing buildings and dwellings can be identified by 

previously captured attributes. In terms of the spectral values for the image as a 

whole the red band was 35% lower, the green 17% lower and blue 7% lower 

across the samples. While this alone does not give complete indication of rough 

pasture, when coupled with high levels of standard deviation in the red and green 

colour bands the probability of this type of land cover is high. When coupled with 

an additional vetting process (eliminating known land cover types), it should be 

possible to identify rough pasture using lower red and green colour band values 

with high standard deviation. 

93

The identification of rough pasture using an automated process would also lend 

itself to the study of abandoned developments. It could reasonable be assumed that 

the presence of rough pasture within more than half the plots of land in a 

development indicates that it has been abandoned. This might allow for a quick 

survey of developments to determine the level of repair by feeding the associated 

polygon set into a process designed to identify the percentage of rough pasture. In 

particular levels of pixel values in the red colour band similar to those above 

would indicate that the site has been abandoned. On an anecdotal level the rate at 

which former construction sites can be reclaimed by vegetation has been 

impressive over the last few years of good growing weather. Although there are 

several other ways to identify incomplete dwellings such as missing access 

roadways and ordnance survey field revision text the presence of spectral values 

corresponding to rough pasture would indicate long term neglect. 

Because of the relatively high level of standard deviation associated with the 

sampling for rough pasture it is probably one of the more important parts of 

automated image processing to identify as once these areas are removed from the 

search remaining areas displaying similar levels of deviation in the red and green 

colour bands can be analyzed more closely (possibly even with flagged histograms 

for users to identify). In order to recognise the values discovered relative to the 

rest of the image three separate areas of cover were sample. 

94

Rough Pasture testing sample 1 

(pasture) 

Mean pixel 

deviation 

95 

Standard 

deviation 

Red 131.413 10.813 

Green 167.157 9.494 

Blue 102.256 12.591 


(water) 

Mean pixel 

deviation 

Standard 

deviation 

Red 35.191 4.374 

Green 68.865 6.877 

Blue 83.6 12.964 


(road) 

Mean pixel 

deviation 

Standard 

deviation 

Red 218.867 7.405 

Green 240.2 9.321 

Blue 197.533 10.802 

Table 26: Rough pasture test sample values 

The first testing sample for comparison to rough pasture was pasture –this 

contrasts well with rough pasture and is a useful benchmark in the algorithm. As 

with the similar (in terms of pixel values) area of mixed forestry, rough pasture 

has high levels of standard deviation from a relatively low mean pixel value for 

the red colour band (app. 70 on the converted greyscale). This contrasts well with 

the mean red pixel value expected for pasture (almost double), something which 

was borne out in the pasture sample. The level of standard deviation is similarly 

low (app. One third of the value found in rough pasture for the red and green 

colour bands). The fact that pasture is often adjacent to rough pasture makes this 

also a very useful comparative measurement. It should also be noted that in terms 

of the vector spatial data this is also useful as the same boundary polyline will flag 

both areas and could be used to refine the algorithm. This is helpful to automated 

image interpretation as it reduces the possible value set by allowing a reduced set 

of values to be applied over the first iteration of the analysis. In this way an initial 

analysis by polygon can apply the spectral values to a smaller subset of possible 

neighbouring polygons and save time, opening up the possibility for the software

cycle being suggested here to present an early estimate of the type of surface 

coverage the image is composed of. 

One other factor that could be included in a search for rough pasture is the 

presence of symbols within polygons of this type in the vector mapping. Initial 

steps within the algorithm could extract all polygons and all indication symbols 

(the polylines surrounding known areas of rough pasture are not coded) and retain 

the set of polygons where they intersect. These could then be discounted from the 

automated image analysis. 

The second sample area was an area of water. This was taken form a lake present 

on the plan because, as was indicated earlier in the study, only water polygons 

above the level stream (wider than 3m) return a pixel sample useful for reference. 

The values obtained matched those identified in the water area sampling in this 

thesis and contrasted with values for rough pasture (less than 50% of the red and 

50% of the green colour band values of rough pasture). The level of standard 

deviation varied greatly also with values almost six times higher in rough pasture. 

This makes water a good benchmark to rate the probability of rough pasture 

against, using the percentage increase on the values for the red and green colour 

bands as a reference. 

The third value sampled was a section of road. This sample is unique to this part 

of the study, so as to avoid overlap with road samples taken elsewhere and ensure 

the values remain consistent for the feature type. Road was selected for sampling 

because, as with water, it contains relatively unique spectral properties with a low 

level of standard deviation. This allows it to be used as a stable contrast to the 

values found in rough pasture (which are similar to some types of forestry and 

contain a high level of standard deviation in all three colour bands).For the red 

colour band the road had an increased mean value of over 20% on those found in 

rough pasture, with a standard deviation of less than a third of the rough pasture 

value. The green colour band showed the highest difference returning a value of 

over twice that found in the rough pasture samples and, as with the red colour 

band, a level of standard deviation less than a third of rough pasture. The blue 

96

colour band also returned a distinct difference, with a mean value for road over 

twice that of rough pasture. 

The last two sample areas gave a clear indicator for use in the identification of 

rough pasture. When these are combined with the vector attributes (symbol search 

and nearest neighbour probability) the algorithm is equipped with a robust means 

of ensuring as many areas of this type of land cover are identified and removed 

from the search as possible. In other words it a user (of the proposed image search 

process) was seeking to identify a crop type (or disease within that crop type etc.) 

they could easily set the process to discount rough pasture from the analysis. 

97

4 Testing 

This chapter describes a set of tests for known polygon types in the imagery being 

used in the study. Four land use types were selected; pasture, rough pasture, marsh 

and bog; and the polygons clipped from the raster imagery for spectral analysis. 

The process followed the outline presented in Chapter 2, with the vector 

coordinates being identified from the ordnance survey data. This set of 

coordinates was then used to extract the corresponding section of image, which in 

turn was analyzed for its spectral values. The overall results were good in terms of 

there being unique proportional pixel count values present in the polygons 

(selected from across the image with typical biasing factors such as shade present). 

Of the four sample areas, bog and pasture had the least amount of standard 

deviation –it was possible to automatically classify pasture even with a high 

proportion of shade present in the image section. Marsh and rough pasture had 

similar levels of standard deviation but can be distinguished by a dip in values 

between the shade and vegetation range in rough pasture, which was not present in 

the marsh polygons. This chapter is divided into four sections; as with the 

previous chapter the results are grouped according to a particular area of interest, 

starting with a look at the values from the known pasture polygons. 

98

4.1 Pasture Test 

Four areas of pasture were sampled across varying degrees of shade and 

unbounded (from vector data) internal ground cover outside pasture (exposed 

rock). The first of these was an area in the south west of the sample area and 

comprised a polygon surrounded by five fence vectors close to the road. The 

sample was chosen to see if the expected values for the man and standard 

deviation across the colour bands would be reflected in a section of image with a 

relatively high degree of variety in terms of ground cover. 

The data was sampled using the Radius software where the polyline values were 

exported to an ASCII file. This process will be easier once GML format spatial 

data becomes available (not available at the time of writing but will be over the 

next few years, OSI 2010). This process has two requirements in order for the 

polygon to be correctly sampled: 

• Firstly the line orientation needs to be the same for all polylines which 

bound the sample polygon, by convention this is anticlockwise (left of the 

line direction falling on the inside of the area to be extracted). 

• Secondly the first and last co-ordinates must match, with no other 

duplicate values present in the file. 

These qualifiers can be validated at the time of extraction once GML data is 

available, however for this study a text editor was used to search duplicate values. 

99

Figure 17: Creating the ASCII file 

Once the ASCII file was created, it was used to extract the region of interest from 

the file using the Mirone cropping tool; 

Figure 18: Aerial view of pasture test 1 

100

This then presented a sample area which can be analyzed for pixel values –note 

the high degree of shade in the south east and exposed rock in the north west of 

the area. The next step in the process involved analyzing the histogram data for 

the target sample. This was completed in Geomatica, with the pixel count reduced 

to remove the values counted outside the image values. In other words the 

irregular image was stored as a GeoTiff image which placed the area onto a blank 

background, resulting in a high pixel count for values above the expected range 

(in the order of thousands), which were discounted by reducing the pixel count. 

Figure 19: Red colour band for pasture test 1 

The above histogram shows the values for the red colour band, with a clear peak 

for pasture values (see sampling section). The lesser spike of pixels of lower 

values is consistent with the type of shade expected in this type of area, while the 

slightly higher values returned by the exposed rock distorted the standard 

deviation slightly. The areas content, however, can be clearly seen in the spike for 

pasture, something which is clearer to see in the histogram for the green colour 

band: 

101

Figure 20: Green colour band for pasture test 1 

In the above sample, the values for both shade and pasture are evident in spikes in 

the pixel count. 

The second sample gave a clearer representation of the area type with a relatively 

monochrome sample taken from the south east of the imagery: 


The relatively low level of shade and lack of exposed earth allows the algorithm to 

classify the area as pasture, based on the values from both the red and green 

colour bands: 

102


The red colour band histogram (above) displayed a peak at the expected pasture 

value with the small level of count for lower values indicate the shade present in 

the east of the area. Note: The high value at the far right is the count for bland 

pixels represented by null values outside the cropped section present in the 

GeoTiff image. These values were discounted by lowering the total pixel count to 

a range within which expected levels of values will fall. The values for the green 

colour band also showed that the sample could correctly be identified as pasture 

from the peak count values found in this band (see sampling section for an 

elaboration on how these values were arrived at). 

103


The third sample took a polygon to the centre of the plan, close to areas of bog, 

rough pasture and marsh. It had a high degree of shade and also contained a 

relatively high degree of variation in use (the darker band to the north, which 

contained trees –the dark band to the south is shade), although it is still an area of 

pasture. 


The histogram values for both the red and green colour bands still, however, 

pointed to a polygon with spectral values which falls within the expected range of 

pasture: 

104


The above red colour band sample shows the peak at the pasture band with the 

variance in terms of darker values representing the scrub/ bushy part of the sample. 

The values are slightly higher than the overall mean for pasture, but still fall inside 

what might be expected for an area of cut pasture 


105

The peak for the green colour band, around 190 on the converted greyscale, means 

that the slightly higher peak for the red colour band could be accepted as variance 

within the pasture range. 

The fourth sample that was used to test the expected ranges for the presence of 

pasture from automatically extracted polygons was an area to the south east of the 

image which was close to farm buildings and used as pasture, giving a relatively 

clean example of the land use type to test the algorithm against: 

Figure 27: Vector data for pasture test 4 

The above screen grab from the vector data (farm buildings are to the right), 

shows how the process being developed in this study uses known controlled 

106

coordinates to crop the imagery into a mosaic of tiles for processing (same area 

below); 


The values for the red colour band were consistent with the expected values and, 

with the small level of distortion as a result of shade increasing the level of values 

counted in the lower part of the colour range, the polygon can easily be processed 

as an area of pasture: 


Similar results were returned for the green colour band (below), with the shade 

causing some distortion to the pixel count but the overall values indicate an area 

of pasture: 

107


The above sampling demonstrates that it is possible to cut up an image based on 

controlled vector data and analyze each section with some degree of success. The 

aim of this sampling was not to show absolute values for each land type, it was 

only to prove the theory that when given correct vector data representing small 

area polygons then valuable data can be further derived based on spectral. Of 

further advantage of anyone using the suggested algorithm is that for this type of 

land cover (peri-urban and rural but close to settlements –which covers most of 

the country) the controlled vector data has changed little over time and means the 

process also lends itself to studies looking at change over time. 

108

4.2 Rough Pasture Test 

Four areas of pasture were sampled across varying degrees of shade and 

unbounded (from vector data) internal ground cover outside pasture (exposed 

rock). The first of these was an area in the south west of the sample area and 

comprised a polygon surrounded by five fence vectors close to the road. The 

sample was chosen to see if the expected values for the man and standard 

deviation across the colour bands would be reflected in a section of image with a 

relatively high degree of variety in terms of ground cover. 

The data was sampled using the Radius software where the polyline values were 

exported to an ASCII file. This process will be easier once GML format spatial 

data becomes available (not available at the time of writing but will be over the 

next few years, OSI 2010). This process has two requirements in order for the 

polygon to be correctly sampled: 

• Firstly the line orientation needs to be the same for all polylines which 

bound the sample polygon, by convention this is anticlockwise (left of the 

line direction falling on the inside of the area to be extracted). 

• Secondly the first and last co-ordinates must match, with no other 

duplicate values present in the file. 

These qualifiers can be validated at the time of extraction once GML data is 

available, however for this study a text editor was used to search duplicate values. 

109

Figure 31: Vector data for rough pasture test 1 

Once the ASCII file was created, it was used to extract the region of interest from 

the file using the Mirone cropping tool; 

Figure 32: Aerial view of rough pasture test 1 

110

This then presented a sample area which can be analyzed for pixel values –note 

the high degree of shade in the south east and exposed rock in the north west of 

the area. The next step in the process involved analyzing the histogram data for 

the target sample. This was completed in Geomatica, with the pixel count reduced 

to remove the values counted outside the image values. In other words the 

irregular image was stored as a GeoTiff image which placed the area onto a blank 

background, resulting in a high pixel count for values above the expected range 

(in the order of thousands), which were discounted by reducing the pixel count. 

Figure 33: Red colour band for rough pasture test 1 

The above histogram shows the values for the red colour band, with a clear peak 

for pasture values (see sampling section). The lesser spike of pixels of lower 

values is consistent with the type of shade expected in this type of area, while the 

slightly higher values returned by the exposed rock distorted the standard 

deviation slightly. The areas content, however, can be clearly seen in the spike for 

pasture, something which is clearer to see in the histogram for the green colour 

band: 

111

Figure 34: Green colour band for rough pasture test 1 

In the above sample, the values for both shade and pasture are evident in spikes in 

the pixel count. 

The second sample gave a clearer representation of the area type with a relatively 

monochrome sample taken from the south east of the imagery: 


The relatively low level of shade and lack of exposed earth allows the algorithm to 

classify the area as pasture, based on the values from both the red and green 

colour bands: 

112


The red colour band histogram (above) displayed a peak at the expected pasture 

value with the small level of count for lower values indicate the shade present in 

the east of the area. Note: The high value at the far right is the count for bland 

pixels represented by null values outside the cropped section present in the 

GeoTiff image. These values were discounted by lowering the total pixel count to 

a range within which expected levels of values will fall. The values for the green 

colour band also showed that the sample could correctly be identified as pasture 

from the peak count values found in this band (see sampling section for an 

elaboration on how these values were arrived at). 

113


The third sample took a polygon to the centre of the plan, close to areas of bog, 

rough pasture and marsh. It had a high degree of shade and also contained a 

relatively high degree of variation in use (the darker band to the north, which 

contained trees –the dark band to the south is shade), although it is still an area of 

pasture. 


The histogram values for both the red and green colour bands still, however, 

pointed to a polygon with spectral values which falls within the expected range of 

pasture: 

114


The above red colour band sample shows the peak at the pasture band with the 

variance in terms of darker values representing the scrub/ bushy part of the sample. 

The values are slightly higher than the overall mean for pasture, but still fall inside 

what might be expected for an area of cut pasture 


115

The peak for the green colour band, around 190 on the converted greyscale, means 

that the slightly higher peak for the red colour band could be accepted as variance 

within the pasture range. 

The fourth sample that was used to test the expected ranges for the presence of 

pasture from automatically extracted polygons was an area to the south east of the 

image which was close to farm buildings and used as pasture, giving a relatively 

clean example of the land use type to test the algorithm against: 

Figure 41: Vector data for rough pasture test 4 

The above screenshot from the vector data (farm buildings are to the right), shows 

how the process being developed in this study uses known controlled coordinates 

to crop the imagery into a mosaic of tiles for processing (same area below); 

116


The values for the red colour band were consistent with the expected values and, 

with the small level of distortion as a result of shade increasing the level of values 

counted in the lower part of the colour range, the polygon can easily be processed 

as an area of pasture: 


Similar results were returned for the green colour band (below), with the shade 

causing some distortion to the pixel count but the overall values indicate an area 

of pasture: 

117


The above sampling demonstrates that it is possible to cut up an image based on 

controlled vector data and analyze each section with some degree of success. The 

aim of this sampling was not to show absolute values for each land type, it was 

only to prove the theory that when given correct vector data representing small 

area polygons then valuable data can be further derived based on spectral. Of 

further advantage of anyone using the suggested algorithm is that for this type of 

land cover (peri-urban and rural but close to settlements –which covers most of 

the country) the controlled vector data has changed little over time and means the 

process also lends itself to studies looking at change over time. 

118

4.3 Marsh Test 

The testing for the marsh areas involved taking four known marsh areas (but 

geographically separate) from the imagery as ASCII co-ordinate files and using 

this data to extract the raster sections for spectral analysis. The sampling section 

of this study found marsh to have a relatively low level of standard deviation 

across the colour bands (when compared to the similar rough pasture type 

polygons). This might have been expected to change as there can be a relatively 

large degree of shade in these areas due to the presence of other vegetation –and to 

some extent this was the case. The samples, however, despite a high degree of 

standard deviation, did display some unique properties in line with the sampling 

section which allow them to be automatically classified. In general terms polygons 

with the value ranges displayed below will belong to either rough pasture or marsh 

(see sampling section, this is in reference to Irish small area polygons only). 

Rough pasture displays two peaks of values where the shade and vegetation 

contrast, the marsh samples did not have this distinction and the range of values 

graduated towards a peak (slightly lower than rough pasture ~10 on the converted 

greyscale). 

119

The first sample area came from a polygon of marsh north of a lake and included a 

large amount of vegetation. 

Figure 45: Vector data for marsh test 1 

The sampling for these was from separate areas across the imagery used in the 

study to try and achieve as consistent a picture as possible of this type of area. The 

co-ordinates of the area were extracted from the vector data (above) in ITM 

projection and used to clip the polygon from the raster imagery (below). As with 

all other samples in this test –the final results were adjusted by lowering the pixel 

count so as to account for plank pixels created in the output GeoTiff of the 

irregular clipped polygon: 

120

Figure 46: Aerial view of marsh test 1 

The results for the red colour band pixel count were similar to the spectral values 

for rough pasture but had a definite gradient between the areas of and vegetation, 

as opposed to two peaks for growth consistent with high vegetation, which 

emerged as typical of the marsh samples taken during the testing of this algorithm 

(the sampling section had a lower level of standard deviation but the full polygon 

samples also included shade and other growth, as in the east of the sample above): 

121

Figure 47: Red colour band for marsh test 1 

As with the spectral values for the red colour band, the green colour band showed 

a gradual increase in pixel count values along the greyscale from those related to 

areas of shade to those falling within the expected range for marsh: 

Figure 48: Green colour band marsh test 1 

The above trend would emerge as consistent through the marsh value testing, with 

a consistent graph for blue (within the sampling range) with values evenly 

122

distributed either side of a peak between 90 and 100 on the converted greyscale. 

The fact that these values mirror rough pasture closely mean that the marsh areas 

can only be classified once rough pasture values have been fully identified 

(towards the end of the third step in the algorithm). 

Figure 49: Blue colour band for marsh test 1 

The second known marsh area selected (outlined below) to test the algorithm was 

taken from an area east of the first sample, which included some areas of tree 

cover (north east) and vegetation close to rough pasture (south): 

123


One of the advantages of using vector data which has already been controlled to 

cut the photography into a mosaic of polygons is the extent to which the set data 

points mirror detail that would be extremely difficult to identify using automatic 

methods (such as the outline of the drain, visible in the indent on the eastern edge 

of the above polygon). The values for this larger sample area, although it 

contained more tree cover that the first sample, remained consistent with all four 

samples with a gradual incline from shade values to the expected values for marsh 

in the red colour band (following page): 

124


This gradual increase in values from shade to marsh (as per the original sampling 

for this study, see sampling section) was also present in the green colour band: 

Figure 52: Green colour band marsh test 2 

The blue colour band remained consistent both with the other three test polygons 

used in this study, and the sampling for marsh. This suggests that the blue colour 

band pixel count values (below) can be added as an additional point of reference, 

125

from which any variance would flag a problem with the polygon analysis and flag 

it for further analysis (step 4 in the algorithm): 


The third known marsh polygon used to test the expected values was taken from 

an area east of the image, with a high degree of shade and trees (and as a 

consequence overhanging growth making aerial analysis difficult). This was used 

to see if a relatively small area of marsh (in comparison to the other areas used in 

this section in of the study) could return values close to what an automatic 

analysis would expect, even with some distortion from neighbouring features): 

126


The values for the red colour band for this smaller section of known marsh were 

not consistent with all the other test polygons, with a small rise from shade values 

to those expected for marsh. This suggests, to a greater extent than the larger 

samples, that it is difficult to automatically classify marsh based on spectral values 

alone. The classification of this type of polygon must, as a result, take place 

towards the end of the search –so that all known values, including those 

automatically registered in the algorithm (such as pasture and bog) are first 

removed from the pool of areas being studied. It should be noted that this part of 

the study is not intended to be an exhaustive search for a means of automatically 

identifying marsh, but to show that it is possible when an aerial image is divided 

into discrete small area polygons –other factors, for example the percentage of 

rough pasture and water polygons in the same region of the image (taken from the 

vector data) could potentially be used increase accuracy in identifying this type of 

ground cover. 

127


The values for the green colour band showed a similar level of distortion and, as 

with the red colour band values form this area and unlike the other three samples 

in this section, the count did not peak neatly in the expected range for marsh. The 

blue colour band, however, was consistent with all other test polygons and 

sampled imagery. This suggests that for small areas (below 2ha.) marsh is difficult 

to detect automatically and could only be assigned the attribute when all other 

values in the region of interest being analyzed have been assigned. 

128


The fourth sample known marsh polygon used in the test was from an area to the 

south of the sample image (below). As with all others it formed an irregular 

polygon bordered by other non-marsh areas. There were also some trees and other 

vegetation present in the area, which was something present in all the marsh 

samples. This suggests that the type of ground cover being described is often 

something that occurs in transitional areas, and the difficulties associated with 

getting clear spectral values (in terms of low levels of standard deviation) are a 

result of this variation. The samples, however, were consistent with the other two 

large marsh test polygons, and the pixel count for the red and green colour bands 

peaked within the expected range. 

129


The values for the red colour band for the fourth test polygon showed a gradual 

incline from the values for shade to the expected range for marsh: 


The green colour band values were also consistent with this trend and, once shade 

is removed from the analysis, these could be used to identify marsh. The values 

130

for the clue colour band, as with the previous three test polygons, remained 

consistent with an expected range for marsh (see sampling section): 


The values returned for the test of marsh polygons were not as distinct as the 

previous three polygon tests (pasture, rough pasture and bog). There were sets of 

values (such as a consistent blue range and gradual increase in count from shade 

to the expected marsh range) which can be used to classify this type of area, but 

the classification needs to be confined to the latter (step 3) part of an automatic 

search algorithm to increase the probability of the spectral values matching 

correctly. 

131

4.4 Bog Test 

The sampling for areas of bog took larger polygons which, with forestry and water 

bodies already eliminated from the search algorithm (with the vector data) could 

automatically be assumed to have a high probability of belonging to an area of 

bog. The first sample was taken from an area of bog bounded by roads on all sides 

in the northeast of the imagery being used in this study (see the general 

introduction for details). This polygon was chosen for sampling as it could be 

considered a good example of this type of land cover: 

Figure 60: Vector data for bog test 1 

The data points which form the polylines bounding the polygon were exported in 

ASCII format. As with all clip files for this algorithm the points are recorded in 

132

anti-clockwise format with a space separating easting and northings, a newline 

separating coordinate pairs and the start/ end point appearing twice. The 

projection is, as with all the data in this study, Irish Transverse Mercator. This 

coordinate set then formed the boundaries for a clipped area of the raster image: 

Figure 61: Aerial view for bog test 1 

As with pasture, areas of bog produced a uniform pixel count with a low level of 

standard deviation (see sampling section for more detail on pixel values). One 

aspect of this type of ground cover, however, which sets it aside from other 

searches in this study is the absence of shade (due to the absence of trees and thick 

vegetation). The small area of shade present in the south of the above sample is 

not enough to distort the mean values and proved to be typical for the imagery. Ad 

with all the samples in this section of the study, the histogram values for the 

polygons have been adjusted to remove the distortion caused by blank pixels 

adjacent to the irregular shape created when it is exported to GeoTiff format. This 

was achieved by reducing the pixel count to eliminate values over 5000 (400 in 

the case of pasture and rough pasture, where sample areas are much smaller). The 

red colour band histogram (below) returned a clear spike, consistent with expected 

bog spectral values (see sampling section) and unique among the polygons in the 

imagery being studied (Note the slight peak in lower values, giving a clear 

indication of the presence, and proportion of, shade in the sample –something 

which facilitates automatic attribute detection): 

133

Figure 62: Red colour band for bog test 1 

The green colour band values for the same polygon also revealed a clear pattern to 

the values, peaking at the expected range for an area of bog: 

Figure 63: Green colour band for bog test 1 

The blue colour band also displayed a range of values which peaked for expected 

pixel values for an area of bog; the result of which is that the area can be classified 

as bog. This has implications for the determination of the growth/ decline of this 

134

type of area over time. The focus of this study is to determine the value of 

cropping raster imagery based on small area polygons so as to automatically 

detect values in the image, so the specifics of the type of bog or other factors 

which can be determined from further processing have not been developed, save 

to flag the necessary properties (i.e. The spectral values set in the sampling section 

and identified in this sample, and the area property of being larger than four 

hectares –which eliminates all other probable land cover types). 

Figure 64: Blue colour band for bog test 1 

The second sample area of bog was extracted from an area bounded by river, 

rough pasture, drain and road and produced an irregular polygon which was then 

extracted from the imagery using Mirone software and saved as a GeoTiff file. 

This file was examined for the spectral values across the colour bands. As is clear 

from the crop area from the image, the properties contain the uniform values (and 

associated low level of standard deviation) which are useful in for automatic 

analysis and classification. 

135

Figure 65: Vector data for bog test 2 

136

As with the first sample, little or no shade is present in the area being analyzed: 


The red colour band produced a clear indication of an area of bog, consistent with 

expected values but with a small distortion due to the vegetation in the west of the 

image (evidenced by the very slight grade of values below the expected range): 


137

The histogram for the green colour band also displayed a proportion of spectral 

values which was consistent with an area of bog. It should be noted that the values 

are very similar to the benchmark values identified in the sampling section, but are 

from a real world polygon –the sampling section used sections of the land type 

from within known areas to set the benchmark. This part of the study is looking at 

typical polygons extracted using vector data. It is therefore significant that the 

results are so similar: 


As with the first sample, all three colour bands matched the expected range, with 

the blue colour band peaking for a range consistent with bog. This makes the 

proportional method (comparing the values to their proportional range to known 

areas such as water bodies and roads) possible, and suggests that during the 

second step of the algorithm the areas flagged as bog could be included into a 

known set of values to test areas with high standard deviation and multiple peaks 

of values against; 

138


The third sample for the bog land area type took a large section of bog to the north 

west of the imagery, which was exported from the vector software as a set of co- 

ordinates in ASCII format. It is worth noting that this format has been used 

throughout this study as it allows for easy manipulation of the files and would 

make them compatible with a GML routine, but the sets of co-ordinates can also 

be contained in a ESRI shape file (.shp) and this could then be used to cut the 

sections with the software used here. The downside to this, however, is that the 

sets cannot be (as easily) fed into reports or user created routines. 

139


The results across all three colour bands were similar to the first two test samples 

and matched the expected values identified in the sampling section of this study, 

which indicates for general areas of raised bog this algorithm gives an accurate 

indicator of total area. This may not be the case for more remote sections of this 

land type, found on mountain ranges. The algorithm is dependant on closed vector 

polygons, which are available for almost all of the country, with the exception of 

high mountains. In those cases it would be necessary to re-set the image key to 

search for values of exposed rock and determine the proportional difference 

between the expected bog values and those for exposed rock in much larger 

mountainous polygons –this is something which the algorithm can be adapted for 

but is not attempted here as the focus is on matching generic land use types as 

much as possible (the larger mountain areas would contain a mix of marsh, bog, 

exposed rock and vegetation). 

140


As with all the bog samples this test area returned values within the expected 

range for the land use type and the pixel count for the colour bands fell within 

expected range for red (above) and green (below): 


141

This trend was continued for the blue colour band and indicates that there is a high 

degree of probability of bog once a low standard deviation and the value ranges 

are met –the graph for the blue colour band for the third test sample is below: 


The final test sample for an area of bog was taken from the west of the image and 

contained some vegetation and shade which may have distorted the results –as 

with all the samples of this type of land cover, the percentage shade is very low, 

allowing it to be a distinguishing feature for inclusion in an automatic search: 

142


The red colour band results showed a slight grade from values of shade into the 

expected range of the land type, which was a result of the small area to the south 

and west of the polygon, but the values still fell within the range expected: 


143

The green colour band pixel count also produced similar results, with the 

indications being that the polygon can (coupled with the relatively small level of 

standard deviation, large area size and low values of shade) be automatically 

recorded as an area of bog: 


As with all the bog samples and test areas in this study the blue colour band values 

remained within a small range with a low level of standard deviation (see 

sampling section). 

144


In general terms bog is a very useful land use type for inclusion into an automatic 

aerial image survey such as this one, as it can be quickly identified and logged 

early in a looping cycle through the spectral values of polygons. As mentioned 

above, this is dependant on controlled polylines bounding the area, something 

which is available for most of the country but results may be distorted on remote 

high ground (identified by 1:5000 mapping and the presence of cropped rock). 

The purpose of this test was to determine the value of using the vector data to clip 

aerial imagery into a mosaic of polygons for spectral analysis. As mentioned in 

the sampling section of the study it is possible to include pattern analysis testing to 

determine the level of cutting taking place (drains are included in the vector data 

and could also factored into such a search). 

145

4.5 Conclusion 

Image segmentation is one of the most important parts of automatic analysis of 

aerial imagery (Zhou & Wang, 2007). A set of reference data is necessary to know 

where to divide image sections. This can be obtained from a survey input by the 

user or from spatial data specific to the area being studied (peat, forestry etc.). 

Ordnance survey vector data provides a comprehensive set of reference points and 

allows an aerial image to be cropped into small discrete area polygons. These 

polygons can also benefit from the previously captured coding which identifies 

many of them as a specific land type. The result of adding this data to an 

automatic search for specific spectral values is that the user can gain context from 

known neighbouring polygons and calibrate the specific search accordingly. This 

in turn means that the process of image analysis can be simplified by applying a 

generic technique for identifying polygons and refining it to search for a given 

value. 

This study looked at the value of cropping aerial imagery into a mosaic of known 

and unknown polygons. It attempts to automatically derive probable types for the 

unknown areas based on the known data and a sampled image key. The sampling 

and testing undertaken during the study indicated that it is possible to derive 

useful value from a spectral analysis based on a pixel count alone. This was 

because the vector data introduced into the process reduced the number of 

possible values that can be attributed to a pixel set –for example, as the extent of 

forestry is known, similar values returned from an unknown polygon must 

represent marsh or rough pasture while further analysis of the shape of the range 

can distinguish between either. 

The process is possible using open source software but could also be coded into a 

standalone application, e.g. using the GDAL library and a function to crop 

irregular polygons. The potential for automation will be supported by the release 

of the vector data in GML format, from which a large ASCII file of coordinate 

sets could be fed into the process. By removing the requirement for a user to 

control areas of the image through the use of the vector data, and by presenting the 

146

user with a pre calibrated image key of expected spectral values it is possible to 

automatically classify aerial image sections. It should be noted that this study was 

confined to a specific type of vector data (ordnance survey) and the landscape of 

small polygons with a single land use may not apply to all landscapes. However, 

the study proves that large scale vector data can be used to simplify aerial image 

processing. 

147

5 Literature Review 

The goal of this thesis is one that is in line with most work completed using 

remote sensing processing in that it is looking for traits in aerial imagery which 

can be used to derive useful information about the surface of the earth. One of the 

early studies of aerial image interpretation (Kittlers ‘Image processing for remote 

sensing’ paper) described the process as “the interpretation of image segments that 

exhibit similar statistical properties” (J.Kittler, 1983). One addition that could be 

made to that definition is that, for the majority of studies in this field the 

interpretation should be automatic, or as close to automatic as to make 

interpretation of a large volume of data viable. 

There is a vast body of work available which documents various methods for 

interpreting aerial and satellite imagery. In general terms there is always a focus 

for each study, e.g. identifying coffee plantations in Costa Rica (Corado-Sanches 

and Sader, 2005) and this influences the methodology. One result of this is that 

there are a large variety of methods employed. This literature review considers a 

representative sample of these in terms of their focus, in other words treating work 

that uses patterns or shapes as one category, spectral deviation for agricultural/ 

forestry purposes as another and urban analysis as a further category. In terms of 

previous studies, the ones that are closest to what this thesis is attempting is the 

body of work that has been completed on what has been termed ISAs 

(impermeable surface areas, or, more usefully, hard ground). The focus of these 

works is to identify the percentage of hard ground within urban areas which can 

then be used in modeling flood events. During the early part of the study I made 

use use of the SWAP technique recommended by T.Knudsen in his 2005 analysis 

of color in aerial imagery to identify grey areas in the test imagery (Figure 5: 

Water Area Image Modification). In the context of the data being analyzed by this 

thesis (Irish peri-urban land parcels) these grey areas within the image can be 

made to correspond to hard ground. 

Before considering the body of work underlying this thesis it is probably useful to 

answer two questions which the reader might ask; is this not just 

148

photogrammetry?, and why not use an established algorithm such as the 2000 

vegetation-impervious surface-soil sub pixel analysis techniques published in the 

2000 issue of Remote Sensing (Phinn et al)? 

In answer to these questions this review will not be considering a history of 

photogrammetry other than a general outline of established (traditional) 

processing techniques. It will also not be describing some of the segmentation, 

target area identification pre-processing methods used in the various studies. This 

work is often a major component of this type of analysis. The answer to the first 

question is that in general terms this study is photogrammetry but takes as a 

starting point controlled photography and captured polygon data so to consider the 

body of work underlying theses techniques falls outside the scope of what this 

thesis is attempting 

The answer to the second question is that this study differs from previous 

techniques in that it pre-supposes a large amount of information form the data 

(features of the built environment, feature coding, water parcels, forestry parcels, 

roads by category, footpaths and buildings by category) so feature capture is not 

part of the study. A possible addition to the study would be a consideration of 

feature capture using pattern analysis. In particular the identification of out 

buildings adjoining existing dwellings would be useful. However, this is outside 

the scope of this study. It can be assumed from the outset most of the major 

physical features present in the built environment are present in the data in vector 

format. This narrows the application of the technique to areas that are covered by 

large scale mapping but results in an automatic method for adding data to this 

mapping. One possible application is calculating the percentage of hard ground in 

a region of interest. 

In general terms the study can be seen as specific to urban areas which have been 

digitally mapped at a large scale (1:2500 or 1:1000 scales). Arbitrarily segmenting 

an image is a technique that has been used in previous studies (Ketting & 

Landgrebe, 1976) but this study differs in that the segments are specific small area 

polygons corresponding to property divisions and physical features. Results 

149

shown give a more detailed picture of these areas and in this way eliminate some 

of the problems encountered in previous studies. 

There are two broad areas within the body of work on processing of remotely 

captured spatial data which will be considered in the remainder of this review; 

these can be considered spectral analysis methods, and the methods associated 

with identifying polygons and the deviation of data from polygon patterns. 

Before discussing these it might be useful to briefly run through the data capture 

process as it stands in a traditional mapping environment (such as in OSI, the 

former OSNI and OSGB). This process has been neatly explained by Bingcai 

Zhang and Neal Olander in their paper to the 2000 ESRI user conference. They set 

out the process as the captured imagery being manipulated by a user in a software 

package (e.g. SOCKET SET) to produce a shape file of vector data from the 

original hardcopy imagery. In this study the process of creating the data has 

already been completed (along with the prior image control as described by Zhang 

and Olander), so the thesis can be considered to be a method of re-visiting the 

imagery to add value to the captured vectors/ features. It should be noted that the 

process developed by this thesis is not dependant on expensive packages (such as 

SOCKET SET) and could be applied to lower budget environmental monitoring 

systems. Using low cost photogrammetric packages such as ShapeCapture there 

would be a trade off in terms of consistency and accuracy (Aguilar et al, 2005). 

Josef Kittlers 1983 Philosophical Times paper, as cited above, provides a useful 

introduction to the subject matter of this thesis. It was written as an attempt to 

summarize the various attempts that had been made towards automating the 

analysis of aerial imagery at that time. The text is useful not just as a background 

to the historical development of the field but also as an outline of the processes 

involved (with respect to multispectral image segmentation). The author looked at 

a wide number of previous studies (27 are cited) and summarized the work into a 

series of categories. The technology available to process imagery has evolved 

massively over the intervening quarter century but the basic techniques (in terms 

of pixel analysis) and motivations (in terms of data required) remain similar today. 

I have chosen this paper for the review as in some ways it sets the context for the 

work being undertaken, that is the “interpretation of image segments that exhibit 

150

similar statistical properties” (Kittler, P.323). Kittler divides image processing into 

six sections; (1) the sensor and (2) data collection, (4) image preprocessing, (5) 

segmentation and classification and (6) image interpretation. The work undertaken 

in this thesis relates to part five of this system, although it will differ from the 

work considered by Kittler’s paper in that some image interpretation will have 

taken palace beforehand. 

Kittler has identified the first part (of the segmentation and classification step 

described above) as analyzing remotely sensed data in order to “identify 

homogeneous segments in the image” (Kittler, P.324). He introduced a method for 

pixel-by-pixel classification to achieve this. For this method he suggests 

identifying and classifying all the pixels on a pixel by pixel basis and then linking 

identical pixels to form connected segments. In many ways this is probably the 

holy grail of image interpretation as if perfected a machine could automatically 

identify change and update maps. Kittler also proposes that segments of pixels 

exhibiting similar properties could be used for this purpose. This thesis does not 

suppose an identical method, instead it attempts to identify the proportion of 

pixels corresponding to hard ground in a small area polygon, subtract buildings, 

roads and water polygons and attach a value for impermeable surface to the area 

data. Kittler suggests that a Bayesian probability formula can be used to determine 

the class of a segment. He suggests using what he terms “ground truth data” 

(Kittler, P.325) for the probability function to assign pixels to a class and 

determine the composition of a segment. He cites the class homogeneity of land 

surface covers as the means to initially segment and classify the pixels. This work 

can be made difficult by weather conditions and instrumental scanning errors 

(Note: Kittler’s paper is from a time when GPS controlled measurements were not 

readily available and such errors are less of a factor today, though small 

inconsistencies can occur, particularly at high altitude). 

Kittler further suggests a method for partitioning the image into segments (initially 

into cells of 2*2 pixels as suggested by Ketting & Landgrebe, 1976). In this way 

he estimates that the larger size of land cover would allow an analysis to identify 

neighboring pixels with similar properties. He also suggests another method for 

identifying segments of the image which he terms “two dimensional spatial 

151

dependencies” (Kettling, P.330). This is the use of the four neighbours of a given 

pixel to identify the probability of them being part of the same group. 

In terms of this thesis Kettling’s paper suggests that an analysis of aerial imagery 

could potentially reveal large amounts of data about selected features (particularly 

if the segmentation has been completed already by vector mapping). Kettling’s 

paper does suggest that it is possible to identify homogenous areas on a pixel by 

pixel basis and is the focus of this thesis. 

5.1 Spectral and image considerations for the thesis 

This body of work forms the basis for what will be the main argument of this 

thesis, that it is possible to automatically capture spatial data relating to 

impervious ground in Irish towns, using controlled photography and matching 

vector data. This requires processing which makes use of the spectral data 

contained in aerial imagery of the sample data, which in turn presents a number of 

separate problems. One of these is bidirectional reflectance, and while this is not 

expected to be a major consideration while developing the algorithm for the thesis 

it nevertheless warrants consideration. One method of calculating for this is to 

adjust the imagery based on either ground sampled data or imagery from a higher 

(possibly satellite) vantage, and is something that was considered by Sakari 

Tuominen and Anssi Pekkarinen in their 2004 study of forestry in southern 

Finland. The authors consider a method of improving the value of data being 

retrieved from aerial photography (in conjunction with satellite data) by reducing 

the presence of bidirectional reflectance. This is a problem with the way light is 

hitting the surface of the earth causing the spectral values if image pixels to 

depend on their location in the image. 

One approach would be to focus any study on the centre of the image, where 

bidirectional reflectance would not be as big an issue. However, the study was 

attempting to find a more effective method of correcting this using overlaying 

satellite images and a correcting algorithm. The reason they chose satellite 

imagery was that they are less affected by bidirectional reflectance and this 

152

enchmark would allow the authors to conduct local adjustment for the pixel 

values. The study covered 4500 hectares of boreal forest located in the 

municipality of Kuru in the south of Finland. 

The core of the study is a local radiometric correction method for reducing the 

effect of bidirectional reflectance. This problem was not a major issue in the thesis 

but the methods used by the authors (finding a larger scale benchmark image to 

reference study areas against) was considered.At the heart of the Finnish study the 

problem was of similar objects possessing different spectral characteristics in 

different parts of the image. This was a problem which was less relevant to the 

focus of this study (the authors are focused on forestry data). The authors 

conclude, not unsurprisingly, that the value of remote sensing is dependant on 

what is visible and what can be registered by the airborne sensor. 

As mentioned in the introduction to this chapter, the body of work which 

examines automatic capture of hard ground within urban areas is of particular 

relevance to this thesis. The 2007 study by Yuyu Zhou and Yu Wang of urban 

examples in Rhode Island is a good illustration of the type of factors that need to 

be considered in this type of survey. The study, which used true-colour digital 

orthophotographic data with a 1m spatial resolution (forming a controlled dataset 

in .tiff format with red, green and blue spectral bands present) segmented the 

imagery according to urban districts. The authors note that “successful image 

segmentation is the most important prerequisite in object-oriented classification” 

(P.644). It is hoped that by using previously captured and verified vector data this 

thesis will have met this prerequisite. 

The algorithm which was employed for this survey was broken down into four 

parts; segmentation, compensation for shadow effect, analysis of variance 

classification and post classification of the data. There is a large body of work that 

has been completed using automatic interpretation of aerial imagery; the focus of 

this work is usually towards a specific purpose, such as the 2007 analysis of coffee 

crops in Costa Rica outlined by S.Cordero-Snacho and S.Adler. The study is a 

useful example of some of the problems that can be encountered when attempting 

automatic image analysis. In the study the authors consider the problem of 

153

identifying coffee plantations from remotely sensed data of tropical forestry. The 

main focus of the study was to identify a means of separating the areas of coffee 

from similar data (in terms of wavelength and spectral values) representing 

tropical forest. This problem is made even more difficult due to the fact that the 

coffee plantations are often set under forestry (due to the shelter the cover 

provides for the crop). In addition the terrain is often mountainous so the authors 

had the additional problems of variety in terms of elevation, associated mist/ cloud 

cover and shade to overcome. While these problem is not something that was a 

large factor in the low flown aerial photography of Irish suburban landscape used 

in this thesis, the methods the authors employ to deal with cloud and haze are 

relevant. It was advantageous to the thesis to be able to reduce the impact of any 

of the areas of shading that existed. 

The authors took rectified Landsat imagery of a large tract of land in central Costa 

Rica, the Central Valley surrounding San Jose. This imagery had been captured 

during the rainy season. They broke the study down into a series of steps, starting 

with classifying three different waveband combinations in the imagery. They then 

developed what they termed a “Coffee Environmental Stratification Model” 

(Coredo-Sancho & Ader, P.1581) before comparing this with supervised results 

(from known data). They then set out to identify which waveband combination 

best matched coffee crops. 

The identification of a control section of water which the authors used to reduce 

haze in their imagery helped with the development of the image key for this thesis. 

The method the authors used in the study was to find an area of deep water to 

identify a minimum reflectance value and subtract this from each of the non 

thermal bands. The next step the authors took was to remove clouds from the 

imagery, they did this by creating a binary mask using a classification of arbitrary 

clusters they had developed. This allowed them to recode areas which it identified 

as being contaminated by cloud or shadow. When this was applied the clusters 

were recoded to a zero value, they also digitized “isolated” (Cordero-Sancho & 

Ader, P.1582) clusters on a case by case basis. It is very beneficial to smooth areas 

of shade within a polygon. This might be done by estimating a percentage of 

shade that should be present in the polygon (based on time of day and vector code 

154

making up the boundary, e.g. fence/ forestry/ building/ water). By identifying a 

value that this shade should fall under it is possible to derive a corresponding 

relationship in the histogram and adjust the results of the study accordingly. 

The final result of the authors work in Costa Rica was only “moderately 

successful” (Cordero-Sancho & Ader, P.1589). This would seem to have been 

largely due to the altitude with which the imagery they used was captured, by their 

own admission the results would probably be better had the imagery been low 

flown or of better resolution. The methods the authors employed in the study are 

useful examples of how inconsistencies in results obtained during the process of 

completing this thesis might be countered –in particular potential methods for 

reducing the effect shade might have on altering the results could be applied. 

In this thesis, as with most automatic aerial imagery analysis, the classification of 

target areas in the photo (usually according to spectral values) is a vital part of the 

process. One solution is to develop a key to differentiate between features. The 

level of detail that can be obtained can be quite precise, but is dependant on the 

resolution of the imagery and the complexity in the patterns of distribution of the 

target. This is illustrated by Megan Lewis 1998 study of vegetation communities 

in an area of westerns New South Wales. In this study she attempted to develop a 

key to differentiate between vegetation types. The problem she was attempting to 

counter was that of identifying particular species. She noted that existing aerial 

analysis could detect the presence of vegetation and in an attempt to improve this 

process she divided these species into colour bands. She took sample plots of 

250sqm corresponding to 8*8 blocks of pixels in a relatively homogenous area 

and calibrated the relationship between field verified data and fifty of these blocks 

(using them as training areas for the study). The study used 12 colour bands which 

were allocated into nine classes and identified a link between these and vegetation 

classes. In the conclusion to the study the author noted that it was possible to 

portray sub-polygon variation using pixel-based imagery. 

In order to complete this study it was necessary to make the best use of the 

available imagery. This imagery can benefit from preprocessing in order to 

highlight the areas being captured. One method for achieving this might be to 

155

apply an algorithm to colour the data so as the target areas are easily captured. 

This is something which was identified by Thomas Knudsen in his 2005 study of 

pseudo natural colour aerial imagery for urban and suburban mapping (and in 

previous studies by the same author). In his study he suggests an algorithm for 

automatic urban and suburban aerial image interpretation. The paper uses test data 

from (pseudo) natural color images used in traditional photogrammetry (as 

opposed to airborne four channel imagers). His aim is to discriminate between 

vegetation and human made materials, which was also one of the aims of this 

thesis. The author cites the relative importance of separating vegetation (which he 

considers to be void of mapping objects) and human-made materials in respect to 

automated photogrammetric mapping. It is worth noting at this point that imagery 

captured in the near-infrared band is generally indicative of vegetation and 

Knudsen’s work is an attempt to identify this band using only aerial photography 

captured using red, green, blue three channel instruments. His work is very 

relevant to this thesis as over the course of his study he identifies a method of 

obtaining “excellent” (Knudsen, P.2691) reproduction of grey surfaces (which in 

an urban area correspond to paving and exposed rock). 

The author takes a look at three algorithms in terms of their effectiveness in 

discriminating between areas in scanned aerial photograph. The first is a pseudo 

natural color algorithm developed by the author in a previous study (Knudsen 

2002) where he managed to create a blue channel based on green, red and near 

infrared values and left the green and red values as captured. This allowed for 

good reproduction of red surfaces (which corresponded to roof surfaces in the 

Danish sample data) but suffers slightly from haze effect. 

The second algorithm the author considers is one which creates a blue channel 

between green and near infrared and a green channel form similar values and 

leaves the red as captured. This, similar to the first algorithm, gave good 

reproduction of red surfaces and of vegetation, but failed in reproducing clear grey 

surfaces. The third algorithm the author considers involved swapping the green 

data for blue, the near infrared for green and leaving the red as was. This allowed 

him to reproduce grey surfaces accurately (making them stand out in the 

156

photography) but was less useful for vegetation, leaving an “artificial looking 

hue” (Knudsen, P.2691). 

The final part of the paper sets out a method of modifying the first algorithm to 

improve its value in creating data for interpretation. The steps are to restore 

black/grey/white, vegetation-covered and red/yellow-reddish areas lost in 

preprocessing, to re-whiten very bright objects and to amplify the pixels (enhance 

the colour saturation). The author provides reference to a more detailed technical 

implementation (using information from previous papers he published) but these 

are less relevant to this thesis as the result would not be suitable for identifying 

hard ground. 

One area where there is considerable information to be gained is in the area of 

forestry, particularly in capturing the spread of disease or invasive species in a 

plantation. One such study was undertaken by M.Martin, S.Newman, J.Aber and 

R.Congalton in 1998. I have included it here as I think it is a good example of 

what appears to be a standard remotely sensed image analysis. In this study the 

authors set out to obtain remote data relating to tree species in an area called 

Prospect Hill in central Massachusetts. Their target data was species identified by 

11 forest cover types. To do this they used a maximum likelihood algorithm 

assigning all pixels in the aerial image to one of the 11 categories they were 

searching. The survey was validated using field data (taken from a database of 

species type). They note that at that time (late 19990’s) spectral data had already 

been used to identify categories of forest cover. These prior surveys had been 

successful in discriminating between coniferous and deciduous cover (the authors 

cite the examples of Nelson et al., 1985, Shen et al., 1985 and Landthrop et al., 

1994). The primary goal was identification of species composition from the forest 

canopy and in this the authors had reasonable success –a random selection of 

pixels yielded an overall classification accuracy of 75% (Martin et al 1998). The 

study used photographic tiles of 10*10km with a high spectral resolution. The 

authors suggest further improvements in accuracy could be made by identifying 

the (deciduous) species with both their leaves on and off which would allow for a 

foliar biomass calculation to be made. 

157

In 2002 S. Phinn, M. Stanford, P. Scarth, A. Murray and P. Shyy attempted to 

apply a similar technique to that used by Megan Lewis in her 1998 study of 

vegetation communities, only with reference to vegetation impervious hard 

ground in urban areas. This study has similar goals to this thesis, but was 

undertaken in an Australian urban context, and does not make use of existing 

vector mapping and polygons. In the study by Phinn et al the authors conducted a 

survey of the area surrounding the city of Brisbane, in an attempt to establish an 

image processing method which would yield information about the urban 

environment. In particular they focused on vegetation –impervious surface-soil, or 

hard ground. They identified 60 spectral zones which they aggregated based on 

their location, they were able to identify distinctive zones of hard ground based on 

their per-pixel classification. The data used was from the Landsat5 Thematic 

mapper and 1:5000 scale aerial photography. Their aims were to deliminate land 

cover and use types, identify areas of impervious and pervious surfaces. One of 

the most difficult aspects they encountered was classifying non-vegetation areas 

into exposed soil and developed surfaces. They extracted the information along 

transects radiating from Brisbane’s city centre. The study noted that water bodies 

and vegetation were “separate in all spectral bands” (Phinn et al, 2002). The 

maximum separation along the cleared hard ground was along bands 3, 4 and 7. 

the authors concluded that the increased resolution enabled more detailed 

assessment of the surface. 

One recurring theme within these studies of spectral analysis of aerial 

photography (and satellite imagery) was that the success of the study is dependant 

on three main factors; the resolution of the photography, the correct segmentation 

of the target areas and an accurate knowledge of the colour bands which apply to 

the study. To a lesser extent it is also important to introduce methods to correct for 

haze, cloud cover and shade. These last three considerations are not the focus of 

this study; the fact that they warrant the complete focus of previously published 

papers indicates that it would not be possible to completely eliminate their 

presence. By applying some of the aspects of masking (Cordero-Sancho & Adler) 

and adjusting for shade (Tuominen & Pekkarinen) it might be possible to reduce 

the influence of these in the outcome to within an acceptable error margin for high 

158

flown and satellite imagery. This was not necessary in this thesis due to the clarity 

of the photography. 

5.2 Vector and polygon based studies of aerial photography 

I have labeled this body of knowledge of aerial (and satellite) image processing as 

vector and polygon based as the trend that unites the studies is the fact that the 

authors sought a pattern or shape based method for extracting the information. As 

with spectral (and hybrid spectral and spatial) methods, the underlying cause for 

the studies can vary, from understanding Alaskan watercourses (van der Werff & 

van der Meer, 2008) to examining the built environment in Moscow (Dudarev, 

2009). I did not make use of a particular algorithm or technique from these studies, 

but have considered them in this review for their potential in offering a method for 

sub dividing small area polygons. It should be noted that there appears to be a 

point where pattern analysis is less beneficial, such as the 500pixel minimum 

suggested by H. van der Meer and F. van der Meer. 

This thesis makes use of building polygons captured from the Irish peri-urban 

landscape. These served both as indicators as to the type of land use in the small 

area polygon surrounding them (and possibly a spectral control in terms of the 

roof tile value). In terms of pattern analysis any automated identification of 

modifications or new buildings should recognize the polygon outline as a building. 

This is a particular problem in peri-urban areas, in a rural context newly built 

slatted sheds etc. will conform to a standard outline but the shapes are more varied 

in urban areas. This is particularly so in the peri-urban Irish landscape where one 

off housing and a fashion for ugly looking (in terms of aerial analysis) extensions 

and sections of building jutting from a main structure mean that establishing a 

template pattern for dwellings would be difficult. Outside of considering other 

data (such as presence of tarred road etc.) an automated study relying on pattern 

identification would need a complicated signature algorithm. This was the basis 

for Roman Duradevs 2009 study of building polygon signature point definition. In 

this he was considering buildings in context of the city of Moscow, but intended 

the algorithm he created for use in a wider variety of data sets. The study tied in 

159

with the authors work for a software development company (Enterra) which is 

involved in developing GIS software and the algorithm was also an attempt to find 

a solution to identifying building signatures within the software. The paper 

develops a work around for identifying polygons which are consistent with 

buildings on a map. He describes this as an ordered point set. This could provide 

useful background information should spatial deviation pattern recognition ever 

become an option. It is not within the scope of the thesis to modify the algorithm 

but nevertheless Duradev’s study provides a possible alternative to the spectral 

deviation method of identifying additional data. 

The author notes that building signatures in many cases will not look “neat and 

beautiful” (Duradev, P.109). By this he is referring to the fact that the polygon 

will not conform to a standard shape which would be easily identified. It should be 

noted that the author acknowledges that a similar algorithm exists for the product 

of another software provider, ESRI (with Arc View software) and that the test of 

the algorithm was carried out on uncoordinated data; implying that any application 

of the algorithm in this thesis would involve a high level of modification for 

something that could be obtained from a desktop application. It is however, useful 

to consider the methodology for breaking down the problem (the author outlined 

an implementation algorithm before considering the process steps required). 

Duradev broke down the necessary implementation into five steps, starting with a 

search for a convex polygon inscribed within the shape being identified. He then 

suggested that if the polygon shape returned from this was bigger than the 

maximum (original shape) then the new polygon should be considered as the 

maximum, otherwise another convex polygon within the shape should be 

identified. This step is repeated until all the polygons have been searched at which 

point the centre of the polygon is searched and the result returned. 

Duradev then stated the mathematical steps that would be required to implement 

the steps he outlined; namely –take a start point (on the polygon) for the search, 

take the neighbor vertex in a set rotation –if this point matches the starting point, 

return the resulting polygon; if the point addition results in a positive vertex then it 

is added to the polygon, otherwise the neighbor vertex is selected again. This 

160

algorithm is accompanied by sample C code which can be used to test it. I believe 

it could be used if a method of spectral analysis of aerial photography could be 

developed to return sharp enough edge detail to identify the component points of 

these polygons. This seems unlikely in relation to the imagery (and processing 

techniques) that are currently available and the algorithm is probably of more use 

in a situation where the vector detail had already been manually captured (in 

which case the appropriate building code should also be present). 

Much of the other work involved in manipulation of polygons could be said to fall 

under the banner of graphic editing of GIS data (as could also be the case with 

Duradevs algorithm). This type of work (physically manipulating and extracting 

specific polygons in vector format) would be of particular significance it pattern 

analysis was being used in this study. If particular patterns could be identified then 

algorithms for clipping and determining intersections between polygons, such as 

the one developed by Kui Liua et al in 2007 would be a central part of the process. 

This would then mean the study would take polygon edges as the basis for 

captured data and perform calculations to construct an output polygon. As these 

polygons have already been manually captured, this body of work slightly less 

relevant. That is not to say that they would not be of central value to an automated 

image processing technique should it become viable. 

In terms of studies this concept has been the focus of a lot of effort -such as Pal & 

Foodys 2009 feature selection study that showed that accuracy of classification 

declines with additional features when using support vector machines. The fact 

that an underlying verifiable automatic technique for identifying change in 

photography and converting it to accurate vector data has not been yielded from 

these studies indicates that it is probably something that will always be specific to 

the terrain being analyzed. This work is beyond the scope of this thesis so it 

focused on an aspect of polygon identification that could be applied to a more 

general spectral analysis. One previous attempt at this is H.van der Werff and F. 

van der Meer’s 2008 study into shape based algorithm for identifying spectrally 

identical objects. In this study the authors took a look at the potential of shape 

signatures in aerial imagery in order to establish a means of identifying and 

classifying the object. They look at three broad methods for this; solely shape 

161

ased analysis, solely spectral based analysis and a combined “spatial-spectral 

classification” (H. van der Weff & F. van der Meer, P.251). These studies are 

slightly different from the one being undertaken in this thesis in that the shape and 

classification of much of the data will already be known however, the study is 

useful to this thesis in that it suggests the potential for a method of identifying new 

farm buildings based on a similar classification. The authors are seeking a method 

to enhance pixel-based spectral classifications (as will be used in this thesis) by 

adding spatial information. It is worth noting that the results of the study were not 

satisfactory in terms of automatically correctly identifying features. 

The first step the authors used to determine the shape of the areas being examined 

was to “seed” (H. van der Meer & F. van der Meer, P.252) the object. This 

involved beginning an object with a single pixel of a set value and increasing the 

size of the area until a spectral variance in a non-overlapping 3*3 pixel occurs. 

This part of the study continued until all the image pixels were segmented into 

objects. The authors noted that size was a factor at this point and objects of 500 

pixels or less were more successfully determined. The study itself was looking at 

parts of Alaska, and the objects being classifies were water bodies; i.e. separating 

streams from ox-bow lakes, thaw waters from rivers and sediment rich water. This 

is a difficult task due to the relative random nature of these shapes when compared 

to a well defined linear pattern that can be observed in the Irish landscape. The 

authors conclude (in the case of water bodies) “(that) an object should consist of 

approximately 500 pixels at minimum to be able to use the absolute value of shape 

measurements” (H. van der Meer & F. van der Meer, P.257). 

The authors created a combined analysis method by classifying shapes according 

to threes spectral bands from the imagery being used and comparing the results 

against the pixel based shape measurements. Using these results they were unable 

to distinguish between the water bodies being considered by the study and the 

authors suggest that further research is required to better combine the two (shape 

and spectral) classifications. In some ways this thesis is a continuation of this, in 

that it will be using a spectral analysis in combination with spectral signatures (in 

the form of previously captured and coded vector data). The aim the authors had 

was to established a means of measurement using an “unbiased software 

162

algorithm” (H. van der Meer & F. van der Meer, P.257), this would seem difficult 

to achieve in the case of a relatively chaotic Alaskan wilderness but might be 

better applied to peri-urban land parcels. 

Another example of a study which combines a number of different aspects of 

remote sensing to analyze aerial data in the 2007 random field model for urban 

area detection developed by Ping Zhong and Runsheng. In this study the authors 

presented a method for interpreting remote images of urban environments that 

makes use of what they call “conditional random fields” (Zhong and Wang, 2007). 

The study is a response to the fact that although considerable research has been 

completed on land cover analysis, the algorithms generally adapt for only a 

narrow range of image resolutions and therefore only a few types of urban areas. 

They see previous attempts at urban analysis as being based on either gray-level- 

based spectral analysis or using texture descriptors. They further note that edge 

strength measures can be used to extract homogenous regions. This is an 

interesting concept, and may have an application in the automatic capture of large 

utility features in rural areas, such as silage pits. 

The authors establish a discriminative method for identifying regions in the 

photography based on interactions with the neighboring regions. This allows them 

to utilize the conditional random fields in terms of context to identify areas. The 

authors broke this technique into the jobs of configuring the features, selecting 

classifications and classifier fusion. The proposed algorithm compares the fields 

against the data segments and places them in a classified segmented model; the 

authors compare their results against two previous algorithms, Stacked Feature 

Based (where a number of different feature types a re concatenated into one model) 

and Straight Line Statistics (where areas of high incidence are used to identify 

urban areas). They observed a higher output rate against the first method (based 

on time on a 2.4Dhz Pentium machine) and decreased accuracy in detecting 

smaller rural areas against the second (where straight line statistics were not 

effective against urban areas smaller than 400*400pixels). The method the authors 

use, of allowing each component part of the search to train based on “its own 

aspects” (Zhong & Wang, 2007) appeared to give positive results against the 60 

training and 91 test images used, and was able to successfully identify blocks of 

163

16*16 pixels as urban or nonurban. The results of the study gave 85.3% accuracy 

in terms of correctly identifying blocks as urban (Zhong & Wang, P.3986). The 

overall methodology is probably best suited to a larger study area, however it may 

be possible to apply the multiple conditional random fields model to a smaller 

scale with success. 

A further example of similar methodology being applied to aerial data on a large 

scale is the 2009 study of the Guangzhou urban area by Fenglei Fan, Yunpeng 

Wang, Maohui Qiu and Zhishi Wang. Although similar results to what the authors 

achieved in their much larger study would be an effective failure of this thesis the 

study indicates that it is possible to determine a lot through automatic image 

analysis, even with the disadvantage of poor imagery, random settlement patterns 

and a large test area. In the study the authors set out to examine urban growth as 

experienced by the people of Guangzhou (a city of 7.5 million inhabitants in the 

southern Chinese Guangdong province). They were limited by available imagery – 

their study attempted to extract urban areas from a series of images dating back to 

the 1970’s and some cycles were not available. They determined that fractal 

geometry was useful in studying the development of the city and that a “fractal 

dimension index is an effective index to evaluate urban form” (Fan et al, 2009). 

The study area covered 3178sqkm and took five separate years as sample points in 

time to identify a pattern in the city’s development. 

The data capture was completed using a maximum likelihood algorithm 

performed on the images. The algorithm took in seven categories to classify the 

imagery with; the target urban settlement, forestry, cropland, orchard, natural 

water, artificial water and bare land (vegetation free surface area outside the urban 

settlement). In order to verify the accuracy of this classification the authors took 

reference data captured from fieldwork and separate land use mapping and 

sampled the results of their study against each category in the reference. They 

achieved an accuracy in correct classification of over 80% using this method. The 

study completed segmentation of the imagery by using two transects, running 

from west to east, comprising nine blocks of 1306130pixels and south-west to 

north-east, comprising ten blocks of the same quantity of pixels. 

164

The Guangzhou study is useful in proving that a relatively high level of accuracy 

can be obtained when automatically capturing urban data over a large scale. This 

data was improved on by making use of a smaller area for this thesis using higher 

resolution imagery and additional indicators (vector and code data imported from 

large scale mapping). 

This thesis is fortunate not to have the variety in landscape patterns that previous 

studies have had to contend with, which meant that a high accuracy level was 

possible. In much of the available literature the studies are completed on a very 

large scale (as in the previous two papers) with very specific data in mind. They 

attempt to identify particular plant species or types of urban development. The 

methodology being used for this study may benefit from applying some of those 

techniques to a more stable sample. There are several advantages present in the 

area being targeted. The temperate nature of the Irish climate means that areas 

which are not developed will be covered by vegetation, so may fall into the near 

infrared category, while areas under development should display values consistent 

with earthworks or paving. At the outset of the study it was expected that most 

roofing would fall within a relatively small range of colors and could be used to 

calibrate the search. This was not the case, however, tarmac road data proved a 

useful replacement in terms of consistent spectral property throughout the image. 

The thesis looked at a very specific aspect of this body of knowledge and attempt 

to bridge the gap between automatic aerial data capture and traditional 

photogrammetric methods. It is noted that in most of the study areas the authors 

did not benefit from the availability of large scale coded vector data and the 

premise for the study was that if this is available then the accuracy of automatic 

capture can be increased. At the core of all of the literature mentioned in this 

review is the classification of imagery (with the exception of the point signature 

algorithm proposed by Duradev). In the course of this review I encountered one 

study which posed one of the same questions that are considered in this thesis; can 

the use of geometric information increase classification accuracies in aerial image 

processing? This study (Bellens et al, 2008) proposed a method of morphological 

profiling to improve the data capture. The authors identified “substantial 

improvement” (Bellens et al, P.2803). The study points out that urban areas such 

165

as roads and car parks are so similar spectrally that they cannot be separated by a 

spectral analysis alone. They further divide spectral analysis into pixel-based or 

object-based. The object based methods group pixels together in a meaningful 

way, something which the authors identify as a difficult task. It is the intention of 

this thesis to use the former method. The authors identify a method for 

automatically obtaining structuring elements to help construct the segmentation 

(such as solid rectangular objects, roofs etc.), allowing a shape index to help 

extract man-made structures from the image. 

One observation that can be made from the available literature on automatic aerial 

image processing is that even with accurately segmented imagery (such as clearly 

divided vegetation and urban areas) a considerable amount of work is involved in 

training the algorithms to classify target areas. The creation of a standard key, 

which can be extended by the user, became one of the main focuses of this study. 

This thesis enables the user to reduce the workload by presenting a method for 

quickly calibrating an automated search. 

166

6 References 

Geospatial Data Abstraction Library (2010) GDAL utility programs. Retrieved on 

18th August 2010 from: http://www.gdal.org/gdal_utilities.html 

Universedade do Algarve (2010) MIRONE. Retrieved on 8th July 2010 from: 

http://w3.ualg.pt/~jluis/mirone/ 

PCI Geomatics (2010) Geomatica. Retrieved on 5th June 2010 from: 

http://www.pcigeomatics.com/index.php?option=com_content&view=article&id= 

5&Itemid=4 

OpenEV (2006) Geospatial Toolkit. Retrieved on 5th June 2010 from: 

http://openev.sourceforge.net/ 

Josef Kittler (1983) Image processing for remote sensing. 

Philosophical Times, 309, 323-335. 

Thomas Knudsen (2005) Pseudo natural colour aerial imagery for urban and 

suburban mapping. 

Int. Journal of Remote Sensing, Vol. 26, No.12, 2689-2698 

Roman Dudarev (2009) Plain Polygon Signature Point Definition Algorithm. 

Survey and Land Information Science 69, No2. 

S. Cordero-Sancho & S.A.Ader (2007) Spectral analysis and classification 

accuracy of coffee crops using Landsat and a topographic-environmental model 

International Journal of Remote Sensing Vol. 28, No. 7, 10 April 2007, 1577–1593 

167

H. van der Werff and F. van der Meer (2008) Shape-based classification of 

spectrally identical objects. 

ISPRS Journal of Photogrammetry & Remote Sensing 63, 251-258 

S. Phinn, M. Stanford, P. Scarth, A. Murray and P. Shyy (2002) Monitoring the 

composition of urban environments based on the vegetation–impervious surface– 

soil (VIS) model by sub pixel analysis techniques. 

International Journal of Remote Sensing, vol. 23, no. 20, 4131–4153 

Sakari Tuominen and Anssi Pekkarinen (2004) Local radiometric correction of 

digital aerial photographs for multi source forest inventory. 

Remote Sensing of Environment 89, 72–82 

Manuel A. Aguilar, Fernando J. Aguilar and Francisco Aguilera (2005) Mapping 

small areas using a low-cost close range Photogrammetric package with aerial 

photography. The Photogrammetric Record 20(112): 335–350 

Yuyu Zhou' and Y.Q. Wang' (2007) An Assessment of Impervious Surface Areas 

in 

Rhode Island 

NORTHEASTERN NATURAUST I4 {4):643-650 

Megan M. Lewis (1998) Numeric classification as an aid to spectral mapping of 

vegetation communities. Plant Ecology 136: 133–149 

Yong Kui Liua, Xiao Qiang Wanga, Shu Zhe Baoa, Matej Gombosib, Borut Zalik 

(2007) An algorithm for polygon clipping, and for determining polygon 

intersections and unions 

Computers & Geosciences 33 (2007) 589–598 

Rama Rao Nidamanuri Bernd Zbell (2010) A method for selecting optimal 

spectral resolution and comparison metric for material mapping by spectral library 

search Progress in Physical Geography 34(1) 47–58 

168

Xiaoping Liu, Xia Li, and Xiaohu Zhang (2009) Determining Class Proportions 

Within a Pixel Using a New Mixed-Label Analysis Method 

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 

Mahesh Pal and Giles M. Foody (2009) Feature Selection for Classification of 

Hyperspectral Data by SVM 

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 

Bingcai Zhang and Neal Olander(2000) How to get GIS Data from Imagery. ESRI 

user conference 2000 proceedings. 

Retrieved on the 2 March 2010 from: 

proceedings.esri.com/library/userconf/proc00/professional/papers/pap427/p427.ht 

m 

Ping Zhong and Runsheng Wang (2007) A Multiple Conditional Random Fields 

Ensemble Model for Urban Area Detection in Remote Sensing Optical Images. 

IEEE Transactions on geoscience and remote sensing, Vol. 45, No. 12 

Fenglei Fan, Yunpeng Wang, Maohui Qiu and Zhishi Wang. (2009) Evaluating 

the Temporal and Spatial Urban Expansion Patterns of Guangzhou from 1979 to 

2003 by Remote Sensing and GIS Methods. International Journal of Geographical 

Information Science, Vol. 23, No. 11, 1371–1388 

M. E. Martin, S. D. Newman, J. D. Aber, and R. G. Congalton (1998) 

Determining Forest Species Composition Using High Spectral Resolution Remote 

Sensing Data. Remote Sens. Environ. 65:249–254 (1998) 

Nelson, R. F., Latty, R. S., and Mott, G. (1985), Classifying 

northern forests using Thematic Mapper Simulator data. 

Photogramm. Eng. Remote Sens. 50:607–617. 

169

Shen, S. S., Badhwar, G. D., and Carnes, J. G. (1985), Separability of boreal forest 

species in the Lake Jennette area, 

Photogramm. Eng. Remote Sens. 51:1775–1783. 

Lathrop, R. G., Aber, J. D., Bognar, J. A., Ollinger, S. V., Casset, S., and Ellis, J. 

M. (1994), GIS development to support regional simulation modeling of 

northeastern (USA) forest Analysis (W. Michener, J. W. Brunt, and S. Stafford, 

Eds.), Skidmore, A. K. (1989), An expert system classifies eucalypt 

Taylor and Francis, London, pp. 431–451. 

Rik Bellens, Sidharta Gautama, Leyden Martinez-Fonte, Wilfried Philips, 

Jonathan Cheung-Wai Chan, and Frank Canters (2008) Improved Classification of 

VHR Images of Urban Areas Using Directional Morphological Profiles 

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, 

NO. 10. 

170

View/Open - ARAN

Create successful ePaper yourself

Delete template?

Save as template?