- Text
- Inverse,
- Training,
- Neural,
- Flow,
- Application,
- Modeling,
- Water,
- Figure,
- Irrigation,
- Network,
- Maps,
- Multiple,
- Dresden.de

Self-organizing maps with multiple input-output option for modeling ...

W03022 SCHÜTZE ET AL.: SOMS FOR MODELING THE RICHARDS EQUATION W03022 classifying flow regimes in horizontal air-water flow. Other typical applications of SOMs include the compression of remote sensing data [Kothari and Islam, 1999] and data clustering [Bowden et al., 2002]. In contrast to MLP, the SOM network architecture provides an insight into the underlying process and avoids problems regarding the identification and training of MLP [Hsu et al., 2002]. Most probably, the lack of an **output** function in the SOM architecture has until now restricted its popularity in water resources. Ef**for**ts are being made to overcome this limitation. For predicting classes of hydraulic conductivity Rizzo and Dougherty [1994] generate discrete **output** values using the Counterpropagation network (CPN) [Hecht-Nielsen, 1987]. The CPN consists of a standard SOM and a so-called Grossberg layer. [5] Recently, further combinations of SOM **with** various other techniques have been designed to approximate single continuous functions. Ranayake et al. [2002] combined self**organizing** **maps** **with** a Multilayer Perceptron **for** a two-step estimation of the hydraulic conductivity and the dispersion coefficient. While the SOM was used to identify the sub range of the parameters, the MLP provided the final estimates. Hsu et al. [1997] modified a CPN in order to estimate precipitation via a Grossberg layer. At a later date, Hsu et al. [2002] successfully applied their self-**organizing** linear **output** (SOLO) mapping network to hydrologic rainfall-runoff **for**ecasting. The SOLO network architecture combines a SOM and a linear mapping network of the same dimension. Schütze and Schmitz [2003] implement another type of hybrid ANN, the local linear map (LLM). They integrate the mapping functions directly in a single self**organizing** map thus ensuring a high reliability and accuracy in approximating the numerical solution of the Richards equation. All the above discussed approaches enable trained hybrid SOM to be used **for** solving single **input**-**output** problems, i.e., **for** reidentifying an **output** vector from a given **input** vector. [6] This contribution aims to proceed a step further by developing a self-**organizing** map architecture **with** a **multiple** **input**-**output** **option** (SOM-MIO) which, **for** example, allows simulating soil water transport as well as solving different inverse problems **with**in a single SOM-MIO. Moreover, we introduce a training procedure which ensures that the generated data fully portrays the **modeling** domain. Thus we analyze the aspects of generating optimal training sets by physically based models following the suggestions of ASCE [2000b]. 2. Methods [7] Neural networks are composed of simple elements operating in parallel which are inspired by biological nervous systems. As in nature, the network function is largely determined by the connections between the elements. Commonly, the training of ANN is based on a comparison of their **output** y 0 and a known target y. Such network architectures use a supervised learning procedure **with** a multitude of corresponding **input**-**output** pairs. Most of the current neural network applications apply the Backpropagation algorithm to layered feed**for**ward networks (i.e., MLP) **for** supervised learning. A specific **for**m of this learning principle is also used by the radial basis function (RBF) networks. 2of10 [8] Another technique **for** learning a particular function is unsupervised training. Network architectures like the self**organizing** **maps** use algorithms which fit an ‘‘elastic net’’ of nodes to a signal space (represented by a great number of sample vectors (x, y), i.e., **input** plus **output** vectors) to approximate its density function. In order to realize this, SOMs combine unsupervised **with** competitive training methods. 2.1. **Self**-Organizing Maps [9] A self-**organizing** map can be adapted to almost arbitrary domains of definition. It allows an interpretation of its internal network structure and is able to approximate the graphs of any continuous function [Kohonen, 2001]. [10] The SOM network used in this investigation consists of l neurons organized on a regular grid. A n + mdimensional weight vector m =(m 1 ...m n , m n+1 ...m n+m ) is assigned to each neuron, where n = dim(x) and m = dim(y), denote the dimensions of the sample **input** and the sample **output**, respectively. Thus contrary to MLP and RBF networks, the **input** signal of the SOM, x SOM, always consists of both **input** and **output** vectors (x, y) which are specified in detail in section 3. The neurons are connected to adjacent neurons by a neighborhood relationship, which defines the topology or the structure of the SOM. In order to characterize the basic features we use a two-dimensional structure of the self-**organizing** map and a hexagonal grid **for** the neighborhood relationship N i. Generally, the SOM is trained iteratively. Each iteration k involves an unsupervised training step using a new sample vector x SOM. The weight vectors m i were modified according to the following training procedure. [11] 1. Begin training the SOM. [12] 2. Initialize the SOM: choose random values **for** the initial weight vectors m i. [13] 3. Begin iteration k. [14] 4. Iteration step 1: best matching unit (winner) search. At each iteration k one single sample vector x SOM(k) is randomly chosen from the **input** data set and its distance i to the weight vectors of the SOM is calculated by i ¼ kxSOMðkÞ mik ¼ Xnþm j¼1 x j SOM k ð Þ mji 2 : ð1Þ The neuron whose weight vector mi is closest to the **input** vector xSOM(k) is the ‘‘winner,’’ i.e., the best matching unit (BMU) c, represented by the weight vector mc(k) kxSOMðkÞ mcðkÞk ¼ min xSOM k fk ð Þ mikgi ; i ¼ 1; 2 ... l: ð2Þ [15] 5. End iteration step 1. [16] 6. Iteration step 2: weight vectors update. After finding the best matching unit c, the weight vectors of the SOM are updated. Thus the BMU c moves closer to the **input** vector in the sample space. Figure 1 shows how the reference vector m c(k) of the BMU and its neighbors move toward the sample vector x SOM(k). Figures 1a and 1b correspond to the situation be**for**e and after updating, respectively. The rule **for** updating the weight vector of unit i is given by miðkþ1Þ ¼ miðkÞþasðÞhci k ðkÞ½xSOMðkÞ miðkÞŠ; ð3Þ

W03022 SCHÜTZE ET AL.: SOMS FOR MODELING THE RICHARDS EQUATION Figure 1. Updating the best matching unit and its neighbors toward the sample vector x SOM marked x. where k denotes the iteration step of a training procedure, as(k) is the learning rate at step k and hci(k) is the so-called neighborhood function which is valid **for** the actual BMU c. hci(k) is a nonincreasing function of k and of the distance dci of unit i from the best matching unit c. The Gaussian function is widely used to describe this relationship: hciðkÞ ¼ e d2 ci =2s2ðÞ k : ð4Þ Variable s is the neighborhood radius at iteration k and dci = krc rik is the distance between map units c and i on the map grid. The neighborhood radius S corresponds to the neighborhood relationship Ni. [17] 7. End iteration step 2. [18] 8. End iteration k. [19] 9. End training of the SOM. [20] Steps 1 and 2 are repeated as often as necessary until the amount nd of sample vectors achieves convergence. Convergence implies that hci(k) ! 0**for**k ! 1 and thus, is dependent upon on the function of the neighborhood radius s(k). A common choice is an exponential decay described by Ritter et al. [1992]: sðÞ¼s k ð0Þe k=kmax : ð5Þ The learning rate as should also vary **with** the increasing number of training steps as indicated in equation (3). Kohonen [2001] recommended commencing an initial value as(0) **with** a value close to 1 and then to decrease gradually **with** an increasing number of training steps k, e.g., asðÞ¼as k ð0Þe k=kmax : ð6Þ The cooperation between neighboring neurons, a unique feature of the SOM algorithm, ensures a rapid convergence and a high accuracy in approximating functional relationships. Even though the exponential decays described in equations (5) and (6) **for** the neighborhood radius s and the learning rate a are purely heuristic solutions, they are adequate **for** a robust **for**mation of the self-**organizing** map [Kohonen, 2001]. 2.2. New SOM-MIO [21] Originally, the SOM is intended as a tool **for** solving classification problems, e.g., as feature extraction or recog- 3of10 nition of images and acoustic patterns. When using the SOM **for** these tasks the most important step be**for**e application is the interpretation of its final internal structure, i.e., the labeling of the network units appertaining to a certain class. The result of an operation of a classic SOM represents then discrete in**for**mation, e.g., a certain phoneme in speech recognition or a character in optical character recognition (OCR) represented by the BMU. [22] In order to offer a wide range of application in water resources we now expand the SOM principle by introducing a new interpolation method when applying a trained SOM which generates **multiple** continuous **output** in**for**mation. This leads to the new SOM-MIO architecture which in accordance **with** the underlying problem arranges the data vectors after training into two predefined parts during application. Rearranging the original data vectors allows **for** switching between the different mapping functions provided by the SOM-MIO. For example, consider a sample vector xSOM **with** three components (x 1 , x 2 , x 3 ). Three **option**s **for** operating the SOM-MIO now exist: (1) (x = (x 1 , x 2 ), y = x 3 ), (2) (x =(x 2 , x 3 ), y = x 1 ), and (3) (x =(x 1 , x 3 ), y = x 2 ). y denotes the required **output** which is not available during application. Two matrices D x and D y, **with** D¼diagfdigdi¼ 0 i ¼ 1 ...n 1 i ¼ n þ 1 ...n þ m; must be a priori defined according to the chosen mapping function, in order to select **input** or **output** components of reference and data vectors. For example, operating the SOM-MIO **with** **option** 1 requires Dx ¼ W03022 ð7Þ 2 1 4 0 0 1 3 2 0 0 05Dy¼4 0 0 0 3 0 05: ð8Þ 0 0 0 0 0 1 By choosing D x and D y the same SOM-MIO can be used **for** different mapping tasks, i.e., either **for** the approximation of a numerical model or **for** solving the corresponding inverse problem. This is a unique feature of the SOM-MIO among all other ANN architectures. 2.3. Interpolation Using the Delaunay Triangulation [23] The new architecture of the SOM-MIO allows a calculation of the required **output** value y using an adapted interpolation method. The strategy **for** determining a smooth continuous **output** in**for**mation uses the Delaunay triangulation TD which is regularly applied to the generation of finite element meshes as well as to Digital Terrain Models (DTM). The interpolation method **with** triangulation (ITRI) provides multidimensional mapping using the same uniquely trained SOM. Our implementation of this method is based on a Delaunay triangulation using the Quickhull algorithm suggested by Barber et al. [1996]. For a given set of points the Delaunay triangulation maximizes the minimum angle over all possible triangulations, a fact which is desirable **for** interpolation methods. [24] For application, it is necessary to per**for**m a unique trans**for**mation of the hexagonal SOM topology into a set of nT triangles {tj}n T : tj ¼ TD fmiDxg nT l : ð9Þ

- Page 1: Self-organizing maps with multiple
- Page 5 and 6: W03022 SCHÜTZE ET AL.: SOMS FOR MO
- Page 7 and 8: W03022 SCHÜTZE ET AL.: SOMS FOR MO
- Page 9 and 10: W03022 SCHÜTZE ET AL.: SOMS FOR MO