Monitoring Design W/ Support Vector Machines

  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Monitoring Design W/ Support Vector Machines as PDF for free.

More details

  • Words: 9,181
  • Pages: 14
WATER RESOURCES RESEARCH, VOL. 40, W11509, doi:10.1029/2004WR003304, 2004

Support vectors--based groundwater head observation networks design Tirusew Asefa, Mariush W. Kemblowski, Gilberto Urroz, Mac McKee, and Abedalrazq Khalil Department of Civil and Environmental Engineering and Utah Water Research Laboratory, Utah State University, Logan, Utah, USA Received 26 April 2004; revised 30 August 2004; accepted 20 September 2004; published 25 November 2004.

[1] This study presents a methodology for designing long-term groundwater head

monitoring networks in order to reduce spatial redundancy. A spatially redundant well does not change the potentiometric surface estimation error appreciably, if not sampled. This methodology, based on Support Vector Machines, makes use of a uniquely solvable quadratic optimization problem that minimizes the bound on generalized risk, rather than just the mean square error of differences between measured and ‘‘predicted’’ groundwater head values. The nature of the optimization problem results in sparse approximation of the function defining the potentiometric surface that was utilized to select the number and locations of long-term monitoring wells and guide future data collection efforts, which is a prerequisite in building and calibrating regional flow and transport models. The methodology is applied to the design of regional groundwater monitoring networks in the Water Resources Inventory Area (WRIA) 1, Whatcom County, INDEX TERMS: 1829 Hydrology: Groundwater hydrology; 1848 northern Washington State, USA. Hydrology: Networks; 9820 General or Miscellaneous: Techniques applicable in three or more fields; KEYWORDS: Support Vector Machines, groundwater monitoring networks, statistical learning theory Citation: Asefa, T., M. W. Kemblowski, G. Urroz, M. McKee, and A. Khalil (2004), Support vectors – based groundwater head observation networks design, Water Resour. Res., 40, W11509, doi:10.1029/2004WR003304.

1. Background [2] This article is concerned with the design of long-term groundwater head observation networks. Groundwater head observations are important calibration constraining data. Under ideal conditions, physical models that are based on governing physical processes of groundwater flow do not need calibration. In reality, since model input parameters are subject to uncertainties and since they are observed locally and in sparse locations only, it is necessary to adjust these parameters so that the observed value of a dependent variable (e.g., groundwater head) matches the one simulated. Groundwater monitoring network design is defined as the selection of sampling points (spatial) and sampling frequency (temporal) to determine the physical, chemical, and biological characteristics of groundwater [Loaiciga et al., 1992]. [3] Broadly speaking, groundwater monitoring networks may be classified into two categories: (1) groundwater contaminant monitoring networks, and (2) groundwater head observation networks. On the basis of design objectives, the former, in turn, may be classified into three categories: initial groundwater contamination detection, characterization, and long-term monitoring networks. Initial groundwater contamination detection networks enable one to detect unexpected leaks before reaching a compliance boundary, which is usually located at some relatively short distance, say 100m, from a landfill. Examples of such studies are Massmann and Freeze [1987a, 1987b], Meyer Copyright 2004 by the American Geophysical Union. 0043-1397/04/2004WR003304$09.00

and Brill [1988], Morisawa and Inoue [1991], Meyer et al. [1994], Jardine et al. [1996], Storck et al. [1997], and Angulo and Tang [1999]. Contaminant characterization networks are concerned with characterizing the nature and extent of the pollutant once initial detection is made. Specifically, the design procedure provides a methodology on how existing monitoring wells can be augmented, if there are any or siting new wells. Examples of such studies are Hudak and Loaiciga [1992], Mahar and Datta [1997], Datta and Dhiman [1996], and Montas et al. [2000]. In long-term monitoring network design, the aim is, given an adequately characterized plume, development of a costeffective monitoring plan. Issues one looks at are selecting the subset of monitoring wells to be sampled for a given period and the frequency of monitoring those wells. Examples of such studies are Molina et al. [1996], Cameron and Hunter [2000], Nunes et al. [2004a], and Reed et al. [2000, 2001, 2003]. We refer interested readers to a recent publication of the American Society of Civil Engineers (ASCE) task committee on state-of-the-art in long-term monitoring network design [Minsker and Task Committee, 2003]. [4] On the basis of design objectives, groundwater head observation wells may be classified into two types: (1) characterization wells, where one tries to locate new observation wells; and (2) long-term monitoring wells where one selects subsets from (many) existing wells to make frequent (monthly, quarterly) observations at those locations. Examples of such studies are Rouhani [1985], Gangopadhyay et al. [2001], and Nunes et al. [2004b]. [5] On the basis of the design approach, Loaiciga et al. [1992] classified all types of monitoring networks (both

W11509

1 of 14

W11509

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

W11509

Figure 1. Study area and 1990 phased groundwater head measurement locations. Cross sections are shown in Figure 2. groundwater head and contaminant monitoring) as hydrologic that do not include advanced statistical methods, and statistical otherwise. The statistical approaches were further divided into simulation, variance-based, and probabilitybased. Differences between these methods came from differences in the objective function formulation. [6] Variance-based, also known as variance reduction and redundancy reduction, methods assess the suitability of a given network by relying on variance of estimation error obtained by kriging [Rouhani, 1985; Ben-Jemaa et al., 1994; Nunes et al., 2004a, 2004b]. A given monitoring network (number and locations) has associated uncertainty explained by the variance of estimation error; and if wells in the network are to be added, removed, or displaced, the associated network accuracy will change. These methods then systematically search through different combinations of monitoring well locations that would result in minimum variance of estimation of error. [7] Ben-Jemaa et al. [1994] applied a branch-and-bound algorithm in designing monitoring networks for observing aquifer properties. The algorithm consists of searching for optimal monitoring nodes along preconstructed tree branches. If one has to select a monitoring network  of  n n nodes from a total of N locations, there will be N possible network layouts, each layout corresponding to one branch of the search tree. The limitation of this approach is that if n  N, the number of combinations become very large and the problem becomes a difficult combinatorial optimization problem. Improvements on such an approach were made using heuristic algorithms that iteratively look for a better solution by trial and error, rather than searching the entire state space [Wagner, 1995; Nunes et al., 2004a, 2004b]. As in all heuristic searches, these approaches may not guarantee that the final network corresponds to a global optimum. In addition, more than one network may be

classified as optimal, making the final solution nonunique and requiring additional criterion to chose between these equally optimal networks. This technique also does not depend on actual values of measured variables, but on the relative distribution of the measuring locations. [8] In this paper, we present a methodology that is based on Support Vector Machines (SVMs) for long-term groundwater head monitoring network design that globally optimizes an objective function to identify monitoring well locations based on their importance in explaining the potentiometric surface without going through exhaustive trial and error searches on alternative monitoring well configurations.

2. Case Study [9] The study area that is in northwestern Washington State, USA, including a small portion inside Canada, is part of what is known as Water Resources Inventory Area #1 (hereafter WRIA 1) (Figure 1). It covers an area of 629 Km2. As part of a concerted effort in tackling water resources management problems in WRIA 1, Utah State University (USU), the United States Geological Survey (USGS), and the Public Utility District No. 1 (PUD) of Whatcom County undertook different tasks within three phases: Phase I – Organization; Phase II – Technical assessment, and Phase III – Plan development and implementation. Different water resources management issues currently being looked at include: (1) Groundwater quantity/quality assessments; (2) Surface water quantity/quality assessments; (3) Instream flow and fish habitat requirements; and (4) Database management and decision support systems that integrate these different activities and present an easy-to-use computer model for the decision makers. All these processes are interrelated. One of the central components of the system is groundwater flow and trans-

2 of 14

W11509

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

port modeling, knowledge of which is a prerequisite for other processes. Effective groundwater flow and transport modeling, in turn, would require, among other things, groundwater head observations that should be collected in a timely fashion and used for model building and calibration. Therefore acquisition of this calibration constraining data is the first step in flow modeling. In the past, our experience in the project area showed that budget and practical constraints (for example, arrangements with private well owners, and arrangements for transboundary measurements due to the fact that some well owners are within Canada) resulted in asynchronic groundwater head measurements. The USGS conducted one of the most complete surveys in 1990. Within six months (March to August 1990), observations were made covering the entire present study area. These inventoried wells are shown in Figure 1. [10] Since it is not feasible to measure all these wells at all times, the management problem to be addressed by the present study is to identify subsets of these wells to be monitored simultaneously on a regular basis. Cost-effective acquisition of these data is then crucial in flow and transport modeling. Consequently, regional groundwater monitoring network design that identifies wells to be monitored on a regular basis while characterizing the potentiometric surface adequately is the subject of this study. In doing so, we present a novel approach to regional groundwater monitoring network design that uses a new learning methodology called Support Vector Machines (SVM) based on Statistical Learning Theory (SLT). [11] Despite enjoying success in other fields [Scho¨lkopf et al., 1999], there are few applications of SVM in hydrology. Dibike et al. [2001] applied SVM successfully in both remotely sensed image classification and regression (rainfall/runoff modeling) problems and reported a superior performance over the traditional artificial neural networks. Kaneviski et al. [2000] used SVM for mapping soil pollution by Chernobyl radionuclide Sr90 and concluded that the SVM was able to extract spatially structured information from the row data. Liong and Sivapragasam [2000] also reported a superior SVM performance compared to artificial neural net in forecasting flood stage. Asefa and Kemblowski [2002] used SVM to reproduce the behavior of a Monte Carlo – based groundwater flow and transport model that was, in turn, utilized in the design of initial groundwater contamination detection monitoring systems. Training and testing examples were derived using plumes generated from a random contaminant leak resulting from failure of a landfill cell and random hydraulic conductivity field. Designed monitoring networks by the trained machine were nearly identical with those obtained by the physical models. Contaminant plume detection reliabilities provided by both methods were also close.

3. Methodology 3.1. Estimation Variance--Based Method [12] Suppose one wants to predict the value of a random variable Z at a location x0 from a space of function F (also named feature space), where no observation is made using observations at the vicinity, x1, x2, . . .xN. The most common theory considers second-order stationarity. The kriging

W11509

estimator of Z at point x0 is given as the best linear unbiased estimation (BLUE) expressed as a linear combination of ^

Z ðx0 Þ ¼

N X

  wk ðiÞ Z xðiÞ ;

ð1Þ

i¼1

^

where wk(i) s are kriging weights; Z (x0) is kriging estimator; and Z(x(i)) is observation made at location xi. The kriging weights are determined by requiring unbiasedness n P ( wk(i) = 1) and minimum estimation variance [Journel i¼1 and Huijbregts, 1978]. [13] The groundwater monitoring network optimization problem is then posed as follows: for a given network size of n, find the best monitoring location out of the total of N that results in minimum mean estimation variance. This is done through an exhaustive search as in branch-and-bound algorithm that guarantees a global optimum, or heuristic near-optimal solution of, say, simulated annealing and genetic algorithm. For applications of this approach see studies by Rouhani [1985], Ben-Jemaa et al. [1994], and Nunes et al. [2004a, 2004b]. 3.2. Support Vector – Based Method [14] The support vector methodology [Vapnik, 1995, 1998], based on Statistical Learning Theory (SLT) (see Appendix A for detailed presentation of SVM algorithm), estimates the value of Z at unsampled location x0 (vector of measurement locations x and y) by ^

Z ðxÞ ¼ hwsv ; x0 i þ b;

ð2Þ

where h.,.i indicates an inner or dot product between x0 and wsv. wsv is the support vector weight (basis function), and b is bias. For simplicity, we will just use x rather than x0. The weights and bias are found by minimizing a regularized e-insensitive loss function. This loss function is depicted in Figure 2 and given below: ^

G ¼ jZ ðxÞ Z ðxÞje ¼

8 > < 0 if > :

^

jZ ðxÞ Z ðxÞj e ; ^

jZ ðxÞ Z ðxÞj e

ð3Þ

otherwise

where Z(x) is measured quantity (groundwater head in this ^ case) and e represents the precision by which Z (x)is estimated. [15] In Figure 2, each data point schematically represents measurements made at a monitoring well, and x s are slack variables that measure distances of these data points from the e tube. Data points that lie inside the e tube have a zero value of the loss function and do not have associated slack variables. [16] In order to find wsv and b, if one only minimizes equation (3), it is an ill-posed problem in Tikhonov’s sense [Tikhonov and Arsenin, 1977]. Therefore in practice one imposes a convex penalty term on some quantity related to the complexity of Z. Vapnik’s [Vapnik, 1995, 1998] choice of the regularization term is given by 12 kwk2 . [17] The optimization problem is then cast as follows: What is the ‘‘best’’ subset of long-term monitoring wells (number and locations) out of the existing N wells that would result in the ‘‘best’’ estimation of potentiometric

3 of 14

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

W11509

W11509

Figure 2. The e-insensitive loss function G.

surface for prespecified error level (best is in a sense of minimum regularized loss function given below). This is mathematically expressed as follows: minimize N X 1 k wsv k2 þ C ðxi þ x*i Þ; 2 i¼1

ð4aÞ

8 Z ðxÞ hwsv ; xi b e þ xi > > > > < hwsv ; xi þ b Z ðxÞ e þ x*; i > > > > :

0 xi ; x*i

ð4bÞ

subject to

to obtain ^

Z ðxÞ ¼ hwsv ; x0 i þ b:

ð4cÞ

This objective function minimizes the complexity of the SVM estimator (i.e., the estimator will tend to be flat) and penalizes monitoring points that lie outside the e tube. In other words, for any (absolute) error smaller than e, xi, = x*i = 0, hence these data points do not enter the objective function. This means that not all groundwater head observations made at existing monitoring well locations ^ will be used to estimate Z (x). The constant C > 0 determines the trade-off between the complexity of the function Z and the amount up to which deviations larger than e are tolerated. A smaller value of C means more weight is given to the regularizer while higher and higher values of C make the problem to be more and more unconstrained. [18] In addition to algorithmic differences, the main difference between the SVM and kriging estimators is the fact that the SVM estimator uses a subset of the data (monitoring wells) from the total set (existing wells) based on their importance in defining the potentiometric surface. 3.3. Optimizing Long-Term Monitoring (LTM) Networks [19] LTM networks are designed by selecting subsets from (many) existing wells to make frequent (monthly, quarterly) observations at those locations. This study

addresses only the problem of spatial redundancy, assuming that future sampling plans will be evaluated as site conditions change. The reason for this restriction is that there is no consistent time series data at the project area that may be used for the purpose of analyzing temporal redundancy. Examples of some previous monitoring network design studies that are based on data collected at a snapshot in time are Rouhani [1985], Reed et el. [2001], and Reed and Minsker [2004]. [20] Usually the optimization problem given in equations (4a) –(4c) is solved in its dual form using Lagrange multipliers. In addition, the dual formulation lends itself to introducing nonlinearity in potentiometric surface estimation (shown below). Writing equations (4a) –(4c) in its dual form and differentiating with respect to primal variables (w, b, xi, x*i ) and rearranging gives (see Appendix A for details) the following: maximize W ða*; aÞ ¼ e

N X

ðai þ a*i Þ þ

i¼1



N X

Zi ðai a*i Þ

i¼1

N     1 X ðai a*i Þ aj a*j k xi ; xj ; 2 i; j¼1

ð5aÞ

subject to constraints N X

ða*i ai Þ ¼ 0

0 ai ; a*i C;

ð5bÞ

i¼1

to obtain ^

Z ðxÞ ¼

n X

ða*i ai Þk ðx; xi Þ þ b;

ð5cÞ

i¼1

where a*i and ai are Lagrange multipliers, k(x, xi) is a kernel that replaces dot products of input examples, n is the number of selected long-term monitoring wells, and xis are their locations. One observes that from^the Kuhn-Tucker (KT) condition it follows that only for jZ (x) Z(x)j e, the Lagrange multipliers may be nonzero. In other words, for all samples inside the e tube (Figure 2) the ai, a*i vanish. The samples that have no vanishing coefficients are called support vectors, hence the name Support Vector Machines.

4 of 14

W11509

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

W11509

Figure 3. Conceptual representation of kernel transformation. Intuitively, one can imagine the support vectors as monitoring well locations that ‘‘support’’ the estimated potentiometric surface. Observe the difference between equation (5c) and equation (2). This is because of the fact that in differentiating the dual form the SVM weights are n P (a*i ai)xi (equation (A6)). shown to be equal to wsv = i¼1

Substituting this expression in equation (2) would result in n ^ P Z (x) = (a*i ai)hx, xii + b. i¼1

[21] The dot product is then substituted by a kernel: hF(x), F(x0)i = K(x, x0). This is the so-called hx, x0i ‘‘kernel trick’’ depicted in Figure 3 where nonlinear transformation is achieved. This is because of the fact that the SVM algorithm depends only on the dot product between monitoring well locations (see equations (A9a) –(A9c)). [22] Kernels may be viewed as dot products of nonlinear transformation functions. The connection between Reproducing Kernel Hilbert Space and random processes is well documented [see, e.g., Wahba, 1990]. According to the Bayesian interpretation, the first term in equation (4a) is a stabilizer that is a prior on the regression function Z in the Reproducing Kernel Hilbert Space (RKHS) induced by kernel K, and the data term is the noise model. If we assume that the data, zi, are affected by additive independent Gaussian noise process (zi = z(xi) + ei), then the squared norm, k Zk 2, can be thought of as the generalization of the expression ZS 1Z (also called the Mahalanobis distance from the mean Z) with covariance S [Wahba, 1990; Poggio and Girosi, 1998a, 1998b]. The density, P(Z), is then a multivariate Gaussian zero-mean function in the Hilbert space defined by the covariance function. The existence of such a well-defined family of random variables is guaranteed by the Kolmogorov consistency theorem [Wahba, 1990]. Therefore choosing kernel K may be viewed as assuming a Gaussian prior on Z with covariance equal to K [Poggio and Girosi, 1998b]. This is also the link between SVMs and kriging theory where the kernel is given by the covariance function: K(x, x0) = cov(Z(x), Z(x0) = S. [23] The optimization problem given by equations (5a) – (5c) estimates the best function that defines the potentiometric surface as a function of support vector locations only. Measurements at other locations (those that lie inside the e tube) do not contribute to the function defining the

potentiometric surface. Since support vectors define the potentiometric surface, future groundwater head observations at those locations will explain the nature of this surface better than measurements taken at other locations. Therefore support vector locations are assumed to be the best long-term monitoring well locations. In addition, the SVM algorithm directly gives the number of wells to be monitored.

4. Application to a Case Study [24] SVM-based regional groundwater monitoring network design may be summarized in two steps: (1) inventory of groundwater head observations and hydrogeological characterization of different layers within which existing piezometers are located; and (2) SVM implementation. 4.1. Hydrogeological Characterization [25] In the present study area, groundwater observation wells are distributed within different aquifer layers and one has to delineate these aquifers in order to select wells in each layer. At a regional scale, the study area is classified as what is known as the Puget Sound Lowland that has been influenced in large part by the tectonic and glacial events during the Tertiary and Quaternary periods [Jones, 1999]. This part of the Puget Sound Lowland is named the Fraser-Whatcom Basin. Cox and Kahle [1999] identified two classes of aquifers (from top down): (1) Sumas aquifer (Qsa) and (2) Everson-Vashon aquifer (Qev). The latter may be further divided into Everson-Vashon fine-grained confining unit (Qevf) and Everson-Vashon coarse-grained layer (Qevc), a confined aquifer. The Qevc consists of discontinuous patches (lenses, pools). Therefore the hydrogeology of the present study area is a two-aquifer, threelayer system. [26] Characterization data were obtained from Cox and Kahle [1999], Whatcom County Health and Human Services Department (WCHHSD) well log database (2826 geographically referenced points), and the Department of Ecology’s scanned well logs (6967 data points). These data were analyzed to select well logs that were subsequently used to delineate these identified hydrogeologic layers. Well log selection criteria, among other factors, include depth of completion and uniform aerial coverage. Figure 4 shows

5 of 14

W11509

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

W11509

Figure 4. Cross sections of (a) east-west and (b) south-north. The locations of the cross sections are shown in Figure 1. two cross sections (east-west and south-north) of the present study area. Locations of the cross sections are shown in Figure 1 (Figure 4). [27] Because of the fact that most of the water supply need in the project area is satisfied by the Sumas Aquifer,

and this aquifer is practically disconnected from the underlying Qevc layer through a thick low permeable Qevf layer, the present study is concerned only with the Sumas Aquifer. In addition, most of the inventoried observation wells are also sited in this aquifer. We note that several localized

6 of 14

W11509

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

Table 1. Commonly Used Kernels Kernel Type Simple dot producta Polynomial Two-layer neural network Radial basisb

Expression K(x, x0) = x*x0 K(x, x0) = (x*x0 + 1)d, d is user specified K(x, x0) = tanh (b(x*x0) c)) , b and c are user specified K(x, x0) = exp (g2kx x0k2), g2 is user specified

a

This kernel corresponds to linear machine. This kernel is translation invariant. Can be written as Gaussian  kx x0k2 = covariance kernel with unit variance: K(x, x0) = s2 exp  2 2 r h exp 2 , where s2 = 1, r2 = 1/g2, and h2 = kx x0k2. b

r

previous hydrological investigations also considered the bottom of the Sumas Aquifer as an impermeable unit [Associated Earth Sciences, Inc., 1994, 1995; GeoEngineers Hydrogeologic Services, 1994; Water Resources Consulting, LLC, 1997]. 4.2. SVM Implementation [28] Equations (5a) – (5c) is a quadratic optimization problem that guarantees a global optimum and can be solved using any off-the-shelf quadratic optimization algorithms like LOQO [Vanderbei, 1994]. We used the SVM optimization code developed by the Royal Holloway University of London and AT&T Speech and Image Processing Service Research Lab [Saunders et al., 1998]. The data required to solve equations (5a) – (5c) are observed groundwater head at x (X and Y coordinates) monitoring locations, Z(x), and a kernel k(x, xi) that describes the (nonlinear) dependency between observation points. Table 1 shows the most commonly used SVM kernels. Here we used a radial basis kernel that is translation invariant and estimated its parameter using cross validation (see below). From Table 1, notice that use of the two-layer neural network kernel in SVM is not the same as that of the traditional Artificial Neural Network (ANN) [Govindaraju and Rao, 2000]. This important difference between ANNs and SVMs is explained below. [29] Although the transformation function (kernel) used by ANNs and SVMs with the two-layer neural network kernel is similar, the loss function used by ANN (based on least square) does not result in a sparse solution [Girosi, 1998], as in the SVM. Therefore because of the nature of the loss function employed, if ANNs were to be used to estimate the potentiometric surface, they will use all the measured data at monitoring well locations. Consequently, ANNs will not be able to directly select a subset of monitoring wells to be used as LTM networks as a function of different levels of potentiometric surface approximations. Lastly, most training methods in ANN such as the back propagation algorithm may not guarantee a global optimum [Hastie et al., 2001, p. 359; Vapnik, 1998, p. 399]. [30] The SVM algorithm is used in two stages: (1) training/validation, and (2) design. The training/validation stages aim at finding the optimal kernel parameter and SVM parameter C for a range of potentiometric surface approximations (e) that will be used in the design stage. The design stage then uses trained SVM to provide a long-term monitoring network as a function of groundwater head surface approximations. Each of these steps is explained below.

W11509

[31] Three hundred and fifty well locations and groundwater head observations extracted from the Sumas Aquifer were used to estimate SVM parameter, C, and radial basis kernel parameter, g. One way of conducting the training/ validation is with a split sample approach. This approach divides the available data into two and uses one for training and the remaining for validation. Optimal SVM parameters will then be selected based on performance (e.g., minimum root mean square error) of the validation set. We used a Kfold cross-validation technique. The K-fold cross-validation approach splits the available data into more or less K equal parts. K-1 parts of the data will be used to find the SVM ^ estimator, Z (x), and calculate the validation error of the fitted model while predicting the kth part of the data. The procedure then continues for k = 1, 2, . . ., K, and the selection of parameters is based on minimum prediction error estimates over all K parts. [32] Now the question is what value to use for K. Hastie et al. [2001] recommend the use of K = 5 or 10 based on the shape of a ‘‘learning curve.’’ A learning curve is a plot of training error versus training size. For given SVM parameters (g, e and C), different training errors are calculated by ^ progressively estimating Z (x) for increased number of the training size, constituting a plot of the learning curve. For smaller training sizes, the learning curve has a steep slope and it gradually flattens, as the training size increases and changes in training error becomes small. At this point, the training error is said to be independent of the training size. Consequently, the value 4K or 9K will correspond to the training size where the learning curve starts to be flat. We note that even though the actual value of the training error may differ for different combination of SVM parameters, the shape of the learning curve remains more or less the same (i.e., the training size that corresponds to flattened portion of the training curve stays nearly the same). [33] Figure 5 shows a representative learning curve in our case. The curve was made using e = 0.1 and C = 10 and g = 6. The value of the kernel parameter was derived from data. This was done by noting that the radial basis kernel, in fact, is a Gaussian covariance with unit variance (see Table 1), the relation being r2 = 1/g2, where r is the distance after which no spatial autocorrelation is evident. Figure 6 shows the experimental and Gaussian covariance that was used to estimate the value of g. We would like to point out

Figure 5. A ‘‘learning curve.’’ The broken line corresponds to fivefold cross validation (280 data points).

7 of 14

W11509

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

Figure 6. Experimental covariance along with Gaussian covariance fit.

that this kernel parameter value obtained from covariance fit is used to obtain the learning curve and we do not imply an assumption of underlying Gaussian random field for the head distribution. One could also use an arbitrarily selected kernel parameter value and adjust its value during training. [34] As shown in Figure 5, the learning curve is relatively flat after it reaches a training size of 250. The five- and tenfold training sizes correspond to sample sizes of 280 and 315, respectively, which is virtually the same as the performance of the complete set. Thus cross validation would not suffer from much bias. The case K = 5 will have almost the same performance as the case K = 10, but it will result in a smaller computational time and, therefore, was used to conduct the cross validation. If the five- or tenfold training size (training size corresponding to 4K or 9K) indicates a location where the learning curve has a considerable slope, from Figure 5 we observe that the true prediction error (where the curve flattens) will be underestimated [Hastie et al., 2001]. [35] Consequently, we conducted a fivefold cross validation for a range of potentiometric surface approximations (e = 0.01 0.5) and obtained optimal values of SVM parameters to be C = 7 and g = 2. These values were then used in the design stage as explained in the next section. 4.3. Selecting Optimal Long-Term Monitoring Networks [36] The design of a groundwater monitoring network is a multiobjective optimization problem [Knopman et al., 1991; Cieniawski et al., 1995; Wagner, 1995; Reed et al., 2001, 2003]. If one monitors all the available wells, the error associated with defining the potentiometric surface will be minimal but this also means a higher cost of monitoring. Using small number of monitoring wells would be less costly but will also have higher error in explaining the potentiometric surface. Therefore our interest in this study lies in (1) finding how many wells would be required to define the groundwater flow field, (2) identifying the locations of those wells, and (3) providing a decision curve that shows trade-offs between the number of wells and corresponding relative error in groundwater table elevation estimates. [37] Using the optimal SVM parameters estimated in the previous section, we fit a potentiometric surface to all the

W11509

observed data (groundwater head observations at monitoring wells and their corresponding locations, X and Y coordinates) for various levels of potentiometric surface approximations. At the end of the quadratic optimization procedure, the support vectors were extracted and geographically referenced, thus producing a set of long-term monitoring well locations. Different magnitude of errors in defining the potentiometric surface would then result in different numbers and locations of monitoring wells. Therefore the relation between e and the number and locations of monitoring wells can be used to decide the size of the network as shown in Figure 7. For example, Figure 8 shows the locations of monitoring wells for four different error levels. Sixty-five monitoring wells would be required to maintain an error level of 5%; 23 wells for e = 10%; and so on. Wells selected in networks of higher error level (for example, e = 15%) were found to be progressively included in the set when e is smaller, rendering consistency in the solution. [38] It is interesting to observe that selected monitoring well locations (Figure 8) are at the areas where the observed heads are most uncertain. Inspection of the equipotential lines shows that the support vector points follow approximately the groundwater watershed boundaries. If two or more monitoring locations are very close to each other, it is because the local differences between groundwater heads at those locations are large, therefore requiring more monitoring wells to explain the groundwater head variation at those areas. Figure 9 depicts the SVM prediction error surface for different sizes of monitoring networks. Recall that from the definition of support vectors, at selected monitoring well locations we have (absolute) prediction errors equal to or greater than the prespecified error level. In other words, at those locations training points are on or outside the e tube. Nonmonitoring observation wells at other locations lie inside the e tube, hence do not contribute toward the definition of the potentiometric surface. This confirms common intuition as the SVM procedure puts observation wells at the most uncertain locations. The groundwater surface is then ‘‘supported’’ at those locations. [39] We also investigated the performance of kernel parameter value (length correlation scale) estimated from covariance fit, compared to the one obtained through

Figure 7. Network size versus potentiometric surface approximation.

8 of 14

W11509

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

W11509

Figure 8. SVM predicted groundwater head (m) surface and selected monitoring wells for different levels of a prespecified error level (e, number of monitoring wells): (a) (5%, 65); (b) (10%, 23); (c) (15%, 11); and (d) (20%, 8). fivefold cross validation, on the complete set of data. Figure 10 depicts this comparison. [40] For small values of e, the covariance fit value (smaller correlation scale or higher gamma) gave better results of Root Mean Square Error (RMSE), as e increases the kernel parameter derived from the fivefold cross validation (selected based on overall best performance) gave better RMSE. This observation can be explained as follows: At lower values of e, the estimated potentiometric surface will be close to the observed groundwater surface, requiring one to use highly localized kernel and, hence, such a kernel is expected to produce a smaller RMSE. As e increases, the estimated potentiometric surface is flatter, with support vectors far apart and, hence, a kernel with higher length scale would result in smaller RMSE values. [41] Two types of measurement errors may be identified in the process: (1) piezometer dislocation (X, Y coordinates); and (2) groundwater head measurement errors. The

former type is a onetime error (although piezometer locations could be updated through resurvey). Usually, groundwater observations are made from ground surface to water table and are converted to groundwater heads by subtracting these values from estimated ground surface elevations. When the variation in topography within the neighborhood of a piezometer head is large, the impact of dislocation error could be significant and may affect subsequent estimates in both groundwater network design and flow and transport modeling. In the present study, we extracted piezometer head information from a high-resolution (10m) Digital Elevation Model (DEM) using GIS operations and assumed that the dislocation error is negligible. [42] In order to investigate groundwater head measurement errors, we conducted experiments for different Noise to Signal Ratios (NSR) using Gaussian noise. NSR is defined as the ratio between the variance of the noise and the variance of the observed data. Table 2 shows compar-

9 of 14

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

W11509

W11509

Figure 9. Error surface and selected monitoring wells for different levels of a prespecified error level (e, number of monitoring wells): (a) (5%, 65); (b) (10%, 23); (c) (15%, 11); and (d) (20%, 8). isons between designed networks with and without Gaussian noise. At e = 5 % network size has increased marginally and this change increases with increase in NSR values. Whereas at higher e values (for example, e = 10%), the change in network size remains steady with increasing NSR. As seen in the table, the network size changes are higher at lower values of e, which is in agreement with our intuition, indicating that designed networks at higher e levels are more tolerant against measurement corruptions. For example, at e = 15%, the NSR value has to be increased to 50% in order to cause changes in the designed network. Overall, we have found the support vector – based designed network to be robust.

5. Conclusions [43] We have presented a regional groundwater network design procedure that used a new machine learning methodology called Support Vector Machines (SVM) based on

Statistical Learning Theory (SLT). The SLT procedure allows for an unbiased selection of monitoring points based on their importance in constructing the groundwater potentiometric surface without going through an exhaustive search on different monitoring network configurations. The approach utilized consists of two parts: one related to the regularization of the solution (i.e., the estimated function will always tend to be flat, avoiding over fitting), and the second related to the goodness-of-fit resulting in remarkable generalization capabilities. The current procedure evaluates minimal information (number of monitoring wells) to design a regional groundwater monitoring network by selecting from (many) existing wells. The locations of existing wells are mapped to the potentiometric surface using a nonlinear kernel transformation chosen a priori. The einsensitive unique feature of SVMs was used to select (the number and locations of) monitoring wells. The ability of SVMs to construct potentiometric surface approximations using a very rich set of functions and to control the

10 of 14

W11509

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

Figure 10. Data-driven (g = 6) and fivefold crossvalidated (g = 2) kernel performances.

trade-off between accuracy of approximation and complexity of the approximating function was the key to the present design procedure. Different accuracy of groundwater head surface approximations would then result in different sizes of networks that will be used to guide future data collection efforts. The nature of the optimization problem resulted in a sparse set of actually used observation locations. In accordance with our intuition, the SVM procedure placed monitoring wells at the most uncertain locations (for example, groundwater watershed boundaries). The procedure also retained a selected monitoring well for a higher e while progressively adding monitoring wells for lower error levels (higher number of wells), rendering consistency in the solutions. [44] There are three important parameters to consider when using SVM for the cases presented in this paper. For a range of error levels (e-insensitive potentiometric surface approximation), complexity, C, and kernel parameter, g, were estimated using a fivefold cross validation approach. The Gaussian covariance kernel parameter was derived from observed data and used to constrain the search space, incorporating domain knowledge into the design process. The performance of this data-driven kernel parameter was also found to be fairly comparable to the one obtained from fivefold cross validation in terms of prediction error of the complete set. [45] This study has looked at the problem of spatial redundancy only, assuming future sampling plans will change as site conditions change. This is because of the

W11509

fact that there is no consistent time series of groundwater head observations that would enable us to include the problem of temporal redundancy in the project area. The methodology presented can be extended to analyze temporal redundancy, for example, by following the same steps as that of the spatial redundancy problem but doing it at different time steps. The temporal-spatial long-term monitoring network may then be selected based on some criterion (for example, frequency of a given monitoring well occurrence in all the time steps [see Gangopadhyay et al., 2001; Nunes et al., 2004a, 2004b]). The method presented here could also be extended to include economic analysis in the design of monitoring networks provided that a utility function for e can be meaningfully formulated.

Appendix A: Algorithm

Support Vector Machines

[46] Suppose one wants to estimate a functional depen^ dency, Z (x), between input points {x1, x2, . . .,xl} taken from x 2 RK and {z1, z2, . . . .zN} with z 2 R drawn from a set of N independent and identically distributed (i.i.d.) ^ observations. We seek a function Z (x) by minimizing the following regularized risk functional [Vapnik, 1995, 1998]: minimize N X 1 k wsv k2 þ C ðxi þ x*i Þ; 2 i¼1

ðA1aÞ

8 Z ðxÞ hwsv ; xi b e þ xi > > > > < hwsv ; xi þ b Z ðxÞ e þ x*i ; > > > > :

0 xi ; x*i

ðA1bÞ

subject to

to obtain ^

Z ðxÞ ¼ hwsv ; x0 i þ b;

ðA1cÞ

where wsv is support vector weights (we will just use w for simplicity), xi, x*i are slack variables that determines the degree to which samples with error greater than e be penalized (Figure 2). C > 0 determine trade-off between function complexity and closeness to data.

Table 2. Designed Network Comparisons Under Noise-Free and Corrupted Groundwater Head Observations E = 5%

Network size (NZ)b DNZ,c % (+)/( ),d %

E = 10%

E = 15%

10a

25a

50a

10a

25a

50a

10a

25a

50a

67 (+)3.0 (+)10 ( )7.7

70 (+)7.7 (+)10.8 ( )3.1

68 (+)4.6 (+)13.8 ( )9.2

23 0 (+/ )4.3

22 ( )4.3 (+)8.7 ( )13

22 ( )4.3 (+)8.7 ( )13

11 0 0 0

11 0 0 0

10 ( )9.1 (+)18.2 ( )9.1

a

Noise-to-signal ratio (%), defined as the ratio between noise variance and variance of the observed data. Size of network obtained using corrupted groundwater head observations. c Percentage change in network size compared to noise-free network size. d Percentage increase (+) or reduction ( ) in wells compared to a noise-free network. (+/ ) means that the number of wells added and reduced are equal. This is the case when network size remains constant. b

11 of 14

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

W11509

[47] The dual form is obtained by using Lagrange multipliers. Equations (A1a) – (A1c) written in dual form is as follows: 1 k w k2 þC 2 " # N K X X ai e þ xi zi þ wj xji þ b

Gðw; x; x*; a; a*; h; h*; bÞ ¼

i¼1



N X

j¼1

" a*i e þ x*i þ zi

i¼1



L X

K X

N X

W11509

to obtain ^

Z ðxÞ ¼

n X

ða*i ai Þhx; xi i þ b:

ðA9cÞ

i¼1

ðxi þ x*i Þ

Since the above expression depends only on inner products between input examples, kernel substitution (also called the hF(x), F(x0)i = kernel trick (see Figure 3)) of hx, x0i 0 K(x, x ) would result in the SVM algorithm: maximize

i¼1

# wj xji b

j¼1

W ða*; aÞ ¼ e

½hi xi þ h*x i ; i *

ðai þ a*i Þ þ

i¼1

ðA2Þ

i¼1



where a*, a, h*, h are Lagrange multipliers, and j = 1,.K is input dimension. The saddle point condition states that the partial derivatives of G with respect to primal variables (w, b, xi, x*i ) have to vanish for optimality, i.e.,

N X

N X

Zi ðai a*i Þ

i¼1

N     1X ðai a*i Þ aj a*j k xi ; xj ; 2 i;j¼1

ðA10aÞ

subject to constraints N X

0 ai ; a*i C;

ða*i ai Þ ¼ 0

ðA10bÞ

i¼1 N @G X ¼ ða*i ai Þ ¼ 0; @b i¼1

ðA3Þ

to obtain ^

# K K N K X @G X @G ! X ! X ! ¼ z ¼ wj z þ ða*i ai Þ xij z ¼ f0g; @w @wj j j¼1 j¼1 i¼1 j¼1

Z ðxÞ ¼

"

ðA4Þ

N X @G ¼w

ða*i ai Þxi ¼ f0g; @w i¼1

n X

ða*i ai Þk ðx; xi Þ þ b:

The bias b of the function that we are seeking is found from the Kuhn-Tucker (KT) condition, which requires that for the optimal solution the product between dual variables and constraints vanish. Mathematically, this is expressed as

ðA5Þ

ai e þ xi zi þ

N X

a*i e þ x*i þ zi þ ða*i ai Þxi ;

K X

! wj xji þ b

¼ 0;

ðA11Þ

¼ 0;

ðA12Þ

j¼1

and thus w¼

ðA10cÞ

i¼1

K X

! wj xji þ b

j¼1

ðA6Þ

i¼1

! where z is a unit vector. Also,

ðai C Þxi ¼ 0; ðA13Þ ða*i C Þx*i ¼ 0:

@G ¼ C ai hi ¼ 0 @xi

ðA7Þ

@G ¼ C a*i h*i ¼ 0: @x*i

ðA8Þ

Substituting equations (A3) to (A8) in equation (A2) results in the following quadratic optimization problem. Maximize the following functional with respect to the forcings (as): W ða*; aÞ ¼ e

N X

ðai þ a*i Þ þ

i¼1



L X

zi ðai a*i Þ

i¼1

N X N    1X ðai a*i Þ aj a*j xi ; xj ; 2 i¼1 j¼1

ðA9aÞ

From the relations shown above, it follows that: (1) only samples (xi, zi) with corresponding a*i or ai = C lie outside the e tube; (2) the dual variables are mutually exclusive (a*iai = 0); if both dual variables have nonzero values, it would require nonzero slack variables on both directions; and (3) for a*i and ai2 (0,C) it follows that x = x* = 0, i.e., the zi lie on the e tube. Since the second term has to vanish also to satisfy the KT condition, this result would allow the estimation of b. Even though a single xi would be enough to solve the problem, in practice one uses the average of all the support vectors that lie on the e tube for the purpose of assuring stability [Mu¨ller et al., 1999]. Thus the proper formulation for estimating b is



subject to constraints N X i¼1

ða*i ai Þ ¼ 0

0 ai ; a*i C;

8 M > 1 X > > ðzm hw; xm i e; > <M > 1 > > > :M

m¼1 M X

ðzm hw; xm i þ e;

for

am 2 ð0; C Þ ;

for

ðA14Þ

am 2 ð0; C Þ

m¼1

ðA9bÞ

where M is the number of sample points on the e tube. 12 of 14

W11509

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

[48] Acknowledgments. We are grateful for the thoughtful review and suggested improvements by two anonymous reviewers that helped improve this manuscript. We especially thank the first reviewer for her/his substantive guidance in improving the manuscript.

References Angulo, M., and W. H. Tang (1999), Optimal groundwater detection monitoring system design under uncertainty, J. Geotech. Geoenviron. Eng., 125, 510 – 517. Asefa, T., and M. W. Kemblowski (2002), Support vector machines approximation of flow and transport models in initial groundwater contamination network design, Eos Trans. AGU, 83(47), Fall Meet. Suppl., Abstract H72D-0882. Associated Earth Sciences, Inc. (1994), Wellhead protection plan for the city of Everson, Whatcom County, Washington, report, Kirkland, Wash. Associated Earth Sciences, Inc. (1995), Wellhead protection program, Sumas, Washington, for city of Sumas, report, Kirkland, Wash. Ben-Jemaa, F., M. A. Marino, and H. A. Loaiciga (1994), Multivariate geostatistical design of groundwater monitoring networks, J. Water Resour. Plann. Manage., 120, 505 – 522. Cameron, K., and P. Hunter (2000), Optimization of LTM networks using GTS: Statistical approaches to spatial and temporal redundancy, report, Air Force Cent. for Environ. Excell., Brooks AFB, Tex. Cieniawski, S. E., J. W. Eheart, and S. R. Ranjithan (1995), Using genetic algorithms to solve a multiobjective groundwater monitoring problem, Water Resour. Res., 31, 399 – 409. Cox, S. E., and S. C. Kahle (1999), Hydrogeology, ground water quality, and sources of nitrate in lowland glacial aquifer of Whatcom County, Washington, and British Columbia, Canada, U. S. Geol. Surv. Water Resour. Invest. Rep., 98-4195. Datta, B., and S. D. Dhiman (1996), Chance-constrained optimal monitoring network design for pollutants in groundwater, J. Water Resour. Plann. Manage., 122, 180 – 188. Dibike, B. Y., S. Velickov, D. Solomatine, and B. M. Abbot (2001), Model induction with support vector machines: Introduction and applications, J. Comput. Civ. Eng., 15, 208 – 216. Gangopadhyay, S., A. D. Gupta, and M. H. Nachabe (2001), Evaluation of groundwater monitoring network by principal component analysis, Ground Water, 39, 181 – 191. GeoEngineers Hydrogeologic Services (1994), Wellhead protection study: Dodson’s IGA well, Whatcom County, Washington, U.S.A., report, Bellingham, Wash. Girosi, F. (1998), An equivalence between sparse approximation and support vector machines, Neural Comput., 10, 1455 – 1480. Govindaraju, R. S., and A. R. Rao (2000), Artificial Neural Network in Hydrology, 348 pp., Kluwer Acad., Norwell, Mass. Hastie, T., R. Tibshirani, and J. Friedman (2001), The Elements of Statistical Learning: Data Mining, Inference and Prediction, Springer-Verlag, New York. Hudak, P. F., and H. A. Loaiciga (1992), A location modeling approach for groundwater monitoring network augmentation, Water Resour. Res., 28, 643 – 649. Jardine, K., L. Smith, and T. Clemo (1996), Monitoring networks in fractured rocks: A decision analysis approach, Ground Water, 34, 504 – 518. Jones, M. A. (1999), Geologic framework for the Puget Sound aquifer system, Washington and British Columbia, U.S. Geol. Surv. Prof. Pap., 1424-C. Journel, A., and C. Huijbregts (1978), Mining Geostatistics, Academic, San Diego, Calif. Kaneviski, M., A. Pozdnukhov, S. Canu, and M. Maignan (2000), Advanced spatial data analysis and modeling with support vector machines, Int. J. Fuzzy Syst., 4, 606 – 615. Knopman, D. S., C. I. Voss, and S. P. Garabedian (1991), Sampling design for groundwater solute transport: Tests of methods and analysis of Cape Code tracer test data, Water Resour. Res., 27, 925 – 949. Liong, S. Y., and C. Sivapragasam (2000), Flood stage forecasting with SVM, J. Am. Water Resour. Assoc., 38, 173 – 186. Loaiciga, H. A., R. J. Charbeneau, L. G. Everett, G. E. Fogg, B. F. Hobbs, and S. Rouhani (1992), Review of groundwater quality monitoring network design, J. Hydrol. Eng., 118, 11 – 37. Mahar, P. S., and B. Datta (1997), Optimal monitoring network and groundwater pollution source identification, J. Water Resour. Plann. Manage., 123, 199 – 207.

W11509

Massmann, J., and R. A. Freeze (1987a), Groundwater contamination from waste management sites: The interaction between risk-based engineering design and regulatory policy: 1. Methodology, Water Resour. Res., 23, 351 – 367. Massmann, J., and R. A. Freeze (1987b), Groundwater contamination from waste management sites: The interaction between risk-based engineering design and regulatory policy: 2. Results, Water Resour. Res., 23, 368 – 380. Meyer, P. D., and E. D. Brill, Jr. (1988), A method for locating wells in a groundwater monitoring network under conditions of uncertainty, Water Resour. Res., 24, 1277 – 1282. Meyer, P. D., A. J. Valocchi, and J. W. Eheart (1994), Monitoring network design to provide initial detection of groundwater contamination, Water Resour. Res., 30, 2647 – 2659. Minsker, B., and Task Committee (2003), Long-term groundwater monitoring design: State of the art applications, report, Am. Soc. of Civ. Eng., Reston, Va. Molina, G. R., J. J. Beauchamp, and T. Wright (1996), Determining an optimal sampling frequency for measuring bulk temporal changes in groundwater quality, Ground Water, 34, 579 – 587. Montas, H. J., R. H. Mohtar, A. E. Hassan, and F. AlKhad (2000), Heuristic space-time design of the monitoring wells for contaminant plume characterization in stochastic flow fields, J. Contamin. Hydrol., 43, 271 – 301. Morisawa, S., and Y. Inoue (1991), Optimum allocation of monitoring wells around a solid-waste landfill site using precursor indicators and fuzzy utility functions, J. Contamin. Hydrol., 7, 337 – 370. Mu¨ller, K. R., A. Smola, G. Ra¨tsch. B. Scho¨lkopf, J. Kohlmorgen, and V. Vapnik (1999), Predicting time series with support vector machines, in Advances in Kernel Methods: Support Vector Learning, edited by B. Scho¨lkopf, C. J. C. Burges, and A. J. Smola, pp. 243 – 254, MIT Press, Cambridge, Mass. Nunes, L. M., E. Paralta, M. C. Cunha, and L. Ribeiro (2004a), Groundwater nitrate monitoring network optimization with missing data, Water Resour. Res., 40, W02406, doi:10.1029/2003WR002469. Nunes, L. M., M. C. Cunha, and L. Ribeiro (2004b), Groundwater monitoring network optimization with redundancy reduction, J. Water Resour. Plann. Manage., 130, 33 – 43. Poggio, T., and F. Girosi (1998a), A sparse representation for function approximation, Neural Comput., 10, 1445 – 1454. Poggio, T., and F. Girosi (1998b), Notes on PCA, regularization, sparsity and support vector machines, AI Memo. 1632, CBCI Pap. 161, Mass. Inst. of Technol., Cambridge. Reed, P., and B. S. Minsker (2004), Striking the balance: Long-term groundwater monitoring design for conflicting objectives, J. Water Resour. Plann. Manage., 130, 140 – 149. Reed, P., B. Minsker, and A. J. Valocchi (2000), Cost-effective long-term groundwater monitoring design using a genetic algorithm and global mass interpolation, Water Resour. Res., 36, 3731 – 3741. Reed, P., B. S. Minsker, and D. E. Goldberg (2001), A multiobjective approach to cost effective long-term groundwater monitoring using an Elitist Nondominated Sorted Genetic Algorithm with historical data, J. Hydroinformatics, 3, 71 – 90. Reed, P., B. S. Minsker, and D. E. Goldberg (2003), Simplifying multiobjective optimization: An automated design methodology for the nondominated sorted genetic algorithm – II, Water Resour. Res., 39(7), 1196, doi:10.1029/2002WR001483. Rouhani, S. (1985), Variance reduction analysis, Water Resour. Res., 21, 837 – 846. Saunders, C., M. O. Stitson, J. Weston, L. Bottou, B. Scholkopf, and A. Smola (1998), Support vector machine reference manual, Tech. Rep. CSD-TR-98-03, Royal Holloway Univ. of London, London. Scho¨lkopf, B., J. C. Burges, and A. Smola (1999), Advances in Kernel Methods: Support Vector Learning, MIT Press, Cambridge, Mass. Storck, P., J. W. Eheart, and A. J. Valocchi (1997), A method for the optimal location of monitoring wells for detection of groundwater contamination in three-dimensional heterogeneous aquifers, Water Resour. Res., 33, 2081 – 2088. Tikhonov, A., and V. Arsenin (1977), Solution of Ill-Posed Problems, W. H. Winston, Washington, D. C. Vanderbei, R. J. (1994), LOQO: An interior point code for quadratic programming, Rep. TRSOR-94-15, Stat. and Oper. Res. Princeton Univ., Princeton, N. J.

13 of 14

W11509

ASEFA ET AL.: SVM IN GROUNDWATER HEAD MONITORING NETWORKS

Vapnik, V. (1995), The Nature of Statistical Learning Theory, SpringerVerlag, New York. Vapnik, V. (1998), Statistical Learning Theory, John Wiley, Hoboken, N. J. Wagner, B. J. (1995), Sampling design methods for groundwater modeling under uncertainty, Water Resour. Res., 31, 2581 – 2591. Wahba, G. (1990), Spline Models for Observation Data, Ser. Appl. Math., vol. 59, Soc. for Indust. and Appl. Math., Philadelphia, Pa.

W11509

Water Resources Consulting, LLC (1997), Wellhead protection program, report, Pole Road Water Assoc., Whatcom County, Wash.





















T. Asefa, M. W. Kemblowski, A. Khalil, M. McKee, and G. Urroz, Department of Civil and Environmental Engineering, Utah State University, Logan, UT 84322, USA. ([email protected]; [email protected]; akhalil@ cc.usu.edu; [email protected]; [email protected])

14 of 14

Related Documents