Application of Indicator and Ordinary Kringing for Modeling of Groundwater Chloride

Document Type : Research Paper

Authors

1 Associate Professor, Department of Water Engineering, Faculty of Water and Soil, University of Zabol, Iran

2 Assistant Professor, Department of Water Engineering, Faculty of Water and Soil, University of Zabol, Iran

Abstract

Introduction
The main problems relating to water quality for agriculture are salinity, soil infiltration and specific ions toxicity. Accumulation of the specific ions from the irrigation water in the plants reduces crop yields. One of the most common specific ions toxicities results from high concentration of chloride (Cl) ion. Knowledge of the spatial distribution of Cl concentration in groundwater is needed for a better management of the groundwater resources. As limited number of sample data is often available, some appropriate interpolation methods are needed to interpolate between the sample points. Ordinary Kringing (OK) is a geostatistical estimation method which uses a semivariogram model to predict the unknown values. However, it cannot predict properly the spatial distribution pattern of a highly skewed data. Besides, OK estimation variance is not a perfect measure of local uncertainty because it only depends on data configuration not data values. Unlike OK, indicator Kriging (IK) is a distribution-free approach, which has the ability to model local uncertainty of the estimated values through estimating a conditional cumulative distribution function (ccdf) corresponding to each point. Although this method has been used by many researchers for mapping and modeling local uncertainty of various environmental variables such as groundwater Cl and other groundwater quality parameters, in Iran it has not been employed for such purposes much. The objective of this study is, therefore, to model the local uncertainty of groundwater chloride over Kerman plain using IK. The performance of IK is compared with the traditionally used OK (with and without data logarithms).
 
Materials & Methods
 
Study area and sample data
This study is performed in Kerman plain. The study area is located in a semiarid and arid region. Its average elevation above Sea level is 1755 m. Because of the lack of surface water resources, groundwater resources are the main water resources for agricultural purposes in this area. Due to the importance of specific ions toxicity, the groundwater samples were collected from 76 agricultural wells and Cl concentration were measured in laboratory.
 
Geostatistical analysis
First of all, experimental (indicator) semivariograms are calculated to investigate the spatial variability of (within class) Cl data. A suitable theoretical model is then fitted to the experimental values for kringing modeling of Cl. Then IK is used to map groundwater Cl and to evaluate the uncertainty attached to the estimates. The results were compared with those achieved from OK and log-kringing (LOK). The probability maps of not exceeding two threshold values 10 and 20 meq/lit were generated for Cl by IK. These two threshold values are selected according to the irrigation water quality standard proposed by Ayres and Westcot in 1989. In the following the geostatistical tools and methods used in this study are briefly described:
 
Semivariogram
The semivariogram quantifies the dissimilarity between observed values as the separation distance between the sample points increases. In practice, experimental semivariogram, γ*(h), is computed for two values separated by a lag distance h as following:
 
                                                                                          (1)
 
Where, N is the total number of data pairs of observations z(xi) and z(xi+h)separated by a distance h for a specific direction. Kringing needs the semivariogram values for any given lag, therefore, a theoretical model may be fitted to the experimental values and the characteristics of this model can be used.
 
Ordinary Kringing
In OK, the values at unsampled locations are determined by a linear weighted moving averaging of values at sampled locations as:
 
              with                                                                           (2)
 
where z*(x0) is the estimated value of variable of interest at unsampled location x0, li is the weight assigned to the known value of variable at location xi determined based on a semivariogram model and n is the number of neighboring observations. OK produces an estimation variance attached to every estimate, which can be used to generate a confidence interval for each estimate assuming a normal distribution of errors. OK performed on lognormal transformed data is called Lognormal Ordinary Kringing (LOK). The estimates have to be back-transformed to the original space at the end.
 
Indicator Kringing
Indicator Kringing (IK) is based on the coding of the random function Z(x) into a set of K indicator random function I(x, zk) corresponding to different cutoffs zk:
 
                                                                  (3)
 
After transforming the observed data to a new set of indicator variables, the experimental semivariogram is calculated for every set of indicators at each cutoff zk. The conditional cumulative distribution function (ccdf) at each unsampled location, e.g. x0, is then obtained by the IK estimator:
 
                                                                             (4)
 
where I*(x0; zk) is the estimated indicator transform at unsampled location x0 and li are the weights assigned to the indicator transform I at location xi. These discrete probability functions must be interpolated within each class and extrapolated beyond the minimum and maximum values to provide a continuous ccdf, which covers all possible range of the variable. E-type estimates, which are comparable with OK estimates, may be computed through post processing IK-based ccdfs. Local uncertainty measures, e.g. conditional variance and probability maps are also produced through post processing of IK-based ccdfs. In this study kringing methods are performed using the software package GSLIB.
 
Evaluation of the results
A cross-validation technique with comparison criteria Mean Error (ME), Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) is used to evaluate the performance of the methods. The most accurate method is the one with the smallest amount of MAE and RMSE and with a MBE close to zero.
 
Results and Discussion
Statistical analysis shows that Cl data distribution is strongly positively skewed. A logarithm transform is used to provide a normal frequency distribution of data. Experimental semivariograms are calculated for the raw and log-transformed data. For IK, nine thresholds 1.3, 2.4, 3.4, 4.4, 5, 7.4, 12.7, 18 and 24 (meq/lit) corresponding to 10, 20, 30, 40, 50, 60, 70, 80 and 90 percent of Cl cumulative frequency distribution functions are selected. Then, the observation of Cl data is coded according to these selected thresholds and the indicator semivariograms were computed afterwards.
The results of semivariogram analysis show that chloride data values and its logarithms are strongly correlated in space and the best fitted semivariogram model is spherical with a range of influence of 42 and 72 km, respectively. However, according to the results, a higher spatial correlation is seen for log-transformed data. Cl data had a moderate to strong within class spatial correlation and indicator semivariograms often follow a spherical structure. Furthermore, the cross-validation results indicate that LOK with the smaller amounts of RMSE and MAE and a higher amount of correlation coefficient, R, are more accurate than IK for estimating groundwater chloride. 
Beside estimation, one of the main aims of this study is to model local uncertainty of Cl data over the study area. Both OK and LOK provide the uncertainty attached to each Cl estimate by calculating its estimation variance. Thus, where the estimation variance or standard deviation is smaller, the estimated value of Cl is more certain. The results show that OK and LOK estimation variance is more related to sampling configuration not to the actual values. In contrast IK conditional variance shows some relation with the sample data in addition to sampling location.
Besides, IK conditional variance was more appropriate for representing the estimation error than the OK (and LOK) variance. Moreover, the produced probability maps showed that the probability of chloride exceeding critical thresholds 10 and 20 meq/lit is higher in the northwest and west of the study area.
 
Conclusion
In this study non-linear indicator kringing is used to model the local uncertainty attached to Cl concentration estimates. Ordinariy (log) kringing is used to map the spatial distribution of Cl estimates. The results show that ordinary log kringing, which is faster and mathematically simpler than indicator kringing, provide more accurate results of Cl estimates. The correlation between indicator kringing conditional variance and estimation error is stronger than the correlation between ordinary (log) kringing variance and estimation error. This means that IK conditional IK produces the probability map of not exceeding a critical threshold for Cl concentration. These maps can be useful in many decision-making processes, e.g. water resource management.

Keywords

Main Subjects