Comparison of linear and hybrid models in predicting the distribution of heavy metals using remote sensing and spatial analysis in East of Zanjan city

Document Type : Research Paper

Authors

1 Ph.D student of soil genesis, classification and evaluation, Department of soil science, University of Zanjan

2 Associate Professor of soil genesis, classification and evaluation, Department of soil science, University of Zanjan

3 Professor

Abstract

Introduction:
Accumulation of heavy metals in soil has been considered as an important global environmental issue during past decades and many efforts have been done to prevent their detrimental effects on ecosystem cycles. Conventional methods for assessing the spatial distribution of soil heavy metals require many soil sampling and laboratory analyses making it very time consuming and costly. Furthermore, a quick and reliable monitoring of heavy metals concentration is crucial for a close to real time management of polluted regions. Remote sensing and satellite imagery have the potential to provide a quick, non-destructive and low cost tools for predicting and mapping the distribution of soil heavy metals. Accumulation of heavy metals in soil has been considered as an important global environmental issue during past decades and many efforts have been done to prevent their detrimental effects on ecosystem cycles. Conventional methods for assessing the spatial distribution of soil heavy metals require many soil sampling and laboratory analyses making it very time consuming and costly. Furthermore, a quick and reliable monitoring of heavy metals concentration is crucial for a close to real time management of polluted regions. Remote sensing and satellite imagery have the potential to provide a quick, non-destructive and low cost tools for predicting and mapping the distribution of soil heavy metals. Recently, for remote sensing modeling by satellite images, and application of these models to the ambient environment, smart models like artificial neural networks and genetic algorithms have shown good capabilities. The aims of this study were to evaluation of a linear model and hybrid algorithms consider the spatial distribution of soil heavy metals concentrations and to study its responsible factors, using remote sensing.

Materials and Methods:

The rich lead and zinc mining areas in Angouran region, Zanjan province, which are unique in the Middle East, are led to the accumulation lead and zinc industries in the province. One of the main manufacturers is Iranian national lead company, which is located at 13 km of east the Zanjan, in Dizajabad region. The major activities of this company are the processing and extraction of lead and zinc in soils and stone powder containing these elements. The soil samples (n=300) were collected at 0 to 5 cm soil depths based on a 250m grid in industrial and agricultural regions and a 500m grid in bare lands. The soil samples air dried and 2mm-sieved. Total (t) forms of zinc, lead, cadmium and copper were determined for each sample. In each sampling point, the mean value of digital numbers was calculated by averaging image pixel’s values within a 30m radius in MATLAB. To observe the quantitative relationships between spectral parameter values and metal levels in the studied area, stepwise linear regression and back propagation artificial neural network-genetic algorithm were applied. After providing descriptive statistics and data normalization, data modeling of heavy metals concentrations, Zn, Pb, Cd and Cu was conducted using stepwise multivariate linear regression models and neural-network model combined genetic algorithm. Modeling by neural network-genetic algorithms, was done using feed-forward multilayer perceptron neural networks with sigmoid transfer function. Seven neurons, including satellite and network output to a neuron contains concentrations of heavy metals, formed the input layers of the artificial neural network. Spatial autocorrelation analysis was used to evaluate the heavy metal source identification and find their hotspots. Autocorrelation analysis describes the spatial properties of a variable in a region and is a reflection of the space mean differences between all space cells and their neighboring cells. To investigate the spatial autocorrelation analysis the local Moran I, were employed to identify the presence of clusters.The validation of models was done by Root mean square error and coefficient of determination (R2) Statistics. The prediction maps were provided by most models with lowers Root mean square error and highest R2. Analysis of satellite images, drawing of heavy metals maps were conducted using ArcGIS software version 10.

Results and Discussion:
Descriptive statistics obtained from chemical analysis of heavy metals concentrations in soil of the study area. The results indicated that average concentrations of Pb (t) Zn (t), Cd (t) and Cu (t) were 354.98, 501.10, 1.92 and 12.69 mgkg-1, respectively. According to the standards of the Department of Environment of Iran, Mean concentrations of lead and zinc were classified in risky level, and the average value of Cd and Cu were classified in the no risky level. Statistical analysis of multivariate stepwise linear regression model and artificial neural network-genetic algorithm model showed root mean square error of training data with neural network-genetic algorithm model for Pb, Zn, Cd and Cu were 0.07, 0.09, 0.17 and 0.17 respectively, and with the multivariate stepwise linear regression model were 0.45, 0.32, 0.48 and 0.54, respectively. The results of the models test also had similar trends. Models coefficient of determination of artificial neural network-genetic algorithm model for Pb, Zn, Cd and Cu were 0.88, 0.80, 0.75 and 0.45 and of multivariate stepwise linear regression model were 0.53, 0.43, 0.44 and 0.43, respectively. The obtained training and test error values of neural network-genetic algorithm hybrid model were less than corresponding values of multivariate stepwise linear regression. On contrary, the values of coefficient of determination in of artificial neural network-genetic algorithm hybrid model were higher than corresponding values of multivariate stepwise linear regression. These results indicated that prediction ability of heavy metals especially in high concentrations by artificial neural network-genetic algorithm hybrid model was higher than linear models. The neural network-genetic algorithm model had acceptable accuracy to estimate the amount of heavy metals in soil using satellite imagery data in this study and were used to produce predict distribution maps of heavy metals in the area. According to the predicted maps, the area of highly risk region for Pb, Cd, Zn and Cu were 50.01, 2.00, 0. 2 and 0.04 percentage of study area, respectively. The spatial autocorrelation results can be used as a strategy for source identification and finding disturbing agents of heavy metals. After taking a confidence level of Moran values, the levels that were significantly greater than zero are indicated positive correlation between clusters of cells. Moran levels that were significantly smaller than zero, are indicated a negative correlation between adjacent cells. If the value were closer to 1, is indicated very little difference between the cells and if these values be closer to -1, are indicated large space differences. Spatial autocorrelation of Moran index indicated the strong hotspots concentration of Pb and Cd around industrial zones in the study area. The pattern of showed that Pb concentrations were affected by dominant wind, and the Zn distribution maps showed the concentration of these elements around industrial installations and streams. This indicated that industrial activities effect on heavy metals distributions strongly and might result in surface and ground water contamination.

Conclusion:
This study defined the artificial neural network-genetic algorithm and multivariate stepwise linear regression models to predict the heavy metal distribution in different land-use by Landsat image. Root mean square errors of training data with artificial neural network-genetic algorithm model for studied heavy metals were lower than the linear model. Models coefficient of determination of artificial neural network-genetic algorithm training and testing hybrid model showed higher values than the linear model. The results showed the success of the artificial neural network-genetic algorithm model in prediction of the heavy metals distribution, using remote sensing techniques. The remarkable ability of hybrid models in the estimation of heavy metals in high concentrations was also observed. The concentrations of Cd, Cu, Pb and Zn showed a decreasing trend with increasing distance from the industrial installations. Moreover, the distributions of regional hotspots of Cu and Cd were similar and close to factories. In addition, the Pb concentrations were affected by wind direction and streams had effect the Zn transport in the studied area.

Keywords

Main Subjects