Application of Symbolic regression and Geographic Information System in Kheyroud watershed to provide spatial models that affect the surface of the landscape

Document Type : Research Paper


1 department of environment, university of Tehran, Iran

2 Faculty of Forestry and Environmental Management, University of New Brunswick, New Brunswick,


Hyrcania is a highly productive forest along the southern coast of the Caspian Sea (northern Iran). The forests are mostly uneven-aged, oriental beech (Fagus orientalis)-dominated hardwood mixtures. These forests often include the presence of Carpinus betulus, Alnus subcurdate, Acer velutinum, and several other tree species and shrubs. These forests are mostly broadleaved, but Taxus bacata and Cupressus spp. do appear on some specialized sites. These forests are home to about 80 different tree species and 50 shrub species. Hyrcanian forests have multiple ecological functions, such as provide for (i) the production of wood fiber and lumber, (ii) the protection of watersheds, including their water and soils, and (iii) the conservation of biodiversity.
The topic of biodiversity has become a primary focal point in deliberations of sustainability worldwide, as a result of the rampant decline and degradation of natural environments initiated by urbanization, unrestrained resource extraction, and wanton disregard for nature. Furthermore, global climate change broadens our need to incorporate significant amounts of knowledge on biodiversity and functionality in developing contemporary forest management plans, which is not always easy to achieve. In this chapter, we develop a computational framework that relates measures of tree diversity (based on actual field surveys) to modelled physical (abiotic) variables. Here, we calculate tree diversity using the Shannon-Weiner index; an index commonly used to characterize species diversity in plant communities by accounting for both species abundance and evenness.
The plot network in the Gorazbon section is designed on a rectangular grid (150 m × 200 m) and consists of 258 fixed-area circular plots of 0.1 ha each .Tree species richness was determined at each plot from basic tree species identification and tallying. Prominent tree species in plots include Fagus orientalis, Carpinus betulus, Acer velutinum, Acer campestre, Alnus subcordata, Quercus castaneifolia, Parrotia persica, Tillia begonifolia, and Ulmus glabra. Total number of plots available for the current analysis was 202; many of the unused plots had missing site information, including GPS (global positioning system) coordinates, preventing their geo-referencing.
Development of numerical surfaces
Fundamental to the spatial calculation of abiotic surfaces or their surrogates at mid-resolution is the DTM of the Gorazbon section. DTM-height data is derived from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) 30-m resolution Global Digital Elevation Model v. 2 (GDEM;, last accessed on June 2014). Descriptions of the various abiotic and associated surfaces, including their proxies and their derivation, can be found in Table 1. Values of abiotic and proxy variables at forest-plot locations were summarised separately as averages of values falling within each individual 0.1-ha plot (Fig. 1b).
Relating plot-estimates of environmental variables to
Symbolic regression, or symbolic function identification, is used to determine from the list of independent variables in Table 1, which site variables are particularly crucial in explaining spatial variability in . Symbolic regression is a procedure founded on evolutionary computation in searching for algebraic equations, while reducing the difference between target values and values calculated with the equations generated with the procedure .Different from conventional regression techniques that determine parameters of known equations, no specific mathematical expression is needed as a starting point to the approach. Rather, primary expressions are formed by randomly combining primitive base functions of input variables (linear or otherwise) with algebraic operators. Equations retained by the procedure are those that replicate the target output data better than others; undesirable solutions are rejected. The procedure stops whenever the desired accuracy in data replication has been reached. In order to balance the relative contribution of each plot-estimate of in the development of a generalised expression of , -values were weighted as a function of the inverse of their occurrence (i.e., number of times it occurs) in the dataset. This was done to ensure that values that are not commonly observed (e.g., = 7 species per 0.1 ha plot) contribute as much to the explanation of as values that are more frequently observed (e.g., = 2-4 species per 0.1 ha plot).
This research examines the possible ecological controls on tree diversity in an unmanaged region of the Hyrcanian forests (i.e., the Kheyrud experimental forest). Key to the study are computer-generated abiotic surfaces and associated plot estimates of (i) growing-season-cumulated cloud-free solar radiation, (ii) seasonal air temperature, (iii) topographic wetness index (TWI) in representing soil water distribution, and (iv) wind velocity generated from simulation of fluid-flow dynamics in complex terrain (Fig. 1).

Fig. 1. Model-generated abiotic surfaces of (a) growing-season cumulated cloud-free solar radiation (MJ m-2), (b) mean seasonal air temperature (oC), (c) topographic wetness index (TWI; unitless), and (d) wind velocity within the study area (m s-1).
Plot-level estimates (Fig. 2) are used in the generation of a three-variable equation (eq. 1) of tree diversity by means of symbolic regression (Schmidt and Lipson, 2009):
where W is the wind velocity (m s-1), S annually-cumulated cloud-free solar radiation (MJ m-2), and T mean annual air temperature near the ground surface (oC). In the regression process, diversity values are weighted according to TWI.

Fig. 2. Spatial variation in plot-level estimates of tree species diversity; size of the circles coincides with level of species diversity based on a calculation of the Shannon-Weiner index; the large circles coincide with high index values (high species diversity with an upper value of 1.6), whereas small circles coincide to low values (low species diversity with a lower value of 0.1).
Localised topographic wetness index (TWI) is shown to be unimportant with explaining spatial patterns in tree diversity. The approach shows that plot-level estimates of W, S, and T in combination can explain roughly 70% of the spatial variation in tree diversity in the validation data (Fig. 3).

Fig. 3. Plot-level estimates of the Shannon-Weiner index (blue circles) compared to modelled values (red line).
Concluding Remarks
The chapter develops the methodology, the results, and discussion regarding the research.
Schmidt, M., and Lipson, H. 2009. Distilling free-form natural laws from experimental data. Science, 324(5923): 81-85.


Main Subjects