Home Articles Producing probability maps to assess risk of exceeding critical threshold value of...

Producing probability maps to assess risk of exceeding critical threshold value of soil EC using geostatistical approach


Suresh Tripathi
Technical Director, Geosun Pty Ltd
P O Box 207, Boronia, VIC, 3155, Australia
Email: [email protected]

Assumptions and Geostatistical Variogram Model:
Geostatistical models are handled with the help of a probabilistic framework which is traditionally presented in terms of continuous random processes. In this paper, I consider a point s in d-dimensional Euclidean space Rd and suppose that a random variable {z(s):sÎD} at spatial location s is a realization of a stochastic process {Z(s):sÎD}. Geostatistical approach requires the stationary assumptions as described below. The process is said to be stationary in the mean if the following condition met as

E[Z(s) – Z(s+h)] = 0 ———————— (1.1)
where h (lag) is a vector in Rd . If the covariance cov (Z(s), Z(s+h)) is finite and depends upon only h, then the function C(h), called the covariogram, can be defined by

C(h) = cov[Z(s), Z(s+ h)] ———————– (1.1)
If C(h) is a function of only ||h|| then C(h) is called isotropic. If the variance of the difference between Z(s) and Z(s+h) is finite, and depends only on h then the variogram 2g(h) can be defined as

2g(h) = var [Z(s)-Z(s+h)] ——————- (1.3)
The process {Z(s)} is said to be intrinsic stationary or, alternatively, to satisfy the intrinsic hypothesis if it satisfies equations (1.1) and (1.3). In this case the variogram can be written as the expected squared difference between random variables Z(s) and Z(s+h) as below.

Note that g(0) = 0, g³0 and g(-h) = g(h). If var [Z(s)] = C(0) (finite), and second-order stationarity is satisfied, then the variogram can be written in terms of the covariogram as

which implies that Z(s) is also intrinsically stationary. In this case the variogram is bounded and C(0) is known as the sill of the variogram. Indicator approach requires data to be converted into binary based on the certain threshold value as Zk below.

Experimental indicator variogram can be calculated from the data as given below.

where 2N(h) is total number of pairs.

Spherical varogram model was used to fit experimental indicator variogram values obtained from the data and is given below.

where C(0), Cs and a are sill, nugget variance and range (spatially correlated distance) respectively.

Geostatistical Model:
In the stationarity framework, the regionalized variable Z(s) can be modelled as the sum of a deterministic part , a mean function called the drift, and a zero-mean stochastic process d(s) called the residual random part. The mean function can be thought of as the large scale variation representing the variable’s global trend over D. The zero-mean stochastic process can be thought of as the small scale variation representing the spatial dependence after the trend is removed. This type of model is known as a stochastic model or a probabilistic model (Cressie 1991) and can be expressed as equation (1.6)

If m(s) varies slightly over a region, or the expected value of the drift is not constant but varies over the region D then the drift may possibly be expressed as a linear combination of suitable base functions and we can write

where s is a location index,bj-1 , j = 1,…,p+1 are called the unknown drift coefficients, and fj-1 , j=1,…,p+1 are known base functions at s. The model expressed in equation (1.7) is known as the universal kriging model in geostatistics. A special case of the universal kriging model is the ordinary kriging model, when m (s) is unknown but constant and the simple kriging model when the mean function is known.

Indicator Kriging
At any unsample location S0 , the ordinary kriging estimator for threshold value used to interpolate at unknown location is defined as below.

Where the weight are obtained by solving the indicator kriging system of (n+1) equations (Goovaerts, 1995).

whereY (Zk) is a Lagrange parameter.

Case Study:
Soil EC measurements were taken in units of decisiemens per meter (dS/m) to study which part of soil land area are contaminated based on the threshold value 250 dS/m. The study soil site was sampled on a 15 X 15 m grid spacing in the eastwards direction and more than 20 meters to 40 meters in the northwards direction at regular interval with total number of 197 observations. Sample EC configuration is provided as shown in the graph 1 below. A geostatistical analyst package part of ESRI ArcGIS software is used produce the graph. Although EC has no direct effect on crop growth or yield but EC could be related to other soil properties such as water holding capacity, topsoil depth, soil nutrients levels, salinity, and subsoil characteristics. From graph 1 it can be roughly seen that the pattern of the soil EC tends to increase in south-east and south-west directions.

Graph 1

The histogram is also produced to check the stationary assumptions of the geostatistical theory and can be seen below in graph 2. It shows that EC data are not normal distribution and contain some outliers and shows mixture of two populations.

Graph 2

A probability plot is also shown to conclude that data violate stationary assumptions probability plot shows that there are some outliers above 95%.


Graph 3

The indicator variogram values are estimated at least in two directions to detect any anisotropy in the data. Checking and detecting anisotropy is important process as it has larger or shorter spatial correlation (range) in some directions, and can be useful in selecting neighbouring data to improve the performance of indicator kriging estimates (interpolation techniques). Anisotropy can be seen in the graph 4 and 5 below where spatial correlation of the variogram model (one with range 370m and other with range of 200m) changes with directions. A spherical variogram model was fitted to experimental variogram values and 3 main parameters of spherical model called nugget effect, sill and range were estimated to provide overall spatial variation of EC. In graph 4 the nugget effect is very small compared to total spatial variation of the EC.

Graph 4

The nugget effect accounted for about one-fourth of the total spatial variance of the EC data in the south-west direction. This model includes a structure with range of 200m compared to the first variogram model with spatial structure range of 370m. The first variogram model shows better spatial continuity than second variogram model. The second variogram model shows some existence of trend after 90 meters.

Graph 5

Indicator kriging has been chosen appropriate interpolation techniques in this case study as data do not follow symmetric distribution and mean and variance as defined in the equation (1.1) and (1.3) over the region is not constant. In this situation, kriging indicator is a robust method to deal with skewed distribution and outliers and resulted improved spatial estimate at unknown locations where no measurement were taken. An important contribution of geostatistics is the assessment of uncertainty about unsample values. Geostatistical interpolation is controlled by above indicator variogram model and the sampling configuration (regular, irregular, trangle cells) of the EC data. Therefore, it is also possible to optimize sampling strategy through variogram modeling after one forth of sample information is collected that has resulted in minimizing interpolation error. The sample below threshold values were assigned to zero and samples above the threshold values were assigned to 1. The probability output values are also between 0 and 1. The output prediction can be interpreted as the probability of the variables belonging to class of 1 which are above the threshold value by producing soil map of probability of exceeding critical values such as regulatory threshold for soil quality. The probability of values higher than threshold value is mapped as shown in the graph 6. For decision making point of view, in fact, it is often to assess risk of soil zones rather than computing single spatial estimate at unknown location with associated indicator kriging variance. The soil with high probability values in the south-west direction has been identified and separate zone of high risk area and some remedial treatment should be recommended to provide some economic yields of agriculture crops and is shown in the graph 6. The formers would be advised to apply fertilizer in this area as the output probability prediction are high compared to critical threshold value (critical value). The delimiting zones have been identified in the south-west direction which require some remedial treatment action if they want to produce some economic yields of agriculture crops.

Graph 6

Conclusion:
The case study above shows that non-parametric geostatistical approach can be applied to soil properties for assessing high risk area of contamination which requires some immediate remedial treatment and an effective management plan providing an informed decision support to help farmers. Traditional techniques do not provide any measurement of the reliability of the estimates; thus no risk assessment can be made. Currently, in most of the countries a little attempt in my knowledge has been made to advice farmers getting help in quantitative estimate of the probability exceeding certain threshold value or draws a probability map to find out which soil land required appropriate fertilizers or action to delimit zones requires urgent attention to control contamination. Therefore, it is useful to communicate the merits of the application of indicative kriging to agriculture area. Critical concentrations of EC or other soil properties have been recognized if the estimates are less than threshold values then farmers are advised to act and an appropriate and effective management plan can be developed. The soil may be lacking in major plan nutrients, show deficiencies or excesses of trace of elements or be too salty or alkaline where Farmers need to apply fertilizers to control salinity and alkalinity based on the geostatistical results.

References:

  • Cressie, N. (1993), Statistics for spatial data, revised ed. John Wiley and Sons, New York. 900p.
  • Deutsch, C.V. and Jorrnel, A. G. (1998), GSLIB: Geostatistical software library and user’s guide. Oxford University Press, New York, 370p.
  • Goovaerts, P. and Journel, A. G. (1995) Integrating soil map information in modeling the spatial variation of continuous soil properties. European Jorurnal of Soil Sciences, 46, 397-414.