Geoscientists often face interpolation and estimation problems when analyzing sparse data from field observations. Geostatistics is an invaluable tool that can be used to characterize spatial or temporal phenomena. Classic statistics is generally devoted to the analysis and interpretation of uncertainties caused by limited sampling of a property under study.

Geostatistics however, deviates from classic statistics in that Geostatistics is not tied to a population distribution model that assumes, for example, all samples of a population are normally distributed and independent from one another. Most of the earth science data (e.g., rock properties, contaminant concentrations) often do not satisfy these assumptions as they can be highly skewed and/or possess spatial correlation (i.e., data values from locations that are closer together tend to be more similar than data values from locations that are further apart). The goal of geostatistics is to predict the possible spatial distribution of a property.

Such prediction often takes the form of a map or a series of maps. Two basic forms of prediction exist: estimation (Figure 1) and simulation (Figure 2).

Figure 1 : Estimation

In estimation, a single, statistically \best” estimate (map) of the spatial occurrence is produced. The estimation is based on both the sample data and on a model (variogram) determined as most accurately representing the spatial correlation of the sample data. This single estimate or map is usually produced by the kriging technique

Figure 2 : Simulation

On the other hand, in simulation, many equal-likely maps (sometimes called \images”) of the property distribution are produced, using the same model of spatial correlation as required for kriging. Differences between the alternative maps provide a measure of quantifying the uncertainty, an option not available with kriging estimation.

Geostatistics versus Simple Interpolation

In geostatistical estimation, we wish to  estimate  a property at  an  unsampled location,  based  on the  spatial  correlation characteristics of this  property and its values at  existing  sampled  locations.  But,  why not  just  use simple interpo- lation?   How is spatial  correlation incorporated in the  geostatistical approach? A simple example  may illustrate this point more clearly (Figure  3):  we know permeability at  n sampled  locations,  we wish to  estimate  the  permeability at an unsampled location,  z0 .  Using inverse distance,  the  unknown  value  can be evaluated as:

We can see that the above relation  is a linear estimator, i.e., z0  is a weighted sum of the n known values.  Each weight (Wi ) (assigned to a known zi ) is determined by the distance  of the known data  point to the unknown  data  point. For n = 7, for example,  the weights can be calculated easily as shown in Figure 5 and 4.

Figure 4 : Estimation of the unknown permeability Z0 based on a set of known values of permeability at n locations.
Figure 5 : Estimation of the unknown Z0 given 7 known values. Numbers in parenthesis are weights assigned to the known values based on inverse distance.

Using this  scheme, the  weights assigned to points  1, 2, 4, 6 are all equal to 0.2.  However, from the  understanding of geology, we realize that permeability within the elongated sand body should be more similar in the lateral direction. Thus, points 4 and 6 should be given higher weights than points 1 and 2. This is obviously not the case when using inverse distance.

Thus, in conventional interpolation methods (e.g., inverse distance, inverse distance squared), information on spatial correlation is not incorporated. On the other hand, geostatistical estimation considers both distance and spatial correlation. In general, geosta- tistical estimation consists of 3 steps: (1) examining the similarity between a set of sample (known) data points via an experimental variogram analysis; (2) fitting a permissible mathematical function to the experimental variogram; (3) conducting kriging interpolation based on this function.