Bivariate Interpolation for Irregular Data

DESCRIPTION:

Interpolates the value of the third variable onto an evenly spaced grid of the first two variables.

USAGE:

interp(x, y, z, xo=<<see below>>, yo=<<see below>>, ncp=0, extrap=F) 

REQUIRED ARGUMENTS:

x
vector of x-coordinates of data points. Missing values are not accepted.
y
vector of y-coordinates of data points. Missing values are not accepted.
z
vector of z-coordinates of data points. Missing values are not accepted.

x, y, and z must be the same length and may contain no fewer than four points. The points of x and y cannot be collinear, i.e, they cannot fall on the same line (two vectors x and y such that y = ax + b for some a, b will not be accepted). interp is meant for cases in which you have x, y values scattered over a plane and a z value for each. If, instead, you are trying to evaluate a mathematical function, or get a graphical interpretation of relationships that can be described by a polynomial, try outer().

OPTIONAL ARGUMENTS:

xo
vector of x-coordinates of output grid. The default is 40 points evenly spaced over the range of x. If extrapolation is not being used ( extrap=F, the default), xo should have a range that is close to or inside of the range of x for the results to be meaningful.
yo
vector of y-coordinates of output grid. The default is 40 points evenly spaced over the range of y. If extrapolation is not being used ( extrap=F, the default), yo should have a range that is close to or inside of the range of y for the results to be meaningful.
ncp
number of additional points to be used in computing partial derivatives at each data point. ncp must be either 0 (partial derivatives are not used), or at least 2 but smaller than the number of data points.
extrap
logical flag: should extrapolation be used outside of the convex hull determined by the data points?

VALUE:

list with 3 components:
x
vector of x-coordinates of output grid, the same as the input argument xo, if present. Otherwise, a vector 40 points evenly spaced over the range of the input x.
y
vector of y-coordinates of output grid, the same as the input argument yo, if present. Otherwise, a vector 40 points evenly spaced over the range of the input x.
z
matrix of fitted z-values. The value z[i,j] is computed at the x,y point x[i], y[j]. z has dimensions length(x) times length(y) ( length(xo) times length(yo)).

NOTE:

The resulting structure is suitable for input to the functions contour, persp and image. Check the requirements of these functions when choosing values for xo and yo.

DETAILS:

If ncp is zero, linear interpolation is used in the triangles bounded by data points. Cubic interpolation is done if partial derivatives are used. If extrap is FALSE, z-values for points outside the convex hull are returned as NA. No extrapolation can be performed if ncp is zero. Results are currently computed to single-precision accuracy only.

The interp function cannot handle duplicate (x,y) points (since it does not know what z value to assign at such points). If you have such data, you need to reduce each duplicate point to a single point - giving it, perhaps, the mean or median of the corresponding z values. Or, if your z data is relatively smooth, you can use jitter() on your x or y data. If interp fails for any reason, using jitter on the x or y values may let it succeed without significantly changing your plot.

The triangulation scheme used by interp works well if x and y have similar scales but will appear stretched if they have very different scales. The spreads of x and y must be within four orders of magnitude of each other for interp to work.

When the data consists of bdVectors, the data is aggregated before smoothing. The range of the "x" and "y" variables are divided into 100 pieces, yielding a grid of 10000 bins. The mean for "x", "y", and "z" is computed in each bin. interp() is then applied to the bin means. This will give values that differ somewhat from those when interp() is applied to the unaggregated data. The values are generally close enough to produce similar results when used in a plot, but the difference could be important when interp() is used for prediction or optimization.

REFERENCES:

Akima, H. (1978). A Method of Bivariate Interpolation and Smooth Surface Fitting for Irregularly Distributed Data Points. ACM Transactions on Mathematical Software, 4, 148-164.

SEE ALSO:

, , , , .

For univariate interpolation, see for linear interpolation, or for cubic spline interpolation.

EXAMPLES:

# an example with duplicate (x,y) points in the data 
airtemp.jit <- jitter(air$temperature) 
air.jit <- interp(air$radiation, airtemp.jit, air$ozone) # gives error 
#an example averaging the z data over duplicate (x,y) points 
f <- 
function(x = air$temperature, y = air$radiation, z = air$ozone, FUN = median) 
{ 
        xy <- paste(x, y, sep = ",") 
        i <- match(xy, xy) 
        z.smoothed <- unlist(lapply(split(z, i), FUN)) 
        ord <- !duplicated(xy) 
        data.frame(x = x[ord], y = y[ord], z = z.smoothed) 
} 
air.smooth<- f()   
air.grid <-  interp(air.smooth$x, air.smooth$y, air.smooth$z) 
oz <- interp(ozone.xy$x, ozone.xy$y, ozone.median) 
contour(oz)    #contour plot 
#example changing xo and yo to get very fine grid 
akima.smooth<- interp(akima.x, akima.y, akima.z, 
      xo= seq(0,25, length=100),  yo= seq(0,20, length=100)) 
persp(akima.smooth)