Preparing Data for Graphing

Some plot types require data to be in a particular format. Special data formats required in S-PLUS are explained in the following section.

The following table lists each plot type and the required columns.

 

Required columns for different plot types: l required, m optional.

 

 

 

 

 

Plot Type

X

Y

Z

W

Area

m

l

 

 

Bar

m

l

 

 

Bar w/ Error Bars

m

l

m

m

Bar 3D (Gridded)

m

m

l

 

Box-Grouped

m (group)

l

 

 

Box-Plot

m

l

 

 

Comment

m

l

m (text)

 

Contour-gridded

m

m

l

 

Contour-irregular

l

l

l

 

Curve Fitting

 

 

 

 

Exponential

m

l

 

 

Linear

m

l

 

 

Log 10

m

l

 

 

Log e

m

l

 

 

Polynomial

m

l

 

 

Power

m

l

 

 

Nonlinear

l

l

 

 

Error Bar Horiz

l (multiple x)

l

m

m

Error Bar Vertical

l

l (multiple y)

m

m

High-Low-Close

l

l (close)

l (high)

l (low)

Histogram

l

 

 

 

Line 2D

m

l

 

 

Line 3D

m

m

l

 

Loess 2D

m

l

 

 

Pie

l

 

 

 

Polar

l (radius)

l (angle)

 

 

Projections 3D

m

l

 

 

Regression 2D

m

l

 

 

Regression 3D

m

l

l

 

Scatter 2D

m

l

 

 

Scatter 3D

m

m

l

 

Smith-Circle

l

l

l

 

Smith-Impedance

l

l

 

 

Smith-Reflection

l

l

 

 

Spline 2D

m

l

 

 

Spline 3D (gridded)

m

m

l

 

Surface-gridded

m

m

l

 

Surface-irregular

l

l

l

 

Vector (beg/end)

l

l

l

l

Vector (angle/mag)

l

l

l

l

Weighted LS (IP)

m

l

 

 

Specifying Multiple Data Columns

Multiple columns can be specified in a list (for example, Y1, Y2, Y3) or in a sequence (for example, Sample1:Sample5). Each column must have the same length. For example, multiple columns can be specified for the Y column for grouped or stacked bar charts, area charts, or for the Z column for contour or surface plots.

 

A data frame is an example of a multiple column data set.

 

 

 

 

 

 

 

Levels

X

Sample1

Sample2

Sample3

Sample4

Sample5

High

1

0.45

0.69

0.66

0.66

0.19

Medium

2

0.89

0.12

0.41

0.89

0.78

Low

3

0.42

0.61

0.37

0.29

0.44

Specifying Matrix Form Data

In a gridded surface or contour plot the Z values represent the height of each intersection in a grid. If the Z data are specified in a series of columns in a data frame or in a matrix, the dimensions of the grid are defined by the number of rows (X grids) and columns (Y grids) of the Z data. For example, if your Z data consists of 4 columns each containing 5 Z values, the number of X grids is 5 and the number of Y grids is 4.

Specifying Stacked Form Data

In a gridded surface or contour plot, the Z values represent the height of each intersection in a grid. If the Z data are "stacked" in one long column, S-PLUS needs more information to determine the dimensions of the grid. You can do this by specifying either the X and Y data columns (if the number of X and Y Data Grids are set to Auto) or the number of data grids.

If you specify the X and Y data columns, S-PLUS uses the minimum and maximum data values to determine the position of the contour along the x and y-axis. For example, if the X column has a minimum value of 2 and a maximum value of 9, the plot is drawn between 2 and 9 on the x-axis. If X and Y are not specified, or if the X and Y columns contain character data, S-PLUS assumes that the grid values in the X and Y direction are a series of integers. If character data are specified, they are used to label the X and Y tick marks.

If you specify the number of data grids, the number of X grids times the number of Y grids must be equal to the number of rows in the Z column. For example, if your Z column contains 875 values, it might represent a grid with 35 X values (the number of X grids) and 25 Y values (the number of Y grids).

Short form stacked data

The X and Y data can be specified in a short or long form. In the short form, each X and Y value is listed only once. Both columns must have values that are in ascending order. The Z values do not correspond to the X and Y values in the same row. S-PLUS assumes that the X values vary faster in determining the X,Y coordinates that correspond to each Z value. For example, the table below represents a list containing vectors X, Y and Z.

 

Sample stacked data: short form.

 

 

 

X

Y

Z

1.2

10

0.60

2.2

20

0.67

3.2

30

0.67

4.2

 

0.69

5.2

 

0.71

 

 

0.59

 

 

0.62

 

 

0.65

 

 

0.66

 

 

0.67

 

 

0.55

 

 

0.58

 

 

0.62

 

 

0.66

 

 

0.73


Long form stacked data

In the long form, the X, Y, and Z columns are of equal length and can be stored in a data frame. They contain every combination of the X and Y values and their corresponding Z value. The X values are an ascending sequence, and the sequence is repeated for each Y value. The Y values are also ascending, with each value repeated for each X value. This is the format that should be used for creating multipanel graphs.

 

Sample stacked data: long form.

 

 

 

X

Y

Z

1.2

10

0.60

2.2

10

0.65

3.2

10

0.67

4.2

10

0.69

5.2

10

0.71

1.2

20

0.59

2.2

20

0.62

3.2

20

0.65

4.2

20

0.66

5.2

20

0.67

1.2

30

0.55

2.2

30

0.58

3.2

30

0.62

4.2

30

0.66

5.2

30

0.73

Specifying Irregular Form Data

For an irregular surface or contour plot you must specify three columns of equal length for X, Y, and Z. The data can be in any order, and the spacing between X values and Y values may be random. Each X,Y,Z triplet defines a position in 3D space. S-PLUS first estimates a set of gridded data, and then plots the data as it does a gridded surface or contour plot. You will get better results if the data are distributed fairly uniformly in X and Y and do not contain sharp "spikes" or "drops" in Z.

 

Irregular data for surface or contour plot

 

 

 

X

Y

Z

1.1

51.0

0.60

2.3

11.9

0.65

3.2

35.8

0.67

3.4

29.0

0.69

4.5

21.6

0.71

5.1

43.2

0.59

5.2

10.3

0.62