Scatter Plot

Scatter plots display one set of numerical values on a vertical scale against another set of numerical values on a horizontal scale.

To generate a scatter plot

Choose Graph __image\arrow5.gif Scatter Plot. The dialog shown below appears.

Data page

__image\scatplot1.gif

In the Scatter Plot dialog, the Data page has the following options:

Data

Data Set

Select a data set from the dropdown list or type the name of a data set. You can also type into the Data Set edit field any expression that evaluates to a data set.

Subset Rows

Enter an S-PLUS expression that identifies the rows to use in the analysis. To use all the rows in the data set, leave this field blank.

Variables

X Axis Value

Select the column that specifies x-values.

Y Axis Value

Select the column that specifies y-values.

Conditioning

Select the columns specifying conditioning values.

Save Graph Object

Save As

Enter the name for the object in which to save the results of the analysis.

Plot page

__image\scatplot2.gif

In the Scatter Plot dialog, the Plot page has the following options:

Plot Type

Type

Specify the type of plot desired. This may be "Points", "Lines", "Both Points and Lines", "Overlaid Points and Lines", "Stairstep Lines", or "Vertical Lines".

Pre-Sort Data

Specify whether to sort the values before connecting them with lines. This may be "None" for no sorting, "Sort on X", or "Sort on Y".

Vary Style by Group

Group Variables

Select a grouping variable to create a separate set of points and lines for each level of the grouping variable. Use this to denote different sets of observations by varying the style. 

Vary Color

Check this to vary the color between groups.

Vary Symbol Style

Check this to vary the symbol style between groups.

Vary Line Style

Check this to vary the line style between groups.

Include Legend

Check this to include a legend indicating the style corresponding to each group.

Symbol/Line Color

Color

Select the color to use for symbols and lines. This is ignored if Vary Color by Group is specified.

Symbol

Symbol Style

Select the symbol style to use. This is ignored if Vary Symbol Style by Group is specified.

Symbol Size

Specify the size of symbol to use. This is the amount of character expansion to use relative to the device's standard size.

Line

Line Style

Specify the line style to use. This is ignored if Vary Line Style by Group is specified.

Line Width

Specify the line width to use. This is the width relative to the device's standard line width.

Fit page

__image\scatplot3.gif

In the Scatter Plot dialog, the Fit page has the following options:

Regression

Regression Type

Specify the type of regression line to add. Select "None" for no line, "Least Squares" for the standard least-squares regression, or "Robust" for a robust-MM regression.

Smooth

Smoothing Type

Specify the type of smoother to add. Choose from the following options:

None uses no smoothing.

Kernel uses the ksmooth function to perform a kernel smooth, which is a generalization of local average smoothing.

Loess uses the loess function to fit a local regression.

Smoothing Spline uses the smooth.spline function and the predict.smooth.spline function to calculate predictions from a cubic B-spline. The regression is fit by penalized least squares between knots. For small data vectors (n < 50), a knot is placed at every distinct point. For larger data sets the number of knots is chosen judiciously in order to keep the computation time manageable.

Supersmoother uses the supsmu function to compute Friedman's variable span smoother. It uses a symmetric k-nearest neighbor linear least squares fitting procedure. The algorithm is fast, and by default uses cross validation to pick the span. This allows the user to specify a smoothing function.

User You also have the option of defining your own smoothing procedure.

# Output Points

Specify the number of points to be produced by the smoothing. If not specified, the default number of points will vary based upon which smoothing algorithm is used.

Kernel Specs

Bandwidth Enter a numeric value for the kernel bandwidth smoothing parameter. All kernels are scaled so the upper and lower quartiles of the kernel are 0.25 and -0.25 when the bandwidth is 1. Larger values of bandwidth make smoother estimates, while smaller values make less smooth estimates. The default bandwidth is 0.5.

Kernel From the dropdown menu, choose Box (a rectangular box), Triangle (a box convolved with itself), Parzen (the parzen function - a box convolved with a triangle), or Normal (a gaussian density function). The default Kernel is Normal.

Loess Specs

Span Select a number between 0 and 1 that will be used to control the amount of smoothing. Smaller values result in less smoothing. Very small values close to 0 are not recommended. By default, automatic (variable) span selection is done by means of cross validation. Reasonable span values are from 0.3 to 0.5. For small samples (n < 50), or if there are substantial serial correlations between observations close in x-value, a pre-specified fixed span smoother should be used.

Degree Select the overall degree of the locally-fitted polynomial. One is locally-linear fitting, and Two is locally-quadratic fitting.

Family Select either Symmetric or Gaussian. The Symmetric option combines local-fitting with a robustness feature that guards against distortion by outliers. The Gaussian option strictly employs local-fitting methods.

Smoothing Spline Specs

Deg. Of Freedom The degrees of freedom should be between 1 and the number of input data points minus 1. The lower the degrees of freedom, the smoother the line. If Auto is selected cross-validation is used.

Supersmoother Specs

Span Select a number between 0 and 1 that will be used to control the amount of smoothing. Smaller values result in less smoothing. Very small values close to 0 are not recommended. By default, automatic (variable) span selection is done by means of cross validation. Reasonable span values are from 0.3 to 0.5. For small samples (n < 50), or if there are substantial serial correlations between observations close in x-value, a pre-specified fixed span smoother should be used.

User-Defined Smoothing

Function Name Specify the name of the function to use for smoothing. The first arguments must be:

x: vector of x data

y: vector of y data

z: vector of z data (can be NULL)

w: vector of w data (can be NULL)

subscripts: vector of row indices

panel.num: panel number if conditioned

It must return a list containing the following components:

x: a vector of x data for line drawing

y: a vector of y data for line drawing

Other

Other Arguments

For any of the smoothing types, any of the optional arguments can be specified here. For example, if Friedman's supersmoother is used, the underlying supsmu function is called. If bass=5 is put into the Other Arguments field, this is passed down to the supsmu function when calculated.

Titles page

__image\scatplot4.gif

In the Scatter Plot dialog, the Titles page has the following options:

Titles

Main Title

Specify a main title to add on the top of the page.

Subtitle

Specify a subtitle to add on the bottom of the page.

Labels

X Axis Label

Specify a label for the x-axis.

Y Axis Label

Specify a label for the y-axis.

Axes page

__image\scatplot5.gif

In the Scatter Plot dialog, the Axes page has the following options:

Aspect Ratio

Aspect Ratio This controls the size of the axis area within the designated plot area. Specify "Fill Plot Area" to fill the page, "Bank to 45 Degrees" to use the 45-degree banking rule in each panel, or "Specified Value" to specify a numeric value.

Ratio Value A numeric value indicating the ratio of height to width.

Scale

X Scale

The scale to use for placing tick marks on the x-axis.

Y Scale

The scale to use for placing tick marks on the y-axis.

Limits

X Limits

The limits for the x-axis. Specify a minimum and a maximum, separated by a comma. For example: "1, 10".

Y Limits

The limits for the y-axis. Specify a minimum and a maximum, separated by a comma. For example: "1, 10".

Relation

X Relation

Controls the relationship between the x-axes in different panels of a conditioned plot. The default value of "Same" ensures that the horizontal or vertical axes on each panel will be identical. "Sliced" gives the same number of data units to corresponding axes on each panel, ensuring that the number of units per cm. is identical. "Free" results in each panel having an axis that accommodates just the data in that panel. For "Sliced" and "Free", axes will be drawn for each panel, using more space on the display.

Y Relation

Controls the relationship between the y-axes in different panels of a conditioned plot.

Alternating

Alternating X Axes

Determines whether x-axes alternate from one side of the group of panels to the other.

Alternating Y Axes

Determines whether y-axes alternate from one side of the group of panels to the other.

Multipanel page

__image\scatplot6.gif

In the Scatter Plot dialog, the Multipanel page has the following options:

Layout

Number of Columns/Rows/Pages Control the layout of the panels by specifying the number of columns, rows and pages.

Panel Order Choose from Graph Order or Table Order. Graph Order begins drawing panels in the bottom left corner of the graph, to the right and up. Table Order begins drawing panels in the upper left corner and continues right.

Include Strip Labels Check this to include strip labels on panels.

Continuous Conditioning

Number of Panels If the data are continuous, the number of panels is determined by the number specified in this field.

Fraction of Shared Points Create overlapping intervals by specifying the fraction of data points that are shared across two panels.

Interval Type Choose from Equal Counts or Equal Ranges. Equal Counts places an equal number of data points in each plot. Equal Ranges makes the interval widths all equal.