The Structure of S-PLUS Expressions

DESCRIPTION:

This describes the structure of the S-PLUS language including the precedence of operators.

SUMMARY:

Assigning a value to a name is achieved with an "arrow" (<-;see Assignment) or with the assign function.

Arithmetic binary operators are placed between the (names of) two vectors. Parentheses specify the order of evaluation, see the precedence table below.

The form of a function call is: the name of the function followed by an opening parenthesis, then the list of arguments (if any) followed by a closing parenthesis. Items in the argument list are separated by commas; arguments that are listed out of order must be in the name=value format. In general, argument names need not be typed in full; only enough to uniquely identify the argument is required. The exception is when an argument follows the dot-dot-dot construction ... in the definition--in this case the argument must be given in the name=value form with the name specified exactly.

To subscript from vectors and arrays use vname[ expr ] (see Subscript). When extracting portions of a list, use lname[ expr ], lname[[ expr ]] or lname$fname . Use of the single square brackets returns a sublist of lname , while the double square brackets and the $ operator return one component of lname. The advantage of the double bracket form is that components can be extracted by their order in the list as well as by name. Not all of a component name needs to be specified--the requirement is that enough of the beginning be written to uniquely identify the component.

Curly braces ( " {" and " }" ) surrounding a number of expressions causes them to be treated as one expression. This is useful when writing functions and with if, for, while, repeat.

A typed expression may be continued on further lines by ending a line at a place where the line is obviously incomplete with a trailing comma, operator, or with more left parentheses than right parentheses (implying more right parentheses will follow). The default prompt character is " > "; when continuation is expected the default prompt is "+ ". On the other hand, two or more expressions can be placed on a single line if they are separated by a semi-colon (;).

The flow of control in for, while, and repeat loops can be controlled with next and break. If next is encountered, the next iteration is immediately begun. The loop is exited if break is encountered. A function can be exited with return( expr ).

String literals are contained between matching apostrophes or matching double quotes. Characters inside can be escaped by preceding them by the back-slash character: \n (newline), \t (tab), \\ (back-slash), \r (carriage return), \b (backspace). In addition, a back-slash followed by 1 to 3 octal digits represents the character with the corresponding octal representation in ASCII. (The character \0 is not allowed since it is used as the string terminator character as in the C language.) A back-slash preceding other characters is ignored (e.g., "\w" == "w"). This follows C language conventions.

Any sequence of characters between matching "%" characters, not including a new line, is recognized as an infix operator.

An expression whose first character is "!" is executed as a UNIX command with no changes. Use curly braces to avoid having "!" as the first character of an expression when it is to mean "not".

A function is defined by the word function followed by matching parentheses that contain the names of the arguments separated by commas. Default values can be given in the name=default.value form. Use ... to pass an arbitrary number of arguments.

FORMAL DESCRIPTION:



The following infix operators are recognized by the parser. They are listed in decreasing precedence. In the event of ties, evaluation is from left to right.

$                               component selection     HIGH
@                               slot selection
[ [[                            subscripts, elements
^                               exponentiation
-                               unary minus
:                               sequence operator
%anything%                      special operator
*   /                           multiply, divide
+  - ?                          add, subtract, documentation
<  >  <=  >=
  ==  !=                        comparison
!                               not
&   |  &&  ||   and, or
~                               formulas
<<-                     permanent assignment to working data
<- -> =                 assignment      LOW



Expressions in S-PLUS are typed by the user, parsed, and evaluated. The following rules define the expressions considered legal by the parser, and the mode of the corresponding object.

LITERALS          MODE
number          "numeric"
string          "character"
name            "name"
comment         "comment"
complex         "complex"

Function Definition
function ( formals ) expr               "function"

Calls
expr infix expr         "call"
expr %anything% expr
unary expr
expr ( arglist )
expr [ arglist ]
expr [[ arglist ]]
expr $ fname

Assignment
expr <- expr                 "<-"
expr _ expr
expr -> expr
expr <<- expr             "<<-"

Conditional
if( expr ) expr                 "if"
if( expr ) expr else expr

Iteration
for( name in expr ) expr        "for"
repeat expr                     "repeat"
while ( expr ) expr             "while"

Flow
break                           "break"
next                            "next"
return ( expr )                 "return"
( expr )                        "("
{ exprlist }                    "{"



The additional syntactic forms introduced in the above rules are defined as follows:

exprlist: expr
          exprlist ; expr
arglist:  arg
          arglist , arg
formals:  empty
          formal
          formals , formal
arg:      empty
          expr
          fname =
          fname = expr
formal:   name
          ...
          name = expr
fname:    name
          string



Notice that the above rules are rules of syntax, and that there may be additional semantic rules that determine what expressions can be evaluated. In particular, the left-hand-side of assignment expressions is syntactically an expression, but only certain of them, involving subscripts, attributes, and names, are allowable at execution time.

Numeric literals (numbers) are defined by the following rules:

numeric: integer
         float
complex: numeric "i"
         numeric [+\-] numeric "i"
name:    (.|letter) (.|letter|digit)*
integer:  digit+
exponent: "e" [+\-]? integer
float:  integer exponent
        integer "." digit* exponent?
        "." integer exponent?
For compatibility with R 2.5, a numeric followed by
a capital "L" is taken to be an integer, unless it
has a fractional part or is too large to be represented
as an integer (in which case it is parsed as a floating
point numeric).  If there is no trailing L, a sequence
of digits is considered integer (if not bigger than
the largest possible integer) and other forms of numbers
are considered floating point.

Names are defined by:

name: (.|letter) (.|letter|digit)*

If you wish to use a variable name that does not fit this pattern, surround it with backquotes, as in `Graduation %` <- 81.2 or log(`2nd Dose`).

The R language uses a very similar syntax. The main difference is that in R the underscore does not mean assignment and may be part of a name. The parse function has a mode argument to parse R code.

SEE ALSO:

, , , , , , , , , , , , , , , , , .

EXAMPLES:

1.2e3  # twelve hundred 
1.2e-2  # twelve-thousandths 
3.9+4.5i # a complex number 
function(...) { 
        ... 
        for(i in 1:10) { 
                ... 
                if(test1) next # go immediately to the next iteration 
                                # if test1 is TRUE 
                if(test2) break # exit the for loop 
                                # if test2 is TRUE 
                if(test3) return(x) # exit the function with the value x 
                                # if test3 is TRUE 
        } 
  ... }