Manipulate S-PLUS Datasets with SQL Commands

DESCRIPTION:

This function will allow the access and manipulation of S-PLUS datasets with simple SQL commands including select, update, insert, and delete.

This function requires the bigdata library section to be loaded.

USAGE:

bd.sql(sql, ...)

REQUIRED ARGUMENTS:

sql
a list/character vector of valid sql statements

OPTIONAL ARGUMENTS:

...
for each dataset referenced in the sql statements, an alias must be provided. This is helpful because '.' is not permissable in SQL names. If no name is supplied for a dataset, the dataset's name is used.

VALUE:

NULL

DETAILS:

When you use this function, you should be aware of the following limitations, unsupported, and supported statements and operations:

You can use the WHERE clause only on the first referenced data table in a SQL statement.

The following functionality is not implemented: distinct; mathematical functions in set or select, such as abs, round, floor, and so on; natural join; union; merge; between and subqueries.

The following statements are supported: Select, Insert, Delete, and Update.

EXAMPLES:

######################
##  SELECT COMMAND
######################
# In this example all the data is selected from fuel.frame.
# To access fuel.frame, an alias is used (in this case,
# we call it fuel)
bd.sql("select * from fuel", fuel=fuel.frame)

# In this example two sets of data are retrieved.
# In the first, we do a simple calculation based
# on two columns from the input dataset.  In the
# second we retrieve Weight from the input when
# Type is not Van or Sporty.  The alias here is
# fuelDataSet.
bd.sql(c("select Mileage*Fuel from fuelDataSet",
          "select Weight from fuelDataSet where Type!='Van' and Type!='Sporty'"),
        fuelDataSet=fuel.frame)

# Full outer join by row.  Notice that no alias is
# given for heart.  It is simply referenced as heart.
bd.sql("select * from fuel, heart", fuel=fuel.frame, heart)

# Inner join by row
bd.sql("select * from fuel join heart", fuel=fuel.frame, heart)

# Cross product of all rows in fuel with itself
bd.sql("select * from fuel cross join fuel", fuel=fuel.frame)

# Inner join of fuel to itself with left key column of Weight
# and right key column of Fuel
bd.sql("select * from leftFuel left outer join rightFuel on leftFuel.Weight=rightFuel.Fuel",
        leftFuel=fuel.frame, rightFuel=fuel.frame)

# Recursive joins
bd.sql("select * from leftFuel left outer join rightFuel on leftFuel.Weight=rightFuel.Weight right outer join secondRightFuel on Weight=rightFuel.Weight",
        leftFuel=fuel.frame, rightFuel=fuel.frame, secondRightFuel=fuel.frame)

# Aggregation
bd.sql("select avg(Fuel) from fuel group by Type", fuel=fuel.frame)
bd.sql("select min(Fuel), max(Fuel) from fuel group by Type", fuel=fuel.frame)
bd.sql("select min(Fuel), max(Fuel) from fuel group by Type having avg(Weight) > 3000", fuel=fuel.frame)

# Define column source
bd.sql("select fuel.Fuel, heart.id from fuel, heart", fuel=fuel.frame, heart=heart)


######################
##  INSERT COMMAND
######################
# Single row insertion (entire row)
bd.sql("insert into fuel values (1, 1, 1, 1, 'Domestic')", fuel=fuel.frame)

# Single row insertion (selected columns)
bd.sql("insert into fuel (Fuel, Type) values (1, 'Domestic')", fuel=fuel.frame)

# Multiple row insertion (entire row)
bd.sql("insert into fuel values (1, 1, 1, 1, 'Domestic') values (2, 2, 2, 2, 'Domestic')",
        fuel=fuel.frame)

# Multiple row insertion (selected columns)
bd.sql("insert into fuel (Fuel, Type) values (1, 'Domestic') values (2, 'Domestic')",
        fuel=fuel.frame)

######################
##  DELETE COMMAND
######################
# Simple row deletion
bd.sql("delete from fuel where Mileage >= 20 or Type = 'Sporty'", fuel=fuel.frame)

######################
##  UPDATE COMMAND
######################
# Apply transformation to all rows in Fuel column
bd.sql("update fuel set Fuel=Fuel*1.15", fuel=fuel.frame)

# Apply transformation to all only rows where Mileage is 18
bd.sql("update fuel set Fuel=Fuel*1.15 where Mileage=18",
        fuel=fuel.frame)

# Apply multiple transformations when Mileage is 18
bd.sql("update fuel set Fuel=Fuel*1.15, Type='Mileage18' where Mileage=18",
        fuel=fuel.frame)