Duplicated rows.

DESCRIPTION:

Determine which rows in a dataset are unique.

This function requires the bigdata library section to be loaded.

USAGE:

bd.duplicated(data, columns=NULL,
              name="DUPLICATED", copy=F)

REQUIRED ARGUMENTS:

data
input data set: a bdFrame, data.frame, bdVector or ordinary vector.

OPTIONAL ARGUMENTS:

columns
names or numbers of columns; the output indicates which rows have unique combinations of these columns. If NULL then all columns are used.
name
name for the new column which indicates which rows are unique.
copy
if TRUE original columns are retained; otherwise, only the new column is returned.

VALUE:

A bdFrame or data.frame (the former if data is a bdFrame or bdVector). If copy=FALSE this contains a single column with logical values indicating which rows were unique. If copy=TRUE this contains the new column in additional to old columns (whether included in the columns argument or not).

DETAILS:

This function is called by duplicated when the argument is a bdVector or bdFrame. The advantage of calling this function directly is that it allows you to keep the old rows.

SEE ALSO:

EXAMPLES:

x <- bdFrame(a=c(1:2,1:2), b=c(1:3,2))
bd.duplicated(x)
bd.duplicated(x, copy=T)