Big Data Object Information.

DESCRIPTION:

Extracts internal information about a bdFrame or bdVector object, including the files used for storing the data, and their sizes.

This function requires the bigdata library section to be loaded.

USAGE:

bd.object.info(obj)

REQUIRED ARGUMENTS:

obj
The object to be analyzed. If it is not a bdVector or bdFrame, an error is generated.

VALUE:

A list with the following elements:

valid: This is TRUE if obj is a valid big data object. If this is FALSE, the rest of the elements of this list are NULL.

columnNum: A bdVector is actually represented as a column within a bdFrame. If obj is a bdVector, this element is the column number of this bdVector, and the elements below such as frameRowBytes contain information about the bdFrame containing this bdVector. If obj is not a bdVector, this element is NULL.

columnElements: If obj is a bdVector, this is the number of elements in the vector. This should be the same as the frameRows value. If obj is not a bdVector, this element is NULL.

columnElementBytes: If obj is a bdVector, this is the number of bytes used to represent each element in the vector. If obj is not a bdVector, this element is NULL.

columnTotalBytes: If obj is a bdVector, this is the total number of bytes used to represent the data in this vector. If obj is not a bdVector, this element is NULL.

frameRows: The number of rows in the bdFrame.

frameColumns: The number of columns in the bdFrame.

frameRowBytes: The number of bytes used to represent each row in the bdFrame.

frameTotalBytes: The total number of bytes used to represent the data in the bdFrame.

cacheDir: The directory containing the files representing this object.

dataFile: The file name of the file containing the data values for this object.

dataFileExists: This is TRUE if the data file exists.

dataFileBytes: The size in bytes of the data file.

metaFile: The file name of the file containing the meta-data for this object.

metaFileExists: This is TRUE if the meta-data file exists.

metaFileBytes: The size in bytes of the meta-data file. blobFile: The file name of a file used for representing variable-length "blob" data associated with the object. This file is normally empty, except in a few special cases.

blobFileExists: This is TRUE if the blob file exists.

blobFileBytes: The size in bytes of the blob file.

DETAILS:

This function extracts information about the internal files used to represent a bdFrame or bdVector, such as the file names and sizes. If the performance of a big data operation seems too slow, it may be helpful to examine the size of the data files.

This function exposes the fact that a bdVector is actually represented as a column in a bdFrame. If the underlying bdFrame has many columns, manipulating the bdVector may involve reading the entire bdFrame data file. In some cases, it may be more efficient to copy the vector data out of a larger bdFrame, before manipulating it. For example, rather than selecting a bdVector with the expression mybdframe[[1]], which creates a bdVector referencing the original myframe, one could do bd.select.rows(mybdframe,columns=1)[[1]], which first copies the column into a one-column bdFrame before extracting the column bdVector for the column.

EXAMPLES:

bd.object.info(as.bdFrame(fuel.frame))