This function requires the bigdata library section to be loaded.
bd.tally(expr=NULL, reset=F)
"ms":
The current time in milliseconds.
This will be zero the first time
bd.tally is called.
This is reset to zero when the
reset argument is true.
pipelines:
The number of bigdata "pipelines" that have been executed.
Currently each pipeline contains a single "node".
Normally, this element is the
nodes value plus the
dc.nodes value.
errors:
The number of bigdata pipelines that have terminated with an error.
nodes:
The number of normal nodes that have been executed.
Most simple bigdata operations are implemented with normal nodes.
dc.nodes:
The number of "data cache" nodes that have been executed.
These nodes are implemented differently than normal nodes.
splus.scripts:
The number of nodes executed that run Splus scripts.
Functions such as
bd.block.apply run Splus scripts.
sorts:
The number of sorting operations performed.
blocks:
The number of data blocks processed by normal nodes.
in.bytes:
The number of bytes read from file caches.
If a given file cache is read repeatedly in multiple passes,
this number will only include the bytes for one read pass.
out.bytes:
The number of bytes written to file caches.
scan.bytes:
Some data cache nodes need to scan output data caches to collect statistics.
This value is the number of bytes read by data cache nodes for this purpose.
sort.bytes:
The number of bytes written during sort operations.
This may be much larger than the final sorted data,
since it includes temporary files written while sorting large datasets.
s2b.bytes:
The number of bytes transferred from small data objects to bigdata objects,
via functions such as
as.bdFrame.
b2s.bytes:
The number of bytes transferred from big data objects to small data objects,
via functions like
bd.coerce.
Note that printing a
bdFrame or
bdVector
extracts part of a big data object, and converts it to a small data object.
copy.bytes:
The number of bytes copied between database directories
when assigning variables to a different database.
If this is excessive, you may be able to reduce it by calling
.
This function is used to examine the performance of bigdata operations.
If an expression
EXPR is taking an exceptional amount of time,
it may be useful to execute
bd.tally(EXPR),
and review the output value,
to see whether the expression is executing a large number of bigdata nodes,
or reading or writing many bytes.
bd.tally(as.bdFrame(fuel.frame))