This function requires the bigdata library section to be loaded.
bd.tally(expr=NULL, reset=F)
"ms"
:
The current time in milliseconds.
This will be zero the first time
bd.tally
is called.
This is reset to zero when the
reset
argument is true.
pipelines
:
The number of bigdata "pipelines" that have been executed.
Currently each pipeline contains a single "node".
Normally, this element is the
nodes
value plus the
dc.nodes
value.
errors
:
The number of bigdata pipelines that have terminated with an error.
nodes
:
The number of normal nodes that have been executed.
Most simple bigdata operations are implemented with normal nodes.
dc.nodes
:
The number of "data cache" nodes that have been executed.
These nodes are implemented differently than normal nodes.
splus.scripts
:
The number of nodes executed that run Splus scripts.
Functions such as
bd.block.apply
run Splus scripts.
sorts
:
The number of sorting operations performed.
blocks
:
The number of data blocks processed by normal nodes.
in.bytes
:
The number of bytes read from file caches.
If a given file cache is read repeatedly in multiple passes,
this number will only include the bytes for one read pass.
out.bytes
:
The number of bytes written to file caches.
scan.bytes
:
Some data cache nodes need to scan output data caches to collect statistics.
This value is the number of bytes read by data cache nodes for this purpose.
sort.bytes
:
The number of bytes written during sort operations.
This may be much larger than the final sorted data,
since it includes temporary files written while sorting large datasets.
s2b.bytes
:
The number of bytes transferred from small data objects to bigdata objects,
via functions such as
as.bdFrame
.
b2s.bytes
:
The number of bytes transferred from big data objects to small data objects,
via functions like
bd.coerce
.
Note that printing a
bdFrame
or
bdVector
extracts part of a big data object, and converts it to a small data object.
copy.bytes
:
The number of bytes copied between database directories
when assigning variables to a different database.
If this is excessive, you may be able to reduce it by calling
.
This function is used to examine the performance of bigdata operations.
If an expression
EXPR
is taking an exceptional amount of time,
it may be useful to execute
bd.tally(EXPR)
,
and review the output value,
to see whether the expression is executing a large number of bigdata nodes,
or reading or writing many bytes.
bd.tally(as.bdFrame(fuel.frame))