Measuring Dervish Performance
Besides the native Tcl command, time, Dervish provides extensions
for measuring system performance.
timerStart is
used to reset the performance measurements and start acquiring them.
timerLap takes a
"snapshot" of the current performance measurements and returns a Tcl Extension
keyed list containing their values. timerLap can be called
again to report the accumulated performance measurements (since the last
timerStart) at different points within the code.
In order to use the times returned by
timerLap correctly,
one must understand how timing measurements are obtained.
The times, returned in seconds, are a conversion from the number of clock ticks
that an operation takes. Since the clock ticks are granular, they place an
ultimate resolution on the time. On most UNIX systems, the clock rate is
100 Hz (see the HZ macro definition in
/usr/include/sys/params.h). This coarse clock rate does not allow
the system to resolve quick lived events accurately.
With longer durations, the elapsed time and the overall CPU time will be
reasonably accurate. But, the breakdown of the overall CPU time between
- user mode time for the current process
- system mode time for the current process
- user mode time for the current process' children
- system mode time for the current process' children
can still be quite skewed.
The problem lies that many operations split between user mode and system mode
can be very short, less than the granularity of the clock tick. It is possible
to not account time to a particular mode because it falls between clock ticks.
It is also possible to account time improperly because of the current mode at
a clock tick boundary. Consider the drawing below:
clock ticks +--+
-|----------|----------|----------|----------|- where |UU| is user mode
+-------+--+ +--+--------+-+--------+ +--+
|UUUUUUU|SS| |SS|UUUUUUUU|S|UUUUUUUU| |SS| is system mode
+-------+--+ +--+--------+-+--------+ +--+
Both cases show where the system mode is charged the CPU time, even though
considerably more time was spent within user mode.
Unfortunately, these accounting errors will accumulate over time; one cannot
rely on them "balancing" themselves out over the long term.
One other condition, which should not be encountered often, occurs when the
system time (not the time zone or the Daylight Savings Time state) is changed
between a timerStart and a subsequent
timerLap. In that case, the elapsed time will be
inaccurate (the CPU times will not be affected).
_______________________________________________________________________________
NAME
timerStart - Reset the timer and start it
SYNOPSIS (Tcl Syntax)
timerStart
ARGUMENTS
(None)
RETURN VALUES
TCL_OK Success. Successful completion. The interp result string is
empty.
TCL_ERROR Failure. The interp result string contains an error message.
_______________________________________________________________________________
DESCRIPTION
The timer is reset and started.
_______________________________________________________________________________
NAME
timerLap - Measure and display current times
SYNOPSIS (Tcl Syntax)
timerLap
ARGUMENTS
(None)
RETURN VALUES
TCL_OK Success. Successful completion. The interp result string is a
Tcl Extension keyed list containing the elapsed and CPU times
since the last timerStart.
TCL_ERROR Failure. The interp result string contains an error message.
_______________________________________________________________________________
DESCRIPTION
Measure current times, without stopping the timer, and return the elapsed and
CPU utilization times since the last
timerStart.
A Tcl Extension keyed list is returned:
{ELAPSED 5.390} {CPU {{OVERALL 0.030} {UTIME 0.010} {STIME 0.010} {CUTIME 0.000} {CSTIME 0.010}}}
The values are in seconds.
The CPU subkeys correspond to the fields returned by the
UNIX times () call:
UTIME user mode time used by the current process
STIME system mode time used by the current process
CUTIME user mode time used by all the current process' children
CSTIME system mode time used by all the current process' children
The CPU.OVERALL time is the sum of the component CPU times
(listed above).
Performance values, especially non-overall counts, can be
inaccurate.