Accessing the processor timer
Attempts to measure very small time intervals are often frustrated by the intermittent background activity that is part of the operating system and by the processing time consumed by the system time routines. One approach to solving this problem is to access the processor timer directly to determine the beginning and ending times of measurement intervals, run the measurements repeatedly, and then filter the results to remove periods when an interrupt intervened.
A trio of library subroutines make access to the TimeBase registers architecture-independent. The subroutines are as follows:
- read_real_time()
- This subroutine obtains the current time from the appropriate source and stores it as two 32-bit values.
- read_wall_time()
- This subroutine obtains the raw TimeBase register value from the appropriate source and stores it as two 32-bit values.
- time_base_to_time()
- This subroutine ensures that the time values are in seconds and nanoseconds, performing any necessary conversion from the TimeBase format.
The time-acquisition and time-conversion functions are separated in order to minimize the overhead of time acquisition.
#include <stdio.h>
#include <sys/time.h>
int main(void) {
timebasestruct_t start, finish;
int val = 3;
int w1, w2;
double time;
/* get the time before the operation begins */
read_real_time(&start, TIMEBASE_SZ);
/* begin code to be timed */
printf("This is a sample line %d \n", val);
/* end code to be timed */
/* get the time after the operation is complete
read_real_time(&finish, TIMEBASE_SZ);
/* call the conversion routines unconditionally, to ensure */
/* that both values are in seconds and nanoseconds regardless */
/* of the hardware platform. */
time_base_to_time(&start, TIMEBASE_SZ);
time_base_to_time(&finish, TIMEBASE_SZ);
/* subtract the starting time from the ending time */
w1 = finish.tb_high - start.tb_high; /* probably zero */
w2 = finish.tb_low - start.tb_low;
/* if there was a carry from low-order to high-order during */
/* the measurement, we may have to undo it. */
if (w2 < 0) {
w1--;
w2 += 1000000000;
}
/* convert the net elapsed time to floating point microseconds */
time = ((double) w2)/1000.0;
if (w1 > 0)
time += ((double) w1)*1000000.0;
printf("Time was %9.3f microseconds \n", time);
exit(0);
}
To minimize the overhead of calling and returning from the timer routines, you can experiment with binding the benchmark nonshared (see When to use dynamic linking and static linking).
If this was a real performance benchmark, the code would be measured repeatedly. A number of consecutive repetitions would be timed collectively, an average time for the operation would be calculated, but it might include interrupt handling or other extraneous activity. If a number of repetitions was timed individually, the individual times could be inspected for reasonableness, but the overhead of the timing routines would be included in each measurement. It may be desirable to use both techniques and compare the results. In any case, you would want to consider the purpose of the measurements in choosing the method.