Theory

Based on the observation that most of the work in debugging is in locating where the error first occurs we suggest the following approach to automate the debugging process.

It is assumed that the user has two programs available: An old program A giving the correct answer and a new program B giving the incorrect answer. The task we wish to solve is to determine where the error first occurs in the new program. It is also conceivable that instead of an old program one has access through a file to the correct values of certain variables at different stages of the computation.

We start by defining a comparative break-point. Such a break-point includes a pair of expressions such that involves variables from A and variables from B. We also need a line number in program A and a line number in B, defining where the expressions are to be evaluated. The variables used in must be in scope in line , similarly for and .

Thus given the programs A and B a comparative break-point is defined by the two pairs .

Each time program A executes line the value of is evaluated using the current values of the variables in A. The value of is then stored in a queue. Similarly when program B executes line the value of is evaluated and stored in a separate queue. Whenever both queues are non-empty the first elements of the queues are compared for equality and deleted. Thus the value of and at the k'th time lines and are executed will be compared. This holds true even if the programs should execute asynchronously.

We define two types of comparative break-points: blocking and non-blocking. When a program encounters a blocking break-point the program halts after storing its value to the queue. When both programs have stored their associated values and the comparison has been performed the programs are notified of the outcome. If both programs continue execution, but if they halt. View demonstration of blocking break-point, either or using .

The rationale for this behavior is that if programs A and B are running inside two separate debuggers, control is passed back to the debuggers. Thus the programmer can now perform traditional debugging of the two programs before deciding on whether to terminate or continue the execution.

When a non-blocking break-point is encountered the program does not wait for the result of the comparison, but continues execution as soon as the associated value has been stored in the queue. If a mismatch is found when comparing the values in the queues an error message is output to the user. Thus non-blocking break-points are mainly for the detection of where an error occurs.

The main reason for including non-blocking break-points is speed of execution. A blocking break-point will cause the programs to synchronize at each break-point thus slowing down the overall execution. Also if and are matrix values it may take a significant amount of time to transmit these values and perform the comparison.

Apart from speeding up the execution, non-blocking break-points are slightly more flexible than the blocking ones. If A encounters a blocking break-point fewer times than B, this would cause A to hang indefinitely. Also if two blocking break-points have been defined such that and a deadlock will occur with program A waiting for input at and program B waiting for input at . Both of these situations can be avoided by using non-blocking break-points.

Comparative break-points should preferably be used in such a manner that deadlocks are avoided. It is, however, the responsibility of the user to ensure this.

In the Section on applications we present some examples to motivate where comparative break-points can be of use.

Floating point comparisons