original graph being valid. This pair of vertices forms
the regression predecessor of the original “bad” com-
mit. Although the main idea of the bisection method is
based on the monotonicity of the function
valid
, it is
guaranteed that the algorithm finds a regression prede-
cessor of the “bad” commit even if the function
valid
is not monotone.
There are two main drawbacks of git bisect com-
paring to RPA. First, the bisection algorithm does not
tend to find the latest regression predecessor. Second,
experiments (see the following section) demonstrate
that git bisect evaluates more commits than RPA. The
reason of this behaviour is that RPA prefers shortest
paths while git bisect prefers vertices with the high-
est associated number. To demonstrate the difference
let us consider a graph with one leaf and two paths
connecting the root with the leaf. If one path is very
short and the second one very long, then RPA prefers
the short path while git bisect evaluates vertices on the
long one. If a graph contains only one path leading to
an invalid leaf, git bisect evaluates the same vertices
as RPA combined with binary search.
There is also further related work that deals with
problems similar to ours. Heuristics for automated
culprit finding (Ziftci and Ramavajjala, 2015) are used
for isolating one or more code changes which are
suspected of causing a code failure in a sequence of
project versions. They assume that the codebase is
tested/validated regularly (e.g. after every n commits)
using some test suit. If a bug is detected, they search
for the culprit only among the changes to the codebase
that have been made since the latest appliance of the
test suite. The individual versions are rated accord-
ing to their potential to cause the failure (e.g. versions
with many code changes are rated higher) and versions
with high rate are tested as first. The culprit finding
technique (Ziftci and Ramavajjala, 2015) is efficiently
applicable only for searching in a short term history
and it assumes that there is only one culprit.
Delta debugging (Zeller, 1999) is a methodology
to automate the debugging of programs using the ap-
proach of a hypothesis-trial-result loop. For a given
code and a test case that detects a bug in the code, the
delta debugging algorithm can be used to trim useless
functions and lines of the code that are not needed
to reproduce to bug. The delta debugging cannot be
used for finding regression points in VCS. However,
we believe that it can be incorporated into RPA and
improve its performance by reducing the portion of
code that need to be validated by the function valid.
A regression testing (Agrawal et al., 1993) and con-
tinuous integration testing (Duvall, 2007) are types of
software testing that verifies that software previously
developed and tested still performs correctly even af-
ter it was changed or interfaced with other software.
These two techniques are suitable for fixing bugs that
are detected right after they are introduced. However,
if a bug that lied in a codebase for some time is de-
tected, e.g. because of extending the coverage of the
tests, a technique like RPA need to be used. That
is, RPA and regression testing/continous integration
testing are mutually orthogonal techniques
SZZ (Sliwerski et al., 2005; Kim et al., 2006) is
an algorithm for identifying commits in a VCS that
introduced bugs, however it works in a quite different
settings. It assumes, that the bug has been already fixed
and that the commit that fixed the bug is explicitly
known or can be found in a log file. This allows to
identify particular lines of code that fixed the bug and
this information is then exploited while searching for
the bug-introducing commit. In our settings, the bugs
are not fixed yet, thus SZZ cannot be used.
In our previous work (Bend
´
ık et al., 2016), a struc-
ture similar to RADAG appears. However, that struc-
ture is monotone and therefore, the problem formu-
lated in (Bend
´
ık et al., 2016) substantially differs from
the regression predecessors problem and the algorithm
presented in that work cannot be used for finding re-
gression points in RADAGs.
Finally, we relate the regression predecessors prob-
lem with well known problems from graph theory. The
latest regression point can be found using the breadth-
first-search (BFS) algorithm (Jungnickel, 1999). As
our goal is to minimize the number of validity queries,
BFS is not suitable as it queries every vertex. There-
fore, we come with a new, specialized, algorithm.
4 EXPERIMENTAL RESULTS
We demonstrate the performance of the variants of
RPA on two types of use cases. We first focus on the
problem of finding a regression predecessor of a single
invalid leaf. We then focus on the problem of finding
regression predecessors of a set of invalid leaves. We
also compare the performance of the RPA variants to
that of the git bisect tool (Git bisect documentation,
2018; Git bisect algorithm overview, 2018).
As benchmarks we use large real open source
projects, taken from the GitHub open source show-
cases (Github Showcases, 2018), with at least 8 ac-
tive branches or at least 1000 commits. Due to the
size of the projects it would be intractable to build
and test all commits in these projects. Therefore we
use those projects from (Github Showcases, 2018)
that employ TravisCI (Travis CI, 2018). Travis CI
is a service used to build and test projects hosted at
GitHub and the results of all tests that were run on
Finding Regressions in Projects under Version Control Systems
159