that our validation has missed a behavioral change that
was also missed by TASTING. Nevertheless, given
that Valgrind recorded all calls correctly, such a miss
could only stem from a change that only touched the
initial values of a global variable. Since TASTING
explicitly includes the local hashes of such variables
into those functions that access or reference them, we
are confident that TASTING would behave correctly
even if the validation missed the change.
For the applicability of our approach, we see the
necessity to enumerate all data sources for each test
case as a major challenge for the integration into exist-
ing test systems. Normally, these dependencies are not
as explicit as required by TASTING. However, with
file-grained tracing methods, like EKSTAZI (Gligoric
et al., 2015), such dependencies can be discovered and
integrated on a coarse-grained level. Similarly, we
can integrate components that are written in languages
without local-hashing support on the file-grained level,
as we have done for assembler source code.
The largest obstacle for the generalizability of our
approach for other programming languages is the link
function. For our static analysis, we assume that the
statically-derivable reference graph is largely equal to
the dynamically-observable references. However, if
this over-approximation is too imprecise for a given
language, it can cause every function to influence every
test-case fingerprint, resulting in no end-to-end savings.
For example, for a scripting-language interpreter (e.g.,
Python), the actual call-hierarchy is largely driven by
the interpreted program – not by the static structure of
the interpreter loop. In such cases, our approach would
work better on the level of the interpreted language.
An alternative would be a combination of TASTING
with fine-grained function tracing.
6 RELATED WORK
Regression testing, and, in our case, more specifically
RTS
, is a topic that has attracted a lot of attention in the
last 30 years, as surveyed in several large literature re-
views (Biswas et al., 2011; Engström et al., 2010; Yoo
and Harman, 2012). Because of the large body of re-
search, we will only give a short roundup of important
RTS
techniques before we discuss other content-based
caching techniques that inspired TASTING.
Regression-test Selection.
Many
RTS
techniques
use a two-step approach: (1) For each test case, they
derive the set of covered program entities that is used
or validated by the given test. (2) They compare two
versions and derive a set of changed program entities
and intersect it with each test’s dependencies to select
or dismiss it for re-execution. From these methods,
TASTING differs fundamentally since we do not com-
pare two versions but derive a semantic fingerprint
from a single version and associate it with the test
result.
One dimension
RTS
techniques differ in is the
granularity of entities that is used for the test-
dependency detection. There are techniques that work
on the textual level (Vokolos and Frankl, 1997), on the
data-flow level (Harrold and Souffa, 1988; Taha et al.,
1989), on the statement level (Rothermel and Harrold,
1996), on the function level (Chen et al., 1994), on the
method level (Ren et al., 2004), on the class level (Orso
et al., 2004), on the module level (Leung and White,
1990), on the file level (Gligoric et al., 2015), or on
the level of whole software projects (Elbaum et al.,
2014; Gupta et al., 2011). With HyRTS (Zhang, 2018),
a method that uses a varying granularity depending
on the change is also available. In general, it was
noted (Gligoric et al., 2015) that a finer granularity
results in higher analysis overheads but also provides
less severe over-approximations. For TASTING, we
choose the function-level granularity, because calling
functions is the technical link between test case and
SUT
. However, as local-hash calculation works on the
AST
, which captures the hierarchical organization of
program entities, other granularities are also possible.
Another dimension is the method to detect depen-
dencies between program entities. This can either be
achieved completely statically (Kung et al., 1995; Ren
et al., 2004; Rothermel and Harrold, 1996) or by in-
specting recorded test-case–execution traces (Gligoric
et al., 2015; Orso et al., 2004; Chen et al., 1994).
While it is easier to argue the soundness of the static
methods, dynamic methods result in smaller depen-
dency sets, which reduces the frequency of unneces-
sary re-executions. In this dimension, TASTING uses
a purely static analysis method to calculate its link
function, but a combination with dynamic trace infor-
mation should be possible without compromising on
soundness.
Most similar to TASTING is EKSTAZI (Gligoric
et al., 2015), which works on the file level and dynam-
ically traces files that a given test accesses (e.g., Java
.class
files). For these files, it calculates a content-
based hash and executes those test cases whose ac-
cessed files changed. While they provide a smart-
hashing method that hides unnecessary information
(e.g., build dates) from the hash function, they only
use those hashes to identify changes on the file level,
making it a change-based
RTS
method. TASTING not
only uses a more fine-grained method to include only
relevant information into the hash but also uses the
content-based hash to identify test-execution results.
TASTING: Reuse Test-case Execution by Global AST Hashing
43