Systematic Comparison of Six Open-source Java Call Graph

Construction Tools

Judit J

asz

, Istv

an Siket

, Edit Peng

, Zolt

an S

agodi

and Rudolf Ferenc

University of Szeged, Department of Software Engineering,

Arp

ad t

er 2., H-6720 Szeged, Hungary

Keywords:

Java, Call Graph, Static Analysis, Tool Comparison.

Abstract:

Call graphs provide the groundwork for numerous analysis algorithms and tools. However, in practice, their

construction may have several ambiguities, especially for object-oriented programming languages like Java.

The characteristics of the call graphs – which are inﬂuenced by building requirements such as scalability, efﬁ-

ciency, completeness, and precision – can greatly affect the output of the algorithms utilizing them. Therefore,

it is important for developers to know a well-deﬁned set of criteria based on which they can choose the most

appropriate call graph builder tool for their static analysis applications. In this paper, we studied and compared

six static call graph creator tools for Java. Our aim was to identify linguistic and technical properties that might

induce differences in the generated call graphs besides the obvious differences caused by the various call graph

construction algorithms. We evaluated the tools on multiple real-life open-source Java systems and performed

a quantitative and qualitative assessment of the resulting graphs. We have shown how different outputs could

be generated by the different tools. By manually analyzing the differences found on larger programs, we also

found differences that we did not expect based on our preliminary assumptions.

1 INTRODUCTION

Producing high-quality software is an important re-

quirement of today’s industrial development, so, nat-

urally, there are many tools and methodologies avail-

able to aid quality management. A subset of these

tools are static source code analyzers that help pro-

grammers eliminate ﬂaws and rule violations early

on by automatically analyzing the subject system and

highlighting its potentially erroneous parts. However,

their capabilities can differ signiﬁcantly depending on

the complexity of the internal representations and al-

gorithms they use.

Call graphs are directed graphs representing con-

trol ﬂow relationships among the methods of a pro-

gram. The nodes of the graph denote the methods,

while an edge from node a to node b indicates that

method a invokes method b. Call graphs can either be

considered static or dynamic depending on whether

they were constructed during static or dynamic anal-

https://orcid.org/0000-0001-6176-9401

https://orcid.org/0000-0003-4064-1489

https://orcid.org/0000-0002-4500-8693

https://orcid.org/0000-0001-5828-6265

https://orcid.org/0000-0001-8897-7403

ysis. Static graphs tend to overestimate the accurate

call graphs, while the quality and precision of the dy-

namic graphs are heavily inﬂuenced by the size and

quality of the corresponding test-suite. In this work

we are considering static call graphs only.

As call graphs are the main building blocks for

modeling interprocedural control and data ﬂow, their

soundness can greatly affect the results of subsequent

analyses. Developers need to carefully consider how

their call graphs are constructed before they incorpo-

rate them into a novel algorithm. In the case of object-

oriented languages, the target of a call often depends

on the runtime behavior of the program, therefore,

a static call graph builder has to make assumptions

about what methods could be called, resulting in pos-

sible imprecisions. Call graph builder algorithms ad-

dressing this challenge have an extensive literature,

including detailed comparisons (Lhot

ak, 2007), (Tip

and Palsberg, 2000), (Grove et al., 1997), (Murphy

et al., 1998), (Grove and Chambers, 2001), (Lhot

and Hendren, 2006). However, there are other factors

that inﬂuence the structure of a call graph as well, for

example, the handling of different kinds of initializa-

tions or anonymous classes. In this paper, we aimed

to study these factors through the evaluation of six

static call graph builder tools for Java. The following

Jász, J., Siket, I., Peng

o, E., Ságodi, Z. and Ferenc, R.

Systematic Comparison of Six Open-source Java Call Graph Construction Tools.

DOI: 10.5220/0007929201170128

In Proceedings of the 14th International Conference on Software Technologies (ICSOFT 2019), pages 117-128

ISBN: 978-989-758-379-7

117

Research Questions (RQs) guided the direction of our

research:

• RQ1: How does the different handling of Java’s

language features affect the resulted call graphs?

• RQ2: How different could the call graphs be in

practice?

• RQ3: Do we get the same graphs if we ignore the

known differences?

We constructed an example code – full of lan-

guage features that we expected to challenge the tools

(available as an online appendix, see Section 4) – and

compared the resulting graphs. We also performed an

evaluation on four real-life Java systems in order to

study the differences on a bigger scale. The results of

the Maven

and ArgoUML

projects are presented in

this paper, while the results of the other projects are

available as part of the online appendix.

The rest of the paper is organized as follows. Sec-

tion 2 provides a brief background on call graphs and

also discusses the related literature. The six tools we

compare are described in Section 3. We deﬁne the

steps of our evaluation process in Section 4 and ana-

lyze the results quantitatively and qualitatively in Sec-

tion 5. Finally, threats to the validity of our results are

examined in Section 6 before we draw our conclu-

sions in Section 7.

2 RELATED WORK

Call graphs are the basis of many software analy-

sis algorithms, such as control ﬂow analysis, pro-

gram slicing, program comprehension, bug predic-

tion, refactoring, bug-ﬁnding, veriﬁcation, security

analysis, and whole-program optimization (Weiser,

1981), (Feng et al., 2014), (Christodorescu and Jha,

2003), (Wagner et al., 1994). The precision and recall

of these applications depends largely on the sound-

ness and completeness of the call graphs they use.

Moreover, call graphs can be employed to visual-

ize the high level control ﬂow of the program, thus

helping developers understand how the code works.

There are several studies about dynamic call graph-

based fault detection, like the work of Eichinger et

al. (Eichinger et al., 2008), who created and mined

weighted call graphs to achieve more precise bug lo-

calization. Liu et al. (Liu et al., 2005) constructed be-

havior graphs from dynamic call graphs to ﬁnd non-

crashing bugs and suspicious code parts with a classi-

ﬁcation technique.

https://github.com/apache/maven

http://argouml.tigris.org/

Regardless of whether the examined language

is low-level and binary or high-level and object-

oriented, call graph construction can always lead to

some difﬁculties (Bacon and Sweeney, 1996), (Reif

et al., 2016). A call graph is accurate if it contains

exactly those methods and call edges that might get

utilized during an actual execution of the program.

However, in some cases, these can be hard to calcu-

late. For example, if several call targets are possi-

ble for a given call site, then deeper examination is

needed to determine which ones to connect as pre-

cisely as possible. This examination can be done

in a context-dependent or context-independent man-

ner; naturally, the choice inﬂuences the generated call

graph. Context-dependent methods are more accu-

rate in return for greater resource usage. To miti-

gate the resource demands of such methods, the anal-

ysis of the programs often only starts from the main

method or a few entry points instead of starting from

every method. This might result in a less accurate

call graph. To improve the accuracy of context-

independent methods, the following algorithms can

be used for object-oriented languages: Class Hierar-

chy Analysis (CHA) (Dean et al., 1995), Rapid Type

Analysis (RTA) (Bacon and Sweeney, 1996), Hybrid

Type Analysis (XTA)(Tip and Palsberg, 2000), Vari-

able Type Analysis (VTA) (Sundaresan et al., 2000).

Another important question during call graph cre-

ation is the handling of library calls (Ali and Lhot

ak,

2012). Including library calls not only makes the call

graph bigger, it also requires the analysis of the li-

braries which can be quite resource consuming. How-

ever, the exclusion of library elements may cause in-

accuracies when developers implement library inter-

faces or inherit from library classes. The analysis

of library classes might involve private, inaccessible

methods as well. Michael Reif et al. (Reif et al., 2016)

discussed the problem that the often used call graph

builder algorithms, such as CHA and RTA, do not han-

dle libraries separately according to their availability.

The recommended algorithm in this work reduces the

number of call edges by 30%, in contrast to other ex-

isting implementations. The tools we selected for our

comparison represent library calls and library meth-

ods at various levels of detail.

As mentioned in Section 1, many comparative

studies are available about call graph creation. Grove

et al. (Grove et al., 1997) implemented a framework

for comparing call graph creation algorithms and as-

sessed the results with regard to precision and per-

formance. Murphy et al. (Murphy et al., 1998) car-

ried out a study similar to ours about the comparison

of ﬁve static call graph creators for C. They identi-

ﬁed signiﬁcant differences in how the tools handled

ICSOFT 2019 - 14th International Conference on Software Technologies

118

typical C constructs like macros. Hoogendorp gave

an overview of call graph creation for C++ programs

in his thesis (Hoogendorp, 2010). Antal et al. (Antal

et al., 2018) conducted a comparison on JavaScript

static call graph creator tools. Similarly to our work,

they collected ﬁve call graph builders and analyzed

the handling of JavaScript language elements and the

performance as well. As a result, they provided the

characterization of the tools that can help in select-

ing the one, which is most suitable for a given task.

Tip et al. (Tip and Palsberg, 2000) tried to improve

the precision of RTA by introducing a new algorithm.

On average, they reduced the number of methods by

1.6% and the number of edges by 7.2%, which can

be a considerable amount in the case of larger pro-

grams. Lhot

ak (Lhot

ak, 2007) compared static call

graphs generated by Soot (Sable Research Group,

2019) and dynamic call graphs created with the help

of the *J (Sable *J, 2019) dynamic analyzer. He built

a framework to compare call graphs, discussed the

challenges of the comparisons, and presented an al-

gorithm to ﬁnd the causes of the potential differences

in call graphs.

Reif et al. (Reif et al., 2018) dealt with the un-

soundness of Java call graphs. They compared the

call graph creator capabilities of two analyzer tools,

WALA and Soot. They evaluated different conﬁgu-

rations of the tools on a small testbed. Their main

goal was to decide whether a tool handles a speciﬁc

language element or not, and - unlike our work - did

not investigate the way it is handled. An assessment

suite for the comparison of different call graph tools

is proposed as well. Our work is similar to their

study, however, we performed an in-depth examina-

tion to identify what differences can occur between

call graph builder tools. The six tools we selected for

our research have various properties and ways of anal-

ysis, for example, there are both source- and bytecode

analyzers, while Reifs et al. only analyzed bytecode

based tools. As a result, we provide a full scale of

factors that can cause ambiguities in the call graph

creation. There are obvious factors, like the handling

of polymorphism and library calls that were analyzed

before. However, we highlight other, less evident as-

pects as well that have to be considered before using

or developing a call graph builder tool.

3 CALL GRAPH

CONSTRUCTION TOOLS

We studied numerous static analyzer tools for Java

to decide whether they could generate – or could

be easily modiﬁed to generate – call graphs. We

searched for widely available, open-source programs

from recent years, which could analyze complex,

real-life Java systems. We discarded many plug-in-

based tools, as they produced only a visual output

(e.g., CallGraph Viewer (CallGraphViewer, 2019)),

while other promising candidates were not robust

enough on larger systems (e.g., Java Call Hierarchy

Printer (Badenski, 2019)). In some cases, the call

graphs had to be extracted directly from the inner rep-

resentation of the analyzer. However, we eliminated

any tool that did not provide enough information

to reconstruct the caller-callee relationships between

compilation units without major development (e.g.,

JavaParser (Danny van Bruggen, Federico Tomassetti,

Nicholas Smith, Cruz Maximilien, 2019)).

The description of the six tools that met our selec-

tion criteria is presented below.

Soot (Sable Research Group, 2019) is a widely used

language manipulation and optimization framework

developed by the Sable Research Group at the McGill

University. It supports analysis up to Java 9 and works

on the compiled binaries. Although its latest ofﬁ-

cial release was in 2012, the project is still active on

GitHub, from where we acquired the 3.2.0 release,

which was the latest version at the time. Soot has a

built-in call graph creator functionality that can be pa-

rameterized with multiple algorithms. We employed

the CHA algorithm during construction.

OpenStaticAnalyzer (OSA) (DSE University of

Szeged, 2019) is an open-source, multi-language

static analyzer framework developed by the Univer-

sity of Szeged. It calculates source code metrics,

detects code clones, performs reachability analysis,

and ﬁnds coding rule violations in Java, JavaScript,

Python, and C# projects. Besides the recursive

directory-based analysis of the source code, OSA is

also capable of wrapping the build system (maven or

ant) of the project under examination. This can make

the analysis more precise, as generated ﬁles will be

handled as well. We extracted the call graph of our

project by traversing its Abstract Syntax Tree

(AST)

like internal representation and collecting every avail-

able invocation information.

SPOON (Pawlak et al., 2015) is an open-source,

feature-rich Java analyzer and transformation tool for

research and industrial purposes. It is actively main-

tained, supports Java up to version 9, and while sev-

eral higher-level concepts (e.g., reachability) are not

provided ”out of the box”, the necessary infrastruc-

ture is accessible for users to develop their own.

SPOON performs a directory analysis of the source

code and builds an AST-like metamodel, which is the

Abstract Syntax Tree represents the syntactic structure

of the source code in a hierarchical tree-like form.

Systematic Comparison of Six Open-source Java Call Graph Construction Tools

119

basis for these further analyses and transformations.

Similarly to the above mentioned OSA implementa-

tion, the call information can be obtained by process-

ing the AST-like inner representation of SPOON. The

library is well-documented and provides a visual rep-

resentation of its metamodel, which helped us thor-

oughly study its structure. We used the 7.0.0 version

for our research.

Java Call Graph (JCG) (Georgios Gousios, 2019) is

an Apache BCEL (Apache Commons, 2019) based

utility for constructing static and dynamic call graphs.

It can be considered a small project, as it has only

one major contributor, Georgios Gousios, whose last

commit (at the time of writing) is from October, 2018.

It supports the analysis of Java 8 features and requires

a jar ﬁle as an input. A special feature of the analyzer

is the detection of unreachable code. As a result, the

call graph does not include calls from code segments

that are never executed.

WALA (WALA, 2019) is a static and dynamic an-

alyzer for Java bytecode (supporting syntactic ele-

ments up to Java 8) and JavaScript. Originally, it was

developed by the IBM T.J. Watson’s Research Cen-

ter; now it is actively developed as an open-source

project. Similarly to Soot, it also has a built-in call

graph generation feature with a wide range of graph

building algorithms. We used the ZeroOneContain-

erCFA graph builder for our research, as it performs

the most complex analysis. It provides an approxima-

tion of the Andersen-style pointer analysis (Andersen,

1994) with unlimited object-sensitivity for collection

objects. The generator had to be parametrized with

the entry points, from which the call graphs would

be built. To make the results similar to the results

of the other tools, we treated all not-private, non-

abstract methods as entry points (instead of just the

main methods). For other conﬁguration options, we

used the default settings provided in the documenta-

tion and example source codes.

Eclipse JDT (Eclipse JDT, 2019) is one of the main

components of the Eclipse SDK (Eclipse, 2019). It

provides a built-in Java compiler and a full model

for Java sources. We created a JDT based plugin for

Eclipse Oxygen that supports even Java 10 code, to

extract the call graph from the extensive, AST-like in-

ner representation.

4 EVALUATION PROCESS

In Java, methods can be distinguished by fully quali-

ﬁed names, which include the package name, the class

name, the name of the method, and the list of the pa-

rameter types. These methods can be referred by their

name and by the appropriate parameter types. How-

ever, the nomenclature of some language elements

is not standardized, for example, the naming of the

anonymous classes and methods, or the notations of

lambda expressions. Moreover, it is also possible that

compiler-generated code parts are not present in some

source-code based representations.

In applications, call graphs can only be used effec-

tively if the call dependencies among the nodes solely

cover real dependencies and also include all those that

indicate any data or control dependencies. In order to

compare the soundness and unsoundness of the call

graphs generated by different tools, we need to iden-

tify the corresponding nodes – the targets of the po-

tential invocations – in multiple graphs. Naturally,

each tool produced a slightly different output. For ex-

ample, OSA and WALA use the standardized naming

convention

, while others employ their own notation

system. To illustrate, here are two different represen-

tations of the public void foo(String[] str) method:

• OSA: foo([Ljava/lang/String;)V

• Soot: void foo(java.lang.String[])

Despite the different textual forms, these two rep-

resentations can be matched easily. We had to im-

plement a speciﬁc graph loader for each tool to han-

dle the aspects of its method naming convention. A

method name uniﬁcation algorithm was introduced to

overcome all notational differences. However, two

language features, the anonymous and generic code

elements, needed extra consideration, therefore, the

line information of the methods was also involved in

the method pairing process. We have note that line

information was not always available or reliable. The

developmental and ﬁne-tuning steps of the method

name uniﬁcation algorithm and validation of the pair-

ing mechanism based on that were discussed in one

of our previous articles (Peng

o and S

agodi, 2019).

We performed the graph comparisons on the uniﬁed

graphs. We analyzed the kinds of nodes and edges

that were found by each tool.

First, we performed a comparison on a small Java

sample code (454 LOC) to identify how the tools

handle different Java language elements. The code,

through trivial test cases, helps highlight how the

language features impact the generated call graphs.

We tested the handling of polymorphism, reﬂection,

lambda expressions, etc. with one simple example

each, then manually studied the outputs. Native JNI

calls and callback functionalities are not tested be-

cause their handling is far beyond the capabilities of

Standard naming convention for Java methods:

https://docs.oracle.com/en/java/javase/11/docs/specs/jni/

intro.html

ICSOFT 2019 - 14th International Conference on Software Technologies

120

ordinary static analyzer tools. After the in-depth ex-

amination of the sample code we conducted an analy-

sis of large, real-life Java projects in order to measure

the impact of various handling procedures on a large

scale.

The source of our tool, which compare the differ-

ent outputs of the different call graphs, the used call

graph tools, the example code and analyzed programs

with the comparison results are available in our on-

line appendix at http://www.inf.u-szeged.hu/

∼

ferenc/

papers/StaticJavaCallGraphs.

5 COMPARISON OF CALL

GRAPHS

To answer our research questions, in this section we

ﬁrst characterize the language elements responsible

for the variance of call graphs. We perform a qualita-

tive and quantitative analysis, and ﬁnally we classify

the differences of the resulted call graphs manually,

so that we can learn more about the causes of the dif-

ferences.

5.1 Handling of Language Features

In this subsection, we summarize the language ele-

ments that are handled differently, therefore, cause

differences in the generated call graphs.

Initializer Methods. The handling of the different

types of initializations is one of the main sources

of differences. Naturally, all of the tools represent

constructor calls. With the exception of JDT, all of

them detect and connect generated default construc-

tors even without the instantiation of an object, and

derived classes’ calls to super constructors are repre-

sented as well. In case of AST-based call graphs, ini-

tializer blocks and constructors have different nodes

in the call graph. Bytecode based call graph builders

represent such nodes as one. The initializer meth-

ods of nested classes also cause discrepancies in the

graphs, because bytecode based tools (Soot, WALA

and JCG) represent a reference to the outer class as an

additional parameter in the parameter list. Obviously,

source code based tools miss this parameter, since it

is not present in the actual code. Both solutions are

acceptable, and does not lessen the accuracy of calls,

although it makes the node pairing more challenging.

Static initializer blocks are executed when a class

is loaded by the class loader of the Java Virtual Ma-

chine. This is a dynamic process, triggered by dif-

ferent types of usage of a class, therefore, represent-

ing static initializer nodes in a static call graph can be

cumbersome and incidental. All tools represent static

initializer blocks, however, with different details and

call edges. A large part of Soot’s graphs are made up

of these nodes. When a class is used and it has at least

one static ﬁeld declared, Soot inserts a corresponding

static initializer node.

Polymorphism. Polymorphism is one of the most im-

portant traits of an object-oriented language. They oc-

cur most often when a parent class reference is used to

refer to a child class object. However, polymorphism

can cause inaccuracies in the call graphs, as static an-

alyzers might be unable to decide whether an object

reference is of its declared type or any subtype of its

declared type. So, when a method is invoked, instead

of linking the proper overridden method, most of the

analyzers only link the parent method in the graph.

This problem can be resolved by employing an algo-

rithm that tries to approximate the call target.

Only WALA and Soot use an advanced algo-

rithm, namely a type of points-to analysis, whilst the

other tools rely on simple Name Based Resolution

(NBR) (Tip and Palsberg, 2000). The NBR tools,

such as OSA, SPOON, JCG, and JDT represent poly-

morphic methods with their static type. As there are

many comparative studies about call graph builder al-

gorithms (Lhot

ak, 2007), (Tip and Palsberg, 2000),

(Grove et al., 1997), (Murphy et al., 1998), (Grove

and Chambers, 2001), (Lhot

ak and Hendren, 2006),

the thorough examination of the handling of polymor-

phism is not in the focus of our research. In our cur-

rent evaluation, Soot uses the Class Hierarchy Anal-

ysis (CHA) to resolve the target of the polymorphic

call. CHA makes the assumption that every over-

ridden implementation of methods on a given inher-

itance hierarchy is callable at the call sites. In many

cases, this will clearly result in false positive call-

ing relationships, as we will see in the discussion

of anonymous classes. The ZeroOneContainerCFA

algorithm of WALA is more sophisticated, but the

implementation is incorrect in the sense that neither

WALA nor Soot realize method invocations of default

methods of interfaces. If a method is not overridden

in the derived class, JCG generates a copy of the base

method in the derived class. This method is callable

if the static type of the object at the calling site makes

it possible, however, this method does not refer to the

original method and the called methods of the origi-

nal. So an application, which traverses the possible

execution paths, will miss some potential paths.

Anonymous Source Code Elements. Anonymous

methods and classes can cause difﬁculties in the node

pairing process because there is no standard way of

naming. If the tools do not provide valid line infor-

mation, some anonymous methods will remain un-

matched, which results in different call edges as well.

Systematic Comparison of Six Open-source Java Call Graph Construction Tools

121

However, our examination revealed that this is not

the most important reason for differences caused by

the anonymous elements. The instantiations of inner

classes and the calls of constructors work well in ev-

ery tool. Although the naming of the inner classes

can be different for each tool, the pairings of the cor-

responding nodes of the graphs are made feasible with

the help of additional line information. According to

the properties discussed during the polymorphism, it

is not surprising, that OSA, SPOON, JCG are not able

to invoke the methods overloaded by the anonymous

classes. CHA algorithm of the Soot can be problem-

atic. Since methods of the anonymous classes are not

reachable in many contexts, it is not always correct to

consider these to be the part of the hierarchy. In many

cases, this can cause false positive call relations in the

call graph.

Generic Elements. In order to implement generics,

the Java compiler applies type erasure, where it re-

places all type parameters in generic types with their

bounds or with the Object if the type parameters are

unbound. This type erasure is used by the call graph

tools in most cases. Although WALA and Soot are

also using type erasure to specify the target method,

whose deﬁnition contains at least one generic param-

eter, these tools propagate the types of the actual pa-

rameters into the called method.

When using JDT, we collect information from the

AST representation of the analyzed program, and we

determine the target of a particular call with method

binding information at the call site. If a generic

method is instantiated with different types, we get

more nodes in the call graph, which represent the

same method. Of course this is our fault, since JDT

provides all necessary information with which we

could approach the accuracy of either WALA or Soot.

Java 8. Java 8 introduced the concept of functional

interfaces. These are interface classes that contain ex-

actly one abstract method. Lambda expressions and

method references, which are also new features of

Java 8, can be used to reference such functional in-

terfaces. Since lambda expressions cannot, strictly

speaking, be considered methods, their interpretation

in a graph that represents methods as nodes is a bit

cumbersome. Out of the six tools, only WALA cre-

ates dedicated nodes for lambda expressions, other

tools represent them with the interface they imple-

ment. Similarly to lambda expressions, WALA also

handles functional interfaces with speciﬁc nodes, to

which the lambda and method reference nodes are

connected. Although it would not be impossible (by

tracing the inner calls of the Java libraries) we con-

cluded from our research that call graph builder tools

mostly fail to detect and, therefore, represent what ac-

tual methods are called through the mentioned calls.

Dynamic Method Calls. The reﬂection and the

method handle mechanism of Java make it possible

to determine the target of a method invocation dy-

namically in runtime. On our sample code we tested

whether any of the six tools could determine the tar-

gets of a basic reﬂection and a method handle call.

Since none of the tools provided a solution for the

handling of these invocations, we are not dealing with

them in the rest of the paper, similarly to native JNI

calls and callback methods.

Our RQ1 was: “How does the different handling

of Java’s language features affects the resulted call

graphs?” The handling of some language elements

causes additional nodes to appear in the call graph.

This is necessary in some cases (e.g. in the case of

default constructors or generated methods), while in

others it is only a technical help for call graph con-

struction (e.g. linking inherited methods). In some

cases there are many potential targets of a method

call. In such situations, different call graph tools can

have a different number of call edges from the given

call site.

5.2 Quantitative Differences of Call

Graphs

We summarized the results of our sample code in Ta-

bles 1 and 2. The numbers in the main diagonals of

the tables are the number of methods (e.g. Soot found

114 methods, see Table 1) or calls (e.g. Soot identiﬁed

404 invocations, see Table 2) found by the tools, while

the top left cell contains the number of distinct meth-

ods found by the different tools, i.e., it is the number

of methods or calls in the union (the six tools together

found 176 distinct methods and 472 different method

invocations). The number of calls discovered by the

individual tools ranges from 211 to 404. The percent-

ages in the row of a tool show the ratio of its methods

or calls that were found by the other tool presented

in the given column. For example, WALA found 249

calls and 175 of them were found by JDT as well,

which results in 70.28%. On the other hand, JDT de-

tects 211 call edges from which WALA found 175 as

well. However, as JDT has fewer calls than WALA,

the ratio is higher, 82.94%. This means that the table

is not symmetrical.

For an easier visual overview, the percentages

above 80% are colored green, while the percentages

below 60% are red. Tables 1 and 2 show that the re-

sults of OSA and SPOON are well aligned on the ex-

ample code, OSA covers all the methods and edges

of the SPOON’s graph. SPOON connects three addi-

tional library methods into the graph, and this causes

ICSOFT 2019 - 14th International Conference on Software Technologies

122

Table 1: Common methods of the sample code.

176 Soot OSA SPOON JCG WALA JDT

Soot 114 76.39% 78.47% 81.25% 82.64% 73.61%

OSA 92.44% 119 100.00% 95.80% 91.60% 94.12%

SPOON 92.62% 97.54% 122 95.90% 91.80% 94.26%

JCG 86.67% 84.44% 86.67% 135 85.19% 81.48%

WALA 93.70% 85.83% 88.19% 90.55% 127 82.68%

JDT 90.60% 95.73% 98.29% 94.02% 89.74% 117

Table 2: Calls of the sample code.

472 Soot OSA SPOON JCG WALA JDT

Soot 404 51.73% 52.48% 53.96% 58.42% 43.56%

OSA 89.70% 233 100.00% 94.85% 89.27% 82.83%

SPOON 87.60% 96.28% 242 92.56% 87.19% 82.64%

JCG 87.20% 88.40% 89.60% 250 86.40% 74.80%

WALA 94.78% 83.63% 84.74% 86.75% 249 70.28%

JDT 83.41% 91.47% 94.79% 88.63% 82.94% 211

Table 3: Methods of the Maven project.

7,567 Soot OSA SPOON JCG WALA JDT

Soot 4,769 33.84% 50.05% 55.19% 43.24% 53.66%

OSA 64.33% 2,509 77.40% 75.69% 57.83% 74.33%

SPOON 63.34% 51.76% 3,748 87.22% 44.18% 82.87%

JCG 68.36% 49.34% 84.91% 3,849 45.31% 87.63%

WALA 92.31% 64.94% 74.55% 78.09% 2,236 84.70%

JDT 61.38% 47.09% 75.58% 82.71% 45.60% 4,239

Table 4: Methods of the ArgoUML project.

28,987 Soot OSA SPOON JCG WALA JDT

Soot 14,905 61.61% 66.51% 69.23% 31.02% 68.26%

OSA 50.61% 18,148 55.11% 56.11% 21.66% 57.95%

SPOON 86.19% 86.97% 11,447 92.93% 35.42% 91.05%

JCG 66.23% 65.28% 68.55% 15,574 26.97% 70.43%

WALA 96.74% 82.27% 85.36% 88.00% 4,783 87.10%

JDT 78.98% 82.35% 82.02% 85.97% 32.48% 12,929

the slight difference.

Having analyzed the sample code, we evaluated

the tools on larger projects as well, namely the

ArgoUML-0.35.1

and Maven-3.6.0

. ArgoUML is

a UML modeling tool with 180 KLOC, while Maven

is a library tool with 80 KLOC, and older versions

of both are also presented in the Qualitas Corpus

Table 5: Calls of the Maven project.

70,192 Soot OSA SPOON JCG WALA JDT

Soot 63,839 3.52% 6.29% 8.28% 6.00% 6.46%

OSA 47.95% 4,684 64.50% 63.86% 38.32% 61.38%

SPOON 56.24% 42.32% 7,139 85.28% 31.18% 80.25%

JCG 59.76% 33.83% 68.87% 8,840 32.34% 74.07%

WALA 99.92% 46.81% 58.04% 74.55% 3,835 53.17%

JDT 59.97% 41.81% 83.31% 95.22% 29.65% 6,877

Table 6: Calls of the ArgoUML project.

332,806 Soot OSA SPOON JCG WALA JDT

Soot 292,212 8.10% 8.42% 8.77% 3.88% 8.58%

OSA 45.13% 52,441 49.96% 49.88% 13.25% 56.14%

SPOON 80.10% 85.26% 30,730 82.78% 24.03% 89.45%

JCG 63.80% 65.13% 63.33% 40,163 19.01% 71.11%

WALA 98.65% 60.48% 64.27% 66.45% 11,491 56.52%

JDT 71.89% 84.38% 78.79% 81.86% 18.62% 34,888

http://argouml-downloads.tigris.org/source/browse/

argouml-downloads/trunk/www/argouml-0.35.1/

https://mvnrepository.com/artifact/org.apache.maven/

maven-core/3.6.0

database (Tempero et al., 2010). Although this pa-

per compares the call graphs of these two programs

only, we repeated our evaluation on other tools too.

The results of these measurements are also available

in our online appendix at http://www.inf.u-szeged.hu/

∼

ferenc/papers/StaticJavaCallGraphs.

Table 3-Table 6 show the differences among the

tools. As both examples show, Soot represents many

more methods in both graphs. One likely reason

for this is the detailed portrayal of static initializer

nodes, and the representation of all overridden meth-

ods caused by CHA algorithm. On the opposite side,

WALA contains fewer methods (and so edges) than

all the others, thanks to its more precise pointer anal-

ysis, and to the fact that WALA only processes meth-

ods available from certain entries. We can also ob-

serve (which did not become apparent from our sam-

ple code) that OSA’s methods often differ from those

of the other tools’. The reason for this is that OSA’s

analysis is library based, meaning that it analyzes ev-

ery library that is on the analysis’s path, but can only

resolve library references based on names or the ac-

tual type only, not through the declarations. In the

case of polymorphic calls, this results in the creation

of nodes that differ in name from the other corre-

sponding nodes of other tools’ graphs. Since, in most

cases the called methods do not even have line in-

formation, pairing cannot be achieved. Moreover,

OSA will not process library functions that other tools

bring in through searching on the project class path.

In RQ2, we investigate the extent to which call

graphs can differ in practice. As we can see based on

the analyzed programs, we get considerably different

graphs. Although our pairings are not correct in every

case, as OSA’s previous example showed, the number

of edges and nodes clearly display the discrepancy.

Even on the small example, one tool deﬁnes twice as

many edges as the other (Table 2), but in case of the

large projects, the number of edges differ even more

considerably.

5.3 Examination of the Causes of

Differences

The main question of this section is, whether we could

get the same graphs if we eliminate all the known dif-

ferences of the tools? A given application that is built

on call graphs may demand different call graph prop-

erties. Is it possible to bring the tools’ outputs closer

together in order to really be able to compare the gen-

erated graphs, therefore, assist in ﬁnding a suitable

tool or method for a particular call graph based appli-

cation? If we ignore the known differences from the

generated call graphs, do we get the same resulting

Systematic Comparison of Six Open-source Java Call Graph Construction Tools

123

graph by each of the tools?

The structure of call graphs are inﬂuenced by the

following three attributes of a call graph creator tool:

• How are the crucial language elements handled?

• What method is used in processing the data: does

it use library analysis or some kind of pointer or

reachability analysis?

• How does it deal with dynamic calls that cannot

be resolved during static analysis. In other words,

what kind of algorithm does it use to make the

connected calls more accurate?

In this section, we are going to non-exhaustively

demonstrate how much ﬁltering out a small discrep-

ancy can help in bringing graph results closer to-

gether. In each step, we create a subgraph from the

original with the help of a given ﬁltering mechanism

and compare it to the call graph that was composed

in the previous step (or with the original if it is the

ﬁrst step). The ﬁltering is applied on the nodes. Nat-

urally, the edges that are in connection with that node

are eliminated as well. So, the results of the tools are

getting closer together.

Table 7: Common calls of the Maven project after elimi-

nating the clinit calls detected only by Soot (The number of

investigated methods is 7,140.)

69,200 Soot OSA SPOON JCG WALA JDT

Soot 62,847 3.57% 6.39% 8.41% 6.10% 6.56%

OSA 47.95% 4,684 64.50% 63.86% 38.32% 61.38%

SPOON 56.24% 42.32% 7,139 85.28% 31.18% 80.25%

JCG 59.76% 33.84% 68.87% 8,840 32.34% 74.07%

WALA 99.92% 46.81% 58.04% 74.55% 3,836 53.17%

JDT 59.97% 41.81% 83.31% 95.22% 29.65% 6,877

Table 8: Common calls of the ArgoUML project after elim-

inating the clinit calls detected only by Soot (The number

of investigated methods is 28,489).

320,036 Soot OSA SPOON JCG WALA JDT

Soot 279,442 8.47% 8.81% 9.17% 4.06% 8.98%

OSA 45.13% 52,441 49.96% 49.89% 13.25% 56.14%

SPOON 80.10% 85.26% 30,730 82.78% 24.03% 89.45%

JCG 63.80% 65.13% 63.33% 40,163 19.01% 71.11%

WALA 98.65% 60.48% 64.27% 66.45% 11,491 56.52%

JDT 71.89% 84.38% 78.79% 81.86% 18.62% 34,888

Eliminating Differences Caused by Language El-

ements. In Section 5.1 we saw that certain lan-

guage elements can signiﬁcantly increase the amount

of nodes, and, therefore, edges a graph has, which

can cause large differences. For example, every time

a class with a static member was used, Soot attached

a reference to that class’s static initializer block. For

this reason, many edges became part of the graph that

in reality might not be executed. In the ﬁrst step, we

decided to ﬁlter out the static initializer nodes that

appear in Soot’s graphs in the following way: the

static initializer nodes only have incoming edges in

the graphs of Soot and do not appear in the other

tools’ graphs. Table 7 and 8 show how much closer

the tools outputs get to each other after the elimina-

tion of the large number of static initializer nodes and

their connections. Compared to Tables 5 and 6, the

difference is observable only in the ﬁrst row. It is ob-

vious, as only Soot’s connections were involved in the

ﬁltering process. In case of Maven, we detracted 992

edges from Soot’s graph this way, 12,770 in the case

of ArgoUML (the number of nodes for Maven is 62,

for ArgoUML it is 498). We concluded that the ﬁrst

ﬁltering step did not bring Soot that much closer to

the other tools.

Table 9: Common calls of the Maven project after elimi-

nating library calls (The number of investigated methods is

4,216).

7,530 Soot OSA SPOON JCG WALA JDT

Soot 4,139 27.04% 53.52% 61.97% 33.97% 56.8%

OSA 45.23% 2,474 59.78% 58.57% 35.49% 59.58%

SPOON 54.34% 36.29% 4,076 85.35% 28.75% 83.02%

JCG 60.4% 34.12% 81.92% 4,247 31.29% 89.45%

WALA 99.93% 62.4% 83.3% 94.46% 1,407 83.58%

JDT 58.88% 36.91% 84.75% 95.14% 29.45% 3,993

Table 10: Common calls of the ArgoUML project after

eliminating library calls (The number of investigated meth-

ods is 21,252).

59,215 Soot OSA SPOON JCG WALA JDT

Soot 34,574 42.51% 41.15% 42.44% 16.39% 43.96%

OSA 43.63% 33,685 43.34% 48.78% 12.59% 53.82%

SPOON 87.48% 89.76% 16,264 82.56% 26.94% 91.98%

JCG 64.26% 71.96% 58.8% 22,834 17.76% 74.42%

WALA 100.00% 74.84% 77.29% 71.54% 5,668 71.33%

JDT 78.09% 93.14% 76.86% 87.3% 20.77% 19,463

Eliminating Algorithmic Differences. Often, it is

not because of language elements that graphs become

very different. Examining the nodes of the sample

code, we noticed that the tools handle Java library

calls in various ways. Some tools, like OSA, represent

library calls with less accuracy as it does not connect

methods called within library methods. Other tools

provide more detailed information about calls outside

the source of the examined project. When we make a

decision about the extent to which a call graph should

handle call relationships between library functions,

we have to consider how important examining the de-

pendencies generated by the execution paths is for

us. In certain cases, these library functions may call

the project’s own methods through call back methods,

creating data dependencies. Call graph based appli-

cations may be sensitive to this property. If we elim-

inate the library methods and their edges, the origi-

nally more detailed and less detailed call graphs will

be more comparable.

Table 9 and 10 show the differences of the graphs

after eliminating library calls. The graphs are more

similar in most cases, but there are exceptions because

we ﬁltered out edges that were detected by both tools,

ICSOFT 2019 - 14th International Conference on Software Technologies

124

which reduced the similarity. We can observe that a

lot of such edges were ﬁltered out from Soot that were

not detected by the other tools. This means that Soot

represents the library nodes with more detail in the

graph, which is not always necessary for every use

case.

Table 11: Common calls of the Maven project between

methods recognized by all tools (The number of investi-

gated methods is 1,366).

1,567 Soot OSA SPOON JCG WALA JDT

Soot 1,542 65.3% 65.37% 66.8% 59.21% 65.56%

OSA 99.6% 1,011 100.00% 98.91% 84.47% 99.9%

SPOON 97.49% 97.78% 1,034 96.71% 82.59% 97.78%

JCG 100.00% 97.09% 97.09% 1,030 85.44% 97.38%

WALA 100.00% 93.54% 93.54% 96.39% 913 93.87%

JDT 99.70% 99.61% 99.70% 98.92% 84.52% 1,014

Table 12: Common calls of the ArgoUML project between

methods recognized by all tools (The number of investi-

gated methods is 3,761.)

9,477 Soot OSA SPOON JCG WALA JDT

Soot 9,333 44.36% 44.62% 39.8% 52.81% 42.42%

OSA 98.85% 4,188 99.98% 87.32% 96.35% 95.56%

SPOON 97.56% 98.10% 4,268 86.25% 95.10% 94.07%

JCG 99.73% 98.17% 98.82% 3,725 96.89% 93.32%

WALA 100.00% 81.86% 82.35% 73.22% 4,929 78.19%

JDT 97.8% 98.86% 99.18% 85.87% 95.21% 4,048

Eliminating Processing Differences. Finally, we ex-

amined the differences that come from the different

processing approaches of the tools. We can see a lot

of red numbers in the columns of WALA, because it

ignores many nodes that are taken into account by the

other tools. One possible reason can be that WALA

starts the call graph builder algorithm only form cer-

tain nodes and uses the call information of the reach-

able nodes only. Other tools take all methods into

account, while there are tools that consider only those

methods that are reachable from public methods. To

be able to compare the edges properly, without suffer-

ing from the differences that come from the different

node sets of the graphs, we only kept those nodes (and

the corresponding edges) that were found by all tools.

We know that this way we lost a lot of edges, but thus

we are able to compare the tools and their capabilities

better and, at the same time, how the different capa-

bilities inﬂuence the differences of the graphs. From

this experiment we initially expected that we would

ﬁnd differences among the results of tools using dif-

ferent pointer analysis techniques, while tools using

the same algorithms would give a much similar re-

sult. As we can see in Tables 11 and 12, the results

do not support our assumption, because, for example,

the results of SPOON, OSA and JDT are differing.

In order to ﬁnd out the cause of the differences,

we manually examined and classiﬁed them. In case of

Maven, 719 out of the 1,567 edges are not found by

at least one tool, which means that 46% of the edges

are not “common”. The ratio for ArgoUML is much

worse, because 6,112 out of the 9,477 edges were not

found by all tools, which is 64%. Soot or WALA

apply different pointer analysis than the other tools,

which explains most of the edges that were only dis-

covered by these two tools (or only discovered by the

other tools). There are 62 edges in the Maven project

and 872 edges in the ArgoUML project that cause dif-

ferences among the graphs, but they cannot be ex-

plained by the various pointer analysis algorithms.

Their examination revealed previously undiscovered

causes. Besides the reasonable differences, we found

faults in the graphs as well, because the tools repre-

sented call edges that correspond to invalid call paths

or execution order.

In case of SPOON and JDT, such fault was that the

initializations of the static block was connected with

init blocks. Besides, JDT and OSA are not able to de-

tect the calls of class member initializations (15 edges

for ArgoUML) while the other tools handle this prop-

erly. However, it is not consistent, because in some

cases, JDT can recognize such edges, but OSA can-

not. Another interesting observation is that SPOON

inserts an extra loop edge among the init methods

when the class has default constructor (19 and 42

edges for Maven and ArgoUML, respectively).

Another example for the differences is when a

class A is imported from a given package but an other

class A was used with a fully qualiﬁed name. SPOON

did not distinguish the two classes, although only one

of them was referred. Since both classes contained

a method with the same signature as the invoked one,

SPOON created two call edges which led to a mistake.

Besides, for Maven, there were 2 cases when SPOON

created call edges, although the called method had a

different number of parameters than the caller pro-

vided. OSA and SPOON rarely handle overloaded

methods improperly, although, it may happen that a

call is not connected to the correct method (1 and 2

edges for Maven and ArgoUML, respectively).

The handling of super classes is also not consis-

tent among the tools. Soot, WALA and JCG con-

nect super.method calls found in inner classes to

the caller method in 3 cases (ArgoUML), while our

JDT extension left out the super constructor calls 184

times for ArgoUML.

It is interesting that analyzers working on byte-

code (Soot, JCG and WALA) created loops in the call

graph that does not exist in the actual code (7 cases in

ArgoUML). This happened when a method was over-

ridden but the return values of the methods were dif-

Method overloading allows a class to have more than

one method having the same name with differing parameter

lists

Systematic Comparison of Six Open-source Java Call Graph Construction Tools

125

ferent. In this case, the compiler generates a node to

handle the different return value, and the generated

and the original methods are connected. Since meth-

ods cannot be distinguished based on return types,

and such methods cannot be implemented, therefore,

source code based analyzers did not make difference

among such methods and handled them as one. This

tricky solution yielded the loop edges for the bytecode

analyzers, because these two “different” nodes were

merged into one, and the edge between them became

a loop edge.

We found a special case, when an anonymous

class was implemented in a parameter list of a method

and there was a call from the method of the anony-

mous class. Soot created a call edge from the outer

method instead of the method of the anonymous class.

Finally, we already experienced in our example

code that JCG handles the inherited methods in a dif-

ferent way which causes difference in the call graph

representation. Most of the differences come from the

representation, because JCG represents the inherited

methods with its own node in the inherited class and

the invocations refer to this generated node.

The question of RQ3 was whether, by eliminat-

ing all the known differences, we get the same graphs

for the tools or not. The answer is no. As we have

seen, there were so many minor differences between

the graphs we did not think of as a lay user. Depend-

ing on which features are more important to us in an

application (e.g.: the precision of the control ﬂow in-

formation, or the dependencies deﬁned between the

methods), we must take into account the features of

the call graph tools and choose the most appropriate

for our purposes.

6 THREATS TO VALIDITY

We only collected open-source Java analyzer tools

that either had an appropriate call graph output or

could easily be extended with a call graph generation

functionality. Although we have thoroughly investi-

gated many other tools, we still cannot rule out the

possibility of having missed some which could have

fulﬁlled our selection requirements. In addition, the

tools have many parameters that inﬂuence the con-

struction of call graphs (e.g. different kind of pointer

analysis) but since we focused on the tools instead of

the difference of their algorithms, we executed each

tool with only one conﬁguration. We were not look-

ing for an optimal setting, but one that reliably works

on the tested inputs. Our goal was not to compare

the call graph builder algorithms themselves, but to

gather how many different outputs these tools could

generate. We wanted to collect all the potential rea-

sons for the differences of the investigated call graphs.

Moreover, in case of OSA, SPOON and JDT, we

implemented the call graph exporter ourselves, there-

fore, it was possible for us to commit mistakes. As it

was previously discussed, errors has occurred in the

handling of ﬁeld initializations, for example. Natu-

rally, there may always be an error if the data extrac-

tion is left to the user.

Even though our sample code contains the fea-

tures of Java 8, we did not take into account the con-

ﬁguration xml-s and ﬁles of the analyzed projects.

The examination of runtime annotations was also ig-

nored, because static analyzers represent the calls de-

ﬁned by them as calls to interface methods.

We had to develop a method name uniﬁcation pro-

cess to handle the different representations provided

by the tools. The implemented node pairing program

was tested on our sample code which contains all ap-

plicable Java 8 features. Because of the anonymous

and generic elements, it was not possible to rely solely

on the methods’ names, we had to include the line in-

formation as well. However, the call graph tools did

not always provide reliable line information, there-

fore, not every possible node pair was identiﬁed by

the program.

7 CONCLUSION

One of the main pillars of software analysis is call

graph creation, and, although, it might seem straight-

forward, there are countless factors that can inﬂuence

the ﬁnal result. Consequently, there is a wide collec-

tion of literature on call graphs. Many papers study

how to improve the accuracy, completeness, or effec-

tiveness of call graph creation, while others focus on

the comparison (and, mainly, the differences) of ex-

isting approaches. Our goal was to present these dif-

ferences from a practical perspective.

As the basis of our comparison, we examined how

6 open-source call graph building tools – which, while

not an all-encompassing overview of the current state-

of-the-art, can be considered a representative sample

– perform when analyzing both an artiﬁcial example

and some larger-scale, open-source projects. Before

our detailed comparison, we investigated how the dif-

ferent calling contexts are represented by the tools.

Already, in this comparison, we have seen that, in

some cases, the tools also differ in their way of pro-

cessing. We evaluated the impact of these differences

in the resulting call graphs on 4 open source pro-

grams. The results of two (Maven and ArgoUML)

were thoroughly discussed in this paper, while the

ICSOFT 2019 - 14th International Conference on Software Technologies

126

data of the rest is available in the online appendix. We

tried to select the most diverse inputs possible for the

analysis (e.g., libraries and executables as well). We

have shown that the outputs of the different call graph

creator tools may differ signiﬁcantly. We have also

shown on Maven and ArgoUML the extent to which

the different factors affected the differences of the call

graphs. With this, we have practically emphasized the

parameters that can signiﬁcantly determine the result-

ing call graphs.

ACKNOWLEDGEMENTS

Ministry of Human Capacities, Hungary grant 20391-

3/2018/FEKUSTRAT is acknowledged. The project

was supported by the European Union, co-fnanced

by the European Social Fund (EFOP-3.6.3-VEKOP-

16-2017-00002). This research was supported by the

EU-funded Hungarian national grant GINOP-2.3.2-

15-2016-00037 titled “Internet of Living Things”.

Thanks for Tam

as Aladics for the technical back-

ground work.

REFERENCES

Ali, K. and Lhot

ak, O. (2012). Application-only call

graph construction. In Proceedings of the 26th

European Conference on Object-Oriented Program-

ming, ECOOP’12, pages 688–712, Berlin, Heidel-

berg. Springer-Verlag.

Andersen, L. O. (1994). Program Analysis and Specializa-

tion for the C Programming Language. PhD thesis,

University of Copenhagen.

Antal, G., Heged

us, P., T

oth, Z., Ferenc, R., and Gyim

othy,

T. (2018). Static JavaScript Call Graphs: a Compar-

ative Study. In Proceedings of the 18th IEEE Inter-

national Working Conference on Source Code Analy-

sis and Manipulation, SCAM 2018, pages 177–186.

IEEE.

Apache Commons (2019). Apache BCEL Home Page.

https://commons.apache.org/proper/commons-bcel.

[Online; accessed 2019].

Bacon, D. F. and Sweeney, P. F. (1996). Fast Static Anal-

ysis of C++ Virtual Function Calls. SIGPLAN Not.,

31(10):324–341.

Badenski, P. (2019). Call Hierarchy Printer GitHub Page.

https://github.com/pbadenski/call-hierarchy-printer.

[Online; accessed 2019].

CallGraphViewer (2019). Callgraph viewer home page.

https://marketplace.eclipse.org/content/callgraph-

viewer. [Online; accessed 2019].

Christodorescu, M. and Jha, S. (2003). Static analysis of ex-

ecutables to detect malicious patterns. In Proceedings

of the 12th Conference on USENIX Security Sympo-

sium - Volume 12, SSYM’03, pages 12–12, Berkeley,

CA, USA. USENIX Association.

Danny van Bruggen, Federico Tomassetti, Nicholas Smith,

Cruz Maximilien (2019). JavaParser - for processing

Java code Homepage.

https://javaparser.org/. [Online; accessed 2019].

Dean, J., Grove, D., and Chambers, C. (1995). Optimiza-

tion of Object-Oriented Programs Using Static Class

Hierarchy Analysis. In Tokoro, M. and Pareschi,

R., editors, ECOOP’95 — Object-Oriented Program-

ming, 9th European Conference,

Aarhus, Denmark,

August 7–11, 1995, pages 77–101, Berlin, Heidelberg.

Springer Berlin Heidelberg.

DSE University of Szeged (2019). OpenStaticAnalyzer

GitHub Page.

https://github.com/sed-inf-u-szeged/

OpenStaticAnalyzer. [Online; accessed 2019].

Eclipse (2019). Eclipse home page.

www.eclipse.org/eclipse/. [Online; accessed 2019].

Eclipse JDT (2019). Eclipse jdt home page.

http://www.eclipse.org/jdt/. [Online; accessed 2019].

Eichinger, F., B

ohm, K., and Huber, M. (2008). Min-

ing Edge-Weighted Call Graphs to Localise Software

Bugs. In Machine Learning and Knowledge Discov-

ery in Databases, pages 333–348, Berlin, Heidelberg.

Springer Berlin Heidelberg.

Feng, Y., Anand, S., Dillig, I., and Aiken, A. (2014).

Apposcopy: Semantics-based detection of android

malware through static analysis. In Proceedings of

the 22Nd ACM SIGSOFT International Symposium

on Foundations of Software Engineering, FSE 2014,

pages 576–587, New York, NY, USA. ACM.

Georgios Gousios (2019). Java Call Graph GitHub Page.

https://github.com/gousiosg/java-callgraph.

Grove, D. and Chambers, C. (2001). A framework for call

graph construction algorithms. ACM Trans. Program.

Lang. Syst., 23(6):685–746.

Grove, D., DeFouw, G., Dean, J., and Chambers, C.

(1997). Call Graph Construction in Object-oriented

Languages. In Proceedings of the 12th ACM SIG-

PLAN Conference on Object-oriented Programming,

Systems, Languages, and Applications, OOPSLA ’97,

pages 108–124, New York, NY, USA. ACM.

Hoogendorp, H. (2010). Extraction and visual exploration

of call graphs for Large Software Systems. Master’s

thesis, University of Groningen.

Lhot

ak, O. (2007). Comparing call graphs. In ACM SIG-

PLAN/SIGSOFT Workshop on Program Analysis for

Software Tools and Engineering, pages 37–42.

Lhot

ak, O. and Hendren, L. (2006). Context-Sensitive

Points-to Analysis: Is It Worth It? In Mycroft, A. and

Zeller, A., editors, Compiler Construction, pages 47–

64, Berlin, Heidelberg. Springer Berlin Heidelberg.

Liu, C., Yan, X., Yu, H., Han, J., and Yu, P. S. (2005). Min-

ing Behavior Graphs for “Backtrace” of Noncrashing

Bugs. In SDM.

Murphy, G. C., Notkin, D., Griswold, W. G., and Lan,

E. S. (1998). An Empirical Study of Static Call

Systematic Comparison of Six Open-source Java Call Graph Construction Tools

127

Graph Extractors. ACM Trans. Softw. Eng. Methodol.,

7(2):158–191.

Pawlak, R., Monperrus, M., Petitprez, N., Noguera, C., and

Seinturier, L. (2015). Spoon: A Library for Imple-

menting Analyses and Transformations of Java Source

Code. Software: Practice and Experience, 46:1155–

1179.

Peng

o, E. and S

agodi, Z. (2019). A preparation guide for

java call graph comparison: Finding a match for your

methods. Acta Cybernetica. in press.

Reif, M., Eichberg, M., Hermann, B., Lerch, J., and Mezini,

M. (2016). Call Graph Construction for Java Li-

braries. In Proceedings of the 2016 24th ACM SIG-

SOFT International Symposium on Foundations of

Software Engineering, FSE 2016, pages 474–486,

New York, NY, USA. ACM.

Reif, M., K

ubler, F., Eichberg, M., and Mezini, M. (2018).

Systematic evaluation of the unsoundness of call

graph construction algorithms for java. In Compan-

ion Proceedings for the ISSTA/ECOOP 2018 Work-

shops, ISSTA ’18, pages 107–112, New York, NY,

USA. ACM.

Sable *J (2019). Sable *J Home Page”.

http://www.sable.mcgill.ca/starj/. [Online; accessed

2019].

Sable Research Group (2019). Sable/Soot GitHub Page.

https://github.com/Sable/soot. [Online; accessed

2019].

Sundaresan, V., Hendren, L., Razaﬁmahefa, C., Vall

ee-Rai,

R., Lam, P., Gagnon, E., and Godin, C. (2000). Prac-

tical virtual method call resolution for java. SIGPLAN

Not., 35(10):264–280.

Tempero, E., Anslow, C., Dietrich, J., Han, T., Li, J.,

Lumpe, M., Melton, H., and Noble, J. (2010). Qualitas

corpus: A curated collection of java code for empiri-

cal studies. In 2010 Asia Paciﬁc Software Engineering

Conference (APSEC2010), pages 336–345.

Tip, F. and Palsberg, J. (2000). Scalable propagation-based

call graph construction algorithms. In Proceedings

of the 15th ACM SIGPLAN Conference on Object-

oriented Programming, Systems, Languages, and Ap-

plications, OOPSLA ’00, pages 281–293, New York,

NY, USA. ACM.

Wagner, T. A., Maverick, V., Graham, S. L., and Harrison,

M. A. (1994). Accurate static estimators for program

optimization. SIGPLAN Not., 29(6):85–96.

WALA (2019). WALA Home Page.

http://wala.sourceforge.net/wiki/index.php/

Main Page. [Online; accessed 2019].

Weiser, M. (1981). Program slicing. In Proceedings of the

5th International Conference on Software Engineer-

ing, ICSE ’81, pages 439–449, Piscataway, NJ, USA.

IEEE Press.

ICSOFT 2019 - 14th International Conference on Software Technologies

128