Investigating Order Information in API-Usage Patterns:

A Benchmark and Empirical Study

Ervina C¸ ergani

, Sebastian Proksch

, Sarah Nadi

and Mira Mezini

Software Technology Group, Technische Universit

at Darmstadt, Darmstadt, Germany

Software Evolution and Architecture Lab, University of Z

urich, Z

urich, Switzerland

Department of Computing Science, University of Alberta, Alberta, Canada

Keywords:

API Usage Pattern Types, Code Repositories, Events Mining, Empirical Evaluation, Benchmark.

Abstract:

Many approaches have been proposed for learning Application Programming Interface (API) usage patterns

from code repositories. Depending on the underlying technique, the mined patterns may (1) be strictly sequen-

tial, (2) consider partial order between method calls, or (3) not consider order information. Understanding the

trade-offs between these pattern types with respect to real code is important in many applications (e.g. code

recommendation or misuse detection). In this work, we present a benchmark consisting of an episode mining

algorithm that can be conﬁgured to learn all three types of patterns mentioned above. Running our benchmark

on an existing dataset of 360 C# code repositories, we empirically study the resulting API usage patterns per

pattern type. Our results show practical evidence that not only do partial-order patterns represent a generalized

super set of sequential-order patterns, partial-order mining also ﬁnds additional patterns missed by sequence

mining, which are used by a larger number of developers across code repositories. Additionally, our study

empirically quantiﬁes the importance of the order information encoded in sequential and partial-order patterns

for representing correct co-occurrences of code elements in real code. Furthermore, our benchmark can be

used by other researchers to explore additional properties of API patterns.

1 INTRODUCTION

Application Programming Interfaces (APIs) provide

effective means for code reuse. Client developers of

an API must be aware on how to correctly use it in or-

der to avoid errors. An API usage pattern encodes a

set of API methods that are frequently used together,

optionally complemented by constraints like the order

in which methods must be called. API patterns are

used as the basis for various applications such as API

documentation generation (Montandon et al., 2013),

automated code completion (Nguyen et al., 2012),

bug or anomaly detection (Wasylkowski et al., 2007),

and code search (Zhong et al., 2009a).

Many techniques have been proposed to learn

three kinds of patterns from code repositories (Robil-

lard et al., 2013): (1) No-order patterns are unordered

sets of frequently used methods (e.g.,(Negara et al.,

2014; Nguyen et al., 2016)) and encode that calls

of methods, say a, b, and c, frequently co-occur in

code, but do not include information about the order

of calls. (2) Sequential-order patterns (e.g., (Pradel

et al., 2010; Raychev et al., 2014)) additionally en-

code facts such as that a has to be called before b, and

b before c. (3) partial-order patterns (e.g., (Nguyen

et al., 2012)) are modelled as graphs and can encode

e.g., that a must be called ﬁrst, but how b or c are

called afterwards is irrelevant.

However, so far, we lack systematic studies of the

tradeoffs between the different types of patterns in

representing source code in practice. A comparison

of different pattern types with regards to some pre-

deﬁned metrics is challenging, because each appro-

ach in the literature uses a different learning technique

with conﬁgurations speciﬁc to its data set (e.g., fre-

quency threshold), a different representation for usage

examples and patterns, and might even be speciﬁcally

tied to a particular programming language or input

form (e.g., source code vs. bytecode).

In this paper, we address this challenge and pre-

sent, to the best of our knowledge, the ﬁrst empirical

comparison of API pattern types to investigate their

effectiveness in representing API usages in the wild.

The different pattern types we compare, consider con-

straints of different nature between method calls, and

thus understanding what exactly they are able to mine

Çergani, E., Proksch, S., Nadi, S. and Mezini, M.

Investigating Order Information in API-Usage Patterns: A Benchmark and Empirical Study.

DOI: 10.5220/0006839000570068

In Proceedings of the 13th International Conference on Software Technologies (ICSOFT 2018), pages 57-68

ISBN: 978-989-758-320-9

in a concrete setting constitutes an interesting and re-

levant subject in many software engineering applicati-

ons (e.g. code recommendation or misuse detection).

To provide a fair setting, we use a common data set

of 360 open-source Github C# repositories with over

68M lines of code (Proksch et al., 2016) and adopt

an established mining algorithm that can be customi-

zed to mine all three types of patterns, episode mi-

ning (Achar et al., 2012). Episode mining is a well-

known machine learning technique used to discover

partially ordered sets of events from a stream, cal-

led episodes (patterns in our terminology). In our

setting, events are method declarations or invocations

(cf. Section 3.2). We can mine all three pattern types

by adjusting certain parameters of the episode mining

algorithm. With this experimental setup in place, we

can produce sequential, partial and no-order patterns

using the same mining algorithm and same data set.

Our experimental setup is publicly available as a ben-

chmark

, and can be used by other researchers to per-

form similar empirical studies.

In this ﬁrst study, we compare pattern types in

terms of three metrics: (1) Expressiveness quantiﬁes

the richness of the language corresponding to a pat-

tern type whose grammar rules are the mined pat-

terns. We measure expressiveness as the number of

words (i.e., derived sequences of method calls) in the

language. This measure indicates how well the mi-

ned patterns abstract over the variety of concrete API

usages observed in source code. Conceptually, one

would expect that less structure patterns encode a ri-

cher language. The question is, though, to what extent

do the differences in expressiveness between pattern

types materialize in the wild. (2) Consistency quan-

tiﬁes the extent to which the words in the language

deﬁned by the mined patterns are actually found in

the code. This is to judge how truthful the mined API

usage patterns represent actual API usage constraints

implicitly encoded in source code. From a practi-

cal perspective, this metric gives us insights about

the relevance of the order information encoded in se-

quential and partial-order patterns. (3) Generaliza-

bility measures whether the usages a pattern encodes

are speciﬁc to a single code context or if they gene-

ralize to multiple contexts. In language terminology,

this metric indicates whether the learned model is ap-

plicable across domains/projects or whether we learn

domain-speciﬁc languages (models). This is impor-

tant to understand the applicability of the information

encoded in the learned patterns.

The contributions of this paper are as follows:

1. We identify a general episode mining algorithm to

http://www.st.informatik.tu-darmstadt.de/artifacts/

patternTypes/

fairly compare different pattern types and adapt it

to the domain of mining code patterns.

2. We deﬁne three metrics on which we base on the

comparison between the different pattern types:

expressiveness, consistency and generalizability.

3. We perform an empirical study that compares the

three pattern types based on the deﬁned metrics.

The implications we ﬁnd from our results help in

building better applications based on API usages.

4. We provide a public benchmark that can be used

by other researchers to evaluate additional metrics

for API usage pattern types.

2 RELATED WORK

Here, we present existing API usage mining techni-

ques and representations, and discuss other studies

that have investigated API usages in practice.

2.1 API Usage Representations

API usage representations can be divided into three

types: no-order, sequential-order, and partial-order.

No-Order Patterns. The simplest form of lear-

ning API usage patterns is to look at frequent co-

occurrences of code elements, while ignoring the or-

der these code elements occur in. Frequent item-set

mining is an example in this category and variations

of it have been commonly used (Michail, 2000; Ne-

gara et al., 2014; Nguyen et al., 2016).

Sequential-order Patterns. To take code semantics

into account, many API usage representations consi-

der order information. For example, calling the con-

structor of an API type must happen before calling

any of its methods. The patterns mined by sequence

mining encode strict sequential order between code

elements in a pattern. Existing approaches are ba-

sed on, but not limited to, using information from

the API’s source code (Acharya and Xie, 2009; Wa-

sylkowski et al., 2007), API documentation (Zhong

et al., 2009b), program control-ﬂow structure (Ra-

manathan et al., 2007), and program execution tra-

ces (Gabel and Su, 2008; Pradel et al., 2010). Statis-

tical models have also been used to predict the next

code element (e.g. method call), given a current con-

text (e.g., sequences of already seen method calls).

Examples include n-gram language models (Raychev

et al., 2014) or statistical generative models (Pham

et al., 2016). Additionally after identifying sequen-

ces, some techniques rely on clustering to build pat-

tern abstractions (Wang et al., 2013; Buse and Wei-

mer, 2012; Zhong et al., 2009a).

ICSOFT 2018 - 13th International Conference on Software Technologies

Partial-order Patterns. This pattern type allows

more ﬂexibility in representing code semantics, e.g.,

that code elements b and c must occur after code

element a, but that their order (b before or after c)

is not relevant. Graph-based techniques like Gra-

Lan (Nguyen and Nguyen, 2015), GraPacc (Nguyen

et al., 2012), and JSMiner (Nguyen et al., 2014) re-

present source code in a graph to identify frequent

sub-graph patterns. Automata-based techniques or Fi-

nite State Machine (FSM) represent code as a set of

states (e.g. method calls) and a transition function be-

tween the states. The framework by (Acharya et al.,

2007) extract API usage patterns directly from client

code. This framework is based on FSMs for genera-

ting execution traces along different program paths.

In their terminology, partial-order expresses choices

between alternative code elements. In our termi-

nology, a partial-order pattern includes strict and/or

unordered pairs of code elements.

2.2 Empirical Studies of API Usages

Researchers have extracted API usages through mi-

ning software repositories and studied the characteris-

tics of these usages or used them in various applicati-

ons. Usage patterns are explored in (Ma et al., 2006)

from the Java Standard API with an early version of

the Qualitas Corpus which contains 39 open source

Java applications. A study on a larger corpus (5,000

projects) on usages of both core Java and third-party

API libraries is performed in (Qiu et al., 2016). The

diversity of API usages in object-oriented software is

empirically analyzed in (Mendez et al., 2013). In their

context, diversity is deﬁned as the different statically

observable combinations of method calls on the same

project. Multiple dimensions of API usages are ex-

plored in (De Roover et al., 2013), such as the scope

of projects and APIs, the metrics of API usages (e.g.,

number of project classes extending API classes), the

API’s metadata, and project versus API-centric views.

The empirical study on API usages presented

in (Zhong and Mei, 2018), focuses on how different

types of APIs are used. Our work is mainly concerned

with API patterns instead of single usages. Most pre-

vious work focuses on comparing one learning techni-

que with other learning techniques that mine the same

pattern type. For example, the framework presented

in (Pradel et al., 2010) is used to evaluate three mining

approaches that learn all sequences of API method

calls. Instead, we focus on understanding the trade-

offs between different pattern types.

The work in (Robillard et al., 2013) provides a

more comprehensive survey on API property infe-

rence and discusses over 60 techniques developed for

mining frequent API usage patterns. Overall, existing

studies focus on different aspects of API usages, but

do not analyze the differences between API usage pat-

tern types. Our work ﬁlls this gap and investigates the

trade-offs between different API usage pattern types

in practice with respect to three metrics: expressive-

ness, consistency, and generalizability.

3 EPISODE MINING FOR API

PATTERNS

We brieﬂy overview the episode mining algorithm

and then explain how we use it to mine patterns from

open-source C# GitHub repositories, in three steps:

(a) generate an event stream by transforming source-

code into a stream of events, (b) apply episode mining

algorithm to mine API usage patterns, and (c) ﬁlter

the resulting partial-order patterns.

3.1 Episode Mining Algorithm

To support the detection of sequential-order, partial-

order, and no-order patterns in source code, we use

the episode mining algorithm (Achar et al., 2012) for

the following reasons. First, it facilitates the compa-

rison of different pattern types, since it provides one

conﬁguration parameter for each type. The other op-

tion would be to use different learning algorithms, one

per pattern type. In this case, ensuring the same ba-

seline for the empirical comparisons will be difﬁcult,

since each algorithm might use different conﬁgurati-

ons and input formats. Second, it is a general pur-

pose machine learning algorithm, which has perfor-

med well in other applications: text mining (Achar

and Sastry, 2015), positional data (Haase and Brefeld,

2014), multi-neuronal spike data (Achar et al., 2012).

Third, the implementation of the episode mining al-

gorithm (Achar et al., 2012) is publicly available.

The term episode is used to describe a partially

ordered set of events. Frequent episodes can be

found in an event stream through an Apriori-like al-

gorithm (Agrawal et al., 1993). Such an algorithm

exploits principles of dynamic programming to com-

bine already frequent episodes into larger ones (Man-

nila et al., 1997). The algorithm alternates episode

candidates generation and counting phases so that in-

frequent episodes are discarded due to the downward

closure lemma (Achar et al., 2012). The counting

phase tracks the occurrence of episodes in the event

stream using Finite State Automaton (FSA). More

speciﬁcally, at the k-th iteration, the algorithm ge-

nerates all possible episodes with k events by self-

joining frequent episodes from the previous iteration

Investigating Order Information in API-Usage Patterns: A Benchmark and Empirical Study

consisting of k − 1 events each. The resulting episo-

des are episode candidates that need to be veriﬁed in

the subsequent counting phase. A given episode is

frequent if it occurs often enough in the event stream.

A user-deﬁned frequency threshold deﬁnes the mini-

mum number of occurrences for an episode to be fre-

quent. An entropy threshold determines whether there

is sufﬁcient evidence that two events occur in either

order or not. All frequent episodes that fulﬁll the mi-

nimum frequency and entropy threshold are outputted

by the algorithm in a given iteration k, and all infre-

quent episodes are simply discarded. The next itera-

tion begins with generating episodes of size k +1. The

entropy threshold is speciﬁc to partial-order patterns.

It has a value between 0 and 1, inclusive. A value of

0 means that no order will be mined, resulting in no-

order patterns. A value of 1 means a strict ordering of

events, resulting in sequential-order patterns. Values

between 0 and 1 result in partial-order patterns, with

varying levels of strictness. We mine the three pattern

types by adjusting the conﬁguration parameter of the

episode mining algorithm: NOC for No-Order Con-

ﬁguration, SOC for Sequential-Order Conﬁguration,

and POC for Partial-Order Conﬁguration. More de-

tails about the algorithm can be found in the work by

Achar et al. (Achar et al., 2012).

3.2 Mining API Usage Patterns

Event Stream Generation. In our context, an event

is any method declaration or method invocation. To

transform a repository of source code into the stream

representation expected by the episode mining algo-

rithm, we iterate over all source ﬁles and traverse each

Abstract Syntax Tree (AST) depth-ﬁrst. Whenever

we encounter a method declaration or method invoca-

tion node in the AST, we emit a corresponding event

to a stream. We use a fully-qualiﬁed naming scheme

for methods to avoid ambiguous references. The fol-

lowing is how we deal with the two types of nodes we

are interested in:

• Method Invocation is the fundamental information

that represents an API usage, for which we want

to learn patterns. While a resolved AST might

point to a concrete method declaration, we gene-

ralize this reference to the method that has ori-

ginally introduced the signature of the referenced

method, i.e., a method that was originally decla-

red in an interface or an abstract base class. The

reason is that the original declaration deﬁnes the

contract that all derived classes should adhere to,

according to Liskov’s substitution principle (Mar-

tin, 2003). Assuming that this principle is univer-

sally followed, we can reduce noise in the dataset

by storing the original reference.

• Method Declarations represent the start of an en-

closing method context that groups the contained

method calls. We emit two different kind of events

for the encountered method declaration. Super

Context: If a method overrides another one, we

include a reference to the overridden method, i.e.,

the encountered method overrides a method in an

abstract base class. This serves as context infor-

mation that might be important for the meaning

of a pattern. First Context: Following the same

reasoning as for super context, we include a refe-

rence to the method that was declared in an inter-

face that originally introduced the current method

signature, which could be further up the type hier-

archy of the current class.

We apply heuristics to optimize the event stream ge-

neration. (1) We ﬁlter duplicated source code, e.g.,

projects that include the same source ﬁles in multi-

ple solutions or that add their references through nes-

ted submodules in the version control system. (2)

We ignore auto-generated source code (e.g., UI clas-

ses generated from XML templates), since they do

not reﬂect human written code. (3) We ignore met-

hods of project-speciﬁc APIs (i.e., declared within the

same project) to avoid learning project-speciﬁc pat-

terns. Our goal is to learn general patterns that have

the potential to be re-used across contexts. (4) We ig-

nore references in the data set that point to unresolved

types or type elements. These cases indicate transfor-

mation errors of the original dataset, that were caused

by -for example- an incomplete class path. (5) We do

not process empty methods, nor include their method

declarations in the event stream.

Learning API Usage Patterns. We feed the genera-

ted event stream to the episode mining algorithm after

ﬁxing the threshold values: frequency and entropy (as

evaluated in Section 4.2).

Filtering Partial-order Patterns. While SOC and

NOC generate episode candidates that are either se-

quences or sets of events respectively, POC might

generate episode candidates from all three types,

since it contains the sequential and no-order types

as special cases. In case all the episode candida-

tes in POC are considered frequent episodes during

the counting phase, then all of them are outputted

by the algorithm. This implies that in every itera-

tion (i.e, pattern size), POC might output redundant

patterns containing the same set of events but dif-

fer in the order information. For illustration, assume

that POC generates episode candidates in iteration 3

by combing the following patterns from iteration 2:

a → b and a → c. The episode candidates in itera-

tion 3 will be: a → b → c and a → c → b as se-

ICSOFT 2018 - 13th International Conference on Software Technologies

quences, and a → (b, c) as partial-order, all possible

orderings between the two newly connected events b

and c. The partial-order episode a → (b, c) repre-

sents both a → b → c and a → c → b. However,

if all three episode candidates turn out to be frequent

in the subsequent counting phase, the two other se-

quences will also be carried over to the next iteration.

These redundant patterns are meaningless for source

code representation though and we ﬁlter them out in

each iteration.

4 EVALUATION SETUP

This section describes the data set we use, presents

the analyses of the frequency and entropy thresholds

used with the episode mining algorithm, and deﬁnes

the metrics for patterns comparison.

4.1 Data Set

We use an established dataset that consists of a cu-

rated collection of 2, 857 C# solutions extracted from

360 GitHub repositories (Proksch et al., 2016) with

a total of 68M lines of source code covering a wide

range of applications and project sizes that provide

many examples for API usages. The data set uses

a specialized AST-like representation of source code

with fully-qualiﬁed type references and elements.

This relieves us from the burden of compiling it to

get resolved typing information and makes it easier to

transform the source code into the event stream.

We ﬁnd 138K type declarations in the dataset that

extend a base class or implement an interface. These

type declarations contain 610K method declarations.

Out of these, 50K (ﬁrst context plus super context)

override or implement a method declaration introdu-

ced in a dependency. The same dependency can be

used in other projects, so focusing on these reusa-

ble methods provides valuable context information

for the API usage. We ﬁnd 2M method invocations

across all method bodies of the data set.

4.2 Frequency and Entropy Thresholds

The episode mining algorithm uses two thresholds:

frequency and entropy. The threshold values directly

impact the number of patterns learned: higher thres-

hold values means stronger evidence in the source

code that a given pattern occurs. In this section, we

empirically evaluate the effects of the threshold va-

lues on the number of patterns learned by the three

We use the visitors in the dataset for the transformation.

2,500

3,000

3,500

4,000

4,500

5,000

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Number of Patterns

Entropy Thresholds

Frequency = 200

Frequency = 210

Frequency = 220

Frequency = 230

Frequency = 240

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

150 650 1,150 1,650 2,150

Number of Patterns

Frequency Thresholds

SOC(entropy = 1.0)

NOC(entropy = 0.0)

POC(entropy = 0.72)

Figure 1: Frequency and entropy threshold analyses.

conﬁgurations (NOC, SOC, POC), and select the

ones to use for the empirical evaluations presented in

Section 5.

Entropy Threshold. Since this threshold is speci-

ﬁc to POC, we ﬁrst focus on analyzing the number

of patterns learned by POC for different entropy and

frequency thresholds. Our analyses reveal an increa-

sing number of patterns learned for different entropy

thresholds at every frequency level. This is expected,

since for entropy values near to 0.0, the algorithm le-

arns mainly unordered sets of events that abstract over

several usages. On the other hand, for entropy values

near to 1.0 the algorithm learns mainly sequences of

events, one for each frequent sequence. For simpli-

city, Figure 1a shows only a few frequency levels,

but similar curves are produced in other frequency

levels as well. We observe that for every examined

frequency level, POC learns a fairly stable number

of patterns in the entropy segment of [0.55, 0.75]. A

stable number of patterns for different threshold va-

lues, means that the patterns are not much affected

by small ﬂuctuations of the threshold values, making

them more preferable compared to an unstable set of

patterns that are easily affected by small changes in

the threshold values. Our data analyses within this

segment reveals that the minimal variation in number

of patterns occur for values of 0.71 − 0.72. Hence,

we use the entropy threshold of 0.72 in our next ana-

lysis for the frequency threshold and in our empirical

evaluations in Section 5.

Investigating Order Information in API-Usage Patterns: A Benchmark and Empirical Study

Frequency Threshold. Our analyses in Figure 1b

show that SOC and POC learn comparable number of

patterns for different frequency values, while NOC le-

arns less patterns in every frequency level compared

to the two others. This is due to the order informa-

tion: while SOC and POC may learn multiple pat-

terns for the same set of events, NOC simpliﬁes to a

single pattern. We select a frequency value that gives

a good trade-off between the total number of patterns

learned per conﬁguration and comparable number of

patterns learned across conﬁgurations. Our analyses

reveal that this is achieved at the frequency threshold

of 345, which we use in the rest of our evaluations. A

comparable number of patterns across conﬁgurations

avoids bias towards one conﬁguration.

4.3 Metrics for Pattern Comparison

We deﬁne the following metrics to quantify different

properties of the mined patterns in our experiments.

Expressiveness. Using a formal language termino-

logy, an API usage pattern can be seen as a grammar

rule of a language over an alphabet of method decla-

ration/invocation (events). The more words the sub-

language it deﬁnes has, the more expressive a pattern

is. A sequential-order pattern (a → b → c) when seen

as a grammar rule deﬁnes a language with a single

word {abc}. A partial-order pattern (a → (b, c)) de-

ﬁnes a language with two words, {abc, acb}. A no-

oder pattern (a, b, c) deﬁnes a language with six words

{abc, acb, bac, bca, cab, cba}. The expressiveness of

a pattern type is determined by the number of patterns

(grammar rules) it deﬁnes, and how well these pat-

terns abstract over the variety of concrete API usages

observed in source code.

To investigate how the three conﬁgurations (SOC,

POC, and NOC) compare to each other in terms of ex-

pressiveness, we calculate three metrics for each con-

ﬁguration pair (c1, c2): (a) exact(c1,c2) is the num-

ber of patterns that are exactly the same in c1 and c2;

(b) subsumed(c1,c2) = (x,y) is a pair that represents

the number of patterns x learned by c1 that subsume

y patterns learned by c2. We say that a pattern p1

subsumes a pattern p2 iff they relate the same set of

events and all words deﬁned by p2 are also deﬁned

by p1, e.g. the grammar rule of a no-order pattern

(a, b, c) subsumes both the grammar rules (a → (b, c))

and (a → b → c) from the partial and sequential-order

patterns respectively; (c) new(c1,c2) is the number of

patterns learned by c1 that include events for which

c2 does not learn any pattern.

Consistency. The three pattern types differ in the

extent to which they preserve code structure. While

no-order patterns cannot represent any structure,

sequential-order patterns can encode an absolute or-

der of events, and partial-order patterns can even re-

present complex control ﬂow that is imposed by con-

trol structures like if. We establish the consistency

metric as a way to quantify how important the order

information encoded by sequential-order and partial-

order patterns is in practice. The metric takes values

in ]0.0, 1.0], and for a given pattern p is deﬁned as:

consistency(p) =

Occs(p)

OccsSet(p)

(1)

where Occs(p) is the number occurrences of p, and

OccsSet(p) is the number of co-occurrence of events

in p regardless of their order. A high consistency

emphasizes the importance of the encoded order. A

low consistency means that in most cases, the re-

spective code elements occur in an order different to

the one encoded in the pattern, suggesting that the

structural information encoded by the pattern is irre-

levant.

Generalizability. Finding instances of a pattern in

multiple contexts indicates that the pattern represents

an abstraction over a set of similar API usages, e.g.,

used by multiple developers. On the other hand, a

very local pattern might suggest that it does not ge-

neralize beyond a speciﬁc context, e.g., it might only

be used by a speciﬁc developer. To quantify the ge-

neralizability of a pattern, we count the number of

contexts in which we can observe it at two different

levels of granularity that complement each other: (a)

The method declaration level measures whether in-

stances of a pattern are found within a single method

declaration (the latter refers to the highest declaration

in the type hierarchy that originally introduced the

current method signature) or across method declara-

tions (method-speciﬁc versus cross-method pattern).

(b) The code repository level measures whether in-

stances of a pattern are found in one or in multiple re-

positories (repository-speciﬁc versus cross-repository

pattern). Knowledge about the generalizability of pat-

terns is important for judging the versatility of the pat-

tern in later applications.

5 STUDY RESULTS

This section presents the results of our empirical

study. All experiments are performed with a fre-

quency threshold of 345. For POC, we use an entropy

ICSOFT 2018 - 13th International Conference on Software Technologies

threshold of 0.72 (cf. Section 4.2). First, we show sta-

tistics about the learned patterns, and then study them

along the dimensions presented in Section 4.3.

5.1 Pattern Statistics

Here we analyze the learned patterns in terms of their

size and number of API types they encode.

Pattern size refers to the number of events in a pat-

tern. Our approach learns patterns with up to 7 events

in each conﬁguration. The number of patterns lear-

ned decreases for larger pattern sizes with the same

ratio in each conﬁguration. Almost all mined patterns

(97%) involve 5 events or less. The result matches the

intuition that it is less probable that many developers

write large code snippets in exactly the same way.

API types within a pattern reﬂects the number of

API types a pattern encodes interactions for. In all the

patterns learned, 75% involve interactions between

events from multiple API types (across conﬁgurati-

ons). Only 28% of the patterns with 2 - 4 events in-

volve interactions between events from a single API

type. All patterns with 5 or more events involve mul-

tiple API types. The maximum number of API types

involved within a pattern is 5 types, where patterns

involving two API types make the majority (40%).

5.2 Expressiveness

Table 1 shows the expressiveness metric results. For

each conﬁguration pair (c1,c2), Total shows the total

number of patterns learned by (c1,c2) respectively.

POC vs. SOC. These conﬁgurations learn 858 equal

patterns, which implies that out of 1,234 patterns le-

arned by POC, 70% are sequences and only 30% of

them include partial-order between events.

Observation 5.1

Most of the API usage patterns deﬁne in the wild

strict-order between events (70%), while the other

30% abstract over different API usage variants.

Furthermore, subsumed(POC, SOC) is (260;346),

i.e., 260 partial-order patterns learned by POC sub-

sume 346 sequences learned by SOC. The 260 partial-

order patterns encode 572 different sequences, i.e.,

the 346 sequences mined by SOC plus 226 others.

Recall that multiple sequential-order patterns can be

represented by a single partial-order pattern.

Finally, new(POC, SOC) is 116, meaning that for

the events included in 116 partial-order patterns, there

are no sequences learned by SOC. The 116 partial-

order patterns encode 308 sequences of events that in-

dividually do not occur often enough in source code.

For this reason, SOC does not mine them. On the

Table 1: Expressiveness results per conﬁguration pair.

(POC, SOC) (NOC, POC) (NOC, SOC)

exact 858 248 0

subsumed (260;346) (716;986) (853;1204)

new 116 17 128

Total (1,234;1,204) (981; 1,234) (981;1,204)

other hand, POC represents different variants of se-

quences for the same set of events in a single pattern,

which increases the partial-order pattern occurrence

and makes it match the frequency threshold.

From these results, we can conclude that all pat-

terns learned by POC represent a superset of the pat-

terns learned by SOC.

Observation 5.2

The API usage speciﬁcations encoded by partial-

order patterns fully represent the speciﬁcations en-

coded by sequential-order patterns. Furthermore,

they learn 116 additional patterns of events for

which sequence mining cannot learn any sequence.

NOC vs. POC. As shown in Table 1, exact(NOC,

POC) = 248, which means that 20% of the patterns

learned by POC are exactly the same as the ones lear-

ned by NOC. Recall that no-order patterns are mined

in POC when the involved events occur often enough

in either order.

Observation 5.3

In 20% of the cases, partial-order patterns encode

events that occur in either order in the wild.

Furthermore, subsumed(NOC, POC) is (716;

986), i.e., 716 no-order patterns learned by NOC sub-

sume 986 patterns learned by POC. Note that one no-

order pattern simpliﬁes several partial-order patterns

that misses order information.

Finally, new(NOC, POC) is 17, i.e.,17 patterns

learned by NOC include events for which POC does

not learn any pattern. These patterns are missed

by POC because either: (a) none of the sequences

between the events occur frequently enough, recall

that sequences are a special case of partial-order

patterns, and/or (b) there is not enough evidence in

the source code that events occur frequently enough

in either order (speciﬁed by entropy threshold).

From these results we can conclude that no-order

patterns represent a superset of partial-order patterns.

NOC vs. SOC. Table 1 shows that NOC and

SOC learn 0 equal patterns, which is obviously the

case, since NOC learns only set of events and SOC le-

arns only strict-order sequences, i.e., there cannot be

any overlap between the patterns learned by these two

conﬁgurations. We ﬁnd that subsumed(NOC, SOC) is

Investigating Order Information in API-Usage Patterns: A Benchmark and Empirical Study

(853; 1,204). In other words, all sequential-order pat-

terns can be subsumed by 853 no-order patterns. Note

that multiple sequential-order patterns can be simpli-

ﬁed into a single no-order pattern by removing order

constraints.

Finally, new(NOC, SOC) is 128, i.e., for 128 pat-

terns learned by NOC there are no sequences mined

by SOC. None of the sequences between these events

occurs frequently enough in the source code.

Observation 5.4

No-order patterns match all sequential-order pat-

terns; furthermore, the no-order conﬁguration le-

arns 128 additional patterns for which sequential-

order conﬁguration could not learn any sequences.

Analysis of the Results. To recap, sequence mi-

ning misses sequences of events which are captured

by partial and no-order patterns. To understand what

code structures they represent, we explored mined

patterns and found examples that explain this pheno-

menon in the source code of Graphical User Interfaces

(GUI). Using a GUI component typically requires to

call its constructor ﬁrst, but the order in which pro-

perties like color or size are conﬁgured is irrelevant.

A miner thus ﬁnds many UI code examples with high

variation and low support of each individual example.

This reveals two disadvantages of the sequential-order

miners. First, if the individual support for each vari-

ant of the GUI component usage is high enough, then

redundant patterns will be identiﬁed, one sequence for

each variant. Second, if the target threshold is not met

by one or more sequence variants, the corresponding

sequence pattern will be missed. In the same situa-

tion, each variant would count as support for patterns

with more abstract representation such as partial and

no-order, which thus may pass the threshold more ea-

sily. When compared with each other, partial-order

patterns can preserve order information, which is mis-

sed by no-order patterns.

5.3 Consistency

Based on the results in 5.2, one may conclude that no-

order patterns deﬁne a richer language compared to

the other two types. The question raises: Why should

one use expensive mining approaches (sequence or

partial mining), if we can learn a richer language from

source code using less computationally expensive mi-

ning approaches such as frequent item-set mining?

However, this would be a valid conclusion, only if the

words in the language mined by NOC are valid, i.e.,

the order between events in a pattern does not really

matter. To analyze this, we investigate the consistency

of the mined sequential and partial-order patterns with

co-occurrences of events in code.

Our results reveal high consistency in sequential

(avg. 0.9) and partial-order patterns (avg. 0.96). This

suggests that order information encoded in both se-

quential and partial-order patterns is crucial for the

correct co-occurrences of events in the wild, and sim-

plifying them into no-order patterns will result in lo-

sing important order information between events.

Observation 5.5

Partial and sequential-order mining learn impor-

tant order information regarding co-occurrences of

events within a pattern.

5.4 Generalizability

In this section, we present the generalizability me-

tric results on two granularity levels as explained in

Section 4.3: method declaration and code repository.

Method Declaration. Our results empirically show

that most of the patterns learned (98%) by each con-

ﬁguration, are used across method declarations. If a

pattern occurs across method declarations, it means

that it generalizes to different implementation tasks.

Observation 5.6

Most of the patterns learned ﬁnd applicability to a

large variety of implementation tasks.

Next we analyze if the patterns learned are used by

multiple developers, or if they represent speciﬁc co-

ding styles for a given repository and its developers.

Code Repository. Table 2 shows our results for diffe-

rent conﬁgurations and pattern sizes. The column Pat-

terns shows the total number of patterns, and the ab-

solute number and percentage of general patterns le-

arned by each conﬁguration. The next columns show

the same information as Patterns, but for different pat-

tern sizes, where the last column (6+ events) shows

the information for patterns that have more then 6

events.

Our results show that the patterns learned by

POC and SOC have almost the same percentage of ge-

neralizability (48% vs. 47%), regardless of their size.

This means that more than half the patterns mined by

each conﬁguration are learned from API usages from

the same repository. While such repository-speciﬁc

patterns are useful to the developers of that particu-

lar repository, they may reﬂect a very speciﬁc way of

using certain API types, which may not be useful to a

general set of developers.

As the table shows, NOC learns slightly more ge-

neral patterns (58%). However, recall that these more

generalizable patterns come at the cost of missing or-

der information between events.

ICSOFT 2018 - 13th International Conference on Software Technologies

Table 2: Code repository generalizability level for different conﬁgurations and pattern sizes.

Patterns 2 events 3 events 4 events 5 events 6+ events

Conﬁg Total General Total General Total General Total General Total General Total General

POC 1,234 594 (48%) 573 472 (82%) 283 106 (38%) 212 15 (7%) 122 1 (1%) 44 0 (0%)

SOC 1,204 561 (47%) 562 458 (82%) 270 92 (34%) 206 10 (5%) 122 1 (1%) 44 0 (0%)

NOC 981 572 (58%) 528 445 (84%) 226 108 (48%) 132 17 (13%) 70 2 (3%) 25 0 (0%)

Observation 5.7

No-order patterns tend to be more generalizable

(58%) compared to sequential and partial-order

patterns (47% and 48%), which tend to be over-

speciﬁed due to the order constraints they encode.

We analyzed the patterns learned exclusively by

POC (recall Table 1) and found that 114 out of 116

patterns are general patterns used across repositories.

To ﬁnd out why most of the patterns learned exclusi-

vely by POC are general patterns, we check if there

is any relation between generalizability and pattern-

order. We ﬁnd that strict-order patterns (exact(POC,

SOC)) are less generalizable (37%) compared to pat-

terns that contain partial-order between events (sub-

sumed - 62%, and new - 98%). This conﬁrms our

hypothesis that there is a relation between generaliza-

bility and pattern-order. Furthermore, most of the pat-

terns (90%) learned exclusively by POC include met-

hod calls only from the standard library, which further

explains their re-usability across repositories.

Table 2 shows that across conﬁgurations, the per-

centage of general patterns learned is higher for smal-

ler patterns, and signiﬁcantly decreases for bigger pat-

terns. Furthermore, for patterns with 6-events and

more, we learn only repository-speciﬁc patterns. Spe-

ciﬁcally, around 70% of general patterns (independent

of the conﬁguration) are 2 and 3-event patterns. Most

of the patterns with 4-events or more are repository-

speciﬁc patterns. This makes sense since the proba-

bility that multiple developers with different coding

styles and different application domains writing a si-

milar and long piece of code is very low.

Observation 5.8

Small code patterns of 2 and 3 events are more ge-

neralizable compared to larger code patterns of 4 or

more events that mainly encode constraints of API

usages from a single repository.

We further analyzed the repository-speciﬁc pat-

terns and found that 93% of them are learned from

testing code, and they include API types that refer to

an old version of a common assembly that is used in

no other repository. Filtering out testing code may

help mining algorithms learn only general patterns.

An empirical validation of this hypothesis, however,

needs to be performed in the future.

Remark: For the sake of completeness, we experi-

mented with other threshold values (frequency and

entropy), and analyzed the generalizability of the pat-

terns across repositories. The results we received

did not show higher generalizability ratios in neither

of the pattern types, compared to the ones presented

above. This conﬁrms the correctness of the threshold

values selected as presented in section 4.2.

6 IMPLICATIONS

Based on the pattern statistics (Section 5.1) and re-

sults in Section 5, we derive the following:

Implication 1 (derived from Section 5.1) . Mining

techniques based on frequency occurrence of source

code in code bases are unlikely to learn large code

patterns (more than 7 method calls using our concrete

parameters), since it is less probable that developers

write large code snippets exactly in the same way. If

the main goal is to learn large code patterns, then ot-

her techniques need to be considered.

Implication 2 (derived from Section 5.1) . Code ana-

lyses techniques should consider interactions between

objects of different API types, while extracting facts

from source code. Even though these analyses are ex-

pensive since data-ﬂow dependencies need to be con-

sidered, they are important in mining relevant patterns

from source code.

Implication 3 (derived from Observations 5.1

and 5.5) . While covering a good amount of usa-

ges seen in source code, sequential-order mining may

lead to false positives in applications such as misuse

detection. For example, if the pattern is a → (b, c),

but a strict-order pattern has only learned a → b → c

and the code written by the developer is a → c → b.

On the other hand, while no-order mining might seem

to learn a larger variety of API usages in source code,

it might result in false negatives in such applications.

Following the same example, the developer might

have written b → a → c, and a no-order pattern cannot

Investigating Order Information in API-Usage Patterns: A Benchmark and Empirical Study

detect that b and c should occur strictly after a. We

can conclude that, partial-order mining learns better

API usage patterns for such applications.

Implication 4 (derived from Observation 5.2) .

Partial-order mining might be more appropriate for le-

arning API usage patterns in applications such as code

recommendation since multiple sequences can be re-

presented by a single partial-order pattern, decreasing

the total number of patterns that need to be part of the

model. In sequence mining, multiple patterns need to

be recommended to the developer for the same set of

events and might even risk missing valid sequences

if they do not occur frequently enough in the training

source code.

Implication 5 (derived from Observation 5.5) . Be-

fore deciding which mining approach to use in a spe-

ciﬁc application, developers need to know their trade-

offs in terms of order information and computation

complexity. Sequential and partial-order mining are

computationally expensive approaches but learn im-

portant order information about the co-occurrence of

events in a pattern, while no-order mining approaches

do not require expensive computations but on the ot-

her hand do not learn any order information about the

co-occurrence of events in a pattern.

Implication 6 (derived from Observation 5.8) . If

the main goal is to learn large code patterns (4 -

7 events), then recommenders should focus on a

repository-speciﬁc mining approach and produce ca-

tered recommendations to the repository’s developers.

However, if the goal is to learn general patterns that

can be used by many developers, then researchers

should know that they might end up mining small pat-

terns (2 and 3 events).

7 THREATS TO VALIDITY

Internal Validity. We generate the event stream ba-

sed on static analyses, not on dynamic execution tra-

ces. Even though this may not represent valid execu-

tion traces, it does represent how the code is written

by developers. In this paper, we focus on learning

code patterns to represent source code as it is writ-

ten in code editors. Also, our event stream considers

only intra-procedural analysis since we are interested

to learn patterns that occur within methods. Using

inter-procedural analysis might affect our results.

The episode mining algorithm learns only in-

jective episodes, where all events are distinct, i.e.,

the algorithm does not handle multiple occurrences

of the same event in a pattern. For example, met-

hod invocations: IEnumerator.MoveNext() or String-

Builder.Append() are usually called multiple times in

the code. The patterns we learn contain a single in-

stance of such events. While this is a limitation, it is

also an advantage in terms of pattern generalizability.

Speciﬁcally, the mined pattern would not have a strict

number of occurrences that would lead to mismatches

between it and another valid code snippet that has a

different number of occurrences.

The algorithm relies on user-deﬁned parameters:

frequency-threshold, entropy-threshold. While the

conﬁguration parameter depends on the type of pat-

terns one is interested in, deciding on adequate fre-

quency and entropy thresholds is not an easy task,

which affect the results. We mitigate this threat by

empirically evaluating the thresholds and choosing

the best combination of frequency and entropy thres-

holds for the given data set (cf. section 4.2).

The episode mining algorithm is available only

in a sequential (non-parallelized) implementation,

hence is inefﬁcient. However, this paper does not ad-

vocate using episode mining per se, but rather uses it

as a baseline for comparing different conﬁgurations.

This limitation can be improved by parallelizing the

algorithm’s implementation.

External Validity. In this paper, we do not learn

patterns for project-speciﬁc API types. Extracting

code patterns for project-speciﬁc API types can still

be achieved using the episode-mining algorithm we

use. Comparing project-speciﬁc patterns between dif-

ferent types of projects is an interesting task for future

work.

We learn code patterns only for method declarati-

ons and invocations, excluding all other code structu-

res such as loops, conditions, exceptions etc. This is

because the focus of this paper is on comparing dif-

ferent code pattern types (sequential, partial, and no-

order), instead of speciﬁcally learning complex pat-

terns that include all code structures. Since learning

code patterns while considering other code structures

is important for supporting certain development tasks,

we plan to enrich the code patterns that we learn with

additional code structures. This requires modifying

our event stream generation, which is an engineering

task rather than a conceptual limitation.

Finally, we analyze the trade-offs between diffe-

rent pattern types using the same set of code reposito-

ries written in the same programming language. We

also use a single learning algorithm that we conﬁgure

to produce different pattern types. We use an establis-

hed data set of 360 repositories that have over 68M

ICSOFT 2018 - 13th International Conference on Software Technologies

lines of source code to ensure that we analyze large

amounts of code and different coding styles. Howe-

ver, we cannot generalize our results beyond our cur-

rent dataset and learning algorithm.

8 CONCLUSIONS

In this paper, we present the ﬁrst benchmark for ana-

lyzing the trade-offs between three pattern types (se-

quential, partial and no-order) with respect to real

code. Our approach consists of three steps: the trans-

formation of source-code into a stream of events, the

adaptation of an event mining algorithm to the special

context of pattern mining for software engineering,

and ﬁltering of the resulting patterns.

Our empirical investigation shows that there are

different types of patterns learned in code reposito-

ries. While there are tradeoffs between pattern types

in terms of expressiveness, consistency and generali-

zability, they are comparable in terms of the patterns

size and number of API types. Our results empiri-

cally show that the sweet spot are partial-order pat-

terns, which are a superset of sequential-order pat-

terns, without losing valuable information like no-

order patterns. Partial-order mining ﬁnds additional

patterns that are missed by sequence mining, which

generalize across repositories. Compared to no-order

mining, partial-order learns a smaller percentage of

cross-repository patterns (58% vs. 48%), due to the

order constraints between events within a pattern.

Evaluation results show that all three conﬁgurations

end-up learning only repository-speciﬁc patterns for

pattern sizes with 6-events or more. Furthermore, our

results empirically show the consistency of order in-

formation in sequential and partial-order patterns: on

average 90% and 96% respectively.

Our ﬁndings are useful indications for researchers

who work with code patterns in applications such as

code recommendation and misuse detection.

ACKNOWLEDGEMENTS

This work has been supported by the European Re-

search Council with grant No. 321217, and by the

German Science Foundation (DFG) in the context of

the CROSSING Collaborative Research Center (SFB

#1119, project E1). The authors want to thank Raajay

Viswanathan for the technical support with the epi-

sode mining algorithm, and Ulf Brefeld for the useful

suggestions on the analyses of the data presented on

this paper. The authors take full responsibility for the

content of the paper.

REFERENCES

Achar, A., Laxman, S., Viswanathan, R., and Sastry, P.

(2012). Discovering injective episodes with general

partial orders. Data Mining and Knowledge Disco-

very, pages 67–108.

Achar, A. and Sastry, P. (2015). Statistical signiﬁcance

of episodes with general partial orders. Information

Sciences, pages 175–200.

Acharya, M. and Xie, T. (2009). Mining API error-handling

speciﬁcations from source code. In International Con-

ference on Fundamental Approaches to Software En-

gineering, pages 370–384.

Acharya, M., Xie, T., Pei, J., and Xu, J. (2007). Mining

API patterns as partial orders from source code: from

usage scenarios to speciﬁcations. In European Soft-

ware Engineering Conference and the ACM SIGSOFT

Symposium on The Foundations of Software Engineer-

ing, pages 25–34.

Agrawal, R., Imieli

nski, T., and Swami, A. (1993). Mining

association rules between sets of items in large data-

bases. In ACM SIGMOD, pages 207–216.

Buse, R. P. and Weimer, W. (2012). Synthesizing api usage

examples. In Proceedings of the 34th International

Conference on Software Engineering, pages 782–792.

IEEE Press.

De Roover, C., Lammel, R., and Pek, E. (2013). Multi-

dimensional exploration of api usage. In Program

Comprehension (ICPC), 2013 IEEE 21st Internatio-

nal Conference on, pages 152–161. IEEE.

Gabel, M. and Su, Z. (2008). Javert: fully automatic mining

of general temporal properties from dynamic traces.

In ACM SIGSOFT International Symposium on Foun-

dations of Software Engineering, pages 339–349.

Haase, J. and Brefeld, U. (2014). Mining positional data

streams. In International Workshop on New Frontiers

in Mining Complex Patterns, pages 102–116.

Ma, H., Amor, R., and Tempero, E. (2006). Usage patterns

of the java standard api. In Software Engineering Con-

ference, 2006, pages 342–352.

Mannila, H., Toivonen, H., and Inkeri Verkamo, A. (1997).

Discovery of frequent episodes in event sequences.

Data Mining and Knowledge Discovery, pages 259–

289.

Martin, R. C. (2003). Agile software development: princi-

ples, patterns, and practices. Prentice Hall PTR.

Mendez, D., Baudry, B., and Monperrus, M. (2013). Em-

pirical evidence of large-scale diversity in API usage

of object-oriented software. In Source Code Analysis

and Manipulation, pages 43–52.

Michail, A. (2000). Data mining library reuse patterns using

generalized association rules. In International Confe-

rence on Software Engineering, pages 167–176.

Montandon, J. E., Borges, H., Felix, D., and Valente, M. T.

(2013). Documenting APIs with examples: Lessons

learned with the APIMiner platform. In WCRE, pages

401–408.

Negara, S., Codoban, M., Dig, D., and Johnson, R. E.

(2014). Mining ﬁne-grained code changes to detect

Investigating Order Information in API-Usage Patterns: A Benchmark and Empirical Study

unknown change patterns. In International Confe-

rence on Software Engineering, pages 803–813.

Nguyen, A. T., Hilton, M., Codoban, M., Nguyen, H. A.,

Mast, L., Rademacher, E., Nguyen, T. N., and Dig, D.

(2016). Api code recommendation using statistical le-

arning from ﬁne-grained changes. In ACM SIGSOFT

International Symposium on Foundations of Software

Engineering, pages 511–522.

Nguyen, A. T. and Nguyen, T. N. (2015). Graph-based

statistical language model for code. In International

Conference on Software Engineering, pages 858–868.

Nguyen, A. T., Nguyen, T. T., Nguyen, H. A., Tam-

rawi, A., Nguyen, H. V., Al-Kofahi, J., and Nguyen,

T. N. (2012). Graph-based pattern-oriented, context-

sensitive source code completion. In International

Conference on Software Engineering, pages 69–79.

Nguyen, H. V., Nguyen, H. A., Nguyen, A. T., and Nguyen,

T. N. (2014). Mining interprocedural, data-oriented

usage patterns in javascript web applications. In Inter-

national Conference on Software Engineering, pages

791–802.

Pham, H. V., Vu, P. M., Nguyen, T. T., et al. (2016). Lear-

ning API usages from bytecode: a statistical approach.

In International Conference on Software Engineering,

pages 416–427.

Pradel, M., Bichsel, P., and Gross, T. R. (2010). A frame-

work for the evaluation of speciﬁcation miners based

on ﬁnite state machines. In IEEE International Con-

ference on Software Maintenance, pages 1–10.

Proksch, S., Amann, S., Nadi, S., and Mezini, M. (2016).

A dataset of simpliﬁed syntax trees for c#. In Interna-

tional Conference on Mining Software Repositories,

pages 476–479.

Qiu, D., Li, B., and Leung, H. (2016). Understanding the

api usage in java. Information and Software Techno-

logy, pages 81–100.

Ramanathan, M. K., Grama, A., and Jagannathan, S.

(2007). Path-sensitive inference of function prece-

dence protocols. In International Conference on Soft-

ware Engineering, pages 240–250.

Raychev, V., Vechev, M., and Yahav, E. (2014). Code com-

pletion with statistical language models. In ACM SIG-

PLAN Notices, pages 419–428.

Robillard, M. P., Bodden, E., Kawrykow, D., Mezini, M.,

and Ratchford, T. (2013). Automated API property

inference techniques. IEEE Transactions on Software

Engineering, pages 613–637.

Wang, J., Dang, Y., Zhang, H., Chen, K., Xie, T., and

Zhang, D. (2013). Mining succinct and high-coverage

api usage patterns from source code. In Proceedings

of the 10th Working Conference on Mining Software

Repositories, pages 319–328. IEEE Press.

Wasylkowski, A., Zeller, A., and Lindig, C. (2007). De-

tecting object usage anomalies. In European Software

Engineering Conference and the ACM SIGSOFT Sym-

posium on The Foundations of Software Engineering,

pages 35–44.

Zhong, H. and Mei, H. (2018). An empirical study on API

usages. IEEE Transaction on Software Engineering.

Zhong, H., Xie, T., Zhang, L., Pei, J., and Mei, H. (2009a).

MAPO: Mining and recommending API usage pat-

terns. In European Conference on Object-Oriented

Programming, pages 318–343.

Zhong, H., Zhang, L., Xie, T., and Mei, H. (2009b). In-

ferring resource speciﬁcations from natural language

API documentation. In International Conference on

Automated Software Engineering, pages 307–318.

ICSOFT 2018 - 13th International Conference on Software Technologies