RTFM: Towards Understanding Source Code

using Natural Language Processing

Maximilian Galanis

, Vincent Dietrich

, Bernd Kast

and Michael Fiegert

Siemens AG, Corporate Technology, Otto-Hahn-Ring 6, 81739 Munich, Germany

Keywords:

Semantics-based Software Engineering, Information Extraction, Natural Language Processing, Planning.

Abstract:

The manual conﬁguration of today’s autonomous systems for new tasks is becoming increasingly difﬁcult due

to their complexity. One solution to this problem is to use planning algorithms that can automatically syn-

thesize suitable data processing pipelines for the task at hand and thus simplify the conﬁguration. Planners

usually rely on models, which are created manually based on already existing methods. These methods are

often provided as part of domain speciﬁc code libraries. Therefore, using existing planners on new domains

requires the manual creation of models based on the methods provided by other libraries. To facilitate this,

we propose a system that generates an abstract semantic model from C++ libraries automatically. The neces-

sary information is extracted from the library using a combination of static source code analysis to analyze its

header ﬁles and natural language processing (NLP) to analyze its ofﬁcial documentation. We evaluate our ap-

proach on the perception domain with two popular libraries: HALCON and OpenCV. We also outline how the

extracted models can be used to conﬁgure data processing pipelines for the perception domain automatically

by using an existing planner.

1 INTRODUCTION

Autonomous systems are expected to adapt to chang-

ing tasks quickly. A good example for this are indus-

trial assembly tasks with small lot sizes. Such cases

demand robots that quickly adapt to new work pieces.

Unfortunately, the rapid reconﬁguration of these com-

plex autonomous systems is infeasible when under-

taken manually. One solution to this problem is the

use of planning algorithms for automatic reconﬁgu-

ration. They can synthesize suitable data process-

ing pipelines automatically and, therefore, reduce the

needed engineering and time requirements to recon-

ﬁgure autonomous systems (Kast et al., 2018).

Planners usually require models that formalize the

description of the target domain. The algorithms and

data structures used in the models are often provided

by code libraries. These libraries are usually domain

speciﬁc and contain a wide variety of methods and

algorithms for their target domain. To add a new do-

main to the planning algorithm, it is therefore neces-

sary to model the algorithms and data structures used

in the library. However, such models are still created

https://orcid.org/0000-0001-8483-0535

https://orcid.org/0000-0003-0568-9727

https://orcid.org/0000-0001-7838-3142

https://orcid.org/0000-0002-6371-6394

manually. The main reason for this is that the needed

semantic information is often not explicitly described

as part of the source code. It is usually only described

in the documentation of the library, which engineers

can understand using their background knowledge.

Because understanding not only the source code but

also the documentation requires signiﬁcant domain

knowledge, the creation of such models is a time con-

suming task.

We propose a system that can create an abstract

semantic model of a given C++ library automati-

cally. Our approach combines source code analy-

sis to extract the application programming interface

(API) with state-of-the-art natural language process-

ing (NLP) to understand the code documentation. The

source code analysis is used to provide a dependable

knowledge base, which can be augmented with addi-

tional information but never be falsiﬁed. We use this

knowledge base and extend it with semantic infor-

mation that is extracted from the ofﬁcial documenta-

tion. Our system creates semantic models of the func-

tions and data structures of the library. It is machine-

readable and represents a high-level abstraction of the

underlying source code. Thereby, the model elimi-

nates the need to understand and manipulate source

code, which has the potential to reduce the time and

engineering knowledge requirements. Our model rep-

resents the data structures of the library as concepts

430

Galanis, M., Dietrich, V., Kast, B. and Fiegert, M.

RTFM: Towards Understanding Source Code using Natural Language Processing.

DOI: 10.5220/0009826604300437

In Proceedings of the 17th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2020), pages 430-437

ISBN: 978-989-758-442-8

and the functions operating on these data structures

as operators (Kast et al., 2019). More generally, one

can also view operators as algorithms and concepts

as the (semantic) data structures the algorithms work

on. In practice, operators always have concepts as in-

puts and outputs, with which we model the function

parameters and return types.

Throughout this paper, we focus on the perception

domain for which excellent libraries, like HALCON

and OpenCV

exist. We demonstrate the applicability

of our approach by extracting models from both com-

puter vision libraries and outline how the extracted

model can be used to synthesize perception pipelines.

1.1 Problem Analysis

Extracting abstract semantic models from a C++ li-

brary and its documentation is a hard problem, as such

a system must adapt to varying documentation and

API styles. Therefore, we evaluated the differences

in both aspects for three perception libraries: HAL-

CON, OpenCV and the PCL

. The analysis shows

that the declaration of a function as the sole infor-

mation source is usually insufﬁcient for the automatic

creation of an abstract model. Consider a function

that returns void and has a pointer as its parameter. In

this case, the C++ declaration provides no informa-

tion on whether this parameter is an input, an output

or both, of this function. While the const qualiﬁer can

deﬁne parameters as unchanging and thus as inputs,

it is often not available. However, this information

is described as part of the function’s documentation.

Either with explicit labels, or as part of the natural

language description. Therefore, the source code but

especially its documentation provide additional meta

information about the functions, like whether a pa-

rameter is an input or an output.

To add semantic knowledge to the model, the pa-

rameter’s data type is important. Similarly to the in-

put/output information, knowledge about the seman-

tic content is contained in both the API and the doc-

umentation. A library could deﬁne a class Image,

which it uses to represent images. Unfortunately, it is

common to obfuscate this information by using wrap-

per classes that can contain a wide variety of data

types, like the HObject

class in HALCON. If this is

the case, only the documentation describes the seman-

tic type of a parameter. Similarly to before, this meta

information can either be encoded by explicit labels

in the documentation (e.g., in HALCON), or in the

https://www.mvtec.com/products/halcon

https://opencv.org

http://pointclouds.org/

https://www.mvtec.com/doc/halcon/1811/en/HObject.html

Table 1: Comparison of HALCON, OpenCV and the PCL

concerning their API and documentation. NL stands for nat-

ural language.

Library

Documentation HALCON OpenCV PCL

input/output explicit NL explicit

semantic type explicit NL NL

availability custom doxygen doxygen

API

data types obfuscated partly ob-

fuscated

explicit

const qualiﬁer consistent inconsistent consistent

natural language description of the parameter (e.g., in

OpenCV).

Our ﬁndings are illustrated in Table 1. The com-

parison motivates a highly modular design to cope

with the heterogeneity in the design of both the API as

well as the ofﬁcial documentation of these libraries.

Furthermore, the examples of OpenCV and PCL il-

lustrate that it is necessary to understand the natural

language documentation in order to obtain, e.g., the

semantic types of parameters.

1.2 Contribution

In this paper, we make the following contributions:

(1) Proposal of an architecture for knowledge extrac-

tion with multiple information sources. (2) Evalua-

tion of the combination of source code analysis in

combination with state-of-the-art NLP methods on

the task of interpreting source code and its documen-

tation. (3) Demonstration of the applicability of a

fully functional model extraction system (MES) on

the HALCON and OpenCV libraries.

2 RELATED WORK

In this section, we discuss how NLP and informa-

tion extraction tools are currently used in the software

engineering ﬁeld to extract information from meta

knowledge sources like the ofﬁcial documentation or

forum discussions.

Closely related to this work, is the trend in soft-

ware engineering to explore methods to interpret

source code and the corresponding natural language

documents. A good example of this is the work pre-

sented by (Zhang and Hou, 2013). They use seman-

tic analysis to extract negative mentions of API fea-

tures from forum discussions and then extract prob-

lematic API features. Another way to interpret natu-

ral language is presented by (Zhong et al., 2009) who

RTFM: Towards Understanding Source Code using Natural Language Processing

431

Step 2

Step 1

Step 3 Step 4 Step 5

Step 6 Step 7 Step 8

Exporter

Main Module

ASG

AutoWig

ASG reduced ASG

extended ASG

Parses all

headers of the

library and

creates an ASG

CodeGenerator Planner Interface

ASG ASG

Merger

NLPDoc

Merges the

information

extracted within

steps three four

and ﬁve

Generates the

source code for

each function

call

Converts the

ASG to models

for the planner

Interprets the

information

extracted by the

DocParser

extended ASG

Parses the

ofﬁcial

documentation

Applies

heuristics to

infer extra

information

from the ASG

DocParserHeaderParser

Filter

Filters out all

unneeded nodes

reduced ASG extended ASG

extended ASG

Figure 1: Overview of the different modules used in the harvester. Every block corresponds to a module of the harvester. The

structure of this system can be divided into three phases. Preprocessing (blue), knowledge acquisition (orange) and export

(green). Each phase is divided into several modules.

use NLP to infer resource speciﬁcations from natu-

ral language documentation. In their approach, they

map class methods to ﬁve predeﬁned types of meth-

ods based on their associated description. To classify

the class methods into the categories, they use word-

level classiﬁcation of the corresponding documenta-

tion based on Hidden Markov Models (HMMs) (Ra-

biner and Juang, 1986). Using these classes, they

can then detect bugs in source code by, e.g., detect-

ing that a manipulation method is called before a cre-

ation method. Another classiﬁcation problem is ap-

proached by (LeClair et al., 2018), who attempt to

do a high-level classiﬁcation of software projects into

predeﬁned categories based on their source code. For

this, the authors use word embeddings that were pre-

trained to encode the similarity of high-level project

descriptions and low-level source code.

Probably most similar to our work is presented by

(Li et al., 2018). They tackle the API accessibility

issue by mining an API knowledge graph from mul-

tiple API sources (e.g., ofﬁcial documentation, API

tutorials) so that developers can be warned of possi-

ble caveats of the APIs before they run into problems.

To extract the API caveats from the documentation,

(Li et al., 2018) crawl the online API documenta-

tion and extract the textual description of the APIs.

Then, they construct an API skeleton graph from the

reference documentation by exploiting the provided

information like the fully qualiﬁed names of the en-

tities, inheritance information, data type references,

function/method return types, and function/method

parameter types. To extract the API caveat sen-

tences, a pattern-based approach (e.g., regular expres-

sions) was used. Finally, using co-reference reso-

lution, sentences containing API caveats are linked

to their corresponding API entity in the knowledge

graph. Their user study provides evidence that this

knowledge graph improves API caveat accessibility

among developers.

Like most research in the ﬁeld of software engi-

neering, our approach focuses on the quest to reduce

the amount of work needed to build and maintain soft-

ware using program analysis and information extrac-

tion. However, unlike the existing research, which

tries to support the developer, the here presented sys-

tem goes further and explores ways to automate the

role of the software developer partially.

3 METHODOLOGY

Our MES is designed to be modular and expandable.

We achieve this by using the abstract semantic graph

(ASG) of the target library as our knowledge base

throughout the analysis. The ASG is extracted from

the header ﬁles of the target library using static source

code analysis and represents all the code components

that are part of the source as nodes and the semantic

relationships between the nodes as edges. Our sys-

tem is designed in such a way that all modules have

access to the current knowledge in the form of the

ASG. Any information that is extracted by the mod-

ules, like the extracted semantic value for each pa-

rameter or whether a parameter is an input or output,

is then added directly to the correct nodes in the ASG

themselves. Figure 1 illustrates this design and the

currently implemented modules.

ICINCO 2020 - 17th International Conference on Informatics in Control, Automation and Robotics

432

Out of the modules in the MES, the Exporter mod-

ule is unique in that it does not implement any func-

tionality itself, but acts as the main module and de-

ﬁnes the order in which the other modules are called.

Therefore, there are no architectural restrictions in the

order in which independent modules can be called.

Each module has access to all of the information that

was harvested so far. This also enables the creation

of custom pipelines for different target libraries. This

ﬂexibility is fundamental because a fully generic so-

lution to the model extraction problem is not feasible

yet, as illustrated in subsection 1.1. The Exporter also

provides a default pipeline that is used for all follow-

ing experiments.

Parsing the Library and Filtering. First, we ob-

tain our knowledge base, the ASG of the target li-

brary. For this, we use the parser module of Au-

toWIG

(Fernique and Pradal, 2017), which inter-

nally uses Clang

to create one coherent ASG for the

header ﬁles of the target library. By using their parser

module instead of using Clang directly, we obtain an

ASG instead of the abstract syntax trees that Clang

produces. The decisive reason to rely on only the

header ﬁles for the source code analysis is that the full

source code is not necessarily available, as it is not

uncommon to only distribute a binary version. The

resulting ASG is then modiﬁed to include several ad-

ditional attributes. These additional attributes allow

us to store the information that is extracted in the fol-

lowing modules directly inside the correct nodes of

the ASG, which drastically simpliﬁes the data han-

dling.

The next step is to decide which nodes in the ASG

should be processed. By default, the ASG includes

not only the code components of the given library but

also all code components of the library’s dependen-

cies, like the C++ standard library. Therefore, we ﬁl-

ter the ASG to only include nodes of a given names-

pace, as it is common practice to have all user-facing

code components of a library in a uniﬁed namespace.

Applying Heuristics. After obtaining and ﬁltering

the ASG, we try to extract the semantic knowledge

that is contained the ASG itself. As elaborated in

subsection 1.1, the source code (and thus the ASG)

can contain some semantic information about whether

function parameters are inputs or outputs. Therefore,

we extract this information with the HeaderParser

module, which analyzes both, the parameter quali-

ﬁers and their data types. This module then marks

all parameters or return values that are represented by

https://autowig.readthedocs.io/en/latest/index.html

https://clang.llvm.org/

fundamental types

, as well as all parameters that are

qualiﬁed with the const qualiﬁer, as inputs. Like const

parameters, parameters of fundamental types are also

guaranteed to be inputs because they cannot contain

any pointers, references or classes.

Extracting and Analyzing the Documentation.

After extracting the semantic information of the

source code, our system proceeds to analyze the of-

ﬁcial documentation of the target library. However, to

analyze the documentation it must be parsed ﬁrst. For

this task, we use documentation speciﬁc parsers that

extract the natural language comments of the code

components that are contained in the provided ASG.

This is necessary as the documentation styles of dif-

ferent libraries varies signiﬁcantly. After linking the

extracted comments to the corresponding nodes in the

ASG, there are several ways to analyze it. In the case

of HALCON, all the needed information

is already

explicitly stated in the documentation and thus parsed

directly by the DocParser and its custom heuristic.

Unfortunately, such a well-documented library is rare.

Therefore, we use NLP to analyze the documenta-

tion. In particular, we analyze the parameter com-

ments, as they contain the information about a param-

eter’s mode (input or output) as well as its semantic

content. To extract this knowledge, a pre-trained neu-

ral network that is based on the transformer architec-

ture (Vaswani et al., 2017), is used. This network is

then ﬁne-tuned on two tasks: (1) Input/Output clas-

siﬁcation (I/O task). (2) Semantic type classiﬁcation

(Sem type task). The trained network is then used

to classify the parameters and the resulting labels are

added to the corresponding node in the ASG.

Merging the Different Knowledge Sources. As

the last step during the knowledge extraction, we

merge the information from the different modules.

Consider the information whether a function param-

eter is an input or output. During the analysis in the

MES, several modules like the HeaderParser, or the

NLPDoc module might classify the parameter into in-

put and output. In case these information sources

contradict each other, a way to fuse both information

sources is needed. To accomplish this, the Merger

module uses a weighted majority vote to determine

the ﬁnal result for each of the two classiﬁcation tasks

given the results of several modules. This implemen-

In C++, this includes the boolean type as well as integer,

ﬂoat and character types:

https://en.cppreference.com/w/cpp/language/types.

This includes whether a parameter is an input or output of

the function and the parameter’s semantic value.

RTFM: Towards Understanding Source Code using Natural Language Processing

433

tation was chosen to be ﬂexible enough to allow for

additional modules that might be added in the future.

Generating the Source Code. Once the needed in-

formation is extracted and merged, the MES generates

the necessary glue code. This code is responsible for

calling the underlying function using the inputs pro-

vided by the input concepts of the operator as well as

passing the outputs of the function to the output con-

cepts. The CodeGenerator relies on cppyy (Lavrijsen

and Dutta, 2016)

to create Python bindings for the

underlying library automatically.

By using these bindings, the underlying C++ li-

brary can be called directly within Python, which sim-

pliﬁes the code generation process.

Interface to the Planner. The last module of our

system is responsible for exporting the harvested in-

formation to the used planner. As mentioned be-

fore, our planner utilizes a bi-modal approach, which

splits the model into procedural knowledge (opera-

tors) and declarative knowledge (concepts). There-

fore, the last module interfaces with the existing plan-

ner (Kast et al., 2018) to create the necessary opera-

tors and concepts. If available, concepts are created

to reﬂect the semantic type of the data they hold. In

case no semantic type information was extracted, this

module falls back to using the data type of the param-

eter that is being processed. Because of the modular-

ity of our MES, the extracted information could also

be exported to other planners by adding an additional

planner interface.

4 EVALUATION

Automatically Synthesizing Operator Pipelines.

To showcase that the extracted models are suitable for

automatically synthesizing data processing pipelines,

our planner was presented with a select list of ex-

tracted models from the HALCON library, some start-

ing facts, and a goal

. Additionally, this experiment

motivates the extraction of semantic information dur-

ing the model extraction.

The initial inputs (facts) are two parameters and

an image, as illustrated in Figure 2. The goal of the

planner is then to ﬁnd a suitable chain of operators

that produce the given goal, which, in this case, is an

https://cppyy.readthedocs.io/en/latest/

Facts and goals are speciﬁc instantiations of concepts. An

example of a speciﬁc instantiation of a concept is the

number 42, which is a speciﬁc instance of the concept

number.

HalconCpp TupleDeg

HalconCpp Threshold

HalconCpp Connection

HalconCpp GetImageSize

HalconCpp ReadImage

HalconCpp SelectShape

HalconCpp EllipticAxis

HalconCpp AreaHoles

number = 2.3

ﬁlename read = image path

number = 0

number = 128

Figure 2: Initially available operators (hexagons) and in-

stances (green and blue) as well as the desired goal instance

(yellow). The given image path points to the depicted paper

clip.

Table 2: Planning performance with and without semantic

type information.

Semantic Informa-

tion

Visited Nodes Improvement

No semantic types 324

With semantic types 115 64.5 %

instance of the concept number with the value 2.3. In

this experiment, the paper clip in the input image is

rotated by 2.3 degrees. Thus, by solving this task, the

planner indirectly solves a range of problems - identi-

fying the angle of single objects in high contrast im-

ages. As illustrated in Figure 3, the planner is able to

ﬁnd a suitable operator pipeline that solves the given

task. While the provided example is relatively simple,

this approach has been shown to scale to more com-

plex perception pipelines (Dietrich et al., 2018). It

has also been shown that the necessary ﬁne-tuning of

parameters can be addressed with optimization based

approaches (Dietrich et al., 2019).

In their C++ interface, HALCON uses very few

data types, such as HObject and HTuple. How-

ever, our planner mainly evaluates the compatibility

of operators based on their input and output concepts.

HALCON’s API design, therefore, results in a vast

search space. As illustrated in Table 2, the additional

semantic information, which we extracted from HAL-

CON’s documentation drastically improves the plan-

ning performance. While the semantic type informa-

tion provided by explicit labels in HALCON’s docu-

mentation is valuable, this experiment also shows its

limits. Because the instance was labeled as a number

and could not be deﬁned more precisely as e.g., an

angle in degrees, deﬁning goals could be problematic

in more complex scenarios.

ICINCO 2020 - 17th International Conference on Informatics in Control, Automation and Robotics

434

HalconCpp Threshold

self = ”0”

HalconCpp ReadImage

self = ”961.092”

self = ”0.0402974”

self = ”364.321”

HalconCpp TupleDeg

RegionImage

self = ”2.30887”

HalconCpp EllipticAxis

self = ”klammer.jpg”

self = ”128”

Figure 3: Data processing pipeline synthesized by the planner. Operators are illustrated as ovals, while concepts are illustrated

with rectangles. The content of the more complex concepts Image and Region is not displayed.

Table 3: Obtained NLP accuracy on the different tasks on

both libraries. All results are the average of 5 runs with

different seeds. Table 7 gives a brief overview of all used

datasets.

Task Dataset Accuracy

I/O io 87.2 %

I/O io balanced 84.2 %

Sem type sem type 89.6 %

I/O cv io 97.9 %

I/O cv io balanced 89.8 %

Natural Language Processing Results. Unfortu-

nately, only few libraries include machine-readable

labels in their documentation. Generally, it is neces-

sary to analyze the natural language comment that is

associated with the function parameters to obtain the

needed semantic information.

To be able to evaluate the contents and meaning

of the natural language comments, we use the SciB-

ERT (Beltagy et al., 2019) language model (LM).

We then ﬁne-tuned separate SciBERT instances on

the two tasks for HALCON as well as OpenCV us-

ing the labeled data that is provided by HALCON’s

documentation and a manually labeled dataset for

OpenCV. For the I/O classiﬁcation task, we provide

an additional balanced dataset for both libraries, in

which the samples of both classes are equally likely.

This is needed because the class input is much more

common in the I/O classiﬁcation datasets. Table 8 il-

lustrates ten samples of our dataset for each library,

while Table 7 summarizes the used datasets. The sam-

ples of our dataset show that we use a combination

of the parameter comment and the parameter name

as the input to the LM. We found that this slightly

improves the classiﬁcation performance of the ﬁne-

tuned LM, as illustrated in Figure 4. Finally, we ap-

ply similar hyperparameters during our experiments

to the ones (Beltagy et al., 2019) use during their

evaluation. The most important hyperparameters are

listed in Table 6.

As illustrated in Table 3, using the combination

of the parameter comments and the parameter name

as features to classify the parameters yields promis-

ing results on both tasks. Even on balanced datasets,

the obtained accuracy only drops by a small margin,

which shows that the models indeed correctly learn to

discern the different classes.

Table 4: Performance of SciBERT (ﬁne-tuned on HAL-

CON) on the OpenCV dataset. All results are the average

of 5 runs with different seeds.

Task Dataset

Train

Dataset

Eval

Accuracy

I/O io cv io 90.1 %

I/O io cv io balanced 84.2 %

Sem type sem type cv sem type 33.9 %

Transferring Knowledge. Creating a labeled

dataset for each new library that should be modeled

contradicts the basic thought of an automatic MES.

Therefore, we examine the performance of the LMs

that were ﬁne-tuned on HALCON on OpenCV -

a new domain. The results of this evaluation are

illustrated in Table 4. The results show that the

model ﬁne-tuned on the I/O classiﬁcation task does

generalize well to this unseen library and achieves

a 84.2 % accuracy on the cv io balanced dataset.

This result is comparable to the accuracy this model

achieved in its training domain: HALCON.

Unfortunately, the classiﬁcation of the parame-

ters into their semantic types does not perform as

well. While the model trained on the Sem type

task achieved 89.6 % in its training domain, it only

achieves 33.9 % on OpenCV, which is only barely bet-

ter than chance in this case. However, images are

classiﬁed correctly with 85.5 % in the cv sem type

dataset, suggesting that images are described simi-

larly, allowing the classiﬁer to perform well for this

class.

The samples from the datasets illustrated in Ta-

ble 8 help to understand some of the problems con-

cerning the transfer performance of the Sem type

task. Consider the semantic type image. All of the

samples of this class contain the word image mak-

ing it easier to classify. This is the case for samples

in both libraries and might explain the good perfor-

mance the LM achieved for this particular class. For

the other classes, however, classiﬁcation is more difﬁ-

cult, even for an experienced engineer. One example

for this is the semantic type string. The samples 6

and 16 in Table 8 are both strings but unlike the im-

age class, the samples do not share common words.

Additionally, both parameters could just as well be a

number to select pre-deﬁned options in an enum given

the parameter comment.

RTFM: Towards Understanding Source Code using Natural Language Processing

435

Table 5: NLP performance after merging the information

from the HeaderParser module. All results are the average

of 5 runs with different seeds.

Task Dataset Train Dataset Eval Accuracy

I/O io cv io 93.2 %

I/O io cv io balanced 90.1 %

Performance after Merging. The I/O classiﬁcation

task is special in that it has two knowledge sources

currently: the ASG, and the NLP analysis of the pa-

rameter comment. Thus, the knowledge from the

ASG can be merged with the result from the NLP

analysis. In the case of this experiment, the extra in-

formation is used to minimize the reliance on NLP.

The current implementation of the Merger module

uses following weights: (1) HeaderParser: 4 (2) Doc-

Parser: 2 (3) NLPDoc: 1 This means that NLP is only

used on parameters that could not be classiﬁed with

certainty by the HeaderParser module i.e., parame-

ters that are not qualiﬁed with the const qualiﬁer and

are not a fundamental type. As shown in Table 5, this

method improves the ﬁnal accuracy.

5 CONCLUSION

In this paper, we presented an approach to extract

abstract semantic models of C++ libraries automati-

cally, which we evaluated on the perception domain

with two popular computer vision libraries. We also

outlined how the extracted models of the percep-

tion libraries can be used within a planner to fur-

ther automate the creation of perception pipelines. It,

therefore, lowers the engineering barriers to develop

robotics and automation solutions that can adapt to

new tasks automatically via planning. Our approach

is based on the combination of static source code anal-

ysis and NLP, which is used to interpret the corre-

sponding documentation. Because we did not make

any domain speciﬁc assumptions, we expect our ap-

proach to perform similarly on other domains.

Our evaluation shows the beneﬁts of additional se-

mantic information on the planning performance. The

required semantic information can be extracted with

a heuristics-based parser in case a machine-readable

documentation is provided (i.e., HALCON). More

generally, however, it is necessary to use NLP to ex-

tract semantic knowledge. Therefore, we ﬁne-tuned

a state-of-the-art LM on two classiﬁcation tasks to

extract semantic information. Our results show that

this approach works well in the training domain. Un-

fortunately, applying the trained model to another li-

brary showed mixed results. While it worked well for

the input/output classiﬁcation task, the semantic type

classiﬁcation task showed the limits of the used LM.

Future work could extend the here described static

analysis with dynamic program analysis to validate

the extracted labels. Additionally, an interesting re-

search direction would be to take more information

into account, like the other parameters of the func-

tion or the functions’ description. Finally, a more di-

verse training data set could improve the transfer per-

formance.

REFERENCES

Beltagy, I., Lo, K., and Cohan, A. (2019). SciBERT: A pre-

trained language model for scientiﬁc text. 2019 Con-

ference on Empirical Methods in Natural Language

Processing (EMNLP), arXiv:1903.10676.

Dietrich, V., Kast, B., Fiegert, M., Albrecht, S., and Beetz,

M. (2019). Automatic conﬁguration of the structure

and parameterization of perception pipelines. In 2019

19th International Conference on Advanced Robotics

(ICAR), pages 312–319.

Dietrich, V., Kast, B., Schmitt, P., Albrecht, S., Fiegert, M.,

Feiten, W., and Beetz, M. (2018). Conﬁguration of

perception systems via planning over factor graphs. In

2018 IEEE International Conference on Robotics and

Automation (ICRA), pages 6168–6174.

Fernique, P. and Pradal, C. (2017). AutoWIG: Auto-

matic generation of Python bindings for C++ libraries.

CoRR, abs/1705.11000.

Kast, B., Albrecht, S., Feiten, W., and Zhang, J. (2019).

Bridging the gap between semantics and control for

industry 4.0 and autonomous production. In 2019 15th

International Conference on Automation Science and

Engineering (CASE), pages 780–787.

Kast, B., Dietrich, V., Albrecht, S., Feiten, W., and Zhang, J.

(2018). A hierarchical planner based on set-theoretic

models: Towards automating the automation for au-

tonomous systems. In 16th International Conference

on Informatics in Control, Automation and Robotics

(ICINCO).

Lavrijsen, W. T. L. P. and Dutta, A. (2016). High-

performance Python-C++ bindings with pypy and

cling. In Proceedings of the 6th Workshop on

Python for High-Performance and Scientiﬁc Comput-

ing, pages 27–35.

LeClair, A., Eberhart, Z., and McMillan, C. (2018). Adapt-

ing neural text classiﬁcation for improved software

categorization. In 2018 IEEE International Confer-

ence on Software Maintenance and Evolution (IC-

SME), pages 461–472.

Li, H., Li, S., Sun, J., Xing, Z., Peng, X., Liu, M., and

Zhao, X. (2018). Improving API caveats accessibil-

ity by mining API caveats knowledge graph. In 2018

IEEE International Conference on Software Mainte-

nance and Evolution (ICSME), pages 183–193.

Rabiner, L. and Juang, B. (1986). An introduction to hidden

markov models. IEEE ASSP Magazine, pages 4–16.

ICINCO 2020 - 17th International Conference on Informatics in Control, Automation and Robotics

436

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,

L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.

(2017). Attention is all you need. In Advances in

Neural Information Processing Systems (NIPS), pages

5998–6008.

Zhang, Y. and Hou, D. (2013). Extracting problematic API

features from forum discussions. In 2013 21st In-

ternational Conference on Program Comprehension

(ICPC), pages 142–151.

Zhong, H., Zhang, L., Xie, T., and Mei, H. (2009). In-

ferring resource speciﬁcations from natural language

API documentation. In 2009 IEEE/ACM Interna-

tional Conference on Automated Software Engineer-

ing (ASE), pages 307–318.

APPENDIX

Table 6: Hyperparameters used to ﬁne-tune the SciBERT

language model on the I/O and Sem type classiﬁcation

tasks.

Learning Rate 2 × 10

−5

Number of Epochs 3

Batch Size 32

Warmup Ratio 6 %

Weight Decay 0.0

Learning Rate Decay Linear

Table 7: Overview of the used datasets.

Library Task Dataset Name

HALCON

I/O io

I/O io balanced

Sem type sem type

OpenCV

I/O cv io

I/O cv io balanced

Sem type cv sem type

1 2 3

Number of Epochs

0.5

0.6

0.7

0.8

0.9

1.0

Accuracy

without parameter name

with parameter name

1 2 3

Number of Epochs

without parameter name

with parameter name

Figure 4: Illustration of the effects of adding the parame-

ter name to the parameter comment on the I/O (left) and

Sem type (right) classiﬁcation tasks. Drawn with 95 % con-

ﬁdence intervals on ten runs with varying seeds.

Table 8: Illustration of 10 samples from both the HALCON

and the OpenCV dataset.

RTFM: Towards Understanding Source Code using Natural Language Processing

437