Integration of Heterogeneous Modeling Languages via Extensible and
Composable Language Components
Arne Haber
1
, Markus Look
1
, Antonio Navarro Perez
1
, Pedram Mir Seyed Nazari
1
, Bernhard Rumpe
1
,
Steven V
¨
olkel
2
and Andreas Wortmann
1
1
Software Engineering, RWTH Aachen University, Aachen, Germany
2
Volkswagen Financial Services, Braunschweig, Germany
Keywords:
Modeling Language Engineering, MDE, Modeling Language Integration.
Abstract:
Effective model-driven engineering of complex systems requires to appropriately describe different specific
system aspects. To this end, efficient integration of different heterogeneous modeling languages is essential.
Modeling language integaration is onerous and requires in-depth conceptual and technical knowledge and ef-
fort. Traditional modeling lanugage integration approches require language engineers to compose monolithic
language aggregates for a specific task or project. Adapting these aggregates to different contexts requires vast
effort and makes these hardly reusable. This contribution presents a method for the engineering of grammar-
based language components that can be independently developed, are syntactically composable, and ultimately
reusable. To this end, it introduces the concepts of language aggregation, language embedding, and language
inheritance, as well as their realization in the language workbench MontiCore. The result is a generalizable,
systematic, and efficient syntax-oriented composition of languages that allows the agile employment of mod-
eling languages efficiently tailored for individual software projects.
1 INTRODUCTION
Engineering of non-trivial software systems requires
reducing the conceptual gap between problem do-
mains and solution domains (France and Rumpe,
2007). Model-driven engineering (MDE) aims at
achieving this by raising the level of abstraction from
programming of a complete system implementation
to abstract modeling of domain and system aspects.
In this way, models are raised to the level of pri-
mary development artifacts. Different aspects of com-
plex software systems require different modeling lan-
guages to be expressed with. The UML (Object Man-
agement Group, 2010), for instance, contains seven
structure modeling languages, with class diagrams
probably being the most famous, and seven behav-
ior modeling languages as well, e.g., statecharts and
activity diagrams. Integration of modeling languages
for a software project either requires composing the
languages specifically for this project a priori, or de-
signing the independent languages with composition
in mind - but without prior assumptions of the ac-
tual composition. The former approach yields mono-
lithic language aggregates which are hardly reusable
for different projects.
We propose an approach to syntax-oriented black-
box integration of grammar-based textual languages
developed around the notions of language aggre-
gation, language embedding, and language inheri-
tance (Schindler, 2012; V
¨
olkel, 2011). This approach
addresses all aspects of syntax-oriented language in-
tegration, namely concrete syntax, abstract syntax,
symbol tables, and context conditions. Is based on
previous work on syntactic modeling language inte-
gration (Krahn et al., 2008) and introduces new mech-
anisms to inter-language model validation. These new
mechanisms were briefly introduced at the GEMOC
workshop at MODELS 2013 (Look et al., 2013).
This contribution explains the concepts and their im-
plementation with the language workbench Monti-
Core (Krahn et al., 2010) in detail.
At first, we will motivate the need for language
integration in MDE on the example of a cloud-based
web system in Sect. 2. Afterwards, Sect. 3 explains
the concepts for language integration, and Sect. 4
their support through a language integration frame-
work. Section 5 discusses concepts related to our
work. Section 6 concludes this contribution with an
outlook on future work and a summary.
19
Haber A., Look M., Navarro Perez A., Mir Seyed Nazari P., Rumpe B., Völkel S. and Wortmann A..
Integration of Heterogeneous Modeling Languages via Extensible and Composable Language Components.
DOI: 10.5220/0005225000190031
In Proceedings of the 3rd International Conference on Model-Driven Engineering and Software Development (MODELSWARD-2015), pages 19-31
ISBN: 978-989-758-083-3
Copyright
c
2015 SCITEPRESS (Science and Technology Publications, Lda.)
2 MOTIVATION
To illustrate our approach we will first motivate the
concepts of language aggregation, language embed-
ding, and language inheritance by the example of a
cloud-based web system that is described by various,
heterogeneous models. The techniques employed in
this example will be described in detail in the follow-
ing sections. Throughout the example, different needs
for language integration will arise which we catego-
rize as follows:
Language aggregation integrates different model-
ing languages by mutually relating their concepts
such that their models can be interpreted together,
yet remain independent.
Language embedding denotes the composition of
different modeling languages by embedding con-
cepts of one language into declared extension
points of another. Models of the new language
thereby contain concepts of both languages.
Language inheritance is the definition of new lan-
guages on the basis of existing languages through
reuse and modification of existing language con-
cepts.
Consider a system that receives streams of sen-
sor data from a multitude of internet-connected sen-
sor hardware, e.g., temperature and wattage sensors
in buildings, analyzes these streams for patterns, and
persists the data into a database. Our aim is to specify
this system by using models in a way detailed enough
to generate a significant amount of its implementation
automatically.
To this end, various system aspects, such as its
overall architecture, the data it operates on, and its
deployment onto a runtime infrastructure, need to be
addressed individually by appropriate modeling lan-
guages, including but not limited to architecture mod-
els, data models, and deployment models. Yet, many
of those aspects are not independent, but mutually re-
lated. For instance, for each specified type of data,
there are processing logic and database structures spe-
cific to it. Consequently, the languages need to be in-
tegrated in such a way that their models can reference
each other and be interpreted together.
In addition, some aspects of the system are of a
more general nature and apply to a wide range of sys-
tem kinds. Conversely, other aspects are specific to
the application domain at hand. For instance, archi-
tectural distribution is not only relevant in web sys-
tems, but also in many embedded systems, whereas
aspects like session management are more specific to
web systems. It is desirable to separate the language
concepts for general aspects from those for specific
aspects in order to facilitate the modularization of lan-
String clientID
Date timestamp
Long sequence
Request
String gatewayID
String sensorID
Date timestamp
SensorMeasurement
String signature
Credentials
String key
T value
SensorValue
T
1
1..*
1..*
SELECT Object(v)
FROM SensorValue v
WHERE v.key = :key
getValues(String key)
CD
HQL
Figure 1: Domain model SensorData for sensor measure-
ments.
guages into reusable language components. Besides,
such reuse of existing general languages reduces the
need for developers to learn new notations and con-
cepts.
In the following, we model the domain model and
the software architecture of our system using several
modeling languages. In doing so, we stress the differ-
ent needs of language integration that arise from our
scenario.
2.1 Domain and Data Modeling
Nearly all object-oriented systems operate in the con-
text of a domain model. Such models describe the ap-
plication’s real world context in terms of classes and
associations between classes. Their most prominent
role is to serve as the basis for the application’s fun-
damental data structures that are used for computa-
tion, communication, and state persistence. The latter
is typically realized by means of a database that op-
erates according to a database paradigm, such as rela-
tional databases or one of various NoSQL flavors. In
our example we focus on a relational database which
allows for complex queries on data. Such queries
are typically expressed in SQL or one of its deriva-
tives, such as Hibernate’s HQL (Hibernate website
http://hibernate.org/, ), depending on the technology
underlying the application’s persistence.
Class diagrams (CD) are the foundational model-
ing language in which domain models are described.
We formulate them by means of a textual syntax de-
fined by the UML/P (Rumpe, 2011; Rumpe, 2012;
Schindler, 2012), a variant of the UML focused on
precise semantics and applicability to generative soft-
ware engineering. In any case, CDs mainly consist of
classes, interfaces, and associations.
Figure 1 shows a graphical representation of the
domain model for our example. It consists of classes
representing the messages received by the system and
the values they convey. Class SensorValue is pa-
rameterized with a generic parameter T which might
be bound with a type defined by another class or by
any type defined by an external language, e.g., Java.
MODELSWARD2015-3rdInternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment
20
Furthermore, its method getValues(key) contains
an embedded HQL expression that specifies its imple-
mentation.
2.2 Software Architecture
Cloud-based web systems typically consist of many
different components that are distributed over differ-
ent hardware nodes in a network. In addition, soft-
ware components and hardware nodes may replicate
dynamically at runtime to meet changing system load
levels. We model the overall software architecture
with a component and connector architecture descrip-
tion language (ADL) (Medvidovic and Taylor, 2000)
clArc that is derived from the ADL MontiArc (Haber
et al., 2012b) and includes cloud-specific language
extensions. Figure 2 shows the graphical represen-
tation of such a software architecture model. It is for-
mulated in terms of hierarchically decomposed com-
ponents. Components realize the system’s function-
ality by interacting through directed communication
channels over which they exchange messages asyn-
chronously. Channels are at each end connected to
typed ports which in sum represent a component’s in-
terface. These elements are defined by MontiArc.
The example shows two of the extensions clArc
introduces. Firstly, components may be marked as
replicating, indicating that they depict multiple run-
time instances in the actual system. Secondly, com-
ponents may have service ports in addition to reg-
ular ports. These ports represent “vertical” opera-
tional interfaces that integrate the component with
its runtime environment. In our example, the soft-
ware architecture model describes a service im-
plemented by the SensorDataSubmissionHandler
component. This component is decomposed into
four interconnected components. The DataStore and
EventBroadcaster component interact with their
runtime environment by referencing operational inter-
faces that they request and provide. These interfaces
are specified by an external language, e.g., Java. The
PatternMatcher component uses the language ex-
tension of replication to indicate that it may increase
or decrease its quantity dynamically at runtime ac-
cording to system load. Port declarations reference
type definitions given by the CD in Fig. 1, thereby de-
noting that messages exchanged via that port have to
correspond to the port’s referenced type. Moreover,
service ports reference the CD as a whole, indicating
that the operational interface represented by that port
is made such that it corresponds to basic database op-
erations inferred from the data model interpretation
of that CD. It is evident that integration concepts be-
tween these different and heterogeneous models are
clArc
reference to the
class in the domain
model
SensorDataSubmissionHandler
SensorValue
DataStore
Sensor
Receiver
Sensor
Receiver
PatternMatcher
«data»
Database
<SensorData>
Request
Ack
Authenticating
Receiver
Figure 2: The software architecture model depicting the ser-
vice SensorDataSubmissionHandler which receives and
processes streams of sensor data.
required. In the following section we define such con-
cepts on the language level.
3 LANGUAGE INTEGRATION
CONCEPTS
In (Look et al., 2013) we already gave a first re-
alization of the language integration concepts de-
fined above and outlined their implementation for
grammar-based languages in the language workbench
MontiCore (Krahn et al., 2010). Here, we describe
these concepts as well as their application in detail.
In particular, we give a detailed description of the in-
tegration concepts, the integrated abstract syntax trees
(AST), and of references between AST nodes. Please
note that the corresponding parser integration mech-
anisms and the resulting chalanges are already dis-
cussed in (Krahn et al., 2010).
The following descriptions make use of the ex-
tended grammar format defined by MontiCore. Such
grammars serve to systematically derive both the con-
crete syntax and the abstract syntax of a language,
as well as language processing infrastructure such as
parsers and pretty-printers. Described briefly, every
production of a grammar implies the existence of an
AST node class of the same name. The non-terminals
of that production form the set of attributes of the
AST node class. Their entirety forms the signature
of that class. In addition, grammars can define ab-
stract productions and interface productions. Other
productions can extend abstract productions and im-
plement interface productions, indicating that their
non-terminals can be used anywhere the extended/im-
plemented production’s non-terminals are used. Con-
sequently, the resulting AST node classes incorporate
the respective extended/implemented signatures into
their own. Abstract productions differ from normal
productions in that they need to have at least one nor-
IntegrationofHeterogeneousModelingLanguagesviaExtensibleandComposableLanguageComponents
21
(a) Aggregration
AST of second
language
AST of first
language
(b) Embedding
ASTs of
embedded
languages
(c) Inheritance
extended AST
nodes of
sublanguage
Figure 3: The resulting ASTs for aggregation, embedding, and inheritance. Aggregation results in separate ASTs for each
model. Embedding results in a single AST with subtrees embedded at the leaves of the host language. Inheritance results also
in a single AST containing extended nodes of the sublanguage (cf. (Look et al., 2013)).
mal production that extends them. Interface produc-
tions work the same way but do not define concrete
syntax and hence only consist of non-terminals. An
in-depth description of the MontiCore grammar for-
mat is given in (Krahn et al., 2010).
3.1 Language Aggregation
Language aggregation combines multiple languages
into a collection of languages (called language fam-
ily), such that models of these languages can be in-
terpreted together but remain formulated in separate
artifacts, as shown in Sect. 2.2. There, port decla-
rations reference type definitions given by the class
diagram. The individual languages in a language fam-
ily are loosely coupled and able to mutually reference
each other’s elements. For instance, a declaration in a
model of one language may reference a type declared
by a model of another language.
Figure 4 shows how aggregation works on a con-
ceptual level and how concrete aggregations can be
defined within the MontiCore framework. The left
half shows two grammars (MCG) while the right half
shows a class diagram and a clArc model that corre-
spond to their respective grammar on the left. The
upper left part shows an excerpt of the grammar for
UML/P CDs. In particular, it shows the production
of the CDClass nonterminal which defines a class as
consisting of a modifier, i.e., private, protected,
and public, followed by the keyword class and the
name of the class. An opening curly bracket fol-
lows, enclosing arbitrary many attributes, methods or
constructors in an arbitrary order, and closing with a
curly bracket. The upper right part contains an in-
stance of the production in concrete syntax in which
a SensorValue class is defined. The lower left part
shows an excerpt of the clArc grammar in which the
production for a component port declaration is de-
fined. As explained earlier, the Type of a port is a
name interpreted as a reference to a data type. The
lower right part shows an instance of the production
referencing a SensorValue type.
Via language aggregation CD and ADL models
can be combined such that the type references in the
ADL model are interpreted as references to class def-
initions in the CD. The technical realization of lan-
guage aggregation works in two steps. Firstly, every
model of every language is parsed individually, re-
sulting in an AST for each model as shown in Fig. 3.
In our example, the type reference in the AST of the
architecture model is represented by an AST node
of type Name containing the name of the reference,
whereas the class definition in the AST of the CD is
represented by an AST node of type CDClass. Sec-
ondly, the references are related to each other by a
symbol table. Conceptually, the symbol table man-
ages a kind of link between AST nodes. Technically,
the links are implemented by adapter classes to al-
low for flexible linking to AST nodes from other lan-
guages. The details of this are described in Sect. 4.
Language aggregation is adequate for modeling
different aspects of a system, each of which can be un-
derstood on their own. Each aspect is then described
by individual model documents in specialized model-
ing languages. Through aggregation, these models are
related to each other without infringing a tight cou-
pling between them. Thereby, models can be reused
in different combinations and in a modular way. For
instance, the same CD can be used to define the same
types which can then be referenced by different mod-
els and in different development projects.
3.2 Language Embedding
Language embedding combines languages such that
they can be used in a single model. To this end, an em-
bedding language incorporates elements from other
languages at distinguished extension points. Even
though this gives the impression of tight coupling,
the individual languages are still developed indepen-
dently and integrated in a black-box way. References
between elements of embedding and embedded lan-
guages work similarly to language aggregation. Sect.
2.1 shows an example of CDs with embedded HQL.
MODELSWARD2015-3rdInternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment
22
CDClass =
Modifier "class" Name "{"
( CDAttribute
| CDConstructor
| CDMethod)* "}"
ClArcPort =
("in" | "out")
Type Name? ["[*]"]? ;
public class SensorValue {
String key;
T value;
}
out SensorValue dataToStore
types are modeled in a
class diagram
adaptation of language
constructs
MCG
MCG clArc
CD
Figure 4: The mechanism for aggregating languages. By adaptation between elements of two independent languages refer-
encing between them is achieved. The right half of the figure shows concrete models of aggregated languages referencing
each other.
Figure 5 shows how the embedding is accom-
plished within the MontiCore framework.
The upper left part again shows an excerpt of the
CD grammar with the production of the nonterminal
CDMethod. The production describes a method with
a modifier, a return type, a name followed by param-
eters enclosed in round brackets, and a Body. The
body is defined as an external nonterminal. Such ex-
ternal productions act as the extensions points into
which elements from other languages can be embed-
ded. In fact, the language is not complete as long as
its external productions have not been bound to ex-
ternal language elements. Note that neither the gram-
mar nor the external production contain any informa-
tion about filling the external nonterminal, leaving the
binding to a later stage. The lower left part shows
an excerpt of a HQL grammar containing the defini-
tion of a HQL block statement that encloses multiple
statements with curly brackets. The HQLStatement
nonterminal is production comprises further nonter-
minals, such as SELECT or INSERT statements. Again,
the HQL grammar does not contain any explicit ref-
erence to language embedding. The actual embed-
ding is specified in the language configuration model,
shown in the middle part of Fig. 5. The model maps
the nonterminal HQLBlock to the external nonterminal
Body. It is also possible to embed several languages
into a single external production by mapping several
external non-terminals to it, for instance, vanilla SQL.
After parsing, the resulting AST consists of different
nodes of the different languages, as shown in Fig. 3.
Nodes from embedded languages manifest as subtrees
attached in place of the node representing the external
production.
Language embedding is especially useful when
the language developer does not want to force the use
of a specific language but allows to choose the sub-
language later. For instance, it can be used to em-
bed different action languages within a structural lan-
guage to specify behavior.
3.3 Language Inheritance
Language inheritance can be used to extend or refine
an existing language. For this purpose, MontiCore al-
lows to define new languages on the basis of existing
languages by reusing, modifying and overriding their
productions. The example in Sect. 2.2 illustrates how
the clArc language extends the MontiArc language
and adds cloud-specific extensions.
Figure 6 illustrates how this extension is defined
within MontiCore’s grammar format. The upper left
part shows a production (with details omitted) from
the MontiArc grammar which specifies the nontermi-
nal ArcPort. Each port starts with either the key-
word in or the keyword out followed by a Type and
a Name. Both Type and Name define possible identi-
fiers for types and instances names similar to the nam-
ing scheme used in Java. The upper right part shows a
model element that conforms to the production shown
in the upper left part. The lower left part shows an
excerpt of the textual clArc grammar extending the
MontiArc grammar. The name of the grammar is fol-
lowed by the keyword extends and a reference to the
extended grammar. The production of the nontermi-
nal ClArcPort also contains the keyword extends
and a reference to the name of the extended pro-
duction inherited from the MontiArc grammar. The
left-hand side of the production contains all elements
present in the parent production and adds the possi-
IntegrationofHeterogeneousModelingLanguagesviaExtensibleandComposableLanguageComponents
23
CDMethod =
Modifier ReturnType
Name "(" CDParameters* ")"
Body | ";" ;
external Body;
language embedding {
HQLBlock in
Body;
}
HQLBlock = "{"
HQLStatement*
"}" ;
HQLStatement = ...
public Object getValues(String key);
public Object getValues(String key){
SELECT Object(v)
FROM SensorValue v
WHERE v.key = :key
}
CD
CDMCG
MCG
Figure 5: The mechanism for embedding languages. By declaring an external nonterminal Body and a separate mapping
artifact, nonterminals of arbitrary languages, such as BlockStatement can be embedded. The right half of the figure shows
a concrete model with the embedded element.
bility of specifying an additional terminal [*] which
denotes replicating ports. It is not obligatory to keep
all elements of a production that have been present
in the parent production. Instead, it is also possible
to leave some out, reorder them, add new elements
in between, or even remove all elements. The lower
right part shows a model element that corresponds to
the production on the lower left.
Productions of extended languages (or “parent”
language) are “virtually‘” copied into the extending
new language where they can be referenced from new
productions. In addition, new productions can indi-
vidually extend productions from the parent language
and thereby inherit that production’s interface. This
means that the extending production can be used any-
where the non-terminal from the extended production
is used. The resulting new AST nodes consequently
implement the signatures of their parent counterparts.
The right part in Fig. 3 illustrates the structure
of ASTs from inheriting languages. The generated
parser for the sublanguage is able to parse text corre-
sponding to the parent language as well as text corre-
sponding to the sublanguage, and consequently cre-
ates an AST containing node types from both lan-
guages. Since parent grammars and nodes are refer-
enced by names, name collisions can occur. To pre-
vent this, the language designer may use full qualified
names formed by the respective grammar’s package,
the grammar’s name and the name of the production.
Language inheritance is particularly useful for
reusing existing concepts of languages while extend-
ing them with new concepts. It is applicable when
the inheriting language is conceptually similar to the
parent language.
4 LANGUAGE INTEGRATION
FRAMEWORK
In this section, we describe a framework to (1) imple-
ment symbol table infrastructures that relate model
elements in composed languages in a black-box way
and to (2) configure them for concrete language com-
positions based on the concepts introduced above.
Section 3 described how language embedding and
language inheritance manifest in the ASTs of pro-
cessed models that are instances of combined lan-
guages (cf. Fig. 3). Both mechanisms make infor-
mation about embedded or inherited model elements
directly available in the AST for further model pro-
cessing, such as code generation. However, this is
not the case for models of aggregated languages in
which elements of one language reference elements
of another language by name. Here, the AST nodes
only contain the raw name of the referenced model
element. The same holds for embedding and embed-
ded languages which refer to each other through raw
names as well. Consequently, additional infrastruc-
ture is necessary to translate raw name references into
information about referenced model elements.
In the following, we describe an infrastructure
named symbol table that a) allows to acquire infor-
mation from referenced models as well as b) to trans-
MODELSWARD2015-3rdInternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment
24
grammar MontiArc {
ArcPort =
("in" | "out")
Type Name?;
}
grammar ClArc extends
ArchitectureDiagram {
ClArcPort extends ArcPort =
("in" | "out")
Type Name? ["[*]"]? ;
}
out SensorValue dataToStore
out SensorValue dataToStore[*]
MCG
MCG
MA
clArc
Figure 6: The mechanism for inheritance between languages. By declaring an extension, languages can inherit from each
other and are able to override productions. The right half of the figure shows a concrete model using the inheritance.
parently interpret elements of one language as ele-
ments of another. Compared to traditional symbol ta-
ble techniques, our realization must be able to trans-
late these different kinds of concepts between lan-
guages. For example, automata know about states
and input signals, whereas Java knows nothing about
these. To integrate these languages nonetheless, the
concept of a state must be translated into Java in a
meaningful way. For instance, states could be mapped
to an enumeration or also to subclasses (as, e.g., in
the state pattern (Gamma et al., 1995)). To keep both
languages, independent, we cannot define this trans-
lation in either language. Instead, we need to define
it as a separate artifact during language integration.
The ability to do this is one key feature of our symbol
table framework.
4.1 Symbol Table Concepts
A symbol table is a data structure that is used to store
and to resolve identifiers within a language. An iden-
tifier, such as a name, is associated with further infor-
mation from the corresponding language element. In
this way detailed information may be gathered from
the symbol table by resolving an element using its
name. In (V
¨
olkel, 2011) the most important parts of a
symbol table are defined. These are:
Definition 1 (Entry and Kind). A symbol is an entry
in a symbol table that represents a named element of
a model. It has a well defined signature determined
by its kind. This structure allows to store and read
kind specific information together with the entry.
An entry may be in distinct states that represent
the completeness of its name and the associated infor-
mation. For an unqualified entry only its unqualified
name, e.g., SensorValue is known, if the full qual-
ified name, e.g., de.se.SensorValue is known the
entry is in the qualified state. If an entry is in the full
state, further information respectively entries, e.g., the
methods of type de.se.SensorValue, are associated
to the entry.
Definition 2 (Scope). A scope of an entry is that part
of a model in which the element that is represented by
the entry may be referenced by its name. Thus, the
entry is visible and not hidden within its scope.
Definition 3 (Namespace). A namespace is a section
of a model in which names and corresponding en-
tries are managed together in symbol tables. Usu-
ally, namespaces are attached to nonterminals that
open namespaces and are thus organized hierarchi-
cally. A namespace may import names and corre-
sponding entries from other namespaces, e.g., from a
parent namespace. Entries with the same name stored
in different namespaces may shadow each other, if the
namespaces are related.
The structural relation of these elements is de-
picted in Fig. 7. A NameSpace may have arbi-
trary many child namespaces and an optional parent
namespace to represent a namespace hierarchy. A
SymbolTable associates entries (Entry) with their
name. Entries are serialized in a preprocessed, con-
densed form that allows for fast loading of its con-
tained information.
IntegrationofHeterogeneousModelingLanguagesviaExtensibleandComposableLanguageComponents
25
SymbolTable
NameSpace
Entry
parent
children
*
encapsulated
imported
*
exported
*
forwarded
*
1
name
0,1
0,1
CD
Figure 7: Hierarchical namespaces, their symbol tables and
related entries (cf. (V
¨
olkel, 2011)).
Figure 8: Technical components of a symbol table (cf.
(V
¨
olkel, 2011)).
4.2 Symbol Table Components
The previously described symbol table and names-
pace structure has to be created for each language.
The MontiCore framework (Gr
¨
onniger et al., 2006)
provides infrastructure for a uniform development of
technical symbol table components for modeling lan-
guages. The most important classes and interfaces are
depicted in Fig. 8.
Each concrete modular modeling language is
presented by the interface ILanguage. It of-
fers the technical components needed to cre-
ate namespaces and symbol tables for an in-
stance of that language. Its entry creators (sub-
classes of ConcreteASTAndNameSpaceVisitor) are
used to set up the namespace hierarchy of a
model. Then, entries for model elements are
created and organized in the symbol tables of
the given namespace hierarchy. The registered
IInheritedEntriesCalculatorClients are used
to compute if entries from imported namespaces
are hidden by locally defined entries. An
IQualifierClient has to be provided for each el-
ement kind of a language which instances may be ref-
erenced within the current or another model, such as
a referenced type of a field. The concrete qualifier
client is used to transfer entries with the correspond-
ing kind from the unqualified to the qualified state.
Resolver clients (IResolverClient) have to be pro-
vided for each entry kind that may be referenced
within the current namespace hierarchy. The regis-
tered deserializers (EntryDeserializer) load serial-
AbstractLanguage
«interface»
ILanguage
LanguageComponent CompositeLanguage
*
components
ModelingLanguage
LanguageFamily
aggregation of different kind of languages
(e.g., clArc using Java or CD types)
MontiCore-specific info
(e.g., file extension,
framework)
single language or language
fragment (e.g., MontiArc
without embeddings)
complete language
(with embedded fragments: CD
with embedded HQL & Java)
CD
DSLTool
*
Figure 9: Technical realization of MontiCore’s language
composition mechanisms (cf. (Schindler, 2012; V
¨
olkel,
2011)).
ized entries from externally referenced models. They
are used to transition the entries that represent a ref-
erenced model element from the qualified to the full
entry state. Associated context conditions that extend
the abstract class AbstractContextCondition are
used to check if processed models are well formed.
This way a concrete ILanguage module offers all
means to process models or model parts of a cer-
tain language and produce a corresponding names-
pace and symbol table hierarchy. The provided infras-
tructure additionally alleviates inter-model relations
that allow to resolve external information defined in
related models. How to combine these components in
several ways to realize the language integration con-
cepts presented in Sect. 3 is described in the following
subsection.
4.3 Configuration of Language
Compositions
Language integration requires different effort depend-
ing on the type of integration. The composition takes
place hierarchically to enable application of mecha-
nisms in the best possible order. Fig. 9 shows the
different language concepts required to achieve this
compositionality.
Language aggregation of two or more ex-
isting modeling languages is implemented in
LanguageFamily instances. These gather the differ-
ent independent ModelingLanguages together with
the inter-language infrastructure, such as resolvers,
qualifiers and adapters for symbol table integration,
as well as factories and inter-language context con-
ditions. Language families are used by MontiCore’s
DSLTools to process sets of heterogeneous but related
models. The clArc language family, for instance,
comprises a modeling language for architecture
MODELSWARD2015-3rdInternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment
26
models and a modeling language for CDs.
A ModelingLanguage is a black-box language
and contains language-specific information such as
the file ending. It may contain either a single language
or a composition of embedded languages, such as
CDs with embedded HQL. Therefore, modeling lan-
guages contain a hierarchy of ILanguage interfaces.
Based on this, MontiCore creates the infrastructure to
parse model instances accordingly. This infrastruc-
ture contains the correct combination of parser and
lexer for the model at hand.
For single languages, modeling languages contain
only a single LanguageComponent which contains
the symbol table infrastructure and context conditions
necessary. A LanguageComponent contains the infor-
mation required to process symbols of the respective
single language, i.e., how entries are created, deseri-
alized, qualified, resolved, which context conditions
are available, and which entry types are exported (see
Sect. 4.2). For embedded languages, modeling lan-
guages contain a hierarchy of CompositeLanguages.
These are composed of the embedded languages that
themselves are represented by an implementation of
ILanguage. Language components and composite
languages are implemented as AbstractLanguage
which provide common functionality used by lan-
guage components and composite languages. They
implement the interface ILanguage to allow utiliza-
tion in a composite (Gamma et al., 1995). Using
the composite pattern for embedding allows to reuse
the resulting language combinations easily in differ-
ent contexts, e.g., to embed Java and HQL into CD.
Composite languages and language families can be
considered as the glue between languages and their
symbol tables as they hold the required adapters, re-
solvers, qualifiers, and context conditions for their
specific composition.
Figure 9 also shows, that the order of language ag-
gregation is arbitrary and depends on the language en-
gineer. Whether a subset of the embedded languages
should define a new ModelingLanguage solely de-
pends on the desire to reuse this combination. It
might, for example, be useful to combine CD and
HQL first, and to reuse the resulting combination with
different action languages. Please note that language
inheritance is not reflected in this structure, as the re-
sulting combined abstract syntax does not necessarily
require any interaction on symbol level. If this how-
ever is necessary, usually a new ’main’ entry has to
be created which contains the additional information
resulting from the language extension. Thus clArc in-
troduces a new component entry to contain the new
language features. To be reusable with existing lan-
guage integration infrastructure of the inherited lan-
ClArcCD
Language
Modeling
Language
CD
Composite
Language
Language
Component
Composite
HQLJavaCD
Language
CDLanguage
Component
JavaLanguage
Component
HQLLanguage
Component
Figure 10: Language composition for the UML/P
ClarcCDLanguage (cf. (Schindler, 2012)).
guage, these need to be adapted to its entries accord-
ingly. Hence, clArc component entries need to be
adapted to MontiArc component entries. Similar to
the development of the symbol table of a new lan-
guage, entries, entry creators, qualifiers, and resolvers
have to be registered for elements added by the inher-
iting language. The inheriting language can also reuse
context conditions of the inherited language and add
new ones.
Figure 10 illustrates the language composi-
tion mechanisms on the clArc CD language
ClArcCDLanguage. The language allows to model
CDs with embedded Java and HQL as illustrated
in Sect. 2.1. As this embedding happens on
the concrete syntax, there is no need to refer-
ence models of other languages by name. There-
fore, the language is implemented as a model-
ing language that contains the composite language
CompositeHQLJavaCDLangauge which realizes the
embedding. CompositeHQLJavaCDLangauge is
composed of language components for CD, Java,
and HQL respectively, and contains adapters between
the three languages as well as inter-language con-
text conditions. Language-internal context condi-
tions are defined in the language component. Cross-
language inter-model context conditions are defined
in language families, whereas cross-language intra-
model context conditions are defined in composite
languages. Adaptation between entries of two embed-
ded languages, such as the types of embedded HQL
and the embedding CD, requires the composite lan-
guage to provide adapters, qualifiers, and resolvers for
type pairs. Adaptation between aggregated languages
of a language family requires to configure these el-
ements in the language family instead. Figure 11
shows the elements required to adapt type entries of
a HQL language to type entries of a CD language.
Integration requires to provide a new adapter factory
marked responsible to create entries of a certain type
IntegrationofHeterogeneousModelingLanguagesviaExtensibleandComposableLanguageComponents
27
part of language
integration
infrastructure
CD
CD2HQLAdapterFactory
AdapterFactory
+ TypeEntry create(CDTypeEntry t)
CDType2HQL
Qualifier
CDType2HQL
Resolver
CDType2HQLAdapter
+ String getName() {
return adaptee.getName()
}
HQLTypeEntry
+ String getName()
CDTypeEntry
+ String getName()
adaptee
type entry of
the CD language
type entry
of the HQL
language
Figure 11: Adaptation between type entries of HQL and
those of the CD language.
(here HQL type entries for CD types). When a CD
type needs to be qualified or resolved for embedded
HQL, the factory produces an adapter which behaves
like a HQL type entry, but delegates all methods to
the adapted CD type entry.
Using adaptation on the levels of composite lan-
guages and language families allows to develop lan-
guages without consideration of a posteriori integra-
tion. As the languages are free from integration
premises, they can be composed arbitrarily.
5 RELATED WORK
We have presented three mechanisms for the integra-
tion of modeling languages. Integration takes place
on the syntactical level and enables language aggre-
gation, language embedding, and language inheri-
tance. Related to our contribution are other studies
and approaches on general syntax-oriented language
integration. We do neither discuss language integra-
tion for specific language families (Barja et al., 1994;
Groenewegen and Visser, 2008) as these are usually
specifically created to be integrated, nor do we dis-
cuss semantic language integration (Gr
¨
onniger and
Rumpe, 2011; Hedin and Magnusson, 2003; Wende
et al., 2010; Wyk et al., 2008).
A study on language composition mechanisms
distinguishes the mechanisms: “language exten-
sion, language restriction, language unification, self-
extension, and extension composition” (Erdweg et al.,
2012). The authors’ notion of language extension also
requires that languages can be composed a-posteriori
and distinguishes language extension from language
integration. Our approach to language extension al-
lows to overwrite nonterminals from the extended
language in order to reduce expressiveness. The
proposed notion of “language unification” matches
our definition of language aggregation, where two
independent languages can be used “unchanged by
adding glue code only”. In their definition of “self-
extension”, the authors start from a different defi-
nition of “language embedding” than we do: there,
language embedding is that a “domain-specific lan-
guage is embedded into a host language by provid-
ing a host-language program that encapsulates the
domain-specific concepts and functionality” (Hudak,
1998) which defines the use of domain-specific pro-
grams and is hardly recognizable as language em-
bedding. Accordingly, the author’s definition of
“self-extension” requires that “the language can be
extended by programs of the language itself while
reusing the language’s implementation unchanged”–
which also allows to “embed” languages as strings
into the host language, e.g., SQL queries or regular
expressions in Java. MontiCore does not provide an
explicit “self-extension” mechanism, but supports it
by embedding action languages allowing definition of
programs. MontiCore languages also support the lan-
guage extension composition mechanisms denoted as
incremental extension and language unification as de-
fined in (Erdweg et al., 2012).
A recent study on language workbenches (Erd-
weg et al., 2013) provides an overview of ex-
isting tools and their features. In particular,
Ens
¯
o (van der Storm et al., 2014), M
´
as (M
´
as website
http://www.mas-wb.com, ), MetaEdit+ (Kelly et al.,
1996), MPS (Dmitriev, 2004), Onion (Erdweg et al.,
2013), Rascal (Klint et al., 2009), Spoofax (Kats
and Visser, 2010), SugarJ (Erdweg et al., 2011), the
Whole Platform (Solmi, 2005) and XText (Eysholdt
and Behrens, 2010) are reviewed. This review con-
siders four dimensions: syntax, validation, seman-
tics and editor services. According to the overview
all of the presented tools are able to achieve syntac-
tical composition via different mechanisms. Never-
theless the composition on the validation depends on
the validation features of the respective workbench.
Only MPS, SugarJ and XText provide validation for
naming and type checking similar to our approach
to syntax-oriented language integration, namely con-
crete syntax, abstract syntax, symbol tables, and con-
text conditions. In (Voelter, 2013) the composition
of languages in MPS is shown in more detail. To
compose types new type definition rules have to be
applied to infer types via unification. Since Monti-
Core is not projectional and uses independent parsers
we define these connections between AST elements
via our adaptation mechanisms and not via generic
type definition rules. Furthermore (Voelter, 2013) dis-
tinguishes between language combination, extension,
reuse and embedding. While language combination
and reuse are similar to our notion of aggregation,
language extension corresponds to our language in-
MODELSWARD2015-3rdInternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment
28
heritance, and language embedding is congruent with
our concept of embedding.
The authors of (Tomassetti et al., 2013) highlight
cross-language context conditions as an important
source for errors, propose to develop reusable cross-
language context conditions and sketch how these can
be implemented with their language workbench. It re-
mains to be discussed how context conditions check-
ing semantic properties specific to a language family
can be designed for reuse.
Another approach to deal with the complexity
of language integration is to employ domain-specific
embedded languages (DSELs) in a host language
(e.g., Scala) (Hofer and Ostermann, 2010). Regarding
our example, this circumvents the problems arising
from using data types between languages and allows
to reuse existing development infrastructure. These
approaches focus on syntax-oriented integration as
well, but language reuse is limited to languages of
the same host language and often DSELs lack explicit
meta-models usable for integration purposes.
Attribute grammars (Knuth, 1968) allow to enrich
grammar symbols with computation rules. Research
in attribute grammars led to promising results regard-
ing language integration, like Forwarding (Wyk et al.,
2002), and produced capable language workbenches
as well (Wyk et al., 2008). Using multiple inheri-
tance of attribute grammars to integrate is another in-
teresting approach (Mernik, 2013) to language inte-
gration which suffices to fulfill the language compo-
sition mechanisms identified in (Erdweg et al., 2012).
6 FUTURE WORK AND
CONCLUSION
Engineering of complex software systems requires
MDE where language integration can help to deal
with the heterogeneity of modeling languages in-
volved. We introduced language aggregation, lan-
guage embedding, and language inheritance by ex-
ample. These language integration techniques allow
integration of languages without stipulating possible
integration partners or mechanisms a priori. This en-
ables to compose languages with minimal effort.
We have illustrated how these integration mech-
anisms are implemented in MontiCore. To achieve
cross-language resolution of names, the symbol table,
language families, and modeling languages were in-
troduced. Language families and modeling languages
contain the glue to enable cross-language model us-
age. This glue is implemented in the form of adapters
between entries of symbol tables. The presented con-
cepts and framework have been evaluated with var-
ious languages for different domains (Haber et al.,
2012b; Haber et al., 2013; Haber et al., 2012a; Haber
et al., 2011a; Haber et al., 2011b; Haber et al., 2011c;
Navarro P
´
erez and Rumpe, 2013; Thomas et al., 2013;
Ringert et al., 2013a; Ringert et al., 2013b). (AW says:
Sch
¨
oner formulieren? )
In the future we will examine whether parts of
symbol table infrastructure can be generated from the
language and its models directly. We further will in-
vestigate whether the different language integration
definition mechanisms (e.g., grammars for embed-
ding, symbol table for aggregation) can be unified.
Furthermore, it might be possible to generate the cross
language infrastructure from enriched models of the
respective language as well.
REFERENCES
Barja, M. L., Paton, N. W., Fern, A. A. A., Williams,
M. H., and Dinn, A. (1994). An effective deduc-
tive object-oriented database through language inte-
gration. In Proc. 20th Int. Conf. on Very Large Data
Bases (VLDB).
Dmitriev, S. (2004). Language oriented programming:
The next programming paradigm. JetBrains onBoard,
1(2).
Erdweg, S., Giarrusso, P. G., and Rendel, T. (2012). Lan-
guage composition untangled. In Proceedings of the
Twelfth Workshop on Language Descriptions, Tools,
and Applications.
Erdweg, S., Rendel, T., K
¨
astner, C., and Ostermann, K.
(2011). Sugarj: Library-based syntactic language ex-
tensibility. In ACM SIGPLAN Notices.
Erdweg, S., van der Storm, T., V
¨
olter, M., Boersma, M.,
Bosman, R., Cook, W. R., Gerritsen, A., Hulshout, A.,
Kelly, S., Loh, A., et al. (2013). The state of the art in
language workbenches. Springer.
Eysholdt, M. and Behrens, H. (2010). Xtext: Implement
your language faster than the quick and dirty way.
In Proceedings of the ACM International Conference
Companion on Object Oriented Programming Sys-
tems Languages and Applications Companion.
France, R. and Rumpe, B. (2007). Model-driven Devel-
opment of Complex Software: A Research Roadmap.
Future of Software Engineering (FOSE ’07).
Gamma, E., Helm, R., Johnson, R., and Vlissides, J. (1995).
Design patterns: elements of reusable object-oriented
software. Addison-Wesley Professional.
Groenewegen, D. and Visser, E. (2008). Declarative ac-
cess control for webdsl: Combining language integra-
tion and separation of concerns. In Web Engineering,
2008. ICWE’08. Eighth International Conference on.
Gr
¨
onniger, H., Krahn, H., Rumpe, B., Schindler, M., and
V
¨
olkel, S. (2006). MontiCore 1.0 - Ein Framework
zur Erstellung und Verarbeitung dom
¨
anenspezifischer
Sprachen. Technical Report Informatik-Bericht 2006-
IntegrationofHeterogeneousModelingLanguagesviaExtensibleandComposableLanguageComponents
29
04, Software Systems Engineering Institute, Braun-
schweig University of Technology.
Gr
¨
onniger, H. and Rumpe, B. (2011). Modeling language
variability. In Foundations of Computer Software.
Modeling, Development, and Verification of Adaptive
Systems. Springer.
Haber, A., H
¨
olldobler, K., Kolassa, C., Look, M., Rumpe,
B., M
¨
uller, K., and Schaefer, I. (2013). Engineering
Delta Modeling Languages. In Proceedings of the
17th International Software Product Line Conference.
Haber, A., Kutz, T., Rendel, H., Rumpe, B., and Schaefer, I.
(2011a). Delta-oriented Architectural Variability Us-
ing MontiCore. In ECSA ’11 5th European Confer-
ence on Software Architecture: Companion Volume.
Haber, A., Rendel, H., Rumpe, B., and Schaefer, I. (2011b).
Delta Modeling for Software Architectures. In
Tagungsband des Dagstuhl-Workshop MBEES: Mod-
ellbasierte Entwicklung eingebetteterSysteme VII.
Haber, A., Rendel, H., Rumpe, B., and Schaefer, I. (2012a).
Evolving Delta-oriented Software Product Line Ar-
chitectures. In Large-Scale Complex IT Systems. De-
velopment, Operation and Management, 17th Mon-
terey Workshop 2012, Oxford, UK, March 19-21,
2012.
Haber, A., Rendel, H., Rumpe, B., Schaefer, I., and van der
Linden, F. (2011c). Hierarchical Variability Modeling
for Software Architectures. In Proceedings of Inter-
national Software Product Lines Conference (SPLC
2011).
Haber, A., Ringert, J. O., and Rumpe, B. (2012b). MontiArc
- Architectural Modeling of Interactive Distributed
and Cyber-Physical Systems. Technical Report AIB-
2012-03, RWTH Aachen.
Hedin, G. and Magnusson, E. (2003). Jastadd—an aspect-
oriented compiler construction system. Science of
Computer Programming.
Hibernate website http://hibernate.org/.
Hofer, C. and Ostermann, K. (2010). Modular domain-
specific language components in scala. In ACM SIG-
PLAN Notices.
Hudak, P. (1998). Modular domain specific languages and
tools. In Software Reuse, 1998. Proceedings. Fifth In-
ternational Conference on.
Kats, L. C. and Visser, E. (2010). The spoofax language
workbench: Rules for declarative specification of lan-
guages and ides. SIGPLAN Not.
Kelly, S., Lyytinen, K., and Rossi, M. (1996). Metaedit+ a
fully configurable multi-user and multi-tool case and
came environment. In Advanced Information Systems
Engineering.
Klint, P., van der Storm, T., and Vinju, J. (2009). Rascal:
A domain specific language for source code analysis
and manipulation. In Source Code Analysis and Ma-
nipulation, 2009. SCAM’09. Ninth IEEE International
Working Conference on.
Knuth, D. F. (1968). Semantics of context-free languages.
Mathematical systems theory.
Krahn, H., Rumpe, B., and V
¨
olkel, S. (2008). Monticore:
Modular development of textual domain specific lan-
guages. In Proceedings of Tools Europe.
Krahn, H., Rumpe, B., and V
¨
olkel, S. (2010). MontiCore: a
framework for compositional development of domain
specific languages. STTT.
Look, M., Perez, A. N., Ringert, J. O., Rumpe, B., and
Wortmann, A. (2013). Black-box Integration of Het-
erogeneous Modeling Languages for Cyber-Physical
Systems. In Proceedings of the 1st Workshop on the
Globalization of Modeling Languages (GEMOC), Mi-
ami, Florida, USA.
Medvidovic, N. and Taylor, R. (2000). A Classification
and Comparison Framework for Software Architec-
ture Description Languages. IEEE Transactions on
Software Engineering.
Mernik, M. (2013). An Object-oriented Approach to Lan-
guage Compositions for Software Language Engi-
neering. Journal of Systems and Software.
M
´
as website http://www.mas-wb.com.
Navarro P
´
erez, A. and Rumpe, B. (2013). Modeling Cloud
Architectures as Interactive Systems. In 2nd In-
ternational Workshop on Model-Driven Engineering
for High Performance and CLoud computing (MDH-
PCL).
Object Management Group (2010). OMG Uni-
fied Modeling Language (OMG UML), Su-
perstructure Version 2.3 (10-05-05). http://
www.omg.org/spec/UML/2.3/Superstructure/PDF/.
Ringert, J. O., Rumpe, B., and Wortmann, A. (2013a). From
Software Architecture Structure and Behavior Model-
ing to Implementations of Cyber-Physical Systems. In
Software Engineering 2013 Workshop Proceedings.
Ringert, J. O., Rumpe, B., and Wortmann, A. (2013b).
MontiArcAutomaton : Modeling Architecture and
Behavior of Robotic Systems. In Workshops and Tuto-
rials Proceedings of the International Conference on
Robotics and Automation (ICRA).
Rumpe, B. (2011). Modellierung mit UML, volume 2nd
Edition. Springer.
Rumpe, B. (2012). Agile Modellierung mit UML: Code-
generierung, Testf
¨
alle, Refactoring, volume 2nd Edi-
tion. Springer.
Schindler, M. (2012). Eine Werkzeuginfrastruktur zur
agilen Entwicklung mit der UML/P. Aachener
Informatik-Berichte, Software Engineering, Band 11.
Shaker.
Solmi, R. (2005). Whole platform. PhD thesis, University
of Bologna.
Thomas, U., Hirzinger, G., Rumpe, B., Schulze, C., and
Wortmann, A. (2013). A New Skill Based Robot Pro-
gramming Language Using UML/P Statecharts. In
Proceedings of the 2013 IEEE International Confer-
ence on Robotics and Automation (ICRA), Karlsruhe,
Germany.
Tomassetti, F., Vetro, A., Torchiano, M., Voelter, M., and
Kolb, B. (2013). A model-based approach to lan-
guage integration. In Modeling in Software Engineer-
ing (MiSE), 2013 5th International Workshop on.
van der Storm, T., Cook, W. R., and Loh, A. (2014). The
design and implementation of Object Grammars. Sci-
ence of Computer Programming.
MODELSWARD2015-3rdInternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment
30
V
¨
olkel, S. (2011). Kompositionale Entwicklung
dom
¨
anenspezifischer Sprachen. Aachener Informatik-
Berichte, Software Engineering Band 9. 2011. Shaker
Verlag.
Voelter, M. (2013). Language and ide modularization and
composition with mps. In Generative and Trans-
formational Techniques in Software Engineering IV.
Springer Berlin Heidelberg.
Wende, C., Thieme, N., and Zschaler, S. (2010). A Role-
based Approach Towards Modular Language Engi-
neering. In Proceedings of the Second International
Conference on Software Language Engineering.
Wyk, E. V., Bodin, D., Gao, J., and Krishnan, L. (2008). Sil-
ver: an Extensible Attribute Grammar System. Elec-
tronic Notes in Theoretical Computer Science.
Wyk, E. V., de Moor, O., Backhouse, K., and Kwiatkowski,
P. (2002). Forwarding in attribute grammars for mod-
ular language design. In In Proc. 11th Intl. Conf. on
Compiler Construction, volume 2304 of LNCS.
IntegrationofHeterogeneousModelingLanguagesviaExtensibleandComposableLanguageComponents
31