structure from an input codified using a specific syn-
tax, the implementation of the mandatory language
processor requires the software engineer to build a
grammar-based language specification for the input
data and also to implement the conversion from the
parse tree returned by the parser to the desired data
structure, which is an instance of the data model.
Whenever the language specification has to be
modified, the language designer has to manually
propagate changes throughout the entire language
processor tool chain, from the specification of the
grammar defining the formal language (and its adap-
tation to specific parsing tools) to the correspond-
ing data model. These updates are time-consuming,
tedious, and error-prone. By making such changes
labor-intensive, the traditional language processing
approach hampers the maintainability and evolution
of the language used to represent the data (Kats et al.,
2010).
Moreover, it is not uncommon for different appli-
cations to use the same language. For example, the
compiler, different code generators, and other tools
within an IDE, such as the editor or the debugger,
typically need to grapple with the full syntax of a
programming language. Unfortunately, their mainte-
nance typically requires keeping several copies of the
same language specification synchronized.
The idea behind model-based language specifi-
cation is that, starting from a single abstract syntax
model (ASM) that represents the core concepts in a
language, language designers can develop one or sev-
eral concrete syntax models (CSMs). These CSMs
can suit the specific needs of the desired textual or
graphical representation. The ASM-CSM mappings
can be performed, for instance, by annotating the ab-
stract syntax model with the constraints needed to
transform the elements in the abstract syntax into their
concrete representation.
This way, the ASM representing the language can
be modified as needed without having to worry about
the language processor and the peculiarities of the
chosen parsing technique, since the corresponding
language processor will be automatically updated. In
this case, the language designer does not have to man-
ually propagate changes throughout the language pro-
cessor tool chain. Also, when different applications
use the same language, there is no need to keep or
maintain duplicate language models.
Finally, as the ASM is not bound to a particu-
lar parsing technique, evaluating alternative and/or
complementary parsing techniques is possible with-
out having to propagate their constraints into the
language model. Therefore, by using an ASM,
model-based language specification completely de-
Context-Free
Grammar
e.g. BNF
Conceptual
Model
Attribute
Grammar
Abstract
Syntax
Tree
Concrete Syntax Model
Abstract Syntax Model
instance
of
instance
of
Textual
Representation
Parser
input
output
Figure 1: Traditional language processing.
Context-Free
Grammar
e.g. BNF
Conceptual
Model
Textual
Representation
Parser
Abstract
Syntax
Graph
Concrete Syntax Model
Abstract Syntax Model
instance
of
instance
of
input
output
Figure 2: Model-based language processing.
couples language specification from language pro-
cessing, which can be performed using whichever
parsing techniques are suitable for the formal lan-
guage implicitly defined by the abstract model and its
concrete mapping.
A diagram summarizing the traditional language
design process is shown in Figure 1, whereas the cor-
responding diagram for the model-based approach is
shown in Figure 2.
It should be noted that ASMs may represent non-
tree structures. Hence the use of the ‘abstract syntax
graph’ term in Figure 2.
ModelCC is a parser generator that supports a
model-based approach to the design of language pro-
cessing systems (Quesada et al., 2011; Quesada,
2012).
Its starting ASM is created by defining classes
that represent language elements and establishing re-
lationships among those elements. Once the ASM is
established, constraints can be imposed over language
elements and their relationships as annotations in or-
der to produce the desired ASM-CSM mappings.
The ASM is built on top of basic language el-
ements, which can be viewed as the tokens in the
model-driven specification of a language. ModelCC
provides the necessary mechanisms to combine those
basic elements into more complex language con-
structs, which correspond to the use of concatenation,
selection, and repetition in the syntax-driven specifi-
cation of languages.
ADomain-SpecificLanguageforAbstractSyntaxModeltoConcreteSyntaxModelMappings
159