A Domain-Speciﬁc Language for Abstract Syntax Model to Concrete

Syntax Model Mappings

Luis Quesada, Fernando Berzal and Juan-Carlos Cubero

Department of Computer Science and Artiﬁcial Intelligence, University of Granada, CITIC, 18071, Granada, Spain

Keywords:

Model-driven Software Development, Language Speciﬁcation, Parser Generators, Abstract Syntax Model,

Concrete Syntax Model.

Abstract:

Model-based parser generators such as ModelCC effectively decouple language design from language pro-

cessing. ModelCC allows the speciﬁcation of the abstract syntax model of a language as a set of language

elements and their relationships. ModelCC provides the necessary mechanisms to specify the mapping from

the abstract syntax model (ASM) to a concrete syntax model (CSM). This mapping can be speciﬁed as a set

of metadata annotations on top of the abstract syntax model itself or by means of a domain-speciﬁc language

(DSL). Using a domain-speciﬁc language to specify the mapping from abstract to concrete syntax models

allows the deﬁnition of different concrete syntax models for the same abstract syntax model. In this paper, we

describe the ModelCC domain-speciﬁc language for ASM-CSM mappings and we showcase its capabilities

by using the ModelCC ASM-CSM DSL to deﬁne itself.

1 INTRODUCTION

Model-based language speciﬁcation techniques

(Kleppe, 2007) decouple language design from

language processing and automatically generate the

corresponding language grammar, thus making the

language design process less arduous.

ModelCC is a model-based parser generator (Que-

sada et al., 2011; Quesada, 2012) that allows the spec-

iﬁcation of the abstract syntax model of a language as

a set of classes, which represent language elements,

and relationships between those classes or language

elements.

ModelCC allows mapping the abstract syntax

model to concrete syntax models by imposing con-

straints over language elements and their relation-

ships using either metadata annotations or a domain-

speciﬁc language for the speciﬁcation of language

constraints.

In this paper, we propose the ModelCC domain-

speciﬁc language for abstract syntax model to con-

crete syntax model mappings (from now on referred

as the ModelCC DSL for ASM-CSM mappings) and

present its speciﬁcation in a model-based way using

ModelCC. This domain-speciﬁc language ultimately

allows model-based parser generators to decouple ab-

stract syntax models from concrete syntax models.

Section 2 introduces model-based language spec-

iﬁcation and the ModelCC model-based parser gen-

erator. Section 3 describes ModelCC the ModelCC

domain-speciﬁc language for ASM-CSM mappings.

Finally, Section 4 presents our conclusions and future

work.

2 MODEL-BASED LANGUAGE

SPECIFICATION

Most existing language speciﬁcation techniques (Aho

et al., 2006) require the language designer to provide

a textual speciﬁcation of the language grammar. The

proper speciﬁcation of such a grammar is a nontrivial

process that depends on the lexical and syntax analy-

sis techniques to be used, since each kind of technique

requires the grammar to comply with a speciﬁc set of

constraints. Each analysis technique is characterized

by its expression power and this expression power de-

termines whether a given analysis technique is suit-

able for a particular language. The most signiﬁcant

constraints on formal language speciﬁcation originate

from the need to consider context-sensitivity, the need

to perform an efﬁcient analysis, and some techniques’

inability to resolve conﬂicts caused by grammar am-

biguities.

In practice, when we want to build a complex data

158

Quesada L., Berzal F. and Cubero J..

A Domain-Speciﬁc Language for Abstract Syntax Model to Concrete Syntax Model Mappings.

DOI: 10.5220/0004671701580165

In Proceedings of the 2nd International Conference on Model-Driven Engineering and Software Development (MODELSWARD-2014), pages 158-165

ISBN: 978-989-758-007-9

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

structure from an input codiﬁed using a speciﬁc syn-

tax, the implementation of the mandatory language

processor requires the software engineer to build a

grammar-based language speciﬁcation for the input

data and also to implement the conversion from the

parse tree returned by the parser to the desired data

structure, which is an instance of the data model.

Whenever the language speciﬁcation has to be

modiﬁed, the language designer has to manually

propagate changes throughout the entire language

processor tool chain, from the speciﬁcation of the

grammar deﬁning the formal language (and its adap-

tation to speciﬁc parsing tools) to the correspond-

ing data model. These updates are time-consuming,

tedious, and error-prone. By making such changes

labor-intensive, the traditional language processing

approach hampers the maintainability and evolution

of the language used to represent the data (Kats et al.,

2010).

Moreover, it is not uncommon for different appli-

cations to use the same language. For example, the

compiler, different code generators, and other tools

within an IDE, such as the editor or the debugger,

typically need to grapple with the full syntax of a

programming language. Unfortunately, their mainte-

nance typically requires keeping several copies of the

same language speciﬁcation synchronized.

The idea behind model-based language speciﬁ-

cation is that, starting from a single abstract syntax

model (ASM) that represents the core concepts in a

language, language designers can develop one or sev-

eral concrete syntax models (CSMs). These CSMs

can suit the speciﬁc needs of the desired textual or

graphical representation. The ASM-CSM mappings

can be performed, for instance, by annotating the ab-

stract syntax model with the constraints needed to

transform the elements in the abstract syntax into their

concrete representation.

This way, the ASM representing the language can

be modiﬁed as needed without having to worry about

the language processor and the peculiarities of the

chosen parsing technique, since the corresponding

language processor will be automatically updated. In

this case, the language designer does not have to man-

ually propagate changes throughout the language pro-

cessor tool chain. Also, when different applications

use the same language, there is no need to keep or

maintain duplicate language models.

Finally, as the ASM is not bound to a particu-

lar parsing technique, evaluating alternative and/or

complementary parsing techniques is possible with-

out having to propagate their constraints into the

language model. Therefore, by using an ASM,

model-based language speciﬁcation completely de-

Context-Free

Grammar

e.g. BNF

Conceptual

Model

Attribute

Grammar

Abstract

Syntax

Tree

Concrete Syntax Model

Abstract Syntax Model

instance

Textual

Representation

Parser

input

output

Figure 1: Traditional language processing.

Context-Free

Grammar

e.g. BNF

Conceptual

Model

Textual

Representation

Parser

Abstract

Syntax

Graph

Concrete Syntax Model

Abstract Syntax Model

instance

input

output

Figure 2: Model-based language processing.

couples language speciﬁcation from language pro-

cessing, which can be performed using whichever

parsing techniques are suitable for the formal lan-

guage implicitly deﬁned by the abstract model and its

concrete mapping.

A diagram summarizing the traditional language

design process is shown in Figure 1, whereas the cor-

responding diagram for the model-based approach is

shown in Figure 2.

It should be noted that ASMs may represent non-

tree structures. Hence the use of the ‘abstract syntax

graph’ term in Figure 2.

ModelCC is a parser generator that supports a

model-based approach to the design of language pro-

cessing systems (Quesada et al., 2011; Quesada,

2012).

Its starting ASM is created by deﬁning classes

that represent language elements and establishing re-

lationships among those elements. Once the ASM is

established, constraints can be imposed over language

elements and their relationships as annotations in or-

der to produce the desired ASM-CSM mappings.

The ASM is built on top of basic language el-

ements, which can be viewed as the tokens in the

model-driven speciﬁcation of a language. ModelCC

provides the necessary mechanisms to combine those

basic elements into more complex language con-

structs, which correspond to the use of concatenation,

selection, and repetition in the syntax-driven speciﬁ-

cation of languages.

ADomain-SpecificLanguageforAbstractSyntaxModeltoConcreteSyntaxModelMappings

159

Table 1: The metadata annotations supported by the ModelCC model-based parser generator.

Constraints on... Annotation Function

...patterns

@Pattern Pattern matching deﬁnition of basic language elements.

@Value Field where the recognized input element will be stored.

...delimiters

@Preﬁx Element preﬁx(es).

@Sufﬁx Element sufﬁx(es).

@Separator Element separator(s) in lists of elements.

...cardinality

@Optional Optional elements.

@Minimum Minimum element multiplicity.

@Maximum Maximum element multiplicity.

...evaluation

order

@Associativity Element associativity (e.g. left-to-right).

@Composition Eager or lazy composition for nested composites.

@Priority Element precedence level/relationships.

...composition

order

@Position Deﬁne an element member position relative to other.

@FreeOrder All the element members positions may vary.

...references

@ID Identiﬁer of a language element.

@Reference Reference to a language element.

Custom

constraints

@Constraint Custom user-deﬁned constraint.

3 ModelCCDSL FOR ASM-CSM

MAPPINGS

In ModelCC, the constraints imposed over ASMs

to map them to particular CSMs can be declared

as metadata annotations on the model itself. Now

supported by all the major programming platforms,

metadata annotations are often used in reﬂective pro-

gramming and code generation (Fowler, 2002). Ta-

ble 1 summarizes the set of constraints supported by

ModelCC.

However, in order to allow the developer to spec-

ify several mappings, ModelCC also allows the spec-

iﬁcation of separate input ﬁles corresponding to sep-

arate sets of constraints by using the ModelCC DSL

for ASM-CSM mappings.

In this section, we describe the ModelCC DSL for

ASM-CSM mappings. We provide the ModelCC im-

plementation of a parser for the DSL as an ASM com-

plemented with annotations.

Finally, as an example of the usage of the lan-

guage, we also provide the ModelCC implementation

of a parser for the DSL as an ASM complemented

with constraint speciﬁcations written in the DSL it-

self.

Subsection 3.1 outlines the language features.

Subsection 3.2 provides the deﬁnition of the language

as an ASM complemented with metadata annotations.

Subsection 3.3 provides several equivalent deﬁnitions

of the language as an ASM complemented with con-

straint speciﬁcation ﬁles written in the language itself.

3.1 Language Features

The ModelCC DSL for ASM-CSM mappings sup-

ports the following features:

• The deﬁnition of constraints on patterns, delim-

iters, evaluation order, and references to language

elements.

• The property-like speciﬁcation of constraints for

language elements and their members.

• The grammar-like speciﬁcation of the concrete

syntax of language elements by means of a

regular-expression-like language.

While the semantics of property-like constraint

deﬁnitions is equivalent to that of metadata annotation

constraint deﬁnitions, grammar-like constraint spec-

iﬁcation allows for a more intuitive speciﬁcation of

ASM-CSM mappings.

Grammar-like constraint deﬁnitions may be more

intuitive to traditional language designers who are

familiar with syntax-driven language speciﬁcation

tools. Such constraint deﬁnitions can be redundant

with the ASM as, for example, they can also include

multiplicity constraints. ModelCC checks and reports

if any syntax implicit in grammar-like constraint def-

initions conﬂicts with the language ASM.

Finally, ModelCC checks, reports, and ignores

any constraints on language elements on language el-

ement members that do not exist.

MODELSWARD2014-InternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment

160

-constraints

0..*

CSM

- constraints : ConstraintDe nition[]

Constr ai n tDe n ition

- target : Element

- @Pre

x("[") @Su x("]") constrain tID : Identi er

- @Pre

x(":") constraint : Constrain tSpeci cation

Elem en t

- @Separator(".") name : Identi er[]

Iden ti er

- @Value name : String

ConstraintSpeci cation

Clausur eSpeci cation

- constraint : Constrain tSpeci cation

Option alSpeci cation

- constraint : Constrain tSpeci cation

PositiveClausur eSpeci cation

- constraint : Constrain tSpeci cation

Par enth esizedSpeci cation

- constraint : Constrain tSpeci cation

Sequen ceSpeci cation

- constraints : Constrain tSpeci cation[]

Pr eceden ceSpeci cation

- @Separator("\<") constrain ts : Con strain tSpeci cation[]

Alter nativeSpeci cation

- @Separator("\|") constrain ts : Constrain tSpeci cation[]

Liter alSpeci cation

- literal : Literal

Integer

- @Value value : int

Boolean

- @Value value : boolean

Literal

Patter n Speci cation

- pattern : Pattern

Patter n

- @Value regEx : String

Elem en tSpeci cation

- elemen t : Element

@Pattern("[a-zA-Z][a-zA-Z0-9_]*")

@Su x("\"")

@Pre

x("\"")

@Pattern(RegExMatch er)

@Su

x("\*")

@Su x("\+")

@Su x("\)")

@Su

x("\?")

@Pre

x("\(")

@Priority(precedes=AlternativeSpeci cation)

@Priority(precedes={AlternativeSpeci

cation,SequenceSpeci cation})

-constraint

-constraints

0..1

-constraintID

-name

0..1-constraint

-element

-pattern

-literal

-constraint

-target

Figure 3: Deﬁnition of the ModelCC DSL for ASM-CSM mappings in ModelCC.

3.2 ModelCC Deﬁnition of the DSL for

ASM-CSM Mappings

The ASM of the language is designed ﬁrst. Then, it

is mapped to a CSM by imposing constraints

using metadata annotations on the model classes.

The resulting model, depicted as an UML class

diagram in Figure 3, can be processed by ModelCC

ADomain-SpecificLanguageforAbstractSyntaxModeltoConcreteSyntaxModelMappings

161

to generate the corresponding parser.

This Figure demonstrates the need of an alterna-

tive way of specifying constraints:

• When metadata annotations are used to deﬁne

CSMs on top of the ASM, the concrete syntax is

interleaved in the abstract syntax model in a way

that burdens it, similar to language processing be-

ing coupled with language speciﬁcation in tradi-

tional syntax-driven language speciﬁcation tech-

niques

• Also, separate CSMs cannot be deﬁned on top of

the ASM using metadata annotations.

3.3 Separating ASM and CSM

Once an initial implementation of the ModelCC DSL

for ASM-CSM mappings provides a bootstrap, we

provide implementations of the language that consist

of an ASM and separate constraint deﬁnitions using

the language itself.

The bare model is depicted as an UML class dia-

gram in Figure 4.

Starting from this ASM, we provide three differ-

ent ASM-CSM mappings for the language.

• Grammar-like Speciﬁcation. Figure 5 presents

a grammar-like constraint set speciﬁed using the

ModelCC DSL for ASM-CSM mappings.

Some of the advantages of grammar-like map-

pings can be observed in the speciﬁcation of

the ConstraintDeﬁnition language element con-

straints. A single constraint speciﬁcation can

include preﬁx constraints, sufﬁx constraints,

and language element member order constraints.

Also, the speciﬁcation of the ConstraintDeﬁnition

language element constraints includes two mul-

tiplicity constraints (optionality, represented by

the regex-like “?” operator) that are redundant

with the ASM. ModelCC checks these multiplic-

ity constraints for consistency with the ASM and

reports any conﬂict in parser generation time.

Another illustrative case of grammar-like map-

pings can be observed in the speciﬁcation of the

Element language element constraints. Although

its member name is deﬁned as a list in the ASM,

the grammar-like constraint speciﬁcation uses a

classical explicit-list speciﬁcation to specify the

separator for list members.

• Property-like Speciﬁcation. Figure 6 presents

a property-like constraint set speciﬁed using the

ModelCC DSL for ASM-CSM mappings.

The property-like speciﬁcation of ASM-CSM

mappings mimics the speciﬁcation of constraints

on ASMs using metadata annotations. It can be

observed that the constraints are speciﬁed as prop-

erties of language elements.

• Mixed Speciﬁcation. Figure 7 presents an-

other equivalent constraint set speciﬁed using the

ModelCC DSL for ASM-CSM mappings.

In this case, some constraints are speciﬁed

grammar-like and some constraints are speciﬁed

property-like. For example, separators in lists

are speciﬁed using property-like constraint deﬁ-

nitions, which may seem more intuitive to some

language designers.

It should be noted that constraint deﬁnitions dif-

fer from grammar rules in that several of them

can be speciﬁed for separate members of the same

language element, as can be observed in the Con-

straintDeﬁnition language element.

Finally, it should be noted that ASMs that are

complemented with metadata annotations can be

complemented with ﬁles written in the ModelCC DSL

for ASM-CSM mappings.

Metadata annotation constraints represent default

values that apply, unless otherwise speciﬁed, to all the

ASM-CSM mappings of a language.

4 CONCLUSIONS AND FUTURE

WORK

ModelCC is a model-based parser generator that al-

lows using metadata annotations or a domain-speciﬁc

language to specify abstract syntax model to concrete

syntax model mappings.

In this paper, we have proposed and described

the ModelCC domain-speciﬁc language for abstract

syntax model to concrete syntax model mappings

(ModelCC DSL for ASM-CSM mappings). This DSL

allows the speciﬁcation of separate abstract syntax

model to concrete syntax model mappings.

As an example, we have speciﬁed the ModelCC

DSL for ASM-CSM mappings as an ASM and several

equivalent ASM-CSM mappings written in the DSL

itself.

In the future, we plan to apply model-based lan-

guage speciﬁcation techniques to problems such as

data integration and natural language processing. We

also plan to incorporate different reference resolution

techniques to ModelCC.

MODELSWARD2014-InternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment

162

Elem en t

- n ame : Identiﬁer[]

Par en th esizedSpeciﬁ cation

- constrain t : ConstraintSpeciﬁcation

Pr eceden ceSpeciﬁ cation

- constrain ts : ConstraintSpeciﬁcation[]

Alter n ativeSpeciﬁ cation

- constrain ts : ConstraintSpeciﬁcation[]

Elem en tSpeciﬁcation

- elemen t : Element

-constraintID

0..1

-name

-constraint

0..1

-element

-pattern

-literal

Patter nSpeciﬁ cation

- pattern : Pattern

-constraints

-constraint

-target

0..*

CSM

- constrain ts : ConstraintDeﬁnition[]

Constr ain tDeﬁn ition

- target : Element

- constrain tID : Identiﬁer

- constrain t : ConstraintSpeciﬁcation

Identiﬁ er

- n ame : Strin g

Clausur eSpeciﬁ cation

- constrain t : ConstraintSpeciﬁcation

OptionalSpeciﬁ cation

- constrain t : ConstraintSpeciﬁcation

Sequen ceSpeciﬁcation

- constrain ts : ConstraintSpeciﬁcation[]

Liter alSpeciﬁcation

- literal : Literal

-constraint

-constraints

PositiveClausur eSpeciﬁ cation

- constrain t : ConstraintSpeciﬁcation

-constraints

Patter n

- regEx : Strin g

-constraints

-constraint

ConstraintSpeciﬁcation

- :

Literal

Integer

- value : int

Boolean

- value : boolean

Figure 4: Deﬁnition of the abstract syntax model of the ModelCC DSL for ASM-CSM mappings in ModelCC.

ConstraintDefinition: target ("[" constraintID "]")? (":" constraint)?

Element: name ("." name)*

Identifier.name: "[a-zA-Z][a-zA-Z0-9_]*"

ClausureSpecification: constraint "\*"

OptionalSpecification: constraint "\?"

PositiveClauseSpecification: constraint "\+"

ParenthesizedSpecification: "\(" constraint "\)"

ConstraintSpecification: SequenceSpecification < PrecedenceSpecification

< AlternationSpecification

AlternationSpecification: constraints ("\|" constraints)*

PrecedenceSpecification: constraints ("\<" constraints)*

Boolean.value: "true|false"

Integer.value: "[0-9]+"

Figure 5: Grammar-like speciﬁcation of the mapping from the abstract syntax model to the concrete syntax model of ModelCC

DSL for ASM-CSM mappings, written in the ModelCC DSL for ASM-CSM mappings itself.

ADomain-SpecificLanguageforAbstractSyntaxModeltoConcreteSyntaxModelMappings

163

ConstraintDefinition.constraintID[prefix] "\["

ConstraintDefinition.constraintID[suffix] "\]"

ConstraintDefinition.constraint[prefix]: ":"

Element.name[separator]: "."

Identifier.name: "[a-zA-Z][a-zA-Z0-9_]*"

ClausureSpecification[suffix]: "\*"

OptionalSpecification[suffix]: "\?"

PositiveClauseSpecification[prefix]: "\+"

ParenthesizedSpecification[prefix]: "\("

ParenthesizedSpecification[suffix]: "\)"

SequenceSpecification[precedes]: AlternationSpecification

PrecedenceSpecification

ConstraintSpecification: SequenceSpecification < PrecedenceSpecification

AlternationSpecification.constraints[separator]: "\|"

PrecedenceSpecification[precedes]: AlternationSpecification

PrecedenceSpecification.constraints[separator]: "\<"

Boolean.value: "true|false"

Integer.value: "[0-9]+"

Figure 6: Property-like speciﬁcation of the mapping from the abstract syntax model to the concrete syntax model of ModelCC

DSL for ASM-CSM mappings, written in the ModelCC DSL for ASM-CSM mappings itself.

ConstraintDefinition: "[" constraintID "]"

ConstraintDefinition: ":" constraint

Element.name[separator]: "."

Identifier.name: "[a-zA-Z][a-zA-Z0-9_]*"

ClausureSpecification: constraint "\*"

OptionalSpecification: constraint "\?"

PositiveClauseSpecification: constraint "\+"

ParenthesizedSpecification: "\(" constraint "\)"

ConstraintSpecification: SequenceSpecification < PrecedenceSpecification

< AlternationSpecification

AlternationSpecification.constraints[separator]: "\|"

PrecedenceSpecification.constraints[separator]: "\<"

Boolean.value: "true|false"

Integer.value: "[0-9]+"

Figure 7: Mixed speciﬁcation of the mapping from the abstract syntax model to the concrete syntax model of ModelCC DSL

for ASM-CSM mappings, written in the ModelCC DSL for ASM-CSM mappings itself.

ACKNOWLEDGMENTS

Work partially supported by research project

TIN2012-36951, “NOESIS: Network-Oriented

Exploration, Simulation, and Induction System”,

cofunded by the Spanish Ministry of Economy and

the European Regional Development Fund (FEDER).

REFERENCES

Aho, A. V., Lam, M. S., Sethi, R., and Ullman, J. D. (2006).

Compilers: Principles, Techniques, and Tools. Addi-

son Wesley, 2nd edition.

Fowler, M. (2002). Using metadata. IEEE Software,

19(6):13–17.

Kats, L. C. L., Visser, E., and Wachsmuth, G. (2010).

Pure and declarative syntax deﬁnition: Paradise lost

and regained. In Proceedings of the ACM Interna-

tional Conference on Object-Oriented Programming

Systems, Languages, and Applications (OOPSLA’10),

pages 918–932.

Kleppe, A. (2007). Towards the generation of a text-based

IDE from a language metamodel. volume 4530 of

Lecture Notes in Computer Science, pages 114–129.

Quesada, L. (2012). A model-driven parser generator with

reference resolution support. In Proceedings of the

27th IEEE/ACM International Conference on Auto-

mated Software Engineering, pages 394–397.

Quesada, L., Berzal, F., and Cubero, J.-C. (2011). A lan-

MODELSWARD2014-InternationalConferenceonModel-DrivenEngineeringandSoftwareDevelopment

164

guage speciﬁcation tool for model-based parsing. In

Proceedings of the 12th International Conference on

Intelligent Data Engineering and Automated Learn-

ing. Lecture Notes in Computer Science, volume 6936,

pages 50–57.

ADomain-SpecificLanguageforAbstractSyntaxModeltoConcreteSyntaxModelMappings

165