An Intermediate Language for Compilation to Scripting Languages

∗

Paola Giannini and Albert Shaqiri

Computer Science Institute, DiSIT, Universit`a del Piemonte Orientale

Via Teresa Michel 11, 15121 Alessandria, Italy

Keywords:

Scripting Languages, Functional Languages, Intermediate Language, Translation.

Abstract:

In this paper we introduce an intermediate language for translation of

, a functional language polymorphi-

cally typed relying on the .Net platform, to different scripting languages, such as Python and JavaScript. This

intermediate language (

for short) is an imperative language, with constructs that make possible to move a

code fragment outside its deﬁnition environment, during the translation. Deﬁnition of names (variables and

functions) are done in blocks, like in Python (and JavaScript) and do not have to statically precede their use.

We present a translation of a core

(including mutable variables) into

1 INTRODUCTION

Implementing an application in JavaScript (or any

other dynamically typed language) can cause prob-

lems due to the absence of type checking. Such prob-

lems can lead to unexpected application behaviour

followed by onerous debugging. Although dynamic

type checking and automatic type casting shorten the

programming time, they introduce serious difﬁculties

in the maintenance of medium to large applications.

This is the reason why dynamically typed languages

are used mostly for prototyping and quick scripting.

We propose to deal with these problems using dy-

namically typed languages as “assembly languages”

to which we translate the source code from

which

is statically typed. In this way, we take advantage of

the

type checkerand type inference system, as well

as other

constructs and paradigms such as pattern

matching, classes, discriminated unions, namespaces,

etc., and we may use the safe imperative features in-

troduced via

mutable variables. There are also the

advantages of using an IDE such as Microsoft Vi-

sual Studio (code organization, debugging tools, In-

telliSense, etc.).

To provide translation to different target languages

we introduce an intermediate language,

for short.

This is useful, for instance, for translating to Python

that does not have complete support for functions as

ﬁrst class concept, or for translating to JavaScript, us-

∗

This work has been partially supported by MIUR

CINA-Compositionality, Interaction, Negotiation, Auto-

nomicity for the future ICT society.

ing or not libraries such as jQuery.

Our aim is to prove the correctness of the com-

pilers produced. To do that we formalize

, and the

translation from the source language to

. The lan-

guage

is imperative, and has some of the character-

istics of the scripting languages that makes them ﬂex-

ible, but difﬁcult to check, such as blocks in which

deﬁnition and use of variables may be interleaved,

and in which use of a variable may precede its def-

inition. (

is partly inspired by IntegerPython, see

(Ranson et al., 2008).) Therefore, the proof of cor-

rectness of the translation from the source language

already covers most of the gap from

to the

target scripting languages. In

we also have some

construct that may be used to manipulate safely frag-

ments of open code.

The paper is organized as follows. In Section 2,

we introduce the challenges of the translation from

to Python and JavaScript via some examples, that led

us to introduce our intermediate language. We also

outline the translation from

to both JavaScript and

Python. In Section 3 we deﬁne the fragment of

used as source language, and in Section 4 we formal-

ize

. The formal translation from

is de-

ﬁned in Section 5, where it is stated to preserve the

dynamic semantics of

. In Section 6 we compare

our work with the work of others, and ﬁnally in Sec-

tion 7 we summarize our work, discussing brieﬂy the

implementation issues and highlighting our plans for

future work.

Giannini P. and Shaqiri A..

An Intermediate Language for Compilation to Scripting Languages.

DOI: 10.5220/0004588600920103

In Proceedings of the 8th International Joint Conference on Software Technologies (ICSOFT-EA-2013), pages 92-103

ISBN: 978-989-8565-68-6

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

2 TRANSLATION BY

EXAMPLES: DESIGN CHOICES

In the fragment of

we consider as source of our

translation we have the typical functional language

constructs: function deﬁnition and application, inte-

gers, booleans, addition and the conditional expres-

sion, and an imperative fragment including mutable

variables, assignment, and sequences of expressions.

On the left-hand-side of an assignment there must be

a variable that was introduced with the

mutable

mod-

iﬁer.

2.1 Sequences of Expressions

Many

constructs can be directly mapped to

JavaScript (or Python), but when this is not the case

we obtain a semantically equivalent behaviour by us-

ing the primitives offered by the target language. E.g.,

a sequence of expressions is itself an expression,

while in JavaScript and Python it is a statement. Sup-

pose we want to translate a piece of code that calcu-

lates a ﬁbonacci number, binds the result to a name

and also stores the information if the result is even or

odd. In Fig. 1 we have one possible

implementa-

tion.

let z=7

let mutable even = false

let x =

let rec fib x =

if x < 3 then 1

else fib(x - 1) + fib(x - 2)

let temp = fib z

even <- (temp % 2 = 0)

temp

Figure 1:

program containing sequence of expressions.

As we can see, on the right-hand-side of “

let x=

”

we have a sequence of expressions: the deﬁnition of

the function

fib

followed by the deﬁnition of

temp

etc. This sequence is, in

, an expression. If we

directly map this code into JavaScript we obtain the

syntactically incorrect code of Fig. 2. This program

is syntactially wrong, since on the right-hand-side of

an assignment we must have an expression, while a

sequence of expressions is, in JavaScript, a statement.

To transform a sequence of statements in an expres-

sion, in JavaScript, we wrap the sequence into a func-

tion, and to execute it we call the function, i.e., we use

a JavaScript closure and application. Also, the whole

program is wrapped into an entry point function. In

this way, the code of Fig. 3 is correct. Unfortunately,

the same cannot be done in Python as its support for

var z = 7;

var even = false;

var x =

var fib = function (x) {

if (x < 3) return 1;

else return fib(x-1)+fib(x-2)};

var temp = fib(z);

even = (temp % 2) == 0;

temp;

return x;

Figure 2: Naive translation into JavaScript of sequence of

expressions.

(function() {

var z = 7;

var even = false;

var x = (function () {

var fib = function (x) {

if (x < 3) return 1;

else return fib(x-1)+fib(x-2)};

var temp = fib(z);

even = (temp % 2) == 0;

return temp })();

return x })();

Figure 3: Correct JavaScript translation.

closures is partial. So we have to deﬁne a temporary

function, say

temp1

, in the global scope and to exe-

cute it we have to call

temp1

in the place where the

original sequence should be. However, variables such

even

will be out of the scope of their deﬁnition,

and this would make the translation wrong. To ob-

tain a behaviour semantically equivalent, we have to

pass to

temp1

the variable

even

, by reference, since it

may be modiﬁed in the body of

temp

. Note that, this

problem is not present in JavaScript where the closure

is deﬁned and called in the scope of

even

. Another

problem in Python is related to lambdas, whose body

must be an expression (not a sequence). So we de-

ﬁne the function

temp2

whose body contains the state-

ments that should be placed where an expression is

expected. In Fig. 4 we can see the translation of the

code into Python. The class

ByRef

is used to wrap

the mutable variable

even

to obtain a parameter called

by reference. The Python code generator inserts the

needed wrapping and unwrapping before and after the

call of

temp1

, and in the body of

temp1

The problem we illustrated above occurs when-

ever in the target language we get a statement where

an expression is expected. Since the target languages

handle the situation differently, we abstract from this

speciﬁc problem, and consider the more general prob-

lem of moving “open code” from its context, replac-

ing it with an expression having the same behaviour.

Taking inspiration from work on dynamic binding,

AnIntermediateLanguageforCompilationtoScriptingLanguages

def temp1(w, z):

def temp2(w, fib, x):

if (x < 3): return 1

else: return fib(x-1)+fib(x-2)

fib = lambda x: temp2(w, fib, x)

temp = fib(z)

w.value = ((temp % 2) == 0)

return temp

def __main__():

z = 7

even = False

wrapper1 = ByRef(even)

x = temp1(wrapper1, z)

even = wrapper1.value

return x

__main__();

Figure 4: Correct Python translation.

see (Nanevski, 2003) and recent work by the authors,

see (Ancona et al., 2013), we deﬁne a pair of box-

ing/unboxing contructs, that we call:

stm2exp

, and

exc

. The construct

stm2exp

wraps “open code” (in

this case a sequence of expressions) providing the in-

formation on the environment needed for its execu-

tion, that is the mutable and immutable variables oc-

curring in it. This construct deﬁnes a value, similar

to a function closure. The construct

exc

is used to

execute the code contained in

stm2exp

. To do this it

must provide values for the immutable variables, in

our example the variable

, and bindings for the mu-

table variables to variables in the current environment,

since when executing the code we have to modify the

variable

even

With these constructs, the

code of Fig. 1 would

be translated into the IL code in Fig. 5. All the

let

constructs are translated to variable deﬁnitions. The

sequence of statements on the right-hand-side of “

let

” is packed into a stm2exp expression. Its ﬁrst

component is the translation of the sequence of state-

ments, the second

w->EV

says that in the execution en-

def y = stm2exp(

def fib =

fun x ->

if x < 3 then 1

else (fib (x-1) + fib (x-2));

def temp = fib u;

w <- temp % 2 = 0;

temp,

w->EV, u);

def z = 7;

def even = false;

def x = exc(y, EV->even, z);

Figure 5: Translation of

sequence of expressions in the

intermediate language.

vironment there should be a rebinding of the global

name

to a variable. Such variable may (in this case

will) be modiﬁed by the executionof the code through

assignment to the local variable

. The third compo-

nent says that a value for

must be provided. The

variable

is not modiﬁed by the execution of thecode.

We choose to use global names to unbind/rebind mu-

table variables,

in our example, so that the local vari-

ables can be consistently renamed without affecting

the semantics of the construct as formal parameters

of functions. Instead names such as

are global to

the whole program.

To obtain theresult that we would have by evaluat-

ing the sequence of statements in the current environ-

ment, to the variable

it is assigned the

exc

expres-

sion applied to

, which is bound to

stm2exp(

···

)

. The

name

is bound to the (mutable) variable

even

and

the variable

will be assigned the value of the vari-

able

. Regarding the different treatment of mutable

and immutable variables, notice that, even though our

intermediate language is imperative, we know, since

we are translating

code that some variables are im-

mutable, so we have to provide just an initial value.

The constructs

stm2exp

and

exc

have a different

translation into the target languages JavaScript and

Python, in particular for JavaScript we can take ad-

vantage from the fact that the closure wrapping the

code can be inlined in the position where we have

exc

so we can substitute both the mutable and immutable

variables, instead the translation to Python treats the

two kind of variables differently.

2.2 Dynamic Type Checking

JavaScript, and many dynamically typed languages,

lack a rigorous type system. On the contrary, in

we write a function that adds two integers, say:

let add x y = x + y

we get

val add : int -> int -> int

because, even though we do not specify type infor-

mation, the interpreter infers the type shown after the

function deﬁnition. Therefore, there is no way of call-

ing

add

with arguments that are not of type integer.

However, if our translation in the intermediate code

would produce a function whose body was simply

x+y

, which in turn could be translated in the corre-

sponding expression in both JavaScript and Python,

the target JavaScript function could be called, e.g.,

add("foo")(1)

and obtain the string

"foo1"

which is

not what we wanted. In Python the situation would

be better, in the sense that we cannot call

add

on a

string and an integer, however, due to overloading we

ICSOFT2013-8thInternationalJointConferenceonSoftwareTechnologies

can call it on two ﬂoating points obtaining a ﬂoating

point. To prevent this, the translation in the interme-

diate language, which follows, insert dynamic checks

on parameters of functions.

def add = fun x ->

def x1 = check(int, x);

fun y ->

def y1= check(int, y);

x1 + y1;

These checks are translated into dynamic type check-

ing in JavaScript and Python. In JavaScript we use

the function

checkInt

(that we deﬁned) that returns

its argument if it is an integer, and fails, raising an

exception, if the parameter is not an integer:

var add = function (x) {

var x1 = checkInt(x);

return function(y) {

var y1 = checkInt(y);

return x1 + y1 } }

Similarly for Python:

def temp__1(y, x):

y1 = checkInt(y)

return (x + y1)

def temp__2(x):

x1 = checkInt(x)

return lambda y: temp__1(y, x1)

add = lambda x: temp__2(x)

3 CORE

The syntax for the core

language is presented in

Fig.6. We sacriﬁced minimality to clarity, includ-

ing constructs, such as

let

let mutable

, and

let

rec

that are used in the practice of programming and

that raise challenges in the translation to dynamic lan-

guages. We also did not introduce imperative features

through reference types, but through mutable vari-

ables, since this is closer to the imperative style of

programming. Moreover, we present a typed version

without type inference, since this is performed

by the

compiler. In the type system we omit type

variables, as they do not add complexity to the trans-

lation.

e :: = x | n |

fls

| e

e | if e then e else e

| fun x:T

e | let [mutable] x=e in e

| e e | let rec

x:T=v in e | x

e | e, e

T :: =

int

bool

| T → T

v :: = n |

fls

| fun x:T

Figure 6: Syntax of

In the grammar for expressions, in Fig.6, the

square brackets “[. . .]” delimit an optional part of the

syntax, we use x, y, z for variable names, and the over-

bar sequence notation is used according to (Igarashi

et al., 2001). For instance: “

x:T=v” stands for

“x

··· x

”. The empty sequence is de-

noted by “

0”. For an

expressions e the free vari-

ables of e, FV(e) are deﬁned in the standard way. An

expression e is closed if FV(e) =

The

let rec

construct introduces mutually recur-

sive variables. Variable names, in this constructs are

meant to be bound to functions (as seen for

fib

the example of Fig. 1). The

let

construct (fol-

lowed by an optional

mutable

modiﬁer) binds the

variable

to the value resulting from the evaluation

of the expression on the right-hand-side of = in the

evaluation of the body of the construct. As usual

the notation let f x=e

in e

is a short hand for

let f=fun x:T

in e

where T is the type of e

Similarly for

let rec

. In the (concrete syntax) of the

examples, as in

, “,” and

are substituted by a re-

turn without indentation.

When the

let

construct is followed by

mutable

the

variable introduced is mutable. Only mutable vari-

ables may be used on the left-hand-side of an assign-

ment. This restriction is enforced by the type system

of the language. The type system enforces also the

restriction that the body of a function cannot contain

free mutable variables, even though it may contain

bound mutable variables. So, the function

in Fig.

7 is not correct, whereas the deﬁnition of

that fol-

lows is correct. A type environment Γ is deﬁned by:

let mutable z = 0

let f x =

if (x > 0) then z <- x

else z <- -x

let g x =

let mutable w = 0

if (x > 0) then w <- x

else w <- -x

Figure 7: Typing functions in

Γ ::= x:T, Γ | x:T!, Γ |

that is Γ associates variables with types, possibly fol-

lowed by ! . If the type is followed by ! this means that

the variable was introduced with the mutable modi-

ﬁer. Let † denote either ! or the empty string, and

let dom(Γ) = {x | x:T † ∈ Γ}. We assume that for any

AnIntermediateLanguageforCompilationtoScriptingLanguages

variable x, in Γ there is at most an associated type. We

say that the expression e has type T in the environment

Γ if the judgement

Γ ⊢ e : T

is derivable from the rules of Fig. 8. In the rules of

Fig. 8, with Γ[Γ

′

] we denote the type environment

such that dom(Γ[Γ

′

]) = dom(Γ) ∪ dom(Γ

′

) and:

• if x:T † ∈ Γ

′

then x:T † ∈ Γ[Γ

′

], and

• if x:T † ∈ Γ and x 6∈ dom(Γ

′

), then x:T † ∈ Γ[Γ

′

In the following we describe the most interesting

rules.

Consider rule (TYABS): to type the body of a function

we need assumptions on its free variables and for-

mal parameter. From the deﬁnition of Γ[Γ

′

] we have

that the assumptions on its free variables must coin-

cide with the one present in the environment of the

deﬁnition of the function. Moreover, none of them

may have been declared as mutable. However, in the

environment in which the function is deﬁned, Γ[Γ

′

there can be mutable variables, as long as they are not

needed to type the body of the function. In the ex-

ample of Fig. 7, if the deﬁnition of the function

were typable, it should have been typed from the en-

vironment Γ[Γ

′

] = z:

int

!, therefore, to type its body

we would have used the environment z:

int

!, x:

int

i.e., Γ

′

= z:

int

!. However, this is not possible. In-

stead, the deﬁnition of

, which is again typed in

Γ[Γ

′

] = z:

int

!, not having z free in its body, can be

typed from x:

int

, by deﬁning Γ

′

The rules (TYLET) and (TYLETMUT) bind a variable, x, to

the expression e

in the expression e

. So the expres-

sion e

is typed in a type environment in which x is

associated with the type of e

In the rule (TYLETMUT) the type is followed by ! so that

inside e

the variable x may be used on the left-hand-

side of an assignment, see rule (TYASSIGN).

Our core

language has imperative features, so

for the deﬁnition of the operational semantics we use

a store. The runtime conﬁgurations are pairs “expres-

sion, store”, e | ρ, where a store ρ is a mapping be-

tween locations and values:

7→ v

, . . . l

7→ v

In Fig. 9 we deﬁne:

• runtime expressions, which are expressions in-

cluding locations (generated by the evaluation of

mutable variables deﬁnitions);

• evaluation contexts deﬁning, in conjunction with

rule (CTX-F), the reduction strategy of the lan-

guage, which is call-by-value, with evaluation

left-to-right, and

• the rules for the evaluation relation, −→.

In the rules, with e[x := e

′

] we denote the result of

substituting x with e

′

in e with renaming if needed.

Moreover, ρ[x 7→ v] is deﬁned by: ρ[x 7→ v](x) = v,

and ρ[x 7→ v](y) = ρ(y), when x 6= y.

The evaluation of the sum expression assumes that

the operand be integers, and returns n, which is the

numeral corresponding to the sum of the values of n

and n

. For the conditional statements we have two

rules corresponding to the (boolean) value of the con-

dition. Both the evaluations of the application, rule

(APP-F), and

let

, rule (LET-F), substitute x with its the

value in the body of the construct. This is in accord

with the fact that x is immutable. Instead, for a vari-

able deﬁned

mutable

, rule (LETMUT-F) , a new location

l is generated, added to the store with the initial value

v, and the variable x is substituted with l. Therefore,

during evaluation, expressions may contain locations.

Indeed, since variables on the left-hand-side of as-

signments where always introduced by

let mutable

when an assignment is evaluated, rule (ASSIGN-F), we

have a conﬁguration: l

v | ρ which is evaluated by

changing the value of the location l to be v. The eval-

uation of

let rec

, rule (LET-F), produces the body e in

which each variable x

is substituted with a

let rec

expression with body v

, so that if x

is evaluated all

the variables

x will be substituted with their deﬁni-

tions v. Evaluation of a location, rule (LOC-F), pro-

duces the value associated in the store. Finally in rule

(CTX-F) the context E selects the ﬁrst sub-expression

to be evaluated. We can show that evaluation is deter-

ministic.

The typing rules in Fig.8 are for the (source) ex-

pression language, so they do not include a rule for

locations. To type run-time expressions we need a

store environment Σ assigning types to locations. The

type judgement should therefore be:

Γ | Σ ⊢ e : T

and the typing rule for locations

Γ | Σ ⊢ l : Σ(l) (TYLOCF)

All the other rules are obtained by putting Γ | Σ on the

left-hand-side of ⊢ in the typing rules of Fig.8.

Deﬁnition 1. A store ρ is well-typed with respect to a

type environment Γ, and a store environment Σ, writ-

ten Γ | Σ ⊢ ρ, if dom(ρ) = dom(Σ), and for all l ∈ ρ,

we have that Γ | Σ ⊢ ρ(l) : Σ(l).

Types are preserved by reduction, and progress

holds, as the following two theorems state.

Theorem 2 (Preservation). Let Γ | Σ ⊢ e : T, and ρ be

such that Γ | Σ ⊢ ρ. If e | ρ −→ e

′

| ρ

′

, then Γ | Σ

′

⊢ e

′

T, for some Σ

′

⊇ Σ such that Γ | Σ

′

⊢ ρ

′

Theorem 3 (Progress). Let

0 | Σ ⊢ e : T, then either e

is a value or for any store ρ such that

0 | Σ ⊢ ρ there

are, e

′

, and ρ

′

such that e | ρ −→ e

′

| ρ

′

ICSOFT2013-8thInternationalJointConferenceonSoftwareTechnologies

Γ ⊢ n :

int

(TYNUM) Γ ⊢

fls

bool

(TYBOOL)

Γ ⊢ e

int

Γ ⊢

int

(TYSUM)

Γ ⊢ e

int

Γ ⊢ e :

bool

Γ ⊢ e

: T Γ ⊢ e

: T

(TYIF)

Γ ⊢ if e then e

else e

: T

′

[x:T] ⊢ e : T

′

∀y, T

′′

y:T

′′

! 6∈ Γ

′

(TYABS)

Γ[Γ

′

] ⊢ fun x:T

e : T → T

′

Γ ⊢ e

: T → T

′

Γ ⊢ e

: T

(TYAPP)

Γ ⊢ e

: T

x:T † ∈ Γ

(TYVAR)

Γ ⊢ x : T

Γ ⊢ e

: T Γ[x:T] ⊢ e : T

′

(TYLET)

Γ ⊢ let x=e

in e

: T

′

Γ[

x:T] ⊢ v

: T

(1 ≤ i ≤ n)

Γ[

x:T] ⊢ e : T

(TYREC)

Γ ⊢ let rec

x:T=v in e : T

Γ ⊢ e

: T Γ[x:T!] ⊢ e : T

′

(TYLETMUT)

Γ ⊢ let mutable x=e

in e

: T

′

Γ ⊢ e : T x:T! ∈ Γ

(TYASSIGN)

Γ ⊢ x

e : T

Γ ⊢ e

: T Γ ⊢ e

: T

′

(TYSEQ)

Γ ⊢ e

, e

: T

′

Figure 8: Typing rules of core

e :: = · · · | l runtime expression

E :: = [ ] | E

e | n

E | if E then e else e | E e | v E | let [mutable] x=E in e evaluation contexts

| u

E | E, e

| ρ −→ n | ρ if ˜n = ˜n

int

˜n

(SUM-F)

then e

else e

| ρ −→ e

| ρ (IFTRUE-F)

fls

then e

else e

| ρ −→ e

| ρ (IFFALSE-F)

(fun x:T

e) v | ρ −→ e[x := v] | ρ (APP-F)

let x=v in e | ρ −→ e[x := v] | ρ (LET-F)

let rec

x:T=v in e | ρ −→

e[x

:= (let rec

x:T=v in v

) | 1 ≤ i ≤ n] | ρ (REC-F)

let mutable x=v in e | ρ −→ e[x := l] | ρ[l 7→ v] l 6∈ dom(ρ) new (LETMUT-F)

v | ρ −→ v | ρ[l 7→ v] l ∈ dom(ρ) (ASSIGN-F)

v, e | ρ −→ e | ρ (SEQ-F)

l | ρ −→ v | ρ if ρ(l) = v (VAR-F)

e | ρ −→ e

′

| ρ

′

E 6= []

(CTX-F)

E [e] | ρ −→ E[e

′

] | ρ

′

Figure 9: Operational semantics of core

4 INTERMEDIATE LANGUAGE

The intermediate language,

, is an imperative lan-

guage with three syntactic categories: expressions,

statements and blocks. We introduce the construct

that wraps code that need to be moved from its def-

inition environment, and the one that executes such

code in the runtime environment.

The syntax of

is presented in Fig.10.

There are three syntactic categories: blocks, state-

ments, and expressions. We introduce the distinction

between expressions and statements as many target

languages do. This facilitates the translation process

and prevents some errors while building the interme-

diate abstract syntax tree, see (Appel, 1998) for a sim-

ilar choice. Blocks are sequences of statements or

expressions ended by an expression. In our transla-

tion we ﬂatten the nested structure of

let

constructs

so we need blocks in which deﬁnitions and expres-

sions/statements may be intermixed. Moreover, since

we do not have a speciﬁc

let rec

construct use of a

variable may precede its deﬁnition, e.g., when deﬁn-

ing mutually recursive (or simply recursive) func-

tions. Statements may be either assignments or vari-

able deﬁnitions. Our compiler handles many more

statements, but these are enough to show the ideas

AnIntermediateLanguageforCompilationtoScriptingLanguages

bl :: = st;bl | e;bl | e

st :: = x

e | def x=e

e :: = x | n |

fls

| e

e | fun x

{bl} | e e

| if e then {bl} else {bl} | check(T

, e)

| stm2exp({bl},

y 7→ Y, x)

| exc(e,

Y 7→ y, e)

:: =

int

bool

v :: = n |

fls

| fun x:T

{bl}

| stm2exp({bl},

y 7→ Y, x)

Figure 10: Syntax of

behind the design of

. Our intermediate language

is inspired (especially for the block structure) to In-

tegerPython, see (Ranson et al., 2008). Variables are

statically scoped, in the sense that, if there is a deﬁ-

nition of the variable x in a block, all the free occur-

rences of x in the block refer to this deﬁnition. How-

ever, we can have occurrences of x preceding its deﬁ-

nition. E.g.,

def f = fun y -> { x };

def x = 5;

f 2

correctly returns 5, whereas the following code would

produce a run-time error:

def x =7;

if (x > 3) then {

def f = fun y -> { x };

f 2

def x = 5;

3 }

else { 4 }

since when

is called the variable x, deﬁned in the

inner block, has not yet been assigned a value. In-

stead, if x was not deﬁned in the inner block, like in

the following

def x =7;

if (x > 3) then {

def f = fun y -> { x };

f 2 }

else { 4 }

the block would return 7, since x is bound in the en-

closing block. This is also the behaviour in JavaScript

and Python.

The construct stm2exp is used to move a block,

bl, outside its deﬁnition context. To produce a closed

term, the mutable variables free in bl,

y, are unbound

by associating them to global names

Y not subject

to renaming. The variables x, instead, are immutable

variables free in bl, i.e., they are not modiﬁed by the

execution of bl. The metavariables, X, Y, Z are used

to denote names.

The operational semantics of

, see Fig. 11, is

given, by deﬁning a reduction relation for blocks. So

our conﬁgurations will be pairs: “block, store”. In

order to specify the order of reduction we deﬁne eval-

uation contexts for blocks, containing evaluation con-

texts for expressions. As for

we have to add to

the syntax of expressions locations, l, as they are gen-

erated during the evaluation of blocks. Moreover,

we add two constructs wrapping blocks: {bl} and

eval(bl). The ﬁrst will be used to do the initial al-

location of variables needed to reproduce the previ-

ously described semantics, and the second to execute

a block in a position where an expression would be re-

quired. Note that these expressions are not in

but

are just introduced to describe its semantics.

As for

, the evaluation contexts of Fig. 11 spec-

ify a call-by-value, left-to-right reduction strategy.

The ﬁrst rule is used before the evaluation of a

block to allocate the variables deﬁned in a block. The

function de f mapping a block to the set of variables

deﬁned in it is deﬁned by:

• def (e) =

• def (e;bl) = def (x

e;bl) = def(bl), and

• def (def x=e;bl) = {x} ∪ def (bl).

The initial value of the locations is set to undeﬁned, ?,

so if an access to a variable is done before the evalu-

ation of an assignment or a deﬁnition for this variable

undErr is returned. Note that, this will never hap-

pen for

programs which are translation of

pro-

grams. After this initial allocation a block will not

contain free variables (but locations).

Rules (ASSIGN) and (DEF) continue the execution of the

expressions/statements in a block in a store in which

the value of locationl is v. So after this the value of l is

not undeﬁned. Rule (EXP) throws away the value of an

expression and continues the execution of the block.

The rules for +, and

are trivial. Rule (APP) allo-

cates a location in the memory, assigning the value

of the actual parameter to it, then the location is sub-

stituted for the formal parameter in the body of the

function. Note that, being in an imperative language,

the formal parameter could be modiﬁed in the body

of the function, however, this change would not be

visible in the calling environment, since the location

is new. After this allocation the execution continues

with the evaluation of the body {bl}, i.e., applying

rule (ALLOC). The rules (TYPEYES), and (TYPENO) check

whether a value is of the right primitive type. The

function typeof from values to types is deﬁned by:

typeof(

) = typeof(

fls

) =

bool

, typeof(n) =

int

and undeﬁned for the other values. The evaluation of

the

exc

construct, rule (STTOEXP), expects the ﬁrst ar-

gument to be a stm2exp, such that the names of its un-

bindings are a subset of the one of the rebindings pro-

vided by

exc

. If this is the case, it allocates new loca-

ICSOFT2013-8thInternationalJointConferenceonSoftwareTechnologies

e :: = · · · | l | {bl} | eval(bl) runtime expression

S :: = l

E ;bl | def l=E;bl | E ;bl | E block evaluation context

E :: = [ ] | E

e | n

E | E e | v E | if E then {bl} else {bl} | check(T

, E ) expression evaluation context

| exc(E ,

Z 7→ l, e) | exc(v, Z 7→ l, vE e) | eval(S )

{bl} | ρ −→ bl[x := l] | ρ[l 7→ ?] if x = def(bl) (ALLOC)

l 6∈ dom(ρ) new

v;bl | ρ −→ bl | ρ[l 7→ v] (ASSIGN)

def l=v;bl | ρ −→ bl | ρ[l 7→ v] (DEF)

v;bl | ρ −→ bl | ρ (EXP)

| ρ −→ n | ρ if ˜n = ˜n

int

˜n

(SUM)

(fun x

{bl}) v | ρ −→ {bl[x := l]} | ρ[l 7→ v] l 6∈ dom(ρ) new (APP)

then bl

else bl

| ρ −→ {bl

} | ρ (IFTRUE)

fls

then bl

else bl

| ρ −→ {bl

} | ρ (IFFALSE)

check(T

, v) | ρ −→ v | ρ if typeof (v) = T

(TYPEYES)

check(T

, v) | ρ −→ typeErr if typeof (v) 6= T

(TYPENO)

exc(stm2exp({bl},

y 7→ Y, x), Z 7→ l

′

, v) | ρ −→ if Y ⊆ Z (STTOEXP)

eval({(bl[

x := l])[y

:= l

′

| Y

= Z

1 ≤ i ≤ n]}) | ρ[

l 7→ v] l 6∈ dom(ρ) new

eval(v) | ρ −→ v | ρ (EVAL)

l | ρ −→ v | ρ if ρ(l) = v (LOCDEF)

l | ρ −→ undErr | ρ if ρ(l) =? (LOCUND)

e | ρ −→ e

′

| ρ

′

S 6= []

(CTX)

S [e] | ρ −→ S [e

′

] | ρ

′

e | ρ −→ err err = typeErr∨ undErr S 6= []

(CTXERROR)

S [e] | ρ −→ err

Figure 11: Runtime expressions, evaluation contexts and operational semantics rule for

tions for the immutable variables

x (as in rule (APP) for

the formal parameter), instead, for the unbound vari-

ables

y it substitutes the associated locations (via the

correspondence of the names in

Y and Z). So through

assignment to the (local) variables in y the execution

environment may be modiﬁed. The resulting block is

wrapped in the

eval

construct. Rule (EVAL) returns its

value. (Evaluation inside

eval

is done by the (CTX)

rule.) Finally, access to a location may return undErr

if the location has not been initialized with an assign-

ment of or a deﬁnition statement. Rule (CTX) evalu-

ates the ﬁrst sub-expression selected by the evaluation

context. In case the evaluation produces and error rule

(CTXERROR) returns the error at the top level. Note that,

given a block bl if there is S and e such that bl = S [e],

then S is unique. So evaluation is deterministic.

program is a closed block, bl. The initial

conﬁguration for a program is {bl} | [].

Let us look at an example of evaluation. Consider

the program of Fig. 5. Applying rule (ALLOC) to the

block enclosed in brackets we get the conﬁguration

bl | ρ where bl is

def lc1 = stm2exp(...);

def lc2 = 7;

def lc3 = fls;

def lc4 = exc(lc1, EV->lc3, lc2);

lc4

and ρ = [lc1 7→?, lc2 7→?, lc3 7→?, lc4 7→?].

Applying (DEF) three times we get bl

| ρ

where

= def lc4 = exc(lc1, EV → lc3, lc2);lc4 and

= [lc1 7→ stm2exp(...), lc2 7→ 7, lc3 7→ fls, lc4 7→?].

From rule (CTX) where S is def lc4 = E ;lc4

and E is exc([], EV → lc3, lc2);lc4, apply-

ing rule (LOCDEF) we get bl

| ρ

where bl

def lc4 = exc(stm2exp(...), EV → lc3, lc2);lc4.

From rule (CTX) where S

is def lc4 = E

;lc4

and E

is exc(stm2exp(...), EV → lc3, []);lc4, ap-

plying rule (LOCDEF) we get bl

| ρ

where bl

def lc4 = exc(stm2exp(...), EV → lc3, 7);lc4.

Again by rule (CTX) where S

is def lc4 = E

;lc4

and E

= [], and applying rule (STTOEXP), we get

def lc4 = eval({bl

});lc4 | ρ

, where bl

def fib =

fun x ->

if x < 3 then 1

else (fib (x-1) + fib (x-2));

def temp = fib 7;

lc3 <- temp % 2 = 0;

temp

The evaluation proceeds inside the

eval

construct,

with rule (CTX) where S

is def lc4 = E

;lc4 and

is eval([ ]) , applying rule (ALLOC), and produc-

ing the conﬁguration bl

| ρ

where ρ

= [lc1 7→

stm2exp(...), lc2 7→ 7, lc3 7→ fls, lc4 7→?, lc5 7→

?, lc6 7→?], and bl

is def lc4 = eval({bl

});lc4

where bl

AnIntermediateLanguageforCompilationtoScriptingLanguages

def lc5 =

fun x ->

if x < 3 then 1

else (lc5 (x-1) + lc5 (x-2));

def lc6 = lc5 7;

lc3 <- lc6 % 2 = 0;

lc6

We can see how recursion is handled and how the as-

signment to

lc3

when evaluated modiﬁes the location

of the initial variable

even

5 TRANSLATION OF CORE F#

INTO IL

In our translation we ﬂatten the

let

constructs trans-

forming them into deﬁnitions of the corresponding

variables followed by the translation of the expression

in their body. Therefore, we have to take into account

the fact that in an

block we may have forward bind-

ing. E.g., if

let y = 3 in

if ( y = 3) then (

let f = (fun x -> y)

let y = 5

(f 0) )

else 4

is translated into

def y = 3;

if ( y = 3) then (

def f = (fun x -> { y });

def y = 5;

(f 0) )

else 4

The translation is incorrect, since in the

code the

occurrence of

in the body of

is bound to the deﬁ-

nition of

that follows. Therefore the

expression

evaluates to 3 whereas its translation in

evaluates

to 5. In the translation we use renaming to resolve this

problem.

As explained in the Section 2 sequences of ex-

pressions will be mapped to sequences of statements,

and we use the

stm2exp

and

exc

constructs to simu-

late the behaviour of the sequence of statements with

an expression. So we deﬁne two translations of

expressions. The ﬁrst to IL expressions, [[·]]

I,M

, and

the second to IL blocks, [[·]]

I,M

. The translations are

parametrized by the sets of the immutable variables, I,

and mutable variables, M, of the context of the

ex-

pression that is translated. The translations produce,

in addition to an IL expression/block also a sequence

of top level variable deﬁnition of variables bound to

stm2exp

expressions. In the following we present the

translations for function deﬁnitions, sequence of ex-

pressions, and the

let

construct, which exemplify the

technique used.

In the formal deﬁnition of the translation δ is

a metavariable denoting a declaration of a variable

“def x=e” and

δ a sequence of declarations separated

by “;” (semicolon).

The translations of

function deﬁnitions to

blocks or expressions:

[[fun x:T

e]]

I,M

[[fun x:T

e]]

I,M

are both equal to:

(fun x

{def y=check(T, x);bl[x := y]}, δ)

where [[e]]

I∪{x},M

= (bl,

δ). So the translation of a

function produces a function whose body is the trans-

lation of the body (to a block) of the original func-

tion. In the translation of the body of the function the

variable x is added to the set of free immutable vari-

ables I. The formal parameter is replaced with a new

variable resulting from the type checking of the origi-

nal parameter. See the discussion about dynamic type

checking in Section 2.

In the following, we introduce the deﬁnition of the

wrapping needed to extrude a block from its deﬁnition

environment and how the construct

exc

rebinds it in

the run-time environment.

Deﬁnition 4. Given an

block, and the dis-

joint sets of variables I = {

x} and M = {y}, let

blockToExp(bl, I, M) be

(exc(z,

Y 7→ y, x), δ)

where:

• δ is def z:T

′′

=stm2exp(bl,

y 7→ Y, x)

• z is a new variable and

Y are new names.

Let blockToExp(bl, I, M) = (e, δ), we can prove that:

for all stores ρ we have: {δ;e} | ρ −→

⋆

v | ρ

′

if and only

if {bl} | ρ −→

⋆

v | ρ

′′

. So the evaluation of the deﬁni-

tion δ followed by the generated expression produces

the same result as the evaluation of the original block.

The difference in the content of the ﬁnal stores is due

to the fact that the evaluation of the deﬁnition δ allo-

cates a location and assigns it the

stm2exp

expression,

to subsequently substitute this value for the location

in the

exc

expression. However, since the variable z

is new it does not interfere with the evaluation of the

original block/expression.

To give the translation of both sequences of ex-

pressions and of the

let

constructs, we introduce the

formal deﬁnition of the top level variable deﬁnition of

expressions, then we deﬁne the renaming needed

to avoid the capture of forward deﬁnitions described

at the beginning of this section.

Deﬁnition 5. 1. Let e be an

expression, the func-

tion de f

(e) returning the set of variables deﬁned

at the top level of e is deﬁned as follows:

ICSOFT2013-8thInternationalJointConferenceonSoftwareTechnologies

100

• def

(let [mutable] x=e

in e

) = {x} ∪ def

• def

(let rec x:T=v in e) = {x} ∪ def

(e),

• def

, e

) = def

) ∪ def

), and

• def

(e) =

0 for all other expresssions e.

2. Let e be an

expression, and x a set of vari-

ables, rn(e, x), renames the top level deﬁnitions of

the variables x in e as follows:

• if e is let [mutable] x=e

in e

, then rn(e,

x) is

let [mutable] x=e

in rn(e

x) if x 6∈ x

let [mutable] z=e

in rn(e

{x 7→ z}, x) if x ∈ x

and z is new

• if e is let rec y:T=v in e, then rn(e, x) is

let rec

y:T=v in rn(e, x) if y∩ x =

let rec z:T=(v{y 7→ z}) in rn(e{y 7→ z}, x) if y ∩

x =

0 and z are new

• if e is e

, e

then rn(e,

x) is rn(e

, x), rn(e

, x)

• rn(e, x) is e for all other expresssions e.

The translations of an

sequence of expressions

to a

block is:

[[e

, e

]]

I,M

= (bl

;bl

δ;δ

′

)

where:

• [[e

]]

= (bl

δ)

• [[rn(e

, z)]]

= ( bl

, δ

′

) and z = def

) ∩ FV(e

The translation of the sequence is the sequence of

blocks which are the translations of the two expres-

sions to blocks. However, before translating the sec-

ond expression, e

, we rename all the variables de-

ﬁned in it that are free in e

, since in e

these vari-

ables are bound to their deﬁnitions in the enclosing

environment. In this way we preserve the semantics

of the source language

The translations of an

sequence of expressions

to an

expression is:

[[e

, e

]]

I,M

= (e, δ;

δ)

where:

• [[e

, e

]]

I,M

= ( bl,

δ) and

• blockToExp(bl, I, M) = ( e, δ).

That is we ﬁrst translate the sequence to a block, and

then return an

exc

expression, and the deﬁnition of

a new variable bound to an

stm2exp

expression, see

Deﬁnition 4. Note that the sets of mutable and im-

mutable variable of the environment are needed to

generate the correct matching for the expressions

exc

and

stm2exp

The translation of the let construct to an

block

[[let x=e

in e

]]

I,M

= (def x=e

′

;bl,

δ;δ

′

)

where

• [[e

]]

I,M

= (e

′

δ) and

• [[rn(e

, z)]]

I∪{x},M

= (bl, δ

′

) with z = def

) ∩

FV(e

That is we translate e

into an

expression and the

body of the let e

into a block. For the translation of

the variable x is added to the immutable variables

of the context. Before translating e

we rename all

the variables deﬁned in e

that are free in e

(as for

the translation of sequences of expressions).

The translation of

let mutable

differs only in the fact

that in translattion of e

, the variable x, being mutable,

is added to M.

Note that, this translation produces a block, the deﬁ-

nition of x followed by a block. Moreover, the trans-

lation of the expression on the right-hand-side of the

deﬁnition of x, that is e

, must be an

expression.

Looking at the

code of Fig. 1 this means that the

following

expression:

let rec fib x =

if x < 3 then 1

else fib(x - 1) + fib(x - 2)

let temp = fib z

even <- (temp % 2 = 0)

temp

which is a sequence of expressions, must be translated

to an

expression.

The translation of a let expression to an

expression, is deﬁned as the translation of a se-

quence of expressions to an

expression in which

[[let x=e

in e

]]

I,M

substitutes [[e

, e

]]

I,M

Properties of the Translation. The translation pre-

serves the dynamic semantics of the

expressions,

that is let e be an

program, and [[e]]

= ( bl,

δ).

Then e | [] −→

⋆

v | ρ if and only if {

δ;bl} | [] −→

⋆

v |

′

for some ρ

′

. From this result and the fact that

programs do not get stuck, we can derive that the

translation of an

program does not evaluate to an

error or gets stuck.

6 COMPARISONS WITH OTHER

WORK

Similar projects exist and are based on similar trans-

lation techniques, although, as far as we know, we are

the ﬁrst to introduce an intermediate language allow-

ing to translate to many target languages. Pit, see (Fa-

had, 2012), and FunScript, see (Bray, 2013), are open

source

to JavaScript compilers. They support only

translation to JavaScript. FunScript ha support for in-

tegration with JavaScript code. Websharper, see (In-

tellifactory, 2012), is a professional web and mobile

development framework. As of version 2.4 an open

AnIntermediateLanguageforCompilationtoScriptingLanguages

101

source license is available. It is a very rich frame-

work offering extensions for ExtJs, jQuery, Google

Maps, WebGL and many more. Again it supports

only JavaScript.

Web Tools is an open source

tool whose main objective is not the translation to

JavaScript, instead, it is trying to solve the difﬁculties

of web programming: “the heterogeneous nature of

execution, the discontinuity between client and server

parts of execution and the lack of type-checked ex-

ecution on the client side”, see (Petˇr´ıˇcek and Syme,

2012). It does so by using meta-programming and

monadic syntax. One of it features is translation to

JavaScript. Finally, a translation between Ocaml byte

code and JavaScript is provided by Ocsigen, and de-

scribed in (Vouillon and Balat, 2011).

On the theoretical side, a framework integrat-

ing statically and dynamically typed (functional) lan-

guages is presented in (Matthews and Findler, 2009).

Support for dynamic languages is provided with ad

hoc constructs in Scala, see (Moors et al., 2012).

A construct similar to stm2exp, is studied in recent

work by one of the authors, see (Ancona et al., 2013),

where it is shown how to use it to realize dynamic

binding and meta-programming, an issue we are plan-

ning to address. The only work to our knowledge that

proves the correctness of a translation between a stat-

ically typed functional language, with imperative fea-

tures to a scripting language (namely JavaScript) is

(Fournet et al., 2013).

7 CONCLUSIONS

AND FUTURE WORK

In this paper we introduced

an intermediate lan-

guage for the translation of a signiﬁcant fragment

to scripting languages such as Python and

JavaScript. The translation is shown to preserve

the dynamic semantics of the original language. A

preliminary version of this paper was presented at

ICTCS 2012, see (Giannini et al., 2012), which has

not published proceedings. We have a prototype im-

plementation of the compiler that can be found at

http://www.bluestormproject.org/. The compiler is

implemented in

and is based on two metaprogram-

ming features offered by the .net platform: quotations

and reﬂection. Our future work will be on the practi-

cal side to use the intermediate language to integrate

code and JavaScript or Python native code. (Some

of the features of

, such as dynamic type check-

ing, were originally introduced for this purpose.) A

previous implementation of the translation supported

other features such as namespacing, classes, pattern

matching, discriminated unions, etc. We are in the

process of adding them at the current implementation,

since some of this features have poor or no support at

all in JavaScript or Python. On the theoretical side,

we are planning to complete the proofs of correctness

of the translations. We need to formalize our target

languages Python and JavaScript, and then prove the

correctness of the translation from

to them. (We

anticipate that these proofs will be easier than the one

from

.) Moreover, we want to formalize the

integration of native code, and more in general meta-

programming on the line of recent work by the au-

thors, see(Ancona et al., 2013) . We are also consid-

ering extending the type system for the intermediate

language with polymorphic types, which is, as shown

in (Ahmed et al., 2011), non trivial.

ACKNOWLEDGEMENTS

We warmly thank Daniele Mantovani for his support

and involvement in the topic of the paper. We also

thank the anonymous referees of a previous version

of the paper for pointing out some problems which

lead to a substantial review of the intermediate lan-

guage. Any misinterpretation of their suggestions is,

of course, our responsibility.

REFERENCES

Ahmed, A., Findler, R. B., Siek, J. G., and Wadler, P.

(2011). Blame for all. In Proceedings of POPL 2011,

Austin, TX, USA, ACM, pages 201–214.

Ancona, D., Giannini, P., and Zucca, E. (2013). Recon-

ciling positional and nominal binding. In ITRS 2012,

EPTCS.

Appel, A. W. (1998). Modern Compiler Implementation in

ML. Cambridge University Press.

Bray, Z. (2013). Funscript. http://tomasp.net/ﬁles/funscript/

tutorial.html.

Fahad, M. S. (2012). Pit - F Sharp to JS compiler. http://

pitfw.org/.

Fournet, C., Swamy, N., Chen, J., Dagand, P.-

E., Strub, P.-

Y., and Livshits, B. (2013). Fully abstract compilation

to javascript. In POPL, pages 371–384. ACM.

Giannini, P., Mantovani, D., and Shaqiri, A. (2012). Lever-

aging dynamic typing through static typing. ICTCS

2012. http://ictcs.di.unimi.it/papers/paper

4.pdf.

Igarashi, A., Pierce, B., and Wadler, P. (2001). Feather-

weight Java: A minimal core calculus for Java and

GJ. ACM TOPLAS, 23(3):396–450.

Intellifactory (2012). Websharper 2010 platform. http://

websharper.com/.

Matthews, J. and Findler, R. B. (2009). Operational seman-

tics for multi-language programs. ACM Trans. Pro-

gram. Lang. Syst., 31(3).

ICSOFT2013-8thInternationalJointConferenceonSoftwareTechnologies

102

Moors, A., Rompf, T., Haller, P., and Odersky, M. (2012).

Scala-virtualized. In Kiselyov, O. and Thompson,

S., editors, Proceedings of PEPM 2012, Philadelphia,

Pennsylvania, USA, ACM, pages 117–120.

Nanevski, A. (2003). From dynamic binding to state

via modal possibility. In PPDP’03, pages 207–218.

ACM.

Petˇr´ıˇcek, T. and Syme, D. (2012). AFAX: Rich client/server

web applications in

. http://www.scribd.com/doc/

54421045/Web-Apps-in-F-Sharp.

Ranson, J. F., Hamilton, H. J., and Fong, P. W. L. (2008).

A semantics of python in isabelle/hol. Technical

Report CS-2008-04, CS Department, University of

Regina,Saskatchewan.

Vouillon, J. and Balat, V. (2011). From bytecode

to javascript: the js of ocaml compiler. http://

www.pps.univ-paris-diderot.fr/∼balat/publi.php.

AnIntermediateLanguageforCompilationtoScriptingLanguages

103