1
A Semantic Framework for the Security Analysis 251
– gas ∈ N
256
is the current amount of gas still available for execution;
– pc ∈ N
256
is the current program counter;
– m ∈ B
256
→ B
8
is a mapping from 256-bit words to bytes that represents the
local memory;
– i ∈ N
256
is the current number of active words in memory;
– s ∈ L(B
256
)isthelocal256-bitwordstackofthestackmachine.
The execution of each internal transaction starts in a fresh machine state, with
an empty stack, memory initialized to all zeros, and program counter and active
words in memory set to zero. Only the gas is instantiated with the gas value
available for the execution.
3.4 Small-Step Rules
In the following, we will present a selection of interesting small-step rules in
order to illustrate the most important features of the semantics.
For demonstrating the overall design of the semantics, we start with the
example of the arithmetic expression ADD performing addition of two values on
the machine stack. Note that as the word size of the stack machine is 256, all
arithmetic operations are performed modulo 2
256
.
ι.code [µ.pc]=ADD
µ.s = a :: b :: sµ.gas ≥ 3 µ
′
= µ[s → (a + b)::s][pc += 1][gas −= 3]
Γ ! (µ, ι, σ, η)::S → (µ
′
, ι, σ, η)::S
ι.code [µ.pc]=ADD (|µ.s| < 2 ∨ µ.gas < 3)
Γ ! (µ, ι, σ, η)::S → EXC :: S
We use a dot notation, in order to access components of the different state
parameters. We name the components with the variable names introduced for
these components in the last section written in sans-serif-style. In addition, we
use the usual notation for updating components: t[c → v]denotesthatthe
component c of tuple t is updated with value v.Forexpressingincremental
updates in a simpler way, we additionally use the notation t[c += v] to denote
that the (numerical) component of c is incremented by v and similarly t[c −= v]
for decrementing a component c of t.
The execution of the arithmetic instruction ADD only p erforms local changes
in the machine state affecting the local stack, the program counter, and the
gas budget. For deciding upon the correct instruction to execute, the currently
executed code (that is part of the execution environment) is accessed at the
position of the current program counter. The cost of an ADD instruction is
constantly three units of gas that get subtracted from the gas budget in the
machine state. As every other instruction, ADD can fail due to lacking gas or due
to underflows on the machine stack. In this case, the exception state is entered
and the execution of the current internal transaction is terminated. For better
readability, we use here the slightly sloppy ∨ notation for combining the two
error cases in one inference rule.
A Semantic Framework for the Security Analysis 251
– gas ∈ N
256
is the current amount of gas still available for execution;
– pc ∈ N
256
is the current program counter;
– m ∈ B
256
→ B
8
is a mapping from 256-bit words to bytes that represents the
local memory;
– i ∈ N
256
is the current number of active words in memory;
– s ∈ L(B
256
)isthelocal256-bitwordstackofthestackmachine.
The execution of each internal transaction starts in a fresh machine state, with
an empty stack, memory initialized to all zeros, and program counter and active
words in memory set to zero. Only the gas is instantiated with the gas value
available for the execution.
3.4 Small-Step Rules
In the following, we will present a selection of interesting small-step rules in
order to illustrate the most important features of the semantics.
For demonstrating the overall design of the semantics, we start with the
example of the arithmetic expression ADD performing addition of two values on
the machine stack. Note that as the word size of the stack machine is 256, all
arithmetic operations are performed modulo 2
256
.
ι.code [µ.pc]=ADD
µ.s = a :: b :: sµ.gas ≥ 3 µ
′
= µ[s → (a + b)::s][pc += 1][gas −= 3]
Γ ! (µ, ι, σ, η)::S → (µ
′
, ι, σ, η)::S
ι.code [µ.pc]=ADD (|µ.s| < 2 ∨ µ.gas < 3)
Γ ! (µ, ι, σ, η)::S → EXC :: S
We use a dot notation, in order to access components of the different state
parameters. We name the components with the variable names introduced for
these components in the last section written in sans-serif-style. In addition, we
use the usual notation for updating components: t[c → v]denotesthatthe
component c of tuple t is updated with value v.Forexpressingincremental
updates in a simpler way, we additionally use the notation t[c += v] to denote
that the (numerical) component of c is incremented by v and similarly t[c −= v]
for decrementing a component c of t.
The execution of the arithmetic instruction ADD only p erforms local changes
in the machine state affecting the local stack, the program counter, and the
gas budget. For deciding upon the correct instruction to execute, the currently
executed code (that is part of the execution environment) is accessed at the
position of the current program counter. The cost of an ADD instruction is
constantly three units of gas that get subtracted from the gas budget in the
machine state. As every other instruction, ADD can fail due to lacking gas or due
to underflows on the machine stack. In this case, the exception state is entered
and the execution of the current internal transaction is terminated. For better
readability, we use here the slightly sloppy ∨ notation for combining the two
error cases in one inference rule.
2
252 I. Grishchenko et al.
A more interesting example of a semantic rule is the one of the CALL instruc-
tion that initiates an internal call transaction. In the case of calling, several
corner cases need to be treated which results in several inference rules for this
case. Here, we only present one rule for illustrating the main functionality. More
precisely, we present the case in that the account that should be called exists,
the call stack limit of 1024 is not reached yet, and the account initiating the
transaction has a sufficiently large balance for sending the s pecified amount of
wei to the called account.
ι.code [µ.pc]=CALL µ.s = g :: to :: va :: io :: is :: oo :: os :: s
σ(to) ̸= ⊥|A|+1< 1024 σ(ι.actor).b ≥ va aw = M (M (µ.i, io, is), oo, os)
c
cal l
= C
gascap
(va, 1,g,µ.gas) c = C
base
(va, 1) + C
mem
(µ.i, aw)+c
cal l
µ.gas ≥ cσ
′
= σ
!
to → σ(to)[b += va]
"!
ι.actor → σ(ι.actor)[b −= va]
"
d = µ.m [io, io + is −1] µ
′
=(c
cal l
, 0,λx.0, 0,ϵ)
ι
′
= ι[sender → ι.actor][actor → to][value → va][input → d][code → σ(to).code]
Γ ! (µ, ι, σ, η)::S → (µ
′
,ι
′
,σ
′
,η)::(µ, ι, σ, η)::S
For performing a call, the parameters to this call need to be specified on the
machine stack. These are the amount of gas g that should be given as budget to
the call, the recipient to of the call and the amount va of wei to be transferred
with the call. In addition, the caller needs to specify the input data that should
be given to the transaction and the place in memory where the return data of
the call should be written after successful execution. To this end, the remaining
arguments specify the offset and size of the memory fragment that input data
should be read from (determined by io and is)andreturndatashouldbewritten
to (determined by oo and os).
Calculating the cost in terms of gas for the execution is quite complicated in
the case of CALL as it is influenced by several factors including the arguments
given to the call and the current machine state. First of all, the gas that should
be given to the call (here denoted by c
call
)needstobedetermined.Thisvalueis
not necessarily equal to the value g specified on the stack, but also depends on
the value va transferred by the call and the currently available gas. In addition,
as the memory needs to be accessed for reading the input value and writing the
return value, the number of active words in memory might be increased. This
effect is captured by the memory extension function M.Asaccessingadditional
words in memory costs gas, this cost needs to be taken into account in the
overall cost. The costs resulting from an increase in the number of active words
is calculated by the function C
mem
. Finally, there is also a base cost charged for
the call that depends on the value va.Asthecostalsodependsonthespecificcase
for calling that is considered, the cost calculation functions receive a flag (here
1) as arguments. These technical details are spelled out in the full version [22].
The call itself then has several effects: First, it transfers the balance from
the executing state (actor in the execution environment) to the recipient (to).
To this end, the global state is updated. Here we use a special notation for the
functional update on the global state using ⟨⟩ instead of []. Second, for initializing
the execution of the initiated internal transaction, a new regular execution state
3
Logging instructions The logging operation allows to append a new log entry to the log
series. The log series keeps track of archived and indexable checkpoints in the execution
of Ethereum byte code. The motivation of the log series is to allow external observers
to track the program execution. A log entry consists of the address of the currently
executing account, up to for ’topics’ (specified on stack) and a fraction of the memory.
There are four logging instructions, but as seen before we describe their effects using
common rules parameterising the instruction by the amount of log information read
from the stack.
!
µ,◆
= LOGnµ.s = pos
m
:: size :: (s
1
++s
2
) |s
1
| = n
aw = M (µ.i, pos
m
, size) c = C
mem
(µ.i, aw) + 375 + 8 · size + n · 375
valid (µ.gas, c, |µ.s|) µ
0
= µ[s ! s][pc += 1][gas = c][i ! aw]
d = µ.m[pos
m
, pos
m
+ size 1] ⌘
0
= ⌘[L ! ⌘.L ++[(◆.actor,s
1
,d)]]
✏ (µ, ◆, , ⌘)::S ! (µ
0
,◆,,⌘
0
)::S
!
µ,◆
= LOGn
µ.s = pos
m
:: size :: (s
1
++s
2
) |s
1
| = n aw = M (µ.i, pos
m
, size)
c = C
mem
(µ.i, aw) + 375 + 8 · size + n · 375 ¬valid (µ.gas, c, |µ.s|)
✏ (µ, ◆, , ⌘)::S ! EXC :: S
!
µ,◆
= LOGn |µ.s| <n+2
✏ (µ, ◆, , ⌘)::S ! EXC :: S
Halting instructions The execution of a RETURN command requires to read data from
the local memory. Consequently the cost for memory consumption is charged. Addi-
tionally the read data is recorded in the halting state in order to potentially propagate it
to the caller.
!
µ,◆
= RETURN
µ.s = io :: is :: s aw = M (µ.i,io,is) c = C
mem
(µ.i,aw)
valid (µ.gas,c,|s|) d = µ.m[io, io + is + 1] g = µ.gas c
✏ (µ, ◆, , ⌘)::S ! HALT(, g, d, ⌘)::S
!
µ,◆
= RETURN µ.s = io :: is :: s
aw = M (µ.i,io,is) c = C
mem
(µ.i,aw) ¬valid (µ.gas,c,|s|)
✏ (µ, ◆, , ⌘)::S ! EXC :: S
!
µ,◆
= RETURN |µ.s| < 2
✏ (µ, ◆, , ⌘)::S ! EXC :: S
The execution of a STOP command halts execution without propagating any data
to the caller.
3. The execution of the called code ends with an exception. In this case the remaining
arguments are removed from the caller’s stack and instead 0 is written to the caller’s
stack. The caller does not get the remaining gas refunded
As the first two cases can be treated analogously, we just need two rules for returning
from a call.
!
µ,◆
= CALL
µ.s = g :: to :: va :: io :: is :: oo :: os :: s to
a
= to mod 2
160
flag = .to
a
= ??0 : 1 aw = M (M (µ.i, io, is), oo, os)
c
call
= C
gascap
(va, flag,g,µ.gas) c = C
base
(va, flag)+C
mem
(µ.i, aw)+c
call
µ
0
= µ[i ! aw][s ! 1::s][pc += 1][gas += gas c][m ! µ.m[[oo, oo + s 1] ! d]]
✏ HALT(
0
,⌘
0
, gas,d)::(µ, ◆, , ⌘)::S ! (µ
0
,◆,
0
,⌘
0
)::S
!
µ,◆
= CALL
µ.s = g :: to :: va :: io :: is :: oo :: os :: s to
a
= to mod 2
160
flag = (to
a
)=??0 : 1 aw = M (M (µ.i, io, is), oo, os)
c
call
= C
gascap
(va, flag,g,µ.gas) c = C
base
(va, flag)+C
mem
(µ.i, aw)+c
call
µ
0
= µ[i ! aw][s ! 0::s][pc += 1][gas = c]
✏ EXC :: (µ, ◆, , ⌘)::S ! (µ
0
,◆,,⌘)::S
The two other instructions for calling (CALLCODE and DELEGATECALL) are
similar to CALL.
The CALLCODE instruction only differs in the fact that the control flow is not
handed over to the called contract, but only its code is executed in the environment of
the calling account. This means in particular that the amount of money transferred is
only relevant as a guard for the call, but does not need to be actually transferred. In
addition, in case that the account whose code should be executed does not exists, this
account is not created, but only the empty code is run. However, still the amount of
Ether specified on the stack influences the execution cost.
!
µ,◆
= CALLCODE
µ.s = g :: to :: va :: io :: is :: oo :: os :: s to
a
= to mod 2
160
(to
a
) 6= ?
|A| +1 1024 (◆.actor).b va aw = M (M (µ.i, io, is), oo, os)
c
call
= C
gascap
(va, 1,g,µ.gas) c = C
base
(va, 1) + C
mem
(µ.i, aw)+c
call
valid (µ.gas,c,|s| + 1) d = µ.m [io, io + is 1] µ
0
=(c
call
, 0,x.0, 0,✏)
◆
0
= ◆[sender ! ◆.actor][value ! va][input ! d][code ! (to
a
).code]
✏ (µ, ◆, , ⌘)::S ! (µ
0
,◆
0
,,⌘)::(µ, ◆, , ⌘)::S
Figure 2: The EtherTrust definitions for ADD, CALL and RETURN.
5 AN EVM MODEL
SPECIALIZED FOR GAS
ANALYSIS
The gas mechanism ensures that a contract can only
run a finite number of “local” instructions, i.e., in-
structions whose effect is local to the current contract
(no call, return, etc.). As mentioned above, when a
contract c
1
calls another contract c
2
with, say, g units
of gas, the gas associated to c
1
is not charged immedi-
ately. Thus, using Yellow Paper semantics, a contract
c
1
calling itself is indefinitely looping. The Yellow
Paper prevents this by fixing a call stack size limit.
If a contract exhausts the stack limit then its execu-
tion ends by an exception. However, unlike other vir-
tual machines, EVM has no exception catching mech-
anism. When an exception is raised in a contract c, the
execution of c stops, the information of the contract c
is popped from the stack and the control flow goes
back to the previous contract in the stack if it exists,
otherwise the execution stops.
To sum up, termination of contracts in the formal
semantics of the Yellow Paper is enforced by the gas
mechanism and the fact that the call stack is finite.
In the following, to formally prove termination we
prove that, whatever the contracts may be, the call
stacks decrease w.r.t. a well founded-ordering. First,
we define the call stacks and the frames composing
the call stacks based on the formal small-step seman-
tics of (Gavin, 2014; Gavin, 2019) and (Grishchenko
et al., 2018b; Grishchenko et al., 2018a).
The Maximal Call Stack Size. The maximal call
stack size is denoted by stack lim. We assume that
stack lim is a natural number strictly greater to 0.
Abstraction of the Frames. For running a contract
c
1
, the EVM stores information in the call stack. In
the following, we call this information a frame. Fol-
lowing (Grishchenko et al., 2018b), our frames can
denote standard program execution frames, HALT
frames and EXC frames. In our EVM model spe-
cialized for gas analysis, we can abstract frames by
three different frame forms: either Ok(g, pc, p,e),
Halt(g,e) or Exception, where g is a gas value, pc
is a program counter, p is a program code, and e is
an environment. Like in (Grishchenko et al., 2018b),
this environment is an abstraction of the global state
of the system σ. In our model, this environment
maps contract names to the associated codes. An
Ok(g, pc, p,e) frame represents a standard execution
frame (µ,ι, σ,η), where we abstract away η and most
parts of µ (including the execution stack and the lo-
cal memory). In µ, we only keep track of µ.pc the
program counter and µ.gas the available gas. Simi-
larly, we forget everything about ι except ι.code the
current program to execute. In σ, we only follow the
contract names associated to code and forget about all
other type of information. A Halt(g,e) frame repre-
sents a contract that successfully reaches a RETURN
instruction, where g is the gas remaining after the ex-
ecution of the contract (the refund) and e is the (possi-
bly) modified environment. In particular, e may con-
tain new contract names and their associated code.
On the opposite, the result value d and the effect η
are not stored in our abstract version of the seman-
tics, because they have no impact on the control flow
nor on gas consumption. In particular, if a conditional
jump depends on the result d then this will be mod-
eled in our abstract semantics by the fact that the ab-
stract Jump instruction can jump to any valid position
in the current contract. Finally an Exception frame
represents a contract whose execution has failed be-
cause it exhausted the available gas, overflowed the
SECRYPT 2020 - 17th International Conference on Security and Cryptography
44