corresponding to these queries and the business for-
mulae (identified in Section 2.1) will be generated in
the programming language desired by the end-user.
In the initial prototype of our approach, we gener-
ate the final code in Python using well-tested openly
available techniques (Richter, 2022; C.W., 2022).
These techniques automatically identify the partici-
pating formulae in the input CS sheet and convert
them into Python. Using these techniques, the final
code for the input CS sheet in Figure 1, as per the
CRG in Figure 3, will be as shown below:
import os, sys
# Function synthesized for Total premium in
current year
def getTotalPremium(policyID):
cursor = database.cursor()
cursor.execute("SELECT SUM(PREM_AMT)
FROM POLICY_PREM_RECEIPT WHERE ...")
result = cursor.fetchall()
return result
# Functions for V, W, B, T, R will be
synthesized similar to getTotalPremium
# Function synthesized for Admin Fee
def getAdminFee(policyID):
V = getPolicyValue(policyID)
P = getTotalPremium(policyID)
W = getTotalWithdrawals(policyID)
AFThresh = MAX(V,(P-W))
B = getAdminFeeBase(policyID)
T = getWaiverThreshold(policyID)
R = getMaxRatePolicyValue(policyID)
AF = B if AFThresh < T else V * R
return AF
The constructed CRG (Figure 3) plays a vital role
in generation of final code. The example CS sheet
(Figure 1) contains two business formulae in cells
B11, B14, but as evident in the CRG, the formula in
cell B14 needs to be executed before the formula in
cell B11. This sequential dependency in business for-
mulae is maintained by the directed edge from cell
B14 to B11 in CRG, and hence the code for the for-
mula in cell B14 appears before the same for cell B11.
3 RELATED WORK
The key challenge for program synthesis is the di-
versity of user intent and the inability to express the
intent precisely. Natural language descriptions (Li
and Jagadish, 2014; Desai et al., 2016) are the most
common way of stating user intent, but they are am-
biguous and result in incorrect programs. Specifying
user intent through input-output examples (Gulwani,
2016) works for a certain class of string manipulation
programs, but it hasn’t been used effectively in query
synthesis paradigm. State-of-the-art query synthesis
techniques either require the user to provide partial
queries (Bastani et al., 2019) or create large input-
output tables (Wang et al., 2017; Takenouchi et al.,
2020). Creating such complicated artifacts is as com-
plex and effort-intensive as writing the query manu-
ally. To address these drawbacks, we have proposed
a novel concise and precise way of specifying user
intent in the form of Calculation Specification (CS)
sheets. A CS sheet represents the mathematical and
logical relationship between the inputs and outputs
of a program. Furthermore, the end-users, who are
not programming experts, may find providing valid
value(s) of variables participating in the calculations
to be more approachable and natural as compared to
creating complicated input-output tables and partial
queries.
Another challenge for program synthesis is enu-
merating and searching programs in a large and
intractable program space. In query synthesis
paradigm, existing techniques (Desai et al., 2016;
Yaghmazadeh et al., 2017) haven’t overcome this
challenge of program space explosion. A few tech-
niques (Li and Jagadish, 2014; Wang et al., 2017;
Takenouchi et al., 2020) indeed reduce the search
space but require the end-user to provide additional
hints like the selection of aggregate function, con-
stants, etc., which can not be provided by the end-
users who typically don’t have technical knowledge
of query language and syntax. On the contrary, in our
proposed approach we plan to reduce the search space
by filtering the enumerated queries based on expected
query output and other heuristics described in detail
in Section 2.3. Hence, the end-user doesn’t need to
give technical guidance, like the selection of aggre-
gate functions, constants, conditions, etc., for reduc-
ing search space for queries.
4 CONCLUSION AND FUTURE
WORK
In this paper we have presented a novel idea to synthe-
size code corresponding to CS sheets. As described
in the paper, business rules can be easily expressed
in CS sheets through a combination of text and for-
mulae. This makes CS sheet a precise and machine-
interpretable way to specify calculations. We have
also proposed a novel query synthesis approach that
firstly formulates a custom PBE spec for each input
in CS sheet, and later uses text and value inference to
synthesize database query corresponding to each PBE
spec. Lastly, the synthesized database queries and the
business formulae specified in the CS sheet are trans-
Towards Synthesis of Code for Calculations Using Their Specifications
503