from fiscal audits of the municipality of S
˜
ao Paulo,
related to service taxes. The city of S
˜
ao Paulo has an
outstanding role in the Brazilian economic scenario.
With regard to the Brazilian Gross Domestic Product
(GDP), in the year of 2016, S
˜
ao Paulo had a contribu-
tion of 33.71% (S
˜
ao Paulo State Government, 2019).
In 2019, S
˜
ao Paulo’s revenue from municipal taxes
represented 20% of the revenue collected from all the
Brazilian municipalities (S
˜
ao Paulo Commercial As-
sociation, 2019). The predominant participation of
S
˜
ao Paulo’s tax revenue in the total income of the
municipality corroborates its importance (S
˜
ao Paulo
City Hall, 2019). In 2018, it amounted to 56.72%
of the total revenue, turning out to be the main re-
source that comprised the total income. Among all
municipal taxes, service taxes were the most relevant,
corresponding to 49.97% of the revenue, followed by
property taxes, which contributed with 33.45%.
One of the main actions that potentially help to in-
crease S
˜
ao Paulo’s contribution to tax revenue is the
implementation of tax audit plans that are oriented
to tax compliance. We have practical results in the
Brazilian municipality of S
˜
ao Paulo, which demon-
strate that audits originated by tax compliance ac-
tions, aiming to orientate taxpayers on how to comply
with tax laws, incremented our service taxes revenue
in 15%. The main cause of this increment is the rise
in risk perception, since taxpayers realize they are un-
der surveillance. Sometimes only sending messages
to taxpayers telling them that they will be object of
scrutiny can increase compliance (Alm, 2019).
We believe that Machine Learning can help our
compliance-oriented audit plans to be more assertive,
due to the predictive power of its techniques and algo-
rithms. Thus, Machine Learning application in audit
plans can reverberate, leading to higher amounts of
tax revenue. Predictions of tax crimes can permit our
local government to plan fiscal audits precisely, be-
fore crimes are committed, forcing taxpayers to com-
ply with tax laws and regulations.
Some governments apply Machine Learning in
crime prediction. Police in Venice (Bernasconi, 2018)
and in Chicago (Fingas, 2017) utilize Machine Learn-
ing to predict crimes like robberies, shootings and
murders. The Internal Revenue Service (IRS) of the
United States of America (Olavsrud, 2019) applies
Machine Learning to detect identity theft and pre-
refund fraud in the tax system. In comparison, our
work aims to predict different types of crimes that are
specific to the service tax system of the municipal-
ity of S
˜
ao Paulo, such as denial to provide documents
to fiscal authorities. Other governments use Machine
Learning to tax fraud prediction. The Government of
Chile (Gonz
´
alez and Vel
´
asquez, 2012) and of Spain
(L
´
opez et al., 2019) have case studies based on Neural
Networks. In our work, we apply and compare more
Machine Learning algorithms, like Random Forests,
Logistic Regression and Ensemble Learning.
In this work, we apply Machine Learning tech-
niques and algorithms with the goal of predicting ser-
vice tax crimes against the tax system of the munic-
ipality of S
˜
ao Paulo. As input, we use data from
our fiscal audits. In general terms, our methods en-
compass the following steps: feature selection; data
extraction from our fiscal audits database; data par-
titioning; model training and testing; model evalua-
tion; model validation. The results of our case study
highlight Random Forests’ tax crime prediction per-
formance and also its capability of adapting to new
data. We are not aware of any work with the goal to
predict crimes against S
˜
ao Paulo’s service tax system,
based on Random Forests.
This paper is organized as follows: Section 2
reviews the related works, Section 3 provides the-
oretical background on fiscal authorities, tax audits,
crimes against the tax system and Machine Learning,
Section 4 explains our methods, Section 5 presents
and discusses the results of our case study, Section 6
concludes the paper and suggests future work.
2 RELATED WORKS
Some related works about the use of Machine Learn-
ing in crime prediction and tax fraud detection de-
serve highlight. One example is the use of Machine
Learning by the Italian Police with the goal of predict-
ing crimes (Bernasconi, 2018). In this case, Machine
Learning extracts patterns from data about time and
localization of previous crimes. It triggers alerts, out-
putting where and when a crime has high probability
to occur. This conduced to more precise prediction of
crimes, redounding in the arrest of a man at a hotel
bar in Venice just before he was about to commit a
robbery.
Other case study that applied Machine Learning
to predict crimes comes from the Chicago Police (Fin-
gas, 2017). The solution analyses crime statistics, so-
cial and economic data, climate and localization reg-
istries and data from shot sensors. Whenever Machine
Learning predicts a crime with high probability, the
solution sends an alert to the police officers’ smart-
phones. Chicago Police reports reduction in the num-
ber of shootings and murders after the use of Machine
Learning.
Some governments have applied Machine Learn-
ing for tax fraud prediction. One example is the
IRS of the United States of America, which imple-
Tax Crime Prediction with Machine Learning: A Case Study in the Municipality of São Paulo
453