contributions to meet the needs identified from data
analyzed.
Qualitative features comparison. The proposed
solution has storage services such as databases for
microservices. It uses s3 object storage to store all
user information. The results obtained can be
reproduced on-demand and facilitate the sharing of
the original files, unlike DESI (Rodriguez, 2015),
which has a temporary storage of the processed
records. Additionally, the prototype integrates a
feature for checking data quality that performs unit
tests on the input data, unlike the KE Tool (Barlas &
Heavey, 2016) that performs unit tests only on the
methods of the classes that perform the processing.
Unlike KE Tool (Barlas & Heavey, 2016) and
DESI (Rodriguez, 2015), the prototype has a more
robust user interface with navigation, information,
and visualization elements, that facilitates
visualization of distributions, particularly the
Markov Chain analysis results. Decisions are
presented using the usual agile architecture diagrams
such as context, deployment, components, classes,
and soon unlike KE Tool (Barlas & Heavey, 2016),
which illustrates a single high-level diagram of the
system elements
The way in which the components fulfill the
requirements is described below.
Manage Projects: The project microservice has
methods for creating projects. Once a project has
been created, the user can invite other users to view
the content of the dashboards through a private link,
which is generated in the project microservice. When
a user wants to invite another user to the application,
a record of the guest’s email address is saved in the
database, then an email invitation to the project is
sent to the user.
Fit Distribution: In the dashboard, in the section
of the adjustment of probability distributions, once
the user selects a particular activity, the microservice
is responsible for identifying the best distribution.
Check Data Quality: Rules were generated to
validate the data at a stage before processing. It is
worth mentioning that these rules do not limit the
user to continue with the statistics generation process
but serve to alert the user to avoid compromising the
results of the estimations due to errors in the data.
User Interface: The processing microservice has
a dashboard developed in Dash and Plotly that has
three pages where all the statistics generated are
displayed. This microservice is a python module in
which each page of the dashboard is an independent
module, which facilitates the maintenance and
editing of the visual components. The data is
presented in graphs and tables, making it easy for the
user to quickly learn about the data distributions.
Markov Chains Validation: We implemented
features for fitting Markov Chains and performing
hypothesis testing to verify Markov chain properties.
6 CONCLUSION
By reviewing the literature and examining the real
data, we defined the basic requirements that an IDM
solution for DES should fulfil such as: managing the
input data, verifying the quality of the data,
processing and presenting process statistics in
dashboards. We also analyzed probability
distributions to be implemented in such application by
using a real case. The proposed solution introduces
therefore a cloud architecture that satisfies the
requirements based on a microservices pattern that
will enable high performance, availability, scalability,
and security.
The novelty of this paper is the integration of
Markov chain modeling to IDM, the proposed cloud
architecture, the design, development and testing of
the software, and the implementation with real data in
the context of ED. The built application has elements
that had not been previously used in similar tools,
such as cloud computing services, containers, unit
testing on data and interactive visualization.
Additionally, the application implements
straightforward and intuitive navigation tools in order
to benefit user experience.
As future work, the results obtained in the
evaluation of the properties of Markov chains rise to
the question on how to approach the preparation of
data for simulation models that consider routing
probabilities for the problem of overcrowding in ED.
Last, it would be desirable to adjust the code so that
the processing is generic for data of similar data
sources where IDM is required, such as
manufacturing.
REFERENCES
Anderson, T. W., & Goodman, L. A. (1957). Statistical
inference about Markov chains. The annals of
mathematical statistics, 89-110.
Baum, L. E., & Petrie, T. (1966). Statistical inference for
probabilistic functions of finite state Markov
chains. The Annals of Mathematical Statistics, 37(6),
1554–1563.
Barlas, P., & Heavey, C. (2016). KE tool: an open source
software for automated input data in discrete event