8.4 (Pedrozo and Vaz, 2014), other recommendation
index studies make MySQL 5.5 a trial and imple-
mentation. Even no exception for NoSQL databases
such as MongoDB has become the object of research
for index recommendations using mining algorithms
(Ameri et al., 2015). However, open source such
as Maria DB does not yet have a default feature in-
dex recommendation and there has not been much re-
search on recommendations with the MariaDB case
study. One way to overcome the problem of index se-
lection on MariaDB can be developed by implement-
ing heuristic principles using Java.
Java is an object-based programming language
(Hermawan, 2004). Java is also multiplatform which
can be interpreted for various operating systems so
that java will be used by various operating system
users. Java has a pattern matcher method such as Reg-
ular Expression which can filter the SQL you want to
optimize.
This research is intended to make a software as
a java-based index recommendation by implementing
a heuristic method that has an index selection algo-
rithm as previously done ((Chaudhuri et al., 2004)but
what differs in the following research is in the index
selection process using the General log or can be in
the form of query as input and provide an output in
the form of a recommended column name as an index
as a result of the selection process carried out in the
Maria DB database.
2 BACKGROUND
2.1 Index Selection
Logical and description, which is designed to meet
the information needs of an organization (Connolly
and Begg, 2005). Meanwhile, to control the database
users can use a DBMS that is the Database Manage-
ment System. (Connolly and Begg, 2005). According
to Conolly, Database has three design phases namely
Conceptual, Logical, and Physical. While the tools of
this research are in the scope of the Physical Design
phase where the output is intended to help the phys-
ical design database process. Performance measure-
ment from the database can be done by assessing sev-
eral aspects according to winand, as follows (Winand,
2012): Data Volume, System Load, and Response
Time and Throughput. In this study will use measure-
ments in the form of response time and throughput
which can be interpreted as Query per Second (QPS)
that calculate from average query execution in itera-
tion of 60 seconds.
Index is a data structure that organizes data
records in storage to optimize certain types of re-
trieval operations (Raghu and Gehrke, 2004). The
right index can increase speed in data processing
(Ameri et al., 2015). The basis of the algorithm in
the following research is the AISIO heuristic principle
(Pedrozo and Vaz, 2014) where in the selection of the
index has four main processes but there are some de-
tails that are still not very clear, such as when making
single and multi-level-index candidates so that they
can use some other research references such as the in-
dex selection process on SQL server (Chaudhuri and
Narasayya, 1997) is by doing permutations for col-
umn names, and obtaining heuristics that are in ac-
cordance with MariaDB and applied in the following
research:
1. Parsing and Filter Queries. To get the initial in-
dex candidates in the form of the names of the
columns contained in the input as did all research
on the previous index recommendations.
2. Retrieving previous indexes. The following is an
addition to this study because in previous stud-
ies the existing index was rarely considered, even
though the index could be an optimal index.
3. Identifying index Configuration. After getting an
index candidate in the form of a column name
the next is to make an index configuration or in
the form of a set of indexes based on a combina-
tion of column names so that a multi-level-index
will appear like the main heuristic and making this
configuration adheres to the details of making the
configuration set in research on sql server (Chaud-
huri and Narasayya, 1997) which found that the
optimal combination is with the value j = 2, the
combination is concerned with sequences so that
it is done by permutation.
4. Configuration cost evaluation Cost evaluation is
an evaluation activity with measurements in the
form of QPS or Response time which is per-
formed to find the best performance of all index
configurations prior to enumeration as in previous
studies (Chaudhuri and Narasayya, 1997).
5. Configuration enumeration. After getting a set of
index configurations with the best performance,
the enumeration was done as in the research for
sql server (Chaudhuri and Narasayya, 1997), but
the selection of the best enumeration candidates
was not based on the most fit storage but used
the selection factor by selecting the candidates
who most often appeared in the workload based
on other studies (Ameri, 2016). This selection
factor in other references is considered a selec-
tivity factor calculated using the na
¨
ıve probabil-
Implementation of Heuristic Principles for Index Recommendations using Java in the MariaDB Database
93