In 2015, Village Ministries, Development of
Underdeveloped Areas and Transmigration through
Presidential Regulation No. 12 of 2015 focus on
village development. It is interestingly noted
because the targeted development is absolutely
necessary.
In this study, the classification analysis using
decision tree for the regions in Indonesia used
secondary data obtained from the Central Bureau of
Statistics in the number and percentage of the poor,
expenditure per capita, life expectancy, average
length of school, literacy rate, village facilities and
infrastructure, accessibility, and percentage of
disaster and conflict in the village. The area used in
this study was limited only in Java Island. It is
because about 60% of Indonesia's population living
in Java and 58% of Indonesia's Gross Domestic
Product is in Java. Therefore, it is assumed that Java
Island has been representative to represent the
classification of underdeveloped areas in Indonesia.
It is known that Java has six regions included in the
underdeveloped regions category. In this research,
the writer conducted further analysis to the
underdeveloped regions to find out the influencing
factors of why those areas are classified as
underdeveloped area. This research is expected to
help the government in determining whether an area
is an underdeveloped region or not with a simpler
method from the model result compared to the
method that has been used by the government. This
research is also expected to assist the government in
making plans to follow up on the problem of
regional underdevelopment so that national
development can be on target.
2 LITERATURE REVIEW
One common method of data mining is the decision
tree, which transforms a very large fact into a
decision tree that represents the rule. It is one of the
most popular classification methods because of its
easily interpreted by humans. The concept of a
decision tree is transforming data into a decision tree
model and rules.
The data in the decision tree is usually expressed
in tabular form with attributes and records. The
attribute states a parameter created as a criterion in
the formation of a tree. For example, in determining
to play tennis, the considered criteria are weather,
wind, and temperature. One of the attributes is an
attribute that states the data per-item of data solution
called the target attribute. The attribute has values
named with the instance, for example the weather
attribute has a bright, cloudy, and rainy instances.
Decision tree is the set of IF rules THEN. Each
path in a tree is associated with a rule, in which the
premise consists of a set of encountered node nodes,
and the conclusion of the rule consists of the class
connected with the leaf of the path. Figure 1 below
shows the decision tree structure.
Figure 1: Decision tree structure.
The first part of this decision tree is the root
point, whereas each branch of the decision tree is a
division based on test results and the end point (leaf)
is the resulting class division. Decision tree has three
types of nodes, as follows:
1. The root node, ie has no incoming branch and
has more than one branch, sometimes has no
branch at all. These nodes are usually the most
attributes that have the greatest influence on a
particular class.
2. The internal node, ie has only one incoming
branch and has more than one outbound branch.
3. Leaf node, which is the end node that has only
one incoming branch and no branch at all at the
same time marks that the node is a class label.
The initial stage is the root node test. If the root
node test produces something, then the testing
process is also performed on each branch based on
the results of the test. This applies also in the
internal node where a new test condition will be
applied to the leaf node. In general, the process of a
decision tree system is to adopt a top-down search
strategy for its search space solution. In the process
of classifying unknown samples, the attribute values
will be tested on the decision tree by tracking the
path from the root point to the end point, then the
class will be predicted to occupy the new sample.
Decision tree is widely used in data mining
process because it has several advantages as follows:
1. It does not cost much when building algorithms.
2. Easy to interpret.
3. Accommodate the missing data.
4. Easily integrate with database system.
ICPS 2018 - 2nd International Conference Postgraduate School
880