stop it from branching. The tree can be evaluated by
means of cross validation or by artificially selecting
partial data to make validation.
3 MINING SOLUTIONS FOR
RADAR DETECTION ERROR
In this paper, SQL Server 2005 is selected as the data
mining platform, and the decision tree algorithm is
used to mine the radar detection error. The SQL
Server 2005 provides the data mining functions
including SQL server integration services (SSIS) and
SQL server analysis services (SSAS)(Deli Zhu,
2007). The integration services are used in data
pre-processing while the analysis services provide
multiple data mining algorithms.
3.1 Data Pre-Processing
Data pre-processing is an important link in the data
mining. Usually the original data supplied for data
mining is lack of consistency, and has plenty of
redundancy and null values. Therefore, data
pre-processing is to process such original data as
well as the noisy data in it. the pre-processing mainly
includes the following procedures.
1) Data conversion, integration and matching. To
make data mining, you should firstly obtain the
difference value between the measured data and the
real tracking data of the target radar. It is necessary
to match the track and to reconcile the step length of
data mining by the method of three-point
interpolation, because the measured data and real
tracking data of radar are stored in different files and
the data mining is in different step lengths.
2) Data consistency processing. The data must be
made clean and consistent in order to improve the
accuracy of data mining. In the paper, the 3σrule is
used to distinguish the abnormal errors.
3.2 Error Data Mining
According to the above analysis, in the radar
detection error mining, the truth distance and truth
position are made as the input attributes, while the
distance errors and position errors are made as the
forecast attributes. The C4.5-based decision tree is
used to build an error model, and the reserved testing
method is used to evaluate the accuracy of decision
tree. The reserved testing method divides the entire
sampled data into the training data set and testing
data set which do not intersect. After the
pre-processing, the training data set contains 20
tracks and 5052 sampled data, and the testing data
set contains 1 track and 1092 sampled data. The
structure of sampled data is shown in Table 1, in
which Det_D and Det_B represent the distance and
position of target measured by radar respectively; D
and B represent the truth distance and truth position
of target; and ΔD and ΔB represent the distance error
and position error respectively. The unit for distance
is meter, and the unit for position is degree.
The modeling is done in SQL Server 2005. The
decision tree algorithm can be determined by
selecting parameters for the mining model. In this
paper, the entropy-based algorithm is used to
calculate and split the fractions, and the method of
bi-section is designated to split nodes. After
processing, the mining model has generated a
decision tree for distance error and one for position
error.
The pruning is to trim the decision tree according
to the minimum number of samples that must be
contained in each leaf node, and to evaluate the
trimmed tree with testing data set. When the
minimum number of samples contained in the leaf
node is equal to 140, the generated decision tree for
distance error present a good forecasting
performance. When the minimum number of
samples contained in the leaf node is equal to 280,
the generated decision tree for position error presents
a good forecasting performance.
The following is part of the decision tree model
for distance error:
B>=160.639
--B<162.453
-- -- D<93611.621:
ΔD=102. 212-5.585*(B-163.452) (439)
-- -- D>=93611.621:
ΔD=18.956+5.476*(B-166.170)-(D-98656.788)
(612)
-- B>=162.453
We can draw out a rule from the above. That is, if
B>=160.639, B<162.453 and D<93611.621, the
distance error model is expressed as ΔD=102.
212-5.585*(B-163.452), and the number of samples
contained is 439. Similarly, other rules can be drawn.
All these rules have covered the coverage of training
data set, and the aggregate of all these rules has
made up the radar detection error models for
distance and position.
ISME 2016 - Information Science and Management Engineering IV
296