Table 2: Prediction performance of formulas in the same category.
R(V M) P(V M) CPU(S) P(S) P(S)revision
Relative error 3.45E-03 1.97E-03 4.82E-02 4.86E-02 1.19E-04
Average R
2
0.889 0.917 0.220 0.208 1.000
Hit rate 95.96% 98.57% 82.01% 81.85% 100.00%
Reduction volume 1241.46 1304.32 900.37 893.20 1339.25
Execution time sec. 0.144 s 0.070 s 0.680 s 0.690 s 0.029 s
those R
2
on data stream will be used as a metric
to evaluate how well predictions fit raw data.
R
2
= 1 −
∑
n
i=1
(y
i
− f
i
)
2
∑
n
i=1
(y
i
− y)
2
(11)
5.3 Formulas Revisions
In the experiment, we monitor the reduction process,
and measure performance of each formula using the
aforementioned criteria. One interesting point we
found is that the variability of prediction ability is sig-
nificant between formulas. Some formulas can make
very accurate predictions in a short execution time
while some formulas fail frequently and cost more
time to train new formulas. The reason for this dis-
crepancy lies in the correlation model. In the reduc-
tion process, if a prediction formula does not meet ac-
curacy requirements, then the predictor needs to learn
a new regression formula to replace t, thus the for-
mula could be always up-to-date. However, some de-
pendent indicators may be hard to predict by selected
regressor indicators, if their correlations are not high
enough. Thus, the framework would take much time
to update those formulas frequently, even though the
general performance increases very little.
Therefore, in order to solve the problem, the pre-
dictor need to revise those inefficient formulas in the
next reduction loop. We denote dependent indicators
of those foot-dragging formulas as slowDS, and we
need to expand RS to include a subset of slowDS to
enhance prediction ability, since results have proved
current RS is not capable of making accurate predic-
tions on those indicators. Among all possible solu-
tions, adding the complete set of slowDS to RS can
solve the issue all at once, but obviously it can only
achieve minimal data reduction. This work recon-
siders correlations between indicators of slowDS in
the correlation model, it obtains several disconnected
subgraphs of the DAG containing only the slowDS
nodes, and add the root nodes of subgraphs to RS.
For instance, if any indicators of slowDS are corre-
lated, they must exist in the same subgraph, thus the
corresponding root node would serve as the regressor
indicator for other nodes in the next loop; otherwise
all slowDS will serve as regressor indicators. By this
gradual means of expanding RS, appropriate RS/DS
could be identified in iterations of reduction loops.
In this experiment, CMBDR framework selects
the root nodes of correlation model as RS in the first
loop. Individual performance of indicators in the first
loop are evaluated in the first 4 columns of Table 2,
each column representing the average performance
of indicators in the same category. Under initial
RS/DS configuration, the indicators of R(V M
i j
) and
P(V M
i j
) outperform evidently CPU (S
i
) and P(S
i
),
with higher reduction volume, better prediction ac-
curacy and much less execution time. To acquire
better performance in the second loop iteration, we
need to update RS/DS. By querying in the correlation
model, we find CPU(S
i
) and P(S
i
) are highly corre-
lated. Then we just move one indicator CPU(S
i
) from
DS to RS, and then call Algorithm 1 to update regres-
sor set CRS for P(S
i
), thus CPU(S
i
) would be used to
predict P(S
i
) in the second loop. The performance are
also measured, as the fifth column in Table 2 shows,
both accuracy and execution speed are improved dra-
matically and the relative error is at the same level
with P(V M
i j
).
We also measure overall reduction performance
of monitoring data in reduction process, to verify the
validity of this predictor and to assess the improve-
ment offered by RS/DS updates. As Figure 6 depicts,
in both first and second loops, the predictor reduces
the raw monitoring data to slightly above one-third of
original volume, and reduction of 2 loops are nearly
the same although the predictor involves more regres-
sor indicators in the second loop. However, these new
regressor indicators improve processing speed and ac-
curacy performance of the predictor dramatically. As
Figure 7 and Figure 8 show, the second loop doubles
processing speed of the first loop, and increases aver-
age prediction accuracy by almost an order of mag-
nitude. Results of the second loop in Figures 7, 8
also illustrate that, for a data center of 2000 indica-
tors, this predictor is able to reduce daily monitoring
data within 100 seconds, ensuring the average relative
error at 10
−3
.
Above all, this predictor could cut down the vol-
ume of data collection in monitoring systems, while
Correlation-Model-Based Reduction of Monitoring Data in Data Centers
403