=
∑
( − )
.
(6)
Another metric used in this evaluation is MAD
(Mean Absolute Deviation). Equation (7) shows how
this metric is computed.
=
∑
| − |
.
(7)
The third metric used in this evaluation is MAPE
(Mean Absolute Percentage Error). Equation (8)
shows how this metric is computed. Although
MAPE, also known as MMRE, is the most common
measurement of forecast accuracy, it has an
important weakness that is shown in (Foss, Stensrud,
Kitchenham, and Myrtveit, 2002).
=
∑
−
× 100.
(8)
The last metric used in this evaluation is
WMAPE (Weighted Mean Absolute Percentage
Error). Equation (9) shows how WMAPE is
computed.
=
∑
−
×
∑
.
(9)
For analyzing the available project data, we
developed a software prototype of our Behavioral
Project Monitoring Framework. This software
prototype has a forecasting module, implementing
the Work Behavior Prediction method.
The project data is provided in the form of
Microsoft Project Plan files which are available on a
monthly basis for several months in the case of
project X, and on a weekly basis for several weeks
in the case of project Y.
The software prototype automatically computes
the four metrics for all the project tasks for which
data is available, so that an index i of D and F from
equations (6), (7), (8), and (9) refers to one task.
4 RESULTS AND DISCUSSION
The forecasts evaluation results are presented in
Table 1, for project X, and Table 2, for project Y.
Table 1 and Table 2 show the prediction time span,
which is measured in months, in the case of project
X, and weeks in the case of the smaller project Y.
The main reason for making predictions on such
time spans was that project development data is
available on a monthly-basis, in the case of project
X, and on a weekly-basis, in the case of project Y.
Consequently, forecasts at the end of the prediction
time span can be compared to existing information
regarding project progress.
The four metrics used in evaluation that were
presented in the previous section, are computed for
Velocity Trend prediction (VPT in Table 1 and
Table 2) and for our prediction method, Work
Behavior Prediction (WBP in Table 1 and Table 2).
A prediction method is considered better than the
other for a case if at least half plus one of the
available metric values are lower for the first method
(considering, of course, the metrics that are used for
this evaluation for which lower means better).
In Table 1 and Table 2, the cases in which our
prediction method (WBP) is better than Velocity
Trend prediction (VTP) are shaded.
Analyzing the results presented in Table 1 and
considering all the available 24 presented cases, our
prediction method (WBP) proves to be
systematically better than Scrum’s Velocity Trend
prediction (VTP). The 1 month prediction time span
shows the lowest differences between the two
prediction methods. Even so, in 7 of the 8 cases our
prediction method has a lower MFE, meaning that is
more “on target” than the competing Velocity Trend
method. The 2 month prediction time span shows
better results for our prediction method in 6 of the 9
cases. For 3 month time span prediction, according
to the metrics values, our prediction method is better
in 6 of the 7 cases. The results presented in Table 1
suggest that, for long term prediction, considering
the available information, our method is more
appropriate to be used for decision support than the
popular Velocity Trend prediction. For example, for
case 17 (Table 1), using Work Behavior Prediction,
the project manager knows two months ahead of
time where project tasks will be in terms of work
progress with an average absolute prediction error
per task of only 10 working days (see MAD for case
17 in Table 1) meaning 2 calendar weeks. Applying
Velocity Trend Prediction on the same data and for
the same time span, the average absolute error per
task is 35 working days, meaning one calendar
month and a half, which almost equals the prediction
time span.
Analyzing the results shown in Table 2 and
considering all the available 10 cases, we conclude
than our prediction method is better than Velocity
Trend prediction for project Y also. For 1 week
prediction time span, our method shows better
results in 3 of the 4 cases. For the other prediction
time spans (2, 3, and 4 weeks), our prediction
method is better in all the cases.
Just like for project X, the results for project Y,
which are presented in Table 2, suggest that, for long
ICSOFT 2011 - 6th International Conference on Software and Data Technologies
50