The accurate and reliable software cost
prediction can significantly increase the productivity
of an organisation and can facilitate the decision
making process (Briand et al., 1999), while, at the
same time, it is highly important to both developers
and customers (Leung and Fan, 2002). Project
managers commonly stress the importance of
improving estimation accuracy and the need for
methods to support better estimates, as these can
help diminish the problems concerning the software
development process and contribute to better project
planning, tracking and control, thus paving the way
for successful project delivery (Lederer and Prasad,
1992). Once a satisfactory and reliable software cost
model is devised, it can then be used for efficiently
developing software applications in an increasingly
competitive and complex environment. The model
may thus constitute the basis for contract
negotiations, project charging, classification of tasks
and allocation of human resources, task progress
control, monitoring of personnel and other resources
according to the time schedule, etc.
The parameters anticipated to affect software
development cost are not easy to define, are highly
ambiguous and difficult to measure particularly at
the early project stages. The hypothesis here is that
if we manage to detect those project characteristics
that decisively influence the evolution of software
cost and assess their impact then we may provide
accurate estimations. Therefore, finding the
fundamental characteristics of the software process
is critical, as these can lead to the creation of various
computational models that aim at measuring or
predicting certain factors affecting this process, such
as software development effort, quality and
productivity. The work presented in this paper aims
to provide accurate predictions of software
development cost by utilising computational
intelligent methods along with Input Sensitivity
Analysis (ISA) to find the optimal set of input
parameters that seem to describe better the cost of a
software project, especially in early phases of the
software development life-cycle (SDLC).
The rest of the paper is organised as follows:
Section 2 presents a brief overview of the relevant
software cost estimation literature and outlines the
basic concepts of artificial neural networks, the latter
constituting the basis of our modelling attempt.
Section 3 introduces the dataseries used for
experimentation and describes in detail the cost
estimation methodology suggested. Section 4
provides the application of the methodology and
discusses the experimental results obtained,
commenting on the factors that mostly affect
software cost. Finally, Section 5 draws the
concluding remarks and suggests future research
steps.
2 COST ESTIMATION MODELS:
A THEORETIC BACKGROUND
During the end of the 50s and 60s, researchers and
software engineers began focusing on software cost
estimation. Since then various estimation techniques
and models have been proposed in order to achieve a
better and more accurate cost prediction. Software
cost estimation is conceived in this paper as the
process of predicting the amount of effort required
to develop software. The success of this process lies
with the quality of the data and the selected
parameters used for performing the estimation.
A considerable amount of the models used for
software cost estimation are either cost-oriented,
providing direct estimates of effort, or constraint
models, expressing the relationship between the
parameters affecting effort over time. COCOMO
(Constructive Cost Model), an example of a cost
model, has a primary cost factor (size) and a number
of secondary adjustment factors or cost drivers
affecting productivity. Since its first publication
(Boehm, 1981) it has been revised to newer versions
called COCOMO II (Boehm et al., 1995) and later in
(Boehm, 1997), mixing three cost models, each
corresponding to a stage in the software life-cycle:
Applications Composition, Early Design, and Post
Architecture, appearing to be more useful for a
wider collection of techniques and technologies.
SLIM (Software Life-cycle Model), an example
of a constraint model, is applied on large projects,
exceeding 70000 lines of code and assumes that
effort for software projects is distributed similarly to
a collection of Rayleigh curves (Putnam and Myers,
1992). It supports most of the popular size
estimating methods including ballpark techniques,
function points (Boehm et al., 2000), component
mapping, GUI (object) sizing, sizing by module etc.
(visit the Quantitative Software Management
website: http://www.qsm.com, for more information
on recently developed SLIM tools). A stepwise
approach, utilising software and manpower build-up
equations, is used and the necessary parameters that
must be known upfront for the model to be
applicable are the system size, the manpower
acceleration and the technology factor.
Software cost models are evaluated considering
certain error criteria, with the most common method
comparing the estimated with the actual effort.
Existing software cost models experience
SOFTWARE COST ESTIMATION USING ARTIFICIAL NEURAL NETWORKS WITH INPUTS SELECTION
399