6. The choice of variable discretisation, structure
learning algorithms, parameter estimation
algorithms, and the number of categories used in the
discretisation all affect the accuracy of the results
and there are no clear-cut guidelines on what would
be the best choice to employ. It may simply be
dependent on the dataset being used, the amount of
data available, and trial and error to find the best
solution (Mendes and Mosley, 2008).
Therefore, given the abovementioned constraints, as
part of a NZ-government-funded project on using
Bayesian Networks to Web effort estimation, we
decided to develop several expert-based company-
specific Web effort BN models, with the
participation of numerous local Web companies in
the Auckland region, New Zealand. The
development and successful deployment of one of
these models is the subject and contribution of this
paper. The model detailed herein, as will be
described later on, is a large model containing 37
factors and over 40 causal relationships. This model
is much more complex than the one presented in
(Mendes et al., 2009), where an expert-based Web
effort estimation model is described, comprising 15
factors and 14 causal relationships. This is the first
time that a study in either Web or Software
Engineering describes the creation and use of a large
expert-based BN model. In addition, we also believe
that our contribution goes beyond the area of Web
engineering given that the process presented herein
can also be used to build BN models for non-Web
companies.
Note that we are not suggesting that data-driven
and hybrid BN models should not be used. On the
contrary, they have been successfully employed in
numerous domains (Woodberry et al., 2004);
however the specific domain context of this paper –
that of Web effort estimation, provides other
challenges (described above) that lead to the
development of solely expert-driven BN models.
We would also like to point out that in our view
Web and software development differ in a number
of areas, such as: Application Characteristics,
Primary Technologies Used, Approach to Quality
Delivered, Development Process Drivers,
Availability of the Application, Customers
(Stakeholders), Update Rate (Maintenance Cycles),
People Involved in Development, Architecture and
Network, Disciplines Involved, Legal, Social, and
Ethical Issues, and Information Structuring and
Design. A detailed discussion on this issue is
provided in (Mendes et al. 2005).
The remainder of the paper is organised as
follows: Section 2 provides a description of the
overall process used to build and validate BNs;
Section 3 details this process, focusing on the
expert-based Web Effort BN focus of this paper.
Finally, conclusions and comments on future work
are given in Section 4.
2 GENERAL PROCESS USED TO
BUILD BNS
The BN presented in this paper was built and
validated using an adaptation of the Knowledge
Engineering of Bayesian Networks (KEBN) process
proposed in (Woodberry et al., 2004). Within the
context of this paper the author was the KE, and two
Web project managers from a well-established Web
company in Auckland were the DEs.
The three main steps within the adapted KEBN
process are the Structural Development, Parameter
Estimation, and Model Validation. This process
iterates over these steps until a complete BN is built
and validated. Each of these three steps is detailed
below:
Structural Development
: This step represents the
qualitative component of a BN, which results in a
graphical structure comprised of, in our case, the
factors (nodes, variables) and causal relationships
identified as fundamental for effort estimation of
Web projects. In addition to identifying variables,
their types (e.g. query variable, evidence variable)
and causal relationships, this step also comprises the
identification of the states (values) that each variable
should take, and if they are discrete or continuous. In
practice, currently available BN tools require that
continuous variables be discretised by converting
them into multinomial variables, also the case with
the BN software used in this study. The BN’s
structure is refined through an iterative process. This
structure construction process has been validated in
previous studies (Druzdzel and van der Gaag, 2000,
Fenton et al., 2004, Mahoney and Laskey, 1996;
Neil et al., 2000, Woodberry et al., 2004) and uses
the principles of problem solving employed in data
modelling and software development (Studer et al.,
1998). As will be detailed later, existing literature in
Web effort estimation, and knowledge from the
domain expert were employed to elicit the Web
effort BN’s structure. Throughout this step the
knowledge engineer(s) also evaluate(s) the structure
of the BN, done in two stages. The first entails
checking whether: variables and their values have a
clear meaning; all relevant variables have been
included; variables are named conveniently; all
ICEIS 2011 - 13th International Conference on Enterprise Information Systems
130