A Generation Method of Reference Operation using Reinforcement

Learning on Project Manager Skill-up Simulator

Keiichi Hamada

, Masanori Akiyoshi

and Masaki Samejima

Graduate School of Information Science and Technology, Osaka University, 2-1 Yamadaoka, Suita, Osaka, Japan

Department of Information Systems and Management, Hiroshima Institute of Technology, 2-1-1 Miyake saeki-ku,

Hiroshima, Japan

Keywords:

Project Management, Skill-up Simulator, Reinforcement Learning, Decision Tree.

Abstract:

This paper addresses generating reference operation that a manager should carry out for improving a result of

a certain project based on the project principle. The project principle is a group of rules to indicate what to do

in situations of a project, and also necessary to generate the reference operation. First, the proposed method

generates the project principle from optimal operations derived by reinforcement learning on automatically

generated operations. And, the reference operation is generated by applying the project principle to a certain

project model. Experimental results show that the proposed method can automatically generate the reference

operation as well as manual generation.

1 INTRODUCTION

Along with increase of difﬁcult projects where a

complicated IT system is developed in a short time,

it becomes important to educate project managers

that have enough skills to manage the projects.

Many project managers learn project management

skills through PMBOK(Project Management Body Of

Knowledge)(PMI, 2009) and literatures which indi-

cate success and failure factors of projects(Horine,

2005). In addition, in order to make project man-

agers acquire skills of planning and carrying out

the project, it is important for them to face actual

projects and solve issues of projects. Thus, many

educational methods have been proposed such as

OJT(On the Job Training), PBL(Project Based Learn-

ing)(Thomas, 2000), and RP(Role Playing)(Henry

and LaFrance, 2006). But, effects of their methods

depend on trainers and they take a lot of time to edu-

cate learners.

Recently, the efﬁcient educational meth-

ods using simulators have been proposed such

as SESAM(Drappa and Ludewig, 2000) and

PMT(Davidovitch et al., 2007). Thus, we have

developed a project manager skill-up simulator

which can interact with a learner in order to take an

action against issues such as explosive occurrence of

bugs(Iwai et al., 2011). In this simulator, the learner

can input actions such as “overtime directive” and

“collaborative work with expert members” based on

the simulated situations in a project model. After ﬁn-

ishing the simulation of the project model, this sim-

ulator outputs QCD(Quality, Cost, Delivery) as a re-

sult of an “operation” deﬁned as a sequence of the

actions. But, this simulator can not give feedback to

the learner about whether each action in the operation

was good or not. Since trainers can not investigate the

reference operation in order to evaluate each learner’s

operation from practical viewpoints, there exists no

way to strengthen his/her skills.

This paper addresses a generation method of such

reference operation to support the evaluation. First,

the proposed method generates the project principle

based on operations that are automatically generated

on various projects. Applying the generated project

principle to a certain project model, the proposed

method generates the reference operation.

2 PROBLEMS ON PROJECT

MANAGER SKILL-UP

SIMULATOR

2.1 Project Manager Skill-up Simulator

The target of this research is a project manager skill-

up simulator that a learner can input actions and

Hamada K., Akiyoshi M. and Samejima M..

A Generation Method of Reference Operation using Reinforcement Learning on Project Manager Skill-up Simulator.

DOI: 10.5220/0004131700150020

In Proceedings of the International Conference on Knowledge Management and Information Sharing (KMIS-2012), pages 15-20

ISBN: 978-989-8565-31-0

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

Simulator

Project status

data

(progress, bugs)

Action

input

Learner

Input

Status

display

Event firing

control

Event rules

Refer

Event

input

Output

Project model

(project, module,

person)

Figure 1: Outline of project manager skill-up simulator.

shows the result by the simulation of the operation as

a sequence of the actions. Figure 1 shows the outline

of the project manager skill-up simulator.

2.1.1 Project Model

When this simulator starts, this simulator shows the

learner a project model. The project model mainly

consists of “Project”, “Module”, and “Person” as

they exist in the real project. “Project” is deﬁned

as a set of modules to be developed and members

to be assigned to the project. “Module” is deﬁned

with a technical domain to develop, its difﬁculty

ranked with {A(difﬁcult), B(normal), C(easy)}, esti-

mated man-hour, parents modules, and children mod-

ules. For example, Module 1 is shown as “DOMAIN:

database, DIFFICULTY: A, MAN-HOUR: 10, PAR-

ENTS MODULE: Module 2, CHILDREN MOD-

ULE: Module 3, Module 4”. “Person” is deﬁned with

a skill on technical domains ranked with {A(expert),

B(normal), C(novice)}. For example, Person 1 is

shown as “database: A, interface: A, server: A, se-

curity: C, editorAP: B”.

2.1.2 Project Status Data

While simulating the project, this simulator calculates

parameters of the project status such as progress and

bugs. Figure 2 shows user interface of the simulator

to input a user’s actions and to output the project sta-

tus. The module status is displayed with “Gantt chart”

including two bars that represent initial planned dura-

tion and changed duration from starting date to com-

pletion date. In fact, mismatching between a module

difﬁculty and a programmer’s skill is a main cause

of schedule delays and quality losses of a project.

And the most important task as a project manager is

to detect and predict such schedule delays and qual-

ity losses. The learner can take “check progress”

Buttons for actions

Gantt Chart

Project Status

Project Model

Figure 2: An interface of simulator.

to check the number of existing bugs and whether

the schedule is delayed or not. If project managers

judge to improve the delay, project managers can in-

put the actions of “overtime directive” and “collabo-

rative work with expert members” after taking “check

progress”. To take “overtime directive” increases

working hours, and to take “collaborative work with

expert members” debugs detected bugs and does not

generate new bugs.

After ﬁnishing all modules in the project model,

this simulator outputs QCD as a result of the sim-

ulation. Q(Quality) is inﬂuenced by the quality

losses including latent bugs. C(Cost) is an additional

cost calculated from days spent for “overtime direc-

tive” and “collaborative work with expert members”.

D(Delivery) is inﬂuenced by the schedule delay.

2.1.3 Events

In order to adjust difﬁculty to a learner’s performance,

this simulator causes “Events” which harm the project

status such as sudden bug explosion, sudden man-

hour increase. These events happen when “Event ﬁr-

ing control” detects the project to go smoothly based

on the current project status.

2.2 Problem on Generating Reference

Operation for Feedbacks of

Evaluation

A learner gets the result of the operation, but can not

know whether each action in the operation is good or

not. So, it is required that the simulator shows feed-

backs of the evaluations of the actions. Here, excel-

lent project managers do not fail to conduct a opera-

tion based on the project principle that is a group of

rules to improve the status of projects. The status is

KMIS2012-InternationalConferenceonKnowledgeManagementandInformationSharing

shown by QCD, and the project principle sets which

targets of QCD is improved. So, the rule in the project

principle is a set of a status to be improved and a

necessary action to improve the target of QCD called

“learning target” for projects. A learning target is

deﬁned which viewpoints the learner should acquire.

The decision making by project managers consists of

two stages of operations: project managers check the

progress and take actions based on the progress. From

this viewpoint, there are two kinds of project princi-

ples: “check progress” and “take action”.

This simulator can give feedback to the learner

whether the learner’s operation follows the project

principle or not. But the learner’s operation can not

be compared directly with a project principle because

situations and actions for them vary with projects. In

order to compare the learner’s operation, it is neces-

sary to generate the reference operation which is an

operation following the project principle.

So, the project principle needs to generate the ref-

erence operation. But, to generate the project prin-

ciple takes a lot of time because trainers simulate

and analyze many operations to get a good result for

each project in order to generate the project principle

which must be useful for various projects. This paper

addresses how to generate such project principle and

reference operation.

3 GENERATION METHOD OF

REFERENCE OPERATION

USING REINFORCEMENT

LEARNING

3.1 Outline of Automatic Generation

Method

We propose a generation method of reference oper-

ation using reinforcement learning and decision tree.

Figure 3 shows the outline of the generation method.

In order to generate the reference operation based

on the project principle, this method uses optimal op-

erations which lead to the best results corresponding

to a learning target set in advance. The optimal oper-

ations can be considered to include correct judgments

and actions to improve the status of the project. Be-

cause QCD should be kept smaller by project man-

agement, an optimal operation is deﬁned as the oper-

ation to minimize the objective function f (operation)

deﬁned as follows:

f(operation) =

∑

i∈LearningTarget

(operation) (1)

Trainer

Reward

output

Set learning

target

Optimal operation

generating system

Project principle

generating system

Decision tree

learning

Output as

rule format

Reinforcement learning

history

Project

principle

execution

program

Project

model

Project model,

operation input

Project simulator

Optimal operations

Optimal operation

Project

principle

Reference operation

Project

model

Input

(e.g.)

1. IF on critical path, delay

1. THEN do “overtime directive”

2. IF lack of skill of person, not “progress

2. check” more than 3 days

2. THEN do “progress check”

Simulation

program of

operations

Figure 3: Outline of proposed method of the reference op-

eration.

where LearningTarget includes Q, C or D, operation

is a learner’s operation, and f

(operation) is i out-

putted by a simulator after executing operation. For

example, when a learning target is “Q and D”, the

objective function f(operation) = f

(operation) +

(operation).

First, this method generates results for operations

by using the simulation program of operations to sim-

ulate various types of operations automatically. Sec-

ond, this method generates the optimal operations

from the results by reinforcement learning that can

calculate faster than brute force search. Third, this

method generates the project principle as a group

of rules based on optimal operations by using deci-

sion tree learning. Finally, the reference operation is

generated by the project principle execution program

which takes action following the project principle.

3.2 Optimal Operation Generating

System

It is necessary for generating a project principle to

use optimal operations for various types of projects.

In order to generate an optimal operation, it is nec-

essary to minimize the objective function as formula

(1). But it takes a lot of time to generate the optimal

operation using brute force search. So, the proposed

method searches an operation to improve the status of

the project. This search uses reinforcement learning

which determines future actions so as to maximize re-

ward based on past actions. The reinforcement learn-

ing, project state s

(t as time) is a set of QCD given

by simulator, action a is one of selectable actions in

simulator, and reward r(s

, a) is deﬁned as formula (2)

AGenerationMethodofReferenceOperationusingReinforcementLearningonProjectManagerSkill-upSimulator

Table 1: Description h

, a).

, a) Description

, a) Variation of bugs based on s

t−1

divided by

bug rate per day in taking no action

, a) Variation of costs based on s

t−1

divided by

costs per day in taking “overtime directive”

and “collaborative work with expert mem-

bers”

, a) Expected ﬁnishing dates on s

divided by ex-

pected ﬁnishing dates in taking no action

and Table 1.

r(s

, a) = −

∑

i∈LearningTarget

, a) (2)

In the reward, h

, a) or h

, a) is given in

comparison with Q or D in taking no action, and

, a) is given in comparison with costs per day in

taking “overtime directive” and “collaborative work

with expert members”. This formula means that the

reward r(s

, a) gets better when the value of bugs

or costs in s

is smaller than one in s

t−1

, or the ex-

pected ﬁnishing dates on s

is early. For example,

when learning target is “Q and D”, reward r(s

, a) =

−(h

, a) + h

, a)).

3.3 Project Principle Generating

System

Figure 4 shows the outline of the project principle

generating method. In order to generate the project

principle corresponding to all aspects, the proposed

method uses decision tree learning based on opti-

mal operations. The proposed method generates the

project principle from two decision trees, “check

progress” and “take action”, which are stages of the

learner’s decision-making in project management.

From the decision tree, a learner gets the informa-

tion of class labels (leaf nodes shown in Figure 4) as

actions and attributes(nodes in Figure 4) as shown Ta-

ble 2. By using these decision trees, the project prin-

ciple execution program written by Java codes is gen-

erated automatically to decide reference operation.

4 EVALUATION

4.1 Results of Experiment

In order to generate a project principle, we used 10

project models that have 4.5 modules in average,

manpower of 4 people, 10 days for each module, and

30 days for development. We set the learning target to

Decision tree

Learning

Project principle

Optimal operations

Optimal operation

Check progress

Already

collaborative work?

On critical path?

Already

overtime directive ?

yes

yes no

yes

Take

check progress

Input

Already delay?

On critical path?

Already

overtime directive ?

yes no

yes

Take

overtime directive

Take action

yes

Figure 4: Outline of project principle generating method.

Table 2: The information a learner gets.

The getting information before “check progress”

1. The number of days since the last “Check

progress”

2. The difference between a module’s difﬁculty

and a programmer’s skill

3. Whether the module is on critical path

4. Has “collaborative work” been taken?

5. Has “overwork” been taken?

The getting information after “check progress”

with one before “check progress”

6. Is the project delayed?

7. Whether additional man-hour exist

8. The number of bugs

“Q and D” because we assumed that a learner is expe-

rienced the project contained high standards for qual-

ity and delivery. And we experimented to compare

the reference operation by manually generated project

principle to one by the proposed method. For deci-

sion tree learning, we used WEKA which is one of

the open source data mining tools(Hall et al., 2009).

Figure 5 shows the project model includes difﬁ-

culty and skill of person assigned for each module,

and Figure 6 shows the result of experiment. Figure

6 shows the result with the project principle gener-

ated by the proposed method is better than one with

manually generated project principle. To generate the

project principle manually took 360 minutes, while to

generate one by the proposed method took 240 min-

utes. So, this experimental result showed the pro-

KMIS2012-InternationalConferenceonKnowledgeManagementandInformationSharing

1 6 11 16 21 26 31

Task5

Task4

Task3

Task2

Task1

Difficult:B/Skill:A

Difficult:C/Skill:A

Difficult:A/Skill:A

Difficult:B/Skill:A

Difficult:B/Skill:B

(days)

Module1

Module2

Module3

Module4

Module5

Figure 5: Detail of project model.

12.2

12.3

The num of bugs Additional costs Delivery

Num/Man-

day/Day

Proposed method Manually

<Plan>

Due date:30 days, Standard bugs:10

Figure 6: Result of experiment.

posed method generated the project principle and ref-

erence operation in reasonable time.

4.2 Discussion

In this experiment, the project principle generated

manually has 13 rules, while the project principle gen-

erated by the proposed method has 24 rules. Figure 7

shows a part of the project principle by the proposed

method.

The manually generated project principle shows

that “check progress” is necessary every three days.

However the project principle by the proposed

method reﬂects project status in detailed manner. Ex-

pert project managers conﬁrmed that it is possible to

apply the project principle by the proposed method to

various types of projects. But, they said that an oper-

ation for a real project does not contain many “check

progress”. On the other hand, the reference oper-

ation by the proposed method indicates that “check

progress” should be taken many times, which may be

improper actions in the operations. Here, we con-

ﬁrmed the result for the modiﬁed project principle

that the number of “check progress” is decreased. The

result in decreasing “check progress” is shown in Fig-

ure 8. Figure 8 shows that the result in applying the

project principle generated by the proposed method is

Already

overtime directive?

Check progress

interval

Is mismatched?

Take

overtime directive

Continue

Already delay?

Take

overtime directive

Check progress

interval

Take

overtime directive

Continue

<=0

yes

Take action

yes

<=0

Match rank

Already

overtime directive?

Take

check progress

Check progress

interval

Continue

Already

collaborative work?

Check progress

interval

Take

check progress

Take

check progress

Check progress

yes

<=0

yes

<=1

=-2,-1,1

=0,2

Figure 7: Project principle generated by the proposed

method.

12.2

11.4

1.9

The num of bugs Additional costs Delivery

Num/Man-

day/Day

Proposed method Modification

Figure 8: The result in decreasing “check progress”.

better than one manually in terms of Q and C. Thus, it

is conﬁrmed that the project principle arranged man-

ually is more useful.

5 CONCLUSIONS

This paper addressed a generation method of refer-

ence operation to improve project results using rein-

forcement learning. The proposed method generates

AGenerationMethodofReferenceOperationusingReinforcementLearningonProjectManagerSkill-upSimulator

the project principle which is a group of rules to in-

dicate what to do in situations of the project automat-

ically. And the reference operation is generated by

applying the project principle to project models. The

project principle is generated by optimal operations

using reinforcement learning and decision tree learn-

ing. The experimental result with the project principle

generated by the proposed method is better than one

with manually generated project principle. To gener-

ate the project principle manually took 360 minutes,

while to generate one by the proposed method took

240 minutes. So, this result showed that the proposed

method generated the project principle and reference

operation efﬁciently. Our future work is how to gen-

erate more useful project principle.

ACKNOWLEDGEMENTS

This work was partially supported by KAKENHI;

Grant-in-Aid for challenging Exploratory Research

(23650537).

REFERENCES

Davidovitch, L., Shtub, A., and Parush, A. (2007). Project

Management Simulation-Based Learning For Systems

Engineering Students. In Proceedings of Interna-

tional Conference on Systems Engineering and Mod-

eling 2007, ICSEM ’07, pp.17–23.

Drappa, A. and Ludewig, J. (2000). Simulation in software

engineering training. In Proceedings of the 22nd in-

ternational conference on Software engineering, ICSE

’00, pp.199–208.

Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reute-

mann, P., and Witten, I. H. (2009). The WEKA Data

Mining Software: An Update. SIGKDD Explorations,

11(1):pp.10–18.

Henry, T. R. and LaFrance, J. (2006). Integrating role-

play into software engineering courses. Consortium

for Computing Sciences in Colleges, 22(2):pp.32–38.

Horine, G. (2005). Absolute Beginner’s Guide to Project

Management (Absolute Beginner’s Guide). Que

Corp., Indianapolis, IN, USA.

Iwai, K., Akiyoshi, M., Samejima, M., and Morihisa, H.

(2011). A Situation-dependent Scenario Generation

Framework for Project Management Skill-up Simula-

tor. In Proceedings of the 6th International Confer-

ence on Software and Data Technologies, 2:pp.408–

412.

PMI (2009). The PMBOK Guide. Project Management In-

stitute, fourth edition.

Thomas, J. W. (2000). A Review of Research on Project-

Based Learning. Technical report, San Rafael, CA.

KMIS2012-InternationalConferenceonKnowledgeManagementandInformationSharing