A NEW REPRESENTATION AND PLANNER FOR COMPUTER

BATCH JOB SCHEDULING, EXECUTION MONITORING,

PROBLEM DIAGNOSIS AND CORRECTION

Tracey Lall

Department of Computer Science, Rutgers, The State University of New Jersey, New Brunswick, New Jersey, U.S.A.

Keywords:

Planning, Scheduling, Execution, Diagnosis, Automation.

Abstract:

Modern enterprise computer environments use commercial schedulers to run and monitor computer batch

jobs and processes. Currently the job schedules must be manually designed to include diagnosis and error

correction behaviours for failed jobs or failures must be handled by support staff at execution time, requiring

them to be on call while these jobs run. Automating these manual tasks using planning techniques requires a

compact representation of contingent plans, handling and monitoring of actions which have a variable duration,

actions which are triggered by external events and planning for knowledge goals. Currently these features are

not provided by any existing single planner. We present a novel plan representation which drawing on existing

scheduler representations provides all these features in an integrated manner. A planner implementation using

this representation with a new action logic is described along with key worked examples from the domain.

1 INTRODUCTION

Modern enterprise computer environments involve

the scheduled running of hundreds of computer batch

jobs, programs and processes on a collection of ma-

chines. These consist of computer programs which

are run to achieve a speciﬁc task - for example a job to

generate a report and email it to a user. These jobs are

executed at predetermined times and monitored for

successful completion by commercial schedulers ac-

cording to predeﬁned job schedules. Schedules need

to be designed in order to avoid conﬂicts between

certain jobs (for example running a report against a

database when the database is undergoing a mainte-

nance job) and to ensure that batch job outputs are

produced by the required deadlines. In all cases the

job schedule deﬁnitions must be created in advance

by the support team and unless explicit recovery logic

is programmed into the deﬁnitions job failures will

require manual intervention to diagnose the cause of

the error and to take appropriate corrective actions.

This makes support of such environments very costly

- typically for every dollar spent on computing infras-

tructure, between 2-10 times that amount are spent for

ongoing management (Murch, 2004).

Previous approaches to the automation of batch

job control (Ennis, 1986), have utilized a pattern

based rule approach to error situation identiﬁcation.

The disadvantage of this approach is that error han-

dling rules must be hand-coded by a skilled operator

for each computer environment. We seek instead to

create a planner which is able to generate contingent

plans for job execution, monitoring and error correc-

tion based on the known behaviours of the batch jobs

and the processes comprising the computer system.

From analysis of the domain there are some key

aspects which need to be addressed by the planner:

• The representation of contingent plans must be in

a form understandable to support staff and in or-

der to avoid combinatorial explosion in the size of

plans, the representation should be a compact rep-

resentation which supports the remerging of con-

tingent execution branches.

• The representation needs to support actions which

have a variable duration and hence which require

monitoring for completion.

• The representation must allow reasoning about

triggered actions and events - actions and events

which occur as soon as a particular set of condi-

tions becomes true.

• The representation must support planning for

knowledge goals in order to diagnose job failure

root causes which are not directly sensable.

277

Lall T. (2010).

A NEW REPRESENTATION AND PLANNER FOR COMPUTER BATCH JOB SCHEDULING, EXECUTION MONITORING, PROBLEM DIAGNOSIS

AND CORRECTION.

In Proceedings of the 2nd International Conference on Agents and Artiﬁcial Intelligence - Artiﬁcial Intelligence, pages 277-284

DOI: 10.5220/0002723602770284

 SciTePress

These features are not provided by any existing

single planner. We present a novel plan representa-

tion which addresses these requirements by drawing

upon the existing commercial scheduler representa-

tions. The representation describes the plan as a dy-

namical system which in conjunction with the exter-

nal world evolves/runs according to a simple dynam-

ics. We describe an action logic for reasoning about

this dynamical system which provides both forwards

temporal projection inferences and means end based

inferences to support partial order contingent plan-

ning. We outline the operation of an implemented

planner using this logic on two key examples from

the domain to demonstrate how it builds the plan in

such a way that so that the combined plan and world

state evolve into a goal state on all contingencies.

2 COMPUTER BATCH JOB

ENVIRONMENTS

2.1 Commercial Schedulers

Automated schedulers (ComputerAssociates, 2002),

(UC4, 2008) exist which allow job schedules and de-

pendencies to be deﬁned. A job is run once its sched-

uled time (if deﬁned) is reached and its start condi-

tions become true, e.g. the start conditions for a batch

jobC might be:

success(jobA) and not running (jobB)

This job will run as soon as jobA is in success

state (i.e. has completed with a nominal process exit

status) and jobB is not in the running state. Such

schedulers employ an event processor which given a

set of job deﬁnitions constantly checks to see if any

jobs are ready for execution.

2.2 Illustrative Example Scenarios

The following example scenarios from a case study

conducted on a real world production computer envi-

ronment demonstrate some of the key features of this

domain:

• A report script generates a report for a given date

by processing an input ﬁle. The input ﬁle is only

received after it has been generated by an exoge-

nous external event. The report generation takes

a variable amount of time and must be monitored

for completion. Once the report is generated the

report ﬁle is ftped to a remote server for use by

an external job. This examples demonstrates the

need for reasoning about exogenous events, trig-

gered events and action execution and monitoring

of durative actions.

• A database error must be repaired, where the er-

ror value can be 1 or 2. To check for internal

database errors a test script checkDb can be run

which takes as an argument the error condition e

it is checking for and outputs True if the database

has that error or False if it doesn’t. There is also

a repairDB script which takes as an argument an

error number e and repairs that error condition (or

does nothing if the database does not have that

condition). Using these scripts, in the event that

a job which accesses the database fails, the error

condition may be determined using the checkDb

script and once the error number is determined the

repairDB script can be called with this error num-

ber to repair the error condition.

This example demonstrates the need for con-

tingent planning, handling of merging of plan

branches and planning for knowledge goals.

2.3 Existing Planner Applicability

There is an enormous range of existing planners with

a broad range of capabilities. Any planner suitable

for this domain must be able to handle the open world

assumption i.e. it must be a contingent planner able

to produce plans whose execution is conditional upon

observations made at execution time. Planners such

as Puccini (Golden, 1998), PKS (Petrick and Bac-

chus, 2002) , GPT (Bonet and Geffner, 2001), MBP

(Bertoli et al., 2001), CC-Golog (Grosskreutz and

Lakemeyer, 2000), C-Buridan (Draper et al., 1994)

are all able to plan in an open world and all except

for Puccini perform contingent reasoning. However

these planners are unable to handle the other require-

ments - Puccini, PKS, MBP, GPT, MBP are all linear

planners - they don’t employ partial order based rea-

soning and hence are unable to reason about actions of

varying durations. All of the planners except GPT use

a branching tree plan representation where each time

an observation is made and an action is predicated on

that observation a branch is introduced into the plan,

with no remerging of the plan branches - which can

lead to combinatorial explosion in plan size. GPT

uses a more compact plan representation as an MDP

policy which is a mapping from planner belief states

to actions.

None of the planners (except CC-Golog which has

a high level while, dowait language construct ) in-

clude action monitoring in the plan representation.

ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence

278

3 NEW PLAN REPRESENTATION

A new plan representation was formulated to meet the

domain requirements. The representation is a ’plan as

program’ type approach(as in (Levesqueet al., 1997))

where the plan is a simple form of program which

is interpreted at runtime into execution of actions ac-

cording to the control structure and sequence deﬁned

by the program.

In order to incorporate the action monitoring as-

pects and to allow actions to be triggered by exter-

nal conditions, the approach taken is to represent the

plan itself using a very similar language as used by

commercial job schedulers - where agent actions are

executed as soon their associated start conditions be-

come true. This representation’s viability as a run-

time execution model which handles triggered events

and durative actions has been demonstrated via its

use in commercial schedulers. Additionally its read-

ability and interpretation by human operators has

been validated. This simple ’programming language’

also obeys a simple formally deﬁnable dynamics and

hence can be directly reasoned about using an appro-

priate action logic.

Most partial order planners use a mixed represen-

tation where the plan represents both plan construc-

tion time reasoning (such as causal links, threats, pro-

tections, orderings) and the runtime execution. In

this representation the plan consists purely of its con-

stituent parts described below and planner deliber-

ation is performed seperately using an action logic

which reasons about the dynamics of this plan.

The form for a plan in this representation is

deﬁned below:

< plan >::=< job > | < planVariableValue >

[< plan >]

< job >::= (”name : ” < identifier >

command : ” < command >;

”status : ” < status >;

startConditions : ” < startCondition > ∗)

< status >::= ”Initialised”|”Executing”|”Completed”

< planVariableValue >::=

(”name : ”i < identifier > ”value : ” < value >)

< command >::= [< identifier >,=]

[“] < string > [< parameter > ∗][“]

A command contains a string description of an

internal (planner inbuilt) command or a quoted string

description of an external operating system script and

a list of parameters for the command. A parameter

may be a plan variable or a constant. The return value

from the command may be optionally assigned to a

plan variable.

The command is run as soon as the job status is

Executing. During planner deliberation for each com-

mand there are corresponding event deﬁnition(s) used

by the action logic to determine the effects (on both

planner variables and world ﬂuents) of executing the

command under nominal and non nominal conditions.

A planvariable (whose identiﬁer is preﬁxed with

i to distinguish it from external world ﬂuent identi-

ﬁers) associates a value with a named variable. Its

value may be set either from the results of an external

command script or internal planner command (such

as assignment). Such variables can be used for rep-

resenting the value of ﬂuents in the world (see below

on knowledge representation). The value of a plan

variable may change during execution of the plan.

A job’s start conditions are a set of formulas de-

scribing conditions involving either a plan variable,

job status or the value of a world ﬂuent whose value

is continuously sensed and accessible to the planner

at runtime.

3.1 Plan Execution Dynamics

Execution of the plan consists of letting the plan and

world state evolve according to the following dynam-

ics, which consists of three forms of event:

• Action Start Event - In order to execute the plan

the plan executor constantly monitors all condi-

tions which are deﬁned as start conditions. This

is why start conditions must be formulas involv-

ing either plan variables or world ﬂuent values to

which the scheduler has direct and continuous ac-

cess. As soon as all of the conditions deﬁned in

the start conditions for a job become true and the

current job state is Initialised the job state is set

to Executing. If the job command is an external

command it is run in the real world via the op-

erating system command line using the speciﬁed

parameters values - at which point any immediate

effects of the command start in the external world

take place. If the command is an internal com-

mand (such as a planner variable assignment) it

executes that command and updates the speciﬁed

planner variable accordingly.

• External Action Completion - When the action

command process (which may have a varying du-

ration) has completed (and any effects of the com-

mand taken place in the external world), any re-

turned results from the action are assigned to the

speciﬁed plan variable and the job status is set to

Completed.

A NEW REPRESENTATION AND PLANNER FOR COMPUTER BATCH JOB SCHEDULING, EXECUTION

MONITORING, PROBLEM DIAGNOSIS AND CORRECTION

279

• Exogenous events - these are events which are not

under direct control of the planner and occur ac-

cording to various conditions becoming true in the

external world. These events must also be rea-

soned about during plan construction.

3.2 Plan Variables and Knowledge

Representation

Fluents which are directly sensable with little or no

cost and which may be continuously sensed are con-

sidered automatic ﬂuents, such ﬂuents may be di-

rectly referenced in the plan (e.g. a start condition

for a job can involve an automatic ﬂuent). Reason-

ing about ﬂuents which are not directly sensable and

attainment of knowledge goals is achieved by using

the approach of epistemic ﬂuents where for world ﬂu-

ent values which are not directly observable a plan

variable is created which represents the value of that

ﬂuent in the real world. For example if actions need

to be predicated on the value of the database inter-

nal state (the world ﬂuent dbstate which is not di-

rectly observable), the planner can create a plan vari-

able called i dbState which represents the value of

the ﬂuent in the real world. This variable can then

be used to control action execution or used as an ac-

tion parameter. These are similar to the concept of

runtime variables used by the Puccini planner. A goal

to gain knowledge of the database internal state would

be represented during plan construction as the subgoal

i dbState = dbstate. The planner can achieve this

knowledge goal by assigning i dbState based on the

output of appropriate sensing actions. Using this con-

crete concept of knowledge, knowledge goals may be

formulated and reasoning about using standard causal

reasoning (without the need for use of modal logic as

is used for example in the PKS planner).

4 ACTION LOGIC

The planning problem is the problem of deﬁning the

initial state of the plan such that the combined plan

and world system evolves so that a state satisfying the

goal conditions occurs on the trajectory of every pos-

sible initial contingency. In order to construct such a

plan, the planner requires an action logic which is able

to perform forwards temporal projection based upon

the initial plan and world state, to do this every job

command must have a corresponding set of event de-

scriptions which deﬁne what the start and completion

effects of the command execution is on both the plan

state (job status, planner variables) and ﬂuents in the

external world. Exogenous events must also be rea-

soned about. The event deﬁnitions used by the action

logic are STRIPS (Fikes, 1971) style descriptors with

conditionswhich must hold for that event to occur and

the effect conditions produced by that event. These

Triggered events are deﬁned by their trigger state de-

scriptor and effect state descriptors. The meaning of

the event descriptors is different from other planners

since all the events are considered as triggered events

- the occurrence of a state in which all the trigger con-

ditions hold is not just a prerequisite for that event but

it entails occurrence of the event and its effect condi-

tions. See table 1 for an example set of event deﬁni-

tions for a job.

A state descriptor consists of a unique name (used

for readability purposes during inferences) and a set

of ﬂuent conditions which hold in that state (simi-

lar to the state deﬁnition used in the ﬂuent calculus

(Thielscher, 1999). Predeﬁned names are reserved

for currentState and goalState. The conditions which

hold in a state are denoted by a set of condition predi-

cates Hold(currentState, condition) where condition

is a condition on a ﬂuent (either internal or external)

e.g. freeDiskSpace ≥ 5000. Fluents not included

in the effect state deﬁnition, retain the same value as

prior to the event. In this approach, all changes (in-

cluding those arising from agent actions) are consid-

ered as a result of triggered events. In the dynamic

model all event effects are deterministic and all uncer-

tainty is represented in the initial state descriptor (any

condition not speciﬁed as holding in the currentState

is not determined). Contingencies are deﬁned as a

sub-state of the current state deﬁned by a further set

of conditions which hold, beyond those that hold in

the current state. e.g. Contingency(dbState = 0).

Table 1: Start, Successful and Failure Event deﬁnitions for

command ”genReport ?date”.

Trigger conditions Effect conditions

status = Initialised status = Executing

status = Executing∧ status = Completed∧

inputFile.exists =

True

report?date.exists =

True∧

report?date.contents=

?date∧

report?date.location =

localServer

status = Executing status = Completed

inputFile.exists =

False

Event occurrence is deﬁned as the ﬁrst occurrence

of its trigger state, not by a time value (as in the event

calculus (Shanahan, 2000)). The advantage of this is

ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence

280

that the same deﬁned event can can occur at differ-

ent times on different contingencies. This allows for

reasoning about plan branch merges where an event

may draw its support from different causal sources

under different contingencies (which might involve

the event happening at different times under those dif-

ferent contingencies). Inferences exist to consolidate

proven occurrences on different contingencies so for

example if a state occurrence is proven on the con-

tingency where dbState = 0 and on the contingency

where dbState 6= 0 then the state occurrence is proven

on the trajectory of the current state.

The predicate Occurs (one of whose arguments is

a contingency descriptor) is used to reason about oc-

currences of trajectory predicates such as conditions,

states, events and partial order planning predicates

such as event orderings, protections (Weld, 1994) un-

der different contingencies. The Occurs predicate is

true when the speciﬁed trajectory predicate occurs on

all trajectories of any initial state which belongs to

that contingency.

Some of the key action logic inferences are shown

in tables 2, 3 and 6.

Table 2: State update inference.

Occurs(StateOn(triggerState), contingency)

=⇒

Occurs(StateOn(ef f ectState),contingency)∧

Occurs(OrderingOn(triggerState, e f fectState),

contingency)

Table 3: Causal support inference.

Occurs(StateOn(stateA), contingency1)∧

HoldsInState(stateA, conditionA)∧

(conditionA =⇒ conditionB)∧

Occurs(ProtectionOn(stateA, stateB, conditionA),

contingency2)∧

Occurs(OrderingOn(stateA, stateB, contingency3)

=⇒

Occurs(StateConditionOn(stateB, conditionB),

(contingency1∩ contingency2, ∩contingency3))

4.1 Partial Order Action Logic

Inferences

Since the actions have variable length durations, a

linear planning approach cannot be followed and in

order to perform forwards and backwards temporal

projection, the action logic must support partial order

planning predicates such as causal support, orderings

and protection of conditions between events. Back-

wards inferences must include resolution of threats

using the techniques of promotion, demotion and sep-

aration (Pryor and Collins, 1996) All of these forms

of reasoning must be supported in a contingent man-

ner - hence there are trajectory predicates deﬁned to

allow contingent reasoning about occurrences of all

of these.

Due to space considerations the inferences can

only be sketched in this paper, but the worked exam-

ples are intended to illustrate some of the key infer-

ences.

5 IMPLEMENTED PLANNER

The planner approach for reasoning with this action

language is taken is that advocated by (Stone, 1998),

(Shanahan, 2000), of planning as an abductive infer-

ence process. The agent uses backwards inferences

which make abductivechoices about the jobs and plan

variables it places into the plan. Once a choice about

the plan components has been made, the planner per-

forms forwards inferences to determine the evolution

of the plan over time under different contingencies

and the plan is considered as complete once it has

been proved that the goal state occurs on all possible

contingencies. The planner was implemented using

the drools rules engine (JBoss, 2007), each inference

in the action logic corresponding to a production rule.

Drools contains an automated logical retraction facil-

ity which was used to implement search backtracking.

6 WORKED EXAMPLES

6.1 Handling Exogenous Events, Action

Monitoring and Triggered Actions

In this example (solved by the implemented planner)

from the previous discussion , the goal is to produce

a report ﬁle ”remoteReport1220” on a remote server.

A report generation batch job which takes a date pa-

rameter generates a report on the local server which

has contents corresponding to the speciﬁed date. The

process requires as input a ﬁle inputFile which is gen-

erated by an exogenous event. An ftp action exists

which copies a speciﬁed ﬁle from the local server to

the remote server under a new ﬁle name.

The event deﬁnitions for these are shown in tables

1, 4 and 5. Note, an object oriented naming conven-

tion is used to name ﬂuents which correspond to at-

tributes of an object (such as a ﬁle).

The ﬁrst key inference is an abductive infer-

ence to provide support for the goal condition

A NEW REPRESENTATION AND PLANNER FOR COMPUTER BATCH JOB SCHEDULING, EXECUTION

MONITORING, PROBLEM DIAGNOSIS AND CORRECTION

281

Table 4: Start, Success, and Failure event deﬁnitions for

action ”ft pToRemote ?file”.

Trigger conditions Effect conditions

status = Initialised status = Executing

status = Executing∧ status = Completed∧

?file.exists = True∧ remote?file.exists =

True∧

?file.location =

localServer∧

remote?file.location =

remoteServer∧

?file.contents =

?contents

remote?file.contents =

?contents

status = Executing status = Completed

?file.exists = False

Table 5: Event deﬁnition for exogenous event

externalFileGen.

Trigger conditions Effect conditions

inputFile.exists =

False

inputFile.exists =

True

remoteReport1220.exists = True by adding into the

plan a new job to run the action ftpToRemote

with the parameter ?file = ”Report1220” and in-

stantiating all its associated events (start event, suc-

cess event, fail event). It then sets subgoals to

prove that the ftp start event is triggered, that the

remoteReport1220.exists = True is protected from

the successful ftp event effect state to the goalState

and that successful event effect is ordered before the

goal state. Similar subgoals are created for the loca-

tion conditions.

Because a ﬁle existence is considered as an

automatically sensed ﬂuent, the planner inserts

a start condition for the ftpToRemote job that

Report1220.exists = True in order for the job to

start - this becomes part of the trigger state deﬁni-

tion for the ftp start event - which means that the

ftp command will not be executed until the condition

Report1220.exists = True is true.

(The ﬁle contents is not an automatically sensed

ﬂuent so this cannot be inserted as a start condition

for the action).

Using the same forms of inference to provide sup-

port for the condition Report1220.exists = True the

planner creates a new job for ”genReport ?date” with

the parameter substitution ?date = ”1220”.

The planner achieves the required ordering be-

tween the report generation and ftp action by inserting

an explicit planner ordering between the two agent ac-

tions by adding genReportJob.status = Complete to

the start conditions for ftpToRemote.

Support for the inputFile.exists = True trigger

condition for genReportJob is obtained from the

event externalFileGen. Since the externalFileGen

trigger state has no conditions, using the inference

”All state conditions proven then state occurrence

proven” the planner is able to prove occurrence of the

externalFileGen event on all trajectories.

Table 6: All state conditions proven then state occurrence

proven.

∀condition ∈ Condition,

contingency ∈ Contingency

s.t.HoldsInState(state, condition)∧

Proven(Occurs(StateConditionOn(condition,

state), contingency)

=⇒

Proven(Occurs(StateOn(state), contingency)

From the occurrence of externalFileGen, us-

ing a series of forwards inferences, including

the ”State update inference”, ”All state conditions

proven then state occurrence proven”, and other

partial order inferences using orderings and pro-

tections the planner is able to prove occurrence

of the events externalFileGen, genReportJob suc-

cess, ftpToRemote success and occurrence of the

goalState on all trajectories.

The complete plan consists of the following:

(name: genReport1220,

command:"runReport 1220" ,

status:Initialised,

startConditions: inputFile.exists=True)

(name:ftpToRemote_Report1220,

command: "ftp Report1220",

status:Initialised,

startConditions: Report1220.exists=True,

genReport1220.status = Complete)

6.2 Planning with Knowledge Goals and

Merged Contingencies

This example (under which the planner implementa-

tion is currently being evaluated) is from the repair

database error scenario previously discussed where

the database error must be determined and the re-

pair action called with the appropriate error. There

is no action to directly determine the database inter-

nal error condition, instead the only available sens-

ing command is ”checkDB ?e” which checks whether

the database has a particular error e (a value of 1 or

2) The knowledge acquisition part of this problem is

analogous to the standard knowledge planning prob-

lem Safe combination problem (Petrick and Bacchus,

2002)).

The event schema deﬁnition for the internal plan

variable assignment action assign, the checkDB com-

ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence

282

mand and repairDB action are shown in tables 7,8

and 9.

Table 7: Event deﬁnition for command ”assign ?x ?y”.

Trigger conditions Effect conditions

status = Initialised status = Completed

∧(?x =?y)

Table 8: Event deﬁnition for command ?result =

”checkDB ?e”.

Trigger conditions Effect conditions

status = Initialised status = Executing

status = Executing status = Completed

∧(?result =

(dbState =?error)

Table 9: Start, Success and Failure Event deﬁnition for

command ”repairDB ?error”.

Trigger conditions Effect conditions

status = Initialised status = Executing

status = Executing status = Completed

dbState =?e ∧dbState = 0

status = Executing status = Completed

dbState 6=?e

In the initial state :

Holds(currentState, dbState = 1|2) (where the ”|”

signiﬁes that the value held is one of the speciﬁed val-

ues)

The goal is to repair the database error :

Holds(goalState, dbState = 0)

In order to provide support for this condition the

planner creates a repair job and an epistemic program

variable i dbState which is speciﬁed as the parame-

ter for the repair action. It then creates a subgoal that

i dbState = dbState at the time the repair job is run

(since the repair action only works if its runtime pa-

rameter is equal to the database error)

The planner establishes support for the condition

i dbState = dbState by adding a job with the assign-

ment command ”assign i x i y” with the substitutions

i x = i dbState, i y = 1 and with the additional re-

quired assumption that the initial contingency is one

where the condition dbState = 1 holds. This proves

causal support for i dbState= dbState but only on the

contingency dbState = 1. It similarly proves causal

support from ”assign i dbState 2” on the contingency

dbState = 2.

The planner must prevent the threats that these

jobs pose to each other’s support of i dbState =

dbState. It performs this threat resolution via sepa-

ration (Pryor and Collins, 1996) whereby a threat by

a conﬂicting action is resolved by ensuring that the

threatening action does not occur in the same contin-

gency where the threatened causal support is needed.

The planner determines that the ”assign i dbState 1”

command must occur on the contingency where

dbState = 1 and it must not occur on the contingency

where dbState 6= 1

In order to provide the necessary conditioning for

the ”assign i dbState 1” job the planner introduces

contingency control on this action by creating a new

boolean plan variable i dbStatus is 1 which repre-

sents the value of the truth/falsity of the proposition

dbState = 1. It adds truth of this planner variable as a

start condition to the ”assign i dbState 1” job.

The planner establishes the value of

i dbStatus is 1 using the sensing action command

”i dbState is 1 = checkDB 1”

To prove ordering of the ”repairDB i dbState”

command to after the epistemic ﬂuent has been cor-

rectly set the planner adds i dbState6= null to the start

conditions to ensure it occurs after i dbState has been

set.

From forwards inferences the occurrence

of i dbState = dbState in the trigger state for

”repairDB i dbState” success event is proven on

the contingency dbState = 1 and similarly it is also

proven to occur on contingency dbState = 2. Using

an inference which combines proven occurrences of

the same event across different contingencies, the

planner proves the occurrence i dbState = dbState

in the repairDB i dbState trigger state on all the

trajectories of the current state. From this it is able to

prove occurrence of the successful repair action and

the goal state on the trajectory of the current state.

The ﬁnal plan is:

(name: i_dbState, value: null )

(name: i_dbState_is1, value:null )

(name: i_dbState_is2, value:null)

(name:check1,

command: i_dbState_is1="checkDB 1" ,

status:Initialised, startConditions:)

(name:check2,

command: i_dbState_is2="checkDB 2" ,

status:Initialised, startConditions:)

(name:assign1, command: i_dbState=1 ,

command: i_dbState=1 ,

status:Initialised,

startConditions:i_dbState_is1=True)

(name:assign2, command: i_dbState=2 ,

command: i_dbState=2 ,

status:Initialised,

startConditions:i_dbState_is2=True)

A NEW REPRESENTATION AND PLANNER FOR COMPUTER BATCH JOB SCHEDULING, EXECUTION

MONITORING, PROBLEM DIAGNOSIS AND CORRECTION

283

(name:repairDB,

command: "repairDB i_dbState" ,

status:Initialised,

startConditions: i_dbState not null )

7 CONCLUSIONS AND FUTURE

WORK

In this paper we have presented a new contingent plan

representation which offers advantages with respect

to action monitoring, handling of triggered events,

compactness of plan branch representation and which

can handle planning for knowledge goals. We have

brieﬂy sketched how a planner is able to reason about

this plan representation and how it can generate plans

for some key domain scenarios. An implementation

has been successfully demonstrated on the ﬁrst ex-

ample, and is currently being evaluated against the

second example. A further evaluation is planned on

an example where the value of two independent ﬂu-

ents must be sensed - demonstrating that search time

scales linearly with the number of sensed ﬂuents - (not

exponentially as is the case with a planner which does

not allow execution branch remerging.) Future exten-

sions to the planner could be made by the introduction

of other inference rules - for example temporal infer-

ences by reasoning about time conditions - and by the

introduction of specialised inferences to build more

complex predesigned plan structures - with validation

of the constructed plan using the forwards dynamical

inferences.

Although created for the computer batch job do-

main the representation could be applied in any do-

main where actions are triggered in response to exter-

nal events, this includes workﬂows and event driven

architectures. It is hoped that this representation will

undergo further future development and application to

such domains.

REFERENCES

Bertoli, P., Cimatti, A., Pistore, M., Roveri, M., and

Traverso, P. (2001). Mbp: a model based planner. In

Proceedings of the IJCAI’01 Workshop on Planning

under Uncertainty and Incomplete Information, Seat-

tle.

Bonet, B. and Geffner, H. (2001). Gpt: A tool for planning

with uncertainty and partial information. In In Proc.

IJCAI01 Workshop on Planning with Uncertainty and

Incomplete Information, pages 82–87.

ComputerAssociates (2002). Autosys.

Draper, D., Hanks, S., and Weld, D. (1994). Probabilistic

planning with information gathering and contingent

execution. pages 31–36. AAAI Press.

Ennis, R. (1986). A continuous real-time expert system

for computer operation. IBM J. research development,

30(0):0.

Fikes, N. (1971). Strips: A new approach to the applica-

tion of theorem proving to problem solving. Artiﬁcal

Intelligence, 2:189–208.

Golden, K. (1998). Leap before you look: Information gath-

ering in the puccini planner. In Proceedings of AIPS,

pages 70–77.

Grosskreutz, H. and Lakemeyer, G. (2000). cc-golog: To-

wards more realistic logic-based robot controllers. In

In AAAI’2000, pages 476–482.

JBoss (2007). Drools 4.0.7.

Levesque, H. J., Reiter, R., Lesperance, Y., Lin, F., and

Scherl, R. B. (1997). GOLOG: A logic programming

language for dynamic domains. Journal of Logic Pro-

gramming, 31(1-3):59–83.

Murch, M. (2004). Autonomic computing, chapter Introduc-

tion. IBM Press.

Petrick, R. and Bacchus, F. (2002). A knowledge-based

approach to planning with incomplete information and

sensing. In Proceedings of AIPS’02, pages 212–221.

Pryor, L. and Collins, G. (1996). Planning for contingen-

cies: A decision-based approach. Journal of Artiﬁcial

Intelligence Research, 4:287–339.

Shanahan, M. (2000). An abductive event calculus planner.

Journal of Logic Programming, 44:207–239.

Stone, M. (1998). Abductive planning with sensing. AAAI.

Thielscher, M. (1999). From situation calculus to ﬂuent cal-

culus: State update axioms as a solution to the infer-

ential frame problem. Artiﬁcial Intelligence, 111:277–

299.

UC4 (2008). Application automation in enterprise workload

automation.

Weld, D. S. (1994). An introduction to least commitment

planning. AI Magazine, 15(4):27–61.

ICAART 2010 - 2nd International Conference on Agents and Artificial Intelligence

284