ADAPTIVE EXECUTION OF SOFTWARE SYSTEMS ON PARALLEL

MULTICORE ARCHITECTURES

Thomas Rauber

Department of Computer Science, University of Bayreuth, Bayreuth, Germany

Gudula R¨unger

Department of Computer Science, Chemnitz University of Technology, Chemnitz, Germany

Keywords:

Parallel execution, Multicore systems, Incremental transformation, Transformation toolset, Dynamic schedul-

ing.

Abstract:

Software systems are often implemented based on a sequential ﬂow of control. However, new developments

in hardware towards explicit parallelism within a single processor chip require a change at software level

to participate in the tremendous performance improvements provided by hardware. Parallel programming

techniques and efﬁcient parallel execution schemes that assign parallel program parts to cores of the target

machine for execution are required.

In this article, we propose a design approach for generating parallel software for existing business software

systems. The parallel software is structured such that it enables a parallel execution of tasks of the software

system based on different execution scenarios. The internal logical structure of the software system is used

to create software incarnations in a ﬂexible way. The transformation process is supported by a transformation

toolset which preserves correctness and functionality.

1 INTRODUCTION

Many advances in software technology and business

computing are enabled by a steady increase in mi-

croprocessor performance and manufacturing tech-

nology. The performance increase will continue dur-

ing the next years (Kuck, 2005; Koch, 2005). How-

ever, technological constraints have forced hardware

manufacturersto consider multicore design to provide

further increasing performance. For these multicore

designs, multiple simple CPU cores are used on the

same processor die instead of a single complex CPU

core. It is expected that within a few years, a typical

desktop processor provides tens or hundreds of exe-

cution cores which may be conﬁgured according to

the needs of a speciﬁc application area (Kuck, 2005).

The design change towards multicore processors

requires a fundamental change in software develop-

ment, since the computing power of the new proces-

sors can only be utilized efﬁciently if the application

program provides coordination structures which en-

able a mapping of different execution threads to dif-

ferent cores. A large variety of parallel programming

languages, scheduling algorithms, and programming

techniques have already been proposed, but they often

require the speciﬁcation of low-level synchronization

and usually parallelization operations that are error-

prone and require a lot of programming experience

to get correct and efﬁcient programs. It is often ar-

gued that more abstract features need to be developed

and integrated into programming languages and sys-

tems to facilitate the use of parallelism (Sutter and

Larus, 2005; Sutter, 2005). This is also important for

business software and the new development towards

multicore provides new challenges and opportunities.

In particular, the use of parallelism allows the inte-

gration of new functionalities, e.g., by running useful

tasks continuously, like an automatic backup utility or

statistics, offering more potential for real-time infor-

mation on demand (Reinders, 2006).

The contribution of this paper is to propose an ex-

ecution environment for business software which en-

ables a parallel execution on arbitrary multicore plat-

forms. The approach decouples the parallel execution

from the speciﬁcation of the business logics of the

software systems, so that the programmer can con-

191

Rauber T. and Rünger G. (2010).

ADAPTIVE EXECUTION OF SOFTWARE SYSTEMS ON PARALLEL MULTICORE ARCHITECTURES.

In Proceedings of the 12th International Conference on Enterprise Information Systems - Information Systems Analysis and Speciﬁcation, pages

191-198

DOI: 10.5220/0002894901910198

 SciTePress

centrate on the requirements of the software system.

A runtime system performs the actual mapping of the

software computations to the execution platform. For

a different platform, a different mapping may lead to

the best execution. The speciﬁc approach comprises

the following novel issues:

• We propose to structure business programsas a set

of cooperating tasks that are brought to execution

by a specialized runtime system.

• We propose a programming technique to imple-

ment such programs and present a scheduling

technique that assigns parallel computations to

cores at runtime.

• We propose an interactive transformation process

organized in several steps, which enables interac-

tive decisions for forming and orchestrating the

tasks. The transformation process is supported by

a transformation toolset.

In summary, a ﬂexible and adaptive execution

scheme results, which can deal with dynamic, adap-

tive, and long-running software requirements, cap-

turing typical characteristics of software systems for

business computing.

The rest of the paper is organized as follows: Sec-

tion 2 presents the execution environment proposed

for executing business software systems. Section

3 presents the priority-based scheduling algorithm.

Section 4 outlines the transformation process and the

architecture of the transformation toolset. Section 5

discusses related work. Section 6 concludes.

2 EXECUTION ENVIRONMENT

In this section, we propose an execution environment

for an adaptive execution of business software sys-

tems on multicore architectures.

2.1 Overview

The execution environment is based on a decomposi-

tion of the computations of the software system into

tasks. This decomposition can be supported by a

transformation system as outlined in Section 4. In

the following, we assume that the decomposition into

tasks is available. Each task has a unique identiﬁer

(TID - task identiﬁer) and speciﬁes the computations

to be performed. Moreover, each task provides an in-

terface specifying the input data of the task as well as

the output data which results by executing the task.

Input and output data may cause dependencies be-

tween tasks: if a task B uses data computed by task A,

then there is a dependence A → B. In this case, B can-

not be executed before the executionof A has been ﬁn-

ished. Dependencies between tasks can be illustrated

by a task dependence graph (TDG). The nodes of the

TDG are the tasks executed by the business software

system. The edges of the TDG represent the depen-

dencies between the tasks.

During the execution of the software system, tasks

can be dynamically created. This can happen accord-

ing to an interactive request by the user of the soft-

ware system. New tasks can also be created during

the execution of other tasks according to the needs

of the computations requested. Thus, the task struc-

ture is not ﬁxed when starting the software system,

but evolves dynamically, and so does the TDG.

As described above, dependencies in the TDG re-

strict the execution order of tasks, requiring a sequen-

tial execution in the presence of a dependency. On

the other hand, two or more tasks can be executed in

parallel if there are no dependencies between them.

This gives room for an efﬁcient use of multiple cores

as they are provided by current and future multicore

processors. In particular, the execution environment

uses a runtime system with several components. The

central component is a task distribution engine (TDE)

which controls the dynamic deployment of tasks to

cores for execution. The tasks that are ready for ex-

ecution are stored in a special data structure from

which they are retrieved by the TDE, see Fig. 1 for

an illustration. During the execution of a task, new

tasks may be generated. If these are ready for exe-

cution, they are inserted into the set of ready tasks.

If newly created tasks are not ready for execution be-

cause, e.g., they must wait for an external event or

for the arrival of data, they are inserted into a pool of

pre-tasks. They are moved to the set of ready tasks as

soon as all their requirements are fulﬁlled.

insert

ready tasks

task distribution

engine

retrieves

core

. . .

core

deployment

. . .

Figure 1: Illustration of the functionality of the task distri-

bution engine.

3 ADAPTIVE SCHEDULING

In this section, more details about the scheduling of

tasks by the TDE are described. In particular, the use

of the priority vector is discussed and two scheduling

ICEIS 2010 - 12th International Conference on Enterprise Information Systems

192

modes for the TDE and their corresponding schedul-

ing algorithms are presented.

3.1 Priority Vector and Task Status

The scheduling of tasks includes decision rules for the

assignment of tasks to speciﬁc cores at a speciﬁc time.

The decisions are usually based on a cost measure ap-

propriate for the speciﬁc situation.

In this article, we consider long-running applica-

tion programs with a dynamic creation of tasks during

runtime. For this kind of application, we propose a

new cost measure. The cost of a task M is captured in

a priority vector

pr(M) = (RD(M),PR(M), HR(M))

comprising informationabout the status RD(M) of the

task, a priority information PR(M), and information

about hardware restrictions HR(M) for the execution

of M. In the following, we describe which type of

information can be captured in the priority vector.

Ready States. A task can be in one of two possi-

ble states: If all constraints and requirements are ful-

ﬁlled, a task is ready to be executed (i.e., RD(M) = 1)

and can be selected by the scheduling algorithm for

an actual execution. If there are constraints or re-

quirements which are not fulﬁlled, a task is not ready

for execution (i.e., RD(M) = 0) and is currently not

considered by the scheduling algorithm. Tasks can

change their ready state during the runtime of the ap-

plication. When a task M is created, its state is set to

RD(M) = 0. This is done for the tasks started at the

beginning as well as for the tasks being created during

runtime by another task. Although these newly cre-

ated tasks are spawned from their parent task, there

might be constraints to be fulﬁlled by other tasks.

These constraints or requirements might be caused by

data to be produced by other program parts and which

are required for a correct execution or by speciﬁc or-

der of task execution in which other tasks have to be

executed before the newly created tasks.

The state of the task M is set to RD(M) = 1, when

all constraints and requirements become true in the

course of the execution. The adaptation of the status

is done dynamically by the TDE, which considers the

tasks of the application as a dynamic graph data struc-

ture Dep with tasks as nodes and edges representing

constraints between tasks. This data structure is im-

plicit and dynamically changing, since new tasks are

created and the entire graph structure is known only

when the execution of the application is ﬁnished. The

dynamic graph structure is usually different from the

task creation treeCre in which a task is a parent of all

the tasks it has created. The difference of the dynamic

graph Dep from the creation tree Cre reﬂects the fact

that there can be dependencies to other tasks than the

parent task. In this paper, we restrict the graph Dep

to constraints between fully executed tasks and tasks

to be executed next, i.e., dependencies between a run-

ning task and a ready task do not exist. The ready state

of the tasks is updated regularly by the TDE, chang-

ing the set of ready tasks RT, and the scheduling al-

gorithm can access the current set RT.

Priority Values. The second entry in the priority

vector pr(M) is the priority value PR(M), which can

be used to inﬂuence the execution order of ready

tasks. The scheduling algorithm described in the fol-

lowing subsection selects the task to be executed next

according to this priority information and, thus, the

priority information is an important decision base in

cases where more ready tasks exist than the cores can

execute next. The priority value can capture different

types of priorities, depending on the situation which

is reasonable for a speciﬁc software.

Priority values can be chosen according to exe-

cution properties, like the expected execution time

or critical path information, as well as according to

business logics aspects, like the importance of a task.

When the execution time is considered, there are sev-

eral possibilities to set priorities. First, the priority

value is high when the execution time is high, so

that expensive tasks are executed as early as possible.

Also, the opposite can be useful, so that cheap tasks

(with high priority values in this case) are executed

ﬁrst and, thus, the set of ready tasks remains smaller.

The critical path is a reasonable priority base when

the overall execution time of the application is impor-

tant and should be minimized. The critical path is the

longest path from the root task of the graph structure

Dep to a leaf. The tasks on a critical path have to

be executed one after another. For a low execution

time, it is reasonable to execute the tasks of the crit-

ical path as soon as they are ready and before other

tasks. The priority values based on the importance

of the task can be set by the application program. A

very important task has a high value, and other tasks

get lower values down to very minor tasks with the

lowest value. In all cases, the priority value is a pos-

itive natural value greater than one, which abstracts

from the way of choosing the values. The scheduling

algorithm works only with these values.

We assume priority values which can be changed

as long as the corresponding task is a ready task. With

varying priority values, more ﬂexible priorities can be

chosen. For example, tasks which should be executed

after a ﬁxed time interval need a high priority value

ADAPTIVE EXECUTION OF SOFTWARE SYSTEMS ON PARALLEL MULTICORE ARCHITECTURES

193

as soon as the task has to be executed. Examples are

statistical evaluations which should be executed from

time to time or ﬁnal accounts, which are to be exe-

cuted every month or year. Flexible priority values

are also important for fairness purposes. A fair exe-

cution of tasks means that each task should ﬁnally be

executed. In principle, it can happen that a task with

low priority is never executed when enough tasks with

higher priority are constantly created, so that there are

not enough cores left for the execution of the low-

priority tasks. In those cases, it can be useful to have

a mechanism to raise a priority value after a certain

period of time elapsed since the task became ready.

Hardware Restrictions. The hardware restriction

information of a task is related to its internal paral-

lelism. To be executed in parallel, a task is required

to be implemented in a multi-threaded way, so that it

can be executed by more than one core. For the task

administration and scheduler, the amount of potential

parallelism is important and is captured in the hard-

ware restriction number.

When a task can be executed only sequentially due

to its internal implementation, the hardware restric-

tion number is set to 1. When a task can be exe-

cuted in parallel, the hardware restriction number is

set to the number of cores which should be exploited

at most, i.e., the number of cores executing task M

will be smaller or equal to HR(M). The scheduling al-

gorithm can assign as many cores to that task as avail-

able. This guarantees a ﬂexible use of the cores and

an adaptive scheduling of the tasks. The highest pos-

sible number for HR(M) is the total number of cores

available for the speciﬁc application on the execution

platform. However, it is reasonable to set lower HR-

values. First, tasks with maximum HR-value will be

problematic to schedule to free cores. Second, there

is often an optimal number p

opt

of cores for a task

which results in the lowest execution time and the use

of more than p

opt

cores would increase the execution

time of that task and would waste hardware resources.

3.2 Scheduling Algorithm for

Single-task Deployment

Task scheduling is based on the set of ready tasks RT

that is maintained for each time of execution and that

is updated if new tasks become ready for execution

according to their ready state or if tasks are retrieved

from RT for deployment to cores. For the TDE, two

different modes of operation are available: single-task

deployment and multi-task deployment.

For single-task deployment, tasks are taken from

the set of ready tasks RT one at a time and are as-

Algorithm 1: Scheduling for single-task de-

ployment mode of the TDE.

begin

sort set RT according to priority values;

foreach (execution step of the TDE) do

while (p ≥ 1 cores are idle) do

select one task M from RT

with highest priority;

assign M to q = min(p, HR(M))

cores;

p = p− q;

if (M creates new task M

′

) then

insert M

′

into RT and re-sort RT;

wait for some cores to become idle;

signed to idle cores as long as there are cores avail-

able, see Alg. 1 for an overview of the underlying

scheduling algorithm. In particular, the TDE waits

for one or multiple cores to become idle after hav-

ing ﬁnished the execution of their current task. As

soon as this happens, the TDE takes the next task M

with the highest priority from RT and assigns it to as

many cores as possible for execution, taking a pos-

sible hardware restriction HR(M) into account. The

deployment of tasks to cores continues until no idle

cores are available any more. If this happens, the TDE

stops deployment and waits for other cores to become

idle. During the execution of a task, new tasks may be

created or may get ready for execution. If so, they are

inserted into RT and RT is kept sorted according to

the task priority values. Thus, in single-task deploy-

ment mode, the TDE tries to assign as many cores as

possible to the next task with highest priority. This

behavior can be changed by switching to multi-task

deployment mode.

3.3 Scheduling Algorithm for

Multi-task Deployment

In multi-task deployment mode, the TDE selects not

only one but q > 1 tasks for deployment as soon as

some cores become idle, see Alg. 2 for an overview.

In practice, q can be ﬁxed, but it may also be selected

such that it depends on the number of cores becoming

idle in the current steps. The TDE tries to schedule the

q tasks retrieved from RT such that the resulting exe-

cution time is minimized. To perform the deployment,

the TDE uses an estimated execution time T(M, p) for

task M on p cores. Before ﬁxing the scheduling, the

TDE compares different possibilities for the task ar-

rangement. In particular, the TDE tries to arrange the

p cores that are currently available into g < p groups

ICEIS 2010 - 12th International Conference on Enterprise Information Systems

194

Algorithm 2: Scheduling for multi-task deploy-

ment mode of the TDE.

begin

foreach (execution step of the TDE) do

let P be the set of idle cores, p = |P|;

select q tasks {M

,... ,M

} from RT

with highest priority;

min

∑

i=1

T(M

, p); best = p;

foreach (g ∈ (set of divisors of p)) do

partition P into g subsets G

,... ,G

of size p

= p/g;

sort {M

,... ,M

} such that

T(M

, p

) ≥ ... ≥ T(M

, p

);

for (j = 1, . ..,q) do

assign M

to G

with smallest

acc. exec. time T

acc

);

act

(g) = max

1≤ j≤g

acc

);

if (T

act

(g) < T

min

) then

min

= T

act

(g); best = g;

execute M

,... ,M

on g equal-sized

sets of cores;

if (M

creates new tasks) then

insert these tasks into RT;

adapt priority vectors and re-sort RT;

of equal size p

= p/g and then assigns the tasks to

these groups in decreasing order of their estimated ex-

ecution time. If p

> HR(M) for a task M, we assume

T(M, p

) = ∞. Therefore, this group arrangement is

not competitive for the ﬁnal deployment selection.

The resulting overallexecution time T

act

(g) for the

current step is determined by the accumulated execu-

tion time of the slowest group. The scheduling algo-

rithm determines the group arrangement which leads

to the smallest overall execution time T

min

by inves-

tigating all useful number of groups and uses this ar-

rangement for the actual task deployment. For the ﬁ-

nal deployment of the groups, the priority values can

be considered again, and the tasks assigned to one

group can be deployed in the order of decreasing pri-

ority values. The advantageof this version of the TDE

is that it does not only use the priorities of the tasks,

but is also takes the actual execution times into con-

sideration and tries to minimize the overall execution

time. In contrast to the single-task mode, this mode

may also assign less than the maximum number of

cores to a task M, as speciﬁed by HD(M), if this is

beneﬁcial.

4 TRANSFORMATION

APPROACH

In this section, we show how the generation of a task-

based version of a business software system can be

integrated into an interactive transformation frame-

work. The framework has been originally designed

for the generation of client-server programs (Rauber

and R¨unger, 2007; Hunold et al., 2009), but it can

be extended so that the single components of the dis-

tributed system can executed in a task-based way on

different cores of a multicore system. This is espe-

cially useful for server components that must yield a

large throughput of requests.

4.1 Requirements and Design Goals

The transformation of an existing software system

into a software system that can run efﬁciently on mod-

ern multicore architectures can be done by an interac-

tive process which is organized according to the spe-

ciﬁc requirements. The main requirements are to keep

the business logics of the given software system, but

to increase the ﬂexibility such that an automatic adap-

tion of the execution to a given hardware platform is

obtained. In particular, the following main require-

ments can be identiﬁed:

• Hardware Flexibility. The resulting software

system should be executable on different multi-

core architectures in an efﬁcient way;

• Distributed Interaction. The resulting software

system should be easy to integrate into a dis-

tributed system that uses remote methods for co-

ordination and data exchange;

• Software Flexibility. The resulting software sys-

tem should be ﬂexible such that business software

of a speciﬁc enterprise can be easily extended by

providing additional functionality. Such exten-

sions are useful since more hardware resources

are available to execute additional computations

such that an additional beneﬁt for users results.

• Efﬁciency. The resulting software system should

be efﬁcient in the sense that the resources of the

multicore hardware are efﬁciently used;

• Scalability. The resulting software system should

be scalable in the sense that additional hardware

resources can be efﬁciently used and that larger

and more data sets can be added without leading

to signiﬁcant performance degradations, if at the

same time more hardware resources are added.

To meet these requirements, we propose an incre-

mental transformation process. Important aspects of

ADAPTIVE EXECUTION OF SOFTWARE SYSTEMS ON PARALLEL MULTICORE ARCHITECTURES

195

Explicit

Workflow

Software Software

with as

configurable

Subsystem Subsystem

Distributed

SystemWorkflow

Software

Subsystem

Distributed

Software

guided by

explicit

Workflow

Software

Parallel

execution

parallel

locally

Software

Distributed

FSP

Flexible Software Package

FSR

Flexible Software Representation

Figure 2: Transition diagram for Flexible Software Repre-

sentation (FSR).

the transformation process is the use of an interme-

diate representation on which the transformations are

performed. This representation is given as auxiliary

program structure. The representation is language

independent and appropriate for the transformation

goals. The auxiliary program structure is a hierarchi-

cal structure which captures the static software struc-

ture. The highest level of the hierarchical structure

is the coordination structure. This level calls spe-

ciﬁc modules which encapsulate the original func-

tionality and which can exhibit a further hierarchical

structure. The hierarchical program structure is the

basis for a module structure which decomposes the

given monolithic program code. This module struc-

ture is exploited (i) to create a ﬂexible program struc-

ture which can be used for enterprise speciﬁc soft-

ware (software ﬂexibility) and (ii) to decide about ex-

plicit parallel execution (hardware ﬂexibility). Start-

ing with this program representation an incremental

transformation process transforms the given software

system into an adaptive parallel system. Interactive

decisions guide the transformation process.

To support the transformation process we pro-

pose a transformation toolset to interactively trans-

form software. The next section describes this toolset

in more detail. The transformation process starts with

the input program and the speciﬁcation of the module

structure, which is essential and has to be provided by

software experts, e.g., by using a clustering method

and interactive design. Software experts are also re-

sponsible for making the business processes explicit

by using a workﬂow description.

4.2 Transformation System

The transformation towards a parallel execution is in-

tegrated into a transformation system that we have

proposed and developed to increase the modularity of

monolithic software systems, to extract the business

logics in form of an explicit workﬂow, and to parti-

tion the system into an explicit structure of cooperat-

ing components (Rauber and R¨unger, 2007; Hunold

et al., 2009). The extension addresses the different

components extracted and partitions them further into

interacting modules which can then be executed by

the TDE as tasks. The transformation is based on a

Flexible Software Representation (FSR) which now

contains also the aspect of a parallel execution, see

Fig. 2. In the following, we are particularly inter-

ested in the combination of a distributed execution

(DS) with an adaptive parallel execution (PE) for the

single components.

The transformation of the FSR into a combination

of DS and PE is performed on the intermediate rep-

resentation, which consists of an upper and a lower

level. The upper level captures the cooperation and

coordination of software parts which are then mapped

as components to different sites of a distributed sys-

tem. The lower level captures the partitioning of the

computations of the different components into tasks

based on the assumption that each site of the dis-

tributed system provides multiple cores for execution.

Both levels together describe the functionality of the

resulting software system in an intermediate format.

Additional static software components are needed to

make the software system executable. In particular,

the TDE is needed to ensure a parallel execution of a

single component. Moreover, coordination and com-

munication services are needed for a correct interac-

tion between the distributed components.

4.3 Transformation Decisions

The entire transformation is organized in an incre-

mental transformation process including (interactive)

transformation decisions. Figure 3 illustrates the

coarse structure of the transformation process and

the decision tree. One path along the transformation

direction corresponds to one speciﬁc transformation

process producing one business software system us-

ing a speciﬁc software mode.

The upper level is based on an extraction of the

logical structure from the business software system

which is transformed into the modular structure in

the intermediate representation. This is the basis for

creating the coordination structure and a coordina-

tion program for the orchestration of the modules of

the distributed system. These are obtained from code

fragments from the original software system and are

converted into components which can communicate

with other components over predeﬁned interfaces us-

ing additional component services. The actual dis-

tributed execution is administrated by a distributed

runtime system.

ICEIS 2010 - 12th International Conference on Enterprise Information Systems

196

Specification

Level

Intermediate

Represenation

Execution

Modular

StructureStructure

Modules

Logical

Program

Services

Components

Business

Software

System

Coordination

Set of

filterselect

transform transform

createcreate

include

FSR

Coordination

Structure

Task

dependencies

tasks

Compontents/

Modules

Distributed Parallel Distributed

System Execution System

locally parallel

Execution

Distributed

Runtime System

Parallel

Runtime Library

Distributed

Runtime System

include

DS+PE

Code

Fragments

Figure 3: Transformation decision and transformation process to generate a distributed system and a parallel execution.

The lower level is based on an interactive identi-

ﬁcation of tasks that are extracted from the modules

constructed by the upper level. Based on their data

access pattern, tasks may have dependencies that can

be captured by a task dependency graph. The creation

of tasks by other tasks may also lead to dependencies.

The actual parallel execution of a component and the

adaptation to a speciﬁc execution environment is con-

trolled by a parallel runtime library which brings the

tasks to execution. The TDE is an important part of

this runtime library.

Fig. 4 gives an overview of a distributed execu-

tion of the resulting software system. The distributed

execution is controlled by a coordination component

which orchestrates the execution of the different com-

ponents on one site. At each site involved in the dis-

tribution execution, such a coordination component is

used. All data accesses of the components are per-

formed via the coordination layer which transfers the

accesses to the corresponding services provided by

the framework. Remote execution of components is

done via the communication service which can be per-

formed in different ways, including Enterprise Java

Beans (EJB). The parallel execution of one compo-

nent is hidden within the single components.

5 RELATED WORK

The transformation of software systems has been con-

sidered by many research groups. Most approaches

concentrate on the transformation into modular or

object-oriented systems or on the extraction of the

business logics. New approaches also consider dis-

tributed solutions, e.g. by providing middleware

component component

database

local

...

component

coordination

coordination layer

controls

remote access

executor

remote

communication

service

security consistency

service

network

access

distributed software DS

access

local

access

Figure 4: Distributed software system DS generated by the

transformation process working with a coordination com-

ponent.

solutions for data integration (Akers et al., 2004;

Menkhaus and Frei, 2004). For distributed systems,

performance aspects play an important role (Litoiu,

2004), since additional latency and transfer times may

be necessary, e.g., when replacing legacy software by

web services using protocols like SOAP (Simple Ob-

ject Access Protocol) (Brunner and Weber, 2002). For

the transformation of software systems, approaches

like DMS (Baxter et al., 2004) have been developed

and have been applied to large software systems (Ak-

ers et al., 2004). The use of automatic program trans-

formations is considered in (Akers et al., 2007).

The parallel execution of software systems on

multicore architectures using the SOA (Service-

Oriented Architecture) approach is considered in

(Isaacson, 2009). Multi-threaded programming lan-

guages with direct support for a parallel execution in-

clude Java, Cray’s Chapel, Sun’s Fortress, and IBM’s

X10. But these approaches require a re-formulation

of existing software systems, which usually requires

a re-formulation of large parts of the code.

ADAPTIVE EXECUTION OF SOFTWARE SYSTEMS ON PARALLEL MULTICORE ARCHITECTURES

197

Although there are many different approaches,

there exists no generally accepted method for the in-

cremental transformation of software systems. An

important reason for this lies in the fact that there

are many different distributed platforms like CORBA

or EJB that cannot be combined in an arbitrary way.

Therefore, a distributed realization often requires a

new implementation of the business logics.

The Model Driven Architecture (MDA) approach

(Siegel, 2005) addresses this problem and uses a

model-based approach for the step-wise generation of

distributed, component-based software. The model-

driven development od parallel software for multicore

in the area of embedded systems is considered in (Hsi-

ung, P. et al, 2009). Support for a simpliﬁcation of

the transition to parallel software is collected by the

COMPASS project (Sethumadhavan et al., 2009).

6 CONCLUSIONS

The portability and efﬁcient execution on multicore

architectures will be an important property of all soft-

ware products, including business software. In this

article, we have proposed a hybrid task-based paral-

lel programming model in which a software system is

decomposed into tasks, which may or may not be exe-

cuted in parallel to each other and additionally have an

internal multi-threadedimplementation. The software

system can exhibit a dynamic behavior such that new

tasks can be activated during the execution of another

task. The correct and efﬁcient execution on a mul-

ticore platform is supported by a task administration

and a scheduler at application program level. Both are

integrated into a separate runtime library which sup-

ports the execution of arbitrary task-based software

systems.

In summary, we have proposed a new hybrid

parallel programming environment which is suitable

for dynamic, long-running business software sys-

tems. A software system can be newly designed for

the proposed program environment. In addition, we

have proposed a transformation mechanism to mi-

grate legacy software into the new execution model.

REFERENCES

Akers, R., Baxter, I., and Mehlich, M. (2004). Re-

Engineering C++ Components Via Automatic Pro-

gram Transformation. In Proc. of ACM Symposium on

Partial Evaluation and Program Manipulation, pages

51–55. ACM Press.

Akers, R., Baxter, I., Mehlich, M., Ellis, B., and Luecke, K.

(2007). Case study: Re-engineering c++ component

models via automatic program transformation. Inf.

Softw. Technol., 49(3):275–291.

Baxter, I., Pidgeon, C., and Mehlich, M. (2004). DMS: Pro-

gram Transformations for Practical Scalable Software

Evolution. In Proc. of the 26th Int. Conf. on Software

Engineering, pages 625–634. IEEE Press.

Brunner, R. and Weber, J. (2002). Java Web Services. Pren-

tice Hall.

Hsiung, P. et al (2009). Model-driven development of multi-

core embedded software. In IWMSE ’09: Proc. of

the 2009 ICSE Workshop on Multicore Software Engi-

neering, pages 9–16. IEEE Computer Society.

Hunold, S., Krellner, B., Rauber, T., Reichel, T., and

R¨unger, G. (2009). Pattern-based Refactoring of

Legacy Software Systems. In Proc. of the 11th

Int. Conf. on Enterprise Information Systems (ICEIS),

pages 78–89. Springer.

Isaacson, C. (2009). Software Pipelines and SOA: Releasing

the Power of Multi-Core Processing. Addison-Wesley

Professional.

Koch, G. (2005). Discovering Multi-Core:Extending the

Beneﬁts of Moore’s Law. Intel White Paper, Technol-

ogy@Intel Magazine.

Kuck, D. (2005). Platform 2015 Software-Enabling Inno-

vation in Parallelism for the next Decade. Intel White

Paper, TechnologyIntel Magazine.

Litoiu, M. (2004). Migrating to Web Services: a perfor-

mance engineering approach. Journal of Software

Maintenance and Evolution: Research and Practice,

16:51–70.

Menkhaus, G. and Frei, U. (2004). Legacy System Integra-

tion using a Grammar-based Transformation System.

CIT - Journal of Computing and Information Technol-

ogy, 12(2):95 – 102.

Rauber, T. and R¨unger, G. (2007). Transformation of

Legacy Business Software into Client-Server Archi-

tectures. In Proc. of the 9th Int. Conf. on Enterprise

Information Systems, pages 36–43. INSTICC.

Reinders, J. (2006). Sea Change in the Software World.

Intel Software Insight, pages 3–8.

Sethumadhavan, S., Arora, N., Ganapathi, R., Demme, J.,

and Kaiser, G. (2009). COMPASS: A Community-

driven Parallelization Advisor for Sequential Soft-

ware. In Proc. of the 2009 ICSE Workshop on Mul-

ticore Software Engineering, pages 41–48. IEEE.

Siegel, J. (2005). Why use the Model Driven Architecture to

Design and Build DistributedApplications. In Proc. of

Int.Conf.Software Engineering, page 37. ACM Press.

Sutter, H. (2005). The free lunch is over – a fundamental

turn toward concurrency in software. Dr.Dobb’s Jour-

nal, 30(3).

Sutter, H. and Larus, J. (2005). Software and the Concur-

rency Revolution. ACM Queue, 3(7):54–62.

ICEIS 2010 - 12th International Conference on Enterprise Information Systems

198