Hierarchical Decomposition Framework

for Steiner Tree Packing Problem

Hanbum Ko

1,∗ a

, Minu Kim

2,∗ b

, Han-Seul Jeong

3 c

, Sunghoon Hong

3 d

, Deunsol Yoon

3 e

Youngjoon Park

3 f

, Woohyung Lim

3 g

, Honglak Lee

3 h

, Moontae Lee

3 i

, Kanghoon Lee

3 j

Sungbin Lim

4 k

and Sungryull Sohn

3 l

Department of Artiﬁcial Intelligence, Korea University, Korea

Kim Jaechul Graduate School of AI, KAIST, Korea

LG AI Research, U.S.A.

Department of Statistics, Korea University, Korea

Keywords:

Hierarchical Reinforcement Learning, Neural Combinatorial Optimization, Steiner Tree Packing Problem.

Abstract:

In this paper, we address the complex combinatorial optimization (CO) challenge of efﬁciently connecting

objects at minimal cost, speciﬁcally within the context of the Steiner Tree Packing Problem (STPP). Tradi-

tional methods often involve subdividing the problem into multiple Steiner Tree Problems (STP) and solving

them in parallel. However, this approach can fail to provide feasible solutions. To overcome this limitation,

we introduce a novel hierarchical combinatorial optimizer (HCO) that applies an iterative process of dividing

and solving sub-problems. HCO reduces the search space and boosts the chances of getting feasible solutions.

This paper proposes for the ﬁrst time a learning-based approaches to address STPP, introducing an iterative

decomposition method, HCO. Our experiments demonstrate that HCO outperforms existing learning-based

methods in terms of feasibility and the quality of solutions, and showing better training efﬁciency and gener-

alization performance than previous learning-based methods.

1 INTRODUCTION

The challenge of optimizing connections at minimal

cost is crucial across various domains. In logistics and

transportation networks, efﬁcient route design is cru-

cial for connecting warehouses, markets, and facto-

https://orcid.org/0000-0002-2601-0974

https://orcid.org/0000-0002-8126-4387

https://orcid.org/0000-0002-3525-8699

https://orcid.org/0000-0002-6745-5005

https://orcid.org/0000-0002-1069-112X

https://orcid.org/0000-0002-3950-3065

https://orcid.org/0000-0003-0525-9065

https://orcid.org/0000-0002-4109-327X

https://orcid.org/0000-0001-5542-3463

https://orcid.org/0000-0002-2077-7146

https://orcid.org/0000-0003-2684-2022

https://orcid.org/0000-0001-7733-4293

∗

This work was conducted during an internship at LG

AI Research.

ries. Urban planning requires the cost-effective link-

age of infrastructure, while chip design demands the

minimal-cost connection of electronic components to

enhance data movement and signal stability. De-

termining the optimal conﬁguration from numerous

possibilities is a complex combinatorial optimization

(CO) problem, often referred to as the Steiner Tree

Packing Problem (STPP). STPP aims to identify a set

of minimum-cost edges that connect speciﬁc terminal

nodes without overlap, and it can be decomposed into

simpler Steiner Tree Problems (STP) (Karp, 1972),

where the objective is to connect all given terminal

nodes with the least number of edges.

The approach of dividing the CO problem into

multiple sub-problems and solving the divided sub-

problems in parallel—before even solving the ﬁrst

sub-problem—has seen numerous attempts in various

problems such as the vehicle routing problem (VRP)

and the traveling salesman problem (TSP) (Nowak-

Vila et al.; Hou et al.; Fu et al.; Ye et al.). This

Ko, H., Kim, M., Jeong, H.-S., Hong, S., Yoon, D., Park, Y., Lim, W., Lee, H., Lee, M., Lee, K., Lim, S. and Sohn, S.

Hierarchical Decomposition Framework for Steiner Tree Packing Problem.

DOI: 10.5220/0013317800003893

In Proceedings of the 14th International Conference on Operations Research and Enterprise Systems (ICORES 2025), pages 165-176

ISBN: 978-989-758-732-0; ISSN: 2184-4372

165

separation of concerns, which involves splitting the

model into one that only divides the problem and

another that solves the divided problems, enhances

time efﬁciency by solving the problems in parallel.

The STPP is another CO problem that can be divided

into multiple sub-problems (i.e., STP). However, this

method signiﬁcantly increases the risk of not obtain-

ing a feasible solution. To address this disadvan-

tage, we propose a hierarchical combinatorial opti-

mizer (HCO) that iteratively processes the dividing

and solving stages. An example of this is shown in

Figure 1. The Two-stage Divide Method (TAM) (Hou

et al., 2022) involves dividing all sub-problems before

solving any of them, whereas the HCO represents a

case where the process of dividing and solving sub-

problems is done iteratively. In the leftmost graph,

which is an STPP instance, circular nodes represent

normal nodes, while square nodes represent terminal

nodes. The different colors of the square nodes in-

dicate different types of terminal nodes, and the ob-

jective of STPP is to connect terminal nodes of the

same type with minimal cost. The main issue with ap-

plying the TAM to solve the STPP is that it assumes

all nodes of a speciﬁc color (apricot, in this case) are

selected before choosing the next sub-problem to di-

vide. This leads to a scenario where green nodes may

not have the opportunity to connect with the only fea-

sible node. Conversely, the HCO, by re-evaluating the

solution of the divided STP problem, creates the pos-

sibility to connect the green terminal nodes.

We refer to problems where a feasible solution

may not emerge as feasibility-hard problems. In this

study, we propose the HCO method to more effec-

tively address these challenges. Additionally, we

mathematically demonstrate that HCO can reduce the

search space through latent mapping to a smaller so-

lution space. We argue that HCO has the following

advantages: First, as observed in the examples in Fig-

ure 1, it can more effectively ﬁnd feasible solutions.

Second, by reducing the search space, it alleviates

the burden on models that must consider both the ob-

jective and feasibility of the CO problem simultane-

ously. Third, it beneﬁts from time efﬁciency by utiliz-

ing accurate and fast models that solve sub-problems

well. Lastly, the model that divides a large prob-

lem into smaller ones reduces the distribution shift

issue by partitioning the graph. This ensures that sub-

problems only see problems of similar size STP, thus

offering advantages in generalization.

In this paper, we detail the Markov decision prob-

lem (MDP) formulation for ﬁnding solutions to CO

problems and introduce the application of this formu-

lation to the STPP. We then describe how we apply

this formulation to solve problems using a hierarchi-

cal policy involving high-level and low-level policies.

Our experiments demonstrate that the HCO, deﬁned

in this manner, more successfully ﬁnds feasible so-

lutions and produces higher quality solutions com-

pared to the method of dividing and solving the prob-

lem at once (TAM) and the method of solving STPP

at once without hierarchical decomposition (Khalil

et al.; Kool et al.).

Our contributions are summarized as follows:

• To our knowledge, this is the ﬁrst work to solve

Steiner tree packing problem using an end-to-end

learning framework.

• We propose a novel decomposition approach for

general CO problems that results in sub-problems

with smaller search spaces.

• We demonstrate that HCO is advantageous over

other methods in terms of ﬁnding feasible solu-

tions and the optimal gap of the found solutions.

Additionally, HCO exhibits the ability to general-

ize to larger problem sizes.

2 RELATED WORKS

Combinatorial Optimization with Feasibility-

Hard Constraints. There have been few attempts

to directly tackle CO problems with feasibility-hard

constraints using RL. Ma et al. (2021) proposed

learning two separate RL models, with each model

respectively solving the constraint satisfaction and

objective optimization problems. Cappart et al.

(2021) manually shaped the reward to bias the RL

process toward predicting feasible solutions and

combined this approach with constraint programming

methods to guarantee the feasibility of the solutions.

Our work indirectly tackles the feasibility-hard

constraint by decomposing the given constraint

satisfaction problem into two easier sub-problems

with smaller problem size and search space, allowing

the learning algorithm to efﬁciently solve each

sub-problem.

Decomposition of Combinatorial Optimization

Problem. Several decomposition methodologies

have been introduced to address a variety of CO chal-

lenges. Nowak-Vila et al. (2018) introduced a Divide-

and-Conquer (DnC) framework focusing on scale-

invariant problems, which assumes the problem can

be split into sub-problems, solved independently, and

subsequently merged to form a full solution. Hou

et al., Fu et al. and Ye et al. proposed the domain-

speciﬁc approaches to divide problems and merge

partial solutions (via heatmap and MCTS) for VRP

ICORES 2025 - 14th International Conference on Operations Research and Enterprise Systems

166

TAM

Infeasible solution

HCO!

(Ours)

Feasible solution

Solve subproblem

: Subproblem

: Solution

: Normal node

: Terminal node

: Subproblem overlap

(1) (2)

(3)

(1) (2)

Figure 1: Comparison of two decomposition methods in the Steiner tree packing problem (STPP). (Top) TAM divides the

problem into subproblems all at once, allowing overlapping vertices between them. The entire solution may be infeasible if

the solutions of subproblems overlap. (Bottom) HCO, on the other hand, solves the problem iteratively by: 1) selecting a

sub-graph to deﬁne a subproblem, 2) solving the identiﬁed subproblem, 3) reﬂecting its solution, and repeating this process

until the entire problem is solved. Since HCO deﬁnes each subproblem by considering all previous solutions, it can more

effectively ﬁnd a feasible solution compared to TAM.

and TSP, respectively. Despite their utility, the ap-

plicability of such DnC-based methods is limited by

their reliance on scale invariance, often leading to in-

feasible solutions when this assumption is violated.

In a different vein, local search-based methods by

Song et al. and Li et al. require an initial solution

to iteratively reﬁne decision variables, which is a no-

table constraint. Wang et al. (2021) further inves-

tigated a bi-level formulation, where the high-level

problem modiﬁes the given problem instance and the

low-level problem solves the modiﬁed instance, to

ease problem solving. However, this occasionally re-

sulted in more complex instances (e.g., a feasible so-

lution may not exist) and suboptimal solutions. Our

algorithm HCO also adopts the bi-level formulation

but sidesteps these issues by preserving the original

problem structure.

3 PRELIMINARIES

3.1 Combinatorial Optimization as

Markov Decision Process

Combinatorial optimization (CO) is a mathematical

optimization over a ﬁnite set, with a discrete feasible

solution space. Formally, a combinatorial optimiza-

tion problem can be written as follows.

argmin

x∈X

{f (x

x) : x

x ∈F } (1)

where X is a ﬁnite support for the variable x

x, F ⊂ X

is a set of feasible solutions

, and f : X → R is an

objective function of the CO problem. For instance,

a mixed integer linear programming (MILP) problem

with n variables and m constraints can be written in

the form

argmin

x∈Z

×R

n−p

⊤

x : A

x ≤b

b, x

x ≥0

0} (2)

where A

A ∈ R

n×m

, b

b ∈ R

, and c

c ∈ R

Most of the CO problems can be formulated as a

Markov Decision Process (MDP) Khalil et al.; Gasse

et al.. Formally, it makes two assumptions to the CO

problem: 1) the solution space X of the original prob-

lem (1) is a ﬁnite vector space and 2) the objective f

is linear on X , so that for any given decomposition

of X into direct sum of subspaces X = X

⊕···⊕X

we have f (x

x) =

∑

i=1

f (x

) for each x

∈X

. Then, the

original problem (1) can be written as the following

sequential decision making problem:

argmin

∈X

, ∀t=1,···,H

X=X

⊕···⊕X

{

∑

t=1

f (x

) :

∑

t=1

∈ F }. (3)

The sequential decision can be thought of choos-

ing for each timestep t an action x

∈ X

, to receive

a reward R (s

) = −f (a

) and a large negative

penalty c ≤ −sup

x∈X

f (x) if and only if any future

choice of action inevitably leads to an infeasible so-

lution at the end of the horizon. The optimal policy

∗

∈ Π for the original problem can be found upon

maximizing the expected return.

F is either discrete itself or can be reduced to a discrete

set.

Hierarchical Decomposition Framework for Steiner Tree Packing Problem

167

∗

= argmax

π∈Π

∑

t=1

R (s

) | s

. (4)

We defer the rest of the details to Appendix. Note

that for some CO problems (e.g., TSP, MVC, Max-

Cut), carefully designing the action space can make

the constraint trivially satisﬁed (Khalil et al., 2017),

where in this case, reinforcement learning algorithm

can efﬁciently solve the problem. However, when

the constraint satisfaction is not guaranteed, the re-

inforcement learning methods often suffer from the

sparse reward problem, and does not learn efﬁciently.

In this work, we focus on the challenging CO prob-

lems, where designing action space cannot guarantee

the constraint satisfaction (i.e., feasibility-hard con-

straint): the Steiner tree packing problem.

3.2 Steiner Tree Packing Problem

A Steiner Tree Problem (STP) can be thought of as a

generalization of a minimum spanning tree problem,

where given a weighted graph and a subset of its ver-

tices (called terminals), one aims to ﬁnd a tree (called

a Steiner tree) that spans all terminals (but not nec-

essarily all nodes) with minimum weights. Although

minimum spanning tree problem can be solved within

polynomial time, the Steiner tree problem itself is

a NP-complete combinatorial problem (Karp, 1972).

Formally, let G = (V,E) be an undirected weighted

graph, w

for e ∈ E its edge weights, and T ⊂ V be

the terminals. Then, a Steiner tree S is a tree that

spans T such that its edge weight is minimal. Hence,

the optimization problem for STP can be written as

follows.

argmin

x∈2

{

∑

e∈x

: x

x ∈Σ

} (5)

where 2

is a power set of E, and Σ

is a set of all

Steiner trees that span T . A more generalized version

of the above Steiner tree problem is called the Steiner

Tree Packing Problem, (STPP) where one has a col-

lection T of N disjoint non-empty sets T

,··· ,T

terminals called nets, that has to be packed with dis-

joint Steiner trees S

,··· ,S

spanning each of the nets

,··· ,T

. The optimization problem for STPP can

be written similarly, with N variables.

argmin

,···,x

∈2

{

∑

n≤N

e∈x

: x

∈Σ

, G[x

] ∩G[x

∀n,m

} (6)

where G[x

x] is a subgraph of G generated by x

x ⊂E.

4 BI-LEVEL DECOMPOSITION

FOR STEINER TREE PACKING

PROBLEM

The goal of this section is to formulate our bi-level

framework for STPP and the corresponding MDP,

which allows us to use a hierarchical reinforcement

learning policy that efﬁciently learns to solve CO

problems.

Let us introduce a continuous surjective latent

mapping φ : X → Y onto a vector space Y such that

|Y | ≪ |X|. Then, the problem (1) admits a hierarchi-

cal solution concept:

∗

= argmin

y∈Y

{f (L(y

y)) : φ

−1

y) ∩F ̸=

0} (7)

where, L(y

y) := argmin

x∈φ

−1

{f (x

x) : x

x ∈F }. (8)

We refer to problem (7) as a high-level problem, and

(8) as a low-level sub-problem induced by the high-

level action y

y in (7). Note that the hierarchical so-

lution concept still attains an optimality guarantee of

the original problem, since φ is a surjection and X is

ﬁnite.

The advantages of such hierarchical formulation

are: (i) searching for feasible solutions over Y rather

than X reduces the size of search space; (ii) learn-

ing to obtain an optimal solution can be done by

two different learnable agents, (namely the high-level

agent and the low-level agent for (7) and (8), re-

spectively) where the task for each agent is reduced

to be easier than the original problem, and (iii) the

generalization capability (with respect to the problem

size) increases when using learnable agents, since the

high-level agent can be made to always provide sub-

problems (for the low-level agent) with the same size,

regardless of the size of the original problem.

4.1 Hierarchical Decomposition for

Steiner Tree Packing Problem

For CO problems deﬁned on a weighted graph G =

(V,E) with edge weights w

for each e ∈ E, it is

straight-forward and beneﬁcial to choose a latent

mapping φ : E → V from the set of edges to the set

of nodes.

Speciﬁcally, we consider a version of φ

such that for any input x

x ⊂ E, u ∈ φ(x

x) if and only

if (u,v) ∈ x

x for some v ∈ V (G). Such a mapping φ

satisﬁes φ

−1

y) = G[y

y] for any set of vertices y

y ⊂V ,

where we slightly overload the notation for the gener-

Since |V | ≈ O(

|E|), so that |Y | << |X| for large

graphs.

ICORES 2025 - 14th International Conference on Operations Research and Enterprise Systems

168

Initial Instance

Solution

−1

(y) ⊂ X

y ∈ Y

⋮

Latent

Mapping

Bi-level Decomposition for STPP

Possible solutions in original problem

Possible solutions in

high-level problem

⋮

MDP Formulation for STPP

High-Level Policy

Low-Level Policy

Final Solution

Steiner Tree Packing Problem

Terminals

Non-terminal nodes

Low-level action

High-level action (shaded region)

Low-level

search space

Partial solution

Figure 2: (Left) We tackle the Steiner tree packing problem, which aims to ﬁnd a minimum-weight tree spanning all the

terminal nodes (square boxes) for each type (color) without overlap. (Middle) We propose to decompose the given problem

into high-level and low-level sub-problems via mapping φ : X 7→ Y , which are solved separately. This facilitates learning

since the search spaces are much smaller for high-level problem |Y | ≪ |X| and low-level problem |φ

−1

(y)| ≪ |X| compared

to the original problem. (Right) In MDP formulation, the high-level agent chooses a set of nodes (shaded region) to deﬁne a

sub-graph for each terminal type (color), and low-level agent ﬁnds a tree spanning all the terminal nodes within the sub-graph.

ated subgraph G[y

y]. For example, the high-level prob-

lem of a STPP (6) can be written as follows.

argmin

,···,y

∈2

{

∑

n≤N

e∈L(y

)

: L(y

)∈Σ

, y

∩y

∀n,m

} (9)

which is now a node-selection problem, (instead of

the original edge selection problem) where L(y

) is a

solution of the low-level subproblem:

L(y

) := arg min

x∈E(G[y

])

{

∑

e∈x

: x

x ∈Σ

} (10)

Notice that the high-level problem (9) has a reduced

size search space (from 2

to 2

≈ O(2

√

)) , and

the low-level subproblem corresponds to a single STP

of a smaller subgraph G[y

]. Therefore, the original

NP-hard problem (6) is decomposed into two smaller

NP-hard problems. The overview of our hierarchical

decomposition method for STPP is illustrated in Fig-

ure 2.

4.2 MDP Formulation and Hierarchical

Policy for Steiner Tree Packing

Problem

The high-level MDP M

for STPP is based on

a sequential decision making y

,··· ,y

in equa-

tion 9. Formally, a state in the MDP is a tuple

= (G, T ,S

,t), where G is a weighted graph of the

problem, T the collection of set of terminals, and

⊂ V (G) is a partial solution constructed until the

current timestep t via previous actions. An action a

to select a set of vertices y

⊂V (G)\S which includes

a tree that spans the terminals T

∈T as a subgraph of

G[y

]. In turn, the subgraph G[y

] is forwarded to the

low-level agent which solves STP on the given sub-

graph by choosing the edges from E(G[y

]). Then,

the high-level agent receives negative of the sum of

the edge weights of the low-level solution L(y

) as

a reward, and appends the solution L(y

) to the pre-

vious partial solution S

If the low-level solution

L(a

) does not exist, or when any future choice of ac-

tions a

t+1

,··· ,a

leads to an infeasible solution of the

given STPP, the high-level agent receives a large neg-

ative penalty C < 0.

In the prior decomposition method (i.e., TAM),

V (L(y

)) and V (G[y

]) are the same. In other words,

once nodes are selected as a high-level action, they

are all used in the low-level solution, which prevents

the next high-level policy from selecting any of the

V (L(y

)) from the previous step. However, in the

STPP, as shown in Figure 1, using all nodes that con-

stitute the low-level problem, V (G[y

]), often leads to

a low probability of ﬁnding a feasible solution. HCO

reduces the likelihood of this occurring by removing

only V (L(y

)) from V (G) after getting the low-level

solution. We defer detailed settings of our MDP for-

mulation to the Appendix.

Model Architecture. Since STPP is a CO prob-

lem deﬁned over a weighted graph, we use a graph

neural network (GNN) to encode the state repre-

sentation with the policy network π

and the value

function V

that serves as a baseline for actor-

critic methods (Konda and Tsitsiklis, 1999). Let

G be the weighted graph with edge weights w

Here, the low-level sub-problem can again be deﬁned

by a MDP M

, which is equivalent to the original deci-

sion making process (3) but on a smaller problem instance

span(φ

−1

)).

Hierarchical Decomposition Framework for Steiner Tree Packing Problem

169

an edge feature for each e ∈ E(G). First, a D-

dimensional node feature µ

is computed for each

node v ∈ V . Please refer to Appendix for our de-

tailed choice of node and edge features. Then, the

extracted features are encoded with graph attention

network (GAT) (Veli

ckovi

c et al., 2018) and atten-

tion network (AT) (Vaswani et al., 2017). GAT ag-

gregates the information across the neighbors in the

graph to capture the local connectivity, but it is lim-

ited in modeling the long-range dependency (Vaswani

et al., 2017). We overcome the limitation by using

the attention network. The attention network captures

the long-range dependency by encoding relation be-

tween all (i.e., ignores the graph structure) pairs of

nodes. Global structures are further encoded via a

graph embedding layer, which embeds particular sub-

sets of node features into groups based on their char-

acteristics.

We use behavioral cloning and reinforcement

learning to train high-level policy and use a mathe-

matical solver (i.e., MILP solver) for solving the low-

level problem (10) in our implementation

5 EXPERIMENTS

5.1 Setting

Dataset. To evaluate the learning efﬁciency and

generalization capacity of each algorithm, we cre-

ated feasibility-hard STPP instances at various scales.

Our instances included graphs of 40, 60, 80, and 100

nodes. For each graph category, we generated 50,000

instances for training, 1,000 for testing, and 100 for

validation. Graphs were created using the Watts-

Strogatz (WS) model (Watts and Strogatz, 1998)

with mean node degree k ∼ Uniform(3,4,5,6) and

rewiring probability β ∼Uniform(0,1). Edge weights

were uniformly random within [0,1]. A subset of ver-

tices was selected as terminals (see Section 3.2) to

construct the STPP instances. To avoid unsolvable

or trivially solvable instances, we designed a termi-

nal selection algorithm ensuring instances were both

solvable and non-trivial. We partitioned the random

graph into T = N

type

subgraphs using Lukes algo-

rithm (Lukes, 1974), then chose N

terminal

terminals per

subgraph. Each subgraph ensured a spanning tree, but

we ﬁltered out trivial or unsolvable instances.

Since our decomposition keeps the size of low-level

subproblem small, we can run the MILP solver in a short

time.

Figure 3: Training performance of the HCO for problem

size n = 40. We pre-trained the agents via behavioral

cloning until 5 millions episodes (i.e., vertical dotted line

in the ﬁgure), and then ﬁnetuned via reinforcement learning

afterwards. We report the performance averaged over 4 ran-

dom seeds.

Baselines. We compare our model with following

baselines. MILP-t utilizes the mixed integer linear

programming (MILP) solver to identify the best so-

lution within a speciﬁed time limit, denoted by t.

Our implementation employs OR-Tools (Perron and

Furnon, 2022) to achieve this. For a fair compari-

son, we set t to 1 second for MILP-t roughly matching

the execution time of the compared methods. MILP-

∞ on the other hand, is the MILP solver operating

without any time constraints, enabling it to ﬁnd the

optimal solution. PathFinder (McMurchie and Ebel-

ing, 1995) is a heuristic algorithm designed to solve

the STPP, as detailed in Section 2. We used a pub-

licly available implementation of Lee et al. (2022),

which includes two variations of low-level solvers:

the shortest-path ﬁnding variant (PathFinder-SP) and

the two-approximation variant (PathFinder-TA). Flat

represents a non-hierarchical RL agent that constructs

a solution by sequentially selecting nodes one by one,

following the methodology outlined in studies Khalil

et al.; Kool et al.. The Two-stage Divide Method

(TAM) (Hou et al., 2022) addresses the STPP by de-

composing it entirely in a single step, after which each

sub-problem is addressed individually. This method

differs from the HCO approach, which iteratively se-

lects and solves one sub-problem at a time.

Training. We discovered that starting training with

a random policy made it difﬁcult to obtain feasible so-

lutions, resulting in minimal learning signals. There-

fore, for HCO training, we perform pre-training us-

ing behavioral cloning, followed by ﬁne-tuning with

IMPALA (Espeholt et al., 2018). The behavioral

cloning data was generated using a solver OR-Tools,

and training was conducted using cross-entropy loss.

The training was conducted for 500K episodes to en-

sure convergence to some extent in all methods. We

used the reinforcement learning framework RLlib for

training with IMPALA. Hyperparameters were deter-

mined based on performance on the validation set,

and the detailed process is described in Appendix.

ICORES 2025 - 14th International Conference on Operations Research and Enterprise Systems

170

Table 1: Result table for Steiner tree packing problem. Gap and FSR denote the average optimality gap and feasible solution

ratio, respectively, while ET represents the average time taken to process a test instance. We report the performance averaged

over 4 random seeds.

n = 40 n = 60

Method Gap↓ FSR↑ ET (ms)↓ Gap↓ FSR↑ ET (ms)↓

Learning-based

HCO (Ours) 0.039 0.969 38 0.031 0.953 99

TAM 0.105 0.885 17 0.128 0.914 46

Flat 0.045 0.957 106 0.087 0.935 252

Heuristic

MILP-1s 0.000 1.000 127 0.000 0.982 501

PathFinder-SP 0.112 0.974 5 0.165 0.990 8

PathFinder-TA 0.116 0.966 19 0.147 0.974 48

Exhaustive MILP-∞ 0.000 1.000 125 0.000 1.000 532

n = 80 n = 100

Method Gap↓ FSR↑ ET (ms)↓ Gap↓ FSR↑ ET (ms)↓

Learning-based

HCO (Ours) 0.054 0.932 124 0.056 0.892 246

TAM 0.246 0.620 47 0.291 0.520 47

Flat 0.087 0.902 529 0.062 0.905 679

Heuristic

MILP-1s 0.001 0.832 975 0.000 0.035 1007

PathFinder-SP 0.150 0.976 20 0.155 0.970 35

PathFinder-TA 0.149 0.965 115 0.150 0.954 170

Exhaustive MILP-∞ 0.000 1.000 1648 0.000 1.000 4685

Table 2: Generalization performance in terms of Common

Instance Gap (CIG). The CIG measures the Gap averaged

over the instances where all methods (HCO, TAM, Flat)

found feasible solutions. Models are trained with n = 40

and are tested with n = {40, 60,80,100}. We report perfor-

mance averaged over 4 random seeds.

n = 40 n = 60

HCO (Ours) 0.033 (±0.001) 0.033 (±0.002)

TAM 0.090 (±0.007) 0.332 (±0.026)

Flat 0.065 (±0.012) 0.086 (±0.014)

n = 80 n = 100

HCO (Ours) 0.041 (±0.003) 0.059 (±0.006)

TAM 1.017 (±0.113) 2.910 (±0.126)

Flat 0.120 (±0.014) 0.260 (±0.056)

Table 3: The average likelihood of decomposition policy

selecting trap nodes, optimal nodes, and redundant nodes

for test instances with n = 40 and |T | = 2.

Trap↓ Optimal↑ Redundant↓

HCO (Ours) 0.422 0.887 0.608

TAM 0.494 0.898 0.747

Evaluation. We use three metrics to evaluate the

algorithm’s capability to minimize the cost and sat-

isfy the constraint. Feasible solution ratio (FSR)

is the ratio of instances where a feasible (i.e., con-

straint is satisﬁed) solution was found by the method.

Since the solution cost can be computed only for

a feasible solution, we also introduce the metric

optimality gap (Gap), measuring the average of

the cost suboptimality in feasible solutions found:

Gap=



algorithm cost

optimal cost

−1



. Finally, elapsed time (ET)

measures the average wall clock time taken to solve

each instance in the test set. We report the perfor-

mance averaged over four random seeds.

5.2 Result

Generalization to Unseen Instances with the Same

Graph Size. Table 1 summarizes the performance

of each method when the training and test sets con-

sist of graphs of the same size but are mutually ex-

clusive. The top three rows compare the learning-

based algorithms: HCO, TAM, Flat. Overall, HCO

is the most performant algorithm in terms of Gap

and FSR. We observe that HCO consistently outper-

forms Flat with much less computation required (i.e.,

smaller ET). Our hierarchical framework improves

the sample efﬁciency of reinforcement learning by re-

ducing the search space in high-level and low-level

problems, leading to HCO’s superior performance in

Hierarchical Decomposition Framework for Steiner Tree Packing Problem

171

Gap and FSR. Also, Flat sequentially solves prob-

lem by choosing the node one by one without de-

composition, which requires more number of feed-

forwards (i.e., higher ET) per problem compared to

HCO. As shown in Figure 1, TAM has a high proba-

bility of failing to obtain a feasible solution, result-

ing in a relatively low FSR. Additionally, we ob-

served that learning signals from infeasible solutions

signiﬁcantly destabilize the training process. Rest

of the table summarizes the performance of heuristic

and exhaustive search-based methods. MILP-1s and

MILP-∞ performs search over solution space, where

MILP-1s constrains the search based on computation

time and MILP-∞ performs search exhaustively. We

note that MILP-based approaches excels in minimiz-

ing the cost once feasible solution is found, but suffers

from ﬁnding any feasible solution especially when the

problem size grows. This results in drastic degra-

dation in FSR for MILP-1s and ET for MILP-∞ for

larger-sized problems (n = 80 and 100). When we

compare PathFinder-based approaches to HCO, HCO

is signiﬁcantly advantageous in terms of Gap. We at-

tribute the high FSR of PathFinder to their iterative

algorithm, negotiated-congestion avoidance, which is

tailored for ﬁnding feasible solutions in STPP.

Generalization to Unseen and Larger Instances.

To facilitate a clearer comparison, we introduce Com-

mon Instance Gap (CIG) metric, which evaluates the

Gap exclusively on instances where every compared

method identiﬁed a feasible solution. This approach

ensures a fair comparison ground. We present the CIG

performance, averaged across four distinct random

seeds in the Table 2 (values in parentheses represent

standard error). The results demonstrate HCO’s su-

perior performance over baseline methods across all

tested instance sizes (n = 40,60,80, 100), with a no-

tably wider performance margin against Flat in larger

or more novel instances (n = 80 and 100). This un-

derscores the bi-level formulation’s effectiveness in

enhancing generalization capabilities, even in more

challenging scenarios.

Analysis of Decomposition-Based Models. We

further evaluate the decomposition performance of

the decomposition-based models, HCO and TAM.

When the model decomposes the given problem into

sub-problems for each net, it must consider three

types of nodes. First, the trap node is a critical part of

another net’s solution and must not be used for solv-

ing the current net. Second, the optimal node is in-

cluded in the optimal solution. Third, the redundant

node refers to nodes that do not fall into either of the

previous categories. Generally, including redundant

nodes in the solution may increase the cost (i.e., sub-

optimal). Effective decomposition should avoid trap

and redundant nodes while incorporating as many op-

timal nodes as possible. To assess this, we measure

the average likelihood of each decomposition algo-

rithm (i.e., high-level policy) selecting trap, optimal,

and redundant nodes. Our analysis was conducted on

all instances with two nets (i.e., |T | = 2) from the

n = 40 test instances. The results are presented in Ta-

ble 3. Overall, TAM shows a higher likelihood of se-

lecting any type of node compared to HCO. Notably,

the likelihood of selecting optimal nodes is similar for

both HCO and TAM. This indicates that while both

methods are capable of effectively identifying nodes

that contribute to the optimal solution, TAM is less

successful at avoiding negative nodes (i.e., trap and

redundant nodes), which may lead to infeasible solu-

tions or suboptimality. This is further supported by

the results in Tables 1 and 2, where TAM shows a

larger Gap and lower FSR due to its higher tendency

to select redundant and trap nodes. This difference

in performance can be attributed to the methods’ ap-

proaches. Unlike TAM, which decomposes all prob-

lems at once, HCO re-evaluates and adjusts its decom-

position after reviewing the results of the low-level

problem, leading to more cautious decision-making.

Learning Curves. The learning curve for n = 40 is

shown in Figure 3. The supervised pre-training im-

proves both FSR and Gap of all the methods but the

performance improvement plateaus around 4M steps.

During RL ﬁnetuning, the FSR rapidly improves for

all the methods while Gap worsens. The observed

rise in FSR, indicating more feasible solutions, corre-

lates with a growing gap. This suggests that instances

solved later in training often need more training to

reach optimality. Consequently, as FSR rises sharply,

the model encounters many new instances for which

it initially only ﬁnds suboptimal (i.e., higher Gap) so-

lutions, leading to an increase in Gap.

6 CONCLUSIONS

In this work, we proposed a novel hierarchical ap-

proach to address challenges of ﬁnding feasible so-

lution for STPP. It uses latent mapping to decompose

the solution search space, resulting in more sample-

efﬁcient learning due to separation of concerns and

a smaller search space, and improved generalization

from a homogeneous problem size for the low-level

policy. We showed that the proposed decomposition

framework is generally applicable to broader scope of

combinatorial optimization problems. The effective-

ICORES 2025 - 14th International Conference on Operations Research and Enterprise Systems

172

ness of our method was evaluated on various sizes of

STPP instances, demonstrating improved sample ef-

ﬁciency and generalization capability, outperforming

heuristic, mathematical optimization, and learning-

based algorithms designed for STPP.

ACKNOWLEDGEMENTS

This work was supported by the National Research

Foundation of Korea(NRF) grant funded by the Korea

government(MSIT) (RS-2024-00410082).

REFERENCES

Cappart, Q., Moisan, T., Rousseau, L.-M., Pr

emont-

Schwarz, I., and Cire, A. A. (2021). Combining

reinforcement learning and constraint programming

for combinatorial optimization. In Proceedings of

the AAAI Conference on Artiﬁcial Intelligence, vol-

ume 35, pages 3677–3687.

Espeholt, L., Soyer, H., Munos, R., Simonyan, K., Mnih,

V., Ward, T., Doron, Y., Firoiu, V., Harley, T., Dun-

ning, I., Legg, S., and Kavukcuoglu, K. (2018). IM-

PALA: Scalable distributed deep-RL with importance

weighted actor-learner architectures. In Dy, J. and

Krause, A., editors, Proceedings of the 35th Interna-

tional Conference on Machine Learning, volume 80

of Proceedings of Machine Learning Research, pages

1407–1416. PMLR.

Fu, Z.-H., Qiu, K.-B., and Zha, H. (2021). Generalize

a small pre-trained model to arbitrarily large tsp in-

stances. In Proceedings of the AAAI Conference on

Artiﬁcial Intelligence, volume 35, pages 7474–7482.

Gasse, M., Ch

etelat, D., Ferroni, N., Charlin, L., and Lodi,

A. (2019). Exact combinatorial optimization with

graph convolutional neural networks. Advances in

Neural Information Processing Systems, 32.

Hou, Q., Yang, J., Su, Y., Wang, X., and Deng, Y. (2022).

Generalize learned heuristics to solve large-scale vehi-

cle routing problems in real-time. In The Eleventh In-

ternational Conference on Learning Representations.

Karp, R. M. (1972). Reducibility among combinatorial

problems. In Complexity of computer computations,

pages 85–103. Springer.

Khalil, E., Dai, H., Zhang, Y., Dilkina, B., and Song, L.

(2017). Learning combinatorial optimization algo-

rithms over graphs. In Guyon, I., Luxburg, U. V.,

Bengio, S., Wallach, H., Fergus, R., Vishwanathan,

S., and Garnett, R., editors, Advances in Neural Infor-

mation Processing Systems, volume 30. Curran Asso-

ciates, Inc.

Konda, V. and Tsitsiklis, J. (1999). Actor-critic algorithms.

Advances in neural information processing systems,

12.

Kool, W., van Hoof, H., and Welling, M. (2019). Atten-

tion, learn to solve routing problems! In International

Conference on Learning Representations.

Lee, K., Park, Y., Jeong, H.-S., Yoon, D., Hong, S., Sohn,

S., Kim, M., Ko, H., Lee, M., Lee, H., Kim, K., Kim,

E., Cho, S., Min, J., and Lim, W. (2022). ReSPack: A

large-scale rectilinear steiner tree packing data gener-

ator and benchmark. In NeurIPS 2022 Workshop on

Synthetic Data for Empowering ML Research.

Li, S., Yan, Z., and Wu, C. (2021). Learning to delegate for

large-scale vehicle routing.

Lukes, J. A. (1974). Efﬁcient algorithm for the partitioning

of trees. IBM Journal of Research and Development,

18(3):217–224.

Ma, Y., Hao, X., Hao, J., Lu, J., Liu, X., Xialiang, T., Yuan,

M., Li, Z., Tang, J., and Meng, Z. (2021). A hi-

erarchical reinforcement learning based optimization

framework for large-scale dynamic pickup and deliv-

ery problems. Advances in Neural Information Pro-

cessing Systems, 34:23609–23620.

McMurchie, L. and Ebeling, C. (1995). Pathﬁnder: A

negotiation-based performance-driven router for fp-

gas. In Proceedings of the 1995 ACM third inter-

national symposium on Field-programmable gate ar-

rays, pages 111–117.

Nowak-Vila, A., Folqu

e, D., and Bruna, J. (2018). Divide

and conquer networks.

Perron, L. and Furnon, V. (2022). Or-tools.

Song, J., Lanka, R., Yue, Y., and Dilkina, B. (2020). A gen-

eral large neighborhood search framework for solving

integer linear programs.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,

L., Gomez, A. N., Kaiser, L. u., and Polosukhin,

I. (2017). Attention is all you need. In Guyon,

I., Luxburg, U. V., Bengio, S., Wallach, H., Fer-

gus, R., Vishwanathan, S., and Garnett, R., editors,

Advances in Neural Information Processing Systems,

volume 30. Curran Associates, Inc.

Veli

ckovi

c, P., Cucurull, G., Casanova, A., Romero, A., Li

P., and Bengio, Y. (2018). Graph attention networks.

In International Conference on Learning Representa-

tions.

Wang, R., Hua, Z., Liu, G., Zhang, J., Yan, J., Qi, F., Yang,

S., Zhou, J., and Yang, X. (2021). A bi-level frame-

work for learning to solve combinatorial optimization

on graphs. Advances in Neural Information Process-

ing Systems, 34:21453–21466.

Watts, D. J. and Strogatz, S. H. (1998). Collective dynamics

of ”small-world” networks.

Ye, H., Wang, J., Liang, H., Cao, Z., Li, Y., and Li, F.

(2024). Glop: Learning global partition and local

construction for solving large-scale routing problems

in real-time. In Proceedings of the AAAI Conference

on Artiﬁcial Intelligence, volume 38, pages 20284–

20292.

Hierarchical Decomposition Framework for Steiner Tree Packing Problem

173

APPENDIX

Combinatorial Optimization as Markov

Decision Process

In this section, we provide a full illustration of a se-

quential optimization process for a general CO prob-

lem. Our goal is to design a corresponding MDP for

equation 3. Intuitively, the sequential decision mak-

ing process in equation 3 can be thought of choosing

for each timestep t an action x

∈ X

, until we have

a full solution x

x =

∑

. However, note that we are

constructing X

,··· ,X

sequentially, i.e., we do not

have the subspace decomposition X = X

⊕···⊕X

beforehand. Hence, assume we have constructed

,··· ,X

until the current timestep t. Let us deﬁne

the remaining subspace that are yet to be decom-

posed, i.e., X = X

⊕···⊕X

⊕W

. Then, the sequen-

tial decision making is equivalent to choosing for each

timestep t a subspace X

≤W

and consequently an ac-

tion x

∈X

, until we have a trivial subspace W

= {0

0}.

Sequential decision making process for the bi-

level decomposition in equation 7 and 8 can be formu-

lated similarly. The key is to construct the subspace

decomposition Y = Y

⊕···⊕Y

sequentially, while

obtaining the partial solution for the original problem

from the low-level sub-problem (8) simultaneously.

MDP Formulation

Below we provide a full description of our MDP for-

mulation for STPP in section 4.2.

State. The state s

of the high-level MDP M

con-

sists of a graph G, collection of sets of terminals T , a

partial solution S

⊂V (G) constructed until timestep

t, (i.e., the nodes of the disjoint trees that span ter-

minals T

,··· ,T

t−1

) and the current timestep t. The

graph G provides general information of the prob-

lem instance, i.e., the connectivity of the graph via

edges. In practice, the graph G can be represented by

the adjacency matrix A ∈ R

|V |×|V|

, where a

i j

takes the

edge weight w

i j

. The information can be further en-

coded via message passing layers of GAT and AT in

GNN. T , S

and t provide node features for the cur-

rent timestep, and are essential for generating a graph

embedding. From the state information s

, we extract

the node features of the graph for further encoding via

the GNN model. For a node v ∈V , we denote the node

features of the vertex v as x

:= (x

) ∈ Z

. The

ﬁrst node feature x

∈ {0, 1} denotes whether a node

v is included in the current partial solution or not. If

v is selected as part of the partial solution, we deﬁne

= 1, otherwise, it is 0. The second node feature

∈ {0,1, ··· ,N

type

} indicates the terminal type (i.e.,

= k if and only if v ∈T

), where the indices are la-

beled in the order that the high-level MDP solves for.

Non-terminal nodes are assigned a value of 0. The

last feature, x

, denotes the degree of a node v ∈ V .

The edges of the graph are also assigned with edge

features. We only use the edge weights as the edge

features in this paper. Our choice of node and edge

features is summarized in Table 4, 5.

Action. The action a

of a high-level MDP M

timestep t is to select a set of vertices y

⊂V (G) \S

where S

is the partial solution constructed until the

previous timesteps. Intuitively, the action a

is to se-

lect a node set that includes all terminals of the current

type at timestep t. The selected set of nodes should

create a subgraph G[a

] of G generated by the nodes

, and should correspond to a STP instance (with sin-

gle type of terminals). The low-level agent then com-

putes a minimum weight tree that spans all terminals

of current type from the subgraph G[a

]. In practice,

we further assist the agent by designing the MDP en-

vironment for M

in a way such that the terminal

nodes of the current type are automatically selected

by the environment internally. Hence, the action a

will result in a subgraph G[a

∪V (T

)] instead of G[a

Reward. Designing a reward with an optimality

guarantee is a non-trivial task. In particular, we wish

to construct a reward where all feasible solutions re-

sult in higher reward than those of any infeasible so-

lutions. Also, the feasible solutions with better so-

lution quality (i.e.the objective being closer to op-

timal solution) should be assigned a higher reward.

Hence, given a ﬁnal solution x

x of a CO problem (1),

a reward with an optimality guarantee can be com-

pactly formulated as r(x

x) = c ·1

x) − f (x

x), where

c ≥ sup

x∈X

f (x

x), f is the objective function of prob-

lem (1), and 1

(·) is an indicator function. Note that

this form of the reward is equivalent to what is de-

scribed in Section 3.1. Achieving the maximal return

will result in the same optimal policy

. Finally, recall

that our objective f is linear in X . For each partial

solution x

constructed at timestep t, we are able to

decompose our reward function as follows.

r(x

) = c ·1

∑

k=1

− f (x

) (11)

Transition. Our transition in MDP is deterministic;

the change in the partial solution alters a node feature

Providing incentives for x

x ∈ F instead of a penalty

when x

x /∈ F scales all rewards to be non-negative.

ICORES 2025 - 14th International Conference on Operations Research and Enterprise Systems

174

Table 4: Node features.

Notation Value Description

∈ {0,1} Is partial solution

∈ {0,1, 2,...,N

type

} Terminal type

∈ Z Node degree

Table 5: Edge feature.

Notation Value Description

∈ R Edge weight(=cost)

from 0 into 1, which results in different node and

graph embeddings from GNN.

Termination. Our STPP environment is terminated

when it is not able to generate a feasible STPP solu-

tion or when a feasible solution is found. The cases

where generating a feasible STPP solution includes

(1) no possible actions remaining, (2) any choice of

actions in future timestep inevitably results in an in-

feasible solution, or (3) the choice of action a

in a

high-level MDP M

results in an unsolvable STP in-

stance G[a

GNN Architecture

Encoder. Given a graph G, we ﬁrst extract a D-

dimensional node embedding µ

for each node v ∈

V , where D denotes the number of features, as pro-

vided in Section 6. Note that we use D = 3, with

= (x

) representing the features as described

in Table 4. Let ρ : R

→ R

D·p

be a ﬁxed vector-

ization mapping of a given node feature x

, and let

: R

D·p

→ R

be a linear mapping. We obtain the

initial node embedding µ

as follows.

= ReLU(θ

(ρ(x

))). (12)

Next, we encode the node embeddings M :=

(µ

,··· ,µ

|V |

) ∈ R

|V |×p

via graph attention network

(GAT) and attention network (AT). Formally, let Θ

×p

→R

and θ

: R

→R

for i = 1, ···, l be lin-

ear mappings, where n

denotes the number of heads

of GAT. For convenience, we slightly overload the no-

tation, writing Θ

(M) := (Θ

(µ

),··· ,Θ

(µ

|V |

)), and

use a similar notation for θ

. We recursively encode

the node embedding as follows

(i−1)′

= AT(Θ

(GAT(M

(i−1)

;G)); G) (13)

(i)

= θ

(ReLU(M

(i−1)

||M

(i−1)′

)) (14)

for each i = 1,··· , l. Here, we deﬁne M

(0)

≡ M,

and write (·||·) for CONCATENATE(·,·). We denote

our graph encoder function Enc succinctly as follows,

omitting some details for brevity.

Enc(·;G) :=

l times

z }| {

(AT ◦GAT) ◦···◦(AT ◦GAT)(·;G)

(15)

Graph Embedding, Logit, and Probability.

Given the node embedding of the last layer M

(l)

, we

obtain the embedding for the entire graph µ

. Instead

of simply averaging over all the nodes, we enrich the

graph embedding by grouping the nodes into three

subsets based on their characteristics: current termi-

nal T

, partial solution S

, and non-terminal nodes

V := V \

T ∈T

T . Then, the embeddings are averaged

within each subset, concatenated, and projected to

obtain the graph embedding µ

. Formally, the graph

embedding layer Emb is written as follows.

Emb(·;t) := Ψ(·,T

)||Ψ(·,S

)||Ψ(·,V ) (16)

where Ψ(M

(l)

,A) performs the average pooling

over the set of node embeddings that belong to

A. The graph embedding µ

is obtained as µ

Emb(M

(l)

;t). Finally, the logit value, which is used

to compute the probability of choosing the node for

each node v ∈V is computed as follows.

logit

= w

(ReLU(w

(µ

)||w

(µ

(l)

))) + µ

(0)

) ∀v ∈V

(17)

= softmax(logit

) ∀v ∈V (18)

where w

: R

→R

, w

: R

→R

, w

: R

→R

and w

: R

→ R

are linear functions.

Value Function. The value function V

uses a

model that has a similar GNN architecture with a

simple multi layer perceptron MLP that sequentially

projects (R

→R

), and does not share

weights with the policy network.

) = MLP(Emb(Enc(M;G);t)). (19)

Hyperparameters

We set the hyperparameters based on the performance

of Flat and considered the following hyperparameter

ranges for the 40-node experiment:

• p = {32, 64, 128, 256}

• l = {3, 4, 5}

• number of heads = {4, 8}

• IL learning rate = {1e-3, 1e-4, 1e-5, 1e-6}

• weight decay = {1e-6, 5e-7}

Hierarchical Decomposition Framework for Steiner Tree Packing Problem

175

• RL learning rate = {1e-3, 1e-4, 1e-5, 1e-6}

• entropy coefﬁcient = {1, 1e-1, 1e-2}

• value loss coefﬁcient = {1e-1, 1, 3, 5}

Thirty hyperparameter combinations were randomly

selected from the given ranges, and the combination

that performed best on the validation set was used.

This parameter combination was consistently used for

all methods. The GNN model uses a hidden dimen-

sion of p = 128 and l = 5 for both GAT and AT en-

coder layers. Both GAT and AT use 8 heads, with

a dropout rate of 0.5 in IL, but dropout is not used

in RL training. A batch size of 64 is used, and the

learning rate is initialized to 10

−4

, decreasing by 0.99

per epoch. A ﬁne-tuned value of 5 ×10

−7

is used for

weight decay. To prevent divergence in learning, the

gradient norm is clipped to 1. HCO trains for 100

epochs, using 1 epoch to update the model with BC

data at every step and in every episode. In the RL

phase, a batch size of 30, a learning rate of 10

−6

, a

discount factor of 0.99, an entropy coefﬁcient of 0.01,

and a value function loss coefﬁcient of 5 are used. The

number of workers used in IMPALA is set to 30.

Training and Evaluation

For imitation learning, we ﬁrst collect the demon-

stration data using the optimal solver, MILP, as ex-

pert policy. The expert policy π

expert

is deﬁned as

expert

|·) = 1 if a

∈ A

∗

and π

expert

|·) = 0 other-

wise. The nodes that are not selected by the expert are

excluded from the loss calculation. HCO uses cross-

entropy loss for BC training, and one epoch is deﬁned

as updating HCO for every step of every instance.

During the evaluation phase, the softmax function is

applied to the logit values of each node to obtain prob-

abilities, and nodes with probability values exceed-

ing 0.5 are selected as high-level actions. The train-

ing and evaluation are carried out on a single GPU,

comprising an AMD EPYC 7R32 CPU and NVIDIA

A10G GPU.

ICORES 2025 - 14th International Conference on Operations Research and Enterprise Systems

176