Text-Based Feature-Free Automatic Algorithm Selection
Amanda Salinas-Pinto
1 a
, Bryan Alvarado-Ulloa
1 b
, Dorit Hochbaum
2 c
,
Mat
´
ıas Francia-Carrami
˜
nana
1 d
, Ricardo
˜
Nanculef
1 e
and Roberto As
´
ın-Ach
´
a
1 f
1
Universidad T
´
ecnica Federico Santa Mar
´
ıa, Chile
2
University of California, Berkeley, U.S.A.
{amanda.salinas, bryan.alvarado}@usm.cl, dhochbaum@berkeley.edu,
Keywords:
Algorithm Selection, Deep Learning, SAT, CSP.
Abstract:
Automatic Algorithm Selection involves predicting which solver, among a portfolio, will perform best for a
given problem instance. Traditionally, the design of algorithm selectors has relied on domain-specific fea-
tures crafted by experts. However, an alternative approach involves designing selectors that do not depend on
domain-specific features, but receive a raw representation of the problem’s instances and automatically learn
the characteristics of that particular problem using Deep Learning techniques. Previously, such raw represen-
tation was a fixed-sized image, generated from the input text file specifying the instance, which was fed to
a Convolutional Neural Network. Here we show that a better approach is to use text-based Deep Learning
models that are fed directly with the input text files specifying the instances. Our approach improves on the
image-based feature-free models by a significant margin and furthermore matches traditional Machine Learn-
ing models based on basic domain-specific features, known to be among the most informative features.
1 INTRODUCTION
Automatic Algorithm Selection (AAS) aims to pre-
dict the optimal solver for a given problem instance
from a portfolio. Traditionally, this process relies on
domain-specific features crafted by experts, which,
while effective, limits scalability and transferability
due to the need for extensive domain knowledge and
labor-intensive analysis.
Recent advances in Deep Learning (DL) (Vaswani
et al., 2017), where models learn from raw data, offer
a compelling alternative to feature-based models. Pre-
vious work (Loreggia et al., 2016) in AAS has trans-
formed raw data into fixed-sized images processed by
Convolutional Neural Networks (CNNs), but this still
requires image-processing techniques.
Our study introduces a novel text-based deep
learning approach that directly processes raw tex-
tual files specifying problem instances, simplifying
a
https://orcid.org/0009-0007-2216-4371
b
https://orcid.org/0009-0008-7468-5723
c
https://orcid.org/0000-0002-2498-0512
d
https://orcid.org/0009-0000-8680-7347
e
https://orcid.org/0000-0003-3374-0198
f
https://orcid.org/0000-0002-1820-9019
the computational pipeline, and enhancing represen-
tation.
In this paper, we present our text-based deep
learning framework for AAS and evaluate its perfor-
mance against traditional image-based and feature-
based models. Our analysis shows that text-based
models are superior in capturing complex informa-
tion in problem descriptions, leading to more effective
and adaptable algorithm selection strategies as com-
pared to image-based methods. Nevertheless, there is
still a gap in performance as compared to specialized
feature-base models, and closing this gap will still be
the base of future research in the area of feature-free
algorithm selection.
Our contributions include demonstrating the feasi-
bility of text-based deep learning for AAS and provid-
ing a thorough analysis of how these techniques out-
perform existing feature-free methods. We establish
new benchmarks, advancing the field of feature-free
AAS, and offer insights into the performance gap be-
tween feature-free and feature-based methodologies.
The subsequent sections review relevant literature,
define key terms and criteria, outline our text-based
AAS framework, present empirical assessments, and
conclude with findings and future research directions.
Salinas-Pinto, A., Alvarado-Ulloa, B., Hochbaum, D., Francia-Carramiñana, M., Ñanculef, R. and Asín-Achá, R.
Text-Based Feature-Free Automatic Algorithm Selection.
DOI: 10.5220/0012913700003838
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 16th International Joint Conference on Knowledge Discover y, Knowledge Engineering and Knowledge Management (IC3K 2024) - Volume 1: KDIR, pages 267-274
ISBN: 978-989-758-716-0; ISSN: 2184-3228
Proceedings Copyright © 2024 by SCITEPRESS Science and Technology Publications, Lda.
267
2 RELATED WORK
2.1 Algorithm Selection Systems
Automatic Algorithm Selection (AAS), introduced by
(Rice, 1976), optimizes computational processes by
selecting the most suitable algorithm for a given prob-
lem instance. This approach is rooted in the “No Free
Lunch” theorem (Adam et al., 2019), which posits
that no single algorithm universally excels across all
scenarios.
AAS typically employs a training phase to asso-
ciate problem instance features with algorithm per-
formance. The trained model then evaluates new in-
stances to predict the most effective algorithm. Re-
cent literature has explored AAS in various domains,
including timetabling (Seiler et al., 2020; Bossek
and Neumann, 2022), SAT (Xu et al., 2008), and
Multi-Agent Path-Finding (Bulitko, 2016; Ach
´
a et al.,
2022).
Kerschke et al. (Kerschke et al., 2019) provide a
comprehensive survey of algorithm selection and con-
figuration, introducing a taxonomy that distinguishes
between “per-set” and “per-instance” methods. Our
focus is on “per-instance” AAS, which considers each
problem instance individually.
While many AAS systems employ complex strate-
gies, such as the hybrid methodology of semi-static
solver schedules (3S) (Kadioglu et al., 2011) or Aut-
ofolio (Lindauer et al., 2015), our study concentrates
on straightforward approaches. We assume an ML
model receives an instance characterization and se-
lects a single solver to execute until completion or
time limit.
Most AAS research relies on domain-specific,
expert-crafted features. However, an alternative ap-
proach involves developing ML methods that uti-
lize raw/generic instance representations, allowing
the learning process to identify relevant features au-
tonomously. This approach was first explored by
(Loreggia et al., 2016).
2.2 Deep Learning for Algorithm
Portfolios
(Loreggia et al., 2016) introduced a groundbreaking
approach to Automatic Algorithm Selection (AAS)
based on deep learning. Unlike traditional AAS tech-
niques that use hand-crafted, domain-specific fea-
tures, this method leverages generic raw data the
text file contents describing the problems.
The process transforms text files into a fixed-size
image format suitable for Convolutional Neural Net-
work (CNN) analysis:
1. Convert textual input into a vector of ASCII
codes.
2. Reorganize the vector into a
N ×
N matrix,
where N is the total character count.
3. Resize the resulting ASCII image” to a uniform
scale.
The CNN can be trained as a multi-class classifier,
multi-label classifier, or regressor. Evaluated using
SAT and Constraint Satisfaction Problems (CSP) in-
stances, this method showed potential to outperform
the Single Best Solver (see Subsection 2.3).
Despite its successes, this approach may not per-
form as well as methods utilizing domain-specific fea-
tures.
2.3 Performance Metric for
Meta-Solvers
We define an algorithm-selection-based meta-solver
as a system comprising a portfolio of solvers. It ana-
lyzes an input instance and runs one or more solvers to
resolve it. A solver solves an instance if it can decide
its satisfiability (for decision problems) or find and
certify the optimal solution (for optimization prob-
lems) within a time limit.
All our meta-solvers here operate uniformly:
1. Accept an input instance.
2. Use an ML model to predict the most efficient
solver, identify capable solvers, or estimate solv-
ing times.
3. Select and run one solver based on these predic-
tions.
We evaluate the meta-solver’s performance using two
baselines:
Single Best Solver (SBS): The solver performing
best on average across all training instances.
Virtual Best Solver (VBS): A hypothetical meta-
solver always choosing the most effective algo-
rithm for each instance.
Performance is measured using the PAR10 metric
(Lindauer et al., 2019). For a solver s on instance i:
m
s
(i) =
(
t
s
(i) if t
s
(i) τ
10τ otherwise
where τ is the timeout constant and t
s
(i) is the solving
time.
We use the performance measure ˆm (Lindauer
et al., 2019) to evaluate meta-solvers:
ˆm
ms
=
m
ms
m
V BS
m
SBS
m
V BS
(1)
KDIR 2024 - 16th International Conference on Knowledge Discovery and Information Retrieval
268
Values of ˆm
ms
close to 0 indicate performance near
VBS, while values close to 1 suggest performance
similar to SBS. Values above 1 indicate the meta-
solver is less effective than SBS.
3 TEXT-BASED FEATURE-FREE
AAS
We follow (Loreggia et al., 2016)’s approach, work-
ing directly with raw problem instance representa-
tions. Our Deep Learning models are fed with raw
text representations, rather than pre-processed image-
like inputs.
3.1 Architecture Overview
Figure 1: Overall architecture of our text-based Deep
Learning Model for AAS.
Our architecture (Figure 1) is a modified Transformer
neural network, using only the encoder component
similar to the one of (Vaswani et al., 2017). The input
text x is truncated, tokenized, and converted into em-
beddings x = hx
1
, x
2
, . . . , x
n
i. The encoder’s outputs
z = hz
1
, z
2
, . . . , z
n
i are fused into a global descriptor
z using Global Max Pooling (Christlein et al., 2019),
then mapped to a prediction through a fully connected
output layer.
3.2 Tokenizers and Embeddings
We explore two tokenization approaches:
Pre-trained Tokenization: Using SentencePiece
(Kudo and Richardson, 2018).
Trained Tokenization: Using Charformer (Tay
et al., 2021).
3.3 Encoder Architecture
Our encoder computes M = 4 hierarchical transfor-
mations Z
(k)
= EBlock(Z
(k1)
). Each block includes
a self-attention mechanism and a position-wise feed-
forward net. The self-attention mechanism computes:
P = SelfAttention(Z) = softmax
QK
T
d
0
Z, (2)
where Q, K, and Z are learnable matrices that project
Z
(k1)
into a d
0
-dimensional latent space. We use
multi-head attention with H = 4 heads. The final
block’s output Z
(k)
is obtained after applying a resid-
ual connection (He et al., 2016) and layer normaliza-
tion (Ba et al., 2016) around each sublayer. We did
not use positional embeddings.
3.4 Problem Framing Strategies
We explore three strategies:
Multi-Class Classification: Identifies the most suit-
able solver and the meta-solver runs it. The output
layer is a softmax function, and the loss function
is categorical cross-entropy.
Multi-Label Classification: Identifies all solvers ca-
pable of solving the instance within the defined
time limit τ. Each solver corresponds to an ele-
ment in the output vector, with a sigmoid func-
tion applied element-wise. The loss is measured
through the Hamming loss function. Since the
probabilities here are not complementary, they de-
termine the likelihood that a solver will be fit for
the problem instance. The meta-solver executes
the solver that exhibits the highest likelihood.
Regression: Estimates normalized log delta runtime
for each solver. The mean squared error function
serves as the loss function, and the output layer is
linear. The meta-solver runs the solver predicted
to have the shortest runtime.
r
s,i
= log(1 + m
s
(i) min
sS
(m
s
(i)))
y
s,i
=
r
s,i
mean(r
s,i
)
std(r
s,i
)
Text-Based Feature-Free Automatic Algorithm Selection
269
4 EXPERIMENTAL SETUP AND
BASELINES
4.1 Libraries and Hardware
We implemented our Deep Learning models in
Python 3.10, using PyTorch 2.0.0. For the text-
based models, we used the Charformer tokenizer
0.0.4
1
, and SentencePiece 0.2.0. For the image scal-
ing, needed by the image-based models, we used
OpenCV 4.7.0.72. The feature-based models were
implemented using scikit-learn 1.4.2.
The experiments were carried out on a machine
with an Intel Xeon Skylake (2x16 @2.1 GHz) pro-
cessor and an Nvidia A40 GPU. The machine runs
Scientific Linux 7 and has 48GB of RAM.
4.2 Benchmark Sets
To evaluate our approach, first, we aimed to use the
same benchmark sets used in (Loreggia et al., 2016).
However, the precise sets of instances and the parti-
tions used in that study were not disclosed publicly
and could not be provided by the authors when asked
in an internal communication. We then searched for
similar-nature benchmarks for which the instance files
and hand-crafted features used in the AAS commu-
nity were available. Unfortunately, we could not
find meaningful benchmark sets similar to the ones
named “SAT Random” and “SAT Crafted” in (Loreg-
gia et al., 2016). However, we were able to collect
the most interesting benchmark sets reported in such
study, “SAT Industrial” and “CSP”. These benchmark
sets are the more interesting because of their diver-
sity in size, complexity, and complementarity of the
solvers.
SAT Industrial. This benchmark includes in-
stances used in the SAT competition between 2003
and 2016 in the industrial/application categories. The
performance of the solvers in these competitions was
retrieved from ASLib, specifically from the SAT03-16-
INDU-ALGO scenario. We removed 269 instances
that could not be solved by any solver in the port-
folio within the given τ time limit. After filtering,
the dataset contains 1, 730 instances and 10 different
solvers.
1
as implemented in https://github.com/lucidrains/char
former-pytorch
CSP. We used the benchmark from the 2009 CSP
competition
2
. The performance data for each
solver was obtained from the PROTEUS-2014 sce-
nario (Hurley et al., 2014) in ASlib. We filtered the
instances by removing the “easy” instances that could
be solved by all solvers within the time-limit equiv-
alent to compute the instance’s features, in addition
to removing the “difficult” instances that were not
solved by any of the solvers within the given time
limit τ. This resulted in a total of 1, 613 instances and
22 different solvers.
4.3 Data Partitioning and Evaluation
Criteria
We split each benchmark into train and test datasets.
For the train dataset we used 80% of the total in-
stances, and the remaining 20% is reserved as the test
dataset. The training dataset is used for training and
model selection, while the test dataset is used to com-
pare the in-production performance of the best text-
based, image-based, and feature-based approaches.
To select the best model for each approach, we
performed 10-fold cross-validation with the training
set. We compared the models based on the ˆm metric
associated with a meta-solver using them. We then
selected the best model based on the mean ˆm metric
across the different folds.
4.4 Feature-Based Models
To offer a comprehensive view of our study on
feature-free models, we also implement and evalu-
ate feature-based models employing both state-of-the-
art crafted features and basic informative features,
using Random Forest models. The comparison of
feature-free models with these feature-based counter-
parts serves a dual purpose: firstly, to analyze and
document the performance disparities between these
two paradigms, and secondly, to provide the research
community with a benchmark on the effectiveness of
applying state-of-the-art crafted features in a straight-
forward manner on ASLib scenarios that are widely
used.
Basic Features: Two basic features extracted from
the text describing a problem instance are: the
number of variables and the number of con-
straints. The motivation for these two features is
that the instance size usually appears among the
most simple and informative ones. We expect that
2
https://www.cril.univ-artois.fr/CSC09/results/global
bybench.php?idev=30&idcat=38&idSubCat=60
KDIR 2024 - 16th International Conference on Knowledge Discovery and Information Retrieval
270
training ML models on these two features estab-
lishes a baseline for the other methods.
We note here that, for the CSP benchmark,
the number of variables and constraints in the
text file differ from the direct nvariables and
direct nclauses features of the ASLib scenario,
since the former seem to be computed after
grounding the CSP formula to SAT.
Full Set of Features: These features represent the
state-of-the-art in domain-specific algorithm se-
lection, as provided in the corresponding scenar-
ios of ASLib. All these 483 SAT features were
introduced in (Xu et al., 2008), and constitute the
by-default standard for AAS in SAT.
For CSP, ASLib provides the 198 domain-specific
features as proposed in (Hurley et al., 2014). We
note here that our evaluation does not consider the
time needed to compute all these features, even
though some of them are expensive to compute
and others are captured during runtime, from a
reference solver.
Although these features and scenarios are com-
monly referenced in the literature, we were unable
to find reported performance values ( ˆm) for meta-
solvers that utilize these features directly. Con-
sequently, our aim is to document these values to
serve as a reference for future research.
4.5 Image-Based Models
We implemented the approach presented by (Loreg-
gia et al., 2016) carefully following the experimen-
tal setup described there. For the training, we used
Stochastic Greedy Descent (SGD) with Nesterov mo-
mentum of 0.9 and a learning rate of 0.03. As first
layer, we included a batch normalization layer, as pro-
posed in (Ioffe and Szegedy, 2015). The output layer
changes depending on the learning task, as mentioned
in Section 3.4. We set a training batch size of 128 and
100 epochs.
4.6 Text-Based Models
Due to limitations on the hardware needed to train our
model on arbitrary-size instances, we truncated the
size of the instances to 10, 000 characters. To avoid
introducing biases into the model, we removed from
the text files any comments or other kind of meta-
information like the folder name where the instance is
located, or the name of the generator of the instance.
In a preliminary evaluation, we noted that these meta-
information fields may unfairly help the text-based
models and decided not to consider this information.
For training our text-based models, we used
AdamW optimizer (Loshchilov and Hutter, 2017)
with a learning rate of 10
5
. We set a batch size of 8
samples and an embedding size d of 128. The training
was set to take 100 epochs.
Since the sequence length produced by Sentence-
Piece can vary among instances while our encoder
accepts a fixed-length sequence, we computed the
median length of SentencePiece’s output, truncating
longer sequences and padding shorter ones. The vo-
cabulary size v for SentencePiece, was set to 1024.
Charformer, which operates at the character level, had
a vocabulary size of 257 (256 ASCII values plus one
token reserved for padding). We set the max block size
and the downsample factor to their default values (4).
Additionally, we employed the block attention scores
proposed in Section 2.1.4 of (Tay et al., 2021) to form
latent subwords.
5 RESULTS
5.1 Feature-Based Validation Results
Table 1: ˆm metric values across 10-fold validation sets
for different handcrafted-features-based meta-solvers for
CSP and SAT Industrial benchmark sets. Here, HF =
Handcrafted-based, F=Full set of features, B=Basic set of
features, ML=Multi-label model, Reg= Regression model,
MC=Multi-class model.
Model CSP SAT Industrial
HF-F-ML 0.409 ± 0.064 0.680 ± 0.312
HF-F-Reg 0.416 ± 0.087 0.640 ± 0.291
HF-F-MC 0.546 ± 0.066 0.676 ± 0.228
HF-B-ML 0.638 ± 0.068 1.054 ± 0.361
HF-B-Reg 0.557 ± 0.066 0.939 ± 0.365
HF-B-MC 0.582 ± 0.075 1.22 ± 0.387
Table 1 shows the average and standard deviation
of the ˆm values computed across 10-fold cross-
validation subsets for six feature-based meta-solvers.
The first three meta-solvers are based on the full set
of features provided in the ASLib, while the last three
meta-solvers only use the two basic features related
to the size of the instances. For a fair compari-
son with our feature-free model, these feature-based
meta-solvers can cast AAS as a multi-label task (ML),
a regression task (Reg), or a multi-class (ML) prob-
lem. As can be seen, for the CSP benchmark, the most
successful meta-solver using the full set of features
is the one based on multi-label classification (ML).
In contrast, for the SAT Industrial benchmark, the
best meta-solver, using the full set of features, is the
one based on regression (Reg). Nevertheless, we note
Text-Based Feature-Free Automatic Algorithm Selection
271
that even for these state-of-the-art crafted features, the
meta-solvers are quite sensible to the test set in SAT,
as is evident from the considerable standard deviation.
Regarding the meta-solvers using only the two
basic features, the meta-solvers based on regression
show better performance in both benchmark sets. We
note that, on average, only using these two basic fea-
tures allows the meta-solvers to outperform the SBS.
For CSP, we found a considerable margin of advan-
tage, and for SAT Industrial, a smaller margin.
We report the performance of these feature-based
solvers in the test set in Subsection 5.3. All the results
reported here are consistent with the literature.
5.2 Image-Based Validation Results
Table 2: ˆm metric values across 10-fold validation sets for
different image-based meta-solvers for CSP and SAT Indus-
trial benchmark sets. Here, Im = Image-based, ML=Multi-
label model, Reg= Regression model, MC=Multi-class
model.
Model CSP SAT Industrial
Im-ML 0.640 ± 0.088 1.25 ± 0.407
Im-Reg 0.609 ± 0.104 1.14 ± 0.346
Im-MC 0.898 ± 0.109 1.66 ± 0.527
Table 2 shows the statistics of the ˆm values com-
puted across 10-fold cross-validation subsets for three
image-based meta-solvers. For a fair comparison with
our text-based model, we trained image-based meta-
solvers based on multi-label, regression, and multi-
class formulations. The results in Table 2 demonstrate
that, although the regression approach was not con-
sidered in (Loreggia et al., 2016), the most successful
image-based meta-solver is the one based on regres-
sion for both benchmark sets.
The meta-solver for CSP outperforms CSP’s Sin-
gle Best Solver by a significant margin while main-
taining a considerable gap with the Virtual Best
Solver for CSP. These results are in line with the ones
reported in (Loreggia et al., 2016). However, an ex-
act match between our image-based results and those
in (Loreggia et al., 2016) is virtually impossible since
the training/validation/test differ.
Image-based SAT meta-solvers cannot outperform
the Single Best Solver. This result diverges from
the results of (Loreggia et al., 2016), which reported
an image-based meta-solver that outperforms SBS on
SAT. This discrepancy may happen due to differences
in the specific SAT industrial benchmark set used or
differences in the training/test partitions. However,
we also observe that the performance of the SAT
image-based meta-solver varies significantly depend-
ing on the training and validation set (standard devia-
tion of 0.346 among cross-validation folds).
5.3 Text-Based Validation Results
Table 3: ˆm metric values across 10-fold validation sets for
different text-based ML models for CSP and SAT Indus-
trial benchmark sets. Here, Txt = Text-based, Cha=Trained
tokenizer Charformer, Sen= Pre-trained tokenizer Senten-
piece, ML=Multi-label model, Reg= Regression model,
MC=Multi-class model.
Model CSP SAT Industrial
Txt-Cha-ML 0.488 ± 0.047 0.952 ± 0.281
Txt-Cha-Reg 0.469 ± 0.050 0.889 ± 0.303
Txt-Cha-MC 0.581 ± 0.076 1.312 ± 0.354
Txt-Sen-ML 0.482 ± 0.082 1.078 ± 0.252
Txt-Sen-Reg 0.536 ± 0.100 1.119 ± 0.448
Txt-Sen-MC 0.608 ± 0.120 1.470 ± 0.319
Table 3 shows the average and standard deviation of
the ˆm values for our text-based meta-solvers com-
puted by 10-fold cross-validation. The first three
meta-solvers are text-based models jointly trained
with the tokenizer (Charformer), while the last three
meta-solvers use the pre-trained tokenizer (Sentence-
Piece). As can be seen, the most successful meta-
solver is the one that uses a regression model jointly
trained with the tokenizer.
The CSP meta-solver significantly improves the
performance of the SBS for this domain. With an av-
erage ˆm value equal to 0.469 and little standard de-
viation, this meta-solver’s performance can be inter-
preted as closer to the VBS than to the SBS.
Despite the formulation, obtaining a ˆm lower than
1 for SAT Industrial was impossible using image-
based methods. Noticeably, our best text-based meta-
solver outperforms the Single Best Solver with an av-
erage ˆm value of 0.889 in this benchmark. Neverthe-
less, as for the previous models, the standard devi-
ation is high (0.303), which suggests that the meta-
solver’s performance varies considerably depending
on the validation instances used.
5.4 Test Set Results
Here we compare feature-based, image-based and
text-based meta-solvers on the test set of each bench-
mark. For each category, we selected the best ap-
proach using 10-fold cross-validation, and trained the
model with the whole training set. Again, we note that
results given on feature-based models are reported as
a reference to gain perspectives as well as to com-
municate performance values of meta-solvers using
straightforward models.
As anticipated, the meta-solvers that yield the best
KDIR 2024 - 16th International Conference on Knowledge Discovery and Information Retrieval
272
Table 4: ˆm metric values of the testing set, for each “best”
model for each approach and benchmark set.
Model CSP SAT Industrial
HF-B-Reg 0.549 0.975
HF-F-ML 0.442
HF-F-Reg 0.674
Im-Reg 0.642 1.309
Txt-Char-Reg 0.556 1.037
results are those that utilize expert-designed features
specific to the domain. In the case of CSP, the meta-
solver employing a multi-label classification model
achieves an ˆm value of 0.442. This significantly nar-
rows the performance disparity between the SBS and
the VBS in CSP scenarios. Similarly, for the SAT In-
dustrial benchmark, the regression-based meta-solver
records an ˆm value of 0.674. Considering the com-
plexity of this benchmark, this score is notably satis-
factory. These outcomes align with those from con-
temporary meta-solvers specialized for CSP and SAT
Industrial. It is important to note that this assess-
ment only gauges the effectiveness of the features in
a well-adjusted ML model. This overview omits the
consideration that many sophisticated features, while
beneficial, are computationally intensive and may not
be regularly employed in elaborate Algorithm Se-
lection Systems that utilize both a presolver and a
solver scheduler. Hence, the current ˆm values of the
meta-solvers that incorporate these advanced features
likely represent a lower bound for any straightforward
methodology.
When comparing the two feature-free meta-
solvers, our text-based method significantly surpasses
the image-based method and nearly matches the per-
formance of the meta-solvers that incorporate the two
basic crafted features. This suggests that the image-
based models may fail to capture even basic infor-
mation, such as the size of the problem instance.
Conversely, the text-based models appear capable of
recognizing information akin to these features, even
though our system uses only basic vanilla encoders.
Converting these ˆm scores to average running times
reveals that the expected average time for the text-
based model is approximately 13% lower than that
of the image-based model for the CSP benchmark.
For the SAT Industrial benchmark, this reduction
is about 20%. Collectively, these figures demon-
strate that our novel text-based feature-free frame-
work significantly decreases the performance gap be-
tween feature-free and feature-based Algorithm Se-
lection Systems (AAS).
6 CONCLUSIONS AND FUTURE
WORK
We present here a novel approach to Automatic Al-
gorithm Selection that leverages the capabilities of
text-based deep learning models. Our results clearly
demonstrate that this method not only simplifies the
feature extraction process (by eliminating the need
of image-based preprocessing) but also significantly
enhances the performance of existing feature-free al-
gorithm selection paradigms. By directly processing
raw textual descriptions of problem instances, our ap-
proach has shown a marked improvement over tradi-
tional, image-based CNN approaches in terms of both
performance and robustness across benchmarks.
The effectiveness of our method was validated
through extensive experiments on benchmarks con-
taining a variety of problem instances. The experi-
mental results underscore the potential of deep learn-
ing techniques that operate directly on raw data, pro-
viding a more scalable and flexible end-to-end solu-
tion for the field of AAS.
Our experiments confirm that, up to date, no
feature-free algorithm selection approach can outper-
form meta-solvers based on validated domain-specific
crafted features by experts. However, results also
show that text-based feature-free models can match
the performance of meta-solvers based on basic in-
formative features. This finding suggests that deep
learning methods can learn problem representations
beyond the most crude and elementary characteriza-
tion.
While our study has made significant strides in the
application of text-based models to algorithm selec-
tion, several avenues remain open for further explo-
ration. Future work may include:
More Complex AAS Systems: Our proposal can
be the base for more complex AAS systems, in-
cluding dynamic portfolios and schedulers.
More Complex ML Models: More complex
transformer architectures can also be tested. Be-
sides, AAS can be framed in a more sophisticated
way to leverage advances in ranking, metric learn-
ing, and recommender systems.
Handling the Whole Text Files: A plethora of ar-
chitectures have been proposed for long text mod-
eling in deep learning. These methods should be
systematically evaluated to overcome the limita-
tions of our text-based meta-solver.
Anytime AAS: Extending our method to Any-
time Algorithm Selection could significantly ben-
efit environments where decisions should be made
based on the available computational resources.
Text-Based Feature-Free Automatic Algorithm Selection
273
Transfer Learning: Exploring transfer learning
techniques to adapt models trained on one set
of problem instances to handle others effectively
could contribute to a general purpose AAS.
Interpretable AI Models: Enhancing the inter-
pretability of deep learning models used in AAS
to provide insights into why certain algorithms are
preferred for specific instances could help refine
the models further and in gaining trust from users.
Benchmarks and Datasets: Applying our frame-
work to other domains, possibly including opti-
mization problems whose domain metrics ˆm in-
volve the values of the objective function.
In conclusion, the research presented in this paper sets
a new benchmark in the field of feature-free AAS and
opens up numerous possibilities for the evolution of
more intelligent and autonomous algorithm selection
systems. Our future efforts will focus on expanding
the capabilities of our framework and exploring these
promising directions to further enhance the field of
algorithm selection.
ACKNOWLEDGEMENTS
Authors 1st, 2nd, 3rd, 4th, and 6th are supported in
part by AI institute NSF award 2112533.
REFERENCES
Ach
´
a, R. A., L
´
opez, R., Hagedorn, S., and Baier, J. A.
(2022). Multi-agent path finding: A new boolean en-
coding. Journal of Artificial Intelligence Research,
75:323–350.
Adam, S. P., Alexandropoulos, S.-A. N., Pardalos, P. M.,
and Vrahatis, M. N. (2019). No free lunch theorem: A
review. Approximation and optimization: Algorithms,
complexity and applications, pages 57–82.
Ba, J. L., Kiros, J. R., and Hinton, G. E. (2016). Layer
normalization. arXiv preprint arXiv:1607.06450.
Bossek, J. and Neumann, F. (2022). Exploring the feature
space of tsp instances using quality diversity. In Pro-
ceedings of the Genetic and Evolutionary Computa-
tion Conference, pages 186–194.
Bulitko, V. (2016). Evolving real-time heuristic search al-
gorithms. In Artificial Life Conference Proceedings
13, pages 108–115. MIT Press.
Christlein, V., Spranger, L., Seuret, M., Nicolaou, A., Kr
´
al,
P., and Maier, A. (2019). Deep generalized max pool-
ing. In 2019 International conference on document
analysis and recognition (ICDAR), pages 1090–1096.
IEEE.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Hurley, B., Kotthoff, L., Malitsky, Y., and O’Sullivan, B.
(2014). Proteus: A hierarchical portfolio of solvers
and transformations. In Integration of AI and OR
Techniques in Constraint Programming: 11th Inter-
national Conference, CPAIOR 2014, Cork, Ireland,
May 19-23, 2014. Proceedings 11, pages 301–317.
Springer.
Ioffe, S. and Szegedy, C. (2015). Batch normalization: Ac-
celerating deep network training by reducing internal
covariate shift. In International conference on ma-
chine learning, pages 448–456. pmlr.
Kadioglu, S., Malitsky, Y., Sabharwal, A., Samulowitz, H.,
and Sellmann, M. (2011). Algorithm selection and
scheduling. In International Conference on Principles
and Practice of Constraint Programming, pages 454–
469. Springer.
Kerschke, P., Hoos, H. H., Neumann, F., and Trautmann, H.
(2019). Automated algorithm selection: Survey and
perspectives. Evolutionary computation, 27(1):3–45.
Kudo, T. and Richardson, J. (2018). Sentencepiece: A sim-
ple and language independent subword tokenizer and
detokenizer for neural text processing. arXiv preprint
arXiv:1808.06226.
Lindauer, M., Hoos, H. H., Hutter, F., and Schaub, T.
(2015). Autofolio: An automatically configured al-
gorithm selector. Journal of Artificial Intelligence Re-
search, 53:745–778.
Lindauer, M., van Rijn, J. N., and Kotthoff, L. (2019). The
algorithm selection competitions 2015 and 2017. Ar-
tificial Intelligence, 272:86–100.
Loreggia, A., Malitsky, Y., Samulowitz, H., and Saraswat,
V. (2016). Deep learning for algorithm portfolios. In
Thirtieth AAAI Conference on Artificial Intelligence.
Loshchilov, I. and Hutter, F. (2017). Decoupled weight de-
cay regularization. arXiv preprint arXiv:1711.05101.
Rice, J. R. (1976). The algorithm selection problem. In Ad-
vances in computers, volume 15, pages 65–118. Else-
vier.
Seiler, M., Pohl, J., Bossek, J., Kerschke, P., and Traut-
mann, H. (2020). Deep learning as a competitive
feature-free approach for automated algorithm selec-
tion on the traveling salesperson problem. In Interna-
tional Conference on Parallel Problem Solving from
Nature, pages 48–64. Springer.
Tay, Y., Tran, V. Q., Ruder, S., Gupta, J., Chung, H. W.,
Bahri, D., Qin, Z., Baumgartner, S., Yu, C., and Met-
zler, D. (2021). Charformer: Fast character transform-
ers via gradient-based subword tokenization. arXiv
preprint arXiv:2106.12672.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.
(2017). Attention is all you need. Advances in neural
information processing systems, 30.
Xu, L., Hutter, F., Hoos, H. H., and Leyton-Brown, K.
(2008). Satzilla: portfolio-based algorithm selection
for sat. Journal of artificial intelligence research,
32:565–606.
KDIR 2024 - 16th International Conference on Knowledge Discovery and Information Retrieval
274