graphs. In this paper we propose a novel augmenta-
tion approach called Graph Augmentation with Mod-
ule Swapping (GAMS) which tries to automatically
detect coherent portions of the graphs motifs and aug-
ment the dataset producing new graphs by swapping
similar motifs between different graphs. The idea be-
hind this approach is that the structural representa-
tions do satisfy some form of compositionality rule
just like image and text data, and it is formed by
the composition of several simpler coherent substruc-
tures loosely connected with one another according to
some unknown compositionality rule. Under this as-
sumption, swapping the modules is equivalent to in-
terchanging base substructures into existing composi-
tion templates.
2 STATE OF THE ART
Approaches for augmenting graph data are relatively
limited and mostly consist in heuristics to select
which elements of the structure (nodes/edges) to per-
turb in order to generate a new graph from a sin-
gle sample. One of the simplest and most used ap-
proaches for graph augmentation is graph perturba-
tion, which consists in a series of edge addition or
removal operations between existing nodes of a sin-
gle graph. This presents several degrees of freedom
and a large number of parameters. In particular, edge
selection can be random or follow a given heuristic,
making this more of a meta-approach with several dif-
ferent instances depending on the perturbation strat-
egy adopted. Perturbation approaches are in general
simple to implement and fast in their execution. How-
ever, in their simplest incarnation they give little to no
control over what part of the structure gets modified,
resulting in graphs that have very little to do with the
originals, and which do not maintain the underlying
semantic label. For instance, the algorithm might add
several edges in a sparse area of the graph or remove
them from a dense area resulting in graphs that are
very dissimilar from the original.
The simplest instance of edge perturbation is
given by DropEdge (Rong et al., 2019), which con-
sists in randomly dropping edges of the input graph
for each training iteration. This method was designed
to alleviate over-fitting and over-smoothing when
training Graph Convolutional Networks (GCNs).
Through DropEdge, we are actually generating differ-
ent randomly deformed copies of the original graph,
thus increasing randomness and diversity of the in-
put data, reducing the risk of over-fitting. Second,
DropEdge can also be treated as a message passing
reducer. In GCNs, the message passing between adja-
cent nodes is conducted along edge paths. Removing
certain edges renders node connections sparser, thus
avoiding over-smoothing to some extent when GCN
goes very deep (Rong et al., 2019). This algorithm is
well suited for deep learning algorithms, with DropE-
dge being executed at the end of each training epoch,
generating the new dataset for the next epoch.
The AdaEdge algorithm (Chen et al., 2019) op-
timizes the graph topology based on the model pre-
dictions. The method consists in iteratively training
GNN models and conduct edge remove/add opera-
tions based on the prediction to adaptively adjust the
graph for the learning target. Experimental results
in general cases show that this method can signifi-
cantly relieve the over-smoothing issue and improve
model performance, which further provides a com-
pelling perspective towards better GNNs performance
(Chen et al., 2019). More specifically, GNNs are
trained on the original graph and then the graph topol-
ogy is adjusted based on the prediction result of the
model by deleting inter-class edges and adding intra-
class edges. The GNN is then retrained on the updated
graph, and the topology optimization and model re-
training are iterated multiple times.
The GAUG (Zhao et al., 2020) methods follow a
similar concept to DropEdge and AdaEdge, driving
the augmentation based on the results of the trained
classifier. The goal is to improve node classifica-
tion by mitigating propagation of noisy edges. Neural
edge predictors like GAE (Kipf and Welling, 2016)
are able to latently learn class-homophilic tenden-
cies in existent edges that are improbable, and nonex-
istent edges that are probable (Zhao et al., 2020).
GAUG key idea is to leverage information inherent in
the graph to predict which non-existent edges should
likely exist, and which existent edges should likely be
removed in the graph G to produce modified graph(s)
G
m
to improve model performance.
• GAUG-M: this procedure starts using an edge pre-
dictor function to obtain edge probabilities for all
possible and existing edges in G. The role of the
edge predictor is flexible and can generally be re-
placed with any suitable method. Then we use
the predicted edge probabilities, we deterministi-
cally add (remove) new (existing) edges to create
a modified graph G
m
, which is used as input to a
GNN node-classifier.
This method is tough for modified-graph setting,
i.e. when we apply one or multiple graph trans-
formation operation f : G → G
m
, such that G
m
re-
places G for both training and inference.
• GAUG-O: it is complementary to GAUG-M be-
cause it is applied for original-graph setting, i.e.
when we apply many transformations f
i
: G → G
i
m
ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods
250