‘move left’. Thus, for node D in Figure 2 the correct
answer would be ‘reach it by moving down’.
Below, we will refer to the trained neural network
predicting the last step in the tree as Lara (for ‘last’).
After we have trained Lara, we use another piece of
code to find a path from a given node to O; we call
this code Una (for ’untangle’). Starting from a given
node, Una asks Lara how this node is classified, and
then performs the opposite move; then the same step
is repeated until O is reached. For example, if Una
starts at node D, Lara classifies this node as ‘move
down’, thus, Una applied the opposite move, ‘move
up’, reaching node A, then asks Lara again, etc., until
O is reached.
Lara
reverse
state action action
Figure 3: Diagram showing how Una works. A trained clas-
sifier Lara is applied to a state to produce an action; then the
action is reversed.
Note that the tree does not include all nodes in the
network, for example, node E in Figure 2 is not in the
tree. Lara is not trained on the nodes which are not in
the tree. However, due to the regular structure of the
network, we can hope that Lara will be able to gen-
eralize to these nodes and produce a reasonable rec-
ommendation for Una. For example, for node E one
can hope that Lara will conjecture that this node can
be reached from above or from the left, thus directing
Una, correctly, up or to the left.
In this paper we apply this idea to untangling
braids. However, before we delve into braid theory,
let us explain how this idea could potentially work on
another, more familiar example. Suppose we want to
train an agent to solve the Rubik’s cube. Visualise
a network in which nodes are all possible positions
of the Rubik’s cube (there are about 10
20
of them),
and in which two nodes are connected with an edge
if the nodes can be produced from one another by
one move (that is, one face turn). The network has
a regular structure, with each node connected to its
neighbors by a small number of possible moves. De-
note the solved position of the Rubik’s cube by O.
As described above, we can perform a breadth first
search in the network starting from O. As a rough
estimation, conducting this breadth first search up to
depth d reaches 10
d
nodes. It is known that one needs
20 moves (or 26 moves, depending on the exact def-
inition of moves (Kunkle and Cooperman, 2007)) to
reach every node from O, and it is not feasible to build
a tree containing 10
20
nodes. Realistically, a tree that
one can build would be much smaller, for example, a
tree can contain about 10
7
nodes. Thus, it is impor-
tant that Lara can generalize well from this relatively
small tree to the unfathomably large network.
3 BRAIDS
Braids are mathematical objects from low-
dimensional topology or, to be more precise,
knot theory (that is, the study of the relative position
of curves in the space). A braid on n strands consists
of n ropes whose left-hand ends are fixed one under
another and whose right-hand ends are fixed one
under another; you can imagine that the braid is laid
out on a table, and the ends of the ropes are attached
to the table with nails. Figures 4, 5 show examples
of braids on 3 strands with 10 crossings. Braids are
important because, on the one hand, they are useful
building blocks of knots and other constructions of
low-dimensional topology and, on the other hand,
have a simple structure and can be conveniently
studied using mathematics and, as in this study,
experimented with using computers.
The braids in Figures 4, 5 can be untangled, that is,
all crossings can be removed by moving certain parts
of strands up or down, as needed (without touching
the ends of the ropes); after the braid is untangled, the
braid diagram will look as in Figure 6, which shows
what we will call the canonical trivial braid. Not ev-
ery braid can be untangled. Those braids that can
be untangled are called trivial braids. The task that
we explore in this research is untangling braids using
ideas from Section 2.
When one studies braids (or knots) and how to un-
tangle them, the untangling process is split into el-
ementary local changes, affecting 2 or 3 consecutive
crossings, called Reidemeister moves (Kassel and Tu-
raev, 2008). Somewhat confusingly, the moves for
untangling braids are called the second Reidemeister
move and the third Reidemeister move; there exists a
move called the first Reidemeister move, but it is used
only with knots and not with braids (Lickorish, 2012).
Please see all forms of the second Reidemeister move
in Figures 7, 8. The meaning of each of these figures
is that a braid fragment shown on the left can be re-
placed by a braid fragment shown on the right, or vice
versa.
All forms of the third Reidemeister move are
shown in Figures 9, 10, 11, 12, 13, 14.
As you can see, the second Reidemeister move,
when applied to one of directions, removes two cross-
ings from a braid; thus, if our aim is to untangle
the braid, it seems like a good move to use. How-
ever, not every Reidemeister move removes cross-
ings from a braid; one can say that some Reidemeis-
ter moves (including all versions of the third Reide-
Supervised Learning for Untangling Braids
785