area and perimeter features deal with both size and
shape of the feature. For e.g. a very large perimeter
for a small area is indicative of a rough contour or a
concave region. The Intensity Sum, Intensity Mean,
Intensity Variance, Minimum Intensity, Maximum In-
tensity features deal with the intensity distribution of
the region. The sum of intensities and mean are in-
dicative of the overall brightness of the region. The
intensity variance and minimum and maximum inten-
sities are indicative of the overall change in intensity
from the centre of the region to the edge of the re-
gion. A very high variance in intensity or a large max-
imum intensity to minimum intensity ratio is indica-
tive of an unevenly bright spot. The radius mean, ra-
dius variance, minimum radius, maximum radius are
calculated as the mean, variance, minimum and max-
imum respectively of the distances from the centre of
the region to each contour pixel. The Radius mean
and radius variance features also deal with the overall
shape of the region. True beads tend to show a slightly
oval shape and therefore have a moderate radius vari-
ance and a rather stable minimum to maximum radius
ratio. For any particular manufacturing image, beads
are also found to be oriented in the same way. The
orientation feature is used to encapsulate this prop-
erty. It is calculated as the angle the longest axis of
the region makes with the y-axis.
A binary one-to-one support vector machine is
then trained using the above 12 features to classify
the extracted regions as bead and non-bead regions.
Classification results are presented in Section 5.
3.2 Fluorescent Bead Detection
The same set of features as the ones extracted for de-
tecting manufacturing beads are also extracted for flu-
orescent bead detection. However for the final classi-
fication instead of using a single support vector ma-
chine a number of support vector machine classifiers
are trained. This is necessary because beads of differ-
ent batches in the manufacturing process differ heav-
ily in appearance in both size and intensity. Further
estimating the batch of the bead directly from the
intensity becomes a multi-class classification prob-
lem which can significantly reduce the accuracy, even
when classification is done by max-wins voting strat-
egy. For our particular problem we found that training
a single one-versus-all support vector machine classi-
fier for each batch gave the best results.
4 BEAD PATTERN MATCHING
Once the bead patterns and their respective batches
have been estimated the relative locations of beads is
used to find matches. For a particular batch the inten-
sity, shape, size and orientation of all beads are very
similar and therefore these features cannot be used to
distinguish between them. The only feature that dis-
tinguishes a bead is the relative position of other beads
with respect to that bead. That is the pattern formed
by the neighbours of a bead is the identifier of the
bead.
The bead matching is done in two steps. In the first
step the graph spectra of the fully connected weighted
graph formed using the bead and its 3 nearest neigh-
bours is used to find a region of the manufacturing im-
age that is most likely to have a matching pattern. The
spectrum of the affinity matrix of a graph has the nice
property of being invariant to rotation and labelling.
This allows the first step of the matching to be rota-
tion invariant. Further using the normalized Laplacian
of the graph instead of the adjacency matrix makes
the matching invariant to scale. The edge weights
are simply the Euclidean distance between the bead
centres. The graph spectra is calculated by doing an
Eigen value decomposition of the normalized graph
Laplacian. The graph Laplacian is calculated as fol-
lows:
L(u, v) =
1, if u = v
−w(u,v)
√
d
u
d
v
, if u and v are adjacent
0, otherwise
(1)
where
d
u
=
∑
v
w(u, v) (2)
and w(u, v) is the weight of the edge between nodes u
and v.
The choice of the number of nearest neighbours
depends on the amount of mismatch in the graphs.
For our implementation the value of 3 was chosen em-
pirically. Using 3 nearest neighbours means that in
order to find a correct match there should be at least
one bead in the fluorescent image for which its 3 near-
est neighbour pattern matches the 3 nearest neighbour
pattern of its true corresponding bead in the manufac-
turing image. This is a reasonable assumption partic-
ularly for dense patterns. In case of sparse patterns us-
ing even 2 nearest neighbours produced good results.
Using a small number of nearest neighbours is nec-
essary because in this step we intend to find matches
which are very similar to each other. In particular we
try to find matches where nodes are not missing and
differences in the two graphs are only because of er-
ror in determining the location of the beads during the
detection process. This step however provides many
possible matches and is used to locate the regions
of the manufacturing image that is likely to have the
FAST BEAD DETECTION AND INEXACT MICROARRAY PATTERN MATCHING FOR IN-SITU ENCODED
BEAD-BASED ARRAY
9