Benchmarking Binarisation Techniques for 2D Fiducial Marker Tracking

Yves Rangoni and Eric Ras

Public Research Centre Henri Tudor, 29 Avenue J. F. Kennedy, L-1855 Luxembourg-Kirchberg, Luxembourg

Keywords:

Binarisation, Tangible User Interface, Object Tracking, 2D Marker Recognition, Computer Vision Frame-

work, Benchmark.

Abstract:

This paper proposes a comparative study of different binarisation techniques for 2D ﬁducial marker track-

ing. The application domain is the recognition of objects for Tangible User Interface (TUI) using a tabletop

solution. In this case, the common technique is to use markers, attached to the objects, which can be identi-

ﬁed using camera-based pattern recognition techniques. Among the different operations that lead to a good

recognition of these markers, the step of binarisation of greyscale image is the most critical one. We propose

to investigate how this important step can be improved not only in terms of quality but also in term of com-

putational efﬁciency. State-of-the-art thresholding techniques are benchmarked on this challenging task. A

real-world tabletop TUI is used to perform an objective and goal oriented evaluation through the ReacTIVi-

sion framework. A computational efﬁcient implementation of one of the best window-based thresholders is

proposed in order to satisfy the real-time processing of a video stream. The experimental results reveal that

an improvement of up to 10 points of the ﬁducial tracking recognition rate can be reached when selecting the

right thresholder over the embedded method while being more robust and still remaining time-efﬁcient.

1 INTRODUCTION

Binarisation is a process that converts an image

from colour or greyscale space to a black-and-white

representation called a binary image. Many opti-

cal recognition solutions need this technique as a

pre-processing to improve the chances of successful

recognition.

It is a challenging task that occurs quite early in

a recognition stream (Lee et al., 1990), so that, fail-

ing at this step has generally irreversible effects on

the next steps. Sometimes, in the worst case, binari-

sation becomes essential when the following steps of

the recognition can only work on binary images. In

other cases, binarisation is in fact similar to segmen-

tation, and is also a crucial task to tackle.

Among the different application domains, binari-

sation has ever been a challenging problem, for exam-

ple in optical music recognition (Ashley et al., 2007),

or document image analysis (Trier and Taxt, 1995),

(Sezgin and Sankur, 2004). It is still a relevant re-

search area in image video and computer vision appli-

cations (Saini et al., 2012) and widely applied in ﬁelds

like medical, remote sensing and image retrieval.

In this paper, the application is the tracking of 2D

marker ﬁducials on a tabletop Tangible User Inter-

face (TUI). They are designed to allow users to in-

teract with digital information through the physical

environment respectively physical objects called tan-

gibles. There are many situations where using physi-

cal objects is still more intuitive than classical com-

puter devices, especially for situations where tasks

are solved collaboratively, i.e., multiple access point

can be provided to a team of users by a multitude of

physical objects. The aim of a TUI is then to go be-

yond the WIMP “Windows, Icons, Menus & Point-

ers” paradigm (van Dam, 1997), which is currently

the classical way for interacting with computers. The

state of the art proposes several frameworks and sys-

tems for interacting through the physical environment

while maintaining a coupling with a digital model

(Ishii, 2008), (Dunser et al., 2010), (Ullmer and Ishii,

2001).

To be able to deploy such an interactive user in-

terface, a speciﬁc hardware and software framework

must be designed. One of the most common ways for

tracking the physical objects is to use computer vi-

sion methods on speciﬁc ﬁducial markers attached to

the physical objects.

The issue raised in this paper is then quite close

to the 2D-matrix QR barcodes detection (Chen et al.,

2012), with the extra constraints of real-time process-

ing and accurate position and rotation angle detection.

For a fast and robust tracking, especially on table-

616

Rangoni Y. and Ras E..

Benchmarking Binarisation Techniques for 2D Fiducial Marker Tracking.

DOI: 10.5220/0004820706160623

In Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods (ICPRAM-2014), pages 616-623

ISBN: 978-989-758-018-5

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 1: ReacTIVision framework diagram

Figure 2: Some samples of ﬁducial markers from ReacTIVi-

sion

top TUI, a good technical solution consists in a

method called “diffuse illumination” as proposed in

(Kaltenbrunner et al., 2006). It considers to evenly

distributing light in the space underneath the table

top. This diffuse light shines through a frosted acrylic

glass pane on the table top (Fig. 1).

Special patterns are usually printed on the bottom

of the objects (Fig. 2). When an object is on the table,

the marker and the glass pane are facing each other,

and the marker can be recognised by the tracking sys-

tem by transparency.

The 2D markers reﬂect the infra-red light back

through the glass surface to the inner bottom of the

table where a camera records the image (Fig. 3)

The frosted acrylic glass is translucent, which

means on the one hand transparent enough to allow

the camera seeing the black and white symbols, and

on the other hand opaque enough to display some vi-

sual feedback on the surface from a video projector.

The camera has an infra-red pass through ﬁlter and

a wide angle lens. It captures almost the entire surface

while blocking out visible light. The segmentation

marker/non-marker is then much easier to process, for

example image from the video projector, which is a

noise for marker tracking, is not captured by the mod-

iﬁed camera.

This kind of set-up has already been developed by

us (Maquil and Ras, 2012). It will be described in

Section 4, and is used for experimentation in this pa-

per.

It exists several optical marker tracking systems

such as ARToolKit (Kato and Billinghurst, 1999) or

D-touch system of (Costanza and Robinson, 2003),

http://reactivision.sourceforge.net

Figure 3: Example of an image of ﬁve markers acquired

by an IR camera. (Note: This image has been enhanced

in terms of contrast and luminosity to improve readability

when printed or accessed electronically. The original raw

image has a very tight tonal range (numbers in red)).

but some studies (X. Zhang, 2002) show that they per-

form slowly when the image’s size increase.

The contribution of this paper is to enhance the

recognition method proposed by (Bencina et al.,

2005) by means of tuning the binarisation method,

which will give the best input for the “topological re-

gion adjacency based ﬁducial recognition” introduced

by the previous cited authors.

The paper is organised as follows: First, a con-

cise survey to marker recognition is introduced. Sec-

ond, the different benchmarked binarisation methods

are described. Then, after describing the experimental

set-up, results of the binarisation benchmarking will

be discussed in Section 5. Finally, just before con-

cluding, a section on fast and concrete implementa-

tion is exposed.

2 MARKER RECOGNITION

This section explains how the recognition system

works. It is stated that the method presented here is

the best trade-off of the available state-of-the-practice

technologies. The binarisation efﬁciency will be eval-

uated through the results of this recognition method,

i.e. a good binarisation is a binarisation that decreases

the error rate. We emphasize that the evaluation is

goal-oriented and context dependent, but we can at

least propose a fair and quantiﬁable metric, contrary

to subjective or global approaches (Leedham et al.,

2002) which have been discarded on purpose.

The topological region adjacency-based ﬁducial

recognition of (Bencina et al., 2005) constructs a re-

gion adjacency graph from a binary image through a

BenchmarkingBinarisationTechniquesfor2DFiducialMarkerTracking

617

Figure 4: (a) ReacTIVision ﬁducial, (b) black and white leafs and their average centroid, (c) black leafs and their average

centroid, (d) vector used to deduce ﬁducial’s orientation, (e) corresponding left heavy depth sequence (Bencina et al., 2005).

process of segmentation. It basically describes the hi-

erarchy of the image as a tree of transitions between

white and black blobs and vice-versa. In this un-

ordered rooted tree, the image regions can be seen

as Russian dolls. The data structure that stores the

tree is highly efﬁcient; all the basic operations are in

O(log(n)) time complexity and O(n) space complex-

ity (Cormen et al., 2001).

The symbols use also a predeﬁned topology of

black and white regions (Fig. 4a). The ﬁducial tree

generation makes use of the number of nodes, max-

imum depth, and number of black and white leaves

(Fig. 4b) to create a set of unique identiﬁers. The

effective drawing of a marker is performed by a ge-

netic algorithm which takes also into account the

space between regions and the location of the cen-

troids (Fig. 4c). The orientation is deduced from the

two centroids (Fig. 4d).

The recognition then consists in ﬁnding sub-

graphs in the region adjacency graph that have ex-

actly the same “encoding” than those of the gener-

ated markers. In the ReacTIVision framework, the

method of (Asai et al., 2003), the left heavy depth se-

quence (Fig. 4e), has been used to perform the sub-

graph scanning.

It can be easily understood that the method works

and is efﬁcient if and only if the input image is to-

tally clean. In practice, only one pixel which breaks a

region will produce a completely different adjacency

graph than the one expected. Any salt-and-pepper

noise is likely to add leafs to the graph and, as a conse-

quence, destroy the expected chain codes, making the

sub-graph identiﬁcation impossible or creating false-

positive detection.

In (Bencina et al., 2005), the authors agreed

with the fact that the binarisation step is important.

They proposed a tile-based variation of the Bernsen’s

method (Bernsen, 1986). They also pointed that this

choice was mainly decided due to some speed con-

sideration. Still, they evoked the test of some other

alternatives, but there is neither additional informa-

tion about them nor further discussion on the pivotal

role of the thresholder in the global tracking system.

In practice, when dealing with diffuse illumina-

tion, the experts are faced to many day-to-day prob-

lems on illumination, camera behaviour/limitations,

marker printing, hardware tuning, calibration, etc. In

return, they often get poor tracking results, which are

simply unacceptable for the end-users. As a result, we

decided to invest a complete work on optimising the

binarisation step.

3 BINARISATION METHODS

Since several decades, many binarisation methods

have been developed. In (Sezgin and Sankur, 2004),

the authors proposed a taxonomy in six categories, de-

pending of the exploitation of: histogram shape infor-

mation, measurement space clustering, histogram en-

tropy information, image attribute information, spa-

tial information, and local characteristics.

Choosing the right binarisation is never straight-

forward (Chen et al., 2012), and there are no rules or

guidelines to support the decision. The only solution

is to benchmark them all. On top of that, the most

advanced techniques require some empirical param-

eters to be set. Their ﬁne tuning is crucial to expect

a good behaviour and this task can be as difﬁcult as

the recognition itself, in the sense that it looks like the

Sayre’s paradox.

Binarisation and segmentation can been seen as a

similar problem and ﬁnding a good method invariably

encounters two problems (Zhang et al., 2003):

• Inability to effectively compare the different

methods or even different parameterizations of

any given method.

• Inability to determine whether one binarisation

method or parameterization is best for all images

or a class of images (e.g. natural images, medical

images, etc.)

Even a fair benchmarking of the algorithms is not

possible, unless specifying clearly the application, the

constraints, the goals, and how to ﬁnd the best param-

eters for each algorithm.

In the context of 2D marker tracking, we can as-

sume that the task to perform is a segmentation of

non-destructive testing (NDT) images as called by

(Sezgin and Sankur, 2004).

The methods we benchmarked are:

ICPRAM2014-InternationalConferenceonPatternRecognitionApplicationsandMethods

618

• Histogram-based: Sezan (1985), Tsai (1985),

Ramesh et al. (1995)

• Measurement-space or clustering-based: Ridler

and Calvard (1978), Otsu (1979), Lloyd (1985),

Kittler and Illingworth (1986), Yanni and Horne

(1994), Jawahar et al. (1997)

• Entropy-based: Dunn et al. (1984), Kapur et

al. (1985), Abutaleb (1989), Li and Lee (1993),

Shanbhag (1994), Yen et al. (1995), and Brink

and Pendock (1996)

• Local thresholding, window based: White and

Rohrer (1983), Bernsen (1986), Niblack (1986),

Mardia and Hainsworth (1988), Yanowitz and

Bruckstein (1989), Sauvola and Pietaksinen

(2000), and Gatos et al. (2004)

• Original ReacTIVision’s thresholders: simple,

simple adaptive, overlapped adaptive, tiled

Bernsen.

Full references for these well-known algorithms can

be found in (Sezgin and Sankur, 2004). For the Reac-

TIVision methods, the simple thresholder is a global

and ﬁxed value for binarising any image. The simple

adaptive considers the image subdivided in grid, for

each cell of the grid, the mean grey value is taken as

the threshold for binarisation. The overlapped adap-

tive works like the simple adaptive, except that the

mean value is computer in a wider cell than the one

where the binarisation is applied. The tiled Bernsen is

based on (Bernsen, 1986), it works on separately on

each cell of the image, computes the mean intensity

in larger surrounding cell, and areas of low dynamic

range are set to black (Bencina et al., 2005).

4 CONTROLLED EXPERIMENT

As presented in Section 1, a tabletop TUI set-up

(Maquil and Ras, 2012) is used for experimentation

(Fig. 5). The following sections elaborate the environ-

mental setting of the controlled experiment, describe

how the sample dataset was created in to compare the

different binarisation methods as well as the data col-

lection method.

4.1 Environmental Setting

In our experiment, we use a wide-angle-IR-modiﬁed

version of the PlayStation Eye camera

. It captures

640×480 pixel frames, up to 75 fps, and analyses in

our case a rectangular surface of 70×110 cm. This

webcam embeds a quite good sensor chip allowing

http://peauproductions.com/store/index.php?main_page=product_

info&cPath=1_110&products_id=568

Figure 5: Tabletop TUI used for experimentation. It is a

hollow frame of wood. White paper in the bottom reﬂects

the IR light coming from the upper borders. The camera is

positioned on the ﬂoor, between the projector and the mir-

ror, recording by transparency the markers placed on the

top.

for more effective low-light operation. Another nice

feature is its capability of outputting uncompressed

video, so that there is no compression artefacts com-

pared to some other webcams.

Because of the camera’s ﬁsh-eye effect and its

relatively low resolution compared to the surface to

watch, we avoid putting the 2D markers at the extreme

corners of the table. We decided of a “dead zone” of

10% (i.e. a border of around 5 cm on the surface).

The 2D markers we used are printed on white

paper with a laser printer. They all come from the

miniset amoeba symbol database, composed of ex-

actly twelve markers (6 with a white background, and

6 others on a black background). Each of them is

printed on a 7 ×7 cm surface and a second set is

printed on 5×5 cm piece of paper. The ﬁrst set is

quite easy to recognise because the pixel density is

enough to see each black or white dots of a ﬁducial.

The second set is quite close to the limits of the sam-

pling capabilities of the camera. Only one or two pix-

els can cover each dot of a ﬁducial, the poor binarisa-

tion methods are then penalised.

The factor of the controlled experiment was the

different binarisation methods to be tested. The

dependent variable marker recognition score is ex-

plained later.

4.2 Preparing The Sample Frames

To evaluate fairly each binarisation method and in a

real case of application, a protocol has been set up.

A frame of markers consists in a greyscale snap-

BenchmarkingBinarisationTechniquesfor2DFiducialMarkerTracking

619

shot from the camera while the twelve markers are

on the table.The markers are located randomly on the

table, with also a random orientation. Each frame is

stored to feed a database for experimentation. Posi-

tion and orientation are always randomised before ac-

quiring a frame. To create a representative database,

several snapshots have been taken during 5 days, at

different hours of the day (4 in total). So that the illu-

mination of the frames is never the same. A series of

ﬁve frames was captured for each conﬁguration and

for each set of ﬁducials. Four conﬁgurations with two

sets, for each day, during ﬁve days, create a database

of 200 samples of frames. Hence, the database is then

composed of 2 400 markers to recognise. We took at-

tention to voluntary “stress” the hardware by printing

small makers and positioning them on places where

the digital camera experiences the strongest difﬁcul-

ties (e.g. mainly at the borders of the table)

4.3 Data Collection

Each binarisation method is applied on the 200 frames

(Fig. 6). For the methods requiring parameters like

the window-based ones, the best ones are chosen by

exhaustive search.

Dependent variable: The metric to evaluate the

quality of a binarisation algorithm takes into account

the recognition results of the left heavy depth se-

quence. In addition, if a 2D marker M is recognised,

the precision of the detection of its position P

rec

(M)

as well as its rotation angle R

rec

(M). A marker is

said perfectly recognised (a score S equals to 1) if it

appears on the table and if its position and orientation

are perfect (P

opt

(M), R

opt

(M)). Otherwise, if present,

the initial score of 1 is decreased by the relative dis-

tance between the perfect position and the recorded

one and also decreased by the distance of the rotation

angle between the optimal one and the recognised

one (Eqn. 1).

S(M) = 1 −

(kP

rec

(M) − P

opt

(M)k+

rec

(M) − R

opt

(M)k) (1)

Most of the time, when ReacTIVision ﬁnds a marker,

there is no reason to have a huge error in the loca-

tion or orientation computation, so that the score is

anyway always extremely close to 1. In some cases,

it allows making a difference between two binarisa-

tions able to identify all the ﬁducials; the one with

better precision in location and location is slightly

privileged. In extremely rare cases, some binarisa-

tions might produce false markers: A phantom marker

can appears in the tracking system while there was no

marker on the tabletop. In that case, the scoring func-

Figure 6: Snapshot of the binarised image with the tiled

Bernsen thresholder of ReacTIVision.

tion allows to quickly degrade the score of this phan-

tom marker.

4.4 Implementation

From one part, the binarisation methods come from

the Gamera framework (Droettboom et al., 2003), it

includes: Abutaleb, Bernsen, Brink, Gatos, Niblack,

Otsu, Sauvola, White, and Tsai. Ridler’s method is

coming from the Mahotas library (Coelho, 2013). The

ReacTIVision methods are used as-they-stand. The

others ones have been written in Python.

Note that it has been discovered, during prelimi-

nary experimentation, there was probably a mistake

in the Sauvola’s implementation of Gamera (line 394

in binarization.hpp, rev. 1389), which was not

compliant with the original equation (Eqn. 2). It has

been decided to correct it before running this method.

5 RESULTS

The results of the experimentation are summarised in

the Figure 7. It can be directly seen that, globally, the

window-based methods perform the best. The results

comply with our expectations with regard to the rank-

ing: The window-based methods seem to perform

best (e.g. Gatos) These kinds of methods proved to

perform well in other situation. On the contrary, basic

methods like REAC simple were doomed to failure

because of their inefﬁciency to deal with non-uniform

illumination. Anyway, none of the methods behaves

perfectly. It was quite foreseeable that the embedded

methods of ReacTIVision, in italic in the ﬁgure, get

quite poor results (on average

S(REAC) = 0.61), al-

though the tiled Bernsen is still well ranked (with a

ICPRAM2014-InternationalConferenceonPatternRecognitionApplicationsandMethods

620

Figure 7: Marker recognition scores for each binarisation

method. The higher, the better. The box-and-whisker plots

are ordered according to the mean score obtained.

mean score of 0.79). As discussed previously, they

were not designed to output the best quality, but to be

time-efﬁcient.

In this regard, the experimentations also showed

that quality and time have a positive correlation. It is

difﬁcult to select a method, considering that the best

ones require setting empirical parameters which can

be a hard task. For a day-to-day use, a consistent

method is probably preferable, like the Sauvola’s bi-

narisation (

S(Sauvola) = 0.89, σ

S(Sauvola)

= 0.006).

Most of the outliers comes from markers posi-

tioned close to a border of the table. As it can be seen,

the results do not pass 0.9, mainly due to the hard-

ware conﬁguration limitation, with the camera record-

ing frames with a density just over 10 dpi. Note that

even better results can be achieved if the hardware is

not “stressed” like it has been done for the purpose of

this speciﬁc experimentation.

6 TIME CONSIDERATION

Processing a VGA video stream at 75 images per sec-

ond is in theory almost equivalent to process a 23

Megapixel image. In practice, working on sequences

of “small” images is always slower and cannot take

beneﬁts from recent multi-core computers. In our

case, the video stream must be processed in order, and

without any delay to avoid latency while moving ob-

jects on the table. As a consequence, many parallel

computing techniques cannot be applied or would be

unproﬁtable.

Although the global trend is changing, the compu-

tation efﬁciency of an algorithm is never really taken

into account. Several hundreds of publications can be

easily found on binarisation, or closely related seg-

mentation problems, but almost none of them are re-

ally considering the time spent on a CPU to achieve

the task. The authors logically boost quality of the

results than address the issue of time complexity.

The average complexity of global binarisation

techniques is O(n), where n is the total number of pix-

els in an image, local techniques and window-based

are closed to O(n

) when the size of the window be-

comes wide (often necessary to reach a good quality).

The most efﬁcient methods based on IA technique,

hierarchical pyramids, recursive programming, are

roughly in the O(n

) class of complexity.

As it would be not feasible to write perfect code

for each binarisation method that exists in the liter-

ature, we propose, as an example, to focus on only

one of them. The best trade-of according to the re-

sults of the experiments, with also a fairly consistency

across the entire test set, is the Sauvola’s binarisation

(Sauvola and Pietik

ainen, 2000). This window-based

method is easy to code, its formulae is given by Equa-

tion 2.

W,K

(i, j) = m

(i, j) ×



1 + K



(i, j)

128

− 1



(2)

To prevent a slowdown of the processing when us-

ing large values for the sliding window W , the in-

tegral image-based solution of (Shafait et al., 2008)

has been chosen. It has been implemented directly in-

side the C++ source code of ReacTIVision as a new

FrameThresholder.

Sauvola’s method outputs then a better binarisa-

tion as shown previously, with little extra computation

compared to the default thresholder of ReacTIVision.

It can still proceed at 75fps (Fig. 8) without any con-

straint on the size of the window for thresholding.

Although W could quite easily be deduced from

the ﬁducial’s size, we decided to automatically com-

pute both empirical parameters W and K, whereas K

is the sensitivity weight on the adjusted variance σ.

We propose a similar optimisation method to (Ran-

goni et al., 2009). The cost function is our case the

number of well recognised markers placed in a known

BenchmarkingBinarisationTechniquesfor2DFiducialMarkerTracking

621

Figure 8: Elapsed time to process optimised Sauvola binarisation while increasing the sliding window size. The x axis is half

of the window size W , e.g. x = 10 means that the window is a square whose side’s length is 10× 2 +1 = 21 pixels. The y axis

is the computation time in µs for processing one frame. It can be observed that there is no huge gap in the computation time

between a small window (around 2.4ms for a 3× 3 square) and a large window (around 2.7ms for a 145 ×145 square).

conﬁguration at the beginning of the experiment. We

assumed that during a work session on the tangible

table, the global illumination is unlikely to suddenly

change.

7 CONCLUSIONS

In this paper, several state-of-the-art binarisation tech-

niques have been benchmarked in the scope of 2D

marker tracking. The evaluation was quantitative and

goal-directed through the use of an existing marker

tracking software, ReacTIVision. The protocol for

evaluation was strict and controlled, and made use

of a real tabletop tangible user interface built by the

authors, so that the outcomes presented are represen-

tative of real-case applications. Similarly to other

computer vision tasks, the results rank window-based

methods ﬁrst. Among several possible techniques,

we focused on the best trade-off, the Sauvola’s bi-

narisation, and showed how it can be integrated di-

rectly in the ReacTIVision framework to perform efﬁ-

ciently and quickly. Thanks to an improvement of al-

most 10 points this methods allows to better recognise

and track the most difﬁcult ﬁducials while remaining

more robust and time-efﬁcient with the current set-up.

For future work, apart improving the hardware itself,

working directly in the greyscale space is one of the

next steps that can be conducted, another one would

consist in using the history of the frames already pro-

cessed to improve the recognition of the actual one.

ACKNOWLEDGEMENTS

The present project is supported by the National Re-

search Fund, Luxembourg and cofunded under the

Marie Curie Actions of the European Commission

(FP7-COFUND)

REFERENCES

Asai, T., Arimura, H., Uno, T., and Nakano, S. I. (2003).

Discovering frequent substructures in large unordered

trees. In 6th International Conference on Discovery

Science, volume 2843, pages 47–61.

Ashley, J., Laurent, B., Greg, P., and Fujinaga, E. I. (2007).

A comparative survey of image binarisation algo-

rithms for optical recognition on degraded musical

sources. In 8th International Conference on Music

Information Retrieval, pages 509–512.

Bencina, R., Kaltenbrunner, M., and Jord

a, S. (2005). Im-

proved topological ﬁducial tracking in the reacTIVi-

sion system. In Proceedings of the IEEE Computer

ICPRAM2014-InternationalConferenceonPatternRecognitionApplicationsandMethods

622

Society Conference on Computer Vision and Pattern

Recognition, pages 99–103.

Bernsen, J. (1986). Dynamic thresholding of gray-level

images. In 8th International Conference on Pattern

Recognition, pages 1251–1255.

Chen, Q., Du, Y., Lin, R., and Tian, Y. (2012). Fast QR code

image process and detection. In Wang, Y. and Zhang,

X., editors, Internet of Things, volume 312 of Commu-

nications in Computer and Information Science, pages

305–312. Springer Berlin Heidelberg.

Coelho, L. P. (2013). Mahotas: Open source software for

scriptable computer vision. Journal of Open Research

Software, 1.

Cormen, T. H., Stein, C., Rivest, R. L., and Leiserson, C. E.

(2001). Introduction to Algorithms. McGraw-Hill

Higher Education, 2nd edition.

Costanza, E. and Robinson, J. A. (2003). A region ad-

jacency tree approach to the detection and design of

ﬁducials. In Vision, Video and Graphics, pages 63–

70.

Droettboom, M., MacMillan, K., and Fujinaga, I. (2003).

The Gamera framework for building custom recogni-

tion systems. In Symposium on Document Image Un-

derstanding Technologies, pages 275–286.

Dunser, A., Looser, J., Grasset, R., Seichter, H., and

Billinghurst, M. (2010). Evaluation of tangible user

interfaces for desktop AR. In International Sympo-

sium on Ubiquitous Virtual Reality, pages 36–39.

Ishii, H. (2008). The tangible user interface and its evolu-

tion. In Communications of the ACM, pages 32–36.

Kaltenbrunner, M., Jord

a, S., Geiger, G., and Alonso, A.

(2006). The reacTable: A collaborative musical in-

strument. In Workshop on Tangible Interaction in Col-

laborative Environments, pages 406–411.

Kato, H. and Billinghurst, M. (1999). Marker tracking and

HMD calibration for a video-based augmented reality

conferencing system. In 2nd IEEE and ACM Interna-

tional Workshop on Augmented Reality (IWAR), page

8594.

Lee, S. U., Chung, S. Y., and Park, R. H. (1990). A com-

parative performance study of several global thresh-

olding techniques for segmentation. Computer Vision,

Graphics, and Image Processing, 52(2):171–190.

Leedham, G., Varma, S., Patankar, A., and Govindarayu, V.

(2002). Separating text and background in degraded

document images A comparison of global threshhold-

ing techniques for multi-stage threshholding. In 8th

International Workshop on Frontiers in Handwriting

Recognition, pages 244–249.

Maquil, V. and Ras, E. (2012). Collaborative problem solv-

ing with objects: Physical aspects of a tangible table-

top in technology-based assessment. In From Re-

search to Practice in the Design of Cooperative Sys-

tems: Results and Open Challenges, pages 153–166.

Rangoni, Y., van Beusekom, J., and Breuel, T. M. (2009).

Language independent thresholding optimization us-

ing a Gaussian mixture modelling of the character

shapes. In Proceedings of the International Workshop

on Multilingual OCR, pages 1–5. ACM.

Saini, R., Dutta, M., and Kumar, R. (2012). A compara-

tive study of several image segmentation techniques.

Journal of Information and Operations Management,

3(1):21–24.

Sauvola, J. and Pietik

ainen, M. (2000). Adaptive docu-

ment image binarization. In Pattern Recognition, vol-

ume 33, pages 225–236.

Sezgin, M. and Sankur, B. (2004). Survey over image

thresholding techniques and quantitative performance

evaluation. Journal of Electronic Imaging, 13(1):146–

168.

Shafait, F., Keysers, D., and Breuel, T. M. (2008). Efﬁcient

implementation of local adaptive thresholding tech-

niques using integral images. In Document Recogni-

tion and Retrieval, volume 6815.

Trier, O. T. D. and Taxt, T. (1995). Evaluation of binariza-

tion methods for document images. IEEE Transac-

tions On Pattern Analysis And Machine Intelligence,

17:312–315.

Ullmer, B. and Ishii, H. (2001). Emerging frameworks for

tangible user interfaces. In Human-Computer Interac-

tion in the New Millennium, pages 915–931. John M.

Carroll, ed. Addison-Wesley.

van Dam, A. (1997). Post-WIMP user interfaces. Commu-

nications of the ACM, 40(2):63–67.

X. Zhang, S. Fronz, N. N. (2002). Visual marker detection

and decoding in AR systems: A comparative study.

In International Symposium on Mixed and Augmented

Reality, pages 97–106.

Zhang, H., Fritts, J. E., and Goldman, S. A. (2003). An

entropy based objective evaluation method for image

segmentation. Storage and Retrieval Methods and Ap-

plications for Multimedia, pages 38–49.

BenchmarkingBinarisationTechniquesfor2DFiducialMarkerTracking

623