A COMBINED TECHNIQUE FOR DETECTING OBJECTS IN

MULTIMODAL IMAGES OF PAINTINGS

Dmitry Murashov

Dorodnicyn Computing Centre of RAS, Vavilov st. 40, 119333, Moscow, Russian Federation

Keywords: Multimodal Images, Information-theoretical Image Difference Measure, Fine-art Paintings, Object

Detection.

Abstract: A combined technique for detecting objects in multimodal images based on specific object detectors and

image difference measure is presented. The information-theoretical measures of image difference are

proposed. The conditions of applicability of these measures for detecting artefacts in multimodal images are

formulated. The technique based on the proposed measures is successfully used for detecting repainting and

retouching areas in the images of fine-art paintings. It requires segmentation of only one of the analyzed

images.

1 INTRODUCTION

In this paper, a problem concerned to analysis of

images taken in different spectral bands is

considered. Multispectral images are widely used in

restoration and attribution of paintings. Images

obtained in different modalities provide information

invisible for the human eye (Kirsh, 2000). For

example, ultraviolet (UV) fluorescence shows newly

applied materials (repainting or retouching area, see

Figure 1; the painting is kept at the Historical State

Museum, Moscow). One of the problems that should

be solved to support formulating restoration tasks is

the discovery and localization of repainting or

retouching areas. It is necessary to detect objects or

artefacts in UV image and find their corresponding

position in visible image.

The images under research are the JPEG images

of size 1640 by 1950 pixels and of 8 or 24 bpp

depth. Properties of the images under research affect

the solutions of the problem. First, uneven

illumination of painting. Second, regions of interest

have various intensity profiles and contrast. Third,

the objects may differ in size from tens to several

hundreds or even thousands of pixels. Forth, a

variety of shapes and intensity levels of the objects.

Hence the regions of interest may be partitioned into

classes according to their appearance in images, and

different detectors should be applied. We will

consider two images U and V of size mn of the

same scene acquired in different spectral ranges and

quantized by K and L gray levels respectively. Let us

assume that K objects (or artefacts)

O , k = 1, …,K,

are visible in image U, and L objects

, l = 1,…,L,

are visible in V. Objects

O and

O are considered

to be the connected sets of pixels having k and l gray

values respectively. It is necessary to localize

objects

visible in image U and absent in V.

(a) (b)

Figure 1: Images of the painting taken in optical (a) and

UV (b) spectral bands.

For solving the problem, the following strategy is

proposed: (a) a measure of image difference will be

introduced; (b) taking into account the specificity of

the appearance of repainting areas in ultraviolet

image, the dark objects will be localized using

specific detectors; (c) applying the image difference

measure, the repainting areas will be selected from

727

Murashov D..

A COMBINED TECHNIQUE FOR DETECTING OBJECTS IN MULTIMODAL IMAGES OF PAINTINGS.

DOI: 10.5220/0003870107270732

In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2012), pages 727-732

ISBN: 978-989-8565-03-7

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

the objects detected at step (b). Some of the

previously developed approaches to detecting

differences in images are briefly observed in the

next section.

2 RELATED WORKS

In paper (Minakhin, 2009), for detecting damages of

negatives obtained in the technique of three-color

photography, a procedure based on the logical

operations with binary images is used. The

procedure is computationally expensive for large

images and does not provide the reliable detection.

In (Heitz, 1990), a method for automated detection

of hidden information (so-called “events”) using

photograph and X-ray image of painting is

described. Detection of “events” is based on the

analysis of two-level hierarchical description of the

image pair. The method seems to be rather

computationally expensive, because it requires

segmentation and feature extraction in each image of

the pair. In (Daly, 1993, Petrovic, 2004), the authors

evaluate visual differences in images and

multispectral image sequences utilizing the human

visual system model. The problem formulation of

evaluating visual differences in images does not

fully meet the problem considered in this paper,

which is dealing with detection of objects visible

only in one image and invisible in another. In

(Kammerer, 2004), the authors developed a software

tool for visual examination and comparison of IR

photographs and color image to support art

historians in understanding differences and

similarities of the preliminary scetches and the final

painting. The study of the existing approaches has

shown that the efficient technique for solving the

problem considered in this paper has not been

developed to date. In several works, information-

theoretical techniques are used for evaluation of

similarity and differences of images.

In the next section, information-theoretical

measures of image difference will be concidered.

3 INFORMATION MEASURES

Information techniques work directly with image

data and no preprocessing or segmentation is

required. In works (Viola, 1995; Escolano, 2009,

and many others), the mutual information measure

of image similarity for multimodality image

registration is presented. In paper (Zhang, 2004), a

new information pseudo metric is introduced. The

metric used is a sum of conditional entropies

H(X|Y)+H(Y|X), where X and Y are random

variables denoting grayscale values at pixels in two

images. In (Rockinger, 1998), to evaluate the

temporal stability and consistency of the fused

image sequence, a quality measure based on the

mutual information is proposed.

In this paper, unlike the image registration and

image fusion problems, it is necessary to use a

measure providing detection of objects visible only

in one image and invisible in the second image of

the analyzed pair. Therefore the required measure

should not be simmetric.

For using information-theoretical approach, the

stochastic model of relation between the images is

needed. Let the grayscale values in the images of

different modalities at a point (x,y) are described by

discrete random variables U(x,y) and V(x,y)

quantized into a finite number of levels K and L, and

taking the discrete values u and

v. As the images U

and V fix the same scene of the real world, there

exist a relation between variables U(x,y) and V(x,y).

A model, analogous to one given in (Escolano,

2009) will be used:

( ( , )) ( ( , )) ( , ),UTrxy FV xy xy







(1)

where Tr is the coordinate transformation (for

registered images we have U(Tr(x,y))=U(x,y)); F is

the function of gray level transform, giving relation

between the images of two modalities;

(, )



is a

random variable modeling appearance of artefacts.

Expression (1) is considered as a model of a discrete

stochastic information system with input V and

output U. Conditional entropy is defined as follows:

(,)

(|) (,)log ,

()

pu v

HU V pu v













(2)

(,)

(|) (,)log ,

()

pu v

HV U pu v













(3)

where

()

pu , ()

pv are the probability mass

functions of variables U and V, and

(,)

pu v is the

joint probability mass function (p.m.f.) of these

variables.

We propose to use conditional entropies

(|)HU V

and

(|)HV U

for estimating difference of

image U from V. The following statement gives the

conditions for using

(|)HV U

as a measure of

image difference.

Statement 1. The difference of images U and V can

be measured by the conditional entropy

(|)HU V

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

728

the following conditions are satisfied:

() ( ,)

lkl

pv pu v ,

1,..., ; 1,...,kKlL

(4)

() ( ,)

lkl

pv pu v and () (,)

kkl

pu pu v ,

1,..., ; 1,...,kKlL,

(5)

where

()

pu , ()

pv , (,)

pu v are the probability

mass functions.

The proof of the statement follows directly from

expression (2). The following examples illustrate the

statement. Let K objects

O , k = 1, …,K are visible

in image U, and L objects

O , l = 1,…,L in V.

Images U and V are registered. Let the condition (4)

be valid. In this case,

(|)0HU V 

(see expression

(2)). This gives

UVUV

kkkk

OOOO for

1 kM

K , and

OO

for

kK , KlL (see Figure 2). The joint

histogram of U and V is shown in Figure 3.

Figure 2: Images U and V provide

(|) 0HU V



Figure 3: Joint histogram of images U and V shown in

Figure 2.

Let the condition (5) be valid. Then

UVUV

llll

OOOO

1 lM



and

OO for

lL (see Figure 4). The

joint histogram of images U and V is shown in

Figure 5. In this case,

(|)0HU V 

and its value

depends on the values of corresponding probability

mass function defined by the geometry of the

objects. If

() (,)

kkl

pu pu v then

(|)HU V

will

include information about the objects visible in V

and invisible in U, that does not meet the formulated

requirements to

(|)HU V

Figure 4: Images U and V provide

(|) 0HU V 

Figure 5: Joint histogram of images U and V shown in

Figure 4.

For the analysis of image pairs it is necessary to

localize and visualize the differences. The value

(|)HU V

computed with the specified number of

quantization levels in the neighborhood of each pixel

of images under research, can be used as an

indicator of artefacts in image U. A size of the

neighborhood and a number of grayscale levels are

chosen in order to: (a) satisfy conditions (4)-(5); (b)

correctly estimate probability mass functions; (c)

provide reasonable precision of difference

localization. Probability mass functions are

estimated using the joint histogram of the image pair

(Rajwade, 2006).

An example of application of measure

(|)HU V

is shown in Figure 6. A color image with objects

embedded in blue channel (a), red channel (b), and

blue channel (c) are presented. Here, blue channel is

denoted as U and red channel is denoted as V. In

Figure 6(d) one can see that all of the embedded

artefacts in the fragment “sky” are detected using

entropy

(|)HU V

and failed to be detected in the

textured regions “forest” and “field” having high

dispersion of grayscale values. It is impossible to

choose the size of pixel neighborhood and number

of quantization levels to satisfy the conditions (4)-

(5). In this case, the efficient way to detect artefacts

is to use entropy

(|)HV U

defined by (3).

The following statement provides conditions for

using

(|)HV U

as a measure of image difference.

Statement 2. The difference of images U and V can

be measured by the conditional entropy

(|)HV U

the following conditions are satisfied:

() (,)

kkl

pu pu v ,

1,..., ; 1,...,kKlL



(6)

A COMBINED TECHNIQUE FOR DETECTING OBJECTS IN MULTIMODAL IMAGES OF PAINTINGS

729

() ( ,)

lkl

pv pu v , and () (,)

kkl

pu pu v ,

1,..., ; 1,...,kKlL, and

(7)

00 0

:() (,)

upu puv







, 1,...,lL .

(8)

The proof follows directly after substituting

expressions (6-8) to (3).

(a)

(b)

(c)

(d)

Figure 6: A color image with objects embedded in blue

channel (a), red channel (b), blue channel (c), visualized

local values of

(|)

(d).

Condition (6) provides

(|) 0HV U 

if U and V

are identical. Conditions (7-8) give

(|) 0HV U 

U contains the object invisible in V. Condition (8)

gives

OO for 1,...,lL (see

Figure 7 (a, b)). The local values of entropies

(|)HU V

and

(|)HV U

are shown in Figure 7 (c, d).

(a) (b) (c)

(d)

Figure 7: Images U and V (a, b); visualized local values of

(|)

(d).

An example of applying the measure

(|)HV U

to a

fragment of color photograph with high dispersion

values is shown in Figure 8.

(

(b)

Figure 8: (a) Region “field” of the image given in Figure

6(a); (b) visualized local values of

(|)

The proposed measures of image difference were

tested using color image of size 988 by 1631 pixels

(see Figure 6 (a)) with uniform and textured regions

(sky, forest, field). 140 blurred dark disks of

diameter d from 4 to 8 pixels were embedded in one

of the color channels (see Figure 6(c)). All of the

disks of diameter d = 8 pixels were detected using

measures

(|)HU V

and

(|)HV U

. The results of

the experiment for embedded objects of d = 4 pixels

are shown in Table 1.

The measures were also tested on twenty images

taken from Berkeley Segmentation Dataset

BSDS500. Fifty four blurred dark disks of diameter

of 4 and 8 pixels were embedded into one of the

color channels. All of the objects of d = 8 pixels

were detected. In the average, more than 95% of the

objects of d = 4 were successfully localized.

The results of calculating image difference

measures in noisy image data (see Figure 1) have

shown that it is impossible to obtain a mask of the

objects with required accuracy without utilization of

special detectors. In the next section, a combined

technique based on specific object detectors and

information-theoretical image difference measure is

presented.

4 APPLICATION TO THE

IMAGES OF PAINTINGS

The proposed information-theoretical measure of

image difference

(|)HU V is applied to the task of

detecting regions of intrusion into the author’s paint

layer of fine-art paintings using images in ultraviolet

and visible spectral bands. The input images are

shown in Figure 1 (a, b). We assume that the images

are perfectly registered and uneven illumination is

compensated. The repainting areas appear as dark

spots in UV image (see Figure 1(b)) and practically

invisible in the photograph (see Figure 1 (a)). Taking

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

730

into account the properties of the objects of interest

(see Introduction), detectors of two types are applied

to the grayscale version of ultraviolet image.

Table 1: The results of the test for d = 4.

Image region Number of objects Detected objects Percentage

Sky 68 66 97

Forest 28 27 96

Field 44 40 91

Total 140 133 95

The first detector is aimed on localizing large

dark regions (so-called “basins”) and is based on the

operation of morphological grayscale reconstruction

(Soille, 2004). Let U be a grayscale version of

ultraviolet image of the painting. A mask of the dark

regions is obtained as:

(),

bas dom

MTUU

(9)

where T() is the threshold operation,

dom

U and

bas

are the images of grayscale “domes” and “basins”

found in U:

(),

dom U

UURUg



 

(10)

where

()

RU g



 is the result of morphological

reconstruction by geodesic dilation of mask U from

marker U-g; g is the relative height of the domes.

bas

U is found as follows:

(),

bas U

URUhU





(11)

where ( )

RU h



 is the result of morphological

reconstruction by geodesic erosion of mask U from

marker U+h; h is the relative depth of the basins.

Operation of pixel-by-pixel subtraction in (9) is used

for enhancing contrast of the image

bas

U .in order to

increase the accuracy of thresholding.

The second detector is intended for localizing

rather small image objects. The algorithm is based

on the locally adaptive thresholding technique

proposed in (Niblack, 1986). The detecting function

is defined in the following way:

1, ( , ) ;

(, )

0, ( , ) ,

uxy u















(12)

where

(, )

uuxyq



; (, )uxy is the mean

value of grayscale levels in some neighbourhood of

a point (x,y);



is the standard deviation; q is a

constant. The mask image is defined as follows:

(, ) (, )

xy xy



 .

(13)

The images of the binary masks obtained using

detectors described by the expressions (9-11) and

(12-13) are shown in Figure 9.

(a)

(b)

Figure 9: Images of binary masks (a)

and (b)

the objects found in UV image.

Not all of the detected objects correspond to the

regions of interest. The next step of the approach is

selecting the repainting areas from the objects

detected at the previous step. For this purpose we

will extract markers using image difference measure

proposed above. Visualized local values of entropies

(|)HU V and

(|)HV U

calculated for the grayscale

versions of input images shown in Figure 1, are

presented in Figure 10(a, b). Probability mass

functions are estimated for 64 grayscale levels in

11x11 windows, and the entropies are calculated in

3x3 pixel neighbourhood. For obtaining markers of

the required objects the following operation is

necessary:

( , ) ( ( | ) ( ) ( | )),

MxyTHUVHUHVU (14)

where T() is the threshold operation,

()HU is the

entropy of the ultraviolet image,

() is the operation

of pixel-by-pixel image multiplication. Operations

of image multiplication and subtraction improve

contrast of the thresholded image and increase the

accuracy of thresholding. The mask of the required

objects visible only in UV image can be as:

(,) ( )

MUV R M





(15)

where

UUU

MM; “  ” is the logical “OR”

operation. The obtained mask (15) of the objects

visible only in UV image is shown in Figure 11(a).

The mask combined with image in the optical

spectral range is presented in Figure 11(b). The

proposed combined technique was successfully

A COMBINED TECHNIQUE FOR DETECTING OBJECTS IN MULTIMODAL IMAGES OF PAINTINGS

731

applied to eight pairs of UV and visible images of

fine-art paintings.

(

(b)

Figure 10: Visualized local values of conditional entropies

(|)

(a) and

(|)

(b).

(a)

(b)

Figure 11: The resultant mask of the repainting areas (a)

and the mask combined with the photograph of the

painting (b).

5 CONCLUSIONS

The combined technique for detecting objects in

multimodal images based on specific object

detectors and image difference measure is presented.

Two information-theoretical measures of image

difference are proposed. The conditions of

applicability of these measures for detecting

artefacts in multimodal images are formulated. The

computing experiment has shown the efficiency of

the proposed measures. The combined technique is

successfully applied for detecting repainting and

retouching areas of fine-art paintings. The objects

having grayscale relief similar to the relief of the

repainting area are localized in UV image thought

the instrumentality of specific detectors.

Subsequently, the “true” objects are selected using

considered information-theoretical difference

measure. The proposed technique requires

segmentation of only one of the analyzed images,

unlike the technique described in (Heitz, 1990). The

future research will be aimed at the development

automated procedures for choosing window size and

number of quantization levels for estimating

conditional entropies.

ACKNOWLEDGEMENTS

This work was supported by the Russian Foundation

for Basic Research, grant No 09-07-00368.

REFERENCES

Daly S., 1993. The visible differences predictor: an

algorithm for the assessment of image fidelity. Digital

images and human vision, MIT Press, Cambridge.

Escolano, F., Suau, P., Bonev, B., 2009. Information

Theory in Computer Vision and Pattern Recognition.

London: Springer Verlag.

Heitz, F., Maitre, H. de Couessin, C., 1990. Event

Detection in Multisource Imaging: Application to Fine

Arts Painting Analysis. IEEE trans. on acoustics,

speech, and signal processin, 38(1), 695-704.

Kammerer, P., Hanbury, A., Zolda, E., 2004. A

Visualization Tool for Comparing Paintings and Their

Underdrawings. In EVA’2004, Conference on

Electronic Imaging & the Visual Arts, 148–153.

Kirsh, A., and Levenson, R. S., 2000. Seeing through

paintings: Physical examination in art historical

studies. Yale U. Press, New Haven, CT.

Minakhin, V., Murashov, D., Davidov, Yu., Dimentman,

D., 2009. Compensation for Local Defects in an Image

Created Using a Triple-Color Photo Technique.

Pattern Recognition and Image Analysis: Advances in

Mathematical Theory and Applications. MAIK

"Nauka/Interperiodica". 19, 1. 137 – 158.

Niblack, W., 1986. An Introduction to Digital Image

Processing. Prentice Hall, Englewood Cliffs, NJ.

Petrovic, V., Xydeas, C., 2004. Evaluation of Image

Fusion Performance with Visible Differences.

ECCV'2004, LNCS, 3023, 380-391.

Rajwade, A., Banerjee, A., Rangarajan, A., 2006.

Continuous Image Representations Avoid the

Histogram Binning Problem in Mutual Information

Based Image Registration, Proc. of IEEE International

Symposium on Biomedical Imaging (ISBI). 840-843.

Rockinger, O., Fechner, T., 1998. Pixel-Level Image

Fusion: The Case of Image Sequences. SPIE Proc.

Signal Processing, Sensor Fusion, and Target

Recognition VII, Ivan Kadar, Editor, 3374, 378-388.

Soille,;P., 2004. Morphological Image Analysis:

Principles and Applications, Springer-Verlag. Berlin.

Viola, P., 1995. Alignment by Maximization of Mutual

Information. PhD Thesis, MIT.

Zhang, J., Rangarajan, A. 2004. Affine image registration

using a new information metric. In: IEEE Computer

Vision and Pattern Recognition (CVPR), 1, 848-855.

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

732