IMAGE SEQUENCE SUPER-RESOLUTION BASED ON LEARNING

USING FEATURE DESCRIPTORS

Ana Carolina Correia R

ezio, William Robson Schwartz and Helio Pedrini

Institute of Computing, University of Campinas, SP, 13083-852 Campinas, Brazil

Keywords:

Super-resolution, Machine Learning, Feature Descriptors, Image Resolution.

Abstract:

There is currently a growing demand for high-resolution images and videos in several domains of knowledge,

such as surveillance, remote sensing, medicine, industrial automation, microscopy, among others. High res-

olution images provide details that are important to tasks of analysis and visualization of data present in the

images. However, due to the cost of high precision sensors and the limitations that exist for reducing the size

of the image pixels in the sensor itself, high-resolution images have been acquired from super-resolution meth-

ods. This work proposes a method for super-resolving a sequence of images from the compensation residual

learned by the features extracted in the residual image and the training set. The results are compared with

some methods available in the literature. Quantitative and qualitative measures are used to compare the results

obtained with super-resolution techniques considered in the experiments.

1 INTRODUCTION

The resolution of a digital image is directly related

to its quality. This resolution refers to the level of

detail of its visual representation, that is, the higher

the resolution, the greater its precision to represent it

in relation to the actual image (Chaudhuri, 2001a).

In many applications, an image or a sequence of

images presents poor quality due to several physical

limitations of the sensors, such as low spatial resolu-

tion, optical distortions, noise, and limited dynamic

range (Bascle et al., 1996). In general, these degrada-

tions seriously undermine the process of image analy-

sis. In surveillance systems, for example, images cap-

tured from low cost cameras may hinder the recogni-

tion of individuals or the identiﬁcation of a vehicle li-

cense plate at a certain scene (Lin et al., 2005). In ad-

dition, a person or an object usually occupies a small

region in the ﬁeld of camera view, such that the por-

tion of pixels of interest in the image is usually very

small.

In several ﬁelds of knowledge, such as medicine,

biology, geology, industrial automation, surveillance,

remote sensing, among others, there is a great demand

for high-resolution images (Chaudhuri, 2001b; Patti

and Altunbasak, 2001).

Due to factors associated with cost and limitations

of acquisition devices, an alternative is to increase

the resolution and improve the psicovisual quality of

the images by applying super-resolution techniques.

Some images can suffer from a process of degradati-

on of its quality (low resolution images) due to some

factors, such as lens aberration, incorrect focus, sen-

sor displacement during acquisition, object displace-

ment, lack or excess of lighting (Bascle et al., 1996).

Super-resolution techniques have received much

attention in recent years (Liu et al., 2008; Lucien,

1999; Lin et al., 2005; Sun et al., 2008; Baker and

Kanade, 2002), whose main objective is to increase

the spatial resolution of images, removing distortions

in the acquisition, highlighting details, such as bor-

ders, or recovering important information from the set

of images.

Super-resolution techniques can be applied to a

single image, a sequence of multiple images, or

videos. Super-resolution methods for a single im-

age aim at increasing the image resolution from the

enhancement of its most relevant information, with-

out introducing blurring. Figure 1 illustrates im-

ages at low and high resolution. Super-resolution

methods for multiple images seek to create an im-

age with higher resolution from fusion information

present in multiple low resolution images. Super-

resolution methods for videos aim at generating a

high-resolution video from low-resolution frames.

The development of techniques for increasing the

sharpness of image details becomes important, since

they can improve subsequent steps of data analysis

and interpretation.

This paper presents a learning-based method for

generating high resolution images from correspond-

ing versions of the images at low-resolution. Unlike

135

Carolina Correia Rézio A., Robson Schwartz W. and Pedrini H..

IMAGE SEQUENCE SUPER-RESOLUTION BASED ON LEARNING USING FEATURE DESCRIPTORS.

DOI: 10.5220/0003861701350144

In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2012), pages 135-144

ISBN: 978-989-8565-03-7

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 1: Examples of images (a) at low resolution and (b)

at high resolution. Images extracted from (Sun et al., 2010).

the technique developed by Yu et al. (Yu et al., 2008),

this work presents a methodology for super-resolution

also based on locally linear embedding (LLE) and

residual compensation, however, the representation of

the residual is carried out by means of feature descrip-

tors, which allows a better estimation of the super-

resolved images and avoids various artifacts. Addi-

tionally, the proposed method applies a more robust

technique for estimating the initial super-resolved im-

age than that used by Yu et al. (Yu et al., 2008),

called prior gradient proﬁle (GPP) (Sun et al., 2010),

which has a greater focus on edge preservation. The

method is proposed for super-resolving multiple low-

resolution images, instead of only a single image, on

a high resolution image.

For extraction of feature descriptors, the

method uses the histogram of shearlet coefﬁ-

cients (HSC) (Schwartz et al., 2011). The HSC is

based on an accurate multi-scale analysis provided

by the shearlet transform and the use of histograms

for estimating the distribution of edge orientation.

Experimental results show that the description of the

residual using HSC provides a signiﬁcant reduction

of artifacts resulting from the application of super-

resolution. The effectiveness of the proposed method

is assessed by quantitative and qualitative measures.

This paper is organized as follows. Section 2

describes some important concepts and related work

on super-resolution. Section 3 presents the proposed

super-resolution method applied to a sequence of im-

ages. Experimental results are shown and discussed

in Section 4. Finally, Section 5 concludes the paper

with ﬁnal remarks.

2 RELATED WORK

Spatial resolution is associated with the size of the

smallest details visible in the image. The smaller the

sampling interval among the image points, that is, the

higher the density of points in the image, the higher

the spatial resolution of the image. This does not

mean that an image containing a large number of pix-

els necessarily has higher spatial resolution than one

with fewer pixels.

Several super-resolution methods have been pro-

posed in the literature for improving the spatial res-

olution of the images. The main objective of these

methods is to estimate a high-resolution (HR) image

from one or more low resolution (LR) images in order

to highlight details, without the addition of artifacts in

the resulting image.

The main approaches found in the literature can be

classiﬁed into three categories: methods based on de-

tail enhancement, reconstruction-based methods and

learning-based methods. These approaches are de-

scribed below.

2.1 Methods based on Detail

Enhancement

In methods based on detail enhancement, the estima-

tion of pixel values in the high resolution image is

performed by interpolating the intensity or color of

the pixels already present in the input image. After-

wards, a process is applied to reduce artifacts included

in the interpolated images.

There are several interpolation methods in the lit-

erature (Gonzalez and Woods, 2007; Lucien et al.,

1997), such that the most commonly used are the

nearest neighbor, bilinear and bicubic interpolation.

In the nearest neighbor interpolation, the orig-

inal pixel value f (x, y) is assigned to the nearest

pixel f

, y

) in the resampled image. Although the

method is simple and has low computational cost, its

main disadvantages are the generation of distortion in

ﬁne details and creation of jagged edges in the image.

The bilinear interpolation calculates the intensity

of each pixel value f

, y

) by the weighted average

of four pixels adjacent away from the nearest neigh-

bors. The resulting image presents smoothing at the

edges and phase distortion, resulting in image blur-

ring.

Bicubic interpolation seeks to obtain a smooth es-

timate of the gray level or color at each pixel f

(x, y)

from a larger number of neighboring points of the

original image, which form a polynomial of low de-

gree. Fine details are preserved, edges are smoothed

and distortions are minimized in the resulting image.

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

136

The super-resolution method called prior gradient

proﬁle (GPP) is a parametric distribution that de-

scribes the shape and sharpness of gradient proﬁles

in natural images. One of the observations presented

by Sun et al. (Sun et al., 2008; Sun et al., 2010) is

that the statistical form of these proﬁles is stable and

invariant to the resolution of the image. From this in-

formation, a statistical relation of the image gradient

sharpness can be learned between HR and LR images.

Using the learned proﬁle gradient relation, it is

possible to provide a constraint on the ﬁeld gradient

of the HR image. Combining it with the constraint

reconstruction, it is possible to recover a high quality

image. The advantages of the GPP mentioned in (Sun

et al., 2008; Sun et al., 2010) include: (a) the gra-

dient proﬁle is not a constraint of smoothness, there-

fore, the edges can be well recovered in the HR image

both at small and large scales; (b) the artifacts com-

mon in super-resolution, such as jagged edges, can be

avoided in the ﬁeld gradient.

The high-resolution image in the GPP technique

can be obtained from the following equation

t+1

= HR

− τ

∂E(HR)

∂HR

(1)

where

∂E(HR)

∂HR

= ((HR ∗ G) ↓ − LR) ↑ ∗ G −

β(∇

HR − ∇

), ∇

(x) = r(d(x, x

))∇

G represents the Gaussian ﬁlter, ∗ is the convolution

operator, ↓ is the down-sampling operator, ∇

represents the image ﬁeld gradient of the HR image,

∇

represents the ﬁeld gradient of the LR im-

age, and ↑ is the up-sampling operator. Term ∇

transforms the observed ﬁeld gradient into the target

ﬁeld gradient by mapping the shape and sharpness

proﬁle of the observed gradient.

2.2 Methods based on Reconstruction

Reconstruction-based methods consist in the synthe-

sis of a single high resolution image from a sequence

of low-resolution images (Borman and Stevenson,

1998). In these methods, it is assumed that the cap-

tured images of low resolution (LR) are very sim-

ilar to each other. There are few details that set

them apart, allowing the generation of new informa-

tion to recover the image details in high resolution

(HR) (Patti and Altunbasak, 2001).

These methods recover details of sequences and

add them to the estimated high-resolution image.

Usually, they have three different phases: i) reg-

istration of the images, ii) generation of the high-

resolution grid, where their values are interpolated

from the recorded images, iii) removal of noise (Park

et al., 2003).

Irani and Peleg (Irani and Peleg, 1991) developed

a technique called iterative back-projection (IBP),

similar to the method of back-projection developed

for reconstruction of tomographic images. Each low-

resolution pixel is modeled as the projection of a

given region in the high resolution image that will

be estimated. The process starts with a ﬁrst estimate

of the high resolution to simulate the low-resolution

images. Iteratively, information is added to this esti-

mated image, from a new gradient image, based on

the error between the simulated images and the ob-

served images. This gradient image is the sum of all

errors between the low resolution image and the high-

resolution image estimated by the transformation pro-

cess given by the motion estimation between the low-

resolution images. The method uses an iterative pro-

cedure to minimize the error between the original data

and the output of the model.

Projection onto convex sets (POCS) (Stark, 1988)

uses the low resolution image to produce a new image

through subpixel displacements in rows and columns.

The displacement aims at minimizing the effects of

aliasing and allowing the retrieval of new information

for the high resolution image.

The POCS method seeks to solve the problem

from a priori information described in the form of

convex sets of constraints. The search for this solution

consists of ﬁnding a value that belongs to the inter-

section of the sets. This is an iterative method, which

produces, for a ﬁnite number of steps, intermediate

solutions. The iteration ends when convergence oc-

curs or the process is interrupted by a predetermined

criterion.

Some improvements in reconstruction methods

based on POCS were developed by Patti and Altun-

basak (Patti and Altunbasak, 2001). First, the dis-

cretization of the model of continuous image forma-

tion is enhanced to allow the use of high-order in-

terpolation methods. Second, the constraint sets are

modiﬁed to reduce the number of edges present in the

estimated high resolution image.

Reconstruction by super-resolution produces an

image or a set of high resolution images from a set

of low-resolution images. In the last two decades,

several super-resolution methods have been proposed,

some of them presented in (Borman and Stevenson,

1998; Farsiu et al., 2004a). These methods are often

very sensitive to noise and to their data model, which

limits their usefulness. In (Farsiu et al., 2004b), an

alternative approach was proposed using minimiza-

tion standards and robust regularization before deal-

ing with different data models and noise.

IMAGE SEQUENCE SUPER-RESOLUTION BASED ON LEARNING USING FEATURE DESCRIPTORS

137

2.3 Methods based on Learning

In super-resolution methods based on learning, the

main objective is to estimate information, which is

not present in the original low-resolution image, from

a set of training samples.

The method proposed by Yu et al. (Yu et al., 2008)

for super-resolution of face images uses projection

onto convex sets (POCS) and residual compensation.

First, the original high resolution image is estimated

by POCS and then the residual compensation is per-

formed. The high frequency information in the im-

age is reconstructed from the learning between the

two training data sets of corresponding low and high

resolution residual images. The super-resolution im-

age is generated from the sum between the initially

estimated image and the image reconstructed by the

weights of learned residuals. The method uses a ma-

chine learning algorithm called locally linear embed-

ding (LLE) (Chang et al., 2004).

First, the method reconstructs the high resolu-

tion image initially estimated by POCS, similar to

the original high resolution image. Afterwards, it es-

timates the residual compensation and the high fre-

quency information is reconstructed by learning be-

tween the correspondence of sets of high and low res-

olution residual images. Assuming a set of N vectors

of dimension D (Yu et al., 2008), then

x = [x

, x

, ..., x

] (2)

For a new vector x

, the Euclidean distance d

be-

tween all other vectors x is calculated as

− x

)

− x

) (3)

The K nearest neighbors of x

are deﬁned by the

Euclidean distance and, therefore, the best weights

w to rebuild x

from K neighbors are determined

(w = [w

, w

, ..., w

]), where w

represents the

contribution of x

for x

reconstruction. If x

is not be-

tween the K nearest neighbors of x

, then w

= 0, else

0 < w

< 1 and

∑

= 1.

The optimal weight w

is estimated using the least

squares algorithm

∑

i, j

∑

i, j

(4)

where P = G

−1

and G is the covariance matrix deﬁned

i, j

= (x

− x

)

− x

) (5)

Since G is a singular matrix, that is, it has no in-

verse, the optimal solution can not be found by Equa-

tion 5. A relatively simple solution is to assign a small

value of α to the diagonal of G

G = G + α × I (6)

where I is a unit matrix.

The learning method proposed in (Yu et al., 2008)

considers an initial low resolution image g obtained

by degrading f (g = H f ) and an initial estimate of

super-resolution

f obtained by the POCS method.

From the known degrading model H, the low reso-

lution residual is calculated as

ˆg = H

f r = g − ˆg (7)

Using the LLE algorithm to learn the weights of

the reconstruction from the low-resolution residual

r and the set of residual images, the method recon-

structs the high-resolution residual ˆs from the cal-

culated weights and the corresponding data at high

resolution. From the high-resolution residual, super-

resolution image

n+1

is obtained by

n+1

+ α ˆs ∈ (0, 1] (8)

3 PROPOSED METHOD

According to the previous section, there are

three main approaches to enhancing image res-

olutions: methods based on detail enhancement,

reconstruction-based methods and learning-based

methods. The method proposed in this work is based

on learning, in which training samples are used to es-

timate the information regarding details that are not

present in the lower resolution images. This super-

resolution technique has been inspired by (Yu et al.,

2008), which employs residual compensation.

The method developed in this work considers a

sequence of low resolution images that will be com-

bined into a single higher resolution image.

The training set, which is sampled from the se-

quence being processed due to high similarity among

the samples, needs to be pre-processed prior to the

execution, as shown in Figure 2.

Image samples in the training set are considered as

to be the high resolution (HR). These images will be

smoothed and downsampled to obtain low resolution

(LR) images. The LR images have their resolution in-

creased, generating images called SR, which will be

smoothed and downsampled to generate the images

. The low resolution residuals are estimated by

the difference between LR and LR

, which the cor-

responding high resolution residuals are estimated by

the difference between images HR and SR. Feature

extraction is performed for each low resolution resid-

uals, resulting on a feature vector.

After the pre-processing, the proposed super-

resolution method for multiple images is executed, ac-

cording to the steps shown in Figure 3. This process

is described in the following paragraphs.

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

138

Figure 2: Pre-processing of the training set sampled from the image sequence.

Figure 3: Data ﬂow of the proposed method.

The execution process starts with the original se-

quence, denoted HR. This sequence is smoothed and

then downsampled, which will generate a new se-

quence called LR. The proposed method will be ap-

plied to this sequence. This is necessary so that the

original sequence can be used as a reference to assess

the results achieved.

The second step consists of increasing the resolu-

tion of the sequence LR, generating a new sequence,

called SR, with the same resolution as the original se-

quence. Then, the images are fused in pairs, in which

one of them is the ﬁrst images of the sequence (refer-

ence image), as shown in Figure 4.

The process starts with N images. After the fu-

sion, the result consists of N = N − 1 images. If

the sequence size N becomes equal to 1, the result-

ing super-resolved image has been obtained and the

process is stopped. Otherwise, the resulting sequence

IMAGE SEQUENCE SUPER-RESOLUTION BASED ON LEARNING USING FEATURE DESCRIPTORS

139

(a) Proposed method.

(b) Iterative process.

Figure 4: Proposed super-resolution method for a sequence of images.

is smoothed and downsampled, generating a new se-

quence LR

, in which the super-resolution for each

image will be obtained by applying the procedure de-

scribed in Section 3.1. This fusion process is illus-

trated in Figure 4(b) and Algorithm 17 presents the

main steps of the method described during this sec-

tion.

3.1 Super-resolution for Single Images

The procedure to perform super-resolution for a sin-

gle image is depicted in Figure 5. It starts by resam-

pling the low resolution image (LR) to the desired size

using a standard interpolation method such as bicu-

bic, bilinear or nearest-neighbor, or another super-

resolution method from the literature, such as GPP,

POCS or IBP (described in Section 2). This results in

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

140

Figure 5: Super-resolution for single images.

an initial estimation of the super-resolved image, de-

noted SR.

Require: img: original image sequence, n: number

of images on this sequence, zoom: zoom factor.

Ensure: listSR: resulting image.

1: for all i = 2 to n do

2: listImg

i−1

= fusion(downsample(img

, zoom),

downsample(img

,zoom))

3: end for

4: n ← n − 1

5: while n > 1 do

6: for all i = 1 to num elements(listImg) do

7: test = smooth(listImg

)

8: test = downsample(teste, 1/zoom)

9: listSR

= singleImage(test)

10: end for

11: for all i = 2 to n do

12: listAux

i−1

= fusion(listSR

, listSR

)

13: end for

14: listImg = listAux

15: n ← n − 1

16: end while

17: return listSR

Algorithm 1: Image sequence super-resolution method.

If the initial estimation SR satisﬁes a similar-

ity threshold based on the root mean square error

(RMSE) between the LR and a smoothed and down-

sampled version of SR, the procedure stops. Other-

wise, the image is modiﬁed by considering the resid-

ual information. SR is added to the high resolution

residual according to Equation 9. This residual is re-

constructed by the weighted sum of several samples

belonging to the training set.

n+1

= SR

+ α resHR (9)

To estimate the weights for the sum, features are

extracted from the low resolution residuals (resLR)

and added to a feature vector. From this vector, the

weights of k nearest-neighbors are estimated using

LLE (the k-th training samples which have the most

similar residuals to resLR). Since feature vector is as-

sociated to a low and a high-resolution sample, the

high resolution weights are used to estimate resHR in

Equation 9, where α ∈ (0, 1] denotes the importance

of the residual estimation (this parameter is experi-

mentally estimated in the next section). This process

is repeated a certain number of iterations (also esti-

mated experimentally in the next section).

IMAGE SEQUENCE SUPER-RESOLUTION BASED ON LEARNING USING FEATURE DESCRIPTORS

141

Figure 6: Examples of frames for taxi video sequence (Nagel, 2011).

Figure 7: Corresponding points between two images used

to fuse both images.

0 50 100 150 200

100

120

140

RMSE

Number of Iterations

alpha = 0.001

alpha = 0.01

alpha = 0.1

alpha = 0.5

alpha = 0.8

alpha = 1

Figure 8: RMSE as a function of the number of iterations

and α.

4 EXPERIMENTAL RESULTS

This section describes the experiments and compares

the results achieved by applying the proposed method.

All experiments were conducted using a Intel Core 2

Duo 2.2 GHz processor with 6GBytes of RAM on the

Windows 7 operating system.

4.1 Dataset

To evaluate the proposed method, a public access

video sequence with 41 frames was used (Nagel,

2011) (some frames are shown in Figure 6). Each

frame is in grayscale and has 256×191 pixels. In our

experiments, the frames were cropped to 128×128

pixels, since some of the compared methods work

only for square images.

4.2 Evaluation Metrics

The results obtained were assessed through the root

mean square error (RMSE) and the structural simi-

larity (SSIM), quantitative and qualitative measures,

respectively.

RMSE considers two images of equal sizes, such

that RMSE equal to 0 represents that both images are

identical. In the experiments, we consider the differ-

ence between the reference image and the resulting

super-resolved image. Equation 10 shows the RMSE,

where M and N denote the image dimensions.

RMSE =

M−1

∑

x=0

N−1

∑

y=0

[SR(x, y) − HR(x, y)]

(10)

SSIM, proposed in (Wang et al., 2004), measures

the similarity between two images considering local

correlation, contrast and structures according to

SSIM(x, y) =

(2µ

)(2σxy +C

)

(µ

+ µ

)(σ

)

(11)

where constants C

and C

prevent from numerical

instabilities when (µ

+ µ

) and (σ

) are close to

zero. As suggested in (Wang et al., 2004), we used

= 0.01 and C

= 0.03. In this measure, the result-

ing values are normalized between 0 and 1, and the

closer to 1, the better the image quality.

4.3 Parameter Settings

To evaluate the proposed method, a set of parameters

has been deﬁned in the experiments. First, the super-

resolution method used to resample the input image

to the desired size (as described in Section 3.1) was

the GPP.

In the fusion process, the images are considered in

pairs, in which one is considered as the reference im-

age. As illustrated in Figure 4(b), the reference image

corresponds to the ﬁrst image of the sequence. To fuse

two images (lines 2 and 12 of Algorithm 17), the SIFT

algorithm (Lowe, 2004) was applied to detect corre-

sponding points (Figure 7). From such corresponding

points an afﬁne transform is computed to register both

images.

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

142

Table 1: Results achieved by the video-based super-resolution methods for a zoom factor of 2×.

Zoom factor 2×

Method 6 frames 12 frames

RMSE SSIM RMSE SSIM

IBP (Irani and Peleg, 1991) 32.5863 0.7476 34.0467 0.7081

POCS (Stark, 1988) 22.4413 0.8511 30.0020 0.7442

Proposed 10.7686 0.9604 18.0773 0.8893

Table 2: Results achieved by the video-based super-resolution methods for a zoom factor of 4×.

Zoom factor 4×

Method 6 frames 12 frames

RMSE SSIM RMSE SSIM

IBP (Irani and Peleg, 1991) 29.8295 0.7648 30.7796 0.7417

POCS (Stark, 1988) 21.5620 0.8571 29.4161 0.7501

Proposed 14.0965 0.9303 16.1489 0.9161

Original

6 frames 12 frames 6 frames 12 frames

(2×) (2×) (4×) (4×)

IBP

POCS

Proposed method

Figure 9: Results obtained for zoom factors of 2× and 4× and for sequences with 6 and 12 frames.

Finally, to achieve better results, experiments

were conducted using different setups for parameters

α (Equation 9) and the number of iterations for the

super-resolution considering single images. Accord-

ing to Figure 8, the method presents higher accuracy

for low values of α, such as 0.001. In addition to

that, the best results have been achieved with 150 it-

erations.

4.4 Results and Comparisons

Using the parameters estimated in the previous sec-

tion, the proposed method was compared to two

video-based super-resolution methods: IBP (Irani and

Peleg, 1991) and POCS (Stark, 1988) . Tables 1 and 2

show the results for zoom factors of 2× and 4× for

image sequences with 6 and 12 frames (usually, long

sequences are not considered due to the large amount

of noise inserted). These values were achieved com-

paring the super-resolved image to the reference im-

age (ﬁrst image of the sequence).

Quantitative (Tables 1 and 2) and qualitative (Fig-

ure 9) results demonstrate the high accuracy of the

proposed method. According to the results shown in

Figure 9, it is also possible to see that the proposed

method presents sharper super-resolved images com-

pared to the other methods.

IMAGE SEQUENCE SUPER-RESOLUTION BASED ON LEARNING USING FEATURE DESCRIPTORS

143

5 CONCLUSIONS

Due to the increasing demand for high resolution

images and sensor limitations, super-resolution tech-

niques are fundamental tools to obtain high resolution

data.

This work proposed a super-resolution technique

based on the fusion of multiple images through resid-

ual compensation using a learning technique and

feature extraction. The comparisons performed in

our experiments showed that the proposed method

achieved higher accuracy when compared to other ap-

proaches available in the literature.

ACKNOWLEDGEMENTS

The authors are grateful to FAPESP, CAPES and

CNPq for the ﬁnancial support. This research was

partially supported by FAPESP grant 2010/10618-3.

REFERENCES

Baker, S. and Kanade, T. (2002). Limits on Super-

Resolution and How to Break Them. IEEE Transac-

tions on Pattern Analysis and Machine Intelligence,

24(9):1167–1183.

Bascle, B., Blake, A., and Zisserman, A. (1996). Motion

Deblurring and Super-resolution from an Image Se-

quence. In Fourth European Conference on Computer

Vision, pages 573–582. Springer-Verlag.

Borman, S. and Stevenson, R. L. (1998). Super-Resolution

from Image Sequences - A Review. In Midwest Sym-

posium on Circuits and Systems, pages 374–378.

Chang, H., Yeung, D.-Y., and Xiong, Y. (2004). Super-

Resolution through Neighbor Embedding. In IEEE

Conference on Computer Vision and Pattern Recogni-

tion (CVPR), pages 275–282.

Chaudhuri, S. (2001a). Super-Resolution Imaging. The

Springer International Series in Engineering and

Computer Science.

Chaudhuri, S. (2001b). Super-Resolution Imaging. Kluwer

Academic Publishers.

Farsiu, S., Robinson, D., Elad, M., and Milanfar, P. (2004a).

Advances and Challenges in Super-Resolution. Inter-

national Journal of Imaging Systems and Technology,

14:47–57.

Farsiu, S., Robinson, M. D., Elad, M., and Milanfar, P.

(2004b). Fast and Robust Multiframe Super Res-

olution. IEEE Transactions on Image Processing,

13(10):1327–1344.

Gonzalez, R. and Woods, R. (2007). Digital Image Process-

ing. Prentice Hall.

Irani, M. and Peleg, S. (1991). Improving Resolution by Im-

age Registration. Graphical Models and Image Pro-

cessing, 53(3):231–239.

Lin, F. C., Fookes, C. B., Chandran, V., and Sridha-

ran, S. (2005). Investigation into Optical Flow

Super-Resolution for Surveillance Applications. In

APRS Workshop on Digital Image Computing: Pat-

tern Recognition and Imaging for Medical Applica-

tions, Brisbane, Australia.

Liu, X., Song, D., Dong, C., and Li, H. (2008). MAP-Based

Image Super-Resolution Reconstruction. In Proceed-

ings of World Academy of Science, pages 208–211.

Lowe, D. (2004). Distinctive Image Features from Scale-

Invariant Keypoints. International Journal of Com-

puter Vision, 60:91–110.

Lucien, W. (1999). Deﬁnitions and Terms of Reference in

Data Fusion. IEEE Transactions on Geosciences and

Remote Sensing, 37(3):1190–1193.

Lucien, W., Ranchin, T., and Mangolini, M. (1997). Fu-

sion of Satellite Images of Different Spatial Res-

olutions: Assessing the Quality of Resulting Im-

ages. Photogrammetric Engineering & Remote Sens-

ing, 63:691–699.

Nagel, H.-H. (2011). Image Sequence

Server. Institut f

ur Algorithmen und

Kognitive Systeme, Universit

at Karlsruhe.

http://i21www.ira.uka.de/image sequences/.

Park, S. C., Park, M. K., and Kang, M. G. (2003).

Super-Resolution Image Reconstruction: A Techni-

cal Overview. IEEE Signal Processing Magazine,

20(3):21–36.

Patti, A. J. and Altunbasak, Y. (2001). Artifact Reduc-

tion for Set Theoretic Super Resolution Image Recon-

struction with Edge Adaptive Constraints and Higher-

Order Interpolants. IEEE Transactions on Image Pro-

cessing, 10(1):179–186.

Schwartz, W., da Silva, R., Davis, L., and Pedrini, H.

(2011). A Novel Feature Descriptor Based on the

Shearlet Transform. In IEEE International Confer-

ence on Image Processing, Brussels, Belgium.

Stark, H. (1988). Theory of Convex Projection and its Ap-

plication to Image Restoration. IEEE International

Symposium on Circuits and Systems, pages 963–964.

Sun, J., Sun, J., Xu, Z., and Shum, H.-Y. (2008). Image

Super-resolution using gradient proﬁle prior. IEEE

Conference on Computer Vision and Pattern Recogni-

tion.

Sun, J., Sun, J., Xu, Z., and Shum, H.-Y. (2010). Gradi-

ent Proﬁle Prior and Its Applications in Image Super-

Resolution and Enhancement. IEEE Transactions on

Image Processing.

Wang, Z., Bovik, A., Sheikh, H., and Simoncelli, E. (2004).

Image Quality Assessment: From Error Visibility to

Structural Similarity. IEEE Transactions on Image

Processing, 13(4):600–612.

Yu, H., Xiang, M., Hua, H., and Chun, Q. (2008). Face

Image Super-Resolution through POCS and Residue

Compensation. IET Conference Publications, pages

494–497.

VISAPP 2012 - International Conference on Computer Vision Theory and Applications

144