The paper is structured to reflect the significant
developments in Single Image Super Resolution
(SISR), focusing on balancing data-driven accuracy
with computational efficiency. Chapter two
introduces traditional SISR techniques, setting the
stage for understanding foundational methods.
Chapter three shifts to advanced models like CNNs,
GANs, and specifically CARN, highlighting their
impact on improving resolution while considering
practical application constraints. The final chapter
concludes with a summary of key findings and
explores potential future research directions,
emphasizing the ongoing quest for more efficient and
higher-quality SISR methods. This paper provides a
concise yet comprehensive overview of the field’s
evolution and current challenges.
2 PERFORMANCE EVALUATION
METRICS
2.1 Peak Signal to Noise Ratio
Peak Signal to Noise Ratio (PSNR) is a metric
commonly utilized to evaluate image quality. It
quantifies the disparity between two images, typically
an original image in low quality (I) and its
reconstructed or super-resolved counterpart in high
quality(K), through the following formula:
𝑃𝑁𝑆𝑅 = 10𝑙𝑜𝑔
𝑀𝐴𝑋
𝑀𝑆𝐸
(1)
In this context, MAXI signifies the maximum
possible pixel value in the image, and MSE(I, K)
represents the Mean Squared Error between the
original and the reconstructed images. A higher
PSNR value is indicative of minor discrepancies
between I and K, implying superior image quality.
Nonetheless, despite the straightforward nature of this
metric and its precise quantification of reconstruction
errors, PSNR may not consistently align with human
visual perception, hence possibly inaccurately
representing the perceived quality of images.
2.2 Structural Similarity Index
The Structural Similarity Index (SSIM) is designed to
overcome the limitations of traditional metrics like
PSNR by considering more comprehensive aspects of
image quality such as detail, brightness, and contrast.
(9) SSIM evaluates the similarity between two images
in a way that is more aligned with the eyes of humans.
The formula for computing SSIM is given by(9):
SSIM
(
𝐼, 𝐾
)
=
(
2μ
μ
+𝑐
)(
2σ
+𝑐
)
(
μ
+μ
+𝑐
)(
σ
+σ
+𝑐
)
(2)
The original and super-resolved images are
denoted by I and K, respectively, and their average
luminance values are μ
and μ
. Their variances are
σ
and σ
, while the covariance between I and K is
σ
. The constants 𝑐
and 𝑐
are added to stabilize
the division with a weak denominator.
2.3 Learned Perceptual Image Patch
Similarity
Learned Perceptual Image Patch Similarity (LPIPS)
utilizes deep learning to assess image quality in a
manner that aligns closely with human visual
perception. It addresses the limitations of traditional
metrics like PSNR and SSIM by incorporating
variations in human perception. LPIPS calculates
similarity by analysing image patches through deep
neural networks, effectively capturing perceptual
differences that may be overlooked by other metrics.
This method offers a nuanced understanding of image
quality, proving especially beneficial in applications
requiring high visual fidelity, such as medical
imaging, where preserving detail is paramount
(Zhang et al. 2018).
The suitability and limitations of these metrics
depend on the context. For example, while PSNR
works well for quantifying
signal reconstruction
quality it may not be the reliable indicator when
visual fidelity is crucial.
On the hand SSIM and
LPIPS provide a more nuanced evaluation of image
quality, which is especially important in fields like
medical imaging where preserving fine details is
essential (Wang et al. 2020).
3 KEY DATASETS IN SISR
RESEARCH
3.1 Set5 and Set14
The Set5 dataset, introduced in 2012, comprises five
high-resolution images, including a variety of scenes
and objects to test the robustness of super-resolution
methods across different content types (Bevilacqua et
al. 2012). As shown in FIGURE 1, images in Set5 are
carefully selected to represent common photographic
subjects, such as landscapes, animals, and urban
scenes.