quisition, since this wavelength light is not perceived
by the human eye, and therefore, does not lead to a
decrease of the pupil size (Toslak et al., 2018).
Regarding image focusing process, retinal cam-
eras can be equipped with a mechanical focusing
system that consists in displacing a compensation
lens that, when combined with the optics of the eye,
matches the image plane to the retina. This focus con-
trol mechanism is conceived to compensate for possi-
ble refractive errors in subject’s eyes (which can be
different for each eye). The EFS presents in its opti-
cal system an aspheric objective lens that minimizes
dioptre errors and the level of optical aberrations pro-
duced in the retinal image (Melo et al., 2018). How-
ever, this manual focusing process is error prone, es-
pecially when performed by inexperienced examin-
ers operating a handheld device, and may lead to sub-
optimal images which is not desirable for eye screen-
ing purposes.
This work presents an automatic focus assessment
approach for non-mydriatic NIR fundus images that
can be executed by a smartphone on a handheld de-
vice. Our method allows the system to find the best
focus value based on the NIR preview images during
the manual alignment step, optimising the retinal im-
age quality at the moment of image capture.
We will also present an extensive comparative
analysis of focus features and measures, proposing a
new DCT-based function against a group of Gradient-
based, Statistical-based and Laplacian-based func-
tions in the same experimental setup. The first ap-
proach consisted of a discriminative analysis of all
metrics performance and later, the authors decided to
observe the impact of the implementation of Machine
Learning models in the analysis, trying to take into
account the balance between performance and com-
putational processing time.
This paper is structured as follows: Section 1
presents the motivation and objectives of this work;
Section 2 summarizes the related work and applica-
tions found on the literature; Section 3 provides an
overview of the system architecture including the data
collection, followed by the methodologies studied for
the retinal focus assessment; in Section 4, the results
and discussion are presented; and finally, the conclu-
sions and future work are drawn in Section 5.
2 RELATED WORK
Unlike tabletop retinal cameras, EFS is a handheld
device and despite the help of its image acquisition
logic, it requires a low level of training. Therefore,
our research group developed strategies to maximize
the quality of the acquisition, minimizing reflections
associated with poor alignment of the device and non-
focused images due to errors in manual focus process.
Through the implementation of a flexible eyecup and
an internal luminous fixation system, the examiner
can ask the patient to fix his gaze on a certain point,
stabilizing the acquisition device and making all the
images consistent in terms of imaged area (Soares
et al., 2020). In addition, the authors sought to imple-
ment other strategies that could enable a more robust
imaging system, which is fundamental for medical
screening purposes. Since the native digital camera
applications of state-of-the-art smartphones are not
able to use their automatic focus tool in NIR images,
the team felt the need to create a suitable focus assess-
ment pipeline that will be integrated into a previously
developed EFS mobile application.
To the best of our knowledge, the number of pub-
lished works on autofocusing in retinal imaging is
scarce. In the work of (Moscaritolo et al., 2009), it
is proposed an algorithm to assess optic nerve sharp-
ness with generation of a quantitative index. How-
ever, the authors use images captured with conven-
tional tabletop mydriatic devices in the visible spec-
trum, and does not present an extensive study compar-
ing focus metrics with the same experimental setup.
Regarding automatic focus algorithms for non-
mydriatic retinal imaging with NIR illumination,
(Marrugo et al., 2012) proposes a passive auto-focus
measure based on the directional variance of the nor-
malized discrete cosine transform (DCT). A focusing
window is selected such that there are retinal struc-
tures within for computation of the normalized DCT.
Consequently, a weighted directional sampling on the
normalized DCT is calculated and finally the focus
measure is the variance from all considered direc-
tions. Although a comparative analysis of the results
with other metrics is presented, the data is from table-
top retinal cameras (Marrugo et al., 2014), unlike the
present work that refers to a mobile and handheld
imaging system. The proposed approach performs the
focus score by using the ratio between high and low
frequencies of the DCT image. By using pre-defined
masks, there are avoided regions from the DCT im-
age related to the noise component and others related
with the basic frequencies. In addition, the work aims
to study the impact of feature-based machine learning
with a set of classifiers evaluated in the EFS use case.
BIOIMAGING 2021 - 8th International Conference on Bioimaging
74