and the patients satisfaction (Segura-Sampedro et al.,
2017). In this later reference as well as in similar
studies (Nordheim et al., 2014), photos or videos of
the wound and filled questionnaires constitute the ex-
changed data between the patient and the doctor, but
is the doctor who always analyzes and evaluates the
received information. However, to our knowledge,
none of the revised methods or Apps is able to esti-
mate, automatically, a reliable pre-diagnose based on
visual data, and to filter out those wounds that clearly
present a good evolution, without the intervention of
the physician. Following this line, the Department of
General and Digestive Surgery of the University Hos-
pital Son Espases is collaborating with the Systems,
Robotics and Vision group of the University of the
Balearic Islands, in order to go one step forward in the
design and implementation of a vision-based mobile
App for telemedicine that can help in the estimation
of a pre-diagnose of abdominal post surgery wounds.
The objective is filtering, automatically, those wounds
that potentially present inflammation as sign of infec-
tion and need a face-to-face evaluation in the hospital
from those that course a normal evolution and can be
managed at home, saving time and medical resources.
The novelty of this work is more in the methodology
itself and the application than in the pipeline of visual
algorithms designed to get the objective. This method
needs to be wrapped into a future compact mobile
App, which will contain additional functionalities to
increase the communication and data exchange be-
tween doctors and patients.
The wound analysis process consists of a pipeline
that involves the next steps: a) grab a video sequence,
with the mobile, of the abdominal zone around the
wound, from side to side, viewing the same area but
from different perspectives and viewpoints, b) extract
images of the video sequence, c) extract and track
common visual features in all the images, d) build a
3D sparse point cloud using a Structure From Motion
(SFM) (Hartley and Zisserman, 2003) algorithm, e)
build a dense point-cloud and a textured meshed sur-
face, f) establish a polyline and a plane fitted in this
polyline in a selected portion of the 3D model; this
plane is intended to be, either tangent to the abdomen
surface, or crossing the abdominal area, below the
wound, g) compute the distance between each point
of the 3D model and the plane, and emit a diagnose
function of these distances.
2 METHODOLOGY
Firstly, the patient must record with the mobile tele-
phone a video of the wound, from side to side of the
abdominal area, in order to have views from different
perspectives and viewpoints. The second step is au-
tomatic and consist of extracting all the images from
the video sequence. Once the images have been ex-
tracted, the process of 3D reconstruction starts auto-
matically with the feature tracking process. The SFM
geometric theory (Hartley and Zisserman, 2003) is
based on the tracking of a set of world points pro-
jected in several images taken by the same camera
from different viewpoints. These projected points
and their correspondences in the subsequent images
are obtained thanks to a process of a classical visual
feature detection and matching (Hartley and Zisser-
man, 2003) using two reputed detectors invariant to
rotation and scale: one detector with scalar descrip-
tor, SIFT (Lowe, 2004), and one detector with binary
descriptors, ORB (Rublee et al., 2011). Both tech-
niques have proved extendedly his excellent perfor-
mance in terms of number of features, robustness and
traceability. Invariance to scale and rotation is impor-
tant for this kind of application since the image key
points must be identified in all frames of the video se-
quence, which show the affected area from different
viewpoints. Figure 1 shows an image provided by the
University Hospital Son Espases of a surgical wound,
with the visual features obtained using the 2 different
detectors.
(a) (b)
Figure 1: Feature detection with: (a) SIFT, (b) ORB.
The feature detection with the 2 tested features has
been implemented with the feature detector OpenCv2
functions. The descriptor matching has been imple-
mented with the FLANN (Muja and Lowe, 2009)
matcher library. Good matches (inliers) are consid-
ered to be those which distance between correspon-
dences in different images is under a certain threshold
(typically, either 0.02 or 2 times the minimum dis-
tance between all the matches). Bad matches are dis-
carded.
Given the projection matrices, the 3D coordinates
of a world point can be obtained from its correspond-
ing image points (in this case visual features) identi-
fied in several views (matching) using triangulation.
Ideally, the 3D point should lie in the intersection of
all back-projected rays. But, in general, these rays
will not intersect in a single point due to the errors in-
herent to the feature matching process. The 3D coor-
dinates of the world point are obtained minimizing the
VISAPP 2018 - International Conference on Computer Vision Theory and Applications
590