2.3 CT-Navigation Matching
To match the navigation path to the binary tree en-
coding CT bronchial anatomy, we identify frames
traversing a higher bronchial level (binary tree level)
and orient the entering branches to chose the tree
node. A given frame at time, t, can be categorized
from the multiplicity of lumen centres, NLM, as:
1) Frame within same bronchial level (NLM
t+1
=
NLM
t
); 2) Frame approaching a bronchial level
(NLM
t+1
> NLM
t
); 3) Frame traversing a bronchial
level (NLM
t+1
< NLM
t
). Starting at the top node
of the binary tree, each time a frame traverses a
bronchial level, the tree level is increased and the path
node sequence is updated by adding ”1” or ”0” de-
pending on the entering branch orientation. The cen-
tre point with highest likelihood is considered to be
the scope current position and defines the entering
branch. Its orientation is defined by its relative po-
sition with respect the disappearing centres. If the
x-coordinate is larger than the average x-coordinate
of the vanishing points, we consider that the node is
labelled ”1” and ”0” otherwise.
Figure 2 illustrates the main steps involved in our
CT-video matching based on the codification of air-
ways anatomical landmarks. We show the skeleton of
a segmented CT scan (left image) that represents the
centre airway line and the final binary tree data struc-
ture for the first 3 bronchial levels (central image). We
have labelled the skeleton branching points according
to their corresponding binary tree nodes, so that the
green path would correspond to the node sequence
(1, 0, 1, 0). The most right images illustrate the iden-
tification of lumen centres and the matching to the bi-
nary tree (central image) representing the exploration
bronchial path. Lumen centres in right images are
plotted in green, with dots indicating the one corre-
sponding to the scope current position. We show two
representative cases: a frame within same bronchial
level (top images) and a traversing frame (bottom im-
ages). The node sequence associated to these frames
is shown on the central tree in green.
3 EXPERIMENTS
3.1 Experiment1: Tracked Centres
Accuracy
We have compared under intervention conditions the
quality of centres tracked using Section 2.2 (la-
belled MSER) to the method in (S
´
anchez et al.,
2015b) exclusively based on local maxima (labelled
LMx). Methods have been applied to 3 ultrathin bron-
choscopy videos performed for the study of peripheral
pulmonary nodules at Hospital de Bellvitge. Videos
were acquired using an Olympus Exera III HD Ultra-
thin videobronchoscope. For each video, we consid-
ered one proximal (up to 6th division) and one dis-
tal (above 6th) fragments. The maximum bronchial
level achieved in our ultrathin explorations was within
10th and 12th, which is in the range of the maxi-
mum expected level reachable by ultrathin navigation
(Asano et al., 2013). Fragments included the most
common artefacts of intra-operative videos: broncho-
scope collision with the bronchial wall, bubbles due
to the anaesthesia and patient coughing.
For each fragment, we sampled 10 consecutive
frames every 50 frames. Such frames were annotated
by 2 clinical experts to set false detections and the po-
sition of missed centres. Inspired in crowd sourcing
strategies (Maier-Hein et al., 2015), annotations were
blended to get a unique ground truth using the inter-
section of the two annotated point sets as illustrated
in fig.3. Ground truth sets were used to compute pre-
cision (Prec) and recall (Rec) for each set of consecu-
tive frames. These scores are taken for all such sets in
distal and proximal fragments for statistical analysis.
We have used a T-test for paired data to assess signifi-
cant differences across methods average precision and
recall and confidence intervals, CI, to report average
expected ranges. Tests and CIs have been computed
at significant level α = 0.05.
Table 1 reports CIs for each score and method at
proximal and distal levels, as well as, p-values for the
difference between MSER and LMx scores. At prox-
imal level, both methods perform equally, but MSER
keeps its quality scores at distal levels. This intro-
duces significant differences (p-val< 0.05) in dis-
tal and total Prec and Rec. Such differences are
larger for Rec, with a CI for the difference equal to
[0.05, 0.16] for distal bronchi and [0.03, 0.16] overall.
It is worth noticing that the proposed method always
has a 100% of precision and a recall over 86%, with
non-significant differences between distal and proxi-
mal levels (p-val > 0.7).
To validate the stability of our tracked centres in
full explorations, we have applied our MSER track-
ing to one of the complete videos. The chosen video
starts at carina, reaches the 11th level and includes
back and forth navigation with bronchoscope rota-
tion. Concerning image quality, there are saturation
illumination artefacts at most distal levels and some
fragments were recorded using narrow band imaging.
The original video with the tracked centres on each
image frame with a colour legend indicating candi-
date centres discarded by the Kalman filter, tracked
Towards a Videobronchoscopy Localization System from Airway Centre Tracking
355