switch leads per cardiac cycle (Zhang et al., 2005). In
order to take this difference in lead selection into
account, the BB and BL values were also calculated.
The BB values were computed according to the
method adopted first in (Martínez et al., 2004) and
later in (Zhang et al. 2005; Vázquez-Seisdedos et al.,
2011). This method defines the T wave end per beat
by selecting the lead in which the detection error,
between the automatically and manually annotated T
wave end, is minimal. The BL method selects the
ECG lead which contains the most T wave ends,
appointed by the previously described method. If an
equal amount of T wave ends were appointed in both
leads, the first lead was selected. From the viewpoint
of a human operator, this is a more realistic
procedure. The results of both methods can be found
in Table 2.
Table 2: Comparison of the overall mean and standard
deviation (sd) of the differences, in ms, between the
automatic and manually annotated T wave ends for all
methods with the supplementary BB and BL protocol.
SEMI TAN TRA INT
Lead mean sd mean sd mean sd mean sd
BB -4.9 15.1 -8.0 16.3 11.8 29.7 2.2 20.0
BL -6.2 17.6 -7.6 19.0 14.3 37.5 3.9 22.9
When applying the BB protocol, it was observed
that the overall sd, obtained by each of the methods,
was lower compared to the sd obtained for lead I and
II. This was expected, since the lead in which the
detection error is minimal was selected per beat. This
protocol is most in accordance with the annotation
method of the cardiologists, since they made their
annotation by examining both leads and based their
decision on the best lead (Martínez et al., 2004).
In clinical practice, the best lead can be selected
after the ECG recorder is set up. Hence, the BL results
are clinically the most relevant, concerning everyday
T wave end detection. We demonstrated that the TRA
method is the least repeatable of all methods tested
(sd=37.5ms), whilst the SEMI method is the most
repeatable one (sd=17.6ms). The integral method
scores best in terms of accuracy (mean=3.9ms).
It might be noted that the mean and sd calculation
was simplified. One value was computed per record
and the overall mean and sd were computed as the
average of these values. This method does not take
the number of annotated T wave ends in each record
into account. Therefore, we opted to generate Bland-
Altman plots. These allow a direct comparison
between all manual annotations and the T wave end
selections of the four algorithms. Only the BB values
were taken into account, since this protocol is most in
accordance with the annotation method of the
cardiologists. Based on the Bland-Altman plots of the
respective QT intervals, Q being manually annotated
by the cardiologists, an evaluation of the agreement
of the methods was performed. The results of the
evaluation are depicted in Figure 4.
The comparison of the TRA method shows the
largest limits of agreement (-109.10/87.27ms). These
results strengthen the previous findings. In
accordance, the best agreement was determined for
the SEMI method (-75.69/85.01ms), although the
agreement of the INT method was only slightly worse
(-84.48/80.70ms). The obtained biases are in the
range of the ones earlier reported (Panicker, Karnad,
Natekar, et al. 2009; Vázquez-Seisdedos et al., 2011).
The results of the SEMI method could be
explained by the small influence of baseline wander
and U waves on the detection of the T wave end.
Because of their low amplitude, U waves have
significantly less influence on the sum of the squared
differences compared to the T waves. However, it
should be noted that the method is very operator
dependent. This is highlighted by the cluster forming
of the difference points in the Bland-Altman plot. All
QT intervals computed per record will be biased in
accordance to the difference in end point selection of
the template T wave end, compared to the manually
annotated T wave end. This results in a relatively
unaffected QT variability, but alters the QT lengths.
This operator dependency should be taken into
account when using this method in QT interval
analysis.
Besides the agreement intervals, the biggest
difference between the algorithms could be observed
for the cloud on the right. This cloud contains the
longest QT intervals, including biphasic T waves and
fusions with the U wave. Both the TAN and TRA
method were outperformed by the INT method for the
detection of the actual ends of these QT intervals.
Probably, this is due to the fact that the TAN and TRA
method rely on the detection of the T wave peak,
making it harder to detect more complex biphasic T
waves or fused T and U waves.