3.1 Analysis of the Computational Cost
Constraints
Now we will evaluate the effect of the limits in the
computational cost available, as well as the groups of
features more selected and useful. Table 3 displays
the error rate and the percentages of appearance (se-
lection rates) of the groups, in function of the maxi-
mum cost established in MFLOPS, using the LSLD.
The error rate is the sum of the decisions where the
system says there is drone presence and it fails be-
cause there is no drone in the environment, and vice
versa. It has been considered as appearance the se-
lection of one or more features from the group. The
same is displayed in Table 4 using LSQD.
At the beginning, the system selects groups G
3
,
G
4
and G
5
in almost 100% of the cases because of
the low threshold imposed (0.5 MaxMFLOPS). When
we increase this value to 1 MaxMFLOPS, the spectral
features appear. If the restriction is established in 1.5
MaxMFLOPS, the MFCCs start to be selected. When
we reach higher values of MFLOPS (3.5), group G
2
is selected, which is composed of features related to
the pitch. The case of 4.0 MaxMFLOPS allows the
algorithm to select whatever it needs, because the sum
of all the costs is lower than this value.
In general LSQD works better than LSLD, since
the error rate is lower in most cases, specially when
the cost constraint is very limiting. The importance
of some features is reflected in the table. For in-
stance, when group G
1
-MFCCs and ∆MFCCs- ap-
pears (from 1.5 MaxMFLOPS onwards) its appear-
ance is 100%. In fact, the parameter that best reflects
the importance of G
1
is the error rate, since it falls
significantly when that group appears (in the case of
LSLD, from 57.5% of error to 28.5%, and in the case
of LSQD, from 41.9% to 23.4%). Something simi-
lar happens when G
2
-pitch, HNR and RUF- appears
(from 3.5 MaxMFLOPS onwards). Again, its selec-
tion rate is 100% and its contribution to the perfor-
mance of the system is really significant (error falls
from 30.1% to 15.7% with LSLD and from 23.8% to
15.5% with LSQD). The importance of pitch could be
directly related to the particular frequency that drones
present, which is dependent on the size of the device,
the number of blades and the speed.
With regard to the rest of features, G
3
seems to
work well only when using LSLD because of its high
selection rate. The same applies to G
8
, but when us-
ing LSQD. Other features seem to be more robust to
changes in the detector used (G
5
, G
6
and G
7
), since
they present high selection rate for both detectors.
3.2 Analysis of the Model of Drone and
Other No-drone Sounds
Then, the error obtained in each of the models in-
cluded in the drone database will be analyzed. Table
5 shows the different models of drone, the duration of
each of them and the error obtained. In these results
the best constraint and detector in terms of error have
been selected from the previous cases (13.4% of error
with 4.0 MFLOPS and LSLD).
From Table 5 it can be seen that Parrot AR is
the best detected model (0% of error rate), while the
worst one is the UDI 817 (50% of error). This could
be because of its minor presence in the database. As
it can be observed, a large proportion of the database
belongs to DJI Phantom 3, which gets an error rate of
12.2%.
As mentioned previously, the dataset was devel-
oped including no-drone sounds present in smart city
environments, which can be easily confused with the
sound of a drone. In Table 6 the no-drone sounds, the
duration of them and the error obtained are detailed.
From the results it can be observed that the most
confusing sounds are the fire siren, radial saw and
construction work (with error rates of 40.7%, 36.4%
and 22.5%, respectively). This could be because the
fundamental frequency of these sounds is in the range
of the drone frequency (one or two hundreds of Hz).
Likewise, other sounds like helicopter, excavator, mo-
torbike or plane are really well detected as no-drone
sounds, with error rates below 3%. This is especially
interesting in the case of other aerial vehicles (heli-
copter, plane), since they could be more conflicting
with drones as they share the same space of work (the
sky) and they could appear at the same time.
4 CONCLUSIONS
The aim of this work is to develop a system capa-
ble of detecting the presence of drones in real time.
To this end, different experiments related to Smart
Sound Processing (SSP) have been carried out, in-
cluding feature extraction, feature selection and de-
tectors. The objective of the algorithms is to minimize
the error rate while controlling the computational
cost. This has been reached through a constraint in
the number of operations per second (MFLOPS).
Related to the features selected, the results show
that MFCCs and features related to pitch are the best
subsets of features for the problem at hand, for both
linear and quadratic detectors. Depending on the de-
sired final error rate and on the resources of the pro-
cessing device, a compromise should be reached be-
ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods
770