can do the identification automatically with high
accuracy (>97%). Using the MS COCO metric
system, the model that produced the highest mAP
value (0.751) was the configuration framework FR
using image-set C, LR 0.005 and IOU 0.7. When
comparing the models, the nested ANOVAs showed
significant differences in mAP values between the
SSD and FR frameworks, as expected from previous
studies (Arcos-Garcia et al., 2018; Janahiraman et al.,
2019), however, any significance between the
remaining factors and variables within the model had
not been explored previously. Almost all of the
models using FR achieved mAP values over 0.7 with
the highest reported value of 0.751, whereas the
models using the SSD framework achieved mAP
values under 0.7 with the highest reported value of
0.698. Interestingly, there was very little difference
between any of the models in terms of accuracy. All
models were able to positively classify the majority
of test images with an accuracy of 94.4% - 97.8%.
Previous studies have shown that there is no
universal LR values (Chudzik et al., 2020),
suggesting that each model and its associated neural
network would require an optimisation of its own LR
value. When experimenting with hyperparameter
values, the combination of learning rates and model
architectures showed significant relationships.
Significant effects were found when the SSD
framework was paired with LR 0.01, and when the
FR framework was paired with LR 0.0005. There was
no significant relationship between the different IOU
values trialled and mAP values. However, there was
a small effect of model performance (1.61%
difference in the strength of the relationship with and
without IOU). Thus, all together, varying the IOU
threshold hyperparameter value could be considered
negligible in the general performance output of the
models.
The deep learning method proposed here utilises
trained object detection models and can classify
images in less than a second. In its present state, the
model using object detection and deep learning
involves chironomids to be collected on a site,
euthanised and their head capsules being placed on
microscope slides. These slides are then viewed
through a microscope lens and images are taken.
Images then need to be transferred to a computer
where they can be examined by the object detection
models which will classify the chironomid head
capsule to one of the three genera. The initial stages
require the use of costly workstations and an expert
to work out the optimum training conditions.
However, once the actual model has been developed,
anyone with access to a computer can use it. When
combined with a camera device, such as an affordable
USB camera, this automatic computer model could be
used to identify chironomid larvae specimens just by
passing them in front of a camera feed rather than
using digital images exclusively. However, it is worth
mentioning that this demonstration only covers a very
small fraction of the chironomid diversity, where only
three genera were detected out of an estimated 200+
genera worldwide and did not distinguish species
taxonomy level, where there are an estimated 20,000+
species worldwide. The use of computer vision
models and, in particular, deep learning techniques
for object detection in ecological sciences are still in
their infancy. This study, however, illustrates how
this technique can be used to rapidly identify
taxonomically challenging organisms. It is envisaged
that future work in object detection will open new
opportunities for biological diversity and
biomonitoring, not only of chironomids but also other
group of freshwater organisms.
REFERENCES
Arcos-García, Á., Álvarez-García, J. and Soria-Morillo, L.
(2018). Evaluation of deep neural networks for traffic
sign detection systems. Neurocomputing, 316, 332-344.
Ärje, J, Melvad, C, Jeppesen, MR, et al. (2020) Automatic
image-based identification and biomass estimation of
invertebrates. Methods in Ecol & Evol.; 11: 922– 931.
https://doi.org/10.1111/2041-210X.13428.
Azhar, M.A.H.B., Hoque, S. and Deravi, F. (2012)
Automatic identification of wildlife using Local Binary
Patterns," IET Conference on Image Processing (IPR
2012), pp. 1-6.
Bentler, P. and Satorra, A. (2010). Testing Model Nesting
and Equivalence. Psychological Methods. 15, 111–123.
Biggs, J., Williams, P., Whitfield, M., Fox, G. and Nicolet,
P. (2000). Biological techniques of still water quality
assessment phase 3: method development. Environment
Agency R & D Technical report E110. Environment
Agency. Bristol. UK.
Bondi, E., Fang, F., Hamilton, M., et al. (2018). Spot
poachers in action: Augmenting conservation drones
with automatic detection in near real time. 32nd AAAI
Conf Artif Intell AAAI, 7741-7746.
Bose, S. and Kumar, V. (2020). Efficient inception V2
based deep convolutional neural network for real-time
hand action recognition. IET Img Proc, 14, 688-696.
Cao, X., Chai, L., Jiang, D., et al. (2018). Loss of
biodiversity alters ecosystem function in freshwater
streams: potential evidence from benthic
macroinvertebrates. Ecosphere, 9, e02445.
Chudzik, P., Mitchell, A., Alkaseem, M., et al. (2020).
Mobile real-time grasshopper detection and data
aggregation framework. Sci Rep, 10, 1150.
ICPRAM 2022 - 11th International Conference on Pattern Recognition Applications and Methods