Importance Order Ranking for Texture Extraction:
A More Efficient Pooling Operator Than Max Pooling?
S. Vargas Ibarra
a
, V. Vigneron
b
, J.-Ph. Conge
c
and H. Maaref
d
Univ. Evry, Universit
´
e Paris-Saclay, IBISC EA 4526, Evry, France
{vincent.vigneron, hichem.maaref, sofia.vargasibarra, jean-philippe.conge}@univ-evry.fr
Keywords:
Deep Learning, Pooling Function, Rank Aggregation, LBP, Segmentation, Contour Extraction.
Abstract:
Much of convolutional neural network (CNN)’s success lies in translation invariance. The other part resides
in the fact that thanks to a judicious choice of architecture, the network is able to make decisions taking into
account the whole image. This work provides an alternative way to extend the pooling function, we named
rank-order pooling, capable of extracting texture descriptors from images. The rank-order pooling layers are
non parametric, independent of the geometric arrangement or sizes of the image regions, and can therefore
better tolerate rotations. Rank-order pooling functions produce images capable of emphasizing low/high fre-
quencies, contours, etc. We shows rank-order pooling leads to CNN models which can optimally exploit
information from their receptive field.
1 INTRODUCTION
Convolutional neural network (CNN) architecture is
augmented by multi-resolution (pyramidal) structures
which come from the idea that the network needs to
see different levels of (resolutions) to produce good
results. A CNN stacks four different processing lay-
ers: convolution, pooling, ReLU and fully-connected
(Goodfellow et al., 2016).
Placed between two convolutional layers, the
pooling layer receives several input feature maps.
Pooling (i) reduces the number of parameters in the
model (subsampling) and computations in the net-
work while preserving their important characteristics
(ii) improves the efficiency of the network (iii) avoids
over-learning.
Thus, the pooling layer makes the network less
sensitive to the position of features: the fact that an
object is a little higher or lower, or even that it has a
slightly different orientation should not cause a radi-
cal change in the classification of the image.
The max-pooling function, for example, down-
samples the input representation (image, hidden layer
output matrix, etc.), reducing its dimensionality.
Weaknesses of pooling functions are well iden-
tified (Yu et al., 2014): (i) they do not preserve all
a
https://orcid.org/0000-0003-3102-4315
b
https://orcid.org/0000-0001-5917-6041
c
https://orcid.org/0000-0002-8641-0312
d
https://orcid.org/0000-0002-1192-7333
spatial information (ii) the maximum chosen by the
max-pooling in the pixel grid is not the true maximum
(iii) average pooling assumes a single mode with a
single centro
¨
ıd. The question is how (optimally) to
take into account the characteristics of the (input im-
age) regions being pooled into the pooling operation?
Part of the answer lies in the work of Lazebnik’s who
demonstrated the importance of the spatial structure
of pooling neighborhoods (Lazebnik et al., 2006): in-
deed, local spatial variations of image pixel intensities
(called textures in popular image processing) char-
acterize an “organized area phenomenon” (Haralick,
1979) which cannot be captured in pooling layers.
This paper proposes a new pooling operation, in-
dependent of the geometric arrangement or sizes of
image regions, and can therefore better tolerate ro-
tations. It is based on the Savage definition of rank
order (Savage, 1956) and also simple to implement.
Notations
Throughout this paper small Latin letters a,b,. .. rep-
resent integers. Small bold letters a, b are put for
vectors and capital letters A,B for matrices or ten-
sor depending of the context. The dot product be-
tween two vectors is denoted < a,b >. We denote by
∥a∥ =
√
< a,a >, the ℓ
2
norm of a vector. X
1
,. .. ,X
n
are non ordered variates, x
1
,. .. ,x
n
non ordered ob-
servations. ”Ordered statistics” means either p
(1)
≤
.. . ≤ p
(n)
(ordered variates) and p
(1)
≤.. . ≤ p
(n)
(or-
dered observations). The extreme order statistics are
Ibarra, S., Vigneron, V., Conge, J. and Maaref, H.
Importance Order Ranking for Texture Extraction: A More Efficient Pooling Operator Than Max Pooling?.
DOI: 10.5220/0011142200003271
In Proceedings of the 19th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2022), pages 585-594
ISBN: 978-989-758-585-2; ISSN: 2184-2809
Copyright
c
2022 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
585