by (Savchenko et al., 2022) fetched the best accuracy
of 78.75%. All methods used a softmax layer or a
linear transformation as the classifier. The models
were trained for at least 50 epochs. The respective
accuracies are reported in table 3 (Deep FER section)
with 10 fold cross validation. These results further
substantiate the usability of our dataset in real-world
practical applications.
India being a culturally and ethnically rich coun-
try, a home to about 1.4 billion people with various
racial identities migrating and settling in the subcon-
tinent. In this context, there existed a need for an
India-specific ethnically diverse dataset comprising
all seven basic human facial expressions.
The proposed InFER dataset comprises of 10,200
images & 4,200 videos of seven basic facial expres-
sions with their age, gender, and ethnic labels. The
subject selection done in this regard corroborated that
there should not be any dataset bias with respect to
ethnicity, age, class, or gender. Moreover, since posed
human expressions lack in realistic data, we adopted a
two way collection strategy. Whilst posed expressions
from human subjects were captured; on the contrary,
we also collected realistic spontaneous/acted expres-
sions collected on a crowd-sourced basis from on-
line sources. We also conducted extensive experi-
mentation on baseline models and available state-of-
the-art deep-learning-based models, showing that our
propsed dataset can be deployed for real-world prac-
tical applications. The Multi-Ethnic Indian Facial Ex-
pression Recognition (InFER) dataset would facilitate
researchers to train and validate their algorithms for
real-world practical applications.
InFER: A Multi-Ethnic Indian Facial Expression Recognition Dataset