Dataset Generation for Egyptian Arabic Sign Language

Mariam Ibrahim

1 a

, Milad Ghantous

2 b

and Nada Sharaf

1 c

Faculty of Informatics and Computer Science, German International University, Cairo, Egypt

Faculty of Media Engineering and Technology, German University in Cairo, Cairo, Egypt

Keywords:

Egyptian Arabic Sign Language (EASL), Sign Language Recognition, Sign Language Datasets, Text-Video

Alignment, Landmark Detection, Skeletal Joint Points, Arabic Sign Languages, Dialect-Speciﬁc Sign

Language, Cultural Representation, Machine Learning in Sign Language Translation, Accessibility, Deaf

Community, Gesture Synchronization, Continuous Sign Language Recognition, Computer Vision,

Non-Western Sign Languages, Linguistic Diversity.

Abstract:

This literature review explores the existing body of work related to Egyptian Arabic Sign Language (EASL)

datasets, focusing on translation and text-to-video alignment, and examining relevant hand and face landmark

detection methodologies, including the use of skeletal joint point analysis. With a particular emphasis on the

research gaps in datasets, alignment accuracy, and computer vision models tailored for Arabic dialects, this

review aims to highlight the limitations and challenges within current literature. Despite advancements in

general sign language research, EASL remains understudied, leaving signiﬁcant gaps in the development of

resources and tools for accurate gesture translation and synchronization. The review concludes by identifying

the need for dialect-speciﬁc resources and advanced alignment techniques to support the growth of accessible,

region-speciﬁc sign language datasets.

1 INTRODUCTION

The study of sign language recognition and transla-

tion represents a rapidly evolving ﬁeld at the inter-

section of linguistics, computer vision, and artiﬁcial

intelligence. Recent advancements have enabled re-

searchers to bridge critical communication gaps be-

tween Deaf and hearing communities, making infor-

mation and services more accessible worldwide.

Various initiatives have been developed to sup-

port individuals with hearing impairments in differ-

ent settings, particularly in educational environments.

Tools have been developed to automatically annotate

lectures live and provide comprehensive notes espe-

cially for students using Arabic (Nasser et al., 2020;

Mohamed et al., 2021). The availability of relevant

sign language generators can signiﬁcantly enhance

this support by making lectures more engaging and

accessible. Such technologies would enable hearing-

impaired students to follow along more easily, pro-

moting greater engagement and understanding in the

classroom. Efforts have also been made to digitize

https://orcid.org/0009-0006-9560-2312

https://orcid.org/0009-0008-1726-6987

https://orcid.org/0000-0002-0681-6743

various aspects of the Arabic language and its di-

alects especially with the advancement of the recog-

nition and use of Arabic speech (Nabil et al., 2024;

Safwat et al., 2023). This digitization is crucial in en-

abling users in this digital era to access Arabic tools

across different domains, facilitating broader engage-

ment and utilization of technology in their native lan-

guage (Akila et al., 2015; Kassem et al., 2016).

There are various requirements central to these ad-

vancements including are datasets and models that

aim to capture the linguistic and gestural complex-

ity of various sign languages. While signiﬁcant

progress has been made in recognizing and trans-

lating widely studied sign languages like American

Sign Language (ASL) and British Sign Language

(BSL), there remains a critical gap in resources and

tools for underrepresented sign languages, particu-

larly those from non-Western contexts. Egyptian

Arabic Sign Language (EASL) is one such underex-

plored language that presents unique challenges and

opportunities for researchers. EASL is uniquely sit-

uated within the linguistic landscape, combining el-

ements of Modern Standard Arabic, Egyptian Ara-

bic dialects, and inﬂuences from other sign lan-

guages in the region. This complexity underscores

Ibrahim, M., Ghantous, M. and Sharaf, N.

Dataset Generation for Egyptian Arabic Sign Language.

DOI: 10.5220/0013380100003890

In Proceedings of the 17th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2025) - Volume 3, pages 1419-1426

ISBN: 978-989-758-737-5; ISSN: 2184-433X

1419

the need for specialized datasets and recognition mod-

els that can capture the nuances of EASL’s gram-

matical structure, hand shapes, and facial expres-

sions (Papastratis et al., 2021). Recent studies have

highlighted the importance of developing compre-

hensive datasets and recognition systems for Arabic

Sign Languages. (Aloysius and Geetha, 2020) con-

ducted a thorough review of vision-based continu-

ous sign language recognition, emphasizing recent

deep learning advancements. (Aloysius and Geetha,

2020) introduced a framework using DeepLabv3+

and BiLSTM for Arabic sign language recognition,

yielding promising results (Papastratis et al., 2021).

Additionally, (Elnasharty, 2024a) proposed a real-

time sign language detection model using TensorFlow

and OpenCV, demonstrating the potential for more

dynamic EASL recognition systems. Challenges

in studying EASL are multifaceted, ranging from

linguistic complexities to technological limitations.

EASL gestures often involve unique combinations

of hand shapes, orientations, and facial expressions

that differ signiﬁcantly from Western sign languages

(Moustafa et al., 2024). These elements are deeply

tied to Egyptian culture and social norms, making

them less amenable to standard models trained on

ASL or BSL datasets (Moustafa et al., 2024). A

promising avenue of research lies in the integration

of advanced machine learning techniques. Methods

like graph convolutional networks (GCNs) and recur-

rent neural networks (RNNs) have shown potential in

capturing the dynamic interactions of hand gestures

and facial expressions, which are central to EASL

communication (Papastratis et al., 2021). Such ap-

proaches not only offer a pathway to more accurate

recognition but also address the limitations of tradi-

tional landmark detection frameworks. This intro-

duction prepares for a detailed review of methods

relevant to Egyptian Arabic Sign Language (EASL),

highlighting advancements in sign language datasets,

alignment technologies, and landmark detection. The

review seeks to identify key gaps in EASL research,

aiming to develop technologies that are inclusive and

responsive to the needs of the Egyptian Deaf commu-

nity, thereby enhancing accessibility and representa-

tion in sign language studies.

2 BACKGROUND AND RELATED

WORK

This section provides a comprehensive review of

the existing research in the ﬁeld of sign language

recognition and translation, with a particular focus

on Egyptian Arabic Sign Language (EASL). It ex-

plores the foundational contributions made in devel-

oping datasets, alignment techniques, and landmark

detection models, which have advanced the study

of sign language recognition globally. While sig-

niﬁcant progress has been made for widely studied

languages like American Sign Language (ASL) and

British Sign Language (BSL), this review highlights

the challenges and gaps in adapting these methods

to culturally and linguistically distinct languages like

EASL.

The subsequent sections address the development

and application of sign language datasets, particularly

for Arabic dialects, and discuss text-video alignment

for translation and advancements in landmark detec-

tion for gesture recognition. The review highlights the

need for culturally tailored resources and innovative

approaches to enhance representation for languages

like EASL, aiming to support future research that fo-

cuses on inclusivity and accessibility in sign language

recognition and translation.

2.1 Egyptian Arabic Sign Language

(EASL)

Egyptian Arabic Sign Language (EASL) is a dis-

tinct and complex language within the broader con-

text of Arabic sign languages. Its unique phono-

logical structure relies on speciﬁc hand orientations

and facial expressions to encode grammatical mean-

ing, setting it apart from other sign languages like

ASL or Gulf Arabic Sign Language (Younes et al.,

2023; Mohamed, 2024). Recent research has high-

lighted the importance of capturing the full linguis-

tic richness of EASL, including its regional varia-

tions and continuous signing patterns. The devel-

opment of EASL-speciﬁc datasets has seen signiﬁ-

cant progress in recent years. The ArSL2018 dataset,

introduced by (Rastgoo et al., 2020) contains over

54,000 images of Arabic Sign Language gestures,

providing a substantial foundation for research. How-

ever, this dataset primarily focuses on isolated signs

and may not fully capture the complexities of continu-

ous signing in EASL. More recent efforts have aimed

to address these limitations that proposed a real-time

sign language detection model using TensorFlow and

OpenCV, demonstrating the potential for more dy-

namic EASL recognition systems (Bani Baker et al.,

2023). Additionally, (Mosleh et al., 2024) introduced

a vision-based method for identifying Arabic hand

signs and converting them to Arabic speech, achiev-

ing a 90% recognition rate. The multimodal nature of

EASL, combining hand shapes, orientations, move-

ments, and facial expressions, continues to pose chal-

lenges for existing models (Mohamed, 2024). Recent

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1420

studies have explored various deep learning archi-

tectures to capture these complexities. For instance,

(Latif et al., 2020) utilized transfer learning and deep

CNN ﬁne tuning to enhance the recognition accu-

racy of 32 hand motions in ArSL (Elnasharty, 2024a).

Expanding datasets to include diverse regional varia-

tions and continuous signing remains crucial for ad-

vancing EASL research. Given Egypt’s central role

in the Arab world, advancements in EASL research

could signiﬁcantly impact the development of inclu-

sive technologies for Arabic-speaking Deaf commu-

nities across the region (Hassan et al., 2024). Fu-

ture research directions should focus on creating more

comprehensive EASL datasets that capture regional

dialects, continuous signing, and the full range of lin-

guistic features. This will enable the development

of more accurate and culturally sensitive recognition

systems, ultimately improving communication acces-

sibility for the Egyptian Deaf community.

2.2 Sign Language Datasets

The development of comprehensive sign language

datasets is foundational to advancing sign language

recognition and translation technologies. Recent ad-

vancements in dataset creation have focused on un-

derrepresented languages, including Egyptian Arabic

Sign Language (EASL). Multimodal datasets incor-

porating RGB-D videos and skeletal joint tracking

have been explored to capture the intricacies of non-

Western sign languages, highlighting the importance

of cultural and linguistic speciﬁcity in dataset design

(Adaloglou et al., 2021; Al-Shamayleh et al., 2020).

Pioneering contributions include the American

Sign Language Lexicon Video Dataset (ASLLVD),

which provides a lexicon-based collection of isolated

ASL signs (Neidle et al., 2012; Adaloglou et al.,

2021), and the RWTH-PHOENIX-Weather dataset,

which captures continuous signing in German Sign

Language (GSL) (Koller et al., 2015; Camgoz et al.,

2018). These datasets have advanced understand-

ing of temporal dependencies and syntactic structures

in sign language. However, their language-speciﬁc

focus limits applicability to languages with differ-

ing grammar, lexicon, and cultural nuances, such as

EASL (Bakalla, 1975; Tharwat et al., 2021).

Efforts to create datasets tailored to Arabic sign

languages remain sparse but promising. For instance,

the Kafr El Sheikh Dataset offers a small-scale re-

source for EASL, focusing on basic linguistic features

and gestures. Expanding such datasets to address con-

tinuous signing and regional variations will be crucial

for further advancements (Al-Shamayleh et al., 2020;

Luqman and El-Alfy, 2021).

Table 1: Summary of Sign Language Datasets.

Dataset Language Features Limitations

ASLLVD ASL Isolated signs,

lexicon-based

Limited to iso-

lated signs

RWTH-

PHOENIX-

Weather

GSL Continuous sign-

ing on temporal

dependencies

does not work

with different

grammar

KafrEl-

Sheikh

Dataset

EASL Small scale re-

source, Basic lin-

guistic

does not ad-

dress regional

variations

General

Multimodal

Datasets

Various RGB-D videos,

skeletal tracking

required to tai-

lor datasets

2.3 Sign Language Translation and

Text-Video Alignment

Recent advancements in dataset generation and align-

ment for sign language recognition have focused on

improving automation and addressing the challenges

of culturally speciﬁc sign languages. Here are some

key developments from 2023-2024:

Automated Annotation and Alignment: Recent

studies have explored more efﬁcient methods to an-

notate and align sign language videos with text. (El-

nasharty, 2024b) proposed a real-time sign language

detection model using TensorFlow and OpenCV,

demonstrating potential for dynamic EASL recogni-

tion systems. This approach could signiﬁcantly re-

duce the manual effort required in annotation.

Cross-Modal Learning: (Aly et al., 2024) intro-

duced a deep learning framework combining CNNs

and LSTMs for Arabic sign language recognition,

achieving high accuracy in capturing both spatial and

temporal features. This method shows promise for

handling the complex multimodal aspects of EASL,

including facial expressions and hand movements.

Large-Scale Datasets: The development of larger,

more diverse datasets has been crucial. (Jiang et al.,

2024) highlighted the importance of expanding sign

language datasets to include regional variations and

continuous signing patterns. While not speciﬁc to

EASL, these principles are applicable and essential

for developing robust EASL recognition systems.

Transformer-Based Models: Transformer models

have shown signiﬁcant potential in sign language pro-

cessing. (Bani Baker et al., 2023) utilized transfer

learning and vision transformer approaches for Ara-

bic Sign Language recognition, demonstrating im-

proved performance in capturing the nuances of Ara-

bic sign languages.

Dataset Generation for Egyptian Arabic Sign Language

1421

2.4 Hand and Face Landmark

Detection in Sign Language Videos

Advancements in hand and face landmark detection

have signiﬁcantly enhanced sign language recogni-

tion systems by capturing crucial gesture and ex-

pression data, essential for accurate interpretation.

This is particularly impactful for non-Western lan-

guages like Egyptian Arabic Sign Language (EASL).

Google’s MediaPipe framework excels in real-time

gesture recognition and sign language detection by in-

tegrating hand, face, and pose estimation. The frame-

work’s ability to extract 3D hand landmarks from 2D

images has made it particularly useful for sign lan-

guage recognition, offering a cost-effective and efﬁ-

cient solution (Podder et al., 2023). However, chal-

lenges remain when applying these frameworks to

non-Western sign languages like EASL. Cultural dif-

ferences in hand gestures and facial expressions can

lead to misinterpretations or reduced accuracy. To

address this, researchers have proposed hybrid mod-

els that combine deep learning techniques with graph

neural networks (GNNs) to better capture the nuanced

gestures and expressions in EASL (Miah et al., 2023).

Recent studies have explored the use of Convolutional

Neural Networks (CNNs) in conjunction with Long

Short-Term Memory (LSTM) networks to improve

the accuracy of sign language recognition. This ap-

proach, known as CNNSa-LSTM, has shown promis-

ing results in capturing both spatial and temporal fea-

tures of sign language gestures(Podder et al., 2023).

The integration of self-attention mechanisms in these

models has further enhanced their ability to focus

on relevant features, improving overall performance

(Baihan et al., 2024). To overcome the limitations of

existing datasets, researchers have begun developing

more comprehensive and diverse datasets speciﬁc to

Arabic Sign Languages. For instance, the RGB Ara-

bic Alphabet Sign Language (AASL) dataset, com-

prising 7,857 raw and fully labelled RGB images of

Arabic sign language alphabets, has become a valu-

able resource for training and evaluating recognition

models (Al-Barham et al., 2023). Additionally, ef-

forts are being made to create datasets that capture the

regional variations and continuous signing patterns

unique to EASL (Al-Barham et al., 2023). The devel-

opment of real-time sign language detection models

using frameworks like TensorFlow and OpenCV has

demonstrated the potential for more dynamic EASL

recognition systems (Elnasharty, 2024a). These ad-

vancements, combined with the growing availability

of EASL-speciﬁc datasets, are paving the way for

more accurate and culturally sensitive sign language

recognition technologies. As research in this ﬁeld

continues to evolve, there is a growing focus on devel-

oping models that can adapt to the unique character-

istics of EASL, including its regional variations and

complex grammatical structures. Recent work has ex-

plored the use of YOLOv8, a cutting-edge object de-

tection algorithm, for real-time sign language recog-

nition, achieving high accuracy rates. Furthermore,

the integration of large language models (LLMs) with

sign language recognition systems is opening new

possibilities for more natural and context-aware com-

munication interfaces (Ahmad et al., 2024). These

ongoing advancements promise to enhance commu-

nication accessibility for the Egyptian Deaf commu-

nity and contribute to the broader goals of inclusive

technology development. As the ﬁeld progresses, we

can expect to see more sophisticated, real-time, and

culturally sensitive sign language recognition systems

that bridge the communication gap between deaf and

hearing individuals.

2.5 Sign Language Recognition

The ability to accurately recognize sign language is a

cornerstone of bridging communication gaps between

deaf and hearing communities. Recent advancements

in sign language recognition leverage cutting-edge

technologies to improve accuracy and scalability, en-

abling applications ranging from real-time translation

to gesture-based human-computer interaction. This

subsection explores two critical aspects of sign lan-

guage recognition: multimodal approaches that inte-

grate various input modalities to enhance recognition

accuracy and real-time systems designed for practical

deployment.

Multimodal systems combine data from hand ges-

tures, facial expressions, skeletal motion, and even

audio cues, creating a richer representation of sign

language input. These approaches are particularly

valuable for addressing the complexities of languages

like Egyptian Arabic Sign Language (EASL), where

non-manual features such as facial expressions play

a critical grammatical role (Tharwat et al., 2021;

Luqman and El-Alfy, 2021). In parallel, real-

time recognition systems are increasingly viable due

to advancements in computational efﬁciency, such

as lightweight neural architectures and transformer-

based models, which maintain high accuracy with low

latency (Vaswani et al., 2017; Attia et al., 2023). To-

gether, these innovations set the stage for developing

inclusive and culturally speciﬁc recognition systems

tailored to the needs of underrepresented languages

like EASL.

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1422

2.5.1 Multimodal Approaches for Sign

Language Recognition

Multimodal approaches enhance sign language recog-

nition by combining hand gestures, facial expres-

sions, and skeletal motion with advanced RGB-D

cameras and inertial sensors. These methods effec-

tively capture the detailed motions crucial for under-

standing EASL, where hand movements and facial

expressions are closely linked (Tharwat et al., 2021;

Luqman and El-Alfy, 2021).

For instance, studies using depth-based sensors

have demonstrated improvements in gesture segmen-

tation and recognition by providing three-dimensional

spatial data that traditional video methods lack (Ren

et al., 2013; Al-Shamayleh et al., 2020). Addition-

ally, the integration of audio modalities for translating

speech to sign language has opened new avenues for

real-time applications, enabling more accessible tech-

nologies for the hearing-impaired (Cao et al., 2019;

Adaloglou et al., 2021).

2.5.2 Real-Time Sign Language Recognition

Real-time sign language recognition systems have

gained traction with the increasing computational

power of edge devices and advancements in model

optimization. Techniques involving lightweight con-

volutional models and quantized neural networks

have made it feasible to deploy recognition systems

on mobile and embedded platforms (Tharwat et al.,

2021).

For EASL, real-time systems must account for

cultural and linguistic nuances, including rapid ges-

ture transitions and context-dependent facial expres-

sions. Solutions such as attention-based LSTMs and

transformer-based architectures have shown promise

in achieving high accuracy while maintaining low la-

tency (Attia et al., 2023; Luqman and El-Alfy, 2021).

By integrating these approaches with multimodal in-

puts, real-time systems can provide robust perfor-

mance in diverse environments.

2.6 Sign Language Generation

Sign language generation (SLG) complements recog-

nition efforts by producing sign language content in

forms such as animations, avatars, or synthesized

videos. SLG plays a crucial role in accessibility for

Deaf communities by visually representing spoken or

written text. However, SLG faces unique technical

challenges, particularly for non-Western languages

like Egyptian Arabic Sign Language (EASL). The

primary issues include the lack of annotated datasets

that capture cultural nuances and grammatical struc-

tures, and the complexity of replicating EASL’s mul-

timodal nature, where subtle facial expressions and

intricate hand movements convey critical meaning.

Challenges and Opportunities: Existing SLG

systems often rely on datasets and models designed

for Western sign languages such as ASL or German

Sign Language, which struggle with EASL’s unique

syntactic and morphological features (Stoll et al.,

2020; Camgoz et al., 2018). Advances in compu-

tational frameworks, such as HamNoSys and neu-

ral motion retargeting, offer promising pathways for

generating linguistically rich content for non-Western

languages like EASL (Prikhodko et al., 2020; Zhang

et al., 2022).

Avatars and Animation: Virtual sign language

avatars are a popular medium in SLG. Motion capture

technology and linguistic rules, as demonstrated by

(McDonald et al., 2016), enable these avatars to pro-

duce sign language gestures. However, their move-

ments often lack the ﬂuidity and cultural authenticity

required for natural communication as shown in ﬁg-

ure 1 (Kipp et al., 2011). In contrast, 3D animation

techniques utilize kinematic models to create lifelike

gestures, but they require signiﬁcant computational

resources and specialized expertise.

Figure 1: Sign Language Avatars: Animation and Compre-

hensibility (from (Kipp et al., 2011)).

Video Synthesis: Recent advancements in deep

learning, such as GAN-based models like SignGAN

(Stoll et al., 2020), have introduced new capabilities

for generating realistic signing videos. Pose-based

representations, such as those proposed by (Natara-

jan and Elakkiya, 2022), enhance temporal consis-

tency and visual accuracy in synthesized videos. For

EASL, these methods hold particular promise due to

their ability to capture gesture variations and cultural

nuances.

Models and Frameworks: Gloss-to-sign

pipelines translate linguistic gloss annotations into

sign language animations or videos. Neural machine

translation techniques, including Transformers, have

been adapted to improve temporal alignment and

naturalness in gloss-to-sign generation (Camgoz

et al., 2018). Text-to-sign pipelines, which extend

this process to raw text input, present broader chal-

lenges, requiring sophisticated parsing and semantic

understanding to produce culturally relevant signs.

Dataset Generation for Egyptian Arabic Sign Language

1423

Encoding Systems: Frameworks like HamNoSys

(Prikhodko et al., 2020) and SignWriting (Grushkin,

2017) provide structured ways to encode sign lan-

guage gestures. HamNoSys captures detailed infor-

mation about hand shapes, movements, and orien-

tations, making it valuable for animating avatars as

shown in ﬁgure 2. SignWriting, a visually intuitive

method for documenting signs, is increasingly inte-

grated into computational pipelines for SLG.

Figure 2: The ﬁve components of signs in sign languages

(from (Prikhodko et al., 2020)).

Hybrid Approaches: Hybrid models that com-

bine avatars with GAN-based video synthesis aim to

balance realism with scalability. These systems gen-

erate lifelike gestures while leveraging the ﬂexibility

of digital avatars to ensure accessibility (Stoll et al.,

2020).

Synthetic Data Generation for Sign Language:

Synthetic data generation using techniques such as

Generative Adversarial Networks (GANs) and 3D

avatar-based simulations has emerged as a promising

solution to address the scarcity of large-scale sign lan-

guage datasets (Natarajan and Elakkiya, 2022). For

EASL, synthetic data generation can mitigate chal-

lenges posed by regional variability and limited data

availability. By simulating culturally speciﬁc gestures

and facial expressions, researchers can train mod-

els more effectively, reducing overﬁtting to Western-

dominated datasets. Recent work has also explored

the use of motion-capture systems to generate high-

ﬁdelity skeletal data that aligns closely with real-

world EASL signing (Adaloglou et al., 2021; Luqman

and El-Alfy, 2021).

3 RESEARCH GAP

The research landscape for Egyptian Arabic Sign

Language (EASL) reveals signiﬁcant gaps across lin-

guistic, technological, and cultural domains. While

there have been advancements in Arabic Sign Lan-

guage recognition, notably by Aly (2024) and Bani

(2023), EASL-speciﬁc challenges persist. The lack

of extensive datasets that capture EASL’s regional

variations and dialectal complexities hinders the de-

velopment of effective recognition models. Current

datasets, like ArSL2018, focus mainly on isolated

signs and do not adequately represent the dynamic na-

ture of natural EASL communication.

Current sign language recognition systems, as

highlighted by Aloysius (2020), show a bias to-

wards analyzing hand gestures alone, overlooking the

critical role of facial expressions and body posture

in EASL. This oversight limits the effectiveness of

recognition models, which fail to capture the full

range of EASL communication. Moreover, the inade-

quacy of existing frameworks like MediaPipe in accu-

rately detecting EASL-speciﬁc gestures calls for the

development of culturally adapted computer vision al-

gorithms. Aligning EASL gestures with textual rep-

resentations poses a signiﬁcant challenge, with cur-

rent methods struggling to capture the unique tim-

ing and non-verbal elements of EASL. This issue

is compounded by regional variations across Egypt,

as Alotaibi (2023) notes, adding variability to ges-

ture execution and interpretation. Additionally, the

potential of using Generative Adversarial Networks

(GANs) to enhance EASL datasets is underexplored.

While promising for addressing data scarcity, their

use must carefully preserve cultural and linguistic

authenticity to accurately reﬂect natural signing nu-

ances.

The development of real-time EASL recognition

systems, while progressing as demonstrated by (El-

nasharty, 2024b), still faces signiﬁcant challenges

in terms of efﬁciency, accuracy, and adaptability

to diverse signing styles and environmental condi-

tions. This gap is particularly pronounced in resource-

constrained settings, where high-performance com-

puting infrastructure may not be readily available.

Addressing these interconnected challenges requires

a multidisciplinary approach that integrates advanced

machine learning techniques, linguistic expertise, and

cultural insights. The resolution of these research

gaps is crucial not only for advancing the ﬁeld of sign

language recognition but also for developing inclusive

technologies that can signiﬁcantly enhance communi-

cation accessibility for the Egyptian Deaf community

and potentially serve as a model for other underrepre-

sented sign languages globally.

4 CONCLUSIONS

The studies reviewed highlight advancements in sign

language recognition, especially in datasets, transla-

tion models, and landmark detection for ASL and

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1424

BSL. Yet, Egyptian Arabic Sign Language (EASL)

remains underexplored. Current tools often under-

perform for EASL due to their reliance on Western

sign language datasets that do not account for EASL’s

unique structures and cultural nuances. Future re-

search should focus on developing comprehensive

EASL datasets, employing multimodal approaches

for gesture analysis, and creating adaptive models

that recognize the linguistic and cultural speciﬁcs

of EASL, including its use of facial expressions for

grammatical context.

Efforts to address gaps in sign language recog-

nition should involve collaborative research between

linguists, computer scientists, and the Deaf commu-

nity. This interdisciplinary approach aims to develop

resources that are linguistically precise and culturally

representative. Progress in tackling EASL-speciﬁc

challenges could inform similar advancements for

other underrepresented sign languages, setting new

standards for inclusivity and cultural sensitivity. This

would enhance communication access and improve

the quality of life for Deaf communities both in Egypt

and globally.

ACKNOWLEDGEMENTS

This paper acknowledges the use of OpenAI’s Chat-

GPT for generating text in sections discussing related

work and proposed research gaps. The AI-generated

content was reviewed and revised to ensure accuracy

and alignment with the paper’s objectives.

REFERENCES

Adaloglou, N., Kouris, L., Theodorakis, S., et al. (2021). A

comprehensive study on deep learning-based methods

for sign language recognition. IEEE Transactions on

Pattern Analysis and Machine Intelligence.

Ahmad, S. I., Sabir, N., Abid, A., and Hussain, A. (2024).

Sign assist: Real-time isolated sign language recog-

nition and translator model connecting sign language

users with gpt model. In Proc. AVSEC 2024, pages

82–88.

Akila, G., El-Menisy, M., Khaled, O., Sharaf, N., Tarhony,

N., and Abdennadher, S. (2015). Kalema: Digitizing

arabic content for accessibility purposes using crowd-

sourcing. In Computational Linguistics and Intelli-

gent Text Processing: 16th International Conference,

CICLing 2015, Cairo, Egypt, April 14-20, 2015, Pro-

ceedings, Part II 16, pages 655–662. Springer.

Al-Barham, M., Alsharkawi, A., Al-Yaman, M., Al-

Fetyani, M., Elnagar, A., SaAleek, A. A., and Al-

Odat, M. (2023). Rgb arabic alphabets sign language

dataset. arXiv preprint arXiv:2301.11932.

Al-Shamayleh, A. S., Ahmad, R., Jomhari, N., and

Abushariah, M. A. (2020). Automatic arabic sign

language recognition: A review, taxonomy, open

challenges, research roadmap and future directions.

Malaysian Journal of Computer Science, 33(4):306–

343.

Aloysius, N. and Geetha, M. (2020). Understanding vision-

based continuous sign language recognition. Multime-

dia Tools and Applications, 79(31):22177–22209.

Aly, S., Osman, B., Aly, W., and Saber, M. (2024).

Arabic sign language recognition using deep ma-

chine learning. Multimedia Tools and Applications,

80(15):22331–22354.

Attia, N. F., Ahmed, M. T. F. S., and Alshewimy, M. A.

(2023). Efﬁcient deep learning models based on ten-

sion techniques for sign language recognition. Intelli-

gent systems with applications, 20:200284.

Baihan, A., Alutaibi, A. I., Alshehri, M., and Sharma, S. K.

(2024). Sign language recognition using modiﬁed

deep learning network and hybrid optimization: a hy-

brid optimizer (ho) based optimized cnnsa-lstm ap-

proach. Scientiﬁc Reports, 14(1):26111.

Bakalla, M. H. (1975). Bibliography of Arabic linguistics.

Mansell London.

Bani Baker, Q., Alqudah, N., Alsmadi, T., and Awawdeh,

R. (2023). Image-based arabic sign language recogni-

tion system using transfer deep learning models. Ap-

plied Computational Intelligence and Soft Computing,

2023(1):5195007.

Camgoz, N. C., Koller, O., Hadﬁeld, S., and Bowden, R.

(2018). Neural sign language translation. Proceedings

of CVPR.

Cao, Z., Hidalgo, G., Simon, T., Wei, S.-E., and Sheikh, Y.

(2019). Openpose: Realtime multi-person 2d pose es-

timation using part afﬁnity ﬁelds. IEEE Transactions

on Pattern Analysis and Machine Intelligence.

Elnasharty, M. S. (2024a). Using deep learning technology

for real-time sign language detection and recognition

at public libraries in egypt: An experimental study.

Scientiﬁc Journal of Library, Archives & Information

(SJLAI), 6(17).

Elnasharty, M. S. (2024b). Using deep learning technology

for real-time sign language detection and recognition

at public libraries in egypt: An experimental study.

Scientiﬁc Journal of Libraries, Documents and Infor-

mation, 6(1).

Grushkin, D. A. (2017). Writing signed languages: What

for? what form? American annals of the deaf,

161(5):509–527.

Hassan, M. A., Ali, A. H., and Sabri, A. A. (2024).

Enhancing communication: Deep learning for ara-

bic sign language translation. Open Engineering,

14(1):20240025.

Jiang, X. et al. (2024). Recent advances on deep learning

for sign language recognition. Computer Modeling in

Engineering & Sciences, 139(3):2399–2450.

Kassem, L., Sabty, C., Sharaf, N., Bakry, M., and Abden-

nadher, S. (2016). tashkeelwap: A game with a pur-

pose for digitizing arabic diacritics.

Dataset Generation for Egyptian Arabic Sign Language

1425

Kipp, M., Heloir, A., and Nguyen, Q. (2011). ign language

avatars:animation and comprehensibility. In Intelli-

gent Virtual Agents: 10th International Conference,

IVA 2011, Reykjavik, Iceland, September 15-17, 2011.

Proceedings 11, pages 113–126. Springer.

Koller, O., Zargaran, S., Ney, H., and Bowden, R. (2015).

Continuous sign language recognition: Towards large

vocabulary statistical recognition systems handling

multiple signers. Computer Vision and Image Under-

standing, 141:108–125.

Latif, G., Mohammad, N., AlKhalaf, R., AlKhalaf, R., Al-

ghazo, J., and Khan, M. (2020). An automatic arabic

sign language recognition system based on deep cnn:

An assistive system for the deaf and hard of hearing.

International Journal of Computing and Digital Sys-

tems, 9(4):715–724.

Luqman, H. and El-Alfy, E.-S. M. (2021). Towards hy-

brid multimodal manual and non-manual arabic sign

language recognition: marsl database and pilot study.

Electronics, 10(14):1739.

McDonald, J., Wolfe, R., Schnepp, J., Hochgesang, J., Jam-

rozik, D. G., Stumbo, M., Berke, L., Bialek, M., and

Thomas, F. (2016). An automated technique for real-

time production of lifelike animations of american

sign language. Universal Access in the Information

Society, 15:551–566.

Miah, A. S. M., Hasan, M. A. M., Jang, S.-W., Lee, H.-S.,

and Shin, J. (2023). Multi-stream general and graph-

based deep neural networks for skeleton-based sign

language recognition.

Mohamed, A., Nasser, N., and Sharaf, N. (2021). Auto-

matic code-switched lecture annotation. In Interactive

Mobile Communication, Technologies and Learning,

pages 464–477. Springer.

Mohamed, N. A. E. (2024). Arabic sign language and vi-

tal signs monitoring using smart gloves for the deaf.

Engineering Research Journal (Shoubra), 53(2):185–

191.

Mosleh, M. A., Assiri, A., Gumaei, A. H., Alkhamees,

B. F., and Al-Qahtani, M. (2024). A bidirectional ara-

bic sign language framework using deep learning and

fuzzy matching score. Mathematics, 12(8):1155.

Moustafa, A., Rahim, M. M., Khattab, M. M., Zeki, A. M.,

Matter, S. S., Soliman, A. M., and Ahmed, A. M.

(2024). Arabic sign language recognition systems: A

systematic review. Indian Journal of Computer Sci-

ence and Engineering, 15:1–18.

Nabil, M., Abdalla, A., Sharaf, N., and Sabty, C. (2024).

Bridging the gap: developing an automatic speech

recognition system for egyptian dialect integration

into chatbots. In International Conference on Appli-

cations of Natural Language to Information Systems,

pages 119–125. Springer.

Nasser, N., Salah, J., Sharaf, N., and Abdennadher, S.

(2020). Automatic lecture annotation. In 2020 IEEE

Frontiers in Education Conference (FIE), pages 1–9.

IEEE.

Natarajan, B. and Elakkiya, R. (2022). Dynamic gan

for high-quality sign language video generation from

skeletal poses using generative adversarial networks.

Soft Computing, 26(23):13153–13175.

Neidle, C., Thangali, A., and Sclaroff, S. (2012). Chal-

lenges in development of the american sign language

lexicon video dataset (asllvd) corpus. Proceedings of

LREC.

Papastratis, I., Chatzikonstantinou, C., Konstantinidis, D.,

Dimitropoulos, K., and Daras, P. (2021). Artiﬁcial

intelligence technologies for sign language. Sensors,

21(17):5843.

Podder, K. K., Ezeddin, M., Chowdhury, M. E., Sumon, M.

S. I., Tahir, A. M., Ayari, M. A., Dutta, P., Khandakar,

A., Mahbub, Z. B., and Kadir, M. A. (2023). Signer-

independent arabic sign language recognition system

using deep learning model. Sensors, 23(16):7156.

Prikhodko, A., Grif, M., and Bakaev, M. (2020). Sign lan-

guage recognition based on notations and neural net-

works. In Digital Transformation and Global Society:

5th International Conference, DTGS 2020, St. Peters-

burg, Russia, June 17–19, 2020, Revised Selected Pa-

pers 5, pages 463–478. Springer.

Rastgoo, R., Kiani, K., and Escalera, S. (2020). Hand sign

language recognition using multi-view hand skeleton.

Expert Systems with Applications, 150:113336.

Ren, Z., Yuan, J., Meng, Z., and Zhang, J. (2013). Robust

part-based hand gesture recognition using kinect sen-

sor. In IEEE Transactions on Multimedia, volume 15,

pages 1110–1120. IEEE.

Safwat, S., Salem, M. A.-M., and Sharaf, N. (2023). Build-

ing an egyptian-arabic speech corpus for emotion

analysis using deep learning. In Paciﬁc Rim Inter-

national Conference on Artiﬁcial Intelligence, pages

320–332. Springer.

Stoll, S., Camgoz, N. C., Hadﬁeld, S., and Bowden, R.

(2020). Text2sign: Towards sign language production

using neural machine translation and generative ad-

versarial networks. In International Journal of Com-

puter Vision (IJCV).

Tharwat, G., Ahmed, A. M., and Bouallegue, B. (2021).

Arabic sign language recognition system for alphabets

using machine learning techniques. Journal of Elec-

trical and Computer Engineering, 2021(1):2995851.

Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Atten-

tion is all you need. Advances in Neural Information

Processing Systems.

Younes, S. M., Gamalel-Din, S. A., Rohaim, M. A., and

Elnabawy, M. A. (2023). Automatic translation of

arabic text to arabic sign language using deep learn-

ing. Journal of Al-Azhar University Engineering Sec-

tor, 18(68):566–579.

Zhang, H., Li, W., Liu, J., Chen, Z., Cui, Y., Wang, Y., and

Xiong, R. (2022). Kinematic motion retargeting via

neural latent optimization for learning sign language.

IEEE Robotics and Automation Letters, 7(2):4582–

4589.

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

1426