Federated Machine Learning Framework for Soil Classiﬁcation in Smart

Agriculture

Marwen Ghabi

, Soﬁane Khalfallah

and Hela Ltif

University of Sfax, National School of Electronics and Telecommunications of Sfax, Sfax, Tunisia

University of Sousse, Centre’s Computer Science Research Laboratory (PRINCE LAB), 4002, Sousse, Tunisia

University of Sfax, Research Groups in Intelligent Machines, Sfax, Tunisia

Keywords:

Federated Learning, Artiﬁcial Intelligence, Smart Agriculture, Data Privacy, Machine Learning as a Service

(MLaaS), Microservices Architecture, Soil Classiﬁcation, Predictive Models, Scalability, Resilience.

Abstract:

In the area of smart agriculture, data management and analysis play a key role in improving agricultural prac-

tices. However, the centralization of data poses major challenges in terms of conﬁdentiality, especially due to

the sensitivity of information collected from farms. Federated learning addresses these concerns by enabling

the training of AI models in a decentralized manner, where data remains localized while sharing only model

updates. This approach ensures conﬁdentiality while facilitating collaboration between different data sources.

This study presents an innovative solution that combines federated learning with a modular microservices-

based architecture to deploy predictive models as a machine learning service. This architecture, consisting of

microservices dedicated to data management, local model formation, federated aggregation, and Application

Programming Interface (API) delivery, enables real-time predictions to be delivered in a scalable and resilient

manner. To illustrate this approach, a case study on soil type classiﬁcation was conducted. The results show

that our method not only preserves the conﬁdentiality of distributed agricultural data, but also improves the

accuracy of agricultural recommendations. The integration of federated learning into a microservices archi-

tecture represents a signiﬁcant step forward, offering new perspectives for artiﬁcial intelligence in complex

environments requiring conﬁdentiality and scalability.

1 INTRODUCTION

In a global context where efﬁciency and sustainabil-

ity have become agricultural priorities, artiﬁcial in-

telligence (AI) and its sub-domains such as machine

learning (ML) offer promising solutions to address

these challenges. Artiﬁcial intelligence (AI) can sig-

niﬁcantly improve crop management, resource opti-

mization and crop yield forecasting. However, the

collection and processing of massive data from farms

around the world raises major privacy and data secu-

rity concerns.

Federated learning (FL) appears as an effective so-

lution to this problem, allowing us to train AI models

without centralizing sensitive data. This decentralized

approach ensures that information remains on local

devices (e.g., ground sensors or weather stations) and

only model parameters such as gradients are shared

between devices to create a global model. Therefore,

this signiﬁcantly reduces the risks of privacy breaches

while maintaining high model performance.

In the area of smart agriculture, FL is particularly

relevant. By combining diverse data sources such as

ground sensors, drones, or satellite images, AI mod-

els can adapt to speciﬁc local conditions while bene-

ﬁting from shared global knowledge. This approach

improves the accuracy of agricultural forecasts and

optimises agricultural practices by integrating local

data while preserving conﬁdentiality (

Zalik and

Zalik,

2023). In addition, the deployment of machine learn-

ing as a service (MLaaS) within a microservices ar-

chitecture allows agricultural actors to easily access

AI models via standardized interfaces, without the

need for in-depth technical expertise. This creates a

ﬂexible and scalable infrastructure for farms, allow-

ing them to adapt these technologies to their speciﬁc

needs. MLaaS architecture is particularly suitable

for distributed environments such as the Internet of

Things, where devices are geographically dispersed

but require centralized access to AI services (Bacciu

et al., 2017).

782

Ghabi, M., Khalfallah, S. and Ltiﬁ, H.

Federated Machine Learning Framework for Soil Classiﬁcation in Smart Agriculture.

DOI: 10.5220/0013182200003890

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 17th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2025) - Volume 3, pages 782-788

ISBN: 978-989-758-737-5; ISSN: 2184-433X

Finally, the combination of FL and MLaaS within

a microservices infrastructure not only protects agri-

cultural data but also accelerates the adoption of AI

technologies in this critical sector. Demonstrate that

this approach provides effective scalability and facili-

tates the integration of AI solutions into farmers’ daily

operations.

2 LITERATURE REVIEW

State-of-the-art Federated learning has become a key

approach to overcoming privacy challenges in train-

ing artiﬁcial intelligence models. Traditionally, it has

been the data was centralized in a single server to

drive ML models, which posed serious conﬁdentiality

issues, particularly in sensitive areas such as health

and agriculture. FL allows data to be kept on local

devices while sharing only model updates. This ap-

proach, well explored in various ﬁelds, is just begin-

ning to develop in the smart agriculture sector (

Zalik

and

Zalik, 2023) (Bacciu et al., 2017).

In agriculture, sensors, satellite images and drones

generate vast amounts of data that must be analyzed

to make informed decisions about crops, soil man-

agement, and weather. However, the geographic and

environmental diversity of farms makes it difﬁcult to

train a single model that could be applied to all 3 re-

gions. FL allows us to circumvent this obstacle by

driving models speciﬁc to each region while bene-

ﬁting from a federated aggregation of information.

For example, federated learning (FL) can be shown

to improve the accuracy of agricultural recommenda-

tions by integrating local data without requiring ex-

plicit sharing (

Zalik and

Zalik, 2023). At the same

time, the adoption of Machine Learning as a Service

(MLaaS) in agriculture allows for democratizing ac-

cess to sophisticated models without requiring end-

users (farmers, and farm managers) to have advanced

technical knowledge (Assem et al., 2016) (Bonawitz

et al., 2019). The MLaaS uses a micro-services archi-

tecture, where AI services are accessible via APIs and

can be easily integrated with existing farm manage-

ment tools. This also allows for better model scalabil-

ity and maintenance, as well as reduced infrastructure

costs (Garc

ıa et al., 2020).

Identiﬁed gaps:

• Few recent studies have explored the integration

of federated learning for soil classiﬁcation in agri-

culture.

• The current literature on MLaaS in the agricul-

tural sector remains limited with regard to its ap-

plication to complex machine learning models.

• The combination of federated learning and

MLaaS in a micro-service architecture for soil

classiﬁcation remains unexplored.

The proposed architecture in this study combines fed-

erated learning and deployment via MLaaS in a mod-

ular and scalable framework for soil classiﬁcation.

Using a micro-service architecture, this approach en-

sures the conﬁdentiality of distributed agricultural

data while facilitating model deployment. To our

knowledge, no other work has proposed such a com-

bination for soil classiﬁcation, which marks the main

contribution of this study.

3 CURRENT LIMITATIONS AND

CHALLENGES

While FL has many beneﬁts, there are persistent chal-

lenges that limit its widespread adoption in agricul-

ture. One of the main challenges is the latency of

communications between local devices and the cen-

tral server, especially in rural areas where connectiv-

ity is limited. In addition, model updating via feder-

ated gradients can consume a large amount of band-

width, which is not always practical in an agricul-

tural environment(Bacciu et al., 2017). Finally, the is-

sue of data heterogeneity is another important barrier.

Agricultural IoT devices vary in data quality, sam-

pling frequency, and data formats, making it difﬁcult

to drive robust models (Assem et al., 2016) (Sengupta

et al., 2020). In summary, while FL and MLaaS offer

new opportunities for smart agriculture, their large-

scale adoption requires technological improvements,

including connectivity and standardization of IoT de-

vices. These advances could potentially revolutionize

the way agricultural decisions are made, making agri-

culture more sustainable and productive.

4 METHODOLOGY

This section outlines the methodology used to inte-

grate KNN (k-Nearest Neighbors) into a federated

learning framework, effectively responding to soil

classiﬁcation requirements and adapting to a deploy-

ment architecture structured around Machine Learn-

ing as a Service (MLaaS). By using federated learn-

ing, this approach enables the formation of robust ma-

chine learning models while keeping the conﬁdential-

ity of agricultural data collected from multiple local

sources. By integrating KNN into the model develop-

ment service offered within MLaaS, we aim to:

Federated Machine Learning Framework for Soil Classiﬁcation in Smart Agriculture

783

• Optimize Soil Classiﬁcation. Use KNN to pro-

vide a simple and effective local data-based classi-

ﬁcation solution, ensuring accurate and rapid pre-

dictions.

• Preserve Data Conﬁdentiality. Through fed-

erated learning, the KNN allows models to be

trained locally without sensitive data transfer, thus

ensuring optimal protection of agricultural infor-

mation.

• Facilitate Model Access via Microservices. De-

ploying KNN in a microservices architecture al-

lows real-time access to models via APIs, while

ensuring ﬂexibility and scalability for different

users.

• Improve Model Efﬁciency. By combining feder-

ated learning and MLaaS, this approach reduces

the costs of centralized data storage and manage-

ment while maximizing performance through lo-

cal model training.

• Accelerate Innovation and Decision-Making.

The offering of a MLaaS platform simpliﬁes ac-

cess to the power of machine learning, eliminat-

ing constraints related to model deployment and

maintenance, and encourages innovation in agri-

cultural practices.

4.1 Data Collection and Preparation

Soil data, collected using sensors installed on various

farms, includes parameters such as chemical compo-

sition, texture, and moisture. Each local sensor col-

lects this data over a speciﬁed period, and then bro-

ken down into learning intervals. The data set is stan-

dardized to mitigate extreme variations that could af-

fect the convergence of the overall model (

Zalik and

Zalik, 2023) (Bacciu et al., 2017). Due to the di-

versity of farms, data sets vary in size and quality.

Data pre-processing techniques, such as smoothing

data and imputing missing values, have been applied

to ensure consistency of training data across the fed-

erated 3-way devices. These steps ensure that locally

driven models remain comparable before the aggrega-

tion phase.

Soil texture, which can be determined in the ﬁeld

or the laboratory, is essential for classifying soils ac-

cording to their physical texture. It can be assessed by

qualitative methods such as texture-to-touch or more

precise techniques such as the hydrometer method.

This characteristic plays a key role in agriculture by

allowing the assessment of crop suitability and pre-

dicting the response of the soil to environmental and

management conditions, such as drought or calcium

requirements. The texture is concentrated on parti-

cles less than two millimetres in size, including sand,

silt and clay.

4.2 Federated Training

The soil classiﬁcation model relies on the k-nearest

neighbours (KNN) algorithm to identify soil types

from numerical features, using a tailored federated

learning approach. Each local node trains an instance

of the KNN model on its data, building and storing

local neighbour sets and associated distances. Af-

ter several training rounds, the nodes transmit their

neighbour sets and distances to the central server.

This server aggregates this information to create a

global KNN model, combining the neighbour sets and

adjusting the average distances. The process is re-

peated until the global model achieves satisfactory ac-

curacy. This approach preserves the conﬁdentiality of

the data, which remains on the local nodes without be-

ing transferred to the central server, by the data con-

ﬁdentiality principles in federated learning (Bonawitz

et al., 2019). Regular evaluation of the model per-

formance and necessary adjustments ensure efﬁcient

soil classiﬁcation while respecting conﬁdentiality and

security constraints (Garc

ıa et al., 2020).

4.3 Micro-Services Architecture for

MLaaS

Our architecture is designed using a micro-services

approach, providing modular and scalable manage-

ment of machine learning services. Microservices al-

low a complex application to be decomposed into a

set of independent services, each performing a spe-

ciﬁc business process (

Zalik and

Zalik, 2023). This

modularity facilitates the integration, deployment and

maintenance of AI models while meeting the speciﬁc

needs of each farm.

Figure 1 presents the following three components

of the architecture:

• Data Collection Service. This microservice is re-

sponsible for collecting data from various sources

such as IoT sensors or images captured by drones.

The data is stored in a distributed manner across

different local nodes to ensure low latency and

better management of local resources. In smart

farming, this data may include parameters such as

soil temperature, water content, or images of the

terrain.

• Model Management Service. This microservice

is responsible for managing the training of fed-

erated learning models on local nodes. It co-

ordinates the training phases, synchronization of

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

784

Figure 1: KNN Based Federated Learning Architecture.

model weights, and global updates. Unlike CNNs,

the KNN (k-nearest neighbours) algorithm will be

used for classiﬁcation or recommendation tasks.

Each local farm trains a KNN model based on its

speciﬁc data and shares the relevant parameters

with a cloud. The central server aggregates the

parameters to improve the overall performance of

the model while preserving the privacy of local

data.

• Inference Service. This service provides REST

APIs to farmers, allowing them to use the trained

models to get recommendations on crop types or

resource management (e.g. irrigation). User in-

put (such as ﬁeld characteristics) is passed to the

service, which returns a prediction based on the

overall KNN model.

4.4 Deployment via MLaaS

The global model obtained is then deployed via a

micro-services architecture as a machine learning ser-

vice. Each service corresponds to a speciﬁc func-

tion of the ML pipeline, including data management,

coaching, and inference. Services are encapsulated in

Docker containers and orchestrated using Kubernetes

to ensure the scalability and resilience of the system

(Bacciu et al., 2017).

Users, such as farmers or farm managers, can in-

teract with the system via RESTful APIs to get predic-

tions on speciﬁc soil types or agricultural recommen-

dations. This deployment also allows for a continu-

ous update of the global model, as new data sets col-

lected can be integrated into the federated drive pro-

cess without interrupting service operations.

Figure 2: Federated learning architecture: cloud model.

4.5 Evaluation and Validation

The model was evaluated on an independent test

dataset, representing a diversity of agricultural condi-

tions. Performance metrics include precision, recall,

and F1 score to assess the model’s ability to classify

different soil types correctly. Additionally, the im-

pact of federated learning on privacy preservation was

evaluated by comparing the performance with a cen-

tralized model (Assem et al., 2016) (Sengupta et al.,

2020).

5 RESULTS AND DISCUSSION

5.1 Soil Classiﬁcation Results

The soil classiﬁcation model, driven by federated

learning, showed promising results in accuracy and

robustness under different agricultural conditions.

Data collected from local farms achieved a remark-

able accuracy of 98%, with an overall accuracy of 1.0

(or 100%) across the entire dataset. A REST API can

be used to view the AI model report using the KNN

algorithm. The classiﬁcation report provides a distri-

bution of classes according to several metrics: preci-

sion, recall, F1-score, and support, showing excellent

results (provided the validation dataset is sufﬁciently

representative) This accuracy is slightly higher than

traditional centralized models, especially for soils

with complex geophysical characteristics (

Zalik and

Zalik, 2023) (Bacciu et al., 2017).

The KNN algorithm performed well in this en-

vironment, allowing for rapid convergence of local

model weights while preserving the integrity of lo-

cal data. Comparing the performance of the feder-

ated model with that of a conventional centralized

model, we observed that the Federated algorithm pro-

vided comparable or even superior results without re-

quiring the transfer of sensitive data between 3 de-

vices. Results are consistent with recent studies that

Federated Machine Learning Framework for Soil Classiﬁcation in Smart Agriculture

785

show Federated algorithm is effective in heteroge-

neous and dispersed data environments (Assem et al.,

2016) (Bonawitz et al., 2019).

Figure 3: KNN classiﬁcation report.

- Figure 3 illustrates that evaluations on the vali-

dation set reveal an accuracy of 100, indicating that

the KNN model correctly classiﬁed all 48 records.

The confusion matrix and classiﬁcation report con-

ﬁrm that the model performed excellently, with per-

fect accuracy in every class.

5.2 MLaaS System Performance and

Optimization

The microservices architecture used to deploy the

model as a machine learning service has allowed for

increased ﬂexibility and scalability. The MLaaS sys-

tem has shown resilience to individual component

failures, ensuring that users can still access service

despite minor interruptions. In addition, using Kuber-

netes or Docker for container orchestration has made

it easier to automate deployment and resource man-

agement.

One of the main advantages of MLaaS is that it al-

lows end users (farmers, and managers) to access pre-

cise recommendations without the need for sophisti-

cated equipment or advanced technical skills (Garc

ıa

et al., 2020) (Bacciu et al., 2017).

This signiﬁcantly reduces the costs associated

with adopting AI technologies on farms. The CI/CD

pipeline conﬁguration (continuous integration / con-

tinuous delivery) allowed for a seamless update of the

machine learning model, ensuring that new data col-

lected could be used to reﬁne and improve predictions

continuously.

Figure 4: Deployment Architecture for MLaaS in Federated

Learning.

This diagram shows the main steps of the DevOps

pipeline for deploying and running an MLaaS service.

- Visual Code (VS Code): Lightweight and versatile

code editor, used to write, debug and manage devel-

opment projects.

- Git: Version control system to manage source code,

Dockerﬁles and necessary conﬁgurations and track

collaborative changes..

- Jenkins: Automates the CI/CD pipeline, handling

repository cloning, building Docker images, testing,

and deployment.

- Docker Build: Jenkins uses the Dockerﬁles to build

the Docker images for the model and services.

- Docker Image: Docker images are created for the

ML model and associated services.

- Docker Container: Docker images are deployed as

containers on the server.

- Cloud: Online hosted infrastructure to deploy and

run Docker containers to run MLaaS in production.

5.3 Comparison with Existing

Approaches

One of the main contributions of this study is the ap-

plication of federated learning to smart agriculture, an

area that has yet to be explored so far. Unlike cen-

tralized drive models, our approach ensures that local

data remains protected while allowing for global anal-

ysis across multiple farm devices and regions (As-

sem et al., 2016). Compared to previous work using

MLaaS models for the Internet of Things (IoT), our

approach has been distinguished by better data pri-

vacy management and improved performance in ge-

ographically disparate environments (Sengupta et al.,

2020) (Kairouz et al., 2019).

The results also show that federated learning can

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

786

outperform traditional data management methods in

environments with limited network connections, as is

often the case in rural areas. This opens the way for

wider adoption of this technology, not only in agri-

culture but also in other sectors where data privacy

and limited access to infrastructure are major con-

cerns (McMahan et al., 2017) (Li et al., 2020).

5.4 Limitations of the Approach and

Prospects for Improvement

Although the results of this study are promising, some

limitations remain. One of the main limitations is the

latency of communications between local devices and

the central server, which can affect the speed of fed-

erated model training. This is particularly problem-

atic in agricultural areas that are poorly connected. In

addition, the heterogeneity of IoT devices and data

formats can lead to inconsistencies in local model up-

dates (Smith et al., 2017).

In the future, more research is needed to improve

the model’s resilience to these challenges. One pos-

sible way would be to explore approaches for com-

pression of data before transmission, as well as ad-

vanced optimization techniques to reduce the band-

width required when exchanging between local de-

vices and the central server. In addition, the integra-

tion of reinforcement algorithms could allow for a dy-

namic adaptation of the model to speciﬁc local condi-

tions, thus further improving the accuracy of predic-

tions (Park and Sim, 2020) (Kone

y et al., 2016).

6 CONCLUSIONS AND FUTURE

WORK

Federated learning is an innovative and promising so-

lution for managing agricultural data in the area of

smart agriculture. This approach offers an effective

alternative to centralized methods, allowing machine

learning models to be trained without the need for

massive data transfer, thus ensuring the conﬁdential-

ity and security of local information. In this study,

we demonstrated that FL applied to soil classiﬁcation

provides high-precision results while adapting to the

geographic and technological constraints of agricul-

tural operations.

In addition, the integration of our solution within a

microservices architecture via an MLaaS service has

proven its effectiveness in terms of scalability and

ﬂexibility. This approach allows for seamless interac-

tion between different users of the system while pro-

viding continuous update capability and rapid deploy-

ment. Applying this technology to smart farming can

revolutionize agricultural practices, making it easier

to make decisions and optimizing crop management

through personalized recommendations based on soil

data.

However, some limitations such as communica-

tion latency and the heterogeneity of IoT devices re-

quire further research. Improvements can be made,

including through the optimization of federated drive

processes and network resource management in envi-

ronments with limited connectivity. In the future, the

integration of reinforcement learning techniques and

data compression methods could further enhance the

effectiveness of this approach.

In conclusion, federated learning combined with a

distributed service architecture is a breakthrough for

smart agriculture. It offers not only high performance

but also better data protection, a crucial factor in the

development of more sustainable and innovative agri-

culture.

REFERENCES

Assem, H., Xu, L., Buda, T. S., and O’Sullivan, D. (2016).

Machine learning as a service for enabling internet of

things and people. Personal and Ubiquitous Comput-

ing.

Bacciu, D., Chessa, S., Gallicchio, C., and Micheli, A.

(2017). On the need of machine learning as a service

for the internet of things. In Proceedings of the 1st In-

ternational Conference on Internet of Things and Ma-

chine Learning.

Bonawitz, K., Eichner, H., Grieskamp, W., Huba, D., Inger-

man, A., Ivanov, V., Kiddon, C., Kone

y, J., Maz-

zocchi, S., McMahan, H., Overveldt, T. V., Petrou, D.,

Ramage, D., and Roselander, J. (2019). Towards fed-

erated learning at scale: System design. In Proceed-

ings of the 2nd SysML Conference.

Garc

ıa, L., Parra, L., Jim

enez, J., Lloret, J., and Lorenz,

P. (2020). Iot-based smart irrigation systems: An

overview on the recent trends on sensors and iot sys-

tems for irrigation in precision agriculture. Sensors,

20(4):1042.

Kairouz, P., McMahan, H., et al. (2019). Advances and

open problems in federated learning. Foundations and

Trends® in Machine Learning, 11(3-4):185–383.

Kone

y, J., McMahan, H., Yu, F., et al. (2016). Feder-

ated learning: Strategies for improving communica-

tion efﬁciency. In Proceedings of the 19th Interna-

tional Conference on Neural Information Processing

Systems.

Li, T., Sahu, A., Talwalkar, A., and Smith, V. (2020). Fed-

erated learning: Challenges, methods, and future di-

rections. volume 37, pages 50–60.

McMahan, H., Moore, E., Ramage, D., and Arcas, B.

(2017). Communication-efﬁcient learning of deep net-

works from decentralized data. In Proceedings of

Federated Machine Learning Framework for Soil Classiﬁcation in Smart Agriculture

787

the 20th International Conference on Artiﬁcial Intelli-

gence and Statistics.

Park, J. and Sim, K. (2020). Blockchain-based dynamic

federated learning for secure smart agriculture. IEEE

Access, 8:139109–139118.

Sengupta, S., Basak, P., and Saikat, S. (2020). Distributed

machine learning in iot: The future of distributed edge

computing. IEEE Transactions on Industrial Infor-

matics.

Smith, V., Chiang, C., Sanjabi, M., and Talwalkar, A.

(2017). Federated multi-task learning. In Proceed-

ings of the 31st International Conference on Neural

Information Processing Systems.

Zalik, K. R. and

Zalik, M. (2023). A review of federated

learning in agriculture. Sensors, 23(23):9566.

ICAART 2025 - 17th International Conference on Agents and Artiﬁcial Intelligence

788