solve the problem of insufficient sample of each
participant, following the main principle of
summarizing the parameters of each local participant
to the central server. The central server aggregates all
the participant parameters and then broadcast them
back to each participant. Each participant constantly
interacts with the central server until the final model
loss reaches a threshold or the training reaches the
specified number of iterations.
(2) Vertical Federated Learning (VFL). If there are
two data sets, they will be split based on the vertical
dimension of features, eliminating any data that
differs between the same users for training purposes.
This method is called longitudinal federated learning.
Vertical federal information mainly addresses the
problem of insufficient data and information
dimension.
(3) The Federal Transfer Learning (FTL). Instead of
splitting the data when users' attributes in both data
sets overlap less, FTL approaches use transfer
learning to overcome the data or labeling. We refer to
this technique as federated transfer learning.
Focusing on the above three aspects, this paper
reports the latest research advanced in the federal
learning field. In detail, the representative methods
will be introduced in Section 2, including their design
ideas, basic framework, key steps, advantages, and
disadvantages. The performance of various federated
learning methods is also compared in Section 3. In
Section 4, a discussion of the existing challenges and
future development directions is summarized, which
is supposed to bring some new insights for the
federated learning field.
2 METHOD
2.1 Horizontal Federal Learning
Horizontal federated learning mainly focuses on how
to train models in parallel across multiple
participants, while protecting data privacy for each
participant. The representative horizontal federated
learning is federal average (FedAvg) proposed by
Google McMahan et al. in 2016 (McMahan, 2016).
During the joint training stage, the cloud center server
randomly selects a fixed proportion of clients from
the clients to participate in training each time. Each
participating client in local training then upload the
training gradient parameters to the cloud center server
for aggregation after several iteration times, which
effectively reduces the communication rounds in the
traditional training method. Compared with the
previous algorithm, federal average can reduce 10-
100 times and speed up the convergence of the model.
Based on the federal average algorithm, Li et al. (Li,
2018) proposed FedProx algorithm in 2018, which is
designed to solve the Non-IID (non-independent and
equally distributed) data problem in federated
learning. FedProx dynamically updates the number of
times of different clients to optimize the
communication efficiency, making the algorithm
more suitable for non-independent and equally
distributed joint modeling scenarios. The MOCHA
algorithm (Smith, 2017) is another representative
horizontal federated learning approach with a multi-
task learning strategy. This algorithm enables
personalized modeling by using a multi-task learning
framework to learn independent but related models
for each client, improving the ability to process
heterogeneous networks. Federated SGD (Liu, 2020)
is a simple baseline algorithm for stochastic gradient
descent (SGD) training in a federated learning
system. During each training round, participants
performed SGD updates using local data and sent the
updated model parameters to the server for
aggregation. These algorithms have their own
characteristics and are suitable for different
application scenarios and data features.
FedDyn (Durmus, 2021) belongs to the category
of horizontal federated learning. Participants can
share the model parameters or gradient information
for the joint model training, while maintaining the
local storage and privacy protection of the data.
FedDyn aims to solve the problem of dynamic
participation in federal learning, where the number of
participants and the data distribution constantly
change since participants can usually join or leave the
federated network dynamically. The traditional
federal learning framework, usually assume that
participants are static, that is, the number and identity
of participants in the training process are fixed.
However, in practical application, participants may
for various reasons (such as equipment failure,
network disconnection, user exit, etc.) dynamic to
join or leave the federal network.
FedDyn algorithm mainly includes initialization,
local training, model aggregation, dynamic
adjustment, and iterative optimization. First, the
server initializes the global model parameters and
distributes these parameters to clients participating in
federated learning. Each client then trains the model
using its data. Different learning rates or
regularization strategies may be applied in this
process to accommodate the data distribution of the
Non-IID. After the training session, the client sends
the model updates back to the server. The server is
responsible for aggregating these updates. To