UAVs allocation. When the user preference is accu-
rately predicted, useful data can be cached. If a user
request is in the UAV’s cached memory, it could be
sent directly to the user from the UAV, without fur-
ther requests to a remote BS. Thus, the user request
delay is reduced.
Low Density Parity Check (LDPC) codes (Gal-
lager, 1963; Guo, 2005) are powerful forward er-
ror correction schemes that have been widely investi-
gated and are considered in the 5G standard. Adaptive
coding and modulation (L. Hanzo, S. X. Ng and T.
Keller, 2005) is another attractive transmission tech-
nology, where a high-rate channel code and a high-
order modulation scheme are employed, for increas-
ing the transmission rate, when the channel quality
is good. By contrast, a low-rate channel code and a
low-order modulation scheme are employed, for im-
proving the transmission reliability, when the chan-
nel quality is poor. In this contribution, an Adaptive
LDPC Coded Modulation (ALDPC-CM) scheme was
investigated and utilized for the UAV-user communi-
cation link. The ALDPC-CM scheme could provide
a near-capacity transmission rate for a given channel
Signal-to-Noise-Ratio (SNR). Hence, a communica-
tion link with high SNR could lead to a lower trans-
mission period (or transmission delay).
The rest of this paper is organized as follows. The
caching model is investigated in Section 2, while our
system model is detailed in Section 3. Our simulation
results are discussed in Section 4, while our conclu-
sions are summarized in Section 5.
2 CACHING MODEL
In this section, the Latent Dirichlet Allocation algo-
rithm is outlined and the user preference model is pre-
sented.
2.1 Latent Dirichlet Allocation
Latent Dirichlet Allocation (LDA) is a generative
probabilistic model (Blei et al., 2003) that character-
izes the document generating process with the graph-
ical model. The grey parameters shown in Fig. 1 are
latent variables and cannot be observed. The remain-
ing variables could be observed with the input data.
LDA algorithm iteratively simulates the generating
process and estimates the latent variables. The gen-
erated data is then evaluated using a cost function.
Additionally, the latent variables are iteratively opti-
mized based on the cost function. Once the model
has converged, it can be used to predict the content of
future user requests.
The LDA model is able to reveal the topic compo-
sition of a document and the word probability that is
related to each topic. It classifies the new document
into different classes and predicts the new words. It
is useful for our system since users that share the
same interest may be gathering around the same ge-
ographical location under specific application scenar-
ios. LDA is a Bag-of-Words (BoW) model that ne-
glects the order of words in the document. In our case,
the order of user requests is not important. Hence,
the LDA algorithm can be employed to perform user
request prediction. More explicitly, LDA is com-
posed of document generation and parameter estima-
tion, where its graphical model is shown in Fig. 1.
The meaning of symbols used in Fig. 1 are presented
in Table 1. In our context, a ‘document’ represents
the ‘content requested by a user’.
We assume that there are a total of |T | = N
T
top-
ics, e.g. cars, movies, weather, news, etc, and they
are controlled by Dirichlet distribution Dir(α). For
each user, their interest topics can be regarded as inde-
pendent identical distributed (i.i.d) random variables
sampled from Dir(α). For example, Bob (Bob ∈ C )
is interested in 40% of news, 10% of cars, 50% of
movies, and 0% for the rest. Then, we can com-
pute the User-Topics distribution of Bob, θ
(d)
. On
the other hand, we have Topic-Contents distribution
of each topic φ
t
for t ∈ T . These distributions are con-
trolled by another Dirichlet distribution Dir(β). Thus,
we can sample from θ
d
and φ
t
to generate a content
w
(d)
i
for Bob. We continue to do this until we have
generated all N
d
contents for each d ∈ C . Likewise,
this model can generate contents for each user. In gen-
eral, the probability that a word blank w is filled by a
term t is given by:
p(w = t) =
∑
k
φ
(k)
(w = t|z = k)θ
(d)
(z = k),
(1)
where
∑
k
θ
(d)
(z = k) = 1.
2.2 User Preference Model
To simulate the user request behaviour and to gener-
ate the related distributions, we performed an LDA
clustering over a text dataset called 20-Newsgroups
dataset, which was first introduced in (Lang, 1995).
It is a popular dataset for experiments in text appli-
cations, which is composed of 20 groups of news.
Fig. 2(a) shows the main topics of the dataset. We
need to preprocess the dataset to filter out the stop
words before the LDA algorithm is employed. Then,
we extract n
w
= 1000 significant words from the
dataset to compose a dictionary. Fig. 2(b) shows
the preprocessing of the dataset. Hence, once we
DCNET 2020 - 11th International Conference on Data Communication Networking
68