
s
′
. For the sake of clarity, we abstract this operation
using a function F : {set of all possible cell states} →
{set of all possible cell states}, with s
′
= F(s,x).
RNNs are prone to overfitting, hence not robust to
errors in data collection. Further, they perform poorly
when the dataset is highly varied and complex. Our
model overcomes these issues by using the above-
described state information s in a probabilistic setting
to predict the next-step data. In other words, tDLGM
uses latent relevant information from the past in order
to predict the next data point. In particular, it predicts
the parameters of the probability distribution of the
next data point. There are other stochastic models for
time-series data prediction, e.g., Time Generative Ad-
versarial Network (Time-GAN) and Variational Re-
current Neural Network (VRNN) (Yoon et al., 2019;
Chung et al., 2015).
1.1 Literature Survey
Previous nondeterministic models have been devel-
oped to address the issue of complex time-series data.
Two notable examples are Time-GAN and VRNN
(Yoon et al., 2019; Chung et al., 2015). Both mod-
els share a common characteristic, which is the idea
of modeling a latent variable. We define ξ as a vec-
tor of latent variables and v ∈ V as values from a
time-series dataset. Both Time-GAN and VRNN base
their design on Variational Auto-Encoder (VAE). It
is used in situations where the prior of a latent vari-
able is known p(ξ), but the posterior p(ξ |v) is not
(Kingma, 2013). If the posterior is known then new
data points can be accurately generated by sampling
from the prior p(ξ). VAE address this unknown rela-
tionship by approximating posteriors p(ξ|v) through
a recognition model. The approximated posteriors are
then used to train a generator model. This genera-
tor model can then create new data points by sam-
pling from the prior. VRNN and Time-GAN do this
but with the additional constraint that their latent vari-
ables are conditioned on a state.
Time-GAN is based on the idea of a Generative
Adversarial Network (GAN) (Yoon et al., 2019; Kar-
ras et al., 2017). The GAN architecture is usually
constructed with one generator and one discrimina-
tor model. The generator is trained to create values,
while the discriminator is trained to discern true and
generated values. Time-GAN moves this to the latent
space, meaning that the discriminator discerns be-
tween true and generated latent variables, and the gen-
erator is tasked with fooling the discriminator. The la-
tent variables are parallel to this used to train another
model, which reconstructs v from ξ.
VRNN has a more straightforward usage of infer-
ence (Chung et al., 2015). It trains a set of neural
networks based on previous states that approximate
a latent variables. Samples from this distribution are
then used to generate values Our model has properties
similar to VRNN. We will, therefore, discuss VRNN
in further detail in the next section.
1.2 Our Contributions and Place in
Literature
As previously stated, VRNN is based on the idea of
VAE, which can be used when the prior of a latent
variable is known (p(ξ)), but the posterior (p(ξ|v))
is not (Kingma, 2013). VRNN solves this by train-
ing a function that extracts latent variables from pre-
vious states. VRNN does this through two samples
per time-step t. Specifically, given previous state s
t−1
they define a latent variable as
ξ
t
∼ N (µ
0,t
,σ
2
0,t
),
(1)
where [µ
0,t
,σ
0,t
] = p(s
t−1
) and p is typically a neural
network. This is then used to sample a value v
v
t
∼ N (µ
x,t
,σ
2
x,t
),
(2)
where [µ
x,t
,σ
x,t
] = p
x
(p
z
(ξ
t
),s
t−1
) and, p
x
and p
z
are
both neural networks. This structure, with one sam-
ple for the latent variable and another to generate v
works well for time-series data. However, we believe
that two layers of sampling hinder the potential ro-
bustness of the generative model. More samples can
result in more intricate distributions. Therefore, we
want a generative model for time-series data in which
the layers of combined samples can be set as a param-
eter of the model.
Deep Latent Gaussian Model (DLGM) was de-
veloped by Rezende et al. in 2014 to solve the issue
of scaleable inference in deep neural network model
(Rezende et al., 2014). It is trained through approxi-
mate inference in layers and, as such, combines mul-
tiple Gaussian samples. This means that the num-
ber of layered samples can vary depending on the
dataset’s needs, allowing for more complex distribu-
tions compared to VRNN where there are two layers.
This allows DLGM to learn complex patterns, gen-
erate new values, and perform inference. However,
it cannot, despite these excellent properties, accom-
modate time-series data. We address this by com-
bining DLGM with the idea of latent variables con-
ditioned on states. The result is a novel recognition-
generator structure that utilizes two recognition mod-
els, one for state and one for latent variables. It differs
from VRNN through the use of two recognition mod-
els and the interleaving of state and latent variables.
Approximate Probabilistic Inference for Time-Series Data: A Robust Latent Gaussian Model with Temporal Awareness
311