2 RELATED WORK
It is usual for a road bridge to bend and distort as a
vehicle passes over the bridge. If we assume that all
of a vehicle’s properties could be obtained in advance,
the bridge structural response including normal strain,
shear strain and displacement would be predictable by
using a bridge model. Existing modeling techniques
can be classified according to two major approaches,
namely explicit modeling and implicit modeling. The
explicit approaches use FEA (Mohamed et al., 2017;
Mohamed and Tahar, 2017), where bridge models are
typically created manually and optimized by iterative
model updates using a test vehicle (Wu et al., 2017).
By comparing uninjured and damaged models (Shah
et al., 2018), the damage can be localized.
On the other hand, the implicit approach abandons
explicit construction of bridge structural models, not
least because accurate FEA modeling is costly. In this
approach, there are two main methods for anomalous
behavior detection. The first method is based on using
model parameters that dominate the bridge dynamics,
such as natural frequencies (Bicanic and Chen, 1997),
damping ratios (Cao et al., 2017), and stress influence
lines (Chen et al., 2014). The second method is based
on physical observation where the anomaly is defined
as a dissociation between sensor data and predictions.
The predicted data can be static (Liu and Wang, 2010;
Ma and Bi, 2011) or dynamic (Zhang et al., 2018).
Traditionally, the transient signals were explained
by using Kalman filtering (Bing et al., 2011; Xiao and
Fang, 2016; Quansheng et al., 2010; Palanisamy and
Sim, 2015) for the case of quasistatic linear responses.
In a recent approach, Neves et al. (Neves et al., 2017;
Neves et al., 2018) modeled bridge vibration signals
by using a neural network that took previous 5-gram
acceleration samples, axle loads, and axle positions as
its input.
Aspects of bridge dynamics such as normal strain
and displacement can be estimated by introducing the
concept of an influence line (Chen et al., 2014; Huang
et al., 2016). The strain response by components, e.g.,
flanges, girders, and deck slabs, may be explained by
a linear response model where the strain measurement
s(t) at time t is proportional to the product of axle load
w(x, t) and the value of the influence line i(x) at axle
point x:
s(t) ≈ ˆs(t) =
Z
l
0
w(x, t)i(x)dx, (1)
where l is the bridge length. The function i(x) denotes
a proportionalityfactor for w(x, t), which is specific to
each bridge.
To predict the dynamic responses or to extract the
influence line from sensor data, the vehicle properties,
including speed, loci, axle positions, and weights, are
needed. Zaurin and Catbas (Zaurin and Catbas, 2011)
also investigated the collection of vehicle properties
via video surveillance. A problem with their approach
was that the targets were limited to test vehicles with
known axle loads. The most obvious approach to axle
weighing is to use an axle-load meter. However, it is
hard to retrofit an axle-load meter to existing bridges
because this meter needs paving work for installation.
Moreover, an axle-load meter is fragile and requires
frequent repair. Additionally, accurate axle weighing
may impose severe limits on vehicle traveling speed.
An alternative solution uses a bridge weigh-in-motion
(BWIM) (Lydon et al., 2015; Yu et al., 2016) system
that estimates axle weights by Eq. (1), but if the bridge
becomes damaged, the influence line may change and
lead to inaccurate axle-load estimates. The influence
line may also change its shape if the running position
in the lane changes. Accordingly, we must develop a
complex model to handle the large number and wide
variety of patterns of strain responses collected by test
runs in advance.
In a previous paper (Kawakatsu et al., 2018a), we
abandoned the collection of axle weights. Instead, we
proposed an anomaly-detection method based on the
assumption that vehicle appearances and bridge strain
responses may share common features for the passing
vehicles. As we have previously reported (Kawakatsu
et al., 2018b; Kawakatsu et al., 2019), the strain data
themselves contain rich information about the passing
vehicles. By using two convolutional neural networks
(CNNs) for video and strain signals and by comparing
the video and sensor data in a common feature space,
we could successfully identify anomalous responses.
The main problem in this work is the interpretation of
anomalous scores that are not directly associated with
physical abnormalities.
3 MEDIA-FUSION GAN
Fig. 1 illustrates the architecture of the generative net-
work. The network comprises two subnets, namely
the encoder and the decoder. The encoder extracts the
features of each vehicle, and the decoder predicts the
strain signal. Both subnets involve many preactivated
residual blocks (He et al., 2016). Note that we applied
an additional activation function to the output of each
block in addition to the two activation functions inside
the block. In this paper, we used leaky ReLU (Maas,
2013) for all activation functions except for those in
the output layers.
The encoder network shown in Fig. 1(a) is derived
from the SpiNet (Kawakatsu et al., 2018a). To obtain