whenever the bus speed is below a given threshold
(because otherwise it would enable people outside the
bus to be listen for a long time). The other strat-
egy consisted in filtering the collected paths in order
to discard paths too short in time or distance (which
could yield sporadic proximities of cars and buses).
Still, we have no means to prevent a person with more
than one Wi-Fi enabled device to be counted as more
than one, nor can we count people not carrying a Wi-
Fi enabled device.
The system also included a strategy, Fake Network
Advertisement (FNA), which was meant to force the
discovery of otherwise undetectable devices (i.e., de-
vices with an enabled Wi-Fi interface but not sending
any frames). In the real deployment scenario it proved
to work, but its benefits are small (about 2% more de-
vices were discovered because of FNA).
A prototype of the collecting system was devel-
oped, using a RaspberryPi and an external GPS sen-
sor. The collector was deployed on a Porto city bus
for some months. From the collected data, and upon
their filtering, we could find some similar occupancy
levels on some week days, which enables us to con-
clude that the results obtained are probably legitimate,
i.e. in average they yield a percentage of the exact
population traveling in the bus. We tried to get the
ticket validation data for further asserting the quality
of our observations, but it was not possible.
2 RELATED WORK
Abedi et al. (Abedi et al., 2013) performed a study to
evaluate Wireless Local Area Network (WLAN) tech-
nologies, Wi-Fi and Bluetooth, as a way of detecting
devices used by people. The authors performed stud-
ies regarding discovery time and popularity of use.
Wi-Fi surpassed Bluetooth on both metrics, register-
ing an average discovery time of 1.4 seconds, while
Bluetooth registered 10.6 seconds. On the popular-
ity test, Wi-Fi was responsible for 92% of the total
amount of devices detected by both WLAN technolo-
gies.
Musa et al. (Musa and Eriksson, 2012) used sim-
ilar concepts to track devices inside moving vehicles.
Their approach included interesting techniques to in-
crease the amount of data received. These techniques
are mostly aimed at increasing the rate of frames re-
ceived from devices as opposed to our objective, in-
creasing the amount of devices detected. A variation
of one of those techniques, Popular SSID AP Emu-
lation, is the FNA implemented in our system. While
they implemented their with a fully functional AP, our
system only emitted beacon frames to tease otherwise
silent devices.
Kostakos et al. (Kostakos et al., 2010) proposed a
solution to obtain the Origin Destination (OD) matrix
by using Bluetooth technology. This solution could
accurately determine a user’s origin and destination
on a trip. However, the percentage of detected pas-
sengers was low, approximately 9.7% travelers were
detected. The authors state that the low amount of
travelers detected is due to the fact that a traveler is
required to have a device with Bluetooth active and
set to discoverable mode, which according to (O’Neill
et al., 2006) only 7.5% of individuals do.
Bullock et al. (Bullock et al., 2010) deployed a
tracking system at the new Indianapolis international
Airport to measure passenger transit times between
security checkpoints. This work is also based on
Bluetooth, and it also exhibited a low success rate:
only 5% to 6.8% of individuals were detected.
Abedi et al. (Abedi et al., 2013) discuss some
practical challenges in the collection and monitoring
of crowd data, but they were concerned with people
moving in open areas, and not with people traveling
inside a transportation vehicle.
Shlayan et al. (Shlayan et al., 2016) proposed a
system using Bluetooth and Wi-Fi technology in or-
der to estimate the OD matrix and wait-times. The
authors performed 2 pilot tests of the system in New
York, one at the Atlantic Avenue Subway Station
(aimed at subway systems) and another at the Port
Authority Transit Facility (aimed at pedestrian flows).
The approach chosen by the authors was different
than ours: they relied on positioning Bluetooth and
Wi-Fi sensors in stations, and not in transportation ve-
hicles as in our system. Similarly to (Kostakos et al.,
2010), the results show a small amount of devices de-
tected by Bluetooth: less than 4% of all detected de-
vices in 2 separate tests. Therefore, they concluded
that Wi-Fi is a far more viable alternative.
This particular study only considered network
probing requests in Wi-Fi (which can limit the sample
size of obtained results) and encryption was necessary
to anonymize the records (due to the nature of the im-
plemented system architecture). On the contrary, we
used all kinds of Wi-Fi communications to infer the
presence of a personal device and our records are not
encrypted, since they are fully anonymized once they
leave the collecting device.
3 PROPOSED SOLUTION
The proposed system architecture aims to create a
client-server system in which the client is a collector
module responsible for collecting data regarding trav-
Survey of Public Transport Routes using Wi-Fi
169