which means that time-related features are good clas-
sifiers to characterize encrypted and VPN traffic.
5.2 Analysis of Scenario B
In this Scenario all encrypted and VPN traffic are
mixed together in one dataset, and the objective is
to characterize the traffic without previously dividing
VPN from Non-VPN traffic, therefore we will have
14 types of traffic: 7 encrypted and 7 VPN traffic
categories. The results are shown in Figure 3 (parts
In this case, we cannot see the pattern ’shorter
timeout - better accuracy’ as clear as in the previ-
ous scenario (5.1). For example using the C4.5 al-
gorithm the Pr of VPN-Browsing, VPN-Mail, and
Mail with 15 sec is 0.771, 0.739, 0.671 respectively,
values lower than the 0.809, 0.786, 0.79 obtained
with 120 sec. The KNN results are similar, the Pr
of VPN-Browsing, VPN-Chat, and VPN-Mail traf-
fic categories is (0.691, 0.501, 0.688) for 15s. ftm,
smaller than the Pr obtained with 120 sec (0.743,
0.501, 0.688). On the other hand, the highest aver-
age Pr from the different ftm values is around 0.783
for C4.5 and 0.711 for KNN algorithms, around 0.5
points lower that the best values from Scenario A.
In this paper we have studied the efficiency of time-
related features to address the challenging problem of
characterization of encrypted traffic and detection of
VPN traffic. We have proposed a set of time-related
features and two common machine learning algo-
rithms, C4.5 and KNN, as classification techniques.
Our results prove that our proposed set of time-related
features are good classifiers, achieving accuracy lev-
els above 80%. C4.5 and KNN had a similar perfor-
mance in all experiments, although C4.5 has achieved
better results. From the two scenarios proposed, char-
acterization in 2 steps (scenario A) vs. characteri-
zation in one step (scenario B), the first one gener-
ated better results. In addition to our main objective,
we have also found that our classifiers perform better
when the flows are generated using shorter timeout
values, which contradicts the common assumption of
using 600s as timeout duration. As future work we
plan to expand our work to other applications and
types of encrypted traffic, and to further study the
application of time-based features to characterize en-
crypted traffic.
Characterization of Encrypted and VPN Traffic using Time-related Features