proper degenerated space on real data is virtually
impossible, as the difference between two or more
eigenvalues of Q is approaching zero, the numer-
ical values of the associated PCs can become dis-
continuous in a real time analysis. This by itself
does not represent an error in an absolute sense
in computing C
n
, as the specific C
n
is the same as
that which would be computed off-line.
• In the two previous cases the discontinuity was
due to an ambiguity of the diagonalization matrix.
A third source of discontinuity can consist into the
temporal evolution of the eigenvalues. Consider
two one-dimensional eigenspaces associated, one
with a variance that is increasing in time, the other
with a variance that is decreasing: there will be a
time step ¯n the two eigenspaces are degenerate.
This is called a “level crossing” and corresponds
in the algorithm to an effective swap in the “ cor-
rect” order of the eigenvectors. To restore conti-
nuity, the two components must me swapped.
3 EXAMPLES AND RESULTS
A publicly available data-set was used for this ex-
ample: it consists of snapshot measurements on 27
variables from a distillation column, with a sampling
rate of one every three days, measured over 2.5 years.
Sampling rate and time in general are not relevant per
se for PCA. Nevertheless, as we discussed the con-
tinuity issue it is interesting to see how the algorithm
behaves on physical variables representing the contin-
uous evolution of a physical system.
Variables represent temperatures, pressures, flows
and other kind of measures (the database is of indus-
trial origin and the exact meaning of all the variables
is not specified). Details are available on-line (Dunn,
2011).
This kind of data set includes variables that are
strongly correlated amongst each other, variables with
a large variance and variables almost constant during
a time of several samples. In Figure 1 we display the
time evolution of the variables and the standard batch
PCA. In Figure 3 the evolution of the covariance ma-
trix Q and the incremental PCs are shown. Notice that
the values p
n
are obviously not equal to the ones com-
puted with the batch method until the last sample. The
matrix Q almost constantly converges to the covari-
ance matrix computed with the batch method. Note
that at the beginning the Frobenius norm of the differ-
ence between the two matrices sometimes grows with
the addition of some samples, the number of samples
needed for Q to settle to the final value depends on the
regularity of the data and the variations in Q may rep-
resent an interesting description of the analyzed pro-
cess. This is expected for the estimator of the covari-
ance matrix until m & n. While the sample covariance
matrix is an unbiased estimator for n → ∞, it is known
to converge inefficiently (Smith, 2005).
In order to quantify the efficiency of the algorithm
the computational of the proposed incremental solu-
tion has been compared with the batch implementa-
tion provided by the Matlab built-in function PCA on
a Intel Core i7-7700HQ CPU running at 2.81 GHz,
with windows 10 operative system. The results are
shown in Figure 2. The time required to execute the
incremental algorithm grows linearly with the number
of samples while the batch presents an increase of the
execution time associated with the size of the dataset.
As reasonably expected, the batch implementation is
more efficient than the incremental one when the PCA
is computed on the whole dataset, while the incremen-
tal implementation is more efficient when samples are
added incrementally.
50 100 150 200 250
Samples
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Time [s]
Batch (cumulative)
Batch(single case)
Incremental(cumulative)
Figure 2: Time required to execute the incremental PCA
and the batch implementation as function of the number of
samples. For the batch algorithm both the time required
to compute the PCA on the given number of samples (sin-
gle case) and the cumulative time required to perform the
PCA with each additional sample (cumulative) are shown.
The computational time is measured empirically and can
be affected by small fluctuations due to the activity of the
operative system: in order to take this in account the aver-
age times (darker lines) and their standard deviations (error
bars) are computed on 33 trials. The batch implementation
is more efficient than the incremental one when the PCA
is computed on the whole dataset, while the incremental
implementation is more efficient when samples are added
incrementally.
In Figure 5 the variance of the whole incremental
PCA is shown. Comparing it with Figure 1 (bottom),
it is evident that the incremental PCs that are not lin-
early independent over the whole sampling time have
a slightly different distribution of the variance com-
Incremental Principal Component Analysis: Exact Implementation and Continuity Corrections
477