signals can effectively be processed to extract
meaningful features, which, when fed into a trained
classification encoder and a subsequent generative
network, can generate representative images of the
perceived visual stimulus. When comparing the SSIM
scores of the AC-GAN models, we found that the
model with the best reconstruction score was the AC-
GAN without modulation (Average SSIM = 0.291 or
65% average similarity between the generated images
and the ground truth). The model with the worst
reconstruction score was the AC-GAN with
modulation and multiplication (Average SSIM =
0.2074 or 60% average similarity between the
generated images and the ground truth). The average
Dice scores for the three AC-GAN models were
lower than the average SSIM scores (Average Dice
for AC-GAN without modulation = 43.73%; Average
Dice for AC-GAN with modulation and
multiplication =39%; Average Dice for AC-GAN
with modulation and concatenation = 35.69%). Since
the Dice score depends solely on the similarity
between the reconstructed images and the ground
truth on a pixel-by-pixel basis and excludes any
structural information, luminance, and contrast
information from the evaluation, then the SSIM score
is a more accurate reconstruction metric.
In this study, we utilized EEG data collected from
a different group (MindBigData (Vivancos and
Cuesta, 2022)) to which we had no control of their
experimental design, recording and collection
processes. Furthermore, EEG signals by nature are
noisy, full of muscular artifacts and eye blinks
particularly the signals from the frontal electrodes.
Careful preprocessing of the MindBigData EEG
signals reduced the data for training and testing from
51895 samples to 1958 samples, which was only
3.77% of the total data set. Even though the final EEG
dataset used for encoding and classification was
small, our CNN model reached a 92% classification
performance on average, which is close to the current
state-of-the-art (~96% (Mahapatra and Bhuyan,
2023)). Given the single-subject nature of these data,
it is unclear how the results would generalize to a
larger population. We believe this could be a fruitful
area for future investigation.
Despite the AC-GAN model's ability to generate
conditioned images, it is unable to rectify
misclassifications based solely on the encoded EEG
latent vector and conditioning label. As a result, when
the model generates an image that is incorrectly
classified, the resulting image exhibits a low SSIM
score, indicating poor similarity to the ground truth
(e.g. reconstructed ‘2’ digit by the AC-GAN without
modulation (see Fig. 9), or reconstructed ‘0’ digit by
the AC-GAN with modulation and concatenation (see
Fig. 11)). Paradoxically in some cases, this
misclassified image may still receive a relatively high
score, suggesting a potential discrepancy between the
model's perception of the image and its actual fidelity
to the ground truth (e.g. reconstructed ‘6; digit by the
AC-GAN with modulation and multiplication (see
Fig. 10)). This is because in the current model design
the predicted class conditioning label has a higher
influence on the predicted image over the encoded
latent vector.
A future extension to our study would be to
include a reinforcement learning agent applied to the
class label prediction. This agent would penalize the
incorrect class label thus reducing its influence on the
predicted image over the encoded latent vector.
Which in cases where the classification is incorrect
the agent would provide unsupervised learning
adjustments which may help to correct the AC-GAN
prediction and validity to indicate an uncertain score.
ACKNOWLEDGEMENTS
This work was supported by the EU HORIZON 2020
Project ULTRACEPT under Grant 778062.
REFERENCES
Fulford J, Milton F, Salas D, Smith A, Simler A, Winlove
C, Zeman A. (2018). The neural correlates of visual
imagery vividness = An fMRI study and literature
review. Cortex. 105: 26-40
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-
Farley D, Ozair S, Courville A, Bengio Y. (2014)
Generative Adversarial Nets. https://arxiv.org/abs/14
06.2661
Gramfort A, Luessi M, Larson E, Engemann D, Strohmeier
D, Brodbeck C, Goj R, Jas M, Brooks T, Parkkonen L,
Hämäläinen M. (2013) MEG and EEG data analysis
with MNE-Python. Frontiers in Neuroscience.
7(267):1-13
Gurumurthy S, Sarvadevabhatla R, Babu R. (2017)
DeLiGAN: Generative Adversarial Networks for
Diverse and Limited Data. IEEE Conference on
Computer Vision and Pattern Recognition (CVPR).
IEEE. 166-174. https://doi.org/10.1109/cvpr.2017.525
Jiao Z, You H, Yang F, Li X, Zhang H, Shen D. (2019)
Decoding EEG by visual-guided deep neural networks.
IJCAI, pp. 1387–1393
Khare S, Choubey RN, Amar L, Udutalapalli V. (2022).
NeuroVision: perceived image regeneration using
cProGAN. Neural Computing and Applications 34:
5979–5991