(a) Before AEWGAN
(b) After AEWGAN
Figure 4: Embedding using t-SNE.
In summary, the proposed method using
AEWGAN in this paper has shown excellent results
in terms of sampling and anomaly detection for wafer
data, which is high-dimension imbalanced data.
5 CONCLUSIONS
We observed proposed AEWGAN show an efficient
abnormal detection method of semiconductor
manufacturing process data with high dimensional
imbalance characteristics. AEWGAN first proceeded
with the autoencoder learning using normal data only.
Then, the abnormal data was oversampled using the
WGAN and put into the previously learned model as
an input value. Finally, we carried out anomaly
detection within the latent space.
In the experiment, semiconductor wafer data with
an extreme imbalance of 152 dimensions were used.
The results of the experiment showed that the
AEWGAN of this paper, performed better
classification performance in abnormal data
compared to other models and that efficient anomaly
detection was also performed in visual comparisons
through t-SNE.
The method is expected to be applicable to semi-
conductor manufacturing processes with various
production systems, i.e. data with many variables and
few abnormal data, so it is likely to be practical and
applicable to a wide variety of areas. Future work to
be undertaken not only will detect the manufacturing
process sensor data but also time series data affected
by past values. It is thought that quantitative
comparisons will be needed in the future.
ACKNOWLEDGEMENTS
This work was supported by the National Research
Foundation of Korea (NRF) grant funded by the
Korea government (MSIT) (NRF-
2019R1A2C2005949). This work was also supported
by the BK21 Plus (Big Data in Manufacturing and
Logistics Systems, Korea University) and by the
Samsung Electronics Co., Ltd.
REFERENCES
Akbani, R., Kwek, S., and Japkowicz, N., 2004. Applying
support vector machines to imbalanced datasets. In
European conference on machine learning (pp. 39-50).
Springer, Berlin, Heidelberg.
An, J., Cho, S., 2015. Variational autoencoder based
anomaly detection using reconstruction probability,
Special Lecture on IE, 2, 1-18.
Arjovsky, M., Chintala, S., and Bottou, L., 2017.
Wasserstein gan. arXiv preprint arXiv:1701.07875.
Chandola, V., Banerjee, A., and Kumar, V., 2009. Anomaly
detection: A survey, ACM computing surveys (CSUR),
41(3), 15.
Chawla, N. V., 2009. Data mining for imbalanced datasets:
An overview. In Data mining and knowledge discovery
handbook (pp. 875-886). Springer, Boston, MA
Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer,
W. P., 2002. SMOTE: Synthetic minority over-
sampling technique. Journal of Artificial Intelligence
Research, 16, 321โ357.
Chawla, N. V., Japkowicz, N., and Kotcz, A., 2004. Special
issue on learning from imbalanced data sets. ACM
Sigkdd Explorations Newsletter, 6(1), 1-6.
Chen, B., Wan, J., Shu, L., Li, P., Mukherjee, M., and Yin,
B., 2017. Smart factory of industry 4.0: Key
technologies, application case, and challenges. IEEE
Access, 6, 6505-6519.
Douzas, G., and Bacao, F., 2018. Effective data generation
for imbalanced learning using conditional generative
adversarial networks. Expert Systems with applications,
91, 464-471.
Freeman, J., 1995. Outliers in statistical data. Journal of the
Operational Research Society, 46(8), 1034-1035.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., ... and Bengio, Y., 2014.