[Performance and scalability] From the
implementation point of view, our goal is to
make execution in the cloud more performant
and to better use the scalability features of the
cloud. These features are especially important
in the case of large-scale analysis (during
training phase) as well as for serving many
users or other applications. In particular, we
will study how existing relevant technologies
like Mesos (Hindman et al., 2011) or
Kubernetes can be used for that purpose.
REFERENCES
Abadi, D.J., 2007. Column stores for wide and sparse data.
In Proceedings of the Conference on Innovative Data
Systems Research (CIDR), 292–297.
Ahmed, M., Mahmood, A.N., Hu, J., 2016. A survey of
network anomaly detection techniques. Journal of
Network and Computer Applications 60, 19–31.
Berthold, M.R., Cebron, N., Dill, F., Gabriel, T., Koetter,
T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B.,
2007. Knime: The Konstanz Information Miner.
Proceedings Studies in Classification, Data Analysis,
and Knowledge Organization (GfKL), Freiburg,
Germany, Springer-Verlag.
Borg, I., Groenen, P.J., 2005. Modern Multidimensional
Scaling: Theory and Applications. Springer, New York,
NY, USA.
Box, G.E.P., Jenkins, G.M., 1976. Time Series Analysis:
Forecasting and Control, Rev. Edition, San Francisco:
Holden-Day.
Chandola, V., Banerjee, A., Kumar, V., 2009. Anomaly
Detection: A Survey. ACM Computing Surveys 41(3).
Chandola, V, Banerjee, A., Kumar, V., 2012. Anomaly
Detection for Discrete Sequences: A Survey, IEEE
Transactions on Knowledge and Data Engineering,
24(5).
Copeland, G.P., Khoshafian, S.N., 1985. A decomposition
storage model. In SIGMOD 1985, 268–279.
de Leeuw, J., 1988. Convergence of the majorization
methods for multidimensional scaling. Journal of
Classification, 5(2):163–180.
Dean, J, Ghemawat, S., 2004. MapReduce: Simplified data
processing on large clusters. In Sixth Symposium on
Operating System Design and Implementation
(OSDI'04), 137–150.
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.A., 2006.
Feature Extraction: Foundations and Applications.
Springer, New York, NY, USA.
Kandel, S. et al., 2011. Research Directions in Data
Wrangling: Vizualizations and transformations for
usable and credible data. Information Visualization,
10(4), 271–288.
Khreich, W., Khosravifar, B., Hamou-Lhadj, A., Talhi, C.,
2017. An anomaly detection system based on variable
N-gram features and one-class SVM. Information and
Software Technology 91, 186–197.
Kruskal, J.B., 1964. Multidimensional scaling by
optimizing goodness of fit to a nonmetric hypothesis.
Psychometrika, 29(1):1–27.
Manning, C.D, Raghavan, P, Schutze, H., 2008. Scoring,
term weighting, and the vector space model.
Introduction to Information Retrieval. p. 100.
McKinney, W., 2010. Data Structures for Statistical
Computing in Python. In Proceedings of the 9th Python
in Science Conference (SciPy 2010), 51–56.
McKinney, W., 2011. pandas: a Foundational Python
Library for Data Analysis and Statistics. In Proc.
PyHPC 2011.
Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A.,
Joseph, A.D., Katz, R., Shenker, S., Stoica, I., 2011.
Mesos: A Platform for Fine-Grained Resource Sharing
in the Data Center. Proc. 8th USENIX conference on
Networked systems design and implementation (NSDI
2011), 295–308.
Saia, R., Carta, S., 2017. A Frequency-domain-based
Pattern Mining for Credit Card Fraud Detection, In
Proc. 2nd International Conference on Internet of
Things, Big Data and Security (IoTBDS 2017), 386–
391.
Savinov, A., 2014. ConceptMix: Self-Service Analytical
Data Integration Based on the Concept-Oriented
Model, Proc. 3rd International Conference on Data
Technologies and Applications (DATA 2014), 78–84.
Savinov, A., 2016. DataCommandr: Column-Oriented Data
Integration, Transformation and Analysis.
International Conference on Internet of Things and Big
Data (IoTBD 2016), 339–347.
Singhal, A., 2001. Modern Information Retrieval: A Brief
Overview. Bulletin of the IEEE Computer Society
Technical Committee on Data Engineering, 24(4): 35–
43.
Schölkopf, B., Williamson, R., Smola, A., Shawe-Taylor,
J., Platt, J., 1999. Support Vector Method for Novelty
Detection. In Proc. 12th International Conference on
Neural Information Processing Systems (NIPS 1999),
582–588.
Smola, A.J., Schölkopf, B., 2004. A Tutorial on Support
Vector Regression. Statistics and Computing archive,
14(3): 199–222.
Zadrozny, P., Kodali, R., 2013. Big Data Analytics Using
Splunk: Deriving Operational Intelligence from Social
Media, Machine Data, Existing Data Warehouses, and
Other Real-Time Streaming Sources. Apress, Berkely.
Zaharia, M., Chowdhury, M., Das, T. et al., 2012. Resilient
distributed datasets: a fault-tolerant abstraction for in-
memory cluster computing. In Proc. 9th USENIX
conference on Networked Systems Design and
Implementation (NSDI'12).
IoTBDS 2018 - 3rd International Conference on Internet of Things, Big Data and Security
62