of Facebook, it is observed that the structure and the
mobility of the data is addressed while the modality
and security perspectives of the data in the
architecture is not clarified.
3.2.2 Twitter
The Twitter data is often streaming and unstructured
(Mishne et al., 2013). Data security does not have an
impact on the architectural design. Similarly, the
modality does not have a significant role and the
architectural components are specialized not for
visual or audio but textual data. Query processing is
one of the main functionalities of the system where
supporting optimization strategies are applied to.
Especially real-time query processing is the main
objective of the system while the architecture also
meets the sufficient batch processing requirements.
Besides, the system employs metadata management.
The data integration capability of Twitter is limited.
The first implementation of the Twitter system was
Hadoop-based, which did not meet latency
requirements. Therefore in the second
implementation, a custom in-memory processing
engine is employed. Ingestion is not mentioned in
the architectural descriptions. The Data Processing
feature of Twitter is presented in Figure 11.
In terms of data analysis, besides the querying
functionalities, optimization is also emphasized in
Figure 11: Feature diagram of Data Processing feature of
Twitter.
the Twitter’s software architecture. Moreover, this
architecture is also capable of doing sophisticated ad
hoc analyses that are designed to answer some
specific questions.
As a result, we observed that while all the top
level features are covered by the two systems’
(Facebook & Twitter) archicecture, the big data
security related features could not be derived from
the architectural descriptions of the system.
4 CONCLUSIONS
In this study we have derive a feature model for Big
Data systems using a systematic domain analysis
process. Based on selected relevant papers we have
been able to derive both the common and variant
features of Big Data systems and represent this as a
feature diagram. We have discussed the features
separately and illustrated the adoption of the feature
model for characterizing parts of two different
systems including Facebook and Twitter. The
feature diagram provides a first initial insight in the
overall configuration space of Big Data systems. As
a future work, we plan to illustrate our approach for
deriving the Big Data architectures.
REFERENCES
Araùjo, J., Baniassad, E., Clements, P., Moreira, A.,
Rashid, A., & Tekinerdogan, B., 2005. Early aspects:
The current landscape. Technical Notes, CMU/SEI
and Lancaster University.
Arrango, G., 1994. Domain Analysis Methods in Software
Reusability. Schäfer, R. Prieto-Díaz, and M.
Matsumoto (Eds.), Ellis Horwood, New York, New
York, pp. 17-49.
Ballard, C., Compert, C., Jesionowski, T., Milman, I.,
Plants, B., Rosen, B., & Smith, H., 2014. Information
Governance Principles and Practices for a Big Data
Landscape, IBM Redbooks.
Chapelle, D., 2013. Big Data & Analytics Reference
Architecture, An Oracle White Paper.
Czarnecki, K., Hwan, C., Kim, P., & Kalleberg, K. T.,
2006. Feature models are views on ontologies. In
Software Product Line Conference, 2006 10th
International (pp. 41-51). IEEE.
Geerdink, B., 2013. A reference architecture for big data
solutions introducing a model to perform predictive
analytics using big data technology. In Internet
Technology and Secured Transactions (ICITST), 2013
8th International Conference for (pp. 71-76). IEEE.
Harsu, M., 2002. A survey on domain engineering.
Tampere University of Technology.
Kang, K. C., Cohen, S. G., Hess, J. A., Novak, W. E., &