While working on the patterns we have discovered
that some of the introduced structural changes can
have very different impact on various processor
architectures – resulting in significant execution time
reduction on one processor, while causing execution
time increase on another one. We have also
acknowledged that even optimizations with positive
performance impact can have negative effects on the
stability of the computing task, maintainability, and
portability of the codebase, which might prove them
not feasible. Software developers are not always
familiar with existing optimization practices, their
efficiency on certain CPU versions and efficient
usage of modern CPU capabilities, which is why it is
particularly important to increase the number of best
practice patterns stored in the pattern repository. This
would also assist us in further improvement of the
methodology and the platform.
It is necessary to extend the platform with
support for testing computing tasks on different CPU
architectures, which would require some changes on
the hardware level and platform itself.
Another possible direction for evolving the
ReCoTOS platform is to add support for optimizing
computing tasks designed for edge computing. This
would require extending the cloud computing
platform with several edge nodes, as well as changes
in the container orchestration and a number of
modules.
ACKNOWLEDGEMENTS
The research leading to these results has received
funding from the project "Competence Centre of
Information and Communication Technologies" of
EU Structural funds, contract No. 1.2.1.1/18/A/003
signed between IT Competence Centre and Central
Finance and Contracting Agency, Research No. 1.13"
Resource-saving Computing Task Optimization
Solutions".
REFERENCES
Amiri, H., & Shahbahrami, A. (2020). SIMD programming
using Intel vector extensions. Journal of Parallel and
Distributed Computing, 135, 83–100. https://doi.org/
10.1016/j.jpdc.2019.09.012
Cockshott, P., & Renfrew, K. (2004). SIMD Programming
in Assembler and C BT - SIMD Programming Manual
for Linux and Windows. In P. Cockshott & K. Renfrew
(Eds.) (pp. 23–46). London: Springer London.
https://doi.org/10.1007/978-1-4471-3862-4_3
Holewinski, J., Ramamurthi, R., Ravishankar, M., Fauzia,
N., Pouchet, L. N., Rountev, A., & Sadayappan, P.
(2012). Dynamic trace-based analysis of vectorization
potential of applications. ACM SIGPLAN Notices,
47(6), 371–382. https://doi.org/10.1145/2345156
.2254108
Hughes, C. J. (2015). Single-instruction multiple-data
execution. Synthesis Lectures on Computer
Architecture, 32, 1–121. https://doi.org/10.2200/S006
47ED1V01Y201505CAC032
Kampars, J., Irbe, J., Kalnins, G., Mosans, G., Gulbe, R., &
Pinka, K. (2020). ReCoTOS: A Methodology for
Vectorization-based Resource-saving Computing Task
Optimization. In 2020 61st International Scientific
Conference on Information Technology and
Management Science of Riga Technical University,
ITMS 2020 - Proceedings. https://doi.org/10.1109/I
TMS51158.2020.9259289
Lai, Z., Luo, Q., & Xie, X. (2019). Efficient data-parallel
primitives on heterogeneous systems. ACM
International Conference Proceeding Series, (Mic).
https://doi.org/10.1145/3337821.3337920
Marjani, M., Nasaruddin, F., Gani, A., Karim, A., Hashem,
I. A. T., Siddiqa, A., & Yaqoob, I. (2017). Big IoT Data
Analytics: Architecture, Opportunities, and Open
Research Challenges. IEEE Access, 5, 5247–5261.
https://doi.org/10.1109/ACCESS.2017.2689040
Nuzman, D., & Zaks, A. (2008). Outer-loop vectorization -
revisited for short SIMD architectures. Parallel
Architectures and Compilation Techniques -
Conference Proceedings, PACT, 2–11. https://doi.org/
10.1145/1454115.1454119
Quinlan, D., & Liao, C. (2011). The ROSE Source-to-
Source Compiler Infrastructure. International Journal,
1–3.
Ren, J., Guo, H., Xu, C., & Zhang, Y. (2017). Serving at the
Edge: A Scalable IoT Architecture Based on
Transparent Computing. IEEE Network, 31(5), 96–105.
https://doi.org/10.1109/MNET.2017.1700030
Rong, H., Zhang, H., Xiao, S., Li, C., & Hu, C. (2016).
Optimizing energy consumption for data centers.
Renewable and Sustainable Energy Reviews, 58, 674–
691. https://doi.org/10.1016/j.rser.2015.12.283
Stone, J. E., Gohara, D., & Shi, G. (2010). OpenCL: A
parallel programming standard for heterogeneous
computing systems. Computing in Science and
Engineering, 12(3), 66–72. https://doi.org/10.
1109/MCSE.2010.69
Watanabe, H., & Nakagawa, K. M. (2019). SIMD
vectorization for the Lennard-Jones potential with
AVX2 and AVX-512 instructions. Computer Physics
Communications, 237, 1–7. https://doi.org/10.1016/
j.cpc.2018.10.028