Performance Computing Applications, 22(2):177–
193.
Dawson, C. and Aizinger, V. (2005). A discontinuous
Galerkin method for three-dimensional shallow wa-
ter equations. Journal of Scientific Computing, 22(1-
3):245–267.
Dietrich, J., Tanaka, S., Westerink, J., Dawson, C., Luet-
tich, R.A., J., Zijlema, M., Holthuijsen, L., Smith, J.,
Westerink, L., and Westerink, H. (2012). Performance
of the unstructured-mesh, swan+adcirc model in com-
puting hurricane waves and surge. Journal of Scien-
tific Computing, 52(2):468–497.
Eeckhout, L., Bell, R. H., Stougie, B., De Bosschere, K.,
and John, L. K. (2004). Control flow modeling in
statistical simulation for accurate and efficient proces-
sor design studies. In Proceedings of the 31st Annual
International Symposium on Computer Architecture,
2004, pages 350–361. IEEE.
Genbrugge, D. and Eeckhout, L. (2009). Chip Multi-
processor Design Space Exploration through Statis-
tical Simulation. Computers, IEEE Transactions on,
58(12):1668–1681.
G
¨
oddeke, D., Komatitsch, D., Geveler, M., Ribbrock, D.,
Rajovic, N., Puzovic, N., and Ramirez, A. (2013). En-
ergy efficiency vs. performance of the numerical solu-
tion of PDEs: An application study on a low-power
ARM-based cluster. J. Comput. Phys., 237:132–150.
Imperas Software Ltd. (2015). OVP Guide to Using Pro-
cessor Models. Imperas Buildings, North Weston,
Thame, Oxfordshire, OX9 2HA, UK. Version 0.5,
docs@imperas.com.
Imperas Software Ltd. (2016). De-
scription of Altera Cyclone V SoC.
http://www.ovpworld.org/library/wikka.php?wakka=
AlteraCycloneVHPS. Last visit on 31.03.2016.
ITMC TU Dortmund (2015). Official
LiDO website. https://www.itmc.uni-
dortmund.de/dienste/hochleistungsrechnen/lido.html.
Last visit on 26.03.2015.
John Hardman (2016). Official NAS Parallel Bench-
marks Website. http://www.nas.nasa.gov/ publica-
tions/npb.html. Last visit on 12.04.2016.
KALRAY Corp. (2015). Official kalray mppa proces-
sor website. http://www.kalrayinc.com/kalray/ prod-
ucts/#processors. Last visit on 31.03.2015.
Kerbyson, D. J. and Jones, P. W. (2005). A performance
model of the parallel ocean program. International
Journal of High Performance Computing Applica-
tions, 19(3):261–276.
Miller, J., Kasture, H., Kurian, G., Gruenwald, C., Beck-
mann, N., Celio, C., Eastep, J., and Agarwal, A.
(2010). Graphite: A distributed parallel simulator for
multicores. In IEEE 16th International Symposium on
High Performance Computer Architecture (HPCA),
2010, pages 1–12.
Nair, R., Choi, H.-W., and Tufo, H. (2009). Computa-
tional aspects of a scalable high-order discontinuous
galerkin atmospheric dynamical core. Computers &
Fluids, 38(2):309 – 319.
NVIDIA Corp. (2015). Official NVIDIA SECO develop-
ment kit website. https://developer.nvidia.com/seco-
development-kit. Last visit on 31.03.2015.
Rajovic, N., Carpenter, P. M., Gelado, I., Puzovic, N.,
Ramirez, A., and Valero, M. (2013). Supercomput-
ing with commodity cpus: Are mobile SoCs ready for
HPC? In Proceedings of the International Confer-
ence on High Performance Computing, Networking,
Storage and Analysis, SC ’13, pages 40:1–40:12, New
York, NY, USA. ACM.
Rajovic, N., Rico, A., Puzovic, N., Adeniyi-Jones, C., and
Ramirez, A. (2014). Tibidabo: Making the case for an
ARM-based HPC system. Future Generation Com-
puter Systems, 36(0):322 – 334.
Reuter, B., Aizinger, V., and K
¨
ostler, H. (2015). A multi-
platform scaling study for an OpenMP parallelization
of a discontinuous Galerkin ocean model. Comput
Fluids, 117:325 – 335.
Ringler, T., Petersen, M., Higdon, R. L., Jacobsen, D.,
Jones, P. W., and Maltrud, M. (2013). A multi-
resolution approach to global ocean modeling. Ocean
Modelling, 69:211 – 232.
Schoenwetter, D., Ditter, A., Kleinert, B., Hendricks, A.,
Aizinger, V., K
¨
ostler, H., and Fey, D. (2015). Tsunami
and Storm Surge Simulation using Low Power Ar-
chitectures – Concept and Evaluation. In SIMUL-
TECH 2015 - Proceedings of the 5th International
Conference on Simulation and Modeling Methodolo-
gies, Technologies and Applications, pages 377–382.
Shu, C.-W. (2016). High order {WENO} and {DG} meth-
ods for time-dependent convection-dominated pdes:
A brief survey of several recent developments. Jour-
nal of Computational Physics, 316:598 – 613.
Tanaka, S., Bunya, S., Westerink, J. J., Dawson, C., and
Luettich, R. A. (2011). Scalability of an unstructured
grid continuous galerkin based hurricane storm surge
model. J. Sci. Comput., 46(3):329–358.
Wallcraft, A., Hurlburt, H., Townsend, T., and Chassignet,
E. (2005). 1/25 degree atlantic ocean simulation using
hycom. In Users Group Conference, 2005, pages 222–
225.
Worley, P. and Levesque, J. (2004). The performance evo-
lution of the parallel ocean program on the cray x1. In
Proceedings of the 46th Cray User Group Conference,
pages 17–21.
Cache Aware Instruction Accurate Simulation of a 3-D Coastal Ocean Model on Low Power Hardware
137