Hierarchical Supporting Structure for Dynamic Organization in Many-core Computing Systems

Liang Guang, Syed M. A. H. Jafri, Bo Yang, Juha Plosila, Hannu Tenhunen

Abstract

Hierarchical supporting structures for dynamic organization in many-core computing systems are presented. With profound hardware variations and unpredictable errors, dependability becomes a challenging issue in the emerging many-core systems. To provide fault-tolerance against processor failures or performance degradation, dynamic organization is proposed which allows clusters to be created and updated at the run-time. Hierarchical supporting structures are designed for each level of monitoring agents, to enable the tracing, storing and updating of component and system status. These supporting structures need to follow software/hardware co-design to provide small and scalable overhead, while accommodating the functions of agents on the corresponding level. This paper presents the architectural design, functional simulation and implementation analysis. The study demonstrates that the proposed structures facilitate the dynamic organization in case of processor failures and incur small area overhead on many-core systems.

References

  1. Asad, M., Guang, L., Hemani, A., Paul, K., Plosila, J., and Tenhunen, H. (2012). Energy-aware faulttolerant network-on-chips for addressing multiple traffic classes. In Proc. of 15th Euromicro Conference on Digital System Design.
  2. Borkar, S., Karnik, T., Narendra, S., Tschanz, J., Keshavarzi, A., and De, V. (2003). Parameter variations and impact on circuits and microarchitecture. In Proc. Design Automation Conference, pages 338-342.
  3. Chabloz, J. M. and Hemani, A. (2010). Distributed DVFS using rationally-related frequencies and discrete voltage levels. In Proc. ACM/IEEE Int Low-Power Electronics and Design (ISLPED) Symp, pages 247-252.
  4. Chabloz, J.-M. and Hemani, A. (2012). Scalable Multicore Architectures: Design Methodologies and Tools, chapter Power Management Architecture in McNoC, pages 55-80. Springer New York.
  5. Chen, X., Lu, Z., Jantsch, A., and Chen, S. (2010). Supporting distributed shared memory on multi-core networkon-chips using a dual microcoded controller. In Proc. Design, Automation & Test in Europe Conf. & Exhibition (DATE), pages 39-44.
  6. Chen, X. and Zhonghai, L. (2010). DME User Guide. KTH, School of Information and Communication Technology, release 4.0 edition.
  7. Chou, C.-L. and Marculescu, R. (2011). Farm: Faultaware resource management in noc-based multiprocessor platforms. In DATE, pages 673-678.
  8. Collet, J. H., Psarakis, M., Zajac, P., Gizopoulos, D., and Napieralski, A. (2009). Comparison of faulttolerance techniques for massively defective fine- and coarse-grained nanochips. In Proc. MIXDES-16th Int. Conf. Mixed Design of Integrated Circuits & Systems MIXDES 7809, pages 23-30.
  9. Das, S., Tokunaga, C., Pant, S., Ma, W.-H., Kalaiselvan, S., Lai, K., Bull, D., and Blaauw, D. (2009). RazorII: In Situ Error Detection and Correction for PVT and SER Tolerance. IEEE JSSC, 44(1):32-48.
  10. Guang, L., Nigussie, E., Rantala, P., Isoaho, J., and Tenhunen, H. (2010). Hierarchical agent monitoring design approach towards self-aware parallel systems-onchip. ACM Transactions on Embedded Computing Systems (TECS), 9(3):25:1-25:24.
  11. Guang, L., Nigussie, E., and Tenhunen, H. (2009). Systemlevel exploration of run-time clusterization for energyefficient on-chip communication. In Proceedings of the 2nd International Workshop on Network on Chip Architectures, pages 63-68. ACM.
  12. Jantsch, A. Nostrum home. http://www.ict.kth.se/nostrum/. Accessed Jan.6, 2013.
  13. Jantsch, A. and Tenhunen, H. (2003). Networks on Chip. Kluwer Academic Publishers.
  14. Khan, G. and Ahmed, U. (2008). Cad tool for hardware software co-synthesis of heterogeneous multiple processor embedded architectures. Design Automation for Embedded Systems, 12:313-343.
  15. Lu, Z., Xia, L., and Jantsch, A. (2008). Cluster-based simulated annealing for mapping cores onto 2d mesh networks on chip. In Proc. DDECS, pages 1-6.
  16. Oh, N., Mitra, S., and McCluskey, E. J. (2002). ED4I: Error detection by diverse data and duplicated instructions. IEEE Trans. Comput., 51(2):180-199.
  17. Ostroumov, S. and Tsiopoulos, L. (2012). Formal development of hierarchical agent-based monitoring systems for dynamically reconfigurable NoC platforms. IJERTCS, 3(2):40-72.
  18. Pamunuwa, D., Oberg, J., Zheng, L.-R., Millberg, M., Jantsch, A., and Tenhunen, H. (2004). A study on the implementation of 2-d mesh based networks on chip in the nanoregime. Integration - The VLSI Journal, 38(1):3-17.
  19. Roig, C., Ripoll, A., and Guirado, F. (2007). A new task graph model for mapping message passing applications. IEEE Transactions on Parallel and Distributed Systems, 18(12):1740-1753.
  20. Shamshiri, S., Lisherness, P., Pan, S.-J., and Cheng, K.-T. (2008). A cost analysis framework for multi-core systems with spares. In Proc. IEEE International Test Conference ITC 2008, pages 1-8.
  21. Unsal, O., Tschanz, J., Bowman, K., De, V., Vera, X., Gonzalez, A., and Ergin, O. (2006). Impact of parameter variations on circuits and microarchitecture. IEEE Micro, 26(6):30-39.
  22. Vangal, S., Howard, J., Ruhl, G., Dighe, S., Wilson, H., Tschanz, J., Finan, D., Singh, A., Jacob, T., Jain, S., Erraguntla, V., Roberts, C., Hoskote, Y., Borkar, N., and Borkar, S. (2008). An 80-tile sub-100-w teraflops processor in 65-nm cmos. IEEE JSSC, 43(1):29-41.
  23. Yang, B., Xu, T. C., Säntti, T., and Plosila, J. (2010). Tree-model based mapping for energy-efficient and low-latency network-on-chip. In Proc. of 13th IEEE DDECS, pages 189-192.
  24. Zhang, L., Han, Y., Xu, Q., and Li, X. (2008). Defect tolerance in homogeneous manycore processors using core-level redundancy with unified topology. In Proc. Design, Automation and Test in Europe DATE 7808, pages 891-896.
Download


Paper Citation


in Harvard Style

Guang L., M. A. H. Jafri S., Yang B., Plosila J. and Tenhunen H. (2013). Hierarchical Supporting Structure for Dynamic Organization in Many-core Computing Systems . In Proceedings of the 3rd International Conference on Pervasive Embedded Computing and Communication Systems - Volume 1: SANES, (PECCS 2013) ISBN 978-989-8565-43-3, pages 252-261. DOI: 10.5220/0004389702520261


in Bibtex Style

@conference{sanes13,
author={Liang Guang and Syed M. A. H. Jafri and Bo Yang and Juha Plosila and Hannu Tenhunen},
title={Hierarchical Supporting Structure for Dynamic Organization in Many-core Computing Systems},
booktitle={Proceedings of the 3rd International Conference on Pervasive Embedded Computing and Communication Systems - Volume 1: SANES, (PECCS 2013)},
year={2013},
pages={252-261},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004389702520261},
isbn={978-989-8565-43-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Pervasive Embedded Computing and Communication Systems - Volume 1: SANES, (PECCS 2013)
TI - Hierarchical Supporting Structure for Dynamic Organization in Many-core Computing Systems
SN - 978-989-8565-43-3
AU - Guang L.
AU - M. A. H. Jafri S.
AU - Yang B.
AU - Plosila J.
AU - Tenhunen H.
PY - 2013
SP - 252
EP - 261
DO - 10.5220/0004389702520261