supports not only DVFS but power capping of
CPUs. However, RAPL itself does not cap the
power consumption of the entire node.
Towards the above-mentioned goal, this paper
proposes a hybrid power capping method of a static
method and a dynamic method. When the application
starts, we determine the initial frequency statically
based on our power and performance model. Then
during the application is running, we dynamically
change the frequencies based on monitored power
consumption. This dynamic phase is introduced to
recover the excess of power, which may be caused by
model errors. Through the experiments using a com-
pute node with a NVIDIA GPU, we demonstrate that
neither the static approach nor the dynamic approach
can satisfy the goal solely, and combining the two is
essential.
2 BACKGROUND
The control of power consumption of supercomput-
ers, which may reach the order of megawatts, is be-
coming an important issue towards the protection of
the environment and cost reduction for energy.
Power capping technique is even more important,
especially for the systems with accelerators, whose
power fluctuation is larger. Towards power cap-
ping for systems, this paper focus on power capping
and energy saving on a compute node equipped with
CPUs and GPUs.
In previous HPC systems, it was more difficult
to obtain power consumption of each node due to
lack of smart power sensors or meters. Thus in or-
der to estimate power consumption of applications on
such nodes, statistical power models based on per-
formance counters have been constructed(Nagasaka
et al., 2010).
More recently, detailed monitoring of node power
consumption is much easier due to spread of power
sensors in computer systems. The real time power
consumption can be obtained with interfaces such as
RAPL for Intel CPUs and NVIDIA Management Li-
brary (NVML) for NVIDIA GPUs. Such interfaces
are used both for power monitoring and control; the
GPU clock speed can be configured by using the
NVML library.
Not only for power in processor level, with the
spread of inexpensive smart meters, HPC commu-
nity has started to develop the specification of power
monitor/control API of HPC systems(Laros, 2014).
As an example of a working system, TSUBAME-
KFC(Endo et al., 2014), ranked as No.1 in the world
in the GREEN500 List
2
, has a detailed monitoring
system, which can monitor not only CPUs and GPUs
power but also AC power of each node by intervals of
a second.
In response to the spread of monitoring systems,
we design our power capping method that uses real
time monitoring as described in the next section.
3 PROPOSED POWER CAPPING
METHODS
In this section, we discuss two simple power-capping
methods, a dynamic method and a static method. And
then we combine them into a hybrid method. Our
goal is to keep power consumption lower than a given
power budget during the execution of user applica-
tions. Generally it is hard to avoid instantaneous
excess of power; instead, we minimize the duration
when the power consumption is exceeding the limit
(hereafter, excess duration). Our goal also includes
the optimization of energy consumption, while keep-
ing the excess duration minimum.
We currently focus on power capping of a sin-
gle node equipped with an NVIDIA GPU accelerator.
Our power capping method is designed to support ap-
plications that have various characteristics, however,
we assume that a single application is running on a
node at a time. Also currently we have the following
assumptions on the application; the application uses
mainly GPUs for its computation, and a single CPU
core is mainly used for initialization of the applica-
tion, controlling the GPU. The application uses GPU
kernel functions that have similar characteristics to
each other; thus GPU power consumptions when ker-
nel functions are running do not change drastically.
3.1 Dynamic Power Capping Method
Here we describe the power capping technique using
dynamic changing of clock speeds of GPUs. With this
method, we continuously monitor power consump-
tion of the node, and if power consumption reaches
or exceeds the power limit, we decrease GPU clock
speeds. If the power consumption is much lower than
the limit contrarily, we increase the speed.
While this dynamic method is simple, we still
have to take care of control amount and control in-
terval. As for the control amount, we adopt a simple
method to change the clock by only a single step once.
As for the interval of control, there is a tradeoff; if the
interval is too long, we may miss the sudden increase
2
http://www.green500.org
PowerCappingofCPU-GPUHeterogeneousSystemsusingPowerandPerformanceModels
227