Authors:
Pengmiao Zhang
1
;
Rajgopal Kannan
2
;
Anant V. Nori
3
and
Viktor K. Prasanna
1
Affiliations:
1
University of Southern California, U.S.A.
;
2
US Army Research Lab-West, U.S.A.
;
3
Processor Architecture Research Lab, Intel Labs, U.S.A.
Keyword(s):
Attention, Memory Access Prediction, Graph Analytics.
Abstract:
Graphs are widely used to represent real-life applications including social networks, web search engines, bioinformatics, etc. With the rise of Big Data, graph analytics offers significant potential in exploring challenging problems on relational data. Graph analytics is typically memory-bound. One way to hide the memory access latency is through data prefetching, which relies on accurate memory access prediction. Traditional prefetchers with pre-defined rules cannot adapt to complex graph analytics memory patterns. Recently, Machine Learning (ML) models, especially Long Short-Term Memory (LSTM), have shown improved performance for memory access prediction. However, existing models have shortcomings including unstable LSTM models, interleaved patterns in labels using consecutive deltas (difference between addresses), and large output dimensions. We propose A2P, a novel attention-based memory access prediction model for graph analytics. We apply multi-head attention to extract feature
s, which are easier to be trained than LSTM. We design a novel bitmap labeling method, which collects future deltas within a spatial range and makes the patterns easier to be learned. By constraining the prediction range, bitmap labeling provides up to 5K compression for model output dimension. We further introduce a novel concept of super page, which allows the model prediction to break the constraint of a physical page. For the widely used GAP benchmark, our results show that for the top three predictions, A2P outperforms the widely used state-of-the-art LSTM-based model by 23.1% w.r.t. Precision, 21.2% w.r.t. Recall, and 10.4% w.r.t. Coverage.
(More)