Authors:
Shay Horovitz
1
;
Yair Arian
2
and
Noam Peretz
2
Affiliations:
1
School of Computer Science, College of Management Academic Studies, Israel
;
2
Huawei, China
Keyword(s):
Application Faults, Transaction, Cloud, Insights, Online.
Abstract:
Performance debugging and fault isolation in distributed cloud applications is difficult and complex. Existing Application Performance Management (APM) solutions allow manual investigation across a huge space of metrics, topology, functions, service calls, attributes and values - a frustrating resource and time demanding task. It would be beneficial if one could gain explainable insights about a faulty transaction whether due to an error or performance degradation, such as specific attributes and/or url patterns that are correlated with the problem and can characterize it. Yet, this is a challenging task as demanding resources of storage, memory and processing are required and insights are expected to be discovered as soon as the problem occurs. As cloud resources are limited and expensive, supporting a large number of applications having many transaction types is impractical. We present Perceptor – an Online Automatic Characteristics Discovery of Faulty Application Transactions in t
he Cloud. Perceptor discerns attributes and/or values correlated with transaction error or performance degradation events. It does so with minimal resource consumption by using sketch structures and performs in streaming mode. Over an extensive set of experiments in the cloud, with various applications and transactions, Perceptor discovered non-trivial relevant fault properties validated by an expert.
(More)