Special Issue Paper
Analyzing performance traces using temporal formulas
Article first published online: 3 FEB 2014
Copyright © 2014 John Wiley & Sons, Ltd.
Software: Practice and Experience
Special Issue: Software Tools and Techniques for Monitoring and Prediction of Cloud Services
Volume 44, Issue 7, pages 777–792, July 2014
How to Cite
Ryckbosch, F. and Diwan, A. (2014), Analyzing performance traces using temporal formulas. Softw: Pract. Exper., 44: 777–792. doi: 10.1002/spe.2256
- Issue published online: 10 JUN 2014
- Article first published online: 3 FEB 2014
- Manuscript Accepted: 1 JAN 2014
- Manuscript Revised: 26 JUL 2013
- Manuscript Received: 22 FEB 2013
- performance analysis;
- trace matching;
- temporal logic;
- large-scale online applications;
- long-tail latencies
While profiling is invaluable for debugging performance problems that affect the common case, it is of little help in tracking performance problems that affect the slowest 1% of the operations (i.e., long-tail latencies). For Web service providers, these long-tail latencies affect both the cost of the service and the user experience. Because interactions between operations are often responsible for long-tail latency, we must analyze fine-grained traces to investigate their cause.
Unfortunately, analyzing traces is difficult because one needs to reason over long chains of events and because this reasoning often requires significant domain knowledge about what the event sequences mean. This paper shows how we can use formulas in linear-temporal logic to analyze traces. Given these formulas, our system searches through traces to find matches for these formulas and extracts relevant information from the matches. We demonstrate that our system is scalable and enables us to investigate long-tail performance problems at Google. Copyright © 2014 John Wiley & Sons, Ltd.