Performance and power modeling of multi-tier Web services. These projects use statistical machine learning (SML) to develop composable models to predict performance (and answer what-if questions) about multi-tier Internet services. Automatic Workload Evaluation (AWE) helps predict simultaneously several aspects of system performance when stimulated by a previously unseen workload through a novel application of Kernel Canonical Correlation Analysis. Our approach achieves predictions within 20% of measured values more than 80% of the time on a real customer workload, even in cases where the database’s built-in query optimizer gives poor estimates. We also explored extending the technique to Hadoop jobs and autotuned concurrent scientific codes. Fingerprinting the Datacenter helps us characterize past datacenter performance crises using a compact, searchable representation of datacenter state, building on and improving our earlier work with such representations of per-machine state both in terms of scale and in the retrieval accuracy achieved by the new representation.
- Students: Kristal Curtis, Yanpei Chen, Kaushik Datta (Par Lab)
- Collaborators: Moises Goldszmidt (Microsoft Research Silicon Valley, a RAD Lab Affiliate); Harumi Kuno, Umeshwar Dayal, Janet Wiener (HP Labs, a RAD Lab Affiliate)
- Alumni: Peter Bodik (now at Microsoft Research), Rean Griffith (postdoc, now at VMware), Archana Ganapathi (now at Splunk), Charles Sutton (RAD Lab postdoc, now professor at Univ. of Edinburgh)
Recent Papers: (PDFs can mostly be found here once camera-ready submitted)
- Peter Bodík, Armando Fox, Michael Franklin, Michael Jordan, David Patterson. Characterizing, Modeling and Generating Workload Spikes for Stateful Services. Proc. SOCC 2010.
- Peter Bodík, Moises Goldszmidt, Hans Andersen, Armando Fox, Dawn Woodard. Fingerprinting the Datacenter: Automated Classification of Performance Crises. Proc. EuroSys 2010 (to appear)
- Archana Ganapathi, Yanpei Chen, Randy Katz, Armando Fox, David Patterson. Statistics-Driven Workload Modeling for the Cloud. Proc. Workshop on Self-Managing Database Systems (SMDB 2010), to appear.
- Peter Bodík, Rean Griffith, Charles Sutton, Armando Fox, Michael Jordan, David Patterson. Automatic Exploration of Datacenter Performance Regimes. Proc. Workshop on Automatic Control for Datacenters and Clouds (ACDC 2009), Barcelona, Spain.
- Archana Ganapathi, Kaushik Datta, Armando Fox, David Patterson. Using Machine Learning to Auto-tune a Stencil Code on a Multicore Architecture. Proc. HotPar 2009.
- Archana Ganapathi, Harumi Kuno, Umeshwar Dayal , Janet Wiener, Armando Fox , Michael Jordan , David Patterson. Predicting Multiple Performance Metrics for Queries: Better Decisions Enabled by Machine Learning. Proc. ICDE 2009.
- Peter Bodík, Moisés Goldszmidt, Armando Fox. HiLighter: Automatically Building Robust Signatures of Performance Behavior for Small- and Large-Scale Systems. Proc. SysML 2008, San Diego, CA, December 2008.
- Peter Bodík, Charles Sutton, Armando Fox, David Patterson and Michael Jordan. Response-Time Modeling for Resource Allocation and Energy-Informed SLAs. Proc. SysML’07, Vancouver, BC, December 2007.
The intersection of power management and performance is of particular interest to us. Datacenter operators are interested in saving power only if there is no risk of violating the SLA (e.g. due to slower performance from being in a lower-power mode). Our goal is therefore to construct SML models that predict performance (i.e. SLA compliance) based on resource utilization and power state, allowing us to put parts of the system into a lower-power state without violating the SLA. Early results using nonlinear quantile regression show that we can keep the CPU in a low-power state for a higher percentage of the time than the CPU’s built-in power management policy (AMD PowerNow) while triggering few or none of the SLA violations caused by the built-in policy. Our eventual goal is that a collection of such analysis tools would inform the decisions of a “Datacenter Director” making global policy within the datacenter, in contrast to most current approaches in which components manage their own power and often end up working at cross-purposes.