Archive for category Systems research

New RAD Lab papers

We continue to make progress on applying machine learning to problems in deploying and operating datacenter-scale systems…

  • Peter Bodik’s paper on “Fingerprinting the Datacenter” (joint work with Moises Goldszmidt at Microsoft Research Silicon Valley and Dawn Woodard at Cornell) was accepted to EuroSys 2010, where I’ll also be giving a tutorial on Web 2.0 applications;
  • Wei Xu presented an online version of his work on data mining of console logs (joint with Ling Huang at Intel Research Berkeley) at ICDM 2009 last month;
  • Dr. Archana Ganapathi filed her PhD dissertation (yay!!) and just had a paper accepted to the Self-Managing Database Systems workshop (SMDB 2010) on statistics-driven workload modeling for cloud jobs like Hadoop (joint work with Yanpei Chen)
  • The RAD Lab will be featured in the VMware GoVirtual webzine later this month, stay tuned!

…and of course we are planning submissions to SOCC and WebApps as well.  See the students’ pages or my project pages for more details!

I’d like to disabuse early-career grad students of certain misconceptions…

  1. You are rarely the best judge of the most important material or best presentation strategy for your talk. Corollary: Give one or more practice talks.
  2. Writing is much harder than you think. Corollary 1: You are not that great a writer. Corollary 2: If you don’t have a solid draft 1-2 weeks before the conference deadline, you’re starting with 2 strikes.
  3. 80% or more of submitted papers are rejected. Corollary: You need feedback from colleagues and outsiders to improve your paper. A poor way to get feedback is to submit the paper, wait 6 months, and get a rejection with cryptic reviews. A better way is left as an exercise to the reader. (Thanks to Mike Franklin for this particular way of looking at the “get feedback” issue.)
  4. When you write up your work, remember that nobody cares what you did but only why it advances the state of the art. Edit accordingly. Corollary: edit an outline and paragraph map before you start writing. It’s much easier to rearrange/eliminate at this level than at the prose level.
  5. The reviewer has 20 other papers waiting to be reviewed and is looking for a reason to set yours aside and move on. Corollary: your job is to ensure no such opening is provided—whether by unsupported statements, poor writing, rambling style, etc.
  6. More coming soon.

E-filing your PhD thesis? Why not file your VM as well?

UC Berkeley has finally started accepting electronic (PDF) thesis filing. The trees thank them. I remember, though, that shortly after I filed my (hardcopy) thesis, I quickly lost the ability to even regenerate the PDF from LaTeX sources: I didn’t have the right packages, some figures didn’t get tarred up properly, etc etc.  And as far as trying to run the sizable chunks of software that I and others built and reported on…fuhggedaboudit.

But hey, with disk space being free now, if I was graduating now I would also “file” a copy of the VM images used to format my thesis and run the experiments. Some of my students are doing cloud computing research so some of their VM’s are already being stored as Amazon AMI’s, but why not snapshot a VM image of their laptop as well? We’d be one step closer to truly reproducible results in CS research.

Undergrad projects in cloud computing

  • Write a SCADS client app in RoR—a clone of eBay, or some other interesting big-data app  (Lead: Amber or Allen)
  • Get Rails environment running using JRuby interpreter and ability to call existing SCADS client library functions, so RoR apps can run in-process with SCADS (Lead: Marcelo?)
  • Devise a Ruby gem that encapsulates SCADS functionality to wrap the above (Lead: Brandon)
  • Write a crawler for Twitter data and metadata; collect a bunch of it, then create some MapReduce jobs to find statistics like density of friendships, things about structure of followers graph, etc., as well as to have tweet data with which to populate SCADr database (Lead: Aaron or Tim)

Dynamic programming in the cloud

A HotOS 2009 talk and paper talked about “wave computing” on batch jobs (MapReduce style)—the problem is that batch jobs often do wasteful I/O or computation when multiple workers solve identical subproblems. For example, “top 10 daily files” and “top 10 weekly files” are separate jobs.

They propose specific solutions to identify optimization opportunities, but the more general opportunity is supporting dynamic programming in the cloud. In their approach they look at the actual queries to automatically determine what the common subtasks might be, but in some dynamic programming problems you can express these explicitly.

Add failure injection to Cloudstone

At the Cloud Computing Workshop this month we’ll be presenting Cloudstone, a Web 2.0 “social events” app in 2 implementations (Rails & PHP) complete with a workload generator and test automation scripts. The idea is that it can be used as a realistic Web 2.0 app with realistic workloads for benchmarking cloud computing, recovery/scaling scenarios, etc.

A great addition would be to add scripts that can inject various kinds of failures—both app-level (e.g. DB timeout or connection reset) and machine-level (machine shuts down unexpectedly, or has a lot of dropped packets or other I/O interference, etc.)—to test datacenter automation scenarios designed to deal with these problems under load.

Email me if you want to work on this!

Tags: ,