Archive for category Cloud computing

Crossing the software education chasm

Dave Patterson and I wrote an extended op-ed piece that appears in this month’s Communications of the ACM talking about how and why we reinvented UC Berkeley’s undergraduate software engineering course to bring it more in line with modern development methodologies.

Although at the time of writing we didn’t even know we were going to offer this content in an online course, as it turns out, the same reasons we believe the course worked well at Berkeley also allowed us to offer it as a MOOC.

What are your thoughts?  Are you doing something similar at your institution?  Have suggestions for us?  Leave comments here or on the CACM article!

Finally made it into SCIENCE, albeit via the back door

A SCIENCE editor attending the National Academy of Engineering FRONTIERS event last September asked if I’d write a short Perspectives article on why scientists should check out cloud computing as a way to help with their work.

I did, and it appears in the January 28, 2011 issue.  You can download a single copy for personal, noncommercial use and without the right to redistribute by clicking here.

Undergrad projects in cloud computing

  • Write a SCADS client app in RoR—a clone of eBay, or some other interesting big-data app  (Lead: Amber or Allen)
  • Get Rails environment running using JRuby interpreter and ability to call existing SCADS client library functions, so RoR apps can run in-process with SCADS (Lead: Marcelo?)
  • Devise a Ruby gem that encapsulates SCADS functionality to wrap the above (Lead: Brandon)
  • Write a crawler for Twitter data and metadata; collect a bunch of it, then create some MapReduce jobs to find statistics like density of friendships, things about structure of followers graph, etc., as well as to have tweet data with which to populate SCADr database (Lead: Aaron or Tim)

Dynamic programming in the cloud

A HotOS 2009 talk and paper talked about “wave computing” on batch jobs (MapReduce style)—the problem is that batch jobs often do wasteful I/O or computation when multiple workers solve identical subproblems. For example, “top 10 daily files” and “top 10 weekly files” are separate jobs.

They propose specific solutions to identify optimization opportunities, but the more general opportunity is supporting dynamic programming in the cloud. In their approach they look at the actual queries to automatically determine what the common subtasks might be, but in some dynamic programming problems you can express these explicitly.