Thursday, May 31, 2012

Should you self-publish? Should anybody?

Many colleagues have asked us about the experience of self-publishing our textbook.  In a previous post I talked about the DIY technology I harnessed to produce the actual artifacts (both the printed book and the ebook).  In this post I’ll talk about being self-sufficient and doing the other things publishers presumably do for you.

Advice: have a plan for proofreading and errata. I’ve never had a publisher, and I’m a stickler for writing, so proofreading with a fine tooth comb is something I do anyway. But if it’s not something you do, you won’t have a publisher to help with that.  Dozens of minor errata were reported by readers; we used a Google Form (HTML form backed up by a Google Docs spreadsheet) to collect them.

 This has been challenging, because Amazon has their own mechanism that allows readers of Kindle books to report errors.  However, the information reported through that mechanism is relayed sporadically and not sanity-checked; factually incorrect “corrections” from readers are passed straight through, as are complaints from readers who aren’t sophisticated enough to operate their ebook reader devices.  However, those people are Amazon’s customers (and as an author, you are not), so we just have to learn to deal with this.

Advice: set your expectations for “service”. One thing a publisher normally handles is distribution. Amazon’s Kindle Direct Publishing handles that for the Kindle book, but until recently, their service & support for publishers was terrible. (Note that I don’t call it “customer service”. People who buy books are Amazon’s customers. Authors are not.) Recently, though, because of the highly visible success of our MOOC, which uses both the Kindle book and Amazon Web Services infrastructure, Amazon has become much more interested in speaking about strategic things with us, and has given us the level of support usually given to real publishers. We just made our book available on the Nook store and expect to sell it via Google Books starting next month; I’ll report back on whether they are any more responsive to indie authors.

Advice: tell your purchasers to follow you or otherwise let you notify them of updates. A key reason we wanted to do an ebook was the ability to get bug fixes and new content into readers’ hands quickly.  Each release of the book has a version number, starting from 0.8.0 in January 2012 up to 0.8.5 in May 2012.  We applied the errata fixes ourselves, using GitHub to track all book content and tagging the releases as we would with code, and every erratum has a corresponding version number.  Amazon initially told us they’d notify purchasers and allow them to re-download updated versions of the ebook, but they waffled.  (This is now fixed, but only by Amazon’s decision to give us special treatment.)  Without that support, I’m not sure we’d be able to get updates into readers’ hands.  Even so, while readers can now re-download updated versions of our book, it’s up to us (not Amazon) to notify readers when new versions are available.  We can use the MOOC registration email lists to hit many of those people, but others will have to find out for themselves.  We’re now encouraging readers to follow us on Twitter, and we’ll put that text into the next release of the book.

Advice: have a plan for spreading the word via professional forums.  We had already been spreading the word about our course—we had presented posters or talks at CSEET, SIGCSE and ICSE, wrote an  espousing the teaching of software engineering using SaaS+Agile, and so on.  We’d been collecting names of people who might be interested in trying out our ideas, so naturally we told them about the book too, and offered most of them complimentary copies.  (Unlike with a publisher, the cost of the comp copies comes out of pocket for us, though the print-on-demand house we use, CreateSpace, allows authors to purchase author copies at a price lower than list.)

Advice: be prepared to do your own feet-on-the-ground marketing and publicity.  My marketing experience as a nonprofit theater Board member came in handy.  Following the pattern I’ve used in that world, we designed a postcard and had it printed and direct-mailed by PSPrint to a mailing list we purchased (~600 software engineering professors). The list was ad hoc and included few top-tier departments; in retrospect I’m not sure it was worth the roughly $500 we paid. Successful practice in arts marketing is to follow the postcard with an email reminder a week or two later, but the firm that sold  us the list wouldn’t sell us the corresponding email list, so we paid people to scrape the Web to get them manually (I know, we could’ve Turked it). Then we found out we couldn’t import those addresses into an email list manager such as the excellent MailChimp or ConstantContact, since due to CAN-SPAM laws you may only import email addresses of people who have directly opted in via your own website.  (There’s now an area on our book’s website where you can express interest in the beta program that uses the book.)  So we sent one-on-one emails to just about all those people (600 or so in all).  We also personally reached out to colleagues in top-25 departments with whom we had good personal relationships.  It wasn’t a huge amount of work but it was time not spent on writing.

Tuesday, May 29, 2012

How to visit San Francisco

If you’re staying in an urban area (SF, Berkeley, Oakland) don’t rent a car while you’re here.  If you’re staying in outlying areas, you might use a car to get to and park at a BART station, but parking and traffic in SF is a headache you don’t need.


1.  Get a Clipper card, a debit farecard that works on almost all transit systems in the Bay Area.  Ideally, order one onlineto be mailed to you (takes 5-7 business days), and you can immediately set up Autoload, which reloads your card from a credit card or bank account so you can “set and forget”.  Unused value never expires (though you cannot get it back as cash).  If you don’t have 5-7 days of lead time, you can buy this card at Walgreens drugstores anywhere in the Bay Area, or at the Muni vending machines in the underground stations at Embarcadero, Montgomery, Powell or Civic Center.  Unfortunately you can’t yet buy them at the airports.  Once you have the card, put some money on it at any Add Value machine in the underground stations or at any Walgreens.  Both methods let you use credit cards or cash.  When your trip is over, keep the card for your next visit.  (You’ll surely want to come back.)

Various separate agencies run Bay Area transit, but the Clipper card works on all of them.  BART runs fast trains that connect San Francisco, Oakland, Berkeley, and the outlying suburbs.  Muni runs the buses, trolleys, and “Metro” streetcar/subways in SF.  Other agencies run buses in other counties.  Caltrain runs a commuter rail system from downtown SF (station is adjacent to AT&T Park, the baseball stadium) to San Jose which stops all along the Peninsula.

2.  If you have a smart phone (iPhone, Android, …) bookmark, which provides real-time bus arrivals for most Bay Area agencies based on transceivers mounted in the buses.  It uses GPS to detect where you are and give you departures of nearby buses, or you can select a specific route, stop and direction.  You also may want the Embark iBART app, which has both schedule info and real-time arrivals for BART trains.

3.  Use Google Maps (or the built-in Maps app on the iPhone) to get public transit directions between any two points.  The route and connection info is accurate, but the specific connection times often aren’t because buses may be delayed during peak hours, etc.  The ideal app would combine the directions from Google Maps with bus departure times from NextBus, but as far as I know that app doesn’t exist yet.

4.  Become a member of Zipcar or CityCarShare, both of which have rent-by-the-hour stations all over the Bay Area.  If you’re already a member, everything should just work.  Zipcar has way more vehicles and pickup spots.

Tips on seeing specific things:

The best way to see the Golden Gate Bridge is to bike across, and take the ferry back from Sausalito or Tiburon.  Bike rentals are available next to the Hyatt Embarcadero Center hotel at the Embarcadero BART station and various locations along the Embarcadero between the Ferry Building and Fishermans Wharf.  The rentals include helmets, locks, and excellent maps.  If you don’t want to bike the bridge, the next best way to see it is to get to the bridge plaza on Golden Gate Transit bus10 or 70, which is a lot faster than getting there on Muni since the GG Transit buses make very few stops in SF.

The Ferry Building (just outside Embaracadero BART station) was beautifully restored in the early 2000s and now hosts an urban-agriculture farmers market several days/evenings a week, at which all products must originate within 150 miles of the building.  It is well worth a visit.  Ferries do still sail from here; if the weather is nice, you can joyride the Alameda-Oakland Ferry, which makes a stop at Oakland Jack London Square and another in Alameda before returning to the Ferry Building (whole trip takes about an hour and you get great views of the Bay Bridge and the lay of the land).  A nice “triangle schlep route”  is to take the ferry from the Ferry Building to Alameda/Oakland, end up at Pier 39, walk over to Ghirardelli Square and cable-car back downtown from there.

The Castro is famous for being the gay epicenter of SF, but it's also just a terrifically vibrant neighborhood with some of the best-preserved examples of Victorian architecture, with many houses dating to before the turn of the 20th century.  Get there by taking Muni Metro lines J, K, L, M.

SF has the US’s largest Chinatown, highlights of which include Waverly Place (Tin How temple dates back to Gold Rush days), Ross Alley (home of the tiny Golden Gate Fortune Cookie factory, where you can see them being handmade and eat them right off the production line), and the Vital Tea tasting room (sample a  huge variety of traditional teas that will change your idea of what tea is).

Haight/Ashbury was the epicenter of the 1969 "Summer of Love" and still a neighborhood whose beatnik/hippie roots are very prominent.  Get there by taking Muni bus #6 from downtown/Market Street.

Some of the best views of the city (in the late morning/early afternoon, before the fog rolls in) can be had from Diamond Heights/Twin Peaks and Lands End/the Cliff House out on the beach.

Have an outdoor lunch on nice day at The Ramp (about a mile south of AT&T Park) or Gordon Biersch (on the Embarcadero, under the Bay Bridge).

Locals tend to steer clear of Fishermans’ Wharf, but it does have those fascinating sea lions, and other things in its general vicinity are worth visiting, like the Musée Mécanique containing hundreds of really old (some from early 1900s) mechanical amusement devices, the USS Pampanito submarine, Fort Mason Park, the Maritime Museum, the recently-restored Crissy Field, and the Municipal Pier.  Traffic is awful around there, so either bike there from the bike rental at Embarcadero, walk there from the Embarcadero BART (~20 minutes), or take the F-Market aboveground streetcar from in front of the Ferry Building.

There’s a fascinating collection of restored old ships and interpretive outdoor exhibits of SF’s maritime history at the Hyde Street Pier (just beyond Fishermans Wharf along the Embarcadero).  You can also get there on Muni bus 19-Polk or 45-Stockton.  Nearby is the lovely curving Municipal Pier (at the foot of Fort Mason Park, which has excellent views of the Golden Gate and Alcatraz).  A further hike along the length of Crissy Field will take you right to the foot of the Golden Gate Bridge south anchorage and historic Fort Point, now open to the public, a fort/lookout that guarded the Golden Gate before invaders were able to drive into San Francisco over the bridge.

Outside of SF, Berkeley is well worth a visit, for both the campus and the vaguely funky Telegraph Avenue area just south of the campus’s Sather Gate.  The best way to get there is to take BART to Downtown Berkeley station.

The best way to visit Golden Gate Park is Muni Metro N-Judah from downtown (runs along southern edge of park) or 44-O’Shaughnessy bus from our neighborhood (stops next to the Citibank right across from BART).

Dolores Park/Mission Dolores is the oldest original construction still standing in SF.  You can reach it via Muni Metro J-Church or walking a few blocks from the 16th St./Mission BART station.

Thursday, May 24, 2012

Thanks, Mr. Ambani, for thwarting free online education in India

Given our efforts to provide free education in Software Engineering—an area where India is known to be a strong player—it was particularly disconcerting to read that numerous Web sites including Vimeo and Pastebin have been blocked in India at the ISP level due to a court order.

According to articles in the New York Times and the Wall Street Journal, it sounds like you should make your displeasure known to Reliance Entertainment, a media company largely controlled by media mogul Anil Ambani.

Other news sources report that some ISPs are still allowing Vimeo and Pastebin through, and that proxies and VPNs can be used to circumvent the blockage.

We don’t plan to move our content, and we’re confident the world’s largest democracy will respond appropriately to what amounts to censorship, as we said in our SaaS TV Chat this week.

$5 Amazon gift card if you recommend a printer that doesn’t suck

I’m trying to buy a replacement all-in-one printer for my mom and dad (they have G4 & G5 Macs, an iPad, and two iPhone 4s), and all the ones I’ve investigated (including those sold by Apple in their retail stores, which I take as an endorsement) have had one or more of the following problems:

  • Wifi advertised as a connectivity option, but doesn’t work, works flakily, or works for printing but not scanning (and you can’t activate both wifi and USB at once without redoing factory setup)

  • Gobbles ink, perhaps deliberately, and/or refuses to print or do ANYTHING when ANY of the ink cartridges is low; and increasingly, they can’t be refilled because they are chipped

  • Printing quality sucks

I’m looking for an all-in-one that has the following properties.  If I end up buying the one you recommend, I’ll send you a $5 Amazon gift card.

MUST HAVE features:

  • Works seamlessly with a Mac, without having to install obtrusive vendor software, which is usually among the worst quality software out there (we have 3rd-party scanning software such as VueScan already)

  • Can function with OS 10.5.x (doesn’t require ?10.6)

  • Ink consumption isn’t deliberately rapacious and eco-hostile (can’t refill cartridges, can’t print at all when printer decides cartridges are low)

  • Decent print quality for photos—consumer grade is fine

HIGHLY DESIRABLE but not dealbreaker if absent:

  • Wifi connectivity that actually works and can be setup by mere mortals

  • AirPrint (print photos directly from iOS devices)

Friday, May 11, 2012

latex2ebook now on GitHub: make PDF & ebooks from same sources

I finally got around to extracting the complex toolchain used to create our textbook into a separate project.

You can check it out on GitHub as armandofox/latex2ebook

It’s far from complete, has many restrictions on what you can and cannot do, has many quirks arising from both LaTeX weirdnesses and the limitations of current ebook formats, and is scantily documented (I’ll add more docs as time permits).  Still, it should do what it advertises—allow you to create both a printable PDF and a .mobi format (Kindle, Sony and many other readers) ebook from the same set of LaTeX sources, if you follow some careful rules.

Next task is to add support for epub output.  Anyone point me to succinct documentation for the epub format, and/or an open-source tool that takes the various epub assets and wraps them up into the .epub distribution file?

Thursday, May 10, 2012

About UC Berkeley CS169 “Software Engineering”

Since I find myself periodically explaining our “reinvented” CS169 to my colleagues, and since ourSaaS MOOC is based on it, I thought I’d write up this short description.  (The official link to the Berkeley course homepage is here, but it changes each semester depending who is teaching it.)

Background. CS 169 is Berkeley’s upper-division (seniors and some juniors) software engineering course.  The way it’s taught varies widely depending on the instructor. This post describes how I teach it, often with the help of Dave Patterson and recently also Koushik Sen.  It’s not a required class in the major, but rather one of several classes that satisfies specific requirements such as a design project, technical communication, etc.

Prerequisites. This is the only undergrad course at Berkeley that claims to address the topic of Software Engineering.  As such, it’s ambitious and fast-paced, with 5 1-week programming assignments, 5 quizzes, and a significant team project with an external customer, all in a single 14-week semester.  Students should be comfortable with at least 1 other language and with basic programming concepts such as object orientation, classes and inheritance, recursion, and higher-order functions.  Prior to this course, Berkeley CS students take CS61A Structure & Interpretation of Computer Programs, which introduces the four major programming paradigms (until recently using Abelson & Sussman’s awesome “wizard book“,now using Python); CS61B Data Structures using Java; and (usually) CS61C Great Ideas in Computer Architecture (aka Machine Structures), using C, MIPS assembly, and more Java (for a Hadoop assignment).

History. While working on the RAD Lab project (2006-2011), we needed SaaS apps to show off our machine-learning-based technology for automating various aspects of cloud operations.  Since no Berkeley course taught SaaS, we created an informal seminar course in 2007 to bootstrap a cohort of undergraduates to create showcase apps using Rails.  It was so popular that we offered it again, increasing the focus on TDD and good software practices, when our colleague Paul Hilfinger observed that we were well on the way to teaching the basics of Software Engineering in a format that students were enthusiastic about, so why not go all-out and teach CS 169 this way?  We agreed, and we did a “dry run” of the beefed-up course in Fall 2009, debugged it, and offered the “SaaS version” of CS 169 for the first time in Fall 2010.  Enrollments have been increasing by nearly 50% per offering.

CS 98/198 Spring ‘0725
CS 194 Fall ‘0835
CS 194 Fall ‘0950
CS 169 Fall ‘1075
CS 169 Spring ‘12115
CS 169 Fall ‘12180

Practices taught and course format. The course teaches software engineering techniques based on the Software Engineering Body of Knowledge (SWEBOK) using SaaS+Agile+cloud as the vehicle and Rails as the framework.  A partial list of what we cover includes test-driven development, behavior-driven/user-centric design, design patterns, legacy code and refactoring, deployment (including “SaaS Performance & Security 101”), and working effectively as part of a small team (using version control with branches, estimating progress toward customer-driven goals, work planning, etc.)  Our recent  explains why we believe these choices bridge the gap between what academics believe should be covered in software engineering courses and what industry wants to see in graduates of those programs.  (Contrary to what one might expect, leading software companies donot want us to become trade schools teaching specific tools, languages or frameworks; they want skills that transcend these, including dealing with legacy code and working in a team serving a nontechnical customer.)  We chose Rails because it has the best testing and code-grooming tools and its developer community places high value on beautiful code and thorough testing.  There are two lectures and one small-section meeting (~30-40 students) per week, weekly programming assignments, bi-weekly short-answer quizzes, no “big” midterm or final exam, and a 6 to 8 week course project.  We are experimenting with pair programming as well.

Course project. We work with on-campus organizations including The Berkeley Group to identify external customers—some nonprofits, some on-campus units, some others—whose needs could be addressed in part by a SaaS prototype.  ”Two-pizza” teams of 4-6 students bid on the projects they’re most interested in and we match them up.  During each of four 2-week iterations, students meet with their customer, use lo-fi mockups and user stories to agree on goals, use BDD and TDD to develop new functionality and tests, and deploy to the public cloud on Heroku.  Per-iteration design/progress reviews with course staff (TA’s) help identify problems and provide technical guidance where needed; we have found no substitute for this critical part of the software craftsmanship apprentice process.  (In Spring 2012, two full-time TA’s monitored 25 teams; we hope to improve this ratio in Fall 2012.)  Each team’s progress is publicly tracked and estimated (and visible to customer and course staff) using the free Pivotal Tracker throughout, and grading is based heavily on (a) demonstrated responsiveness to customer feedback on deployed functionality, (b) demonstrated improvement in ability to estimate how much work will be completed by end of iteration, (c) sound use of agile processes as demonstrated by BDD scenarios (which Cucumber turns into integration tests), good test coverage, and reasonable complexity and beauty metrics (cohesion, lack of code smells, etc.) on code, which is publicly accessible on GitHub for review by course staff at any time.  At the end of the course, students present their work in a poster/demo session attended by course staff, the external customers, and invited guests such as industry practitioners and VCs.  Many students reported that their customers were so delighted that they were trying to hire the students to continue the work over the summer.  Two projects from Spring’12 were already deployed in production with real users by the time the poster was presented.

We’ve started gathering screencasts and customer interviews highlighting representative projects; more are being added all the time.

(Coming soon: aggregate code statistics for Spring 2012 projects, including test coverage & code cleanliness metrics)

Scalable grading. Given the growth in popularity of the course and CS courses in general here, we had already been thinking of ways to scale the grading by repurposing testing and code grooming tools such as RSpec, Cucumber, Mechanize, reek, flog/flay, and Webdriver (Selenium) both to assess correctness of student code at a fine grain and give nontrivial feedback on code quality.  When we agreed to offer the first 5 weeks of the course as a massive open online course in Feb/Mar 2012, it forced the issue and made us sit down and write the autograders.  Of course, these are no substitute for actual interaction with an instructor; indeed, the autograders have freed up our teaching staff to focus on creating additional review material and holding design meetings.  (If you’re an instructor interested in participating in our in-classroom beta program, we’ll even run the autograders for you.)

Teaching assistant duties & prereqs. (Thanks to head TA Richard Xia for this info.)  The course is approximately split into two halves, with homeworks and quizzes dominating the first half and the project dominating the second half. During the first half, each TA runs a discussion sections of ~30 students (1-2 hours/week + 2-4 hours to prep and review material), holds office hours (2 hours/week), monitors the online question forums on Piazza (4-6 hours), and miscellaneous tasks such as individual emails and handling regrade requests (4 hours).  In addition, for the first offering of the course the content creation included not only homeworks, quizzes, and section material, but the grading rubrics for the autograders for each type of evaluation.   For the second half of the course, we converted most of the discussion sections into project meetings with the students in which we met with each group for about 10 minutes each week, so less time was spent preparing homework/quiz material and the section-prep time was replaced by evaluating project checkpoints.  A few additional hours per week were spent managing the online course, but as we fine-tune the material and autograder logistics, we expect that the online course can be managed by a single 10-hour-per-week TA, leaving the CS169 TAs free for for direct interactions with the students, especially during the project.

Book. Modern software touches many subsystems of different types, each of which has historically been the focus of some CS subspecialty.  For example, SaaS encompasses datacenter computing, databases, OO programming, network security and performance, and user-centric design, plus nontechnical topics such as how to work with nontechnical customers and deliver a user-centric design. While there are many great (and not-so-great) books and online resources on each of these topics, a reading list cobbled together from them is impractical, lacks a “through-narrative”, and is very hard to get students to take seriously.  We finally decided in early 2011 to create our own book that would introduce enough of each topic to function as a SaaS engineer and weave them together in a way that both made sense for a one-semester (or shorter) course. Our division of topics into largely-standalone subgroups allows instructors with less time or a less-sophisticated audience to select subsets of material appropriate for their needs.  We decided very early to self-publish to keep the price low (currently $10 ebook/$20 print book).   is now in its beta draft, and we expect the First Edition to be ready by Spring 2013.

Sunday, May 6, 2012

Does Amazon KDP want to engage authors or commoditize them?

We just released a new version of  on May 1, 2012.

As we promised our alpha-edition buyers, we fixed hundreds of errata and added two new chapters.

(UPDATE: we’ve deleted the spreadsheet rows corresponding to the fixed errata, which numbered over 200.  So if you look at this list now you’ll find only newer errata reported in last couple of weeks.  You can use Google Spreadsheets’ versioning feature to see an older version of the spreadsheet showing all the errata we fixed.)

Amazon Kindle Direct Publishing (KDP) had told us that when we did this, all we had to do was:

(a) email Amazon and have them notify readers of the changes, and that they could receive a free re-download

(b) email our readers (if we knew who they were) and tell them to contact Amazon customer support and request a free re-download

We have done (b), so Amazon can expect to hear from a lot of readers, especially given that I tried to do (a) two days ago and finally got a robot response from KDP saying “Send us a detailed list of your changes. If we agree they’re major, we’ll notify people.  If they cause formatting issues, we’ll pull your ebook” (as they did without telling us in January).

I tried to reply to this email, as instructed, with a description of the changes and a link to the errata page on which the problems were reported.  Indeed, Amazon itself emailed us a number of haphazard complaints about content (some of which were factually wrong or themselves contained speling errerrs) about the ebook during the 3 months since it was first released.

My reply email bounced.

If I was an Amazon customer having this experience trying to buy a book, someone would likely get fired.  Yet the only reason our previous author problems got any attention is because we used out-of-band research relationships to escalate it through an organization that until then had shown not the slightest sign of giving a shit about the concerns of authors.

Perhaps Amazon will become one more faceless channel for companies like BookLocker or Vook, competing against Google Books and iBooks in a race-to-the-bottom business.  Publishers may take 85 cents on the dollar, but at least they don’t respond with robot emails.  Amazon KDP should decide if it wants to positively engage authors, or commoditize them.  It can’t do both.

Look for  on Google Books in the next week or so.