Showing posts with label joel spolsky. Show all posts
Showing posts with label joel spolsky. Show all posts

Friday, April 30, 2010

The Ramp

I love stories about performance problems. Recently, my friend Debra Lilley sent me this one:
I went to see a very large publishing company about 6 months after they went live. I asked them what their biggest issue was, and they told me querying in GL was very slow, and I was able to fix quite easily. (There was a very simple concatenated index trick for the Chart of Accounts segments that people just never used.) Then I asked if there was anything else. The manager said no but the clerk who sat behind him said, “I have a problem.” His manager seemed embarrassed, but when I pressed him, the clerk continued, “Every day I throw away reams of paper from our invoice listing.”

I asked to look at the request, which ran a simple listing of all invoices entered at a scheduled time each day. I opened up the schedule screen and there was a tick box to “Increment date on each run.” This was not ticked, and they were running the report from day 1, every day. When they accepted the system at go live there was no issue. I think all system implementations should include a 3- or 6-month review. Regardless of how good the implementers are, their setup is based on the information known at the time. In production, that information (volumes, etc.) often changes, and when it does, it can affect your decisions.
My friends Connie Smith and Lloyd Williams call this performance antipattern The Ramp. With the ramp, processing duration increases as the system is used. This invoicing system exhibited ramp behavior, because every invoicing process execution would take just a little bit longer and print just a few more pages than the prior execution did.

The problem of the ramp reminds me of a joke I heard when I was young. A boy, one who is athletically very talented but not too bright, takes on a job as a stripe painter for the highway department. The department gives him bucket of paint and a brush and drives him out to the highway he’s supposed to paint. His first day on the job, he paints a stripe almost seven miles long. This is an utterly stunning feat, for no one previously had ever painted more than five miles in a day. The department was ecstatic. Apparently, this boy’s true calling was to paint roadways.

The excitement abated a little bit on the second day, when the boy painted only five miles of highway. But still, five miles is the best that anyone had ever done before him. But on the third day, the distance dropped to two miles, and on the fourth day, it fell to less than one mile.

The department managers were gravely concerned, especially after having been so excited on the first couple of days. So they had a driver go out to fetch the boy, to bring him back to the office to explain why his productivity had been so outstanding at first but had then declined so horribly.

The reason was easy to understand, the boy explained. Every day he painted, he kept getting farther and farther away from where he had set his paint bucket on the first day.

I’ve known people who’ve written linked list insertion algorithms this way. Joel Spolsky has written about string library functions in C that work this way. I’ve seen people write joins in SQL that work this way. And Debra’s publishing company ran their invoices this way.

When you have the ramp problem, individual response times increase linearly. ...Which is bad. But overall response time—through the history of using such an application—varies in proportion to the square of the number of items being processed. ...Which is super-duper bad.

Imagine, in the invoicing problem that Debra solved, that the system had been processing just one invoice per day and that each invoice is only one page long. Given that she was at a “very large publishing company,” it’s certain that the volume was greater than this, but for the sake of simplifying my argument, let’s assume that there was just one new invoice each day. Then, with the “Increment date on each run” box left unchecked, there would be one invoice to print on day 1, two on day 2, etc. On any day n, there would be n invoices to print.

Obviously, the response time on any given day n would thus be n times longer than it needed to be. At the end of the first year of operation with the new application, an invoice would take 365 times longer to print than on the first day of the year.

But the pain each day of invoice generation is not all there is to the problem. The original concern was expressed in terms of all the paper that was wasted. That paper waste is important, not just because of the environmental impact of unnecessary paper consumption, but also because of all the computing power expended over the operational history of the application required to generate those pages. That includes the resources (the electrical power, the CPU cycles, the memory, the disk and network I/Os, etc.) that could have been put to better use doing something else.

In the grossly over-simplified invoicing system I’ve asked you to imagine (which creates only one invoice per day), the total number of pages printed as of the end of day n is 1 + 2 + ... + n, which is n(n + 1)/2. All but n of those pages are unnecessary. Thus the total number of wasted pages that will have been printed by the end of day n is n(n + 1)/2 – n, which is n(n – 1)/2, or (n2n)/2. The number of invoices that should never been printed is proportional to the square of the number of days using the application.

To get a sense for what that means, think about this (remember, all these points refer to a grossly over-simplified system that creates only one invoice per day):
  • By the end of the first month, you'll have printed 465 pages when you only needed 30. That’s 435 unnecessary pages.
  • But by the end of the first year, you’ll have printed 66,795 pages instead of 365. That’s 66,430 unnecessary pages. It’s 27 unnecessary 2,500-page boxes of paper.
  • And by the end of the fifth year, you’ll have used 668 boxes of paper to print 1,668,508 pages instead of using just one box to print 1,826 pages. The picture below shows how tremendously wasteful this is.

When total effort varies as the square of something (like the number of items to process, or the number of days you’ve been using an application), it’s bad, bad news for efficiency. It means that every time your something doubles, your performance (time, materials consumption, etc.) will degrade by a factor of four. Every time your something increases by a factor of ten, your performance will degrade by a factor of a hundred. When your something increases a hundred fold, performance will degrade by a factor of 10,000.

Algorithm analysts characterize algorithms that behave this way as O(n2), pronounced “big-oh of n-squared.” O(n2) performance is no way to live. The good news is that you can usually break yourself out of a O(n2) regime. Sometimes, as Debra’s story illustrates, the solution isn’t even technical: she solved her client’s problem by using an option designed into the end-user interface.

No matter where the problem is—whether it’s problem with use, setup, implementation, design, or concept—it’s worth significant time and effort to find the O(n2) problems in your system and eliminate them. Whenever you need reassurance of that idea, just glance again at the image of the paper boxes shown here.

And by the way, do you remember my post about “Just go look at it?” Tally one for Debra, for the win.

Friday, April 3, 2009

Cary on Joel on SSD

Joel Spolsky's article on Solid State Disks is a great example of a type of problem my career is dedicated to helping people avoid. Here's what Joel did:
  1. He identified a task needing performance improvement: "compiling is too slow."
  2. He hypothesized that converting from spinning rust disk drives (thanks mwf) to solid state, flash hard drives would improve performance of compiling. (Note here that Joel stated that his "goal was to try spending money, which is plentiful, before [he] spent developer time, which is scarce.")
  3. So he spent some money (which is, um, plentiful) and some of his own time (which is apparently less scarce than that of his developers) replacing a couple of hard drives with SSD. If you follow his Twitter stream, you can see that he started on it 3/25 12:15p and wrote about having finished at 3/27 2:52p.
  4. He was pleased with how much faster the machines were in general, but he was disappointed that his compile times underwent no material performance improvement.
Here's where Method R could have helped. Had he profiled his compile times to see where the time was being spent, he would have known before the upgrade that SSD was not going to improve response time. Given his results, his profile for compiling must have looked like this:
100%  Not disk I/O
  0%  Disk I/O
----  ------------
100%  Total
I'm not judging whether he wasted his time by doing the upgrade. By his own account, he is pleased at how fast his SSD-enabled machines are now. But if, say, the compiling performance problem had been survival-threateningly severe, then he wouldn't have wanted to expend two business days' worth of effort upgrading a component that was destined to make zero difference to the performance of the task he was trying to improve.

So, why would someone embark upon a performance improvement project without first knowing exactly what result he should be able to expect? I can think of some good reasons:
  • You don't know how to profile the thing that's slow. Hey, if it's going to take you a week to figure out how to profile a given task, then why not spend half that time doing something that your instincts all say is surely going to work?
  • Um, ...
Ok, after trying to write them all down, I think it really boils down to just one good reason: if profiling is too expensive (that is, you don't know how, or it's too hard, or the tools to do it cost too much), then you're not going to do it. I don't know how I'd profile a compile process on a Microsoft Windows computer. It's probably possible, but I can't think of a good way to do it. It's all about knowing; if you knew how to do it, and it were easy, you'd do it before you spent two days and a few hundred bucks on an upgrade that might not give you what you wanted.

I do know that in the Oracle world, it's not hard anymore, and the tools don't cost nearly as much as they used to. There's no need anymore to upgrade something before you know specifically what's going to happen to your response times. Why guess... when you can know.

Thursday, September 4, 2008

Business of Software 2008, day 2

Greetings from the second and final day of "Business of Software 2008, the first ever Joel on Software conference."

Yesterday was a hard act to follow, but today met the challenge. Today's roster:
Some of today's highlight ideas for me (again, with apologies to the speakers for the crude summarization):
  • Nothing is difficult to someone who doesn't know what he's talking about. (Johnson)
  • Creating more artifacts and meetings is no answer. (Johnson)
  • Entrepreneurs are better entrepreneurs when they're not worried about their personal balance sheet. (Jennings)
  • "In the software field, we don't have to deal with the perversions of matter." (Stallman)
  • VCs say 65% of failed new ventures are the result of people problems with founding or management teams. (Wasserman)
  • Websites are successful to the extent they're self-evident as possible. (Krug)
  • Sensible usability testing is absolutely necessary and, better yet, possible and even inexpensive. You can even download a script at Steve's site. (Krug)
  • The huge chasm between #1 and #2 is all about elements of happiness, aesthetics, and culture. (Spolsky)
Steve Johnson and Steve Krug gave truly superb presentations. Steve Krug I knew about beforehand, from his book. Steve Johnson I did not know, but I do now. These are people I'll take courses from someday. And of course, Joel Spolsky... I had seen him speak before, so I knew what to expect. He's one of the best speakers I've ever watched. I've asked him to keynote at Hotsos Symposium 2009. We'll see what he says.

Wednesday, September 3, 2008

Business of Software 2008, day 1

Greetings from Boston, where I'm attending "Business of Software 2008, the first ever Joel on Software conference."

It has been fantastic so far. Here's a featured presenters roll call for the day:
That's not to mention the eight Pecha Kucha presentations, although I will mention two that I particularly enjoyed by Jason Cohen of SmartBear Software ("Agile marketing") and Alexis Ohanian, founder of Reddit ("How to start, run, and sell a web 2.0 startup"). Alexis won the contest, which netted him a new MacBook Air. Not bad for 6 minutes 40 seconds of work. ;-)

Here are some of the highlight ideas of the day for me (with apologies to the speakers for, in some cases, crudely over-simplifying their ideas):
  • Ideas that spread win. (Godin)
  • The leader of a tribe begins as a heretic. (Godin, Livingston)
  • Premature optimization is bad. In business too. Not just code. (Fried, Shah)
  • Interruptions are bad. Meetings are worse. (Fried, Sink, Livingston)
  • "Only two things grow forever: businesses and tumors." Unless you take inelligent action. (Fried)
  • Pricing is hard. Really, really hard. (Shah)
  • Business plans are usually stupid. (Fried, Shah, Livingston)
  • Software specs are usually stupid. (Fried)
  • An important opportunity cost of raising VC money is the time you're not spending working on the business of your actual business. (Shah)
  • The most common cause of startup failure isn't competition, it's fear. (Livingston)
  • Your first idea probably sucks. (Fried, Sink, Shah, Livingston)
  • Radical mood swings are part of the territory for founding a company. (Livingston)
An overarching belief that I think bonds almost all of the 300 people here at the event is this: If you're not working on your passion, then you're wasting yourself. It is inspiring to met so many people at one time who are living courageously without compromising this belief. Re-SPECT.

I think a good conference should provide three main intellectual benefits for people:
  1. You can expose yourself to new ideas, which can make you wiser.
  2. You can fortify some of the beliefs you already had, which can make you more confident.
  3. You can learn better ways to explain your beliefs to others, which can make you more effective.
And then of course there's networking, fun, and all that stuff—that's easy. So far, this event is ringing the bell on every dimension that I needed. Absolutely A+.

Tuesday, July 1, 2008

Multitasking: Productivity Killer

A couple of years ago, I read Joel Spolsky's article "Human Task Switches Considered Harmful," and it resonated mightily. The key take-away from that article is this: Never let people work on more than one thing at once. Amen. The nice thing about Joel's article is that it explains why in a very compelling way.

Last week, a good friend emailed me a link to an article by Christine Rosen called "The Myth of Multitasking," which goes even further. It quotes one group of researchers at the University of California at Irvine, who found that workers took an average of twenty-five minutes to recover from interruptions such as phone calls or answering e-mail and return to their original task.

So it's not just me.

The "benefits" of human multitasking is an illusion. Looking or feeling busy is no substitute for accomplishment.

Here's a passage from the Rosen article that might get your attention, if I haven't already:
...Research has also found that multitasking contributes to the release of stress hormones and adrenaline, which can cause long-term health problems if not controlled, and contributes to the loss of short-term memory.
Translation: Trying too hard to do the information overload thing makes you sick, and it makes you stupid.

For as long as I can remember, I've hated the times I've been "forced" to multitask, and I've loved those segments of my life when I've been free to lock down on a train of thought for hours at a time. I believe deep down that multitasking is bad—at least for me—and literature like the two articles I've discussed here supports that feeling in a compelling way.

Here's a checklist of decisions that I resolve to implement myself:
  • When you need to sit down and write, whether it's code or text, close your door, and turn off your phone and your email. (Or just work the 10pm-to-4am shift like I did with Optimizing Oracle Performance.)
  • When you're in a classroom, if you're really trying to learn something, turn off your email and your browser.
  • When you're managing someone, make sure he's working on one thing at a time. It's obviously important that this one thing should be the right thing to be working on. But it's actually worse to be working on two things than working on just one wrong thing. Read Spolsky. You'll see.