Cary Millsap: optimization

Showing posts with label optimization. Show all posts

Wednesday, October 20, 2010

Virtual Seminar: "Systematic Oracle SQL Optimization in Real Life"

On November 18 and 19, I’ll be presenting along with Tanel Põder, Jonathan Lewis, and Kerry Osborne in a virtual (GoToWebinar) seminar called Systematic Oracle SQL Optimization in Real Life. Here are the essentials:

What:	Systematic Oracle SQL Optimization in Real Life. Learn how to think clearly about Oracle performance, find your performance problems, and then fix them, whether you’re using your own code (which you can modify) or someone else’s (which you can not modify).
Who:	Cary Millsap, Tanel Põder, Jonathan Lewis, Kerry Osborne
When:	8am–12n US Pacific Time Thursday and Friday 18–19 November 2010
How much:	475 USD (375 USD if you register before 1 November 2010)

The format will be two hours per speaker: an hour and a half for presentation time, and a half hour for questions and answers. Here’s our agenda (all times are listed in USA Pacific Time):

Thursday	8:00a–10:00a	Cary Millsap: Thinking Clearly about Performance
	10:00a–12:00n	Tanel Põder: Understanding and Profiling Execution Plans
Friday	8:00a–10:00a	Jonathan Lewis: Writing Your SQL to Help the Optimizer
	10:00a–12:00n	Kerry Osborne: Controlling Execution Plans (without touching the code)

This is going to be a special event. My staff and I can’t wait to see it ourselves. I hope you will join us.

Friday, November 20, 2009

Performance Optimization with Global Entry. Or Not?

As I entered the 30-minute "U.S. Citizens" queue for immigration back into the U.S. last week, the helpful "queue manager" handed me a brochure. This is a great place to hand me something to read, because I'm captive for the next 30 minutes as I await my turn with the immigration officer at the Passport Control desk. The brochure said "Roll through Customs faster."

Ok. I'm listening.

Inside the brochure, the first page lays out the main benefits:

bypass the passport lines
no paper Customs declaration
in most major U.S. airports

Well, that's pretty cool. Especially as I'm standing only 5% deep in a queue with a couple hundred people in it. And look, there's a Global Entry kiosk right there with its own special queue, with nobody—nobody!—in it.

If I had this Global Entry thing, I'd have a superpower that would enable me to zap past the couple hundred people in front of me, and get out of the Passport Control queue right now. Fantastic.

So what does this thing cost? It's right there in the brochure:

Apply online at www.globalentry.gov. There is a non-refundable $100 application fee. Membership is valid for five years. That's $20 a year for the queue-bypassing superpower. Not bad. Still listening.
Schedule an in-person interview. Next, I have to book an appointment to meet someone at the airport for a brief interview.
Complete the interview and enrollment. I give my interview, get my photo taken, have my docs verified, and that's it, I'm done.

So, all in all, it doesn't cost too much: a hundred bucks and probably a couple hours one day next month sometime.

What's the benefit of the queue-bypassing superpower? Well, it's clearly going to knock a half-hour off my journey through Passport Control. I immigrate three or four times per year on average, and today's queue is one of the shorter ones I've seen, so that's at least a couple hours per year that I'd save... Wow, that would be spectacular: a couple more hours each year in my family's arms instead of waiting like a lamb at the abattoir to have my passport controlled.

But getting me into my family's arms 30 minutes earlier is not really what happens. The problem is a kind of logic that people I meet get hung up in all the time. When you think about subsystem (or resource) optimization, it looks like your latency savings for the subsystem should go straight to your system's bottom line, but that's often not what happens. That's why I really don't care about subsystem optimization; I care about response time. I could say that a thousand times, but my statement is too abstract to really convey what I mean unless you already know what I mean.

What really happens in the airport story is this: if I had used Global Entry on my recent arrival, it would have saved me only a minute or two. Not half an hour, not even close.

It sounds crazy, doesn't it? How can a service that cuts half an hour off my Passport Control time not get me home at least a half hour earlier?

You'll understand once I show you a sequence diagram of my arrival. Here it is (at right). You can click the image to embiggen it, if you need.

To read this sequence diagram, start at the top. Time flows downward. This sequence diagram shows two competing scenarios. The multicolored bar on the left-hand side represents the timeline of my actual recent arrival at DFW Airport, without using the Global Entry service. The right-hand timeline is what my arrival would have looked like had I been endowed with the Global Entry superpower.

You can see at the very bottom of the timeline on the right that the time I would have saved with Global Entry is minuscule: only a minute or two.

The real problem is easy to see in the diagram: Queue for Baggage Claim is the great equalizer in this system. No matter whether I'm a Global Entrant or not, I'm going to get my baggage when the good people outside with the Day-Glo Orange vests send it up to me. My status in the Global Entry system has absolutely no influence over what time that will occur.

Once I've gotten my baggage, the Global Entry superpower would have again swung into effect, allowing me to pass through the zero-length queue at the Global Entry kiosk instead of waiting behind two families at the Customs queue. And that's the only net benefit I would have received.

Wait: there were only two families in the Customs queue? What about the hundreds of people I was standing behind in the Passport Control queue? Well, many of them were gone already (either they had hand-carry bags only, or their bags had come off earlier than mine). Many others were still awaiting their bags on the Baggage Claim carousel. Because bags trickle out of the baggage claim process, there isn't the huge all-at-once surge of demand at Customs that there is at Passport Control when a plane unloads. So the queues are shorter.

At any rate, there were four queues at Customs, and none of them was longer than three or four families. So the benefit of Global Entry—in exchange for the $100 and the time spent doing the interview—for me, this day, would have been only the savings of a couple of minutes.

Now, if—if, mind you—I had been able to travel with only carry-on luggage, then Global Entry would have provided me significantly more value. But when I'm returning to the U. S. from abroad, I'm almost never allowed to carry on any bag other than my briefcase. Furthermore, I don't remember ever clearing Passport Control to find my bag waiting for me at Baggage Claim. So the typical benefit to me of enrolling in Global Entry, unfortunately, appears to be only a fraction of the duration required to clear Customs, which in my case is almost always approximately zero.

The problem causing the low value (to me) of the Global Entry program is that the Passport Control resource hides the latency of the Baggage Claim resource. No amount of tuning upon the Passport Control resource will affect the timing of the Baggage In Hand milestone; the time at which that milestone occurs is entirely independent of the Passport Control resource. And that milestone—as long as it occurs after I queue for Baggage Claim—is a direct determinant of when I can exit the airport. (Gantt or PERT chart optimizers would say that Queue for Baggage Claim is on the critical path.)

How could a designer make the airport experience better for the customer? Here are a few ideas:

Let me carry on more baggage. This idea would allow me to trot right through Baggage Claim without waiting for my bag. In this environment, the value of Global Entry would be tremendous. Well, nice theory; but allowing more carry-on baggage wouldn't work too well in the aggregate. The overhead bins on my flight were already stuffed to maximum capacity, and we don't need more flight delays induced by passengers who bring more stuff onboard than the cabin can physically accommodate.
Improve the latency of the baggage claim process. The sequence diagram shows clearly that this is where the big win is. It's easy to complain about baggage claim, because it's nearly always noticeably slower than we want it to be, and we can't see what's going on down there. Our imaginations inform us that there's all sorts of horrible waste going on.
Use latency hiding to mask the pain of the baggage claim process. Put TV sets in the Baggage Claim area, and tune them to something interesting instead of infinite loops of advertising. At CPH, they have a Danish hot dog stand in the baggage claim area. They also have a currency exchange office in there. Excellent latency hiding ideas if you need a snack or some DKK walkin'-around-money.

Latency hiding is a weak substitute for improving the speed of the baggage claim process. The killer app would certainly be to make Baggage Claim faster. Note, however, that just making Baggage Claim a little bit faster wouldn't make the Global Entry program any more valuable. To make Global Entry any more valuable, you'd have to make Baggage Claim fast enough that your bag would be waiting for anyone who cleared the full Passport Control queue.

So, my message today: When you optimize, you must first know your goal. So many people optimize subsystems (resources) that they think are important, but optimizing subsystems is often not a path to optimizing what you really want. At the airport, I really don't give a rip about getting out of the Passport Control queue if it just means I'm going to be dumped earlier into a room where I'll have to wait until an affixed time for my baggage.

Once you know what your real optimization goal is (that's Method R step 1), then the sequence diagram is often all you need to get your breakthrough insight that either helps you either (a) solve your problem or (b) understand when there's nothing further that you can really do about it.

Wednesday, February 18, 2009

Throughput versus Response Time

I like Doug Burns's recent blog post called Time Matters: Throughput vs. Response Time. If you haven't read it, please do. The post and its comment thread are excellent.

The principle Doug has recognized is why the knee in the performance curve is defined as the traffic intensity (think utilization, or load) value at which, essentially, the ratio of response time divided by throughput is minimized. It's not just the place where response time is minimized (which, as Doug observed, is when there's no load at all except for you, ...which is awesome for you, but not so good for business).

I'd like to emphasize a couple of points. First, batch and interactive workloads have wholly different performance requirements, which several people have already noted in their comments to Doug's post. With batch work, people are normally concerned with maximizing throughput. With online work, individual people care more about their own response times than group throughput, although those people's managers probably care more about group throughput. The individual people probably care about group throughput too, but not so much that they're happy about staying late after work to provide it when their individual tasks run so slowly they can't finish them during the normal working day.

In addition to having different performance requirements, batch workload can often be scheduled differently, too. If you're lucky, you can schedule your batch workload deterministically. For example, maybe you can employ a batch workload manager that feeds workload to your system like a carefully timed IV drip, to keep your system's CPU utilization pegged at 100% without causing your CPU run-queue depth to exceed 1.0. But online workload is almost always nondeterministic, which is to say that it can't be scheduled at all. That's why you have to keep some spare un-utilized system capacity handy; otherwise, your system load goes out past the nasty knee in your performance curve, and your users' response times behave exponentially in response to microscopic changes in load, which results in much Pain and Suffering.

My second point is one that I find that a lot of people don't understand very well: Focusing on individual response time—as in profiling—for an individual business task is an essential element in a process to maximize throughput, too. There are good ways to make a task faster, and there are bad ways. Good ways eliminate unnecessary work from the task without causing negative side-effects for tasks you're not analyzing today. Bad ways accidentally degrade the performance of tasks other than the one(s) you're analyzing.

If you stick to the good ways, you don't end up with the see-saw effect that most people seem to think of when they hear "optimize one business task at a time." You know, the idea that tuning A breaks B; then tuning B breaks A again. If this is happening to you, then you're doing it wrong. Trying to respond to performance problems by making global parameter changes commonly causes the see-saw problem. But eliminating wasteful work creates collateral benefits that allow competing tasks on your system to run faster because the task you've optimized now uses fewer resources, giving everything else freer and clearer access to the resources they need, without having to queue so much for them.

Figuring out how to eliminate wasteful work is where the real fun begins. A lot of the tasks we see are fixable by just changing just a little bit of source code. I mean the 2,142,103-latch query that consumes only 9,098 latches after fixing; things like that. A lot more are fixable by simply collecting statistics correctly. Others require adjustments to an application's indexing strategy, which can seem tricky when you need to optimize across a collection of SQL statements (here comes the see-saw), but even that is pretty much a solved problem if you understand Tapio Lahdenmäki's work (except for the inevitable politics of change control).

Back to the idea of Doug's original post, I wholeheartedly agree that you want to optimize both throughput and response time. The business has to decide what mixture is right. And I believe it's crucial to focus on eliminating waste from each individual competing task if you're going to have any hope of optimizing anything, whether you care more about response time, or throughput.

Think about it this way... A task cannot run at its optimal speed unless it is efficient. You cannot know whether a task is efficient without measuring it. And I mean specifically and exactly it, not just part of "it" or "it" plus a bunch of other stuff surrounding it. That's what profiling is: the measurement of exactly one interesting task that allows you to determine exactly where that task spends its time, and thus whether that task is spending your system's time and resources efficiently.

You can improve a system without profiling, and maybe you can even optimize one without profiling. But you can't know whether a system is optimal without knowing whether its tasks are efficient, and you can't know whether a given task is efficient without profiling it.

When you don't know, you waste time and money. This is why I contend that the ability to profile a single task is absolutely vital to anyone wanting to optimize performance.

Tuesday, February 3, 2009

Reading Knuth

I've been reading a little Knuth for the past couple of days. I've read about the famous statement, "premature optimization is the root of all evil," many times, and yesterday I decided to read his statement in its whole context. You can find Knuth's whole article here (thank you, Wikipedia):

Knuth, Donald: Structured Programming with Goto Statements. Computing Surveys 6:4 (1974), 261–301.

I want to quote the passage for you that contains the famous line about premature optimization. It's on page 268 (the eighth page of the article):

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Most people stop here when they quote Knuth. But actually, it just keeps getting better:

Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail. After working with such tools for seven years, I've become convinced that all compilers written from now on should be designed to provide all programmers with feedback indicating what parts of their programs are costing the most; indeed, this feedback should be supplied automatically unless it has been specifically turned off.

What a beautiful expression of the idea that's at the core of what I do for a living.

If you haven't seen "For Developers: Making Friends with the Oracle Database" yet, I hope you'll take a look. My aim is to help Oracle application developers writing Java, PHP, C#—or anything else—understand how to find that critical code that Knuth wrote about in 1974. ...And, just as importantly, how to ignore the other 97% of your code that you shouldn't be worrying about.