Thursday, October 1, 2015

What I Wanted to Tell Terry Bradshaw

I met Terry Bradshaw one time. It was about ten years ago, in front of a movie theater near where I live.

When I was little, Terry Bradshaw was my enemy because, unforgivably to a young boy, he and his Pittsburgh Steelers kept beating my beloved Dallas Cowboys in Super Bowls. As I grew up, though, his personality on TV talk shows won me over, and I enjoy watching him to this day on Fox NFL Sunday. After learning a little bit about his life, I’ve grown to really admire and respect him.

I had heard that he owned a ranch not too far from where I live, and so I had it in mind that inevitably I would meet him someday, and I would say thank you. One day I had that chance.

I completely blew it.

My wife and I saw him there at the theater one day, standing by himself not far from us. It seemed like if I were to walk over and say hi, maybe it wouldn’t bother him. So I walked over, a little bit nervous. I shook his hand, and I said, “Mr. Bradshaw, hi, my name is Cary.” I would then say this:

I was a big Roger Staubach fan growing up. I watched Cowboys vs. Steelers like I was watching Good vs. Evil.

But as I’ve grown up, I have gained the deepest admiration and respect for you. You were a tremendous competitor, and you’re one of my favorite people to see on TV. Every time I see you, you bring a smile to my face. You’ve brought joy to a lot of people.

I just wanted to say thank you.

Yep, that’s what I would say to Terry Bradshaw if I got the chance. But that’s not how it would turn out. How it actually went was like this, …my big chance:

Me: I was a big Roger Staubach fan growing up.
TB: Hey, so was I!
Me: (stunned)
TB: (turns away)
The End

I was heartbroken. It bothers me still today. If you know Terry Bradshaw or someone who does, I wish you would please let him know. It would mean a lot to me.

…I did learn something that day about the elevator pitch.

Thursday, September 17, 2015

The Fundamental Challenge of Computer System Performance

The fundamental challenge of computer system performance is for your system to have enough power to handle the work you ask it to do. It sounds really simple, but helping people meet this challenge has been the point of my whole career. It has kept me busy for 26 years, and there’s no end in sight.

Capacity and Workload

Our challenge is the relationship between a computer’s capacity and its workload. I think of capacity as an empty box representing a machine’s ability to do work over time. Workload is the work your computer does, in the form of programs that it runs for you, executed over time. Workload is the content that can fill the capacity box.


Capacity Is the One You Can Control, Right?

When the workload gets too close to filling the box, what do you do? Most people’s instinctive reaction is that, well, we need a bigger box. Slow system? Just add power. It sounds so simple, especially since—as “everyone knows”—computers get faster and cheaper every year. We call that the KIWI response: kill it with iron.

KIWI... Why Not?

As welcome as KIWI may feel, KIWI is expensive, and it doesn’t always work. Maybe you don’t have the budget right now to upgrade to a new machine. Upgrades cost more than just the hardware itself: there’s the time and money it takes to set it up, test it, and migrate your applications to it. Your software may cost more to run on faster hardware. What if your system is already the biggest and fastest one they make?

And as weird as it may sound, upgrading to a more powerful computer doesn’t always make your programs run faster. There are classes of performance problems that adding capacity never solves. (Yes, it is possible to predict when that will happen.) KIWI is not always a viable answer.

So, What Can You Do?

Performance is not just about capacity. Though many people overlook them, there are solutions on the workload side of the ledger, too. What if you could make workload smaller without compromising the value of your system?
It is usually possible to make a computer produce all of the useful results that you need without having to do as much work.
You might be able to make a system run faster by making its capacity box bigger. But you might also make it run faster by trimming down that big red workload inside your existing box. If you only trim off the wasteful stuff, then nobody gets hurt, and you’ll have winning all around.

So, how might one go about doing that?

Workload

“Workload” is a conjunction of two words. It is useful to think about those two words separately.


The amount of work your system does for a given program execution is determined mostly by how that program is written. A lot of programs make their systems do more work than they should. Your load, on the other hand—the number of program executions people request—is determined mostly by your users. Users can waste system capacity, too; for example, by running reports that nobody ever reads.

Both work and load are variables that, with skill, you can manipulate to your benefit. You do it by improving the code in your programs (reducing work), or by improving your business processes (reducing load). I like workload optimizations because they usually save money and work better than capacity increases. Workload optimization can seem like magic.

The Anatomy of Performance

This simple equation explains why a program consumes the time it does:
r = cl        or        response time = call count × call latency
Think of a call as a computer instruction. Call count, then, is the number of instructions that your system executes when you run a program, and call latency is how long each instruction takes. How long you wait for your answer, then—your response time—is the product of your call count and your call latency.

Some fine print: It’s really a little more complicated than this, but actually not that much. Most response times are composed of many different types of calls, all of which have different latencies (we see these in program execution profiles), so the real equation looks like r = c1l1 + c2l2 + ... + cnln. But we’ll be fine with r = cl for this article.

Call count depends on two things: how the code is written, and how often people run that code.
  • How the code is written (work) — If you were programming a robot to shop for you at the grocery store, you could program it to make one trip from home for each item you purchase. Go get bacon. Come home. Go get milk... It would probably be dumb if you did it that way, because the duration of your shopping experience would be dominated by the execution of clearly unnecessary travel instructions, but you’d be surprised at how often people write programs that act like this.
  • How often people run that code (load) — If you wanted your grocery store robot to buy 42 things for you, it would have to execute more instructions than if you wanted to buy only 7. If you found yourself repeatedly discarding spoiled, unused food, you might be able to reduce the number of things you shop for without compromising anything you really need.
Call latency is influenced by two types of delays: queueing delays and coherency delays.
  • Queueing delays — Whenever you request a resource that is already busy servicing other requests, you wait in line. That’s a queueing delay. It’s what happens when your robot tries to drive to the grocery store, but all the roads are clogged with robots that are going to the store to buy one item at a time. Driving to the store takes only 7 minutes, but waiting in traffic costs you another 13 minutes. The more work your robot does, the greater its chances of being delayed by queueing, and the more such delays your robot will inflict upon others as well.
  • Coherency delays — You endure a coherency delay whenever a resource you are using needs to communicate or coordinate with another resource. For example, if your robot’s cashier at the store has to talk with a specific manager or other cashier (who might already be busy with a customer), the checkout process will take longer. The more times your robot goes to the store, the worse your wait will be, and everyone else’s, too.

The Secret

This r = cl thing sure looks like the equation for a line, but because of queueing and coherency delays, the value of l increases when c increases. This causes response time to act not like a line, but instead like a hyperbola.


Because our brains tend to conceive of our world as linear, nobody expects for everyone’s response times to get seven times worse when you’ve only added some new little bit of workload, but that’s the kind of thing that routinely happens with performance. ...And not just computer performance. Banks, highways, restaurants, amusement parks, and grocery-shopping robots all work the same way.

Response times are trememdously sensitive to your call counts, so the secret to great performance is to keep your call counts small. This principle is the basis for perhaps the best and most famous performance optimization advice ever rendered:
The First Rule of Program Optimization: Don’t do it.

The Second Rule of Program Optimization (for experts only!): Don’t do it yet.

The Problem

Keeping call counts small is really, really important. This makes being a vendor of information services difficult, because it is so easy for application users to make call counts grow. They can do it by running more programs, by adding more users, by adding new features or reports, or by even by just the routine process of adding more data every day.

Running your application with other applications on the same computer complicates the problem. What happens when all these application’ peak workloads overlap? It is a problem that Application Service Providers (ASPs), Software as a Service (SaaS) providers, and cloud computing providers must solve.

The Solution

The solution is a process:
  1. Call counts are sacred. They can be difficult to forecast, so you have to measure them continually. Understand that. Hire people who understand it. Hire people who know how to measure and improve the efficiency of your application programs and the systems they reside on.
  2. Give your people time to fix inefficiencies in your code. An inexpensive code fix might return many times the benefit of an expensive hardware upgrade. If you have bought your software from a software vendor, work with them to make sure they are streamlining the code they ship you.
  3. Learn when to say no. Don’t add new features (especially new long-running programs like reports) that are inefficient, that make more calls than necessary. If your users are already creating as much workload as the system can handle, then start prioritizing which workload you will and won’t allow on your system during peak hours.
  4. If you are an information service provider, charge your customers for the amount of work your systems do for them. The economic incentive to build and buy more efficient programs works wonders.

Thursday, August 20, 2015

Messed-Up App of the Day: Crux CCH-01W

Today’s Messed-Up App of the Day is the “Crux CCH-01W rear-view camera for select 2007-up Jeep Wrangler models.”

A rear-view camera is an especially good idea in the Jeep Wrangler, because it is very difficult to see behind the vehicle. The rear seat headrests, the wiper motor housing, the spare tire, and the center brake light all conspire to obstruct much of what little view the window had given you to begin with.

The view is so bad that it’s easy to, for example, accidentally demolish a mailbox.

I chose the Crux CCH-01W because it is purpose-built for our 2012 Jeep Wrangler. It snaps right into the license plate frame. I liked that. It had 4.5 out of 5.0 stars in four reviews at crutchfield.com, my favorite place to buy stuff like this. I liked that, too.

But I do not like the Crux CCH-01W. I returned it because our Jeep will be safer without this camera than with it. Here’s the story.

My installation process was probably pretty normal. I had never done a project like this before, so it took me longer than it should have. Crux doesn’t include any installation instructions with the camera, which is a little frustrating, but I knew that from the reviews. There is a lot of help online, and Crutchfield helped as much as I needed. After all the work of installing it, it was a huge thrill when I first shifted into Reverse and—voilà!—a picture appeared in my dashboard.

However, that was where the happiness would end. When I tried to use the camera, I noticed right away that the red, yellow, and green grid lines that the camera superimposes upon its picture didn’t make any sense. The grid lines showed that I was going to collide with the vehicle on my left that clearly wasn’t in jeopardy (an inconvenient false negative), and they showed that I was all-clear on the right when in fact I was about to ram into my garage door facing (a dangerous false positive).

The problem is that the grid lines are offset about two feet to the left. Of course, this is because the camera is about two feet to the left of the vehicle’s centerline. It’s above the license plate, below the left-hand tail light.

So then, to use these grid lines, you have to shift them in your mind about two feet to the right. In your mind. There’s no way to adjust them on the screen. Since this camera is designed exclusively for the left-hand corner of a 2007-up Jeep Wrangler, shouldn’t the designers have adjusted the location of the grid lines to compensate?

So, let’s recap. The safety device I bought to relieve driver workload and improve safety will, unfortunately, increase driver workload and degrade safety.

That’s bad enough, but it doesn’t end there. There is a far worse problem than just the misalignment of the grid lines.

Here is a photo of a my little girl standing a few feet behind the Jeep, directly behind the right rear wheel:

And here is what the camera shows the driver while she is standing there:

No way am I keeping that camera on the vehicle.

It’s easy to understand why it happens. The camera, which has a 120° viewing angle, is located so far off the vehicle centerline that it creates a blind spot behind the right-hand corner of the vehicle and grid lines that don’t make sense.

The Crux CCH-01W is one of those products that seems like nobody who designed it ever actually had to use it. I think it should never have been released.

As I was shopping for this project, my son and a local professional installer advised me to buy a camera that mounted on the vehicle centerline instead of this one. I didn’t take their advice because the reviews for the CCH-01W were good, and the price was $170 less. Fortunately, Crutchfield has a generous return policy, and the center-mounting 170°-view replacement camera that I’ll install this weekend has arrived today.

I’ve learned a lot. The second installation will go much more quickly than the first.

Wednesday, July 29, 2015

I Wish I Sold More

I flew home yesterday from Karen’s memorial service in Jacksonville, on a connecting flight through Charlotte. When I landed in Charlotte, I walked with all my stuff from my JAX arrival gate (D7) to my DFW departure gate (B15). The walk was more stressful than usual because the airport was so crowded.

The moment I set my stuff down at B15, a passenger with expensive clothes and one of those permanent grins established eye contact, pointed his finger at me, and said, “Are you in First?”

Wai... Wha...?

I said, “No, platinum.” My first instinct was to explain that I had a right to occupy the space in which I was standing. It bothers me that this was my first instinct.

He dropped his pointing finger, and his eyes went no longer interested in me. The big grin diminished slightly.

Soon another guy walked up. Same story: the I’m-your-buddy-because-I’m-pointing-my-finger-at-you thing, and then, “First Class?” This time the answer was yes. “ALRIGHT! WHAT ROW ARE YOU IN?” Row two. “AGH,” like he’d been shot in the shoulder. He holstered his pointer finger, the cheery grin became vaguely menacing, and he resumed his stalking.

One guy who got the “First Class?” question just stared back. So, big-grin guy asked him again, “Are you in First Class?” No answer. Big-grin guy leaned in a little bit and looked him square in the eye. Still no answer. So he leaned back out, laughed uncomfortably, and said half under his breath, “Really?...”

I pieced it together watching this big, loud guy explain to his traveling companions so everybody could hear him, he just wanted to sit in Row 1 with his wife, but he had a seat in Row 2. And of course it will be so much easier to take care of it now than to wait and take care of it when everybody gets on the plane.

Of course.

This is the kind of guy who sells things to people. He has probably sold a lot of things to a lot of people. That’s probably why he and his wife have First Class tickets.

I’ll tell you, though, I had to battle against hoping he’d hit his head and fall down on the jet bridge (I battled coz it’s not nice to hope stuff like that). I would never have said something to him; I didn’t want to be Other Jackass to his Jackass. (Although people might have clapped if I had.)

So there’s this surge of emotions, none of them good, going on in my brain over stupid guy in the airport. Sales reps...

This is why Method R Corporation never had sales reps.

But that’s like saying I’ve seen bad aircraft engines before and so now in my airline, I never use aircraft engines. Alrighty then. In that case, I hope you like gliders. And, hey: gliders are fine if that makes you happy. But a glider can’t get me home from Florida. Or even take off by itself.

I wish I sold more Method R software. But never at the expense of being like the guy at the airport. It seems I’d rather perish than be that guy. This raises an interesting question: is my attitude on this topic just a luxury for me that cheats my family and my employees out of the financial rewards they really deserve? Or do I need to become that guy?

I think the answer is not A or B; it’s C.

There are also good sales people, people who sell a lot of things to a lot of people, who are nothing like the guy at the airport. People like Paul Kenny and the honorable, decent, considerate people I work with now at Accenture Enkitec Group who sell through serving others. There were good people selling software at Hotsos, too, but the circumstances of my departure in 2008 prevented me from working with them. (Yes, I do realize: my circumstances would not have prevented me from working with them if I had been more like the guy at the airport.)

This need for duality—needing both the person who makes the creations and the person who connects those creations to people who will pay for them—is probably the most essential of the founder’s dilemmas. These two people usually have to be two different people. And both need to be Good.

In both senses of the word.

My Friend Karen

My friend Karen Morton passed away on July 23, 2015 after a four-month battle against cancer. You can hear her voice here.

I met Karen Morton in February 2002. The day I met her, I knew she was awesome. She told me the story that, as a consultant, she had been doing something that was unheard-of. She guaranteed her clients that if she couldn’t make things on their systems go at least X much faster on her very first day, then they wouldn’t have to pay. She was a Give First person, even in her business. That is really hard to do. After she told me this story, I asked the obvious question. She smiled her big smile and told me that her clients had always paid her—cheerfully.

It was an honor when Karen joined my company just a little while later. She was the best teammate ever, and she delighted every customer she ever met. The times I got to work with Karen were bright spots in my life, during many of the most difficult years of my career. For me, she was a continual source of knowledge, inspiration, and courage.

This next part is for Karen’s family and friends outside of work. You know that she was smart, and you know she was successful. What you may not realize is how successful she was. Your girl was famous all over the world. She was literally one of the top experts on Earth at making computing systems run faster. She used her brilliant gift for explaining things through stories to become one of the most interesting and fun presenters in the Oracle world to go watch, and her attendance numbers proved it. Thousands of people all over the world know the name, the voice, and the face of your friend, your daughter, your sister, your spouse, your mom.

Everyone loved Karen’s stories. She and I told stories and talked about stories, it seems like, all the time we were together. Stories about how Oracle works, stories about helping people, stories about her college basketball career, stories about our kids and their sports, ...

My favorite stories of all—and my family’s too—were the stories about her younger brother Ted. These stories always started out with some middle-of-the-night phone call that Karen would describe in her most somber voice, with the Tennessee accent turned on full-bore: “Kar’n: This is your brother, Theodore LeROY.” Ted was Karen’s brother Teddy Lee when he wasn’t in trouble, so of course he was always Theodore LeROY in her stories. Every story Karen told was funny and kind.

We all wanted to have more time with Karen than we got, but she touched and warmed the lives of literally thousands of people. Karen Morton used her half-century here on Earth with us as well as anyone I’ve ever met. She did it right.

God bless you, Karen. I love you.

Friday, February 27, 2015

What happened to “when the application is fast enough to meet users’ requirements?”

On January 5, I received an email called “Video” from my friend and former employee Guðmundur Jósepsson from Iceland. His friends call him Gummi (rhymes with “do me”). Gummi is the guy whose name is set in the ridiculous monospace font on page xxiv of Optimizing Oracle Performance, apparently because O’Reilly’s Linotype Birka font didn’t have the letter eth (ð) in it. Gummi once modestly teased me that this is what he is best known for. But I digress...

His email looked like this:


It’s a screen shot of frame 3:12 from my November 2014 video called “Why you need a profiler for Oracle.” At frame 3:12, I am answering the question of how you can know when you’re finished optimizing a given application function. Gummi’s question is, «Oi! What happened to “when the application is fast enough to meet users’ requirements?”»

Gummi noticed (the good ones will do that) that the video says something different than the thing he had heard me say for years. It’s a fair question. Why, in the video, have I said this new thing? It was not an accident.

When are you finished optimizing?

The question in focus is, “When are you finished optimizing?” Since 2003, I have actually used three different answers:
When are you are finished optimizing?
  1. When the cost of call reduction and latency reduction exceeds the cost of the performance you’re getting today.
    Source: Optimizing Oracle Performance (2003) pages 302–304.
  2. When the application is fast enough to meet your users’ requirements.
    Source: I have taught this in various courses, conferences, and consulting calls since 1999 or so.
  3. When there are no unnecessary calls, and the calls that remain run at hardware speed.
    Source: “Why you need a profiler for Oracle” (2014) frames 2:51–3:20.
My motive behind answers A and B was the idea that optimizing beyond what your business needs can be wasteful. I created these answers to deter people from misdirecting time and money toward perfecting something when those resources might be better invested improving something else. This idea was important, and it still is.

So, then, where did C come from? I’ll begin with a picture. The following figure allows you to plot the response time for a single application function, whatever “given function” you’re looking at. You could draw a similar figure for every application function on your system (although I wouldn’t suggest it).


Somewhere on this response time axis for your given function is the function’s actual response time. I haven’t marked that response time’s location specifically, but I know it’s in the blue zone, because at the bottom of the blue zone is the special response time RT. This value RT is the function’s top speed on the hardware you own today. Your function can’t go faster than this without upgrading something.

It so happens that this top speed is the speed at which your function will run if and only if (i) it contains no unnecessary calls and (ii) the calls that remain run at hardware speed. ...Which, of course, is the idea behind this new answer C.

Where, exactly, is your “requirement”?

Answer B (“When the application is fast enough to meet your users’ requirements”) requires that you know the users’ response time requirement for your function, so, next, let’s locate that value on our response time axis.

This is where the trouble begins. Most DBAs don’t know what their users’ response time requirements really are. Don’t despair, though; most users don’t either.

At banks, airlines, hospitals, telcos, and nuclear plants, you need strict service level agreements, so those businesses invest into quantifying them. But realize: quantifying all your functions’ response time requirements isn’t about a bunch of users sitting in a room arguing over which subjective speed limits sound the best. It’s about knowing your technological speed limits and understanding how close to those values your business needs to pay to be. It’s an expensive process. At some companies, it’s worth the effort; at most companies, it’s just not.

How about using, “well, nobody complains about it,” as all the evidence you need that a given function is meeting your users’ requirement? It’s how a lot of people do it. You might get away with doing it this way if your systems weren’t growing. But systems do grow. More data, more users, more application functions: these are all forms of growth, and you can probably measure every one of them happening where you’re sitting right now. All these forms of growth put you on a collision course with failing to meet your users’ response time requirements, whether you and your users know exactly what they are, or not.

In any event, if you don’t know exactly what your users’ response time requirements are, then you won’t be able to use “meets your users’ requirement” as your finish line that tells you when to stop optimizing. This very practical problem is the demise of answer B for most people.

Knowing your top speed

Even if you do know exactly what your users’ requirements are, it’s not enough. You need to know something more.

Imagine for a minute that you do know your users’ response time requirement for a given function, and let’s say that it’s this: “95% of executions of this function must complete within 5 seconds.” Now imagine that this morning when you started looking at the function, it would typically run for 10 seconds in your Oracle SQL Developer worksheet, but now after spending an hour or so with it, you have it down to where it runs pretty much every time in just 4 seconds. So, you’ve eliminated 60% of the function’s response time. That’s a pretty good day’s work, right? The question is, are you done? Or do you keep going?

Here is the reason that answer C is so important. You cannot responsibly answer whether you’re done without knowing that function’s top speed. Even if you know how fast people want it to run, you can’t know whether you’re finished without knowing how fast it can run.

Why? Imagine that 85% of those 4 seconds are consumed by Oracle enqueue, or latch, or log file sync calls, or by hundreds of parse calls, or 3,214 network round-trips to return 3,214 rows. If any of these things is the case, then no, you’re absolutely not done yet. If you were to allow some ridiculous code path like that to survive on a production system, you’d be diminishing the whole system’s effectiveness for everybody (even people who are running functions other than the one you’re fixing).

Now, sure, if there’s something else on the system that has a higher priority than finishing the fix on this function, then you should jump to it. But you should at least leave this function on your to-do list. Your analysis of the higher priority function might even reveal that this function’s inefficiencies are causing the higher-priority function’s problems. Such can be the nature of inefficient code under conditions of high load.

On the other hand, if your function is running in 4 seconds and (i) its profile shows no unnecessary calls, and (ii) the calls that remain are running at hardware speeds, then you’ve reached a milestone:
  1. if your code meets your users’ requirement, then you’re done;
  2. otherwise, either you’ll have to reimagine how to implement the function, or you’ll have to upgrade your hardware (or both).
There’s that “users’ requirement” thing again. You see why it has to be there, right?

Well, here’s what most people do. They get their functions’ response times reasonably close to their top speeds (which, with good people, isn’t usually as expensive as it sounds), and then they worry about requirements only if those requirements are so important that it’s worth a project to quantify them. A requirement is usually considered really important if it’s close to your top speed or if it’s really expensive when you violate a service level requirement.

This strategy works reasonably well.

It is interesting to note here that knowing a function’s top speed is actually more important than knowing your users’ requirements for that function. A lot of companies can work just fine not knowing their users’ requirements, but without knowing your top speeds, you really are in the dark. A second observation that I find particularly amusing is this: not only is your top speed more important to know, your top speed is actually easier to compute than your users’ requirement (…if you have a profiler, which was my point in the video).

Better and easier is a good combination.

Tomorrow is important, too

When are you are finished optimizing?
  1. When the cost of call reduction and latency reduction exceeds the cost of the performance you’re getting today.
  2. When the application is fast enough to meet your users’ requirements.
  3. When there are no unnecessary calls, and the calls that remain run at hardware speed.
Answer A is still a pretty strong answer. Notice that it actually maps closely to answer C. Answer C’s prescription for “no unnecessary calls” yields answer A’s goal of call reduction, and answer C’s prescription for “calls that remain run at hardware speed” yields answer A’s goal of latency reduction. So, in a way, C is a more action-oriented version of A, but A goes further to combat the perfectionism trap with its emphasis on the cost of action versus the cost of inaction.

One thing I’ve grown to dislike about answer A, though, is its emphasis on today in “…exceeds the cost of the performance you’re getting today.” After years of experience with the question of when optimization is complete, I think that answer A under-emphasizes the importance of tomorrow. Unplanned tomorrows can quickly become ugly todays, and as important as tomorrow is to businesses and the people who run them, it’s even more important to another community: database application developers.

Subjective goals are treacherous for developers

Many developers have no way to test, today, the true production response time behavior of their code, which they won’t learn until tomorrow. ...And perhaps only until some remote, distant tomorrow.

Imagine you’re a developer using 100-row tables on your desktop to test code that will access 100,000,000,000-row tables on your production server. Or maybe you’re testing your code’s performance only in isolation from other workload. Both of these are problems; they’re procedural mistakes, but they are everyday real-life for many developers. When this is how you develop, telling you that “your users’ response time requirement is n seconds” accidentally implies that you are finished optimizing when your query finishes in less than n seconds on your no-load system of 100-row test tables.

If you are a developer writing high-risk code—and any code that will touch huge database segments in production is high-risk code—then of course you must aim for the “no unnecessary calls” part of the top speed target. And you must aim for the “and the calls that remain run at hardware speed” part, too, but you won’t be able to measure your progress against that goal until you have access to full data volumes and full user workloads.

Notice that to do both of these things, you must have access to full data volumes and full user workloads in your development environment. To build high-performance applications, you must do full data volume testing and full user workload testing in each of your functional development iterations.

This is where agile development methods yield a huge advantage: agile methods provide a project structure that encourages full performance testing for each new product function as it is developed. Contrast this with the terrible project planning approach of putting all your performance testing at the end of your project, when it’s too late to actually fix anything (if there’s even enough budget left over by then to do any testing at all). If you want a high-performance application with great performance diagnostics, then performance instrumentation should be an important part of your feedback for each development iteration of each new function you create.

My answer

So, when are you finished optimizing?
  1. When the cost of call reduction and latency reduction exceeds the cost of the performance you’re getting today.
  2. When the application is fast enough to meet your users’ requirements.
  3. When there are no unnecessary calls and the calls that remain run at hardware speed.
There is some merit in all three answers, but as Dave Ensor taught me inside Oracle many years ago, the correct answer is C. Answer A specifically restricts your scope of concern to today, which is especially dangerous for developers. Answer B permits you to promote horrifically bad code, unhindered, into production, where it can hurt the performance of every function on the system. Answers A and B both presume that you know information that you probably don’t know and that you may not need to know. Answer C is my favorite answer because it is tells you exactly when you’re done, using units you can measure and that you should be measuring.

Answer C is usually a tougher standard than answer A or B, and when it’s not, it is the best possible standard you can meet without upgrading or redesigning something. In light of this “tougher standard” kind of talk, it is still important to understand that what is optimal from a software engineering perspective is not always optimal from a business perspective. The term optimized must ultimately be judged within the constraints of what the business chooses to pay for. In the spirit of answer A, you can still make the decision not to optimize all your code to the last picosecond of its potential. How perfect you make your code should be a business decision. That decision should be informed by facts, and these facts should include knowledge of your code’s top speed.

Thank you, Guðmundur Jósepsson, of Iceland, for your question. Thank you for waiting patiently for several weeks while I struggled putting these thoughts into words.