Cary Millsap

Monday, February 14, 2011

It’s Conference Season!

My favorite mode of life is being busy doing something that I enjoy and that I know, beyond a doubt, is the Right Thing to be doing. Any hour I get to spend in that zone is a precious gift.

I’ve been in that zone nearly continuously for the past three weeks. I’ve been doing two of my favorite things: lots of consulting work (helping, earning, and learning), and lots of software development work (which helps me help, earn, and learn even faster).

I’m looking forward to the next four weeks, too, because another Right Thing that I love to do is talk with people about software performance, and three of my favorite events where I can do that are coming right up:

RMOUG Training Days, Denver CO — I leave tomorrow. I’m looking forward to reuniting with lots of good friends. My stage time will be Wednesday, February 16th, when I’ll talk about material from my new “Mastering Performance with Extended SQL Trace” paper.
NoCOUG Winter Conference, Pleasanton CA — I’ll be in the east Bay Area on Thursday, February 24th presenting the keynote address where I’ll discuss whether Exadata means never having to “tune” again and then spending two hours helping people to think clearly about performance.
Hotsos Symposium, Irving TX — I’ll present “Thinking Clearly about Performance” on Monday, March 7th. I love the agenda at this event. It’s a high quality lineup that is dedicated purely to Oracle software performance. This is one of the very few conferences where I can enjoy sitting and just watching for whole days at a time. If you are interested in Oracle system performance, do not miss this.

Happy Valentine’s Day. I shall hope to see you soon.

Friday, January 21, 2011

Describing Performance Improvements (Beware of Ratios)

Recently, I received into my Spam folder an ad claiming that a product could “...improve performance 1000%.” Claims in that format have bugged me for a long time, at least as far back as the 1990s, when some of the most popular Oracle “tips & techniques” books of the era used that format a lot to state claims.

Beware of claims worded like that.

Whenever I see “...improve performance 1000%,” I have to do extra work to decode what the author has encoded in his tidy numerical package with a percent-sign bow. The two performance improvement formulas that make sense to me are these:

Improvement = (b – a)/b, where b is the response time of the task before repair, and a is the response time of the task after repair. This formula expresses the proportion (or percentage, if you multiply by 100%) of the original response time that you have eliminated. It can’t be bigger than 1 (or 100%) without invoking reverse time travel.
Improvement = b/a, where b and a are defined exactly as above. This formula expresses how many times faster the after response time is than the before one.

Since 1000% is bigger than 100%, it can’t have been calculated using formula #1. I assume, then, that when someone says “...improve performance 1000%,” he means that b/a = 10, which, expressed as a percentage, is 1000%. What I really want to know, though, is what were b and a? Were they 1000 and 1? 1 and .001? 6 and .4? (...In which case, I would have to search for a new formula #3.) Why won’t you tell me?

Any time you see a ‘%’ character, beware: you’re looking at a ratio. The principal benefit of ratios is also their biggest flaw. A ratio conceals its denominator. That, of course, is exactly what ratios are meant to do—it’s called normalization—but it’s not always good to normalize. Here’s an example. Imagine two SQL queries A and B that return the exact same result set. What’s better: query A, with a 90% hit ratio on the database buffer cache? or query B, with a 99% hit ratio?

Query	Cache hit ratio
A	90%
B	99%

As tempting as it might be to choose the query with the higher cache hit ratio, the correct answer is...

There’s not enough information given in the problem to answer. It could be either A or B, depending on information that has not yet been revealed.

Here’s why. Consider the two distinct situations listed below. Each situation matches the problem statement. For situation 1, the answer is: query B is better. But for situation 2, the answer is: query A is better, because it does far less overall work. Without knowing more about the situation than just the ratio, you can’t answer the question.

Situation 1
Query	Cache lookups	Cache hits	Cache hit ratio
A	100	90	90%
B	100	99	99%

Situation 2
Query	Cache lookups	Cache hits	Cache hit ratio
A	10	9	90%
B	100	99	99%

Because a ratio hides its denominator, it’s insufficient for explaining your performance results to people (unless your aim is intentionally to hide information, which I’ll suggest is not a sustainable success strategy). It is still useful to show a normalized measure of your result, and a ratio is good for that. I didn’t say you shouldn’t use them. I just said they’re insufficient. You need something more.

The best way to think clearly about performance improvements is with the ratio as a parenthetical additional interesting bit of information, as in:

I improved response time of T from 10s to .1s (99% reduction).
I improved throughput of T from 42t/s to 420t/s (10-fold increase).

There are three critical pieces of information you need to include here: the before measurement (b), the after measurement (a), and the name of the task (here, T) that you made faster. I’ve talked about b and a before, but this I’ve slipped this T thing in on you all of a sudden, haven’t I!

Even authors who give you b and a have a nasty habit of leaving off the T, which is far worse even than leaving off the before and after numbers, because it implies that using their magic has improved the performance of every task on the system by exactly the same proportion (either p% or n-fold), which is almost never true. That is because it’s rare for any two tasks on a given system to have “similar” response time profiles (defining similar in the proportional sense). For example, imagine the following quite dissimilar two profiles:

Task A
Response time	Resource
100%	Total
90%	CPU
10%	Disk I/O

Task B
Response time	Resource
100%	Total
90%	Disk I/O
10%	CPU

No single component upgrade can have equal performance improvement effects upon both these tasks. Making CPU processing 2× faster will speed up task A by 45% and task B by 5%. Likewise, making Disk I/O processing 10× faster will speed up task A by 9% and task B by 80%.

For a vendor to claim any noticeable, homogeneous improvement across the board on any computer system containing tasks A and B would be an outright lie.

Friday, January 14, 2011

An Axiomatic Approach to Algebra and Other Aspects of Life

Not many days pass that I don’t think a time or two about James R. Harkey. Mr. Harkey was my high school mathematics teacher. He taught me algebra, geometry, analytic geometry, trigonometry, and calculus. What I learned from Mr. Harkey influences—to this day—how I write, how I teach, how I plead a case, how I troubleshoot, .... These are the skills I’ve used to earn everything I own.

Prior to Mr. Harkey’s algebra class, algebra for me just was a morass of tricks to memorize: “Take the constant to the other side...”; “Cancel the common factors...”; “Flip the fraction and multiply...” I could practice for a while and then solve problems just like the ones I had been practicing, by applying memorized transformations to superficial patterns that I recognized, but I didn’t understand what I had been taught to do. Without continual practice, the rules I had memorized would evaporate, and then once more I’d be able to solve only those problems for which I could intuit the answer: “7x + 6 = 20” would have been easy, but “7/x – 6 = 20” would have stumped me. This made, for example, studying for final exams quite difficult.

On the first day of Mr. Harkey’s class, he gave us his rules. First, his strict rules of conduct in the classroom lived up to his quite sinister reputation, which was important. Our studies began with a single 8.5" × 14" sheet of paper that apparently he asked us to label “Properties A” (because that’s what I wrote in the upper right-hand corner; and yes, I still have it). He told us that we could consult this sheet of paper on every homework assignment and every exam he’d give. And here’s how we were to use it: every problem would be executed one step at a time; every step would be written down; and beside every step we would write the name of the rule from Properties A that we invoked to perform that step.

You can still hear us now: Holy cow, that’s going to be a lot of extra work.

Well, that’s how it was going to be. Here’s what each homework and test problem had to look like:

The first few days of class, we spent time reviewing every single item on Properties A. Mr. Harkey made sure we all agreed that each axiom and property was true before we moved on to the real work. He was filling our toolbox.

And then we worked problem after problem after problem.

Throughout the year, we did get to shift gears a few times. Not every ax + b = c problem required fourteen steps all year long. After some sequence of accomplishments (I don’t remember what it was—maybe some set number of ‘A’ grades on homework?), I remember being allowed to write the number of the rule instead of the whole name. (When did you first learn about foreign keys? ☺) Some accomplishments after that, we’d be allowed to combine steps like 3, 4 and 5 into one. But we had to demonstrate a pattern of consistent mastery to earn a privilege like that.

Mr. Harkey taught algebra as most teachers teach geometry or predicate logic. Every problem was a proof, documented one logical step at a time. In Mr. Harkey’s algebra class, your “answer” to a homework problem or test question wasn’t the number that x equals, it was the whole proof of how you arrived at the value of x in your answer. Mr. Harkey wasn’t interested in grading your answers. He was going to grade how you got your answers.

The result? After a whole semester of this, I understood algebra, and I mean thoroughly. You couldn’t make a good grade in Mr. Harkey’s algebra class without creating an intimate comprehension of why algebra works the way it does. Learning that way supplies you for a whole lifetime: I still understand it. I can make dimensioned drawings of the things I’m going to build in my shop. I can calculate the tax implications of my business decisions. I can predict the response time behavior of computer software. I can even help my children with their algebra. Nothing about algebra scares me, because I still understand all the rules.

When I help my boys with their homework, I make them use Mr. Harkey’s axiomatic approach with my own Properties A that I made for them. (I rearranged Mr. Harkey’s rules to better illuminate the symmetries among them. If Mr. Harkey had been handy with the laptop computer, which didn’t exist when I was in school, I imagine he’d have done the same thing.)

Invariably, when my one of boys misses a math problem, it’s for the same stupid reason that I make mistakes in my shop or at work. It’s because he’s tried to do steps in his head instead of writing them all down, and of course he’s accidentally integrated an assumption into his work that’s not true. When you don’t have a neat and orderly audit trail to debug, the only way you can fix your work is to start over, which takes more time (which itself increases frustration levels and degrades learning) and which bypasses perhaps the most important technical skill in all of Life today: the ability to troubleshoot.

Theory: Redoing an n-step math problem instead of learning how to propagate a correction to an error made in step n – k through step n is how we get to a society in which our support analysts know only two solutions to any problem: (a) reboot, and (b) reinstall.

It’s difficult to teach people the value of mastering the basics. It’s difficult enough with children, and it’s even worse with adults, but great teachers and great coaches understand how important it is. I’m grateful to have met my share, and I love meeting new ones. Actually, I believe my 11-year old son has a baseball practice with one tomorrow. We’ll have to check his blog in about 30 years.

Thursday, January 13, 2011

New paper "Mastering Performance with Extended SQL Trace"

Happy New Year.

It’s been a busy few weeks. I finally have something tangible to show for it: “Mastering Performance with Extended SQL Trace” is the new paper I’ve written for this year’s RMOUG conference. Think of it a 15-page update to chapter 5 of Optimizing Oracle Performance.

There’s lots of new detail in there. Some highlights:

How to enable and disable traces, even in un-cooperative applications.
How to instrument your application so that tracing the right code path during production operation of your application becomes dead simple.
How to make that instrumentation highly scalable (think 100,000+ tps).
How timestamps since 10.2 allow you to know your recursive call relationships without guessing.
How to create response time profiles for calls and groups of calls, with examples.
Why you don’t want to be on Oracle 11g prior to 11.2.0.2.0.

I hope you’ll be able to make productive use of it.

Wednesday, October 20, 2010

Virtual Seminar: "Systematic Oracle SQL Optimization in Real Life"

On November 18 and 19, I’ll be presenting along with Tanel Põder, Jonathan Lewis, and Kerry Osborne in a virtual (GoToWebinar) seminar called Systematic Oracle SQL Optimization in Real Life. Here are the essentials:

What:	Systematic Oracle SQL Optimization in Real Life. Learn how to think clearly about Oracle performance, find your performance problems, and then fix them, whether you’re using your own code (which you can modify) or someone else’s (which you can not modify).
Who:	Cary Millsap, Tanel Põder, Jonathan Lewis, Kerry Osborne
When:	8am–12n US Pacific Time Thursday and Friday 18–19 November 2010
How much:	475 USD (375 USD if you register before 1 November 2010)

The format will be two hours per speaker: an hour and a half for presentation time, and a half hour for questions and answers. Here’s our agenda (all times are listed in USA Pacific Time):

Thursday	8:00a–10:00a	Cary Millsap: Thinking Clearly about Performance
	10:00a–12:00n	Tanel Põder: Understanding and Profiling Execution Plans
Friday	8:00a–10:00a	Jonathan Lewis: Writing Your SQL to Help the Optimizer
	10:00a–12:00n	Kerry Osborne: Controlling Execution Plans (without touching the code)

This is going to be a special event. My staff and I can’t wait to see it ourselves. I hope you will join us.

Thursday, October 7, 2010

Agile is Not a Dirty Word

While I was writing Brown Noise in Written Language, Part 2, twice I came across the word “agile.” First, the word “agility” was in the original sentence that I was criticizing. Joel Garry picked up on it and described it as “a code word for ‘sloppy programming.’” Second, if you read my final paragraph, you might have noticed that I used the term “waterfall” to describe one method for producing bad writing. Waterfall is a reliable method for producing bad computer software too, in my experience, and for exactly the same reason. Whenever I disparage “waterfall,” I’m usually thinking fondly of “agile,” which I consider to be “waterfall’s” opposite. I was thinking fondly of “agile,” then, when I wrote that paragraph, which put me at odds with Joel’s disparaging description of the word. Such conflict usually motivates me to write something.

In my career, I’ve almost always had one foot in each of two separate worlds. These days, one foot is in the Oracle world. There, I have all my old buddies from having worked at Oracle Corporation for over a decade, from companies like Miracle and Pythian, the Oracle ACEs and ACE Directors, Oracle OpenWorld, ODTUG, and a couple dozen or so user groups that I visit every year. The other foot is in the business of software. There, I have colleagues and friends from 37signals and Fog Creek and Red Gate and Pragmatic Marketing, the Business of Software conference, and the dozens of blogs and tweets that I study every day in order to fuel a company that makes not just software that meets a list of requirements, but software that makes you feel like something magical has been accomplished when you run it.

In my Oracle world, agile is a dirty word. I have to actually be careful when I use it. To my Oracle practitioner colleagues, the A-word means, as Joel wrote, “sloppy programming.” In my business of software world, though, “agile” means wholesome golden goodness, an elegant solution to the absolutely most difficult problems in our field. I’m not being facetious one little bit here, either. The two most important influences in my professional life in the past decade have been, far and away:

Eli Goldratt’s The Goal: A Process of Ongoing Improvement
Kent Beck’s Extreme Programming Explained: Embrace Change (2nd Edition)

Far and away.

I don’t mention this among most of my Oracle friends. I don’t blurt out the A-word to them, any more than I’d blurt out the F-word at my parents’ dinner table. To talk with my Oracle friends about the goodness of “A-word development” would go over like an enthusiastic hour-long lecture on urophagia.

A lot of really smart people are very anti-“agile.” I’m pretty sure that it’s mostly because they’ve seen project leaders in the Oracle market segment using the A-word to sell—or justify—some really bad decisions (see Table 1). So the word “agile” itself has been co-opted in our Oracle culture now to mean sloppy, stupid, unprofessional, irresponsible, immature, or naive. That’s ok. I’ve had words taken away from me before. (Like “scalability,” which today is little more than some vague synonym for “fast” or “good”; or “methodology,” which apparently people think sounds cooler than “method.” ...Ok, I am actually a little angry at the agile guys for that one.) That doesn’t mean I can’t still use the concepts.

Table 1.

What people think agile means	What agile means
No written requirements specification; therefore, no disciplined way to match software to requirements.	You write your requirements as computer programs that test your software instead of writing your requirements in natural language documents that a human has to read and interpret to re-test your software every time a developer commits new source code.
No testing phase; therefore, no testing.	You test your software before every commit to your source code repository, by running your automated test suite.
No written design specification; therefore, developers just “design” as they go.	You iterate your design along with your code, but design changes are always accompanied by changes to the automated test programs (which, remember, are the specification).
Rapid prototyping always results in the production code being a fragile—well—rapid fragile, prototype.	When you can’t know how (or whether) something will work, you build it and find out—but only the parts you know you’ll really need. You use the knowledge learned from those experiences to build the one you’ll keep.

Agile is not a synonym for sloppy. On the contrary, you're not really doing agile if you’re not extraordinarily disciplined. I think that is why a lot of people who try agile hit so hard when they fail. I hope you will check out Balancing Agility and Discipline: A Guide for the Perplexed, coauthored by Barry Boehm (yes, that Barry Boehm) if you feel perplexed and in need of guidance.

As with any label, I hope you’ll realize that when you use a word that stands for a complex collection of thought, not everyone who hears or reads the word sees the same mental picture. When this happens, the word ceases being a tool and becomes part of a new problem.

Friday, October 1, 2010

Brown Noise in Written Language, Part 2

Here is some more thinking on the subject of brown noise in written language, stimulated by Joel Garry’s comment to my prior post.

My point is not an appeal for more creative writing in the let’s-use-lots-of-adverbs sense. It’s an appeal for clarity of expression. More fundamentally, it is an appeal for having an idea to express in the first place. If you have an actual idea and express it in a useful way, then maybe you've created something that is not spam (even if it happens to be a mass mailing), because it yields some value to your audience.

My point is about being creative only to the extent that if you haven’t created an interesting thought to convey by the time you’ve written something, then you don’t deserve—and you’re not going to get—my attention. (Except you might get me to criticize your writing in my blog.)

What Lanham calls “the Official Style” is a tool for solving two specific problems: There’s (1) “I have no clear thought to express, yet I'm required to write something today.” And (2) “I have a thought I'd like to express, but I'm afraid that if I just come out and say it, I'll get in trouble.” Problem #1 happens, for example, to school children who are required to write when they really don’t have anything in mind to be passionate about. Problem #2 happens to millions who live out the Emperor’s New Clothes every day of their lives. They don’t “get” what their mission is or why it’s important, so when they’re required to write, they encrypt their material to hide from their audience that they don’t get it. The result includes spam, mission statements, and 98% of the PowerPoint presentations you’ll ever see in your life.

I’m always more successful when I orient my thoughts in the direction of gratitude, so a better Part 1 post from me would have been structured as:

Wow, look at this horrible, horrible sentence. I am so lucky I don't have to live and work in an environment where this kind of expression (and by implication, this kind of thinking) is deemed acceptable.
I highly recommend Lanham's Revising Prose. It is brilliant. It helps you fix this kind of writing, and—more importantly—the kind of thinking that leads to it.
I’m grateful for the work of people like Lanham, Fried, Heinemeier-Hansson, and many others, who help us understand and appreciate clear thinking and courageous writing.

Writing is not just output. Writing is an iterative process—along with thinking, experimenting, testing—that creates new thought. If you try to use the waterfall approach when you write—“Step 1: Do all your thinking; Step 2: Do all your writing”—then you’ll miss the whole point of how writing clarifies and creates new thought. That is why learning how to revise prose is so important. It’s not just about how to make writing better. As Lanham illustrates in dozens of examples throughout his book, revising prose forces improvement in the writer’s thinking, which enriches the writer’s life even more than the writing, however tremendous, will enrich the reader.