Tuesday, January 18, 2022

Why the Oak Table Was So Great

This weekend, I watched a wonderful TEDx video by Barbara Sher, called “Isolation Is the Dream-Killer, Not Your Attitude.” Please watch this video. It’s 21 minutes, 18 seconds long.

It reminded me about what was so great about the Oak Table. That’s right: was. It’s not anymore.

Here’s what it was. People I admired, trusted, and liked would gather at a home in Denmark owned by a man named Mogens Nørgaard. Mogens is the kindest and most generous host I have ever encountered. He would give his whole home—every inch—to keep as many of us as he could, for a week, once or twice a year. Twenty, maybe thirty of us. We ate, drank, and slept, all for free, as much and for as long as we wanted. 

And the “us” in that sentence was no normal, regular, everyday “us.” It was Tom Kyte, Lex de Haan, Anjo Kolk, Jonathan Lewis, Graham Wood, Tanel Põder, Toon Koppelaars, Chris Antognini, Steve Adams, Stephan Haisley, James Morle, John Beresniewicz, Jože Senegačnik, Bryn Llewellyn, Tuomas Pystynen, Andy Zitelli, Johannes Djernæs, Michael Möller, Peter Gram, Dan Norris, Carel Jan-Engel, Pete Sharman, Tim Gorman, Kellyn Pot'Vin, Alex Gorbachev, Frits Hoogland, Karen Morton, Robyn Sands, Greg Rahn, and—my goodness—I’m leaving out even more people than I’m listing.

We spent a huge amount of our time sitting together at Mogens’s big oak table, which was big enough for about eight people. Or, in actuality, about twice that. We’d just work. And talk. If there wasn’t a meal on the table, then it would be filled with laptops and power cords covering every square inch. Oops, I mean millimeter. That table had millimeters.

And here’s what was so great about the Oak Table: you could say what you wanted—whatever it was!—and you could have it. You could just say your dream and your obstacle, and someone around the table would know how to make your dream come true.

It’s tricky even trying to remember good examples of people’s dreams, because I’m so far removed from it now. Some of them were nerdy things like, “I wonder how long an Oracle PARSE call would take if we did a 256-table join?” You’d hear, “Hmm, interesting. I think I have a test for that,” and then the next thing you know, Jonathan Lewis would be working on your problem. Or, “Hey, does anyone know how to do such-and-such in vim?” And Johannes Djernæs or Michael Möller would show you how easy it was.

I got into a career-saving conversation late one night with Robyn Sands. She had asked, “Is anybody else having trouble finding good PL/SQL developers? I can’t figure out where they are, if there even are any. Are there?” We talked for a while about why they were so scarce, and then I connected the dots that, hey, I have two superb PL/SQL developers at home on the bench, and I had been desperately trying to find them good work. The story that Robyn and I started some 3:00am over beers resulted in a superb consumer femtocell device for Robyn and a year’s worth of much-needed revenue for my tiny little team.

It was a world where you could have anything you want. Better yet, it was a world where you could dream properly. Today, in isolation, it’s hard to even dream right. After nearly two years of being locked away, I can barely conceive of a world that’s plentiful and joyous like those Oak Table years. I feel much smaller now. (Oh, and it wasn’t COVID-19 that killed that Oak Table experience. It died years before that—but, obviously, it’s a factor today.)

I want it back. I want my friends back. How are we going to do this?

Thursday, December 23, 2021

The New Book Is Ready: “Faster: How to Optimize a System”

My new book called Faster: How to Optimize a System is ready! You can buy the PDF now at method-r.com. The hardcover format should be available at Amazon sometime around mid-January. The hardcover format is available now at Amazon. 

The physical book will be about 250 pages long, in a normal 6 × 9-inch business book format. There are 109 chapters, each of which conveys either a story or a lesson. Most of the lessons are only a couple pages long. A few of the stories are longer than that, but the story chapters are little adventures that read quickly.

So far, the people I’ve shown it to seem to like it. I can’t wait for you to see it.

Buy Faster:

Friday, October 1, 2021

How I Spent My COVID Vacation

I haven’t blogged in, what, ...forever. A couple of years.

I have been writing, though. Just not here. A lot, actually. Here’s some of the stuff I’ve done during my lapse here:

But the biggest reason of all: I’ve spent every spare moment outside of my work obligations since March 2020…

The working title is How to Make Things Faster. It’s a book about making things (including software) go faster, for readers who aren’t necessarily technical. It will be about the size of a Dilbert book and contain pretty much everything I’ve ever learned in 30 years of work, both technical and political. The audience I’m aiming for is every IT professional, business leader, and consultant on Earth. I hope you’ll be able to find it in airport book stores next to Liz Wiseman and Dr. Phil.

The book is story-driven. You’ll get a fun story or two and then half a dozen bite-sized (~2½-page) chapters teaching lessons inspired by the stories, then a new story and more bite-sized lessons. I think that people will be able to enjoy reading the book both serially from front to back, and randomly two or three short chapters at a time.  

To ensure that the material would be accessible to non-technical readers, I hired a very special first reviewer for this project: my mom. My instructions to her were to identify anything that was either unclear or boring. Thanks to her help, other reviewers so far have characterized the book as “fun,” “lively,” “quick,” and “important.”

It’s been a lot of work, and there’s still a lot to do. I’m excited about it, and I hope you’ll be, too. I’ll get it to you as soon as I can.

If you want a peek, you should join me next Thursday, October 7 at the Dallas Oracle Users Group’s “DOUG Training Day 2021,” which will convene live (!) in Grapevine, Texas, not too far from where I’m sitting right now. This will be an important event for me: it’ll be the first time I’ll have presented face-to-face with a live audience in nearly two years, and it’ll be the first time I’ll have presented material from the book. Plus, it’s a keynote for the event, so …the pressure’s on. :-)

If you’re interested in the sound of the material, please consider booking me for a workshop.

Tuesday, July 23, 2019

Sometimes the Simplest Things Are the Most Important

I didn’t ride in First Class a lot during my career, but I can remember one trip, I was sitting in seat 1B. Aisle seat, very front row. Right behind the lavatory bulkhead. The lady next to me in 1A was very nice, and we traded some stories.

I can remember telling her during dinner, wow, just look at us. Sitting in an aluminum tube going 500 miles per hour, 40,000 feet off the ground. It’s 50º below zero out there. Thousands of gallons of kerosene are burning in huge cans bolted to our wings, making it all go. Yet here we sit in complete comfort, enjoying a glass of wine and a steak dinner. And just three feet away from us in that lavatory right there, a grown man is sitting on a toilet.

I said, you know, out of all the inventions that have brought us to where we are here today, the very most important one is probably that wall.

Tuesday, August 15, 2017

Words I Don’t Use, Part 5: “Wait”

The fifth “word I do not use” is the Oracle technical term wait.

The Oracle Wait Interface

In 1991, Oracle Corporation released some of the most important software instrumentation of all time: the “wait” statistics that were implemented in Oracle 7.0. Here’s part of the story, in Juan Loaiza’s words, as told in Nørgaard et. al (2004), Oracle Insights: Tales of the Oak Table.
This stuff was developed because we were running a benchmark that we could not get to perform. We had spent several weeks trying to figure out what was happening with no success. The symptoms were clear—the system was mostly idle—we just couldn’t figure out why.

We looked at the statistics and ratios and kept coming up with theories, the trouble was that none of them were right. So we wasted weeks tuning and fixing things that were not the problem. Finally we ran out of ideas and were forced to go back and instrument the code to figure out what the problem was.

Once the waits were instrumented the problem was diagnosed in minutes. We were having “free buffer” waits because the DBWR was not writing blocks fast enough. It’s amazing how hard that was to figure out with statistics, and how easy it was to figure out once the waits were instrumented.

...In retrospect a lot of the names could be greatly improved. The wait interface was added after the freeze date as a “stealth” project so it did not get as well thought through as it should have. Like I said, we were just trying to solve a problem in the course of a benchmark. The trouble is that so many people use this stuff now that if you change the names it will break all sorts of thing tools, so we have to leave them alone.
Before Juan’s team added this code, the Oracle kernel would show you only how much time its user calls (like parse, exec, and fetch) were taking. The new instrumentation, which included a set of new fixed views like v$session_wait and new WAIT lines in our trace files, showed how much time Oracle’s system calls (like reads, writes, and semops) were taking.

The Working-Waiting Model

The wait interface begat a whole new mental model about Oracle performance, based on the principle of working versus waiting:
Response Time = Service Time + Wait Time
In this formula, Oracle defines service time as the duration of the CPU used by your Oracle session (the duration Oracle spent working), and wait time as the sum of the durations of your Oracle wait events (the duration that Oracle spent waiting). Of course, response time in this formula means the duration spent inside the Oracle Database kernel.

Why I Don’t Say Wait, Part 1

There are two reasons I don’t use the word wait. The first is simply that it’s ambiguous.

The Oracle formula is okay for talking about database time, but the scope of my attention is almost never just Oracle’s response time—I’m interested in the business’s response time. And when you think about the whole stack (which, of course you do; see holistic), there are events we could call wait events all the way up and down:
  • The customer waits for an answer from a user.
  • The user waits for a screen from the browser.
  • The browser waits for an HTML page from the application server.
  • The application server waits for a database call from the Oracle kernel.
  • The Oracle kernel waits for a system call from the operating system.
  • The operating system’s I/O request waits to clear the device’s queue before receiving service.
  • ...
If I say waits, the users in the room will think I’m talking about application response time, the Oracle people will think I’m talking about Oracle system calls, and the hardware people will think I’m talking about device queueing delays. Even when I’m not.

Why I Don’t Say Wait, Part 2

There is a deeper problem with wait than just ambiguity, though. The word wait invites a mental model that actually obscures your thinking about performance.

Here’s the problem: waiting sounds like something you’d want to avoid, and working sounds like something you’d want more of. Your program is waiting?! Unacceptable. You want it to be working. The connotations of the words working and waiting are unavoidable. It sounds like, if a program is waiting a lot, then you need to fix it; but if it’s working a lot, then it is probably okay. Right?

Actually, no.

The connotations “work is virtuous” and “waits are abhorrent” are false connotations in Oracle. One is not inherently better or worse than the other. Working and waiting are not accurate value judgments about Oracle software. On the contrary, they’re not even meaningful; they’re just arbitrary labels. We could just as well have been taught to say that an Oracle program is “working on disk I/O” and “waiting to finish its CPU instructions.”

The terms working and waiting really just refer to different subroutine call types:

“Oracle is workingmeans“your Oracle kernel process is executing a user call”
“Oracle is waitingmeans“your Oracle kernel process is executing a system call”

The working-waiting model implies a distinction that does not exist, because these two call types have equal footing. One is no worse than the other, except by virtue of how much time it consumes. It doesn’t matter whether a program is working or waiting; it only matters how long it takes.

Working-Waiting Is a Flawed Analogy

The working-waiting paradigm is a flawed analogy. I’ll illustrate. Imagine two programs that consume 100 seconds apiece when you run them:

Program AProgram B
DurationCall typeDurationCall type
98system calls (waiting)98user calls (working)
2user calls (working)2system calls (waiting)

To improve program A, you should seek to eliminate unnecessary system calls, because that’s where most of A’s time has gone. To improve B, you should seek to eliminate unnecessary user calls, because that’s where most of B’s time has gone. That’s it. Your diagnostic priority shouldn’t be based on your calls’ names; it should be based solely on your calls’ contributions to total duration. Specifically, conclusions like, “Program B is okay because it doesn’t spend much time waiting,” are false.

A Better Model

I find that discarding the working-waiting model helps people optimize better. Here’s how you can do it. First, understand the substitute phrasing: working means executing a user call; and waiting means executing a system call. Second, understand that the excellent ideas people use to optimize other software are excellent ideas for optimizing Oracle, too:
  1. Any program’s duration is a function of all of its subroutine call durations (both user calls and system calls), and
  2. A program is running as fast as possible only when (1) its unnecessary calls have been eliminated, and (2) its necessary calls are running at hardware speed.
Oracle’s wait interface is vital because it helps us measure an Oracle program’s complete execution duration—not just Oracle’s user calls, but its system calls as well. But I avoid saying wait to help people steer clear of the incorrect bias introduced by the working-waiting analogy.

Monday, August 7, 2017

Words I Don’t Use, Part 4: “Expert”

The fourth “word I do not use” is expert.

When I was a young boy, my dad would sometimes drive me to school. It was 17 miles of country roads and two-lane highways, so it gave us time to talk.

At least once a year, and always on the first day of school, he would tell me, “Son, there are two answers to every test question. There’s the correct answer, and there’s the answer that the teacher expects. ...They’re not always the same.”

He would continue, “And I expect you to know them both.”

He wanted me to make perfect grades, but he expected me to understand my responsibility to know the difference between authority and truth. My dad thus taught me from a young age to be skeptical of experts.

The word expert always warns me of a potentially dangerous type of thinking. The word is used to confer authority upon the person it describes. But it’s ideas that are right or wrong; not people. You should evaluate an idea on its own merit, not on the merits of the person who conveys it. For every expert, there is an equal and opposite expert; but for every fact, there is not necessarily an equal and opposite fact.

A big problem with expert is corruption—when self-congratulators hijack the label to confer authority upon themselves. But of course, misusing the word erodes the word. After too much abuse within a community, expert makes sense only with finger quotes. It becomes a word that critical thinkers use only ironically, to describe people they want to avoid.

Saturday, July 29, 2017

Words I Don’t Use, Part 3: “Best Practice”

The third “word I do not use” is best practice.

The “best practice” serves a vital need in any industry. It is the answer to, “Please don’t make me learn about this; just tell me what to do.” The “best practice” is a fine idea in spirit, but here’s the thing: many practices labeled “best” don’t deserve the adjective. They’re often containers for bad advice.

The most common problem with “best practices” is that they’re not parameterized like they should be. A good practice usually depends on something: if this is true, then do that; otherwise, do this other thing. But most “best practices” don’t come with conditions of execution—they often contain no if statements at all. They come disguised as recipes that can save you time, but they often encourage you to skip past thinking about things that you really ought to be thinking about.

Most of my objections to “best practices” go away when the practices being prescribed are actually good. But the ones I see are often not, like the old SQL “avoid full-table scans” advice. Enforcing practices like this yields applications that don’t run as well as they should and developers that don’t learn the things they should. Practices like “Measure the efficiency of your SQL at every phase of the software life cycle,” are actually “best”-worthy, but alas, they’re less popular because they sound like real work.