Cary Millsap: tom kyte

Showing posts with label tom kyte. Show all posts

Tuesday, March 31, 2009

Last call for C. J. Date course

Note added 3 April 2009: When I wrote this post, we were counting down toward 2 April as the date for our preliminary go/no-go decision. That date is now behind us, and we have made the preliminary decision to Go. We are accepting further enrollments. —Cary Millsap

Thursday 2 April 2009 is our last call for enrollment in C. J. Date's course, "How to write correct SQL, and know it: a relational approach to SQL." I'm looking forward to this course more eagerly than anything I've attended in the past ten years, ...maybe twenty.

SQL and I never really got along too well. When I first joined Oracle Corporation in 1989, I was new to relational databases. I had done one hierarchical database project in college. I enjoyed the project ok, but it wasn't something I ever wanted to do again. When I joined Oracle, I didn't know much about relational technology or SQL. In my formative first couple of years at Oracle, though, I just never learned to like the SQL language. Prior to my Oracle career, I designed languages and wrote compilers for a living. From a language design standpoint, it just seemed that SQL (at least "Oracle SQL") could have become something really cool, but it didn't. For Oracle to treat an empty string as NULL, for example, is a decision which I still can't believe made it into the light of day...

I had a lot of respect over the years for the people I met who knew how to make SQL do what they wanted it to do. Dominic Delmolino was one of the first people I ever met who could make SQL do things I had no idea it could do. I'm still amazed when I see the things that Tom Kyte can do with SQL. I was never one of the SQL people.

Lex de Haan is the first person I ever met who really revealed to me what my problem was. A few years ago, Lex delivered a Miracle presentation in Rødby, Denmark, that dropped my jaw. He explained a better way to write an application with SQL. He showed how to write a completely unambiguous specification using a language I understood, predicate calculus ("this set equals that set," that kind of thing). He then showed how to implement that specification in SQL.

Here's the problem, though. SQL doesn't implement many of the set-theory/predicate-calculus operations that I expect. I'm not looking at Lex's notes as I write this, so I'll show you an example outlined recently by Toon Koppelaars, Lex's coauthor on the brilliant book called Applied Mathematics for Database Professionals (Expert's Voice).

In SQL, there's no "set equality" operator. That's right, although SQL is a set processing language, it has no operator for testing whether one set A equals another set B. But set equality "A = B" can be rewritten as "(A is-a-subset-of B) and (B is-a-subset-of A)".

Unfortunately, SQL doesn't have an is-a-subset-of operator either. But "A is-a-subset-of B" can be rewritten into "A minus B = the-empty-set".

But SQL also lacks the concept of an empty set. The way to express that is to test whether the cardinality of a set is zero, as in "count(*)=0".

Over the course of an hour-long presentation, Lex showed me a dozen or so operators that are missing from SQL, which we really need for expressing our intentions clearly in SQL. He put structure around the negative feelings I had toward the language. And then he showed an equivalent translation for each missing operator that could be implemented in SQL, which invested back into the language a new power. That's the trick that caused my jaw to fall. In Lex's presentation, the game of writing applications in SQL went from this:

Implement complex thoughts in crappy language that requires me to record my thoughts in a format that doesn't much resemble my thinking.
Worry whether the implementation was really right.

...to this:

Record complex thoughts using a language designed well to record exactly such thoughts.
Translate the specification of the program into SQL, using translation patterns.

Since our predicate calculus expressions were explicit enough to be provable, and since we could prove the correctness of the translations we were using to move from our specification to our SQL, we could actually then prove the correctness of our SQL. It was a beam of hope that developers could actually write correct applications ...and know it!

That's the first day I ever got excited thinking about SQL.

So, on April 27–29 in Dallas, I'll get a chance to enter the next phase of that thinking. On top of that, the message will be delivered by Chris Date, who I really enjoyed at the Hotsos Symposium earlier this month, and who is one of the pioneers who invented the whole world our careers live in. I'm looking to forward to it. It should be an interesting classroom, with Chris Date in the front and Karen Morton, Jeff Holt, and some others with me in the back. I hope you won't miss the opportunity.

Like I said, the final day to sign up is this Thursday 2 April 2009. I know that economic times are tough these days, but this is a one-of-a-kind education event that I believe will deliver lasting value to everyone who goes.

Thursday, February 5, 2009

On the Usefulness of Software Instrumentation

I appreciate that in today's reading, Chen Shapira has taught me about the psychological principle of the endowment effect and its influence over people's decision-making about whether to instrument their code. I'm beginning to collect a nice little list of inspiring observations about performance instrumentation. I've posted them in a public wiki. I hope you'll join in.

Here are some samples:

Chen Shapira: The pervasive lack of instrumentation in software products is more a result of psychological bias than real technical concerns. Software vendors can work around these psychological issues by building instrumentation as a default into tools involved in the development and deployment process. Just as Knuth said 35 years ago.

Don Knuth: I've become convinced that all compilers written from now on should be designed to provide all programmers with feedback indicating what parts of their programs are costing the most; indeed, this feedback should be supplied automatically unless it has been specifically turned off.

Tom Kyte: I have yet in 18 years to hear a valid reason why instrumentation should not be done. I have only heard extremely compelling reasons why it must be done.

Tom Kyte: To the developers that say “this is extra code that will just make my code run slower” I respond “well fine, we will take away V$ views, there will be no SQL_TRACE, no 10046 level 12 traces, in fact–that entire events subsystem in Oracle, it is gone”. Would Oracle run faster without this stuff? Undoubtedly—not. It would run many times slower, perhaps hundreds of times slower. Why? Because you would have no clue where to look to find performance related issues. You would have nothing to go on. Without this “overhead” (air quotes intentionally used to denote sarcasm there), Oracle would not have a chance of performing as well as it does. Because you would not have a change to make it perform well. Because you would not know where even to begin.

Tom Kyte (I don't know of a link to this one; I'm paraphrasing from public statements I've seen him make): Showing where your code spends its time is a necessary function of any production application. The fact that the extra code required to do this consumes extra system capacity is no different from any other feature in your application. As with any required feature, you simply have to size your hardware with the feature in mind.

Wednesday, February 4, 2009

A report about our course in Utrecht

Toine van Beckhoven has blogged about his experience in our recent course in Utrecht, hosted by Miracle Benelux. I also like his post about the new function-based index he created after an inspiration from the Tom Kyte course he attended.