Tuesday, June 23, 2009

An Essay on Science

Richard Feynman defined science as "the belief in the ignorance of experts." Science begins by questioning established ideas. ...Even those ideas promoted by so-called experts.

The value of science that's obvious to everybody is the chance you might discover some valuable truth that nobody else has discovered before. That's the glamorous idea that might motivate you to begin the hard work that science sometimes requires. Science is also valuable to you when you learn that an established idea, no matter how much you may not like it, really is true after all. That second value of science is not as glamorous, but it's just as important. My little prayer with respect to that possibility is, "If an idea I believe is wrong, please let me find out before anybody else does."

Everyone can do science. Not just "scientists"; all of us. But you need to do science "right," or it's not science. Do it right, and you accumulate a little bit of truth. Do it wrong, and and you've wasted your time, or worse, you've doomed yourself to waste more of your time in the future, too.

The difference between "right" and "wrong" in science is not some snooty, bureaucratic concept. You don't need a license or a blessing to do science right. You just need to ensure that the cause-effect relationships you choose to believe are actually correct. One of the rules for doing science right is that you measure instead of just asserting your opinion.

Different people have different thresholds of skepticism. Some people believe new ideas, whether they're true or false, with very little persuasion. The people who are persuaded easily to believe false things cannot contribute much useful new knowledge to their communities (irrespective of how much they might publish).

People at the opposite end of the spectrum have very strict standards for what they accept as truth. They're careful about what rows they insert and commit into their minds. There aren't as many of those people, but they're more interesting, with respect to science, because they're the people who can contribute new knowledge to the community.

A lot of what we do in the Oracle world comes down to a person demonstrating, say, in sqlplus, that some certain cause produces some certain effect in some certain version of the database on some certain operating system, and so on. Then the next step is when you have to look at that result and decide for yourself (or perhaps with someone's help), how relevant that result is for you.

Innumerable Oracle debates funnel into the argument that, "Yes, I can see plainly that this situation can happen, but that situation will probably never happen for me." This is what happens, for example, when the doubter believes the prover is using an example that is too contrived to be realistic, or when the doubter's context is different from the prover's. ...Like when the doubter is running a data warehouse, but the prover talks only about transaction processing.

That's where another level of value begins, and it's another place where science can help. It's where the issue becomes proving how relevant a given proposal really is for a given circumstance. One of the nice things about software (and Oracle Database software in particular) is that it's usually easy to write code that will tell you how often something happens, or how long it takes when it does. With software, I don't have to guess. I can measure and know. Software is unusual in the world in that it can be used to measure itself.

At this week's ODTUG conference in Monterey, California, a debate has formed that I'm interested in following. The established idea of the experts in this debate is a passionate belief that you should declare, enable, and index foreign key (FK) relationships. The counter-argument is that you should not.

I've heard probably most of the arguments on the pro side of the debate. (a) If you don't declare/enable/index your FKs, then you have to ensure the correct relationships in your application code, which is error-prone in both obvious and highly subtle ways. (b) Not implementing full FK integrity in the database is developmentally inefficient, because it violates the important principle that you should never duplicate code in an application. (c) Furthermore, the absence of declared/enabled/indexed FK integrity is operationally inefficient at run time, because it blocks the Oracle query optimizer from using code paths that can be hundreds of times faster and more scalable than when it can't rely upon database-enforced FK integrity.

I haven't heard a single compelling argument for the other side of the debate. But, you see, there could be a compelling argument that I haven't heard.

This is what makes this new debate interesting for me. One side or the other is on the brink of learning something important, if the debate is conducted properly.

The first thing the two sides will need to do is agree on whether both sides really are in disagreement and, if they are, what exactly the disagreement is. In lots of debates I've seen, we've figured out after defining the terms and deciding the context of the argument (for example, what kind of application we're talking about), that there's really no debate of principle at all. When that happens, it's debate closed. Each side agrees with the other, and maybe the two sides learn a little more about life on the other side of the fence.

If after the suitable definition-of-terms process, the two sides really do still have a debate left, then there will be some kind of attestation of facts as both sides see them. One side, for example, will show sqlplus session output to demonstrate the subtle and not-so-subtle ways in which not doing the declare/enable/index thing causes corrupted data and horrific performance penalties. The other side of the argument will then show some contrary evidence that counts in favor of not doing the declare/enable/index thing.

If the sides can't agree on the truth of the "facts" presented, then the debate will collapse, and at least one side (possibly both) will have learned nothing. If each side succeeds in impressing the other side that the "facts" thus presented are actually true, then the debate will move to the discussions of the relevance of the facts just demonstrated. This is where one guy might say something like, "I know that queries with FKs in the database are faster, but I can't afford the performance penalty at data load time." To which the other guy might say, "Yes, loads are a little slower with referential integrity checking, but that's time spent executing necessary code path to ensure that only correct data can get into your database. And besides, it's not a good trade-off to endure a thousand slow queries a day so that one load can go faster." Rebuttal, counter-rebuttal, and so on. You get the picture.

I am biased in my estimation of how this will turn out. But I sincerely respect when someone thoughtfully and sincerely challenges an important idea in my professional domain, no matter how well entrenched that idea may be. In fact, the more entrenched, the better. The debate will remain interesting as long as the counter-argument is thoughtful and sincere. As soon as the evidence for the challenging new idea reveals itself to be nonsense, or if the counterargument context seems irrelevant to me (and the people I'm trying to represent), then I'll lose interest.

The best debate ends in a handshake (real or virtual) in which the people representing both sides learn something they didn't know before, one side perhaps more than the other. The best debaters value learning more than being right. The best debaters respect each other more after the debate because each has helped the other (and the community around them) to advance.

The worst debaters confuse the principles of factual correctness and personal correctness. When the debate shifts from factual to personal, it may become interesting to some people because of the enhanced drama, but the actual usefulness of the debate evaporates. ([Grin] I guess if the debate were solid to begin with, then the usefulness actually sublimates.)

No matter which idea wins this FK debate (right, I said idea, not people), I expect to be happy for the debate to have occurred. That's because I expect the debate to end with a resolution, and either I'm going to learn something completely new, or I'm going to fortify an existing belief. Something new is obviously exciting, and fortification will make me a more effective teacher of my existing belief. Either result benefits me and, I believe, the community.

With science, you get suspense, drama, plot twists, surprises, fortifications, .... Science is fun. I wish more people knew.

12 comments:

Marcin Przepiorowski said...

Hi,
Very true. When you are open and talk with other people you can always learn. I wish that all people understood that - there are lot of them finishing all discussions because they know better.

Unknown said...

Cary,

Nice post, as usual.

I may have missed it, but do you have a link to where we could follow this debate online?

Cheers,

Doug

Cary Millsap said...

Thanks, Pioro.

Doug, this debate doesn't have an online venue yet. I'll let you know if it acquires one. If we're lucky, it'll play out at AskTom.

Joel Garry said...

"Yes, I can see plainly that this situation can happen, but that situation will probably never happen for me."

This is the flaw I often see in the arguments for integrity in the application as opposed to the database. The error is in thinking the app will continue to be the only view of the data over time (in other words time or SDLC is irrelevant). I think people who argue that ought to be forced to do 4 or 5 maintenance programming projects trying to reconcile differing business rules and unclean data with no time to "do it right." There's no scientific way to demonstrate this issue, plus a strong argument about the relatively low limitations on complexity of business rules in the db favoring the app view.

I don't think everyone can do good science, and I don't think everyone can do good programming. It would be nice if they would or could, but few people are willing to accede to the rules of science. Most of the time that is no big deal, but there are too many times (like the NASA story in the Feynman link) where it is just sad.

So it's a good thing to advocate science, it's a good thing to pull the curtain back from these mysterious black boxen, but the battle is large and never ending.

Add costs and profit and propaganda into the mix, and the battle is uphill.

Even if you win the battle, you won't win the war, as the message will be lost among the many other battles.

word: natiol
next word: rcenis

Cary Millsap said...

Joel,

It makes me a little sad to believe you, but I do. The older I get, the more resigned I am to the likelihood of never "winning the war," but the more appreciative I become of the opportunities to score little victories.

—Cary

Marcelle Kratochvil said...

Teaser: The answer to a scenario raised is you move to a virtual roller coaster.

Noons said...

One important distinction in all this is the fundamental difference between experimentation and science.

One can experiment and gather facts. Those can be either verified by others or not.

That, is not science.

Science in the context of a scientific approach is the act of deducting a theory that encompasses and explains facts and can be verified experimentally.

Of course in the case of the FK debate and as you well point out, the first cab off the rack should be: what exactly is the difference between the two camps, if there is one. And why.

Only after that is sorted out can we see a true result.

But, I digress.

Brian Tkatch said...

@Noons

"Science in the context of a scientific approach is the act of deducting a theory that encompasses and explains facts and can be verified experimentally."

Nope. People come up with hypothesis, and Science tests theories. Science itself, however, has no mechanism for deducting theories.

Mark Brady said...

@Brian,

"Science tests theories"

EXCELLENT!


It doesn't prove anything. It tests things. I can hypothesize that all coffee cups are white, and you can perform random sampling around the world and come up with only white coffee cups.

So either my hypothesis is correct, your sampling is wrong, or you just haven't found the exception that tests the rule.

I'm always fascinated by science tv shows... they talk about anthropology like a science.
"Clovis man" was the first human in North America... oh, wait... until we find pre-Clovis artifacts. Oops, guess our science was based on insufficient evidence.

Physics: i'll measure the time it takes the car to go from point A to point B. Time is time everywhere, right? Well, no, the clock in the car and the clock used by the observer have different measurements.

And on and on. I think we confuse science with facts. There are people who cling to a theology of science just as strongly as those who cling to religious dogma.

Just look at the global warming debate (I'm not relitigating, just comparing the tenor of the argument). The debate is loaded with "appeals to authority". If 95 Nobel *scientists* agree, it must be true. My problem is that most scientists of previous eras believed, the world was flat, disease was caused by malair, the atom couldn't be divided, that the world would face masive famine because of overpopulation, etc.

Generation after generation science is shown to constantly come to right AND wrong conclusions yet no one is ever in doubt of their own conclusions.

Marcelle Kratochvil said...

Discussion on the topic can now be found at:

http://foreignkeys.blogspot.com/

Anonymous said...
This comment has been removed by a blog administrator.
News ~ Findd Hindi said...
This comment has been removed by a blog administrator.