Thursday, January 13, 2011

New paper "Mastering Performance with Extended SQL Trace"

Happy New Year.

It’s been a busy few weeks. I finally have something tangible to show for it: “Mastering Performance with Extended SQL Trace” is the new paper I’ve written for this year’s RMOUG conference. Think of it a 15-page update to chapter 5 of Optimizing Oracle Performance.

There’s lots of new detail in there. Some highlights:
  • How to enable and disable traces, even in un-cooperative applications.
  • How to instrument your application so that tracing the right code path during production operation of your application becomes dead simple.
  • How to make that instrumentation highly scalable (think 100,000+ tps).
  • How timestamps since 10.2 allow you to know your recursive call relationships without guessing.
  • How to create response time profiles for calls and groups of calls, with examples.
  • Why you don’t want to be on Oracle 11g prior to 11.2.0.2.0.
I hope you’ll be able to make productive use of it.

10 comments:

Vladimir Begun said...

The trace file alphabet is extended in 11.2.0.2.

http://vbegun.blogspot.com/2010/06/this-is-nice.html

As you can see there are "letters" for LOB related operations, which, of course, is a very useful extension.

Cary Millsap said...

Thank you, Vladimir. We have some studying to do. We're going to figure out whether Oracle simply reclassified LOB read operations as dbcalls. It would be nice if the new dbcalls (LOBREAD for example) still had the capability to throw one or more WAIT lines with the actual syscall data on them. Your example makes it look like that's not the case, but we'll find out.

Cary Millsap said...

Aha: I posted my comment before noticing that you had responded to Jeff's on your blog. What you've shown looks like excellent news. Thank you very much for doing that.

Vladimir Begun said...

:-) No problems. You mentioned 11.2.0.2 and why it's worth to consider using that release but since I have not seen new "alphabet letters" in the article I decided to point to that "small" thing. It just could be a good addition to your article. Play with that, it seems to be a very useful thing.

tb0n3 said...

This was a great paper, thank you! I noticed your footnote on Solaris and microstate accounting, which led me to ask our Unix admins about it. Learned from them that microstate accounting cannot be disabled in Solaris 10, so nice to be able to assume more accuracy on that OS. More info here: http://tinyurl.com/4z2xv54

Cary Millsap said...

tb0n3, thank you; and thank you for the link.

Dallas said...

Another stunningly wonderful paper.

I learned a couple of things, remembered a couple of things I had forgotten, and found a way to make things much easier.

I wasn't aware that jdbc can pass session attributes - that will make things dead simple for me once I start to require it be done in new app development.

It was even better when read from a Kindle.

Bravo!

Cary Millsap said...

Thank you very much, Dallas!

Jay Mehta said...

An excellent white paper as usual. This is on of those "ever green" white paper that you can reference anytime!

I have a question about between call unaccounted-for duration which, as you explained in the section 19, is equal to R (total response time) minus time accounted for top level DB and OS call.

It's easy to account for top level DB calls by looking at e in DB calls at depth=0, but how do you identify top level OS calls and account for in response time analysis? What events, or OS calls, are top level OS calls? Are "SQL*Net message to/from clients" and idle events are the only ones that would contribute to top level OS calls?

If you have multiple FETCH calls, then we would see "SQL*Net message" events interspersed between FETCH calls. Are these SQL*Net message events accounted for e in FETCH DB calls?

Thanks in advance.

Cary Millsap said...

Jay,

Thank you for your very kind words. By inspecting the begin and end times of each call (like I did in the two-column {tim, call} table in section 17), you can know exactly which calls are not children of other calls.

Of course, our Method R software tools described in section 21 (Method R Profiler and mrskew) do this work for you.

Cary