Friday, July 11, 2008

So how do you FIX the problems that "Performance as a Service" helps you find?

I want to respond carefully to Reubin's comment on my Performance as a Service post from July 7. Reuben asked:
i can see how you actually determine for a customer where the pain points are and you can validate user remarks about poor performance. But i don't see from your post how you are going to attack the problem of fixing the performance issue.

i would be most interested in hearing your thoughts on that. I wonder if you guys are going to touch the actual code behind the "order button" you described.
Under the service model I described in the Performance as a Service post, a client using our performance service would have several choices about how to respond to a problem. They could contract our team as consultants; they could use someone else; they could do it themselves; or of course, they could choose to defer the solution.

Attacking the problem of fixing the performance issue is actually the "easy" part. Well, it's not necessarily always easy, but it's the part that my team have been doing over and over again since the early 1990s. We use Method R. Once we've measured the response time in detail with Oracle's extended SQL trace function, we know exactly where the task is spending the end user's time. From there, I think it's fair to say we (Jeff, Karen, etc.) are pretty skilled at figuring out what to do next.

Sometimes, the root cause of a performance problem requires manipulation of the application source code, and sometimes it doesn't. If you do diagnosis right, you should never have to guess which one it is. A lot of people wonder what happens if it's the code that needs modification, but the application is prepackaged, and therefore the source code is out of your control. In my experience, most vendors are very responsive to improving the performance of their products when they're shown unambiguously how to improve them.

If your application is slow, you should be eager to know exactly why it's slow. You should be equally eager to know, whether you wrote the code yourself or someone else wrote it for you. To avoid collecting the right performance diagnostic data for an application because you're afraid of what you might find out is like taking your hands off the wheel and covering your eyes when a child rides his bike out in front of the car that you're driving. There's significant time-value upon information about performance problems. Even if someone else's code is the reason for your performance problem (or whatever truth you might be afraid of learning), you need to know it as early as possible.

The SLA Manager service I talked about is so important because the most difficult part of using Method R is usually the data collection step. The difficulty is almost always more political than technical. It's overcoming the question, "Why should we change the way we collect our data?" I believe the business value of knowing how long your computer system takes to execute tasks for your business is important enough that it will get people into the habit of measuring response time. ...Which is a vital step in solving the data collection problem that's at the heart of every persistent performance problem I've ever seen. I believe the data collection service that I described will help remove the most important remaining barrier to highly manageable application software performance in our market.

No comments: