Wednesday, December 31, 2008

Syncing..., Part 3

In a couple of prior posts, I described some of my requirements for syncing my iPhone with my laptop computer. A couple of months ago, I solved the problem (and a whole bunch of other ones, too). I'm very pleased with the solution I chose, which I'll share with you here.

I bought a new MacBook Pro. Not one of the new solid-brick-of-aluminum ones, but one of the slightly older aluminum ones held together with screws. It has a beautiful 15-inch matte-finish monitor, and it's faster than any portable I've ever used. It runs Windows XP applications in my Fusion VM about 33% faster than the native XP laptop I replaced.

It was a big investment, but I love it. Every time I open up my MacBook, I feel good. Thirty seconds later after it has booted completely from scratch, I love it even more.

VMware Fusion, by the way, is a brilliant penetrant to the barriers I had perceived about moving from Microsoft Windows to Mac OS X. With Fusion, you can make your whole Windows PC run on your Mac if you want to. As you gain comfort in OS X, you can migrate applications from Windows to the native Mac operating system. I digress...

So, now I use iCal, which gives me the ability to subscribe to iCal feeds like the one from TripIt. And ones like webcal://, which automagically populate your calendar with US holidays, so you won't accidentally book a course on Labor Day. (Visit iCal World for more.)

My new system completely solves the travel problems I was talking about. Now, I do this before I travel:
  1. Book travel.
  2. Forward the confirmation emails to
  3. Print my TripIt unified itinerary for my briefcase. (Or not.)
  4. Sync my iPhone with my laptop.
And that's it. No data double-entry. None.

One more requirement I had, though, was syncing my iCal calendar with Google Calendar. I found a solution to that one, too. I tried using the Google Calaboration, but I really didn't like the way it forced me to deal with my separate calendars. The tool I chose is called Spanning Sync. I used their 15-day trial, liked it a lot, and bought a subscription. I love the way I can map from my list of Google calendars to my list of iCal calendars. However, I don't like the way it syncs my contacts, so I just turn that feature off.

I'm intrigued by the Spanning Sync business model as well. You can save $5 on Spanning Sync by clicking here. It works like this (I'm quoting here a Spanning Sync web page)...
After a 15-day free trial period, Spanning Sync usually costs $25/year, but you can save $5 by using my discount code if you decide to buy it:


Also, if you use my code I'll get a $5 referral fee from Spanning Sync. Once you're a subscriber you'll get a code of your own so you can make money every time one of your other friends subscribes to Spanning Sync. Pretty cool!
Anyway, I'm very pleased with the new system, and I'm happy to share the news.

Monday, December 29, 2008

Performance as a Service, Part 2

Over the holiday weekend, Dallas left a comment on my July 7 post that begins with this:
One of the biggest issues I run into is that most of my customers have no SLAs outside of availability.
It's an idea that resonates with a lot of people that I talk to.

I see the following progressive hierarchy when it comes to measuring performance...
  1. Don't measure response times at all.
  2. Measure response times. Don't alert at all.
  3. Measure response times. Alert against thresholds.
  4. Measure response times. Alert upon variances.
Most people don't measure response times at all (category 1), at least not until there's trouble. Most people don't measure response times even then, but some do. Not many people fit into what I've called category 2, because once you have a way to collect response time data, it's too tempting to do some kind of alerting with it.

Category 3 is a world in which people measure response times, and they compare those response times against some pre-specified list of tolerances for those response times. Here's where the big problem that Dallas is talking about hits you: Where does that list of tolerances come from? It takes work to make that list, and preceding that work is the motivation to make that list. Many companies just don't have that motivation.

I think it's the specter of the difficulty in getting to category 3 that prevents a lot of people from moving into category 2. I think that is Dallas's situation.

A few years ago, I would have listed category 3 at the top of my hierarchy, but at CMG'07, in a paper called "Death to Dashboards...," Peg McMahon and Justin Martin made me aware of another level: this notion of alerting based on variance.

The plan of creating a tolerance for every business task you execute on your system works fine for a few interesting tasks, but the idea doesn't scale to systems with hundreds or thousands of instrumented tasks. The task of negotiating, setting, and maintaining hundreds of different tolerances is just too labor-intensive.

Peg and Justin's paper described the notion that not bothering with individual tolerances works just as well—and with very low setup cost—because what you really ought to look out for are changes in response times. (It's an idea similar to what Robyn Sands described at Hotsos Symposium 2008.) You can look for variances without defining tolerances, but of course you cannot do it without measuring response times.

Dallas ends with:
I think one of the things you might offer as part of the "Performance as a Service" might be assisting customers in developing those performance SLAs, especially since your team is very experienced in knowing what is possible.
I of course love that he made that point, because this is exactly the kind of thing we're in business to do for people. Just contact us through We are ready, willing, and able, and now is a great time to schedule something.

There is a lot of value in doing the response time instrumentation exercise, no matter how you do alerting. The value comes in two main ways. First, just the act of measuring often reveals inefficiencies that are easy and inexpensive to fix. We find little mistakes all the time that make systems faster and nicer to use and that allow companies to use their IT budgets more efficiently. Second, response time information is just fascinating. It's valuable for people on both sides of the information supply-and-demand relationship to see how fast things really are, and how often you really run them. Seeing real performance data brings ideas together. It happens even if you don't choose to do alerting at all.

The biggest hurdle is in moving from category 1 to category 2. Once you're at category 2, your hardest work is behind you, and you and your business will have exactly the information you'll need for deciding whether to move on to category 3 or 4.

Monday, December 22, 2008

Cuts, Thinks About Cutting, Part 2

Thank you all for your comments on my prior post.

To Robyn's point in particular, I believe the processes of rehearsing, visualizing, and planning have extraordinarily high value. Visualization and rehearsal are vital, in my opinion, to doing anything well. Ask pilots, athletes, actors, public speakers, programmers, .... Check out "Blue Angels: A Year in the Life" from the Discovery Channel for a really good example.

But there's a time when the incremental value of planning drops beneath the value of actually doing something. I think the location of that "time" and the concept of obsessiveness are related. I'm learning that it's earlier than I had believed when I was younger.

The one sentence in my original blog post that captures the idea I want to emphasize is, "I'm not advocating that design shouldn't happen; I'm advocating that you not pretend your design is build-worthy before you can possibly know."

Thinking of all this reminds me of a "tree house" I wanted to build when I was about 10 years old. I use quotation marks, because it wasn't meant to have anything to do with a tree at all. Here was the general idea of the design:My dad was a pilot, so when I was 10, I was too. The front of this thing was going to have an instrument panel and some windows, and I was going to fly this thing everywhere I could think of.

I drew detailed design drawings, and I dug a hole. (If you have children, by the way, make sure they dig a hole sometime before they grow up. Trust me.) My plan called for a 2x4 frame with plywood walls screwed to that frame. It's when I started gathering the 2x4 lumber that my plan fell apart. See, I had naively planned that 2x4 lumber is 2 inches thick by 4 inches wide. It's not. It's 1-1/2 inches thick by 3-1/2 inches wide. That meant I'd have to redraw my whole plan. Using fractions.

I don't remember exactly what happened next, but I do remember exactly what didn't happen. I didn't redraw my whole plan, and I didn't ever build the "tree house." Eventually, I did finally fill in that hole in the yard.

Here are some lessons to take away from that experience:
  • Not understanding other people's specs well enough is one way to screw up your project.
  • Actually getting started is a sure way to learn what you need to know about those other people's specs.
  • You learn things in reality that you just don't learn when you're working on paper.
  • Not building the "tree house" at all meant that I'd have to wait until I was older to learn that putting wood on (or into) the ground is an impermanent solution.
  • The problems that you think during the design phase are your big problems may not be your actual big problems.

Friday, December 19, 2008

Cuts, Thinks About Cutting

This xkcd comic I saw today reminded me of a conversation a former colleague at Oracle and I had some years ago. During a break in one of the dozens of unmemorable meetings we attended together, he and I were talking about our wood shops. He had built several things lately, which he had told me about. I was in the planning phase of some next project I was going to do. He observed that this seemed to be the conversation we were always having. He argued that it would be better for me to build something, have the fun of building it, actually use the thing I had built, and then learn from my mistakes than to spend so much of my time hatching on what I was going to do next.

I of course argued that I didn't want to waste my very precious whatever-it-was that I would mess up if I didn't do it exactly right on the first try. We went back and forth over the idea, until we had to go back to our unmemorable meeting. I remember him shaking his head as we walked back in.

Sometime after we sat down, he passed a piece of paper over to me that looked like this:It said "cuts" with an arrow pointing to him, and "thinks about cutting" with an arrow pointing to me.

Thus was I irretrievably labeled.

He was right. In recent years, I've grown significantly in the direction of acquiring actual experience, not just imagined experience. I'm still inherently a "math guy" who really enjoys modeling and planning and imagining. But most of the things I've created that are really useful, I've created by building something very basic that just works. Most of the time when I've done that, I've gotten really good use out of that very basic thing. Most of the time, such a thing has enabled me to do stuff that I couldn't have done without it.

In the cases where the very basic thing hasn't been good enough, I've either enhanced it, or I've scrapped it in favor of a new version 2. In either case, the end result was better (and it happened earlier) than if I had tried to imagine every possible feature I was going to need before I built anything. Usually the enhancements have been in directions that I never imagined prior to actually putting my hands on the thing after I built it. I've written about some of those experiences in a paper called "Measure once, cut twice (no, really)".

Incremental design is an interesting subject, especially in the context of our Oracle ecosystem, where there is a grave shortage of competent design. But I do believe that a sensible application of the "measure-once, cut-twice" principle improves software quality, even in projects where you need a well-designed relational database model. I'm not advocating that design shouldn't happen; I'm advocating that you not pretend your design is build-worthy before you can possibly know.

Some good places where you can read more about this philosophy include...

Tuesday, December 16, 2008

A Small Adventure in Profiling

Tonight I'm finishing up some code I'm writing. It's a program that reports on directories full of trace files. I can tell you more about that later. Anyway, tonight, I got my code doing pretty much what I wanted it to be doing, and I decided to profile it. This way, I can see where my code is spending its time.

My program is called lstrc. It's written in Perl. Here's how I profiled it:
23:31:09 $ perl -d:DProf /usr/local/bin/lstrc
The output of my program appeared when I ran it. Then I ran dprofpp, and here's what I got:
23:31:23 $ dprofpp
Total Elapsed Time = 0.411082 Seconds
User+System Time = 0.407182 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
64.1 0.261 0.261 18 0.0145 0.0145 TFK::Util::total_cpu
18.1 0.074 0.076 176 0.0004 0.0004 TFK::Util::timdat
5.65 0.023 0.052 9 0.0026 0.0058 main::BEGIN
1.72 0.007 0.007 1348 0.0000 0.0000 File::ReadBackwards::readline
0.98 0.004 0.022 6 0.0007 0.0036 TFK::Util::BEGIN
0.74 0.003 0.011 18 0.0002 0.0006 TFK::Util::tim1
0.74 0.003 0.004 6 0.0005 0.0006 ActiveState::Path::BEGIN
0.74 0.003 0.014 7 0.0004 0.0019 Date::Parse::BEGIN
0.74 0.003 0.359 2 0.0015 0.1797 main::process_files
0.49 0.002 0.002 4 0.0005 0.0006 Config::BEGIN
0.49 0.002 0.002 177 0.0000 0.0000 File::Basename::fileparse
0.49 0.002 0.002 176 0.0000 0.0000 File::Basename::_strip_trailing_sep
0.49 0.002 0.002 3 0.0005 0.0005 Exporter::as_heavy
0.49 0.002 0.002 6 0.0003 0.0004 File::ReadBackwards::BEGIN
0.25 0.001 0.002 24 0.0001 0.0001 Getopt::Long::BEGIN
What this says is that the function called TFK::Util::total_cpu accounts for 64.1% of the program's total execution time. The thing you couldn't have known (except I'm going to tell you) is that this program is not supposed to execute the function TFK::Util::total_cpu. At all. It's because I didn't specify the --cpu command line argument. (I told you that you couldn't have known.)

Given this knowledge that my code was spending 64.1% of my time executing a function that I didn't even want to run, I was able to add the appropriate branch around the call of TFK::Util::total_cpu. Then, when I ran my code again, it produced exactly the same output, but its profile looked like this:
23:33:07 $ dprofpp
Total Elapsed Time = 0.150279 Seconds
User+System Time = 0.147957 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
50.0 0.074 0.076 176 0.0004 0.0004 TFK::Util::timdat
15.5 0.023 0.053 9 0.0026 0.0058 main::BEGIN
4.73 0.007 0.007 1348 0.0000 0.0000 File::ReadBackwards::readline
2.70 0.004 0.022 6 0.0007 0.0036 TFK::Util::BEGIN
2.70 0.004 0.013 18 0.0002 0.0007 TFK::Util::tim1
2.03 0.003 0.004 6 0.0005 0.0006 ActiveState::Path::BEGIN
2.03 0.003 0.013 7 0.0004 0.0019 Date::Parse::BEGIN
2.03 0.003 0.100 2 0.0015 0.0499 main::process_files
1.35 0.002 0.002 4 0.0005 0.0005 Config::BEGIN
1.35 0.002 0.002 177 0.0000 0.0000 File::Basename::fileparse
1.35 0.002 0.002 176 0.0000 0.0000 File::Basename::_strip_trailing_sep
1.35 0.002 0.002 6 0.0003 0.0004 File::ReadBackwards::BEGIN
1.35 0.002 0.002 3 0.0005 0.0005 Exporter::as_heavy
0.68 0.001 0.002 24 0.0001 0.0001 Getopt::Long::BEGIN
0.68 0.001 0.005 176 0.0000 0.0000 File::Basename::basename

Let me summarize:
Total Elapsed Time = 0.411082 Seconds — before profiling
Total Elapsed Time = 0.150279 Seconds — after profiling
That's about a 64% improvement in response time, in return for about 30 extra seconds of development work.

Profiling—seeing how your code has spent your time—rocks.