Tuesday, December 16, 2008

A Small Adventure in Profiling

Tonight I'm finishing up some code I'm writing. It's a program that reports on directories full of trace files. I can tell you more about that later. Anyway, tonight, I got my code doing pretty much what I wanted it to be doing, and I decided to profile it. This way, I can see where my code is spending its time.

My program is called lstrc. It's written in Perl. Here's how I profiled it:
23:31:09 $ perl -d:DProf /usr/local/bin/lstrc
The output of my program appeared when I ran it. Then I ran dprofpp, and here's what I got:
23:31:23 $ dprofpp
Total Elapsed Time = 0.411082 Seconds
User+System Time = 0.407182 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
64.1 0.261 0.261 18 0.0145 0.0145 TFK::Util::total_cpu
18.1 0.074 0.076 176 0.0004 0.0004 TFK::Util::timdat
5.65 0.023 0.052 9 0.0026 0.0058 main::BEGIN
1.72 0.007 0.007 1348 0.0000 0.0000 File::ReadBackwards::readline
0.98 0.004 0.022 6 0.0007 0.0036 TFK::Util::BEGIN
0.74 0.003 0.011 18 0.0002 0.0006 TFK::Util::tim1
0.74 0.003 0.004 6 0.0005 0.0006 ActiveState::Path::BEGIN
0.74 0.003 0.014 7 0.0004 0.0019 Date::Parse::BEGIN
0.74 0.003 0.359 2 0.0015 0.1797 main::process_files
0.49 0.002 0.002 4 0.0005 0.0006 Config::BEGIN
0.49 0.002 0.002 177 0.0000 0.0000 File::Basename::fileparse
0.49 0.002 0.002 176 0.0000 0.0000 File::Basename::_strip_trailing_sep
0.49 0.002 0.002 3 0.0005 0.0005 Exporter::as_heavy
0.49 0.002 0.002 6 0.0003 0.0004 File::ReadBackwards::BEGIN
0.25 0.001 0.002 24 0.0001 0.0001 Getopt::Long::BEGIN
What this says is that the function called TFK::Util::total_cpu accounts for 64.1% of the program's total execution time. The thing you couldn't have known (except I'm going to tell you) is that this program is not supposed to execute the function TFK::Util::total_cpu. At all. It's because I didn't specify the --cpu command line argument. (I told you that you couldn't have known.)

Given this knowledge that my code was spending 64.1% of my time executing a function that I didn't even want to run, I was able to add the appropriate branch around the call of TFK::Util::total_cpu. Then, when I ran my code again, it produced exactly the same output, but its profile looked like this:
23:33:07 $ dprofpp
Total Elapsed Time = 0.150279 Seconds
User+System Time = 0.147957 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
50.0 0.074 0.076 176 0.0004 0.0004 TFK::Util::timdat
15.5 0.023 0.053 9 0.0026 0.0058 main::BEGIN
4.73 0.007 0.007 1348 0.0000 0.0000 File::ReadBackwards::readline
2.70 0.004 0.022 6 0.0007 0.0036 TFK::Util::BEGIN
2.70 0.004 0.013 18 0.0002 0.0007 TFK::Util::tim1
2.03 0.003 0.004 6 0.0005 0.0006 ActiveState::Path::BEGIN
2.03 0.003 0.013 7 0.0004 0.0019 Date::Parse::BEGIN
2.03 0.003 0.100 2 0.0015 0.0499 main::process_files
1.35 0.002 0.002 4 0.0005 0.0005 Config::BEGIN
1.35 0.002 0.002 177 0.0000 0.0000 File::Basename::fileparse
1.35 0.002 0.002 176 0.0000 0.0000 File::Basename::_strip_trailing_sep
1.35 0.002 0.002 6 0.0003 0.0004 File::ReadBackwards::BEGIN
1.35 0.002 0.002 3 0.0005 0.0005 Exporter::as_heavy
0.68 0.001 0.002 24 0.0001 0.0001 Getopt::Long::BEGIN
0.68 0.001 0.005 176 0.0000 0.0000 File::Basename::basename
Yay.

Let me summarize:
Total Elapsed Time = 0.411082 Seconds — before profiling
Total Elapsed Time = 0.150279 Seconds — after profiling
That's about a 64% improvement in response time, in return for about 30 extra seconds of development work.

Profiling—seeing how your code has spent your time—rocks.

8 comments:

  1. That's cool Carry!
    Regards
    Aman....

    ReplyDelete
  2. I know we've talked about this before but to me this just makes the case stronger for creating ILO-like libraries for other languages (C#, C++, etc). Using those libraries and Aspect Oriented Programming the libraries could be linked in fairly easily.

    ReplyDelete
  3. Carry,

    Why did the system decide to use that function ?

    Bundit

    ReplyDelete
  4. I hope I'm not duplicating a comment when I post this. I got a weird error from the web server.

    Not all profiling data collectors work well. This is especially true for Perl's DProf module. I and others before me have logged bugs that demonstrate DProf's inadequacies. Unfortunately, there's no good solution.

    So, when we say things like "why guess when you can know" we are talking only about environments where you can collect reliable data. In less reliable environments you still need to rely on your experience and intuition.

    ReplyDelete
  5. Bundit,

    My program called that function because I had told it to. In the display function, my code branched around displaying the total CPU time unless --cpu had been specified. But in the main body of the code, I had accidentally neglected to branch around the computation of total CPU time unless --cpu had been specified.

    So it was a bug in my code that led to an unnecessary consumption of end-user response time. It was easy to fix, and the profiler made it very easy to find.

    Hope this helps,

    Cary

    ReplyDelete
  6. Did you ever take a look at Devel::NYTProf? It has nice HTML output. I like that it also has statement/line level profiling.

    ReplyDelete
  7. Thanks Yngvi. No, I didn't know about that. Thank you for the lead. The manpage looks very, very good.

    ReplyDelete
  8. ...and so does the output. Wow, nice.

    ReplyDelete