Thursday, May 19, 2016

Messed-Up App of the Day: Tables of Numbers

Quick, which database is the biggest space consumer on this system?
Database                  Total Size   Total Storage
-------------------- --------------- ---------------
SAD99PS                    635.53 GB         1.24 TB
ANGLL                        9.15 TB         18.3 TB
FRI_W1                       2.14 TB         4.29 TB
DEMO                         6.62 TB        13.24 TB
H111D16                      7.81 TB        15.63 TB
HAANT                         1.1 TB          2.2 TB
FSU                          7.41 TB        14.81 TB
BYNANK                       2.69 TB         5.38 TB
HDMI7                      237.68 GB       476.12 GB
SXXZPP                     598.49 GB         1.17 TB
TPAA                         1.71 TB         3.43 TB
MAISTERS                   823.96 GB         1.61 TB
p17gv_data01.dbf            800.0 GB         1.56 TB
It’s harder than it looks.

Did you come up with ANGLL? If you didn’t, then you should look again. If you did, then what steps did you have to execute to find the answer?

I’m guessing you did something like I did:
  1. Skim the entire list. Notice that HDMI7 has a really big value in the third column.
  2. Read the column headings. Parse the difference in meaning between “size” and “storage.” Realize that the “storage” column is where the answer to a question about space consumption will lie.
  3. Skim the “Total Storage” column again and notice that the wide “476.12” number I found previously has a GB label beside it, while all the other labels are TB.
  4. Skim the table again to make sure there’s no PB in there.
  5. Do a little arithmetic in my head to realize that a TB is 1000× bigger than a GB, so 476.12 is probably not the biggest number after all, in spite of how big it looked.
  6. Re-skim the “Total Storage” column looking for big TB numbers.
  7. The biggest-looking TB number is 15.63 on the H111D16 row.
  8. Notice the trap on the ANGLL row that there are only three significant digits showing in the “18.3” figure, which looks physically the same size as the three-digit figures “1.24” and “4.29” directly above and below it, but realize that 18.3 (which should have been rendered “18.30”) is an order of magnitude larger.
  9. Skim the column again to make sure I’m not missing another such number.
  10. The answer is ANGLL.
That’s a lot of work. Every reader who uses this table to answer that question has to do it.

Rendering the table differently makes your readers’ (plural!) job much easier:
Database          Size (TB)  Storage (TB)
----------------  ---------  ------------
SAD99PS                 .64          1.24
ANGLL                  9.15         18.30
FRI_W1                 2.14          4.29
DEMO                   6.62         13.24
H111D16                7.81         15.63
HAANT                  1.10          2.20
FSU                    7.41         14.81
BYNANK                 2.69          5.38
HDMI7                   .24           .48
SXXZPP                  .60          1.17
TPAA                   1.71          3.43
MAISTERS                .82          1.61
p17gv_data01.dbf        .80          1.56
This table obeys an important design principle:
The amount of ink it takes to render each number is proportional to its relative magnitude.
I fixed two problems: (i) now all the units are consistent (I have guaranteed this feature by adding unit label to the header and deleting all labels from the rows); and (ii) I’m showing the same number of significant digits for each number. Now, you don’t have to do arithmetic in your head, and now you can see more easily that the answer is ANGLL, at 18.30 TB.

Let’s go one step further and finish the deal. If you really want to make it as easy as possible for readers to understand your space consumption problem, then you should sort the data, too:
Database          Size (TB)  Storage (TB)
----------------  ---------  ------------
ANGLL                  9.15         18.30
H111D16                7.81         15.63
FSU                    7.41         14.81
DEMO                   6.62         13.24
BYNANK                 2.69          5.38
FRI_W1                 2.14          4.29
TPAA                   1.71          3.43
HAANT                  1.10          2.20
MAISTERS                .82          1.61
p17gv_data01.dbf        .80          1.56
SAD99PS                 .64          1.24
SXXZPP                  .60          1.17
HDMI7                   .24           .48
Now, your answer comes in a glance. Think back at the comprehension steps that I described above. With the table here, you only need:
  1. Notice that the table is sorted in descending numerical order.
  2. Comprehend the column headings.
  3. The answer is ANGLL.
As a reader, you have executed far less code path in your brain to completely comprehend the data that the author wants you to understand.

Good design is a topic of consideration. And even conservation. If spending 10 extra minutes formatting your data better saves 1,000 readers 2 minutes each, then you’ve saved the world 1,990 minutes of wasted effort.

But good design is also a very practical matter for you personally, too. If you want your audience to understand your work, then make your information easier for them to consume—whether you’re writing email, proposals, reports, infographics, slides, or software. It’s part of the pathway to being more persuasive.

1 comment:

Jared said...

Thanks Cary, interesting as usual.

This is an excellent example what many folks would consider to be obviously good design.

It is obviously less obvious to some.