Showing posts with label donald knuth. Show all posts
Showing posts with label donald knuth. Show all posts

Friday, January 8, 2016

The “Two Spaces After a Period” Thing

Once upon a time, I told my friend Chet Justice why he should start using one space instead of two after a sentence-ending period. I’m glad I did.

Here’s the story.

When you type, you’re inputting data into a machine. I know you like feeling like you’re in charge, but really you’re not in charge of all the rules you have to follow while you’re inputting your data. Other people—like the designers of the machine you’re using—have made certain rules that you have to live by. For example, if you’re using a QWERTY keyboard, then the ‘A’ key is in a certain location on the keyboard, and whether it makes any sense to you or not, the ‘B’ key is way over there, not next to the ‘A’ key like you might have expected when you first started learning how to type. If you want a ‘B’ to appear in the input, then you have to reach over there and push the ‘B’ key on the keyboard.

In addition to the rules imposed upon you by the designers of the machine you’re using, you follow other rules, too. If you’re writing a computer program, then you have to follow the syntax rules of the language you’re using. There are alphabet and spelling and grammar rules for writing in German, and different ones for English. There are typographical rules for writing for The New Yorker, and different ones for the American Mathematical Society.

A lot of people who are over about 40 years old today learned to type on an actual typewriter. A typewriter is a machine that used rods and springs and other mechanical elements to press metal dies with backwards letter shapes engraved onto them through an inked ribbon onto a piece of paper. Some of the rules that governed the data input experience on typewriters included:
  • You had to learn where the keys were on the keyboard.
  • You had to learn how to physically return the carriage at the end of a line.
  • You had to learn your project’s rules of spelling.
  • You had to learn your project’s rules of grammar.
  • You had to learn your project’s rules of typography.
The first two rules listed here are physical, but the final three are syntactic and semantic. Just like you wouldn’t press the ‘A’ key to make a ‘B’, you wouldn’t use the strings “definately” or “we was” to make an English sentence.

On your typewriter, you might not have realized it, but you did adhere to some typography rules. They might have included:
  • Use two carriage returns after a paragraph.
  • Type two spaces after a sentence-ending period.
  • Type two spaces after a colon.
  • Use two consecutive hyphens to represent an em dash.
  • Make paragraphs no more than 80 characters wide.
  • Never use a carriage return between “Mr.” and the proper name that follows, or between a number and its unit.
The rules were different for different situations. For example, when I wrote a book back in the mid 1980s, one of the distinctive typography rules my publisher imposed upon me was:
  • Double-space all paragraph text.
They wanted their authors to do this so that their copyeditor had plenty of room for markup. Such typography rules can vary from one project to another.

Most people who didn’t write for different publishers got by just fine on the one set of typography rules they learned in high school. To them, it looked like there were only a few simple rules, and only one set of them. Most people had never even heard of a lot of the rules they should have been following, like rules about widows and orphans.

In the early 1980s, I began using computers for most of my work. I can remember learning how to use word processing programs like WordStar and Sprint. The rules were a lot more complicated with word processors. Now there were rules about “control keys” like ^X and ^Y, and there were no-break spaces and styles and leading and kerning and ligatures and all sorts of new things I had never had to think about before. A word processor was much more powerful than a typewriter. If you did it right, typesetting could could make your work look like a real book. But word processors revealed that typesetting was way more complicated than just typing.

Doing your own typesetting can be kind of like doing your own oil changes. Most people prefer to just put gas in the tank and not think too much about the esoteric features of their car (like their tires or their turn signal indicators). Most people who went from typewriters to word processors just wanted to type like they always had, using the good-old two or three rules of typography that had been long inserted into their brains by their high school teachers and then committed by decades of repetition.

Donald Knuth published The TeXBook in 1984. I think I bought it about ten minutes after it was published. Oh, I loved that book. Using TeX was my first real exposure to the world of actual professional-grade typography, and I have enjoyed thinking about typography ever since. I practice typography every day that I use Keynote or Pages or InDesign to do my work.

Many people don’t realize it, but when you type input into programs like Microsoft Word should follow typography rules including these:
  • Never enter a blank line (edit your paragraph’s style to manipulate its spacing).
  • Use a single space after a sentence-ending period (the typesetter software you’re using will make the amount of space look right as it composes the paragraph).
  • Use a non-breaking space after a non-sentence-ending period (so the typesetter software won’t break “Mr. Harkey” across lines).
  • Use a non-breaking space between a number and its unit (so the typesetter software won’t break “8 oz” across lines).
  • Use an en dash—not a hyphen—to specify ranges of numbers (like “3–8”).
  • Use an em dash—not a pair of hyphens—when you need an em dash (like in this sentence).
  • Use proper quotation marks, like “this” and ‘this’ (or even « this »).
Of course, you can choose to not follow these rules, just like you can choose to be willfully ignorant about spelling or grammar. But to a reader who has studied typography even just a little bit, seeing you break these rules feels the same as seeing a sentence like, “You was suppose to use apostrophe's.” It affects how people perceive you.

So, it’s always funny to me when people get into heated arguments on Facebook about using one space or two after a period. It’s the tiniest little tip of the typography iceberg, but it opens the conversation about typography, for which I’m glad. In these discussions, two questions come up repeatedly: “When did the rule change? Why?”

Well, the rule never did change. The next time I type on an actual typewriter, I will use two spaces after each sentence-ending period. I will also use two spaces when I create a Courier font court document or something that I want to look like it was created in the 1930s. But when I work on my book in Adobe InDesign, I’ll use one space. When I use my iPhone, I’ll tap in two spaces at the end of a sentence, because it automatically replaces them with a period and a single space. I adapt to the rules that govern the situation I’m in.

It’s not that the rules have changed. It’s that the set of rules was always a lot bigger than most people ever knew.

Thursday, February 5, 2009

On the Usefulness of Software Instrumentation

I appreciate that in today's reading, Chen Shapira has taught me about the psychological principle of the endowment effect and its influence over people's decision-making about whether to instrument their code. I'm beginning to collect a nice little list of inspiring observations about performance instrumentation. I've posted them in a public wiki. I hope you'll join in.

Here are some samples:

Chen Shapira: The pervasive lack of instrumentation in software products is more a result of psychological bias than real technical concerns. Software vendors can work around these psychological issues by building instrumentation as a default into tools involved in the development and deployment process. Just as Knuth said 35 years ago.

Don Knuth: I've become convinced that all compilers written from now on should be designed to provide all programmers with feedback indicating what parts of their programs are costing the most; indeed, this feedback should be supplied automatically unless it has been specifically turned off.

Tom Kyte: I have yet in 18 years to hear a valid reason why instrumentation should not be done. I have only heard extremely compelling reasons why it must be done.

Tom Kyte: To the developers that say “this is extra code that will just make my code run slower” I respond “well fine, we will take away V$ views, there will be no SQL_TRACE, no 10046 level 12 traces, in fact–that entire events subsystem in Oracle, it is gone”. Would Oracle run faster without this stuff? Undoubtedly—not. It would run many times slower, perhaps hundreds of times slower. Why? Because you would have no clue where to look to find performance related issues. You would have nothing to go on. Without this “overhead” (air quotes intentionally used to denote sarcasm there), Oracle would not have a chance of performing as well as it does. Because you would not have a change to make it perform well. Because you would not know where even to begin.

Tom Kyte (I don't know of a link to this one; I'm paraphrasing from public statements I've seen him make): Showing where your code spends its time is a necessary function of any production application. The fact that the extra code required to do this consumes extra system capacity is no different from any other feature in your application. As with any required feature, you simply have to size your hardware with the feature in mind.


Tuesday, February 3, 2009

Reading Knuth

I've been reading a little Knuth for the past couple of days. I've read about the famous statement, "premature optimization is the root of all evil," many times, and yesterday I decided to read his statement in its whole context. You can find Knuth's whole article here (thank you, Wikipedia):
Knuth, Donald: Structured Programming with Goto Statements. Computing Surveys 6:4 (1974), 261–301.
I want to quote the passage for you that contains the famous line about premature optimization. It's on page 268 (the eighth page of the article):
There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Most people stop here when they quote Knuth. But actually, it just keeps getting better:
Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail. After working with such tools for seven years, I've become convinced that all compilers written from now on should be designed to provide all programmers with feedback indicating what parts of their programs are costing the most; indeed, this feedback should be supplied automatically unless it has been specifically turned off.
What a beautiful expression of the idea that's at the core of what I do for a living.

If you haven't seen "For Developers: Making Friends with the Oracle Database" yet, I hope you'll take a look. My aim is to help Oracle application developers writing Java, PHP, C#—or anything else—understand how to find that critical code that Knuth wrote about in 1974. ...And, just as importantly, how to ignore the other 97% of your code that you shouldn't be worrying about.