Sunday, April 6, 2008

Messed-Up App of the Day

Greetings from London Gatwick Airport. Messed-up app of the day: National Express self-service ticketing kiosk, London Heathrow Airport, Terminal 5.

I'm connecting through London on my way home. Unfortunately, my connection requires a bus ride from Heathrow (LHR) to Gatwick (LGW). Not my favorite, but I've done it before and survived. Bus is run by National Express. It's a £19 ticket; apparently only £17 if you're doing the BA/AA ugly connection thing I'm doing. I don't remember having to pay that myself before, but today I'm glad I did.

One way to buy your ticket is through one of the self-service ticket machines outside baggage claim. Here's approximately how that went:

Machine: Where will you be going?

Me: London Gatwick.

Then the machine I was doing business with popped a widget that showed the number of qualifying journeys the machine had found so far. It's one of those "watch the number change really fast" fields. I watched it count through 10, and then 50, and then 100 (jaw drop), and then eventually 500, until finally, about a minute later, the counter reached roughly 2,000. Finally, it said, "2,000 qualifying journeys found", or something to that effect.

Then it showed me the first half dozen or so of those "qualifying journeys."

  • 11:10 Heathrow to Gatwick
  • 11:20 Heathrow to Gatwick
  • 11:30 Heathrow to Gatwick
  • ...

It was 10:55 at the time. I chose 11:10.

"Brilliant."

Let me summarize. This thing computes the schedule for each of the next 2,000 bus rides you might want to take. You only look at six of them. The cost to you for all this information you don't look at is (N+1)*60 seconds of your time, where N is the number of people ahead of you in the queue for this machine. If there are six or more people ahead of you in your queue, then you're probably going to miss the next bus.

I don't know whether it's an Oracle application behind this or not. It doesn't matter. It's stupid either way.

There's a simple solution. Begin by asking approximately when the user wants his bus ride. The default value should be "now." I don't know what percentage of people would ever want to change that value, but I doubt I ever would. Then instead of returning 2,000 qualifying journeys sorted by departure time, return only journeys whose departure time is within ±1 hour of his choice. Of course, list only those departure times that occur in the future.

SQL like this would do it:

select   departure_time, origin, destination, ...
from whatever1, whatever2, ...
where origin = :here
and destination = :there
and departure_time between :your_time - 1/24 and :your_time + 1/24
and departure_time > sysdate
order by 1

Okay, that's functionally dirty because what happens if someone specifies a :your_time value of 3am, and there are no buses scheduled anywhere within 2am-4am?

Then be clever and use analytic functions to find the 10 rows that have departure_time that are closest to :your_time. Be a developer.

The functional difference we're trying to achieve is to stop burning computing and communication resources producing 1,990 rows that nobody wants to see anyway. The benefit? Reduction of human suffering in situations wherein frustrated people who are trying to make their connections are missing buses because a machine that takes their money is spending time making calculations nobody needs.

8 comments:

Toon said...

Hmmm...
The fact that there was a counter fed back to the UI, strongly suggests to me that there was some Object Query Language processing going on in the 'mid-tier', scanning EJB's or whatever they are called this week. Probably zero database-involvement for you at that moment...
Brilliant indeed.

Tom said...

When I was in London back in May, I took the tubes to Victoria Street Station and then got on the Gatwick Express.

Marco Gralike said...
This comment has been removed by the author.
Marco Gralike said...

If we could only get a quarter or a dime of every application / UI out there... displaying this behavior; I guess we then would be rich men...

Dr Quott said...

Well I'm impressed. You managed to make it through Heathrow Terminal 5 and get out the same day and the worst thing you came across was the National Express booking system. £4 billion to create luggage mountian, makes me proud to be British. My guess is that the developer liked seeing the whizzy number gadget counting up the records. It's no fun just seeing it count up to 10, make it count up to 2000 ... genius.

jan van mourik said...

See, that's why I prefer Amsterdam :-)

blogger said...

even the http://www.yorkshiretravel.net site for local bus service shows only next 3 bus times as result and then if the user click more button, it adds further 3 bus times, click more again adds 3 more bus times.

it not using AJAX but still the concept is more useful as not many user want to go beyond next 6 buses.

Jeff said...

Well, I'm frustrated at analytic functions. From SQL Reference 10g Release 2 (B14200-02) and copied in SQL Language Reference 11g Release 1 (B28286-01):

"Analytic functions are the last set of operations performed in a query except for the
final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are
completed before the analytic functions are processed. Therefore, analytic functions
can appear only in the select list or ORDER BY clause."

I am here to attest to the apparent truthfullness in that quote. The 11.1.0.6 software apparently implements analytic functions that way.

The limit feature in MySQL seems far superior to the techniques found in Oracle. I say 'techniques' because Oracle provides some limit-ability via the index_asc and index_desc hints. The problem--and it's not a common one--is that if someone uses the technique and then the index is dropped or renamed or redefined with the "wrong" columns, then you're left with a query that almost certainly gives the wrong answer. But, as I said, it's a rare problem.

So, given an appropriate index, here's a query that can satisfy the query with probably less than 10 consistent reads (a.k.a. LIOs).

select to_char(departing, :dtfmt) from
(
select * from
(
select /*+ index_asc(departure departure_n1) */ *
from departure
where departing >= to_date(:my_time, :dtfmt)
and (orig,dest) = (:orig,:dest)
and rownum <= 5
union
select /*+ index_desc(departure departure_n1) */ *
from departure
where departing < to_date(:my_time, :dtfmt)
and (orig,dest) = (:orig,:dest)
and rownum <= 5
)
order by abs(departing - to_date(:my_time, :dtfmt))
)
where rownum < 5

Here's the trace data its execution generated.

PARSE #4:c=0,e=632,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1236372631765393
EXEC #4:c=0,e=367,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1236372631776514
FETCH #4:c=0,e=440,p=0,cr=5,cu=0,mis=0,r=1,dep=0,og=1,tim=1236372631777292
FETCH #4:c=4000,e=4928,p=0,cr=0,cu=0,mis=0,r=3,dep=0,og=1,tim=1236372631783013
STAT #4 id=1 cnt=4 pid=0 pos=1 obj=0 op='COUNT STOPKEY (cr=5 pr=0 pw=0 time=0 us)'
STAT #4 id=2 cnt=4 pid=1 pos=1 obj=0 op='VIEW (cr=5 pr=0 pw=0 time=0 us cost=378 size=90 card=10)'
STAT #4 id=3 cnt=4 pid=2 pos=1 obj=0 op='SORT ORDER BY STOPKEY (cr=5 pr=0 pw=0 time=0 us cost=378 size=90 card=10)'
STAT #4 id=4 cnt=10 pid=3 pos=1 obj=0 op='VIEW (cr=5 pr=0 pw=0 time=17 us cost=377 size=90 card=10)'
STAT #4 id=5 cnt=10 pid=4 pos=1 obj=0 op='SORT UNIQUE (cr=5 pr=0 pw=0 time=6 us cost=377 size=90 card=10)'
STAT #4 id=6 cnt=10 pid=5 pos=1 obj=0 op='UNION-ALL (cr=5 pr=0 pw=0 time=36 us)'
STAT #4 id=7 cnt=5 pid=6 pos=1 obj=0 op='COUNT STOPKEY (cr=3 pr=0 pw=0 time=16 us)'
STAT #4 id=8 cnt=5 pid=7 pos=1 obj=84410 op='INDEX RANGE SCAN DEPARTURE_N1 (cr=3 pr=0 pw=0 time=6 us cost=163 size=488619 card=54291)'
STAT #4 id=9 cnt=5 pid=6 pos=2 obj=0 op='COUNT STOPKEY (cr=2 pr=0 pw=0 time=16 us)'
STAT #4 id=10 cnt=5 pid=9 pos=1 obj=84410 op='INDEX RANGE SCAN DESCENDING DEPARTURE_N1 (cr=2 pr=0 pw=0 time=6 us cost=2 size=2952 card=328)'