Steve Simms

Home

Troubleshooting business day calculation in PostgreSQL

Submitted by Steve Simms on Sat, 09/15/2007 - 10:27pm.
  • Programming

One of the stats I keep track of for work is how quickly we get mailings out the door. It’s actually the only stat that I make public, incidentally — partly for personal accountability, and partly because it’s a selling point.

It’s a stat that’s easy enough to calculate right in the database:

turnaround_time := SELECT mailed - submitted;

Catch 1: I don’t want to include time spent waiting for someone to approve a preview or pay for a mailing.

That makes the calculation a little more complicated, but not by much — just subtract the time spent waiting from the total:

turnaround_time := SELECT mailed - submitted - (GREATER(approved, paid) - responded);

Catch 2: I don’t want to include weekends or postal holidays in the turnaround time either, since I don’t work on either, as a general rule.

This is where it gets messy — PostgreSQL doesn’t have a built-in way of calculating business days, and it certainly doesn’t have a database of postal holidays.

So, a while back, I wrote a database function to do that calculation, given two dates and a set of holidays.

Assuming, for the time being, that no two postal holidays will happen on successive days* (or on weekends), the logic looks something like this:

Part 1:

  • If the start time is on a holiday, move the start time to midnight on the next day
  • If the start time is on a weekend, move the start time to midnight on Monday
  • If the start time is on a holiday, move the start time to midnight on the next day

Part 2:

  • If the end time is on a holiday, move the end time to one second before the beginning of the day
  • If the end time is on a weekend, move the end time to one second before midnight on Saturday
  • If the end time is on a holiday, move the end time to one second before the beginning of the day

Part 3:

  • Figure out how much time there is between the adjusted start and end times
  • Subtract two days for every seven
  • If the weekday of the end time is less than the weekday of the start time (i.e. there’s a weekend in between, but not a full week), subtract another two days

Part 4:

  • Subtract one day for every holiday between the start time and the end time

Part 5:

  • Return 0 seconds if the remaining time is negative (e.g. if start and end times are both on a weekend)
  • Return the remaining time otherwise

As far as I know, that logic is complete, and should work.

But it wasn’t working.

Specifically, when the start and end dates were exactly a week apart, I was getting seven work days instead of five, meaning that the “subtract two days for every seven” line had a bug in it.

Here’s the code that should have worked:

gap := gap - ((((EXTRACT(EPOCH FROM gap)/86400.0) / 7.0 - 0.5)::INTEGER * 2) || ' days')::INTERVAL;

That’s a mouthful, so here’s the expanded version:

  • Starting with gap (i.e. the end time minus the start time)…

  • Figure out how many seconds there are in gap (that’s the EPOCH bit — it’s a cheat to turn a PostgreSQL interval into an integer that I can do arithmetic on).

  • Figure out how many days that represents (60 seconds x 60 minutes x 24 hours = 86,400). The .0 keeps the number from becoming an integer just yet (i.e. don’t round off the number).

  • Divide that number by 7 to get the number of weeks. Again, don’t round yet.

  • I only want to take whole weeks into account, so turn the number into an integer now. Since PostgreSQL does this by rounding to the nearest whole number, subtract 0.5 first, which is equivalent to saying “round down”.

  • Multiply that number by two, so that I get two days per week.

  • Subtract that number from the initial gap.

  • Turn it back into a PostgreSQL interval by appending the word “days” and type-casting it into an interval (the two vertical lines, or “pipes”, are the PostgreSQL command to append text).

On the face of it, everything looks fine. So, now comes the debugging.

Taking two arbitrary dates that are a week apart, not on weekends, and which definitely don’t have holidays in my database, I run it through each of the steps in the long line above:

select extract(epoch from (‘1950-01-09’::timestamp - ‘1950-01-02’::timestamp)::interval);
=> 604800

select extract(epoch from (‘1950-01-09’::timestamp - ‘1950-01-02’::timestamp)::interval) / 86400.0;
=> 7

select extract(epoch from (‘1950-01-09’::timestamp - ‘1950-01-02’::timestamp)::interval) / 86400.0 / 7.0;
=> 1

select extract(epoch from (‘1950-01-09’::timestamp - ‘1950-01-02’::timestamp)::interval) / 86400.0 / 7.0 - 0.5; => 0.5

So far, everything’s correct. But watch the next line:

select (extract(epoch from (‘1950-01-09’::timestamp - ‘1950-01-02’::timestamp)::interval) / 86400.0 / 7.0 - 0.5)::integer;
=> 0

That’s not right — 0.5 is supposed to round to 1, not 0. See:

select 0.5::integer;
=> 1

It’s the nasty floating point math, striking again. You know, where 2 + 2 can equal 5 for sufficiently large values of 2.

In pretty much every programming language known to man, you have integers and floating point numbers. Integers are whole numbers, and are stored by being converted to binary and stored in a container that’s a specific number of bytes long. A one-byte container (i.e. eight binary digits, or “bits”) lets you store a range of 256 numbers, because that’s how many combinations of 0 and 1 you can fit into eight characters.

Floating point numbers let you get around that 1:1 limitation by storing numbers in something approximating scientific notation, dividing the bits between both parts of the number. This gives you a huge range of numbers, but it has the drawback that it’s not quite as precise, and sometimes bites you when you’re expecting one thing and get another.

I would’ve expected a warning sign to show up before the last step in my troubleshooting, but apparently I was wrong.

To fix the problem, I added a ::NUMERIC(8,2) type cast after subtracting 0.5 and before turning it into an integer. This tells PostgreSQL to turn it into a specific precision number (eight total digits, two of which are after the decimal point — eight was chosen as an arbitrary and excessively large number for this function), and then round it, which ensures that there won’t be any floating point mishaps this time.

So, now my published turnaround times are actually correct! Given how few mailings this (and two other holiday-related implementation bugs that I corrected along the way) affected, the numbers didn’t change much, but it means I’m not seeing some oddball numbers on my work list any more.

* You can adjust the function to allow for successive holidays by changing “if” to “while”. I haven’t done that yet, because I don’t need it, and because I’m afraid of accidentally creating an infinite loop inside the database on a remote production server. You can also account for whole weeks of holidays by grouping the first and second threesomes of statements in while loops.

Update: As it turns out, there was another bug in the code mentioned above. In order to fake a “round down” function, you need to subtract 0.499… (with as many nines as you need for precision), not 0.5, which I discovered when fixing the case when the two times were a week apart by day, but not by hour (e.g. noon on Monday through 9am the next Monday).

My code for that line is now:

gap := gap - ((((EXTRACT(EPOCH FROM (t2::DATE::TIMESTAMP - t1::DATE::TIMESTAMP))/86400.0) / 7.0 - 0.499)::NUMERIC(8,3)::INTEGER * 2) || ' days')::INTERVAL;

The ::DATE::TIMESTAMP bit is the equivalent of converting the dates to midnight, which means that two Mondays a week apart will always have the two days subtracted, even if it hasn’t quite been a full week.

  • About
  • Humor
  • Programming
  • Tech
  • Sound
  • Contact Info
  • What's a Deef?
My Amazon.com Wish List

Welcome to Deef's Net, Steve's online home.

If you think of his professional web sites as his office and workplace, this is where he hangs out in the evenings, chats about what happened during the day, and explores some hobbies.

Recent Posts

  • Highlights from 2008
  • August E-Mail Statistics
  • A new kind of job
  • A Public Service Postal Announcement
  • You know your computer is slow when...
© 2000-2008, Stephen Simms. All Rights Reserved.