usable in any place a human can be used

20100209

time

[caption id="attachment_708" align="alignright" width="300" caption="Finally was able to get the clock to show me its good side, dirty girl"]clock[/caption]

Don't worry this won't be a rant about mortality or getting things done or any of the philosophy that has been dominating this blog as of late. This is back to basics, a discussion about software and a particularly tricky aspect of it, time. Not time as in, scheduling and managing time, but something far more fundamental representing time. It is an insidiously tricky problem and one that can be quite difficult to wrap your head around.


The problem comes from how we think about time as people living normal lives. "I'll meet you at 3 o'clock" is a rather dull and completely normal type of phrase to say to someone. As two normal people living normal lives, this simple phrase "3 o'clock" is plenty to convey when they should meet. This is because there exists a great deal of unspoken context between the two parties, if we are meeting for a business meeting I clearly mean 3:00pm not 3:00am. If we both work in the same building I probably mean relative to the our shared timezone, 3:00pm EST not 3:00pm GMT. There is a world of shared unspoken context that makes human-human time discussions easy and natural.


Computers are really stupid though, they need everything spelled out. If you were trying to store time you might take a naive approach at first and just store the string "3:00" maybe if you are really thinking it out you would store "3:00pm EST." This method soon starts showing its weaknesses as its hard to compare times, or perform operations on them. How many hours are between 2:00am EST and 5:30pm CST? There is a nasty problem to try to solve unless you have some sort of way to represent times in the abstract.


In steps a number of formats to represent time. There is the venerable Unix Timestamp which is the number of seconds from Jan. 1, 1970 as of my current writing it stands at 1265738039, but feel free to check for yourself. Then there are numerous proprietary formats like Microsoft's, Oracle's, etc. These all allow you to represent an exact moment of time in a portable abstract way with no dependence on the cavalcade of context us fleshy humans share.


Well problem solved, just bust out your favorite abstract representation and you are done. Not so fast, there are many other considerations to take into account when dealing with time. There are of course the tricky problems of Daylight's Saving Time, leap years, and the like. Imagine you are trying to add an event to a calendar system everyday at 5:00pm EST, you think you could just add it to today and then just add 24 hours and create a new event. DST hits your algorithm over the head at some point and everything is off an hour, oh no! Also now you have a ton of data to represent one basic fact, something happens everyday at 5:00pm EST. Its only one fact, you should need one record, not an infinite number of 5:00pm EST records. This hints at the next difficulty of time.


Humans think about repeatable events (sometimes with complex and insane rules) as commonplace and easy. This thing happens on the third Thursday of every month, unless of course Monday was a holiday and then it gets shifted to Friday. The problem with time and dates and repeating events is that human beings erected a ton of complex processing rules before they realized we were going to try and digitize them. These are difficult to represent and difficult to get right.


At first the task of representing arbitrary points and spans of time seems fairly straightforward, but it is a complex and nuanced task, like most things the devil is in the details. Before you go off half-cocked building up your own representation, take a look at some established formats, like Unix Timestamps, RFC5545, and ISO 8601.

No comments:

Post a Comment