strptime and timezones

T

Tom Anderson

Hello!

Possibly i'm missing something really obvious here. But ...

If i have a date-time string of the kind specified in RFC 1123, like this:

Tue, 12 Aug 2008 20:48:59 -0700

Can i turn that into a seconds-since-the-epoch time using the standard
time module without jumping through substantial hoops?

Apart from the timezone, this can be parsed using time.strptime with the
format:

%a, %d %b %Y %H:%M:%S

You can stick a %Z on the end for the timezone, but that parses timezone
names ('BST', 'EDT'), not numeric specifiers. Also, it doesn't actually
parse anything, it just requires that the timezone that's in the string
matches your local timezone.

Okay, no problem, so you use a regexp to split off the timezone specifier,
parse that yourself, then parse the raw time with strptime.

Now you just need to adjust the parsed time for the timezone. Now, from
strptime, you get a struct_time, and that doesn't have room for a timezone
(although it does have room for a daylight saving time flag), so you can't
add the timezone in before you convert to seconds-since-the-epoch.

Okay, so convert the struct_time to seconds-since-the-epoch as if it were
UTC, then apply the timezone correction. Converting a struct_time to
seconds-since-the-epoch is done with mktime, right? Wrong! That does the
conversion *in your local timezone*. There's no way to tell it to use any
specific timezone, not even just UTC.

So how do you do this?

Can we convert from struct_time to seconds-since-the-epoch by hand? Well,
the hours, minutes and seconds are pretty easy, but dealing with the date
means doing some hairy calculations with leap years, which are doable but
way more effort than i thought i'd be expending on parsing the date format
found in every single email in the world.

Can we pretend the struct_time is a local time, convert it to
seconds-since-the-epoch, then adjust it by whatever our current timezone
is to get true seconds-since-the-epoch, *then* apply the parsed timezone?
I think so:

def mktime_utc(tm):
"Return what mktime would return if we were in the UTC timezone"
return time.mktime(tm) - time.timezone

Then:

def mktime_zoned(tm, tz):
"Return what mktime would return if we were in the timezone given by tz"
return mktime_utc(tm) - tz

The only problem there is that mktime_utc doesn't deal with DST: if tm is
a date for which DST would be in effect for the local timezone, then we
need to subtract time.altzone, not time.timezone. strptime doesn't fill in
the dst flag, as far as i can see, so we have to round-trip via
mktime/localtim:

def isDST(tm):
tm2 = time.localtime(time.mktime(tm))
assert (tm2.isdst != -1)
return bool(tm2.isdst)

def timezone(tm):
if (isDST(tm)):
return time.altzone
else:
return time.timezone

mktime_utc then becomes:

def mktime_utc(tm):
return time.mktime(tm) - timezone(tm)

And you can of course inline that and eliminate a redundant call to
mktime:

def mktime_utc(tm):
t = time.mktime(tm)
isdst = time.localtime(t).isdst
assert (isdst != -1)
if (isdst):
tz = time.altzone
else:
tz = time.timezone
return t - tz

So, firstly, does that work? Answer: i've tested it a it, and yes.

Secondly, do you really have to do this just to parse a date with a
timezone? If so, that's ridiculous.

tom
 
C

Christian Heimes

Tom said:
Secondly, do you really have to do this just to parse a date with a
timezone? If so, that's ridiculous.

No, you don't. :) Download the pytz package from the Python package
index. It's *the* tool for timezone handling in Python. The time zone
definition are not part of the Python standard library because they
change every few of months. Stupid politicians ...

Christian
 
C

Christian Heimes

Tom said:
Secondly, do you really have to do this just to parse a date with a
timezone? If so, that's ridiculous.

No, you don't. :) Download the pytz package from the Python package
index. It's *the* tool for timezone handling in Python. The time zone
definition are not part of the Python standard library because they
change every few of months. Stupid politicians ...

Christian
 
T

Tom Anderson

No, you don't. :) Download the pytz package from the Python package
index. It's *the* tool for timezone handling in Python. The time zone
definition are not part of the Python standard library because they
change every few of months. Stupid politicians ...

My problem has absolutely nothing to do with timezone definitions. In
fact, it involves less timezone knowledge than the time package supplies!
The wonderful thing about RFC 1123 timestamps is that they give the
numeric value of their timezone, so you don't have to decode a symbolic
one or anything like that. Knowing about timezones thus isn't necessary.

The problem is simply that the standard time package doesn't think that
way, and always assumes that a time is in your local timezone.

That said, it does look like pytz might be able to parse RFC 1123 dates.
Ill check it out.

tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,701
Latest member
XavierQ83

Latest Threads

Top