T
Tom Anderson
Hello!
Possibly i'm missing something really obvious here. But ...
If i have a date-time string of the kind specified in RFC 1123, like this:
Tue, 12 Aug 2008 20:48:59 -0700
Can i turn that into a seconds-since-the-epoch time using the standard
time module without jumping through substantial hoops?
Apart from the timezone, this can be parsed using time.strptime with the
format:
%a, %d %b %Y %H:%M:%S
You can stick a %Z on the end for the timezone, but that parses timezone
names ('BST', 'EDT'), not numeric specifiers. Also, it doesn't actually
parse anything, it just requires that the timezone that's in the string
matches your local timezone.
Okay, no problem, so you use a regexp to split off the timezone specifier,
parse that yourself, then parse the raw time with strptime.
Now you just need to adjust the parsed time for the timezone. Now, from
strptime, you get a struct_time, and that doesn't have room for a timezone
(although it does have room for a daylight saving time flag), so you can't
add the timezone in before you convert to seconds-since-the-epoch.
Okay, so convert the struct_time to seconds-since-the-epoch as if it were
UTC, then apply the timezone correction. Converting a struct_time to
seconds-since-the-epoch is done with mktime, right? Wrong! That does the
conversion *in your local timezone*. There's no way to tell it to use any
specific timezone, not even just UTC.
So how do you do this?
Can we convert from struct_time to seconds-since-the-epoch by hand? Well,
the hours, minutes and seconds are pretty easy, but dealing with the date
means doing some hairy calculations with leap years, which are doable but
way more effort than i thought i'd be expending on parsing the date format
found in every single email in the world.
Can we pretend the struct_time is a local time, convert it to
seconds-since-the-epoch, then adjust it by whatever our current timezone
is to get true seconds-since-the-epoch, *then* apply the parsed timezone?
I think so:
def mktime_utc(tm):
"Return what mktime would return if we were in the UTC timezone"
return time.mktime(tm) - time.timezone
Then:
def mktime_zoned(tm, tz):
"Return what mktime would return if we were in the timezone given by tz"
return mktime_utc(tm) - tz
The only problem there is that mktime_utc doesn't deal with DST: if tm is
a date for which DST would be in effect for the local timezone, then we
need to subtract time.altzone, not time.timezone. strptime doesn't fill in
the dst flag, as far as i can see, so we have to round-trip via
mktime/localtim:
def isDST(tm):
tm2 = time.localtime(time.mktime(tm))
assert (tm2.isdst != -1)
return bool(tm2.isdst)
def timezone(tm):
if (isDST(tm)):
return time.altzone
else:
return time.timezone
mktime_utc then becomes:
def mktime_utc(tm):
return time.mktime(tm) - timezone(tm)
And you can of course inline that and eliminate a redundant call to
mktime:
def mktime_utc(tm):
t = time.mktime(tm)
isdst = time.localtime(t).isdst
assert (isdst != -1)
if (isdst):
tz = time.altzone
else:
tz = time.timezone
return t - tz
So, firstly, does that work? Answer: i've tested it a it, and yes.
Secondly, do you really have to do this just to parse a date with a
timezone? If so, that's ridiculous.
tom
Possibly i'm missing something really obvious here. But ...
If i have a date-time string of the kind specified in RFC 1123, like this:
Tue, 12 Aug 2008 20:48:59 -0700
Can i turn that into a seconds-since-the-epoch time using the standard
time module without jumping through substantial hoops?
Apart from the timezone, this can be parsed using time.strptime with the
format:
%a, %d %b %Y %H:%M:%S
You can stick a %Z on the end for the timezone, but that parses timezone
names ('BST', 'EDT'), not numeric specifiers. Also, it doesn't actually
parse anything, it just requires that the timezone that's in the string
matches your local timezone.
Okay, no problem, so you use a regexp to split off the timezone specifier,
parse that yourself, then parse the raw time with strptime.
Now you just need to adjust the parsed time for the timezone. Now, from
strptime, you get a struct_time, and that doesn't have room for a timezone
(although it does have room for a daylight saving time flag), so you can't
add the timezone in before you convert to seconds-since-the-epoch.
Okay, so convert the struct_time to seconds-since-the-epoch as if it were
UTC, then apply the timezone correction. Converting a struct_time to
seconds-since-the-epoch is done with mktime, right? Wrong! That does the
conversion *in your local timezone*. There's no way to tell it to use any
specific timezone, not even just UTC.
So how do you do this?
Can we convert from struct_time to seconds-since-the-epoch by hand? Well,
the hours, minutes and seconds are pretty easy, but dealing with the date
means doing some hairy calculations with leap years, which are doable but
way more effort than i thought i'd be expending on parsing the date format
found in every single email in the world.
Can we pretend the struct_time is a local time, convert it to
seconds-since-the-epoch, then adjust it by whatever our current timezone
is to get true seconds-since-the-epoch, *then* apply the parsed timezone?
I think so:
def mktime_utc(tm):
"Return what mktime would return if we were in the UTC timezone"
return time.mktime(tm) - time.timezone
Then:
def mktime_zoned(tm, tz):
"Return what mktime would return if we were in the timezone given by tz"
return mktime_utc(tm) - tz
The only problem there is that mktime_utc doesn't deal with DST: if tm is
a date for which DST would be in effect for the local timezone, then we
need to subtract time.altzone, not time.timezone. strptime doesn't fill in
the dst flag, as far as i can see, so we have to round-trip via
mktime/localtim:
def isDST(tm):
tm2 = time.localtime(time.mktime(tm))
assert (tm2.isdst != -1)
return bool(tm2.isdst)
def timezone(tm):
if (isDST(tm)):
return time.altzone
else:
return time.timezone
mktime_utc then becomes:
def mktime_utc(tm):
return time.mktime(tm) - timezone(tm)
And you can of course inline that and eliminate a redundant call to
mktime:
def mktime_utc(tm):
t = time.mktime(tm)
isdst = time.localtime(t).isdst
assert (isdst != -1)
if (isdst):
tz = time.altzone
else:
tz = time.timezone
return t - tz
So, firstly, does that work? Answer: i've tested it a it, and yes.
Secondly, do you really have to do this just to parse a date with a
timezone? If so, that's ridiculous.
tom