Various strings to dates.

A

Amy G

I have seen something about this beofore on this forum, but my google search
didn't come up with the answer I am looking for.

I have a list of tuples. Each tuple is in the following format:

("data", "moredata", "evenmoredata", "date string")

The date string is my concern. This is the date stamp from an email.
The problem is that I have a whole bunch of variations when it comes to the
format that the date string is in. For example I could have the following
two tuples:

("data", "moredata", "evenmoredata", "Fri, 23 Jan 2004 00:06:15")
("data", "moredata", "evenmoredata", "Thursday, 22 January 2004 03:15:06")

I know there is some way to use the date string from each of these to get a
date usable by python, but I cannot figure it out.
I was trying to use time.strptime but have been unsuccesful thus far.

Any help is appreciated.
 
W

wes weston

Amy,
I hope there is a better way but, if you go here:

http://www.python.org/doc/current/lib/datetime-date.html

The new datetime module may help. This and the time mod
should get you where you want to go.

list = strdate.split(", ")
daystr = list[0]
daynum = int(list[1])
monthstr = list[2]
year = int(list[3])
#funct to get a month int is needed

d = datetime.Date(y,m,d)

wes
 
A

Amy G

No it won't. Unfortunatly I don't necessarily have a comma delimited date
string. Thanks for the input though.

The following three date strings is another example of the various date
formats I will encounter here.

Thursday, 22 January 2004 03:15:06
Thursday, January 22, 2004, 03:15:06
2004, Thursday, 22 January 03:15:06

All of these are essentially the same date... just in various formats. I
would like to parse through them and get a comparable format so that I can
display them in chronological order.


wes weston said:
Amy,
I hope there is a better way but, if you go here:

http://www.python.org/doc/current/lib/datetime-date.html

The new datetime module may help. This and the time mod
should get you where you want to go.

list = strdate.split(", ")
daystr = list[0]
daynum = int(list[1])
monthstr = list[2]
year = int(list[3])
#funct to get a month int is needed

d = datetime.Date(y,m,d)

wes

---------------------------------------

Amy said:
I have seen something about this beofore on this forum, but my google search
didn't come up with the answer I am looking for.

I have a list of tuples. Each tuple is in the following format:

("data", "moredata", "evenmoredata", "date string")

The date string is my concern. This is the date stamp from an email.
The problem is that I have a whole bunch of variations when it comes to the
format that the date string is in. For example I could have the following
two tuples:

("data", "moredata", "evenmoredata", "Fri, 23 Jan 2004 00:06:15")
("data", "moredata", "evenmoredata", "Thursday, 22 January 2004 03:15:06")

I know there is some way to use the date string from each of these to get a
date usable by python, but I cannot figure it out.
I was trying to use time.strptime but have been unsuccesful thus far.

Any help is appreciated.
 
S

Skip Montanaro

Amy> The following three date strings is another example of the various
Amy> date formats I will encounter here.

Amy> Thursday, 22 January 2004 03:15:06
Amy> Thursday, January 22, 2004, 03:15:06
Amy> 2004, Thursday, 22 January 03:15:06

Assuming you won't have any ambiguous dates (like 1/3/04), just define
regular expressions which label the various fields of interest, then match
your string against them until you get a hit. For
example, the first would be matched by this:
>>> import re
>>> pat = re.compile(r'(?P<wkday>[A-Z][a-z]+),\s+(?P<day>[0-9]{1,2})\s+'
... r'(?P said:
>>> mat = pat.match('Thursday, 22 January 2004 03:15:06')
>>> mat
said:
>>> mat.groups() ('Thursday', '22', 'January', '2004')
>>> mat.group('month')
'January'

etc. (Extending the regexp to accommodate the time is left as an exercise.)
Once you have a match, pull out the relevant bits, maybe tweak them a bit
(int()-ify things), then create a datetime instance from the result.

I do something like this in my dates module. It's old and ugly though:

http://manatee.mojam.com/~skip/python/

Search for "date-parsing module". This was written long before the datetime
module was available and was used for for a slightly different purpose. It
recognizes a number of different date range formats in addition to
individual dates. You might be able to snag some regular expression ideas
from it though.

Skip
 
M

Michael Spencer

Amy G said:
No it won't. Unfortunatly I don't necessarily have a comma delimited date
string. Thanks for the input though.

The following three date strings is another example of the various date
formats I will encounter here.

Thursday, 22 January 2004 03:15:06
Thursday, January 22, 2004, 03:15:06
2004, Thursday, 22 January 03:15:06

All of these are essentially the same date... just in various formats. I
would like to parse through them and get a comparable format so that I can
display them in chronological order.


wes weston said:
Amy,
I hope there is a better way but, if you go here:

http://www.python.org/doc/current/lib/datetime-date.html

The new datetime module may help. This and the time mod
should get you where you want to go.

list = strdate.split(", ")
daystr = list[0]
daynum = int(list[1])
monthstr = list[2]
year = int(list[3])
#funct to get a month int is needed

d = datetime.Date(y,m,d)

wes

---------------------------------------

Amy said:
I have seen something about this beofore on this forum, but my google search
didn't come up with the answer I am looking for.

I have a list of tuples. Each tuple is in the following format:

("data", "moredata", "evenmoredata", "date string")

The date string is my concern. This is the date stamp from an email.
The problem is that I have a whole bunch of variations when it comes
to

This was asked and answered earlier today

See: https://moin.conectiva.com.br/DateUtil
 
A

Amy G

That is exactly what I am looking for. However I don't have the module
installed. Where can I get it?

Michael Spencer said:
Amy G said:
No it won't. Unfortunatly I don't necessarily have a comma delimited date
string. Thanks for the input though.

The following three date strings is another example of the various date
formats I will encounter here.

Thursday, 22 January 2004 03:15:06
Thursday, January 22, 2004, 03:15:06
2004, Thursday, 22 January 03:15:06

All of these are essentially the same date... just in various formats. I
would like to parse through them and get a comparable format so that I can
display them in chronological order.


wes weston said:
Amy,
I hope there is a better way but, if you go here:

http://www.python.org/doc/current/lib/datetime-date.html

The new datetime module may help. This and the time mod
should get you where you want to go.

list = strdate.split(", ")
daystr = list[0]
daynum = int(list[1])
monthstr = list[2]
year = int(list[3])
#funct to get a month int is needed

d = datetime.Date(y,m,d)

wes
google
search
didn't come up with the answer I am looking for.

I have a list of tuples. Each tuple is in the following format:

("data", "moredata", "evenmoredata", "date string")

The date string is my concern. This is the date stamp from an email.
The problem is that I have a whole bunch of variations when it comes
to
the
format that the date string is in. For example I could have the following
two tuples:

("data", "moredata", "evenmoredata", "Fri, 23 Jan 2004 00:06:15")
("data", "moredata", "evenmoredata", "Thursday, 22 January 2004 03:15:06")

I know there is some way to use the date string from each of these
to
get a
date usable by python, but I cannot figure it out.
I was trying to use time.strptime but have been unsuccesful thus far.

Any help is appreciated.

This was asked and answered earlier today

See: https://moin.conectiva.com.br/DateUtil
 
A

Amy G

When I tried to do the make install I get the following error message:

warning: install: modules installed to '/usr/lib/python2.2/site-packages/',
which is not in Python's module search path (sys.path) -- you'll have to
change the search path yourself

How do I correct this. Sorry for the newb question.



Michael Spencer said:
Amy G said:
No it won't. Unfortunatly I don't necessarily have a comma delimited date
string. Thanks for the input though.

The following three date strings is another example of the various date
formats I will encounter here.

Thursday, 22 January 2004 03:15:06
Thursday, January 22, 2004, 03:15:06
2004, Thursday, 22 January 03:15:06

All of these are essentially the same date... just in various formats. I
would like to parse through them and get a comparable format so that I can
display them in chronological order.


wes weston said:
Amy,
I hope there is a better way but, if you go here:

http://www.python.org/doc/current/lib/datetime-date.html

The new datetime module may help. This and the time mod
should get you where you want to go.

list = strdate.split(", ")
daystr = list[0]
daynum = int(list[1])
monthstr = list[2]
year = int(list[3])
#funct to get a month int is needed

d = datetime.Date(y,m,d)

wes
google
search
didn't come up with the answer I am looking for.

I have a list of tuples. Each tuple is in the following format:

("data", "moredata", "evenmoredata", "date string")

The date string is my concern. This is the date stamp from an email.
The problem is that I have a whole bunch of variations when it comes
to
the
format that the date string is in. For example I could have the following
two tuples:

("data", "moredata", "evenmoredata", "Fri, 23 Jan 2004 00:06:15")
("data", "moredata", "evenmoredata", "Thursday, 22 January 2004 03:15:06")

I know there is some way to use the date string from each of these
to
get a
date usable by python, but I cannot figure it out.
I was trying to use time.strptime but have been unsuccesful thus far.

Any help is appreciated.

This was asked and answered earlier today

See: https://moin.conectiva.com.br/DateUtil
 
A

Amy G

Some extra info... when I get this
['', '/usr/local/lib/python2.2', '/usr/local/lib/python2.2/plat-freebsd5',
'/usr/local/lib/python2.2/lib-tk', '/usr/local/lib/python2.2/lib-dynload',
'/usr/local/lib/python2.2/site-packages']

Doesn't that mean that the directory is already in the path???

Michael Spencer said:
Amy G said:
No it won't. Unfortunatly I don't necessarily have a comma delimited date
string. Thanks for the input though.

The following three date strings is another example of the various date
formats I will encounter here.

Thursday, 22 January 2004 03:15:06
Thursday, January 22, 2004, 03:15:06
2004, Thursday, 22 January 03:15:06

All of these are essentially the same date... just in various formats. I
would like to parse through them and get a comparable format so that I can
display them in chronological order.


wes weston said:
Amy,
I hope there is a better way but, if you go here:

http://www.python.org/doc/current/lib/datetime-date.html

The new datetime module may help. This and the time mod
should get you where you want to go.

list = strdate.split(", ")
daystr = list[0]
daynum = int(list[1])
monthstr = list[2]
year = int(list[3])
#funct to get a month int is needed

d = datetime.Date(y,m,d)

wes
google
search
didn't come up with the answer I am looking for.

I have a list of tuples. Each tuple is in the following format:

("data", "moredata", "evenmoredata", "date string")

The date string is my concern. This is the date stamp from an email.
The problem is that I have a whole bunch of variations when it comes
to
the
format that the date string is in. For example I could have the following
two tuples:

("data", "moredata", "evenmoredata", "Fri, 23 Jan 2004 00:06:15")
("data", "moredata", "evenmoredata", "Thursday, 22 January 2004 03:15:06")

I know there is some way to use the date string from each of these
to
get a
date usable by python, but I cannot figure it out.
I was trying to use time.strptime but have been unsuccesful thus far.

Any help is appreciated.

This was asked and answered earlier today

See: https://moin.conectiva.com.br/DateUtil
 
A

Amy G

Okay. I fixed the problem somewhat. I moved the dateutil directory over to
/usr/local/lib/python2.2/site-packages and I can now import dateutil. But a
call like this:

from dateutil.parser import parse

results in this error:

ImportError: cannot import name parse

I can 'from dateutil.parser import *' but cannot use parse after that.
I can also 'from dateutil import parser' but that doesn't help either.

Sorry for my inexperience here. Thanks for all of the help already.

Amy G said:
No it won't. Unfortunatly I don't necessarily have a comma delimited date
string. Thanks for the input though.

The following three date strings is another example of the various date
formats I will encounter here.

Thursday, 22 January 2004 03:15:06
Thursday, January 22, 2004, 03:15:06
2004, Thursday, 22 January 03:15:06

All of these are essentially the same date... just in various formats. I
would like to parse through them and get a comparable format so that I can
display them in chronological order.


wes weston said:
Amy,
I hope there is a better way but, if you go here:

http://www.python.org/doc/current/lib/datetime-date.html

The new datetime module may help. This and the time mod
should get you where you want to go.

list = strdate.split(", ")
daystr = list[0]
daynum = int(list[1])
monthstr = list[2]
year = int(list[3])
#funct to get a month int is needed

d = datetime.Date(y,m,d)

wes

---------------------------------------

Amy said:
I have seen something about this beofore on this forum, but my google search
didn't come up with the answer I am looking for.

I have a list of tuples. Each tuple is in the following format:

("data", "moredata", "evenmoredata", "date string")

The date string is my concern. This is the date stamp from an email.
The problem is that I have a whole bunch of variations when it comes
to
 
J

John Roth

Amy G said:
I have seen something about this beofore on this forum, but my google search
didn't come up with the answer I am looking for.

I have a list of tuples. Each tuple is in the following format:

("data", "moredata", "evenmoredata", "date string")

The date string is my concern. This is the date stamp from an email.
The problem is that I have a whole bunch of variations when it comes to the
format that the date string is in. For example I could have the following
two tuples:

("data", "moredata", "evenmoredata", "Fri, 23 Jan 2004 00:06:15")
("data", "moredata", "evenmoredata", "Thursday, 22 January 2004 03:15:06")

I know there is some way to use the date string from each of these to get a
date usable by python, but I cannot figure it out.
I was trying to use time.strptime but have been unsuccesful thus far.

Any help is appreciated.

This is what I use to parse dates of unknown provinance.
It's laughably overengineered, and I don't include the day
of the week or the time. Given your examples, though,
those should be easy enough to deal with.

HTH
John Roth

class DateContainer(object):
_typeDict = {}
_stringValue = ""
_typeDict["stringValue"] = "String"
_typeDict["value"] = "String"
_value = ""
year = 1
month = 1
day = 1
bc = ""

def _checkUserFriendlyDate(self, date):
# The rules for a user friendly date are:
# 1. The year must be at least three digits, including
# leading zeroes if necessary. Day and numeric month
# may be no longer than 2 digits.
# 2. The month may be alphabetic or numeric. If it's
# alphabetic, it must be at least three letters long.
# 3. The epoch may be ad, bc, bce or ce. If omitted, it's
# assumed to be ad.
# 4. After removing the year, epoch and an alphabetic month,
# the remaining single piece is the day, or the piece that
# is greater than 12.
# 5. If two pieces remain, the first is the month, the second
# is the day. Both are between 1 and 12, inclusive.
partList = dateTimeParse(date)
if not(2 < len(partList) < 5):
raise ValueError, "incorrect part list: %s" % (partList,)
bc = self._findBC(partList)
if len(partList) != 3:
return "too many components in date: '%s'" % date
year = self._findYear(partList)
month = self._findAlphaMonth(partList)
if month != 0:
day = partList[0]
else:
day = self._findDay(partList)
if day:
month = partList[0]
else:
month, day = partList
year = self._checkNum(year, 4712)
day = self._checkNum(day, 31)
month = self._checkNum(month, 12)
if bc in ("AD", "CE"):
bc = ""
self.year, self.month, self.day, self.bc = year, month, day, bc
return True

def _checkNum(self, num, limit):
result = int(num)
if result > limit:
raise ValueError, "number '%s' out of range '%s'" % (num, limit)
return result

def _findBC(self, partList):
for i in range(len(partList)):
word = partList
if word in ("AD", "BC", "CE", "BCE"):
del partList
return word
# XXX if len(partList > 3): error
return ""

def _findYear(self, partList):
for i in range(len(partList)):
word = partList
if len(word) > 2 and word.isdigit():
del partList
return word
raise ValueError

def _findAlphaMonth(self, partList):
for i in range(len(partList)):
word = partList
if word.isalpha():
del partList
return ['JAN', 'FEB', 'MAR', 'APR', 'MAY', 'JUN',
'JUL', 'AUG', 'SEP', 'OCT', 'NOV',
'DEC'].index(word[:3]) + 1
return 0

def _findDay(self, partList):
for i in range(len(partList)):
word = partList
if word.isdigit() and int(word) > 12:
del partList
return word
return ""

def _getStringValue(self):
return self._stringValue

def _setStringValue(self, value):
self._checkUserFriendlyDate(value)
self._stringValue = value

_typeDict["stringValue"] = "String"
stringValue = property(_getStringValue, _setStringValue,
doc="User Friendly Date")

def _getValue(self):
isoDate = "%04u-%02u-%02u %s" % (self.year, self.month, self.day,
self.bc)
return isoDate.strip()

def checkISODate(self, value):
year = self._checkNum(value[:4], 4712)
month = self._checkNum(value[5:7], 12)
day = self._checkNum(value[8:10], 31)
if len(value) > 10:
bc = value[11:]
if not (bc.upper() in ("AD", "BC", "BCE", "CE")):
raise ValueError
if bc in ("AD", "CE"):
bc = ""
self.year, self.month, self.day, self.bc = year, month, day, bc
return

def _setValue(self, value):
self._checkISODate(value)
isoDate = "%04u-%02u-%02u %s" % (self.year, self.month, self.day,
self.bc)
self.stringValue = isoDate
return None

value = property(_getValue, _setValue,
doc = "ISO Standard Format Date")
 
W

wes weston

#!/usr/local/bin/python -O

#NOTE: add missing MONTHS; DAYS not used

import datetime
import string

dateList = ["Fri, 23 Jan 2004 00:06:15",
"Thursday, 22 January 2004 03:15:06",
"Thursday, January 22, 2004, 03:15:06",
"2004, Thursday, 22 January 03:15:06"]
MONTHS = [("JANUARY",1),
("JAN",1),
("FEBRUARY",2),
("FEB",2),
#etc
]
DAYS = [("Monday",0),
("Mon",0),
#........
("Thursday",3),
("Thur",3)
]
#--------------------------------------------------------------------
def GetMonthInt(mstr):
#print "mstr=",mstr
for t in MONTHS:
if t[0].find(mstr) > -1:
return t[1]
return -1
#--------------------------------------------------------------------
class MyDateTime:
def __init__(self,oddstr):
tokens = oddstr.split()
temp = []
for t in tokens:
if t.find(":") > -1:
continue
if t[-1] == ',':
t = t[:-1]
temp.append(t)
tokens = temp
#for t in tokens:
# print t
year = -1
month = -1
day = -1
for t in tokens:
if t[0] in string.digits:
x = int(t)
if x > 31:
year = x
else:
day = x
continue
t = t.upper()
if t[0] in string.ascii_uppercase:
x = GetMonthInt(t)
if x <> -1:
month = x
continue
if year > -1 and month > -1 and day > -1:
self.Date = datetime.date(year,month,day)
else:
self.Date = None
def Show(self):
print self.Date.ctime()

#--------------------------------------------------------------------
if __name__ == '__main__':
for date in dateList:
dt = MyDateTime(date)
dt.Show()
 
M

Michael Spencer

[Fixed top-posts: please add future comments at the bottom]
Amy G wrote:
I have seen something about this beofore on this forum, but my
google
search
didn't come up with the answer I am looking for.

I have a list of tuples. Each tuple is in the following format:

("data", "moredata", "evenmoredata", "date string")

The date string is my concern. This is the date stamp from an email.
The problem is that I have a whole bunch of variations when it comes
to
the
format that the date string is in. For example I could have the following
two tuples:

("data", "moredata", "evenmoredata", "Fri, 23 Jan 2004 00:06:15")
("data", "moredata", "evenmoredata", "Thursday, 22 January 2004 03:15:06")

I know there is some way to use the date string from each of these
to
get a
date usable by python, but I cannot figure it out.
I was trying to use time.strptime but have been unsuccesful thus far.

Any help is appreciated.
wes weston said:
Amy,
I hope there is a better way but, if you go here:

http://www.python.org/doc/current/lib/datetime-date.html

The new datetime module may help. This and the time mod
should get you where you want to go.

list = strdate.split(", ")
daystr = list[0]
daynum = int(list[1])
monthstr = list[2]
year = int(list[3])
#funct to get a month int is needed

d = datetime.Date(y,m,d)

wes
Amy G said:
No it won't. Unfortunatly I don't necessarily have a comma delimited date
string. Thanks for the input though.

The following three date strings is another example of the various date
formats I will encounter here.

Thursday, 22 January 2004 03:15:06
Thursday, January 22, 2004, 03:15:06
2004, Thursday, 22 January 03:15:06

All of these are essentially the same date... just in various formats. I
would like to parse through them and get a comparable format so that I can
display them in chronological order.
Amy G said:
Okay. I fixed the problem somewhat. I moved the dateutil directory over to
/usr/local/lib/python2.2/site-packages and I can now import dateutil. But a
call like this:

from dateutil.parser import parse

results in this error:

ImportError: cannot import name parse

I can 'from dateutil.parser import *' but cannot use parse after that.
I can also 'from dateutil import parser' but that doesn't help either.

Sorry for my inexperience here. Thanks for all of the help already.

Amy:

The docstring of the dateutil package:
"""
Copyright (c) 2003 Gustavo Niemeyer <[email protected]>

This module offers extensions to the standard python 2.3+
datetime module.
"""
__author__ = "Gustavo Niemeyer <[email protected]>"
__license__ = "PSF License"

notes that it requires the datetime module (new in Python 2.3). It appears
that you are using 2.2. Can you install 2.3.3?

If you need to stick with 2.2, the question of using datetime in 2.2 was
answered in
http://groups.google.com/[email protected]&rnum=2
with a pointer to a "workalike" module, see:
http://cvs.zope.org/Zope3/src/datetime/

I have not tested whether this works with dateutil

Cheers

Michael
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,176
Messages
2,570,950
Members
47,500
Latest member
ArianneJsb

Latest Threads

Top