data regex match

G

Gary Wessle

Hi

I am having an issue with this match

tx = "now 04/30/2006 then"
data = re.compile('(\d{2})/\1/\1\1', re.IGNORECASE)
d = data.search(tx)
print d

Nono
I was expecting 04/30/2006, what went wrong?

thanks
 
F

Fredrik Lundh

Gary said:
I am having an issue with this match

tx = "now 04/30/2006 then"
data = re.compile('(\d{2})/\1/\1\1', re.IGNORECASE)
d = data.search(tx)
print d

Nono
I was expecting 04/30/2006

really? your pattern matches two digits, followed by a slash, followed
by a byte with the ASCII value 1, followed by a slash, followed by two
bytes with the ASCII value 1.
'(\\d{2})/\x01/\x01\x01'

in case you meant to write

r'(\d{2})/\1/\1\1'

(which is the same thing as '(\\d{2})/\\1/\\1\\1')

it's still not close; that pattern matches two digits, followed by a slash,
followed by the *same* two digits, followed by a slash, followed by the
same two digits, followed by the same two digits.

in other words, dates like 20/20/2020 and 12/12/1212.

try

'\d\d/\d\d/\d\d\d\d'

instead.
what went wrong?

(insert obligatory jwz quote here)

</F>
 
H

Heiko Wundram

Am Dienstag 02 Mai 2006 23:06 schrieb Gary Wessle:
Hi

I am having an issue with this match

tx = "now 04/30/2006 then"
data = re.compile('(\d{2})/\1/\1\1', re.IGNORECASE)

As always, use a raw string for regular expressions. \d is being interpreted
to mean an ascii character, and not to mean the character class you're trying
to reference here.

Second: \1 (if properly quoted in the string) matches the first group exactly.
Your regex would only match 20/20/2020 or strings of such format.

Third: IGNORECASE is irrelevant here, you're not trying to match letters, are
you?

Anyway, the following works:

dateregex = re.compile(r"(\d{2})/(\d{2})/(\d{4})")
m = dateregex.search(tx)
if m:
print m.groups()
else:
print "No match."

--- Heiko.
 
R

Rene Pijlman

Gary Wessle:
tx = "now 04/30/2006 then"
data = re.compile('(\d{2})/\1/\1\1', re.IGNORECASE)
d = data.search(tx)
print d

Nono
I was expecting 04/30/2006

You should expect: NameError: name 're' is not defined
what went wrong?

\1 matches the content of the first group, which is '04'. It doesn't match
'30', '20' and '06'.

Also, you'll need to use a raw string, or backslash your backslashes.
<_sre.SRE_Match object at 0x01287F60>
 
F

Fredrik Lundh

Heiko said:
As always, use a raw string for regular expressions. \d is being interpreted
to mean an ascii character, and not to mean the character class you're trying
to reference here.

\d isn't an ASCII character, but \1 is.
(\d{2})/?/??

</F>
 
H

Heiko Wundram

Am Dienstag 02 Mai 2006 23:34 schrieb Fredrik Lundh:
\d isn't an ASCII character, but \1 is.

I tried that just know. Didn't know that \[a-z] weren't all interpreted as
escape sequences... Seems like I learn something every day. ;-)

--- Heiko.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,294
Messages
2,571,511
Members
48,202
Latest member
ClaudioVil

Latest Threads

Top