regex with specific list of string

J

james_027

hi,

how do I regex that could check on any of the value that match any one
of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
'sep', 'oct', 'nov', 'dec'

Thanks
james
 
C

Carsten Haese

hi,

how do I regex that could check on any of the value that match any one
of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
'sep', 'oct', 'nov', 'dec'

Why regex? You can simply check if the given value is contained in the
set of allowed values:
s = set(['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec'])
'jan' in s True
'spam' in s
False

HTH,
 
P

Pablo Ziliani

Carsten said:
hi,

how do I regex that could check on any of the value that match any one
of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
'sep', 'oct', 'nov', 'dec'

Why regex? You can simply check if the given value is contained in the
set of allowed values:

s = set(['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
'sep', 'oct', 'nov', 'dec'])

Also, check calendar for a locale aware (vs hardcoded) version:
import calendar
[calendar.month_abbr.lower() for i in range(1,13)]

['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec']

If you still want to use regexes, you can do something like:
import re
pattern = '(?:%s)' % '|'.join(calendar.month_abbr[1:13])
pattern '(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)'
re.search(pattern, "we are in september", re.IGNORECASE)
re.search(pattern, "we are in september", re.IGNORECASE).group()
'sep'

If you want to make sure that the month name begins a word, use the following pattern instead:
pattern = r'(?:\b%s)' % r'|\b'.join(calendar.month_abbr[1:13])
pattern
'(?:\\bJan|\\bFeb|\\bMar|\\bApr|\\bMay|\\bJun|\\bJul|\\bAug|\\bSep|\\bOct|\\bNov|\\bDec)'

If in doubt, Google for "regular expressions in python" or go to http://docs.python.org/lib/module-re.html


Regards,
Pablo
 
C

Carsten Haese

Unfortunately, that also matches margarine, mayonnaise, and octopus,
just to name a few ;-)
 
P

Pablo Ziliani

Carsten said:
Unfortunately, that also matches margarine, mayonnaise, and octopus,
just to name a few ;-)

(and so does the solution you sent before :)

This is fine IMO since the OP didn't specify the opposite.

BTW in my previous post I included an example that ensures that the
search month matches the beginning of a word. That was based in that
maybe he wanted to match e.g. "dec" against "December" (BTW, it should
have been r'\b(?:Jan|Feb|...)' instead). To always match a whole word, a
trailing \b can be added to the pattern OR (much better) if the month
can appear both in its abbreviated and full form, he can use the
extensive set as follows (I hope this is clear, excuse my Thunderbird...):
>>> pattern = r"\b(?:%s)\b" % '|'.join(calendar.month_name[1:13] + calendar.month_abbr[1:13])
>>> pattern '\\b(?:January|February|March|April|May|June|July|August|September|October|November|December|Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\\b'
>>> target = "Unlike Julia, I like apricots with mayo in august or sep"
>>> target 'Unlike Julia, I like apricots with mayo in august or sep'
>>> re.findall(pattern, target, re.IGNORECASE) ['august', 'sep']
>>> re.search(pattern, target, re.IGNORECASE)
>>> re.findall(pattern, target, re.IGNORECASE)
['august', 'sep']


Regards,
Pablo
 
C

Carsten Haese

(and so does the solution you sent before :)

No, it doesn't.
s = set(['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', .... 'sep', 'oct', 'nov', 'dec'])
'margarine' in s False
'mayonnaise' in s False
'octopus' in s
False

This is fine IMO since the OP didn't specify the opposite.

True, but my crystal ball tells me that the OP wants exact matches.
(Extrapolating from another post made by the OP earlier today, I'm
guessing he has a list of column names to include in an SQL "order by"
clause, and he wants to check them for validity first.)
 
S

Steve Holden

Carsten said:
(and so does the solution you sent before :)

No, it doesn't.
s = set(['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', ... 'sep', 'oct', 'nov', 'dec'])
'margarine' in s False
'mayonnaise' in s False
'octopus' in s
False

This is fine IMO since the OP didn't specify the opposite.

True, but my crystal ball tells me that the OP wants exact matches.
(Extrapolating from another post made by the OP earlier today, I'm
guessing he has a list of column names to include in an SQL "order by"
clause, and he wants to check them for validity first.)
Well, as somebody else already pointed out, the OP's query was
completely misconceived in the first place, and he would have been
better recasting it in a more natural way.

However, I am not going to claim that my psychic powers are clearly
superior to yours ...

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline
 
J

james_027

Hi all,
This is fine IMO since the OP didn't specify the opposite.

Thanks for all your replies, though I don't understand quite well the
going argument? What do you mean when you say "the OP didn't specify
the opposite"?

There reason for using regex is because I am going to use it in
Django's URL pattern

Thanks
james
 
S

Steve Holden

james_027 said:
Hi all,


Thanks for all your replies, though I don't understand quite well the
going argument? What do you mean when you say "the OP didn't specify
the opposite"?

There reason for using regex is because I am going to use it in
Django's URL pattern
Carsten was pointing out that the pattern I gave you would match any
string that *began* with one of the month names, as I didn't include an
element to force a match of the end of the string.

I did this because I assumed you were most interested in finding out how
to match one of a number of alternate strings, and this would likely
only be a part of your final pattern.

If you already have what you need you really don't need to pay much
attention to the rest: it's just geeks picking nits!

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top