Regular expression

S

Sallu

Hi All,
here i have on textbox in which i want to restrict the user to not
enter the 'acent character' like ( é )
i wrote the program

import re
value="this is Praveen"
#value = 'riché gerry'
if(re.search(r"^[A-Za-z0-9]*$",value)):
print "Not allowed accent character"
else:
print "Valid"

output :

sys:1: DeprecationWarning: Non-ASCII character '\xc3' in file regu1.py
on line 3, but no encoding declared; see http://www.python.org/peps/pep-0263.html
for details
Valid

when i make comment #value="this is Praveen" and released comment
value = 'riché gerry'
but still i am getting same output even it have accent character.
 
S

Sallu

Hi,
Your post is not about re, but about encoding, next time
be more careful when choosing topic for your post!
Did you check what pep0263 says about encoding?
One of the first thins it says is:

"(...)
Defining the Encoding
Python will default to ASCII as standard encoding if no other
     encoding hints are given.
(...)"

So when you're using non ASCII characters you should always
specify encoding. Here again, read pep0263 for how this can
be done, especially section Defining Encoding, where there
are multiple ways of doing that.

Sallu pisze:


Hi All,
here i have on textbox in which i want to restrict the user to not
enter the 'acent character' like ( é )
i wrote the program
import re
value="this is Praveen"
#value = 'riché gerry'
if(re.search(r"^[A-Za-z0-9]*$",value)):
  print "Not allowed accent character"
else:
  print "Valid"
sys:1: DeprecationWarning: Non-ASCII character '\xc3' in file regu1.py
on line 3, but no encoding declared; seehttp://www.python.org/peps/pep-0263.html
for details
Valid
when i make comment #value="this is Praveen" and released comment
value = 'riché gerry'
but still i am getting same output even it have accent character.

I am sorry sotys..actually i am very much new to python..
import re
import os, sys

string = 'riché'
print string


def strip_accents(string):
import unicodedata
return unicodedata.normalize('NFKD',
unicode(string)).encode('ASCII', 'ignore')


msg=strip_accents(string)
print msg

Output :

sys:1: DeprecationWarning: Non-ASCII character '\xc3' in file regu.py
on line 4, but no encoding declared; see http://www.python.org/peps/pep-0263.html
for details
riché
Traceback (most recent call last):
File "regu.py", line 13, in ?
msg=strip_accents(string)
File "regu.py", line 10, in strip_accents
return unicodedata.normalize('NFKD',
unicode(string)).encode('ASCII', 'ignore')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
4: ordinal not in range(128)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,150
Members
46,697
Latest member
AugustNabo

Latest Threads

Top