How to check...

L

Lad

Hello,
How can I check that a string does NOT contain NON English characters?
Thanks
L.
 
A

augustus.kling

Hello,

try using regular expressions. I'afraid that i don't have any
documentation right here but i think there is a starting point for a
web search now.

Greetings
 
A

augustus.kling

Additional info: You will documentation in the Python help utility by
typing the module name 're' or 'sre'
 
D

Daniel Marcel Eichler

Lad said:
How can I check that a string does NOT contain NON English characters?

try:
foobar.encode('ascii')
except:
bla

or use string.ascii_letters and enhance it.


mfg

Daniel
 
J

John Zenger

This should be just a matter of determining how your string is encoded
(ASCII, UTF, Unicode, etc.) and checking the ord of each character to
see if it is in the contiguous range of English characters for that
encoding. For example, if you know that the string is ASCII or UTF-8,
you could check ord for each character and confirm it is less than 128.
 
N

Neil Hodgson

Lad:
> How can I check that a string does NOT contain NON English characters?

It depends on how you define the set of English characters which is
as much a matter of opinion or authority as fact. The following may be
regarded as English despite containing 9 (8 unique) non-ASCII characters:
The €200 encyclopædia deï¬nes the “coördinates†in ¼ Ã¥ngströms.

Neil
 
S

Steven D'Aprano

Hello,

try using regular expressions.

"Some people, when confronted with a problem, think 'I know, I'll use
regular expressions'. Now they have two problems." -- Jamie Zawinski

The original poster asked:

"How can I check that a string does NOT contain NON English characters?"

REs are rather overkill for something so simple, don't you think?

import string
english = string.printable # is this what you want?
english = string.ascii_letters + string.digits # or maybe this?
english = "abc..." # or just manually set the characters yourself

for c in some_string:
if c not in english:
print "Not English!!!"
break
else:
print "English!"



if you want it as a function, it is even more flexible:

def all_good(s, goodchars=None):
if goodchars is None:
goodchars = string.printable
for c in s:
if c not in goodchars:
return False
return True
 
S

Scott David Daniels

Lad said:
Hello,
How can I check that a string does NOT contain NON English characters?
Thanks
L.
If all you care about is ASCII vs. non-ASCII, you could use:
ord(max(string)) < 128
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,284
Messages
2,571,413
Members
48,106
Latest member
JamisonDev

Latest Threads

Top