String to valid Python identifier

  • Thread starter Дамјан ГеоргиевÑки
  • Start date
Ð

Дамјан ГеоргиевÑки

Is there any easy function in the stdlib to convert any random string in
a valid Python identifier .. possibly by replacing non-valid characters
with _ ?

Python 2.x only, so no need to do Unicode.


--
дамјан ( http://softver.org.mk/damjan/ )

Religion ends and philosophy begins,
just as alchemy ends and chemistry begins
and astrology ends, and astronomy begins.
 
M

MRAB

Дамјан ГеоргиевÑки said:
Is there any easy function in the stdlib to convert any random string in
a valid Python identifier .. possibly by replacing non-valid characters
with _ ?

Python 2.x only, so no need to do Unicode.
Lookup the 'maketrans' function in the 'string' module and the
'translate' method of the 'str' class.
 
M

Martin v. Löwis

Is there any easy function in the stdlib to convert any random string in
a valid Python identifier .. possibly by replacing non-valid characters
with _ ?

I think this is fairly underspecified as a problem statement. A solution
that would meet your specification would be

def mkident(s):
return "foo"

It returns a valid Python identifier for any random string.

If you now complain that this gives too many collisions, I propose

def mkident(s):
return "foo%d" % (hash(s) & 0x7fffffff)

Regards,
Martin
 
Ð

Дамјан ГеоргиевÑки

I think this is fairly underspecified as a problem statement. A
solution that would meet your specification would be

True, I was thinking that if there was an obvious solution I'd get the
answer right away.. so it seems there's not.

I did try the string.maketrans way... it was a bit ugly so I decided to
ask here if I'm missing something... maybe I'll try maketrans again.


ps.
by "convert" I didn't mean a full transformation... I was more hoping
for the least intrusive transformation that would still leave
discernible resemblance of the original string.
 
V

Vlastimil Brom

2009/8/8 Дамјан ГеоргиевÑки said:
....

ps.
by "convert" I didn't mean a full transformation... I was more hoping
for the least intrusive transformation that would still leave
discernible resemblance of the original string.



--
дамјан ( http://softver.org.mk/damjan/ )

When you do things right, people won't be sure if you did anything at
all.

Depending on the needs (speed, readability ...), the regular
expression replacement might be viable too:
import re
print re.sub(r"[^a-zA-Z0-9_]", r"_", u"aábc d:eé_123's - foo?!var") a_bc_d_e__123_s___foo__var
print re.sub(r"\W", r"_", u"aábc d:eé_123's - foo?!var") # equivalent a_bc_d_e__123_s___foo__var

Additionally, as with the other approaches, you also have to check for
the (illegal) starting digit and replace it or prepend some valid
prefix.
hth
vbr
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,833
Latest member
BettyeMacf

Latest Threads

Top