Spell-checking Python source code

J

John Zenger

To my horror, someone pointed out to me yesterday that a web app I
wrote has been prominently displaying a misspelled word. The word was
buried in my code.

Is there a utility out there that will help spell-check literal
strings entered into Python source code? I don't mean spell-check
strings entered by the user; I mean, go through the .py file, isolate
strings, and tell me when the strings contain misspelled words. In an
ideal world, my IDE would do this with a red wavy line.

I guess a second-best thing would be an easy technique to open a .py
file and isolate all strings in it.

(I know that the better practice is to isolate user-displayed strings
from the code, but in this case that just didn't happen.)
 
?

=?ISO-8859-1?Q?Ricardo_Ar=E1oz?=

John said:
To my horror, someone pointed out to me yesterday that a web app I
wrote has been prominently displaying a misspelled word. The word was
buried in my code.

Is there a utility out there that will help spell-check literal
strings entered into Python source code? I don't mean spell-check
strings entered by the user; I mean, go through the .py file, isolate
strings, and tell me when the strings contain misspelled words. In an
ideal world, my IDE would do this with a red wavy line.

I guess a second-best thing would be an easy technique to open a .py
file and isolate all strings in it.

(I know that the better practice is to isolate user-displayed strings
from the code, but in this case that just didn't happen.)

Use the re module, identify the strings and write them to another file,
then open the file with your spell checker. Program shouldn't be more
than 10 lines.
 
D

David

Use the re module, identify the strings and write them to another file,
then open the file with your spell checker. Program shouldn't be more
than 10 lines.

Have a look at the tokenize python module for the regular expressions
for extracting strings (for all possible Python string formats). On a
Debian box you can find it here: /usr/lib/python2.4/tokenize.py

It would probably be simpler to hack a copy of that script so it
writes all the strings in your source to a text file, which you then
spellcheck.

Another method would be to log all the strings your web app writes, to
a text file, then run through your entire site, and then spellcheck
your logfile.
 
?

=?ISO-8859-1?Q?Ricardo_Ar=E1oz?=

David said:
Have a look at the tokenize python module for the regular expressions
for extracting strings (for all possible Python string formats). On a
Debian box you can find it here: /usr/lib/python2.4/tokenize.py

It would probably be simpler to hack a copy of that script so it
writes all the strings in your source to a text file, which you then
spellcheck.

Another method would be to log all the strings your web app writes, to
a text file, then run through your entire site, and then spellcheck
your logfile.

Nice module :

import tokenize

def processStrings(type, token, (srow, scol), (erow, ecol), line):
if tokenize.tok_name[type] == 'STRING' :
print tokenize.tok_name[type], token, \
(srow, scol), (erow, ecol), line

file = open("myprogram.py")

tokenize.tokenize(
file.readline,
processStrings
)

How would you go about writing the output to a file? I mean, I would
like to open the file at main level and pass a handle to the file to
processStrings to write to it, finally close output file at main level.
Probably a class with a processString method?
 
D

DaveM

In an ideal world, my IDE would do this with a red wavy line.

I can't help with your problem, but this is the first thing I turn off in
Word. It drives me _mad_.

Sorry - just had to share that.

DaveM
 
D

David Trudgett

John said:
In an ideal world, my IDE would do this with a red wavy line.

You didn't mention which IDE you use; however, if you use Emacs, there
is flyspell-prog-mode which does that for you (checks your spelling
"on the fly", but only within comments and strings).

Regards,
David Trudgett
 
M

Miki

In an ideal world, my IDE would do this with a red wavy line.
You didn't mention which IDE you use; however, if you use Emacs, there
is flyspell-prog-mode which does that for you (checks your spelling
"on the fly", but only within comments and strings).
Same in Vim :)set spell)

HTH,
 
D

David

tokenize.tokenize(
file.readline,
processStrings
)

How would you go about writing the output to a file? I mean, I would
like to open the file at main level and pass a handle to the file to
processStrings to write to it, finally close output file at main level.
Probably a class with a processString method?

tokenize.tokenize() takes a callable object as it's second arg. So you
can use a class which you construct with the file, and you give it an
appropriate __call__ method.

http://docs.python.org/ref/callable-types.html

Although with a short script a global var may be simpler.
 
B

Benjamin

To my horror, someone pointed out to me yesterday that a web app I
wrote has been prominently displaying a misspelled word. The word was
buried in my code.

Is there a utility out there that will help spell-check literal
strings entered into Python source code? I don't mean spell-check
strings entered by the user; I mean, go through the .py file, isolate
strings, and tell me when the strings contain misspelled words. In an
ideal world, my IDE would do this with a red wavy line.

I guess a second-best thing would be an easy technique to open a .py
file and isolate all strings in it.

(I know that the better practice is to isolate user-displayed strings
from the code, but in this case that just didn't happen.)

This is when it's good to use put all your UI strings in a file and
get the advantages of spelling checking ease and the ability to
translate the app.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,999
Messages
2,570,246
Members
46,839
Latest member
MartinaBur

Latest Threads

Top