Windows command line problem

M

MarkE

I'm sure someone else has posted a similar problem but I can't find it,
nor the solution...

I have a python script which accepts a command line argument.
E.g.
python.exe myscript.py -n Foo

I build this as part of a package using distutils with the
bdist_wininst option on a Windows 2K (SP4) machine.
I have tested installing and running it fine on a Windows XP (SP2)
machine. I build my package-installer with a Python.org 2.4.1
distribution which is source-compiled locally. I have installed my
package-installer on a machine running ActiveState Python 2.4.1
installed from a .msi file.
That all works fine.

I have problems delivering it to the test team (of course). After some
investigation, if I install the package on one of our test machines and
butcher my installed file to dump the command line and exit (i.e. print
'hi', sys.argv) then I get the following:
hi ['c:\\Python24\\Lib\\site-packages\\MyPackage\\myscript.py', '\x96n,
'Foo']

If I run it specifying --name instead of -n I get:
hi ['c:\\Python24\\Lib\\site-packages\\MyPackage\\myscript.py',
'\x96-name, 'Foo']

The machine in question is also running XP service pack 2 as far as I
know, with Python.org's 2.4.1 distribution.

Does anyone know why the first character on the command line (here '-')
is getting adjusted (to '\x96') in this way ? Is it a Unicode/encodings
kind of a problem ? I can make the problem go away by running with
quotes like this:
python.exe myscript.py "-n" Foo

I'm hoping I can add an entry to my setup.py. Thanks for any and all
help.
Mark
 
J

Jeff Epler

I don't exactly know what is going on, but '\x96' is the encoding for
u'\N{en dash}' (a character that looks like the ASCII dash,
u'\N{hyphen-minus}', u'\x45') in the following windows code pages:
cp1250 cp1251 cp1252 cp1253 cp1254
cp1255 cp1256 cp1257 cp1258 cp874

Windows is clearly doing something clever.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFC287NJd01MZaTXX0RAmYtAJ0bQFD2cRwfm4EoN6ocY/mZze23kQCgknTv
gjeda9xgA1tSMeAXeK2QfFI=
=G/xd
-----END PGP SIGNATURE-----
 
M

MarkE

I'm using getopt. I doubt getopt recognises \x96 as a command line
parameter prefix. I suppose I could iterate over sys.argv doing a
replace but that seems messy. I'd rather understand the problem.

That said, and me not understanding code pages that much, I chcp'd the
machines it works on both coming back with 850, chcp'd the machine it
wasn't working on which also came back with 850, but then again the
machine where it wasn't working now works. So now it's an intermittent
bug. Great. I'll try messing with code pages later and report back if I
get anywhere.

I need more coffee before I can do anything remotely clever. Damn you
windows and your lack of a need for coffee
 
M

MarkE

This was discovered after consultation with a colleague who shall
remain nameless but, well, nailed it basically.
The answer appears to be:
An example command line for running the script was written in a word
document. The "Autocorrect" (sic) feature in word replaces a normal
dash at least as I know it with the character Jeff Epler showed above,
u'\N{en dash}' which is a nice big long dash in the Arial font.

If you cut and paste that onto the command line, bad things can happen
although when I do this on my machine I actually get a "u" with an "^"
on top. For whatever reason it must have looked ok on my colleagues
machine (or possibly this isn't the answer but I seriously doubt that)
and when he ran the Python script things went awry.

Thanks Jeff (and nameless colleague). And beware Word autocorrection.
 
B

Benji York

MarkE said:
The answer appears to be:
An example command line for running the script was written in a word
document. The "Autocorrect" (sic) feature in word replaces a normal
dash

There is a lesson there I wish more people would learn: Word is not a
text editor. :)
 
S

sp1d3rx

I think the lesson there is 'dont depend on getopt, write your own
command line parser'. I always write my own, as it's so easy to do.
 
B

Benji York

I think the lesson there is 'dont depend on getopt, write your own
command line parser'. I always write my own, as it's so easy to do.

While I'll agree that getopt isn't ideal, I find optparse to be much better.
 
S

Steve Holden

I think the lesson there is 'dont depend on getopt, write your own
command line parser'. I always write my own, as it's so easy to do.
I suppose you built your own car so you could get out a bit, too? After
all, there's nothing tricky about a simple internal combustion engine,
right? ;-)

regards
Steve
 
S

sp1d3rx

Here's an example...
---- BEGIN TEST.PY ----

import sys
print "Original:", sys.argv
for arg in sys.argv:
arg = arg.strip('-\x93\x96') # add chars here you want to strip
print "Stripped:", arg

---- END TEST.PY ----
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top