Corectly convert from %PATH%=c:\\X;"c:\\a;b" TO ['c:\\X', 'c:\\a;b']

C

chirayuk

Hi,

I am trying to treat an environment variable as a python list - and I'm
sure there must be a standard and simple way to do so. I know that the
interpreter itself must use it (to process $PATH / %PATH%, etc) but I
am not able to find a simple function to do so.

os.environ['PATH'].split(os.sep) is wrong on Windows for the case when
PATH="c:\\A;B";c:\\D;
where there is a ';' embedded in the quoted path.

Does anyone know of a simple way (addons ok) which would do it in a
cross platform way? If not - I will roll my own. My search has shown
that generally people just use the simple split menthod as above and
leave it there but it seemed like such a common operation that I
believe there must be a way out for it which I am not seeing.

Thanks,
Chirayu.
 
M

Michael Spencer

chirayuk said:
Hi,

I am trying to treat an environment variable as a python list - and I'm
sure there must be a standard and simple way to do so. I know that the
interpreter itself must use it (to process $PATH / %PATH%, etc) but I
am not able to find a simple function to do so.

os.environ['PATH'].split(os.sep) is wrong on Windows for the case when
PATH="c:\\A;B";c:\\D;
where there is a ';' embedded in the quoted path.

Does anyone know of a simple way (addons ok) which would do it in a
cross platform way? If not - I will roll my own. My search has shown
that generally people just use the simple split menthod as above and
leave it there but it seemed like such a common operation that I
believe there must be a way out for it which I am not seeing.

Thanks,
Chirayu.
You may be able to bend the csv module to your purpose:

>>> test = """\"c:\\A;B";c:\\D;"""
>>> test1 = os.environ['PATH']
>>> import csv
>>> class path(csv.excel):
... delimiter = ';'
... quotechar = '"'
...
>>> csv.reader([test],path).next() ['c:\\A;B', 'c:\\D', '']
>>> csv.reader([test1],path).next()
['C:\\WINDOWS\\system32', 'C:\\WINDOWS', 'C:\\WINDOWS\\System32\\Wbem',
'C:\\Program Files\\ATI Technologies\\ATI Control Panel',
'C:\\PROGRA~1\\ATT\\Graphviz\\bin', 'C:\\PROGRA~1\\ATT\\Graphviz\\bin\\tools',
'C:\\WINDOWS\\system32', 'C:\\WINDOWS', 'C:\\WINDOWS\\System32\\Wbem',
'C:\\Program Files\\ATI Technologies\\ATI Control Panel',
'C:\\PROGRA~1\\ATT\\Graphviz\\bin', 'C:\\PROGRA~1\\ATT\\Graphviz\\bin\\tools',
'c:\\python24', 'c:\\python24\\scripts', 'G:\\cabs\\python\\pypy\\py\\bin']
HTH
Michael
 
C

chirayuk

Michael said:
chirayuk said:
Hi,

I am trying to treat an environment variable as a python list - and I'm
sure there must be a standard and simple way to do so. I know that the
interpreter itself must use it (to process $PATH / %PATH%, etc) but I
am not able to find a simple function to do so.

os.environ['PATH'].split(os.sep) is wrong on Windows for the case when
PATH="c:\\A;B";c:\\D;
where there is a ';' embedded in the quoted path.

Does anyone know of a simple way (addons ok) which would do it in a
cross platform way? If not - I will roll my own. My search has shown
that generally people just use the simple split menthod as above and
leave it there but it seemed like such a common operation that I
believe there must be a way out for it which I am not seeing.

Thanks,
Chirayu.
You may be able to bend the csv module to your purpose:

test = """\"c:\\A;B";c:\\D;"""
test1 = os.environ['PATH']
import csv
class path(csv.excel):
... delimiter = ';'
... quotechar = '"'
...
csv.reader([test],path).next() ['c:\\A;B', 'c:\\D', '']
csv.reader([test1],path).next()
['C:\\WINDOWS\\system32', 'C:\\WINDOWS', 'C:\\WINDOWS\\System32\\Wbem',
'C:\\Program Files\\ATI Technologies\\ATI Control Panel',
'C:\\PROGRA~1\\ATT\\Graphviz\\bin', 'C:\\PROGRA~1\\ATT\\Graphviz\\bin\\tools',
'C:\\WINDOWS\\system32', 'C:\\WINDOWS', 'C:\\WINDOWS\\System32\\Wbem',
'C:\\Program Files\\ATI Technologies\\ATI Control Panel',
'C:\\PROGRA~1\\ATT\\Graphviz\\bin', 'C:\\PROGRA~1\\ATT\\Graphviz\\bin\\tools',
'c:\\python24', 'c:\\python24\\scripts', 'G:\\cabs\\python\\pypy\\py\\bin']
HTH
Michael

That is a cool use of the csv module.

However, I just realized that the following is also a valid PATH in
windows.

PATH=c:\A"\B;C"\D;c:\program files\xyz"
(The quotes do not need to cover the entire path)

So here is my handcrafted solution.

def WinPathList_to_PyList (pathList):
pIter = iter(pathList.split(';'))
OddNumOfQuotes = lambda x: x.count('"') % 2 == 1
def Accumulate (p):
bAcc, acc = OddNumOfQuotes(p), [p]
while bAcc:
p = pIter.next ()
acc.append (p)
bAcc = not OddNumOfQuotes (p)
return "".join (acc).replace('"','')
return [q for q in [Accumulate (p) for p in pIter] if q]


So now I need to check if the os is windows.

Wishful thinking: It would be nice if something like this (taking care
of the cases for other OS's) made it into the standard library - the
interpreter must already be doing it.

Thanks,
Chirayu.
 
J

Jeff Epler

if your goal is to search for files on a windows-style path environment
variable, maybe you don't want to take this approach, but instead wrap
and use the _wsearchenv or _searchenv C library functions
http://msdn.microsoft.com/library/en-us/vclib/html/_crt__searchenv.2c_._wsearchenv.asp

Incidentally, I peeked at the implementation of _searchenv in wine (an
implementation of the win32 API for Unix), and it doesn't do the
quote-processing that you say Windows does. The msdn page doesn't give
the syntax for the variable either, which is pretty typical. Do you
have an "official" page that discusses the syntax?

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFCUAVOJd01MZaTXX0RAkr/AJ4mRxmzp1yFPVq3E0zLaKwcQMOwDgCePo8e
IY+Ee/9janJgX/eezLtlYdc=
=cTMF
-----END PGP SIGNATURE-----
 
M

Michael Spencer

chirayuk said:
However, I just realized that the following is also a valid PATH in
windows.
PATH=c:\A"\B;C"\D;c:\program files\xyz"
(The quotes do not need to cover the entire path)

Too bad! What a crazy format!
So here is my handcrafted solution.

def WinPathList_to_PyList (pathList):
pIter = iter(pathList.split(';'))
OddNumOfQuotes = lambda x: x.count('"') % 2 == 1
def Accumulate (p):
bAcc, acc = OddNumOfQuotes(p), [p]
while bAcc:
p = pIter.next ()
acc.append (p)
bAcc = not OddNumOfQuotes (p)
return "".join (acc).replace('"','')
return [q for q in [Accumulate (p) for p in pIter] if q]
Does it work?

I get: Traceback (most recent call last):
File "<input>", line 1, in ?
File "pathsplit", line 31, in WinPathList_to_PyList
File "pathsplit", line 27, in Accumulate
StopIteration
Also, on the old test case, I get:
>>> WinPathList_to_PyList("""\"c:\\A;B";c:\\D;""") ['c:\\AB', 'c:\\D']
>>>

Should the ';' within the quotes be removed?
So now I need to check if the os is windows.

Wishful thinking: It would be nice if something like this (taking care
of the cases for other OS's) made it into the standard library - the
interpreter must already be doing it.
This sort of 'stateful' splitting is a somewhat common task. If you're feeling
creative, you could write itertools.splitby(iterable, separator_func)

This would be a sister function to itertools.groupby (and possible derive from
its implementation). separator_func is a callable that returns True if the item
is a separator, False otherwise.

splitby would return an iterator of sub-iterators (like groupby) defined by the
items between split points

You could then implement parsing of crazy source like your PATH variable by
implementing a stateful separator_func

Michael
 
C

Chirayu Krishnappa

I do agree that it is a crazy format - and am amazed that it works at
the prompt.

For the first case - you have a mismatched double quote for test2 at
the end of the string. test2 should be r'c:\A"\B;C"\D;c:\program
files\xyz' instead. For the 2nd case - my code swallowed the ';' it
split on - so I need a acc.append (';') just before the acc.append(p)
in Accumulate. The code then works. It needs to be fixed to take care
of extra double quotes and also a missing one (cmd.exe appeats to
assume one at the end if it did not find one.)

The itertools.splitby idea sounds really cool. I did not feel like
writing a state machine as the state was so simple to maintain here -
but I'd like to write a splitby so that it makes it easier to do such
crazy splitting in general.

Chirayu.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,232
Messages
2,571,168
Members
47,803
Latest member
ShaunaSode

Latest Threads

Top