Bug in glob.glob for files w/o extentions in Windows

G

Georgy Pruss

On Windows XP glob.glob doesn't work properly for files without extensions.
E.g. C:\Temp contains 4 files: 2 with extensions, 2 without.

C:\Temp>dir /b *
aaaaa.aaa
bbbbb.bbb
ccccc
ddddd

C:\Temp>dir /b *.
ccccc
ddddd

C:\Temp>python
Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.['aaaaa.aaa', 'bbbbb.bbb', 'ccccc', 'ddddd']
[]

It looks like a bug.

Georgy
 
G

Georgy Pruss

OK, you can call it not a bug, but different behavior.
I've found that the fnmatch module is the reason for that.
Here's other examples:

C:\temp>dir /b *.*
..eee
aaa.aaa
nnn

C:\temp>dir /b * # it's by def synonym for *.*
..eee
aaa.aaa
nnn

C:\temp>dir /b .*
..eee

C:\temp>dir /b *. # it looks strange too
..eee
nnn


C:\temp>python
['aaa.aaa']
['aaa.aaa', 'nnn']
[]


It seems that in any case I'll have to extract 'nnn' by myself.
Something like:

if mask.endswith('.'): # no extention implies actually no dots in name at all
list = glob.glob( mask[:-1] )
list = filter( lambda x: '.' not in x, list ) # or [x for x in list if '.' not in x]
else:
list = glob.glob( mask )

G-:
 
J

Jules Dubois

On Windows XP glob.glob doesn't work properly for files without extensions.
E.g. C:\Temp contains 4 files: 2 with extensions, 2 without.
[...]
C:\Temp>dir /b *.
ccccc
ddddd

This is standard Windows behavior. It's compatible with CP/M and therefore
MS-DOS, and Microsoft has preserved this behavior in all versions of
Windows.

Did you ever poke around in the directory system in a FAT partition
(without VFAT)? You'll find that every file name is exactly 11 characters
long and "." is not found in any part of any file name in any directory
entry.

It's bizarre but that's the way it works. If you try

dir /b *

does cmd.exe list only files without extensions?

glob provides "Unix style pathname pattern expansion" as documented in the
_Python Library Reference_: If there's a period (".") in the pattern, it
must match a period in the filename.
It looks like a bug.

No, it's proper behavior. It's Windows that's (still) screwy.
 
G

Georgy Pruss

| On Sun, 30 Nov 2003 03:47:38 GMT, in article
| <Georgy Pruss
| wrote:
|
| > On Windows XP glob.glob doesn't work properly for files without extensions.
| > E.g. C:\Temp contains 4 files: 2 with extensions, 2 without.
| > [...]
| > C:\Temp>dir /b *.
| > ccccc
| > ddddd
|
| This is standard Windows behavior. It's compatible with CP/M and therefore
| MS-DOS, and Microsoft has preserved this behavior in all versions of
| Windows.

That's what I meant, wanted and liked.

C'mon guys, I don't care if it's FAT, NTFS, Windows, Linux, VMS or whatever.
All I wanted was to get files w/o dots in their names (on my computer :)).
I did it and I can do it on any system if I need.


| Did you ever poke around in the directory system in a FAT partition
| (without VFAT)? You'll find that every file name is exactly 11 characters
| long and "." is not found in any part of any file name in any directory
| entry.
|
| It's bizarre but that's the way it works. If you try
|
| dir /b *
|
| does cmd.exe list only files without extensions?

By definition it's the same as *.* if my memory serves me right.


| >>>> glob.glob( '*.' )
| > []
| >
|
| glob provides "Unix style pathname pattern expansion" as documented in the
| _Python Library Reference_: If there's a period (".") in the pattern, it
| must match a period in the filename.
|
| > It looks like a bug.
|
| No, it's proper behavior. It's Windows that's (still) screwy.

I see.
Show the world a perfect OS and you'll be a billionaire.

G-:
 
J

Jules Dubois

On Sun, 30 Nov 2003 06:18:36 GMT, in article
| On Sun, 30 Nov 2003 03:47:38 GMT, in article
|
C'mon guys, I don't care if it's FAT, NTFS, Windows, Linux, VMS or whatever.
All I wanted was to get files w/o dots in their names (on my computer :)).

I was just pointing out the reason for the behavior.
| dir /b *
|
| does cmd.exe list only files without extensions?

By definition it's the same as *.* if my memory serves me right.

I'm sure ".*" was the same as "*.*". Win2k's cmd.exe won't run under Wine,
so I couldn't test "*".
| No, it's proper behavior. It's Windows that's (still) screwy.

Show the world a perfect OS and you'll be a billionaire.

We agree, then, that every operating system has its good points and its bad
points. (I guess we don't agree on whether "*." should or shouldn't match
files without periods in their name.)
 
G

Georgy Pruss

|
| We agree, then, that every operating system has its good points and its bad
| points. (I guess we don't agree on whether "*." should or shouldn't match
| files without periods in their name.)

Anyway, "*." is not a bad DOS convention to select files w/o extention, although
it comes from the old 8.3 name scheme. BTW, how can you select files w/o
extention in Unix's shells?

G-:
 
F

Francis Avila

Georgy Pruss wrote in message ...
OK, you can call it not a bug, but different behavior.


That's true. But calling dir's behavior "different" here is quite a
euphemism!
It seems that in any case I'll have to extract 'nnn' by myself.
Something like:

if mask.endswith('.'): # no extention implies actually no dots in name at all
list = glob.glob( mask[:-1] )
list = filter( lambda x: '.' not in x, list ) # or [x for x in list if '.' not in x]
else:
list = glob.glob( mask )


I don't understand where 'mask' is coming from. If you want files with no
dots, just filter out those files:

filelist = [file for file in glob.glob('*') if '.' not in file]

Or you can use sets: symmetric difference of all files against the files
with dots.

If you're trying to recast glob in windows' image, you'll have to
specialcase '*.*' too. And then what do you do if someone comes along who
*really* wants *only* names with dots in them!?

Trying to shoehorn windows-style semantics into glob is just braindead--the
windows semantics are wrong because dots are not special anymore. For one
thing, we can have more than one of them, and they can be anywhere in the
filename. Both were not true for DOS, whence windows inherited the *.*
nonsense.

Behold the awesome visage of the One True Glob (TM): (Not that I'm starting
a holy war or anything ;)
*.* -> Filename has a dot in it, and that dot cannot be the first or last
char.
This is NOT the same as '*'!!
..* -> Filename has a dot as the first character.
*. -> Filename has a dot as the last character.
* -> Gimme everything.
 
S

Serge Orlov

Anyway, "*." is not a bad DOS convention to select files w/o extention, although
it comes from the old 8.3 name scheme. BTW, how can you select files w/o
extention in Unix's shells?
The same way as in Python:
filelist = [file for file in glob.glob('*') if '.' not in file]
Shell:
ls|grep -v [.]
Making up special conventions is not the Python way.

-- Serge.
 
P

Peter Otten

Georgy said:
Anyway, "*." is not a bad DOS convention to select files w/o extention,
although it comes from the old 8.3 name scheme. BTW, how can you select
files w/o extention in Unix's shells?

ls -I*.*

The -I option tells the ls command what *not* to show.

Peter
 
G

Gerrit Holl

Francis said:
Behold the awesome visage of the One True Glob (TM): (Not that I'm starting
a holy war or anything ;)
*.* -> Filename has a dot in it, and that dot cannot be the first or last
char.
This is NOT the same as '*'!!
.* -> Filename has a dot as the first character.
*. -> Filename has a dot as the last character.
* -> Gimme everything.

Note that Bash doesn't behave like this either: * does not give
everything, rather it gives everything not starting with a dot. In Bash,
* really means: [!.]*

yours,
Gerrit.
 
M

Mel Wilson

| On Sun, 30 Nov 2003 03:47:38 GMT, in article
| <Georgy Pruss
| wrote:
|
| > On Windows XP glob.glob doesn't work properly for files without extensions.
| > E.g. C:\Temp contains 4 files: 2 with extensions, 2 without.
| > [...]
| > C:\Temp>dir /b *.
| > ccccc
| > ddddd
|
| This is standard Windows behavior. It's compatible with CP/M and therefore
| MS-DOS, and Microsoft has preserved this behavior in all versions of
| Windows.

That's what I meant, wanted and liked.

C'mon guys, I don't care if it's FAT, NTFS, Windows, Linux, VMS or whatever.
All I wanted was to get files w/o dots in their names (on my computer :)).
I did it and I can do it on any system if I need.

Looks like you need os.path.glob(), which doesn't exist, yet.

Regards. Mel.
 
F

Francis Avila

Gerrit Holl wrote in message ...
Francis Avila wrote:
Note that Bash doesn't behave like this either: * does not give
everything, rather it gives everything not starting with a dot. In Bash,
* really means: [!.]*


That behavior can be modified with the 'dotglob' shell option:

$shopt -s dotglob
$echo *
a .b b .c d ...
 
F

Francis Avila

Stein Boerge Sylvarnes wrote in message ...

In Windows, how do you create a file with a dot as the last character? In
Unix you can do this, because a dot is just another character in the
filename. It's because you can't do this in Windows that *. is unambiguous.
That's non-standard gnu ls behaviour, I think. (Tested on OpenBSD and
SunOS)


for N in $(ls -1 | grep -v '\.'); echo $N; done

Not positively sure that the -1 option is posix, but it's at least in
OpenBSD and SunOS (in fact, it's the default when output is not to a
terminal).

Bash also has an extglob option:

$ shopt -s extglob dotglob
$ ls -1 *
a
..b
c.c
d.
$ echo !(*.*)
a

There's also @(), ?(), *(), +(). You can use multiple patterns within the
parens by joining with '|'.
 
J

Jules Dubois

On Sun, 30 Nov 2003 08:59:55 GMT, in article
| (I guess we don't agree on whether "*." should or shouldn't match
| files without periods in their name.)

Anyway, "*." is not a bad DOS convention to select files w/o extention, although
it comes from the old 8.3 name scheme. BTW, how can you select files w/o
extention in Unix's shells?

Touche.
 
T

Tim Roberts

Georgy Pruss said:
|
| It's bizarre but that's the way it works. If you try
|
| dir /b *
|
| does cmd.exe list only files without extensions?

By definition it's the same as *.* if my memory serves me right.

Actually, truth being stranger than fiction, the NT-based systems and the
16-bit systems (95/98/ME) will give you different answers to this
question...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,175
Messages
2,570,942
Members
47,476
Latest member
blackwatermelon

Latest Threads

Top