xml.dom - reading from a file

S

sashan

Is the way to use DOM for an xml file as follows:
1) Read the file into a string
2) Call xml.dom.minidom.parseString(string)
 
A

Alex Martelli

sashan said:
Is the way to use DOM for an xml file as follows:
1) Read the file into a string
2) Call xml.dom.minidom.parseString(string)

It's one way, but xml.dom.minidom.parse(f) is generally better. f can
be a filename OR a file object open for reading.


Alex
 
B

Bengt Richter

It's one way, but xml.dom.minidom.parse(f) is generally better. f can
be a filename OR a file object open for reading.
That reminds me ...

Is there a BDFL pronouncement or dev consensus on implementation of accepting
either filename or file-object?

E.g., should one
assert type(filename) is str
or
assert isinstance(filename,str)
or
??

and is the file object alternative

assert isinstance(filename, file) # too restrictive IMO
or
assert hasattr(filename,'read') and callable(filename.read) # what about next?
or
??

I guess the generic idea is that filename-when-it-is-a-file-object will be bound to
something that produces a sequence of strings, so shouldn't an iterator/generator
be acceptable as well? (E.g., I expect generator expressions will be handy for
test inputs etc.)

So should one look for a next method?

And, given a generic source of string chunks (must they be str instances or could they
be generator chunks recursively?) is there a blessed efficient wrapper function that will
convert the str chunk stream to an object that can fake a file instance more completely
(e.g., for readline etc.)?

In fact, why not a standard function to convert this kind of either-or argument into
a file instance proxy? Then policy and behavior could be standardized, and people
wouldn't be wondering and re-inventing wheel variants.

Let's see...
['softspace', 'encoding', 'xreadlines', 'readlines', 'flush', 'close', 'seek', '__init__', 'newl
ines', '__setattr__', '__new__', 'readinto', 'next', 'write', 'closed', 'tell', 'mode', 'isatty'
, 'truncate', 'read', '__getattribute__', '__iter__', 'readline', 'fileno', 'writelines', 'name'
, '__doc__', '__delattr__', '__repr__']

Well, probably not all that (except maybe for nice error messages) and maybe the wrapping function
should accept some keyword arguments for 'strict' vs 'warn' and maybe optional callback vs exception
raising?

Oh, and what about when the arg is already a standard file instance? Should e.g., mode be
overridable when that is feasible?

In summary, the proposed goal is to make usage a no-brainer by providing a standard wrapping function
for file-or-filename args.

Regards,
Bengt Richter
 
M

Magnus Lie Hetland

That reminds me ...

Is there a BDFL pronouncement or dev consensus on implementation of accepting
either filename or file-object?
[snip]

The standard in such cases is usually the "leap before you look"
idiom, I should think, using try/except and catching signature-related
exceptions. In this case you might try to call read() and revert to
opening the file if there is no read method.
 
D

dman

That reminds me ...

Is there a BDFL pronouncement or dev consensus on implementation of
accepting either filename or file-object?
[snip]

The standard in such cases is usually the "leap before you look"
idiom, I should think, using try/except and catching signature-related
exceptions. In this case you might try to call read() and revert to
opening the file if there is no read method.

Another ordering would be to make the parameter a file, without trying
to read first :

def my_function( f ) :
try :
f = file(f, "r")
except TypeError : pass

f.read()

If 'f' is a valid path, then you'll have an open file. If it is
already a file you'll get a type error ("coercing to Unicode: need
string or buffer, file found"). If it is neither, then the .read()
will fail.

--
Q: What is the difference between open-source and commercial software?
A: If you have a problem with commercial software you can call a phone
number and they will tell you it might be solved in a future version.
For open-source sofware there isn't a phone number to call, but you
get the solution within a day.

www: http://dman13.dyndns.org/~dman/ jabber: (e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,170
Messages
2,570,925
Members
47,468
Latest member
Fannie44U3

Latest Threads

Top