XML pickle

castironpi · Feb 13, 2008

Readability of the Pickle module. Can one export to XML, from cost of
speed and size, to benefit of user-readability?

It does something else: plus functions do not export their code,
either in interpreter instructions, or source, or anything else; and
classes do not export their dictionaries, just their names. But it
does export in ASCII.

Pickle checks any __safe_for_unpickling__ and __setstate__ methods,
which enable a little encapsulating, but don't go far.

At the other end of the spectrum, there is an externally-readable
datafile:

<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook
xmlns="urn:schemas-microsoft-com

ffice:spreadsheet"
xmlns:ss="urn:schemas-microsoft-com

ffice:spreadsheet">
<Worksheet ss:Name="Sheet1">
<Table>
<Row>
<Cell><Data ss:Type="String">abc</Data></Cell>
<Cell><Data ss:Type="Number">123</Data></Cell>
</Row>
</Table>
</Worksheet>
</Workbook>

Classes can be arranged to mimic this hierarchy:

class XMLable:
def __init__( self, **kwar ):
self.attrs= kwar
class Workbook( XMLable ):
cattrs= {
'xmlns': "urn:schemas-microsoft-com

ffice:spreadsheet",
'xmlns:ss': "urn:schemas-microsoft-com

ffice:spreadsheet" }
class Worksheet( XMLable ):
cattrs= { 'name': 'ss:Name' }
class Table( XMLable ): pass
class Row( XMLable ): pass
class Cell( XMLable ): pass
class Data( XMLable ):
cattrs= { 'type': 'ss:Type' }

data= Data( content= 'abc', type= 'String' )
cell= Cell( data= data )
row= Row( cells= [ cell ] )
table= Table( rows= [ row ] )
sheet= Worksheet( table= table, name= "Sheet1" )
book= Workbook( sheets= [ sheet ] )

(These might make things cleaner, but are not allowed:

#data= Data( 'abc', 'ss:Type'= 'String' )
#sheet= Worksheet( table= table, 'ss:Name'= "Sheet1" )

For keys can only be identifiers in keyword argument syntax.)

How close to this end can the standard library come? Is it more
prevalent than something else that's currently in it? What does the
recipie look like to convert this to XML, either using import xml or
not?

import pickle
print( pickle.dumps( book ) )

is not quite what I have in mind.

I guess I'm not convinced that 'is currently in use' has always been
or even is the standard by which standard library additions are
judged. If it's not, then I hold that XML is a good direction to go.
Will core developers listen to reason? Does +1 = +1?

George Sakkis · Feb 14, 2008

Readability of the Pickle module. Can one export to XML, from cost
of speed and size, to benefit of user-readability?

Take a look at gnosis.xml.pickle, it seems a good starting point.

George

castironpi · Feb 14, 2008

Take a look at gnosis.xml.pickle, it seems a good starting point.

George

The way the OP specifies it, dumps-loads pairs are broken: say if
Table and Worksheet are defined in different modules. He'd have to
have some kind of unifying pair sequence, that says that "Worksheet"
document elements come from WS.py, etc.

Stefan Behnel · Feb 14, 2008

Hi,

Readability of the Pickle module. Can one export to XML, from cost of
speed and size, to benefit of user-readability?

Regarding pickling to XML, lxml.objectify can do that:

http://codespeak.net/lxml/objectify.html

however:

It does something else: plus functions do not export their code,
either in interpreter instructions, or source, or anything else; and
classes do not export their dictionaries, just their names. But it
does export in ASCII.

Pickle checks any __safe_for_unpickling__ and __setstate__ methods,
which enable a little encapsulating, but don't go far.

I'm having a hard time to understand what you are trying to achieve. Could you
state that in a few words? That's usually better than asking for a way to do X
with Y. Y (i.e. pickling in this case) might not be the right solution for you.

Stefan

castironpi · Feb 14, 2008

Hi,

Regarding pickling to XML, lxml.objectify can do that:

http://codespeak.net/lxml/objectify.html

however:

I'm having a hard time to understand what you are trying to achieve. Could you
state that in a few words? That's usually better than asking for a way to do X
with Y. Y (i.e. pickling in this case) might not be the right solution for you.

Stefan

The example isn't so bad. It's not clear that it isn't already too
specific. Pickling isn't what I want. XML is persistent too.

XML could go a couple ways. You could export source, byte code, and
type objects. (Pickle could do that too, thence the confusion
originally.)

gnosis.xml and lxml have slightly different outputs. What I'm going
for has been approached a few different times a few different ways
already. If all I want is an Excel-readable file, that's one end of
the spectrum. If you want something more general, but still include
Excel, that's one of many decisions to make. Ideas.

How does lxml export: b= B(); a.b= b; dumps( a )?

It looks like he can create the XML from the objects already.

Stefan Behnel · Feb 14, 2008

The example isn't so bad. It's not clear that it isn't already too
specific. Pickling isn't what I want. XML is persistent too.

XML could go a couple ways. You could export source, byte code, and
type objects. (Pickle could do that too, thence the confusion
originally.)

What I meant was: please state what you are trying to do. What you describe
are the environmental conditions and possible solutions that you are thinking
of, but it doesn't tell me what problem you are actually trying to solve.

gnosis.xml and lxml have slightly different outputs. What I'm going
for has been approached a few different times a few different ways
already. If all I want is an Excel-readable file, that's one end of
the spectrum. If you want something more general, but still include
Excel, that's one of many decisions to make. Ideas.

How does lxml export: b= B(); a.b= b; dumps( a )?

It looks like he can create the XML from the objects already.

In lxml.objectify, the objects *are* the XML tree. It's all about objects
being bound to specific elements in the tree.

Stefan

castironpi · Feb 14, 2008

What I meant was: please state what you are trying to do. What you describe
are the environmental conditions and possible solutions that you are thinking
of, but it doesn't tell me what problem you are actually trying to solve.

What problem -am- I trying to solve? Map the structure -in- to XML.

In lxml.objectify, the objects *are* the XML tree. It's all about objects
being bound to specific elements in the tree.

Stefan- Hide quoted text -

- Show quoted text -

Objects first. Create. The use case is a simulated strategy
tournament.

Stefan Behnel · Feb 14, 2008

Hi,

http://catb.org/~esr/faqs/smart-questions.html#goal

What problem -am- I trying to solve? Map the structure -in- to XML.

http://catb.org/~esr/faqs/smart-questions.html#beprecise

Is it a fixed structure you have, or are you free to use whatever you like?

Objects first. Create.

http://catb.org/~esr/faqs/smart-questions.html#writewell

My guess is that this is supposed to mean: "I want to create Python objects
and then write their structure out as XML". Is that the right translation?

There are many ways to do so, one is to follow these steps:

http://codespeak.net/lxml/objectify.html#tree-generation-with-the-e-factory
http://codespeak.net/lxml/objectify.html#element-access-through-object-attributes
http://codespeak.net/lxml/objectify.html#python-data-types
then maybe this:
http://codespeak.net/lxml/objectify.html#defining-additional-data-classes
and finally this:
http://codespeak.net/lxml/tutorial.html#serialisation

But as I do not know enough about the problem you are trying to solve, except:

The use case is a simulated strategy tournament.

I cannot tell if the above approach will solve your problem or not.

Stefan

castironpi · Feb 14, 2008

Hi,

http://catb.org/~esr/faqs/smart-questions.html#beprecise

Is it a fixed structure you have, or are you free to use whatever you like?

http://catb.org/~esr/faqs/smart-questions.html#writewell

My guess is that this is supposed to mean: "I want to create Python objects
and then write their structure out as XML". Is that the right translation?

There are many ways to do so, one is to follow these steps:

http://codespeak.net/lxml/objectify...eak.net/lxml/objectify.html#python-data-types
then maybe this:http://codespeak.net/lxml/objectify.html#defining-additional-data-cla...
and finally this:http://codespeak.net/lxml/tutorial.html#serialisation

But as I do not know enough about the problem you are trying to solve, except:

I cannot tell if the above approach will solve your problem or not.

Stefan

I was trying to start a discussion on a cool OO design. Problem's
kind of solved; downer, huh?

I haven't completed it, but it's a start. I expect I'll post some
thoughts along with progress. Will Excel read it? We'll see.

A design difference:

Worksheet= lambda parent: etree.SubElement( parent, "Worksheet" )
Table= lambda parent: etree.SubElement( parent, "Table" )
sheet= Worksheet( book ) #parent
table= Table( sheet )
vs.

table= Table() #empty table
sheet= Worksheet( table= table ) #child

I want to call sheet.table sometimes. Is there a lxml equivalent?

castironpi · Feb 15, 2008

I was trying to start a discussion on a cool OO design. Problem's
kind of solved; downer, huh?

I haven't completed it, but it's a start. I expect I'll post some
thoughts along with progress. Will Excel read it? We'll see.

A design difference:

Worksheet= lambda parent: etree.SubElement( parent, "Worksheet" )
Table= lambda parent: etree.SubElement( parent, "Table" )
sheet= Worksheet( book ) #parent
table= Table( sheet )
vs.

table= Table() #empty table
sheet= Worksheet( table= table ) #child

I want to call sheet.table sometimes. Is there a lxml equivalent?- Hide quoted text -

- Show quoted text -

Minimize redundancy. Are there some possibilities ignored, such as
reading a class structure from an existing Excel XML file, downloading
the official spec, and if one is coding in Windows, how bulky is the
equiavelent COM code? One doesn't want to be re-coding the "wheel" if
it's big and hairy.

Ben Finney · Feb 15, 2008

Minimize redundancy.

Please do so by trimming the quoted material; remove anything not
relevant to people reading your reply.

castironpi · Feb 15, 2008

Great!

--
\ "I moved into an all-electric house. I forgot and left the
|
`\ porch light on all day. When I got home the front door wouldn't
|
_o__) open." -- Steven Wright
|
Ben Finney

castironpi · Feb 15, 2008

I cannot tell if the above approach will solve your problem or not.

Well, declare me a persistent object.

from lxml import etree

SS= '{urn:schemas-microsoft-com

ffice:spreadsheet}'
book= etree.Element( 'Workbook' )
book.set( 'xmlns', 'urn:schemas-microsoft-com

ffice:spreadsheet' )
sheet= etree.SubElement(book, "Worksheet")
sheet.set( SS+ 'Name', 'WSheet1' )
table= etree.SubElement(sheet, "Table")
row= etree.SubElement(table, "Row")
cell1= etree.SubElement(row, "Cell")
data1= etree.SubElement(cell1, "Data" )
data1.set( SS+ 'Type', "Number" )
data1.text= '123'
cell2= etree.SubElement(row, "Cell")
data2= etree.SubElement(cell2, "Data" )
data2.set( SS+ 'Type', "String" )
data2.text= 'abc'
out= etree.tostring( book, pretty_print= True, xml_declaration=True )
print( out )
open( 'xl.xml', 'w' ).write( out )

Can you use set( '{ss}Type' ) somehow? And any way to make this look
closer to the original? But it works.

<?xml version='1.0' encoding='ASCII'?>
<Workbook xmlns="urn:schemas-microsoft-com

ffice:spreadsheet">
<Worksheet xmlns:ns0="urn:schemas-microsoft-com

ffice:spreadsheet"
ns0:Name="WSheet1">
<Table>
<Row>
<Cell>
<Data ns0:Type="Number">123</Data>
</Cell>
<Cell>
<Data ns0:Type="String">abc</Data>
</Cell>
</Row>
</Table>
</Worksheet>
</Workbook>

Stefan Behnel · Feb 15, 2008

Well, declare me a persistent object.

Ok, from now on, you are a persistent object.

from lxml import etree

SS= '{urn:schemas-microsoft-comffice:spreadsheet}'
book= etree.Element( 'Workbook' )
book.set( 'xmlns', 'urn:schemas-microsoft-comffice:spreadsheet' )
sheet= etree.SubElement(book, "Worksheet")
sheet.set( SS+ 'Name', 'WSheet1' )
table= etree.SubElement(sheet, "Table")
row= etree.SubElement(table, "Row")
cell1= etree.SubElement(row, "Cell")
data1= etree.SubElement(cell1, "Data" )
data1.set( SS+ 'Type', "Number" )
data1.text= '123'
cell2= etree.SubElement(row, "Cell")
data2= etree.SubElement(cell2, "Data" )
data2.set( SS+ 'Type', "String" )
data2.text= 'abc'
out= etree.tostring( book, pretty_print= True, xml_declaration=True )
print( out )
open( 'xl.xml', 'w' ).write( out )
http://codespeak.net/lxml/tutorial.html#namespaces
http://codespeak.net/lxml/tutorial.html#the-e-factory
http://codespeak.net/lxml/objectify.html#tree-generation-with-the-e-factory

Can you use set( '{ss}Type' ) somehow?

What is 'ss' here? A prefix?

What about actually reading the tutorial?

http://codespeak.net/lxml/tutorial.html#namespaces

And any way to make this look
closer to the original?

What's the difference you experience?

Stefan

castironpi · Feb 15, 2008

Can you use set( '{ss}Type' ) somehow?

What is 'ss' here? A prefix?

What about actually reading the tutorial?

http://codespeak.net/lxml/tutorial.html#namespaces

What's the difference you experience?

Target:
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook
xmlns="urn:schemas-microsoft-com

ffice:spreadsheet"
xmlns:ss="urn:schemas-microsoft-com

ffice:spreadsheet">
<Worksheet ss:Name="Sheet1">
<Table>
<Row>
<Cell><Data ss:Type="String">abc</Data></Cell>
<Cell><Data ss:Type="Number">123</Data></Cell>
</Row>
</Table>
</Worksheet>
</Workbook>

It helped get me the working one, actually-- the tutorial. 'ss' is,
and I don't know the jargon for it, a local variable, or namespace
variable, prefix?, or something. xmlns:ss="urn:schemas-microsoft-
com

ffice:spreadsheet". The ElementMaker example is closest, I
think, but it's working, so, ...

I'm more interested in a simplification of the construction code, and
at this point I can get goofy and brainstorm. Ideas?

castironpi · Feb 15, 2008

Something else that crept up is:

<?xml version='1.0' encoding='ASCII'?>
<Workbook xmlns="[hugethingA]">
<Worksheet xmlns:ns0="[hugethingA]" ns0:name="WSheet1">
</Worksheet>
<Styles>
<Style xmlns:ns1="[hugethingA]" ns1:ID="s21"/>
</Styles>
</Workbook>

Which xmlns:ns1 gets "redefined" because I just didn't figure out how
get xmlns:ns0 definition into the Workbook tag. But too bad for me.

castironpi · Feb 15, 2008

Something else that crept up is:

<?xml version='1.0' encoding='ASCII'?>
<Workbook xmlns="[hugethingA]">
<Worksheet xmlns:ns0="[hugethingA]" ns0:name="WSheet1">
</Worksheet>
<Styles>
<Style xmlns:ns1="[hugethingA]" ns1:ID="s21"/>
</Styles>
</Workbook>

Which xmlns:ns1 gets "redefined" because I just didn't figure out how
get xmlns:ns0 definition into the Workbook tag. But too bad for me.

In Economics, they call it "Economy to Scale"- the effect, and the
point, and past it, where the cost to produce N goods on a supply
curve on which 0 goods costs 0 exceeds that on one on which 0 goods
costs more than 0: the opposite of diminishing returns. Does the
benefit of encapsulating the specifics of the XML file, including the
practice, exceed the cost of it?

For an only slightly more complex result, the encapsulated version is
presented; and the hand-coded, unencapsulated one is left as an
exercise to the reader.

book= Workbook()
sheet= Worksheet( book, 'WSheet1' )
table= Table( sheet )
row= Row( table, index= '2' )
style= Style( book, bold= True )
celli= Cell( row, styleid= style )
datai= Data( celli, 'Number', '123' )
cellj= Cell( row )
dataj= Data( cellj, 'String', 'abc' )

46 lines of infrastructure, moderately packed. Note that:

etree.XML( etree.tostring( book ) )

succeeds.

castironpi · Feb 15, 2008

In Economics, they call it "Economy to Scale"- the effect, and the

point, and past it, where the cost to produce N goods on a supply
curve on which 0 goods costs 0 exceeds that on one on which 0 goods
costs more than 0: the opposite of diminishing returns. Does the
benefit of encapsulating the specifics of the XML file, including the
practice, exceed the cost of it?

And for all the management out there, yes. As soon as possible does
mean as crappy as possible. Extra is extra. Assume the sooner the
crappier and the theorem follows. (Now, corroborate the premise...)

P.S. Gluttony is American too.

castironpi · Feb 15, 2008

And for all the management out there, yes. As soon as possible does
mean as crappy as possible. Extra is extra. Assume the sooner the
crappier and the theorem follows. (Now, corroborate the premise...)

The sooner the crappier or the parties waste time.

Stefan Behnel · Feb 16, 2008

Something else that crept up is:

<?xml version='1.0' encoding='ASCII'?>
<Workbook xmlns="[hugethingA]">
<Worksheet xmlns:ns0="[hugethingA]" ns0:name="WSheet1">
</Worksheet>
<Styles>
<Style xmlns:ns1="[hugethingA]" ns1:ID="s21"/>
</Styles>
</Workbook>

Which xmlns:ns1 gets "redefined" because I just didn't figure out how
get xmlns:ns0 definition into the Workbook tag. But too bad for me.

What about actually *reading* the links I post?

http://codespeak.net/lxml/tutorial.html#the-e-factory

Hint: look out for the "nsmap" keyword argument.

Stefan

I want to create the Excel that's have a drop-down list.	0	May 7, 2008
XSL for Excel	0	May 8, 2008
XSL: selecting columns in Excel XML	5	Nov 8, 2004
template adds unexpected namespace	1	Sep 14, 2004
whats wrong with this XML, used xmlwriterclass	0	Nov 15, 2006
using xmlwriter to create attributes for an excel tag	1	Nov 7, 2006
Excel's XML to HTML using data islands	4	Jul 8, 2006
Export records to multiple worksheet under 1 workbook	0	Aug 24, 2004

XML pickle

castironpi

George Sakkis

castironpi

Stefan Behnel

castironpi

Stefan Behnel

castironpi

Stefan Behnel

castironpi

castironpi

Ben Finney

castironpi

castironpi

Stefan Behnel

castironpi

castironpi

castironpi

castironpi

castironpi

Stefan Behnel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads