PyYaml?

C

Clark C. Evans

On Mon, Sep 20, 2004 at 10:02:49PM +0100, Paul Moore wrote:
| > Serialization security seems generally assigned as a responsibility
| > of the user, who is usually in the best position to gage their
| > data's effects. The best a serialization format can do is ensure
| > data reconstruction within the bounds described by the user.
|
| As I say, most of this should be in the YAML documentation. I'll be
| charitable and assume that it's just something that hasn't been
| written up yet, but that section in the spec that I quoted looks
| pretty explicit in its vagueness :)

Indeed. I'd go so far to say it's a blind spot; or, probably more
accurately, something that we have not had time to seriously
address. I think some of the changes with how implicit typing is
specified should help in this regard -- it punts much of the
security issues to the application. If the Application wishes to
use a lazy-approach (and hence insecure) to mapping tags to native
object implementations, then it should be explicitly requested by
the Application. The other faults in PyYaml, as diligently pointed
out by Andrew, are implementation faults and not directly
attributable to YAML itself.

Clark
 
A

Andrew Dalke

Clark said:
Thank you for taking some serious time looking at PyYaml

Well, in truth I was taking time to criticize the OP's
advocacy and explain why we aren't "welcoming it with open
arms".
> YAML has at least two more years of work before it'd be
> ready for even proposing that it be considered as a core library.

Sounds about right. It'll take at least that long to get
robust implementation and enough people using it to warrant
inclusion.

Best wishes,

Andrew
(e-mail address removed)
 
I

Istvan Albert

Clark said:
- YAML was created for human reading / authoring. We have spent
an enormous amount of time working with real use cases of data
to find a very clean expression of structured data. If you
like Python's use of whitespace to show structure, you will

IMHO you are already too focused on some specific use cases (invoices etc)
and you'll do probably better if you'd move your entire effort
to that direction.

What you are proposing, whitespace as a markup is not nearly as readable
as XML when the dependancy tree gets more complicated. Just because
whitespace indentation work for programming it does not mean
it works for data too.

Istvan.
 
I

Istvan Albert

Andrew said:
[Long post. Summary is I've found three exploits in
pyyaml and at least five limitations w.r.t. the existing
Python pickles. I DO NOT recommend anyone use pyyaml

pulling no punches
 
C

Chris S.

Istvan said:
IMHO you are already too focused on some specific use cases (invoices etc)
and you'll do probably better if you'd move your entire effort
to that direction.

What you are proposing, whitespace as a markup is not nearly as readable
as XML when the dependancy tree gets more complicated. Just because
whitespace indentation work for programming it does not mean
it works for data too.

XML was designed as a general purpose method of data description, and so
suffers from the requirement of having to explicitly tag everything and
define the meaning of each of these tags. This introduces tremendous
inefficiency, especially with large, yet even small data structures. For
an example, data={'abc':[1,2,3]} (20 bytes), takes 38 bytes to store
with YAML, 41 bytes to store with Pickle, and 416 bytes to store with
XML (using xml_pickle module by David Mertz). I'd agree that readability
is hindered by complex dependencies, but XML's overly verbose syntax is
unreadable for even simple data IMHO. I've yet to find a situation where
any supposedly readable format doesn't benefit from whitespace
indentation or simplistic notation, two aspects YAML enjoys.

data={'abc':[1,2,3]}

---
abc:
- 1
- 2
- 3

(dp1
S'abc'
p2
(lp3
I1
aI2
aI3
as.

<?xml version="1.0"?>
<!DOCTYPE PyObject SYSTEM "PyObjects.dtd">
<PyObject class="XML_Pickler" id="12405096">
<attr name="stuff" type="dict" id="13060256">
<entry>
<key type="string" value="abc" />
<val type="list" id="12976528">
<item type="numeric" value="1" />
<item type="numeric" value="2" />
<item type="numeric" value="3" />
</val>
</entry>
</attr>
</PyObject>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,206
Messages
2,571,070
Members
47,678
Latest member
Aniruddha Das

Latest Threads

Top