Use yaml to store Ruby arrays or hashes in the DB?

G

George Moschovitis

Hello everybody,

the current version of NDB stores Array/Hash/GeneralRubyObjects using
the Marshal.dump/load methods. This is relatively fast but not portable.
Ie it is difficult to use the database from an application coded in
another language due to Marshal being Ruby specific.

I could use yaml of course but I am wondering if this is fast enough.

Any opinions on this?

best regards,
George Moschovitis
 
Z

Zev Blut

Hello,

the current version of NDB stores Array/Hash/GeneralRubyObjects using
the Marshal.dump/load methods. This is relatively fast but not portable.
Ie it is difficult to use the database from an application coded in
another language due to Marshal being Ruby specific.

I could use yaml of course but I am wondering if this is fast enough.

Any opinions on this?

That depends upon what you consider fast enough?
If you read the Ruby-Talk thread starting at [99960] you can see some
benchmarks.

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/99960

From my experience YAML is much faster than an XML equivalent using
REXML.

But, you need to watch out about what you store as YAML. If you are
only just storing Arrays and Hashes of Strings, Numbers and other such
things you are probably fine, but as soon as custom Ruby Objects start
appearing you may be in trouble.

I have learned this the hard way, when working with objects that
dynamically extend modules. Attached is a simple example of an
interesting difference between Marshal and YAML, with regards to
saving an objects state.

I hope this helps.
Best,
Zev
-----
Example code that shows that Marshal can store the modules extended
by a specific object, while YAML cannot.
-----
module TestModule
def foo
puts "bar!"
end
end

x = "String"
x.extend TestModule

puts "Is x a TestModule?"
puts x.is_a?(TestModule)

y = "String y"
puts "Is y a TestModule?"
puts y.is_a?(TestModule)

include Marshal

puts "Dumping and Reloading X"
rx = restore(dump(x))
puts "Is the reloaded X a TestModule?"
puts rx.is_a?(TestModule)
puts "Just to make sure we can still foo."
puts rx.foo
puts "Dumping and Reloading Y (for fun)."
ry = restore(dump(y))
puts "Is the reloaded Y a TestModule? (I hope not.)"
puts ry.is_a?(TestModule)

puts "Now let's try the equivalent in YAML"
require "yaml"
yx = YAML.load(x.to_yaml)
puts "Is the YAML reloaded X a TestModule?"
puts yx.is_a?(TestModule)
 
J

James Britt

George said:
Hello everybody,

the current version of NDB stores Array/Hash/GeneralRubyObjects using
the Marshal.dump/load methods. This is relatively fast but not portable.
Ie it is difficult to use the database from an application coded in
another language due to Marshal being Ruby specific.

I could use yaml of course but I am wondering if this is fast enough.

How can you store Ruby-specific objects (i.e., custom, user-defined
classes) in a language-neutral way?



James
 
T

trans. (T. Onoma)

| > Hello everybody,
| >
| > the current version of NDB stores Array/Hash/GeneralRubyObjects using
| > the Marshal.dump/load methods. This is relatively fast but not portable.
| > Ie it is difficult to use the database from an application coded in
| > another language due to Marshal being Ruby specific.
| >
| > I could use yaml of course but I am wondering if this is fast enough.
|
| How can you store Ruby-specific objects (i.e., custom, user-defined
| classes) in a language-neutral way?

You use YAML.

No really. That's all there it to it. If it's not fast enough I'd advice
writing your own YAML parser in Assembly, cause honestly it is a high level
task and it is the only real portable solution. (Okay, there are some XML
based tools but they will be just as slow.)

Of course you could use an OODB. Look up 'purple' in RAA. That in itself will
interest you probably. But then read the included docs (README I think it is)
there you will find caveats, in which it will suggest an OODB alternative.
Maybe that will work.

Have Fun,
T.
 
J

James Britt

trans. (T. Onoma) said:
On Monday 01 November 2004 10:20 pm, James Britt wrote:
|
| How can you store Ruby-specific objects (i.e., custom, user-defined
| classes) in a language-neutral way?

You use YAML.

No really. That's all there it to it.

The ri data files for 1.8.2 are in YAML. Can I load the following into
a YAML parser in, say, perl or python? How does that language make
sense of it?

--- !ruby/object:RI::MethodDescription
aliases: []
block_params:
comment:
- !ruby/struct:SM::Flow::p
body: "Exclusive Or---If <em>obj</em> is <tt>nil</tt> or
<tt>false</tt>, returns <tt>false</tt>; otherwise, returns <tt>true</tt>."
full_name: FalseClass#^
is_singleton: false
name: "^"
params: >
false ^ obj => true or false

nil ^ obj => true or false


visibility: public
 
T

trans. (T. Onoma)

| > On Monday 01 November 2004 10:20 pm, James Britt wrote:
| > | How can you store Ruby-specific objects (i.e., custom, user-defined
| > | classes) in a language-neutral way?
| >
| > You use YAML.
| >
| > No really. That's all there it to it.
|
| The ri data files for 1.8.2 are in YAML. Can I load the following into
| a YAML parser in, say, perl or python? How does that language make
| sense of it?
|
| --- !ruby/object:RI::MethodDescription
| aliases: []
| block_params:
| comment:
| - !ruby/struct:SM::Flow::p
| body: "Exclusive Or---If <em>obj</em> is <tt>nil</tt> or
| <tt>false</tt>, returns <tt>false</tt>; otherwise, returns <tt>true</tt>."
| full_name: FalseClass#^
| is_singleton: false
| name: "^"
| params: >
| false ^ obj => true or false
|
| nil ^ obj => true or false
|
| visibility: public


You'll have to use types to get rid of the !ruby/object cruft.

YAML.add_private_type( 'MehthodDescription' ) { |type, val|
RI::MethodDescrition.new( val )
}

That'll load it automatically. Note that val is the parsed document (a hash)
so your initialize method will either have to deal with that ot you'll need a
new constructor method instead. I usually define #initialize_yaml(val) and
use that.

To output the type correctly use:

class RI::MethodDescription

def to_yaml_type
'MethodDescription'
end

end

Perl and Python should be able to handle no problem from there.

HTH,
T.

P.S. You can also use a Domain type if you prefer. Form ore info see:

http://yaml4r.sourceforge.net/doc/
 
G

George Moschovitis

trans. (T. Onoma) said:
| > Hello everybody,
| >
| > the current version of NDB stores Array/Hash/GeneralRubyObjects using
| > the Marshal.dump/load methods. This is relatively fast but not portable.
| > Ie it is difficult to use the database from an application coded in
| > another language due to Marshal being Ruby specific.
| >
| > I could use yaml of course but I am wondering if this is fast enough.
|
| How can you store Ruby-specific objects (i.e., custom, user-defined
| classes) in a language-neutral way?

You use YAML.

yeap the latest version uses YAML instead of Marshal. Havent run any
benchmarks yet.

-g.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,160
Messages
2,570,889
Members
47,422
Latest member
LatashiaZc

Latest Threads

Top