Can you explain what this repo (yaml/Marshal) code do?

A

anne001

repositiory is a hash keyed on some kind of check sum derived from the
file.
But what is the use of Marshal? something to do with memory management,

instead of writing to file?
What is the format of what is spewed out?

REPO_FILE = "repo.bin".freeze

class Repository
attr_accessor :main_dir, :duplicate_dir, :extensions

def initialize(extensions = %w{mp3 ogg})
@extension = extensions
@repository = {}
end

def process_dir(dir)
# find all files with the extensions we support
Dir[File.join(dir, "*.{#{extensions.join(',')}}")].each do |f|
process_file( File.join(dir, f) )
end
end

def process_file(file)
digest = digest(file)
name = @repository[digest]

if name
target = duplicate_dir
# ...
else
target = main_dir
# ...
end

FileUtils.cp( file, File.join( target, File.basename( file ) ) )
end

def digest(file)
Digest::MD5.hexdigest( File.open(file, 'rb') {|io| io.read})
end

def self.load(file)
File.open(file, 'rb') {|io| Marshal.load(io)}
end

def save(file)
File.open(file, 'wb') {|io| Marshal.dump(self, io)}
end
end

repo = begin
Repository.load( REPO_FILE )
rescue Exception => e
# not there => create
r = Repository.new
r.main_dir = "foo"
r.duplicate_dir = "bar"
r
end

ARGV.each {|dir| repo.process_dir(dir)}

repo.save( REPO_FILE )
http://groups.google.com/group/comp...f86841?lnk=gst&q=repo&rnum=2#fa358e55b7f86841
 
G

Gregory Brown

repositiory is a hash keyed on some kind of check sum derived from the
file.
But what is the use of Marshal? something to do with memory management,

Marshal is a way to save the state of your Ruby objects to a file.
Unlike YAML, it outputs in binary format. It is written in C, so it
can serialize and restore objects very quickly.

It looks like in this code, this is simply how the save and load
functions are implemented, and since 'self' is being passed, it will
just serialize the Repository object to file during save and restore
it during load.

If you need readable, but much slower serialization, you could use
YAML in place of the Marshal calls.
 
A

anne001

It looks like in this code, this is simply how the save and load
functions are implemented, and since 'self' is being passed, it will
just serialize the Repository object to file during save and restore
it during load.

but why do you need to save and load the objects,
I have never seen code like this before. What do you gain?
what is the problem that it resolves
 
T

Timothy Goddard

anne001 said:
but why do you need to save and load the objects,
I have never seen code like this before. What do you gain?
what is the problem that it resolves

It can be used to store objects on disk for future use (e.g. web
application sessions) or to send objects between Ruby interpreters
(only works with same interpreter version & object class loaded on
both).
 
J

James Edward Gray II

but why do you need to save and load the objects,
I have never seen code like this before. What do you gain?
what is the problem that it resolves

Well, let's pretend you had some wiki class in your code:

# a mock wiki object...
class WikiPage
def initialize( page_name, author, contents )
@page_name = page_name
@revisions = Array.new

add_revision(author, contents)
end

attr_reader :page_name

def add_revision( author, contents )
@revisions << { :created => Time.now,
:author => author,
:contents => contents }
end

def wiki_page_references
[@page_name] + @revisions.last[:contents].scan(/\b(?:[A-Z]+
[a-z]+){2,}/)
end

# ...
end

Now, let's assume you have a Hash of these things you are using to
run your wiki:

wiki = Hash.new
[ ["HomePage", "James", "A page about the SillyEmailExamples..."],
["SillyEmailExamples", "James", "Blah, blah, blah..."] ].each
do |page|
new_page = WikiPage.new(*page)
wiki[new_page.page_name] = new_page
end

When your script runs you will need to save these pages to a disk
somehow, so you don't lose the site contents between runs. You have
a ton of options here, of course, including using a database or
rolling some method that can write these pages out to files.

Writing them out is a pain though because page contents can be pretty
much anything, so you'll need to come up with a good file format that
allows you to tell where each revision starts and stops. This
probably means handling some escaping characters of some kind, at the
minimum.

Or, you can just use Marshal/YAML. With these helpers, saving the
entire wiki is reduced to the trivial:

File.open("wiki.dump", "w") { |file| Marshal.dump(wiki, file) }

When needed, you can load that back with:

wiki = File.open("wiki.dump") { |file| Marshal.load(file) }

Those files will be stored in a binary format for Ruby to read. If
you would prefer a human-readable format, replace the word Marshal
with YAML above and make sure your script does a:

require "yaml"

See how easy it is to get instant saving/loading of entire Ruby
structures?

James Edward Gray II
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,981
Messages
2,570,187
Members
46,730
Latest member
AudryNolan

Latest Threads

Top