Memoization, files and Marshal?

D

Daniel Berger

Hi all,

Inspired by a recent blog entry by Mauricio Fernandez, I decided to try
to add a method to the memoize package that would allow users to
memoize using a file based approach rather than memory, either to
conserve memory or for persistance.

However, I'm not sure it's possible. If it is, I'd love to know how.
If it's not, well, pstore it is.

Here's what I tried:

def memoize_to_file(name, file)
meth = method(name)
cache = Hash.new.update(Marshal.load(File.read(file))) rescue nil
cache ||= {}
(class << self; self; end).class_eval do
define_method(name) do |*args|
if cache.has_key?(args)
cache[args]
else
cache[args] ||= meth.call(*args)
File.open(file, "wb+"){ |f| Marshal.dump(cache, f) }
end
end
end
cache
end

# Code snippet
include Memoize
def fib(n)
return n if n < 2
fib(n-1) + fib(n-2)
end
memoize_to_file:)fib, "temp.txt")
fib(10)

However, running that gives me this error: in `fib': undefined method
`+' for #<File:temp.txt (closed)> (NoMethodError)

Any ideas?

Thanks,

Dan
 
A

ara.t.howard

Hi all,

Inspired by a recent blog entry by Mauricio Fernandez, I decided to try
to add a method to the memoize package that would allow users to
memoize using a file based approach rather than memory, either to
conserve memory or for persistance.

However, I'm not sure it's possible. If it is, I'd love to know how.
If it's not, well, pstore it is.

Here's what I tried:

def memoize_to_file(name, file)
meth = method(name)
cache = Hash.new.update(Marshal.load(File.read(file))) rescue nil
cache ||= {}
(class << self; self; end).class_eval do
define_method(name) do |*args|
if cache.has_key?(args)
cache[args]
else
cache[args] ||= meth.call(*args)
File.open(file, "wb+"){ |f| Marshal.dump(cache, f) }
end
end
end
cache
end

# Code snippet
include Memoize
def fib(n)
return n if n < 2
fib(n-1) + fib(n-2)
end
memoize_to_file:)fib, "temp.txt")
fib(10)

However, running that gives me this error: in `fib': undefined method
`+' for #<File:temp.txt (closed)> (NoMethodError)

Any ideas?

Thanks,


harp:~ > cat a.rb
def memoize_to_file(name, file)
cache = Hash.new.update(Marshal.load(File.read(file))) rescue {}
(class << self; self; end).class_eval do
define_method(name) do |*args|
unless cache.has_key?(args)
cache[args] = method(name).call(*args)
File.open(file, "wb+"){|f| Marshal.dump(cache, f) }
end
cache[args]
end
end
end

def fib(n)
return n if n < 2
fib(n-1) + fib(n-2)
end

memoize_to_file "fib", "fib.cache"
n = fib 10
p n



harp:~ > ruby a.rb
55


regards.

-a
--
===============================================================================
| ara [dot] t [dot] howard [at] noaa [dot] gov
| all happiness comes from the desire for others to be happy. all misery
| comes from the desire for oneself to be happy.
| -- bodhicaryavatara
===============================================================================
 
D

Daniel Berger

harp:~ > cat a.rb
def memoize_to_file(name, file)
cache = Hash.new.update(Marshal.load(File.read(file))) rescue {}
(class << self; self; end).class_eval do
define_method(name) do |*args|
unless cache.has_key?(args)
cache[args] = method(name).call(*args)
File.open(file, "wb+"){|f| Marshal.dump(cache, f) }
end
cache[args]
end
end
end

def fib(n)
return n if n < 2
fib(n-1) + fib(n-2)
end

memoize_to_file "fib", "fib.cache"
n = fib 10
p n

As is, I get a stack level too deep error on Windows. If I change the
"cache[args] =" to "cache[args] ||=", I always get back nil.

I'm probably being thick here, but any help is appreciated.

Thanks,

Dan
 
A

ara.t.howard

harp:~ > cat a.rb
def memoize_to_file(name, file)
cache = Hash.new.update(Marshal.load(File.read(file))) rescue {}
(class << self; self; end).class_eval do
define_method(name) do |*args|
unless cache.has_key?(args)
cache[args] = method(name).call(*args)
File.open(file, "wb+"){|f| Marshal.dump(cache, f) }
end
cache[args]
end
end
end

def fib(n)
return n if n < 2
fib(n-1) + fib(n-2)
end

memoize_to_file "fib", "fib.cache"
n = fib 10
p n

As is, I get a stack level too deep error on Windows. If I change the
"cache[args] =" to "cache[args] ||=", I always get back nil.

I'm probably being thick here, but any help is appreciated.

sorry, don't know how that's running on linux in the first place... you need:

--- a.rb 2005-12-31 20:30:59.000000000 -0700
+++ b.rb 2005-12-31 20:31:10.000000000 -0700
@@ -1,9 +1,10 @@
def memoize_to_file(name, file)
+ meth = method(name)
cache = Hash.new.update(Marshal.load(File.read(file))) rescue {}
(class << self; self; end).class_eval do
define_method(name) do |*args|
unless cache.has_key?(args)
- cache[args] = method(name).call(*args)
+ cache[args] = meth.call(*args)
File.open(file, "wb+"){|f| Marshal.dump(cache, f) }
end
cache[args]

with that it works for me on linux and windows.

cheers.

-a
--
===============================================================================
| ara [dot] t [dot] howard [at] noaa [dot] gov
| all happiness comes from the desire for others to be happy. all misery
| comes from the desire for oneself to be happy.
| -- bodhicaryavatara
===============================================================================
 
D

Daniel Berger

harp:~ > cat a.rb
def memoize_to_file(name, file)
cache = Hash.new.update(Marshal.load(File.read(file))) rescue {}
(class << self; self; end).class_eval do
define_method(name) do |*args|
unless cache.has_key?(args)
cache[args] = method(name).call(*args)
File.open(file, "wb+"){|f| Marshal.dump(cache, f) }
end
cache[args]
end
end
end

def fib(n)
return n if n < 2
fib(n-1) + fib(n-2)
end

memoize_to_file "fib", "fib.cache"
n = fib 10
p n

As is, I get a stack level too deep error on Windows. If I change the
"cache[args] =" to "cache[args] ||=", I always get back nil.

I'm probably being thick here, but any help is appreciated.

sorry, don't know how that's running on linux in the first place... you need:

--- a.rb 2005-12-31 20:30:59.000000000 -0700
+++ b.rb 2005-12-31 20:31:10.000000000 -0700
@@ -1,9 +1,10 @@
def memoize_to_file(name, file)
+ meth = method(name)
cache = Hash.new.update(Marshal.load(File.read(file))) rescue {}
(class << self; self; end).class_eval do
define_method(name) do |*args|
unless cache.has_key?(args)
- cache[args] = method(name).call(*args)
+ cache[args] = meth.call(*args)
File.open(file, "wb+"){|f| Marshal.dump(cache, f) }
end
cache[args]

with that it works for me on linux and windows.

cheers.

-a

Thanks, that did the trick nicely. It's now part of memoize 1.2.0. :)

Regards,

Dan
 
A

ara.t.howard

--1yeeQ81UyVL57Vl7
Content-Type: MULTIPART/MIXED; BOUNDARY=1yeeQ81UyVL57Vl7

This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.

--1yeeQ81UyVL57Vl7
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; FORMAT=flowed
Content-ID: <[email protected]>
Content-Disposition: INLINE

Dumping the cache on every miss seems expensive.
I changed the code to save it once every N misses, and made the cache update
atomic to prevent problems with:
* concurrent executions of the memoized method (processes and threads)
* no HD space being left (the previous cache state is preserved)

I also applied those changes to memoize.rb 1.2.0 and generalized it to work
with instance methods, see the patch after memo.rb.

nice idea.

seems like the perfect time to introduce a Memoize::Cache class wrapping
pstore though doesn't it?

cheers.

-a
--
===============================================================================
| ara [dot] t [dot] howard [at] noaa [dot] gov
| all happiness comes from the desire for others to be happy. all misery
| comes from the desire for oneself to be happy.
| -- bodhicaryavatara
===============================================================================
--1yeeQ81UyVL57Vl7
Content-Type: TEXT/PLAIN; CHARSET=us-ascii
Content-ID: <[email protected]>
Content-Description:
Content-Disposition: ATTACHMENT; FILENAME=memoize.rb

module Memoize
MEMOIZE_VERSION = "1.2.0"

def self.memoize(recv, name, file=nil, period=-1)
class << recv; self end.instance_eval{ instance_memoize(name, file, period) }
end

def memoize(name, file=nil, period=-1)
Memoize.memoize(self, name, file, period)
end
end

class Module
# Memoize the method +name+. If +file+ is provided, then the method results
# are stored on disk in addition to the in-memory cache.
def instance_memoize(name, file=nil, period=-1)
meth = instance_method(name)

if file
cache = Hash.new.update(Marshal.load(File.read(file))) rescue {}
else
cache = {}
end

save_cache_proc = lambda do
tmpfile = "#{file}.#{Process.pid}"
begin
File.open(tmpfile, "wb+"){|f| Marshal.dump(cache, f) }
File.rename tmpfile, file
rescue Exception
end
end

at_exit { save_cache_proc.call } if file

calls = period
define_method(name) do |*args|
unless cache.has_key?(args)
cache[args] = meth.bind(self).call(*args)
if file && period != -1 && (calls -= 1) <= 0
calls = period
save_cache_proc.call
end
end
cache[args]
end

cache
end
end

if __FILE__ == $0
def fib(n)
return n if n < 2
fib(n-1) + fib(n-2)
end

def fib2(n)
return n if n < 2
fib2(n-1) + fib2(n-2)
end

def fib3(n)
return n if n < 2
fib3(n-1) + fib3(n-2)
end

include Memoize

module DumbFib
def fib(n)
return n if n < 2
fib(n-1) + fib(n-2)
end
end

DumbFib2 = DumbFib.clone

memoize :fib, "xfib.memoize.cache", 1
memoize :fib2, "xfib2.memoize.cache"
memoize :fib3, "xfib3.memoize.cache", 10

class Dummy1
include DumbFib
instance_memoize :fib, "xfib.dumb1.cache"
end
class Dummy2
include DumbFib2
end
module DumbFib2
instance_memoize :fib, "xfib.dumb2.cache"
end

require 'benchmark'
runs = 1
Benchmark.bmbm(10) do |bm|
dummy1 = Dummy1.new
dummy2 = Dummy2.new
bm.report("period 1") { runs.times{ fib 500 } }
bm.report("period 10"){ runs.times{ fib3 500 } }
bm.report("at_exit") { runs.times{ fib2 500 } }
bm.report("at_exit''") { runs.times{ dummy1.fib 500 } }
bm.report("at_exit'''") { runs.times{ dummy2.fib 500 }; runs = 100000}
end

end

--1yeeQ81UyVL57Vl7--
--1yeeQ81UyVL57Vl7--
 
D

Daniel Berger

This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.



nice idea.

Looks interesting but I get this with windows xp using his sample code:
ruby memoize.rb
Rehearsal ----------------------------------------------
period 1 memoize.rb:39:in `has_key?': stack level too deep
(SystemStackError)
from memoize.rb:39:in `fib'
from memoize.rb:56:in `fib'
from memoize.rb:40:in `fib'
from memoize.rb:56:in `fib'
from memoize.rb:40:in `fib'
from memoize.rb:56:in `fib'
from memoize.rb:40:in `fib'
from memoize.rb:56:in `fib'
... 539 levels...
from c:/ruby184/lib/ruby/1.8/benchmark.rb:293:in `measure'
from c:/ruby184/lib/ruby/1.8/benchmark.rb:261:in `bmbm'
from c:/ruby184/lib/ruby/1.8/benchmark.rb:259:in `bmbm'
from memoize.rb:97

Regards,

Dan
 
M

Mauricio Fernandez

Looks interesting but I get this with windows xp using his sample code:

Rehearsal ----------------------------------------------
period 1 memoize.rb:39:in `has_key?': stack level too deep
(SystemStackError)
from memoize.rb:39:in `fib'
from memoize.rb:56:in `fib'
from memoize.rb:40:in `fib'
from memoize.rb:56:in `fib'
from memoize.rb:40:in `fib'
from memoize.rb:56:in `fib'
from memoize.rb:40:in `fib'
from memoize.rb:56:in `fib'
... 539 levels...

Your stack is too small; fib(500) goes 1000 levels deep. Change
those examples to fib(100) etc.:

batsman@tux-chan:/tmp$ ruby memoize.rb
Rehearsal ---------------------------------------------
period 1 memoize.rb:54:in `fib': let's see how deep we get (RuntimeError)
from memoize.rb:39:in `fib'
from memoize.rb:55:in `fib'
from memoize.rb:39:in `fib'
from memoize.rb:55:in `fib'
from memoize.rb:39:in `fib'
from memoize.rb:55:in `fib'
from memoize.rb:39:in `fib'
from memoize.rb:55:in `fib'
... 993 levels...
from /home/batsman/usr/lib/ruby/1.8/benchmark.rb:293:in `measure'
from /home/batsman/usr/lib/ruby/1.8/benchmark.rb:261:in `bmbm'
from /home/batsman/usr/lib/ruby/1.8/benchmark.rb:259:in `bmbm'
from memoize.rb:96

given

batsman@tux-chan:/tmp$ diff -u memoize.rb.mod memoize.rb
--- memoize.rb.mod 2006-01-02 21:38:18.000000000 +0100
+++ memoize.rb 2006-01-02 21:46:10.000000000 +0100
@@ -51,7 +51,7 @@

if __FILE__ == $0
def fib(n)
- return n if n < 2
+ raise "let's see how deep we get" if n < 2
fib(n-1) + fib(n-2)
end

@@ -97,10 +97,10 @@
dummy1 = Dummy1.new
dummy2 = Dummy2.new
bm.report("period 1") { runs.times{ fib 500 } }
- bm.report("period 10"){ runs.times{ fib3 500 } }
- bm.report("at_exit") { runs.times{ fib2 500 } }
- bm.report("at_exit''") { runs.times{ dummy1.fib 500 } }
- bm.report("at_exit'''") { runs.times{ dummy2.fib 500 }; runs = 100000}
+ #bm.report("period 10"){ runs.times{ fib3 500 } }
+ #bm.report("at_exit") { runs.times{ fib2 500 } }
+ #bm.report("at_exit''") { runs.times{ dummy1.fib 500 } }
+ #bm.report("at_exit'''") { runs.times{ dummy2.fib 500 }; runs = 100000}
end

end
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,189
Members
46,734
Latest member
manin

Latest Threads

Top