Initialising a hash

U

Uri Guttman

JB> And yet there are plenty of papers written on automatic memoization,
JB> which would be a pleonasm if it was that simple.

JB> This technique is called “memoization.†The manual version of the
JB> technique generally involves building lookup tables and is a standard
JB> technique in dynamic programming. This process, however, is often
JB> tedious and time-consuming, and it requires significant mod-
JB> ification and debugging of existing code. An “automatic†memoization
JB> facility is one in which existing functions can be programmatically
JB> changed to cache previously seen results in a table,

JB> http://techdigest.jhuapl.edu/td/td1802/hall.pdf p254-255.

JB> (unless I misunderstand automatic memoization)

do you know there is a memoize module on cpan? it does that very same
thing. it is actually fairly simple code. it creates a new sub that uses
a private hash to store previously calculated values. the index is the
fixed set of args into the function. it replaces the regular sub in the
symbol table with the new sub. the new sub keeps the code ref to the
original and calls it when the hash lookup fails. then it stores the new
result in the hash. not complex at all IMO. and it doesn't require any
debugging of existing code - it is a drop in speedup for the right type
of call - trading ram for speed.

uri
 
J

Janek Schleicher

Am 13.02.2014 12:17, schrieb Rainer Weikusat:
This means a hash table has to
be used instead of an array and that the argument list needs to be
transformed into a suitable key.

Well, for a http status code look up, an array lookup would be most time
efficient (beside a C solution).
This is non-trivial because of the
possibity that arguments with the same 'effective content' might
stringify differently, eg, when passing [1,2,3] as an argument, and that
semantically equivalent (in the sense of producing identical results)
argument lists might be different.

Well, that might be the reason why we capsulate foreign procedure calls.


Greetings, Janek
and be glad to reinvent everything new only to find it doesn't work
 
J

John Bokma

Uri Guttman said:
JB> And yet there are plenty of papers written on automatic memoization,
JB> which would be a pleonasm if it was that simple.

JB> This technique is called “memoization.†The manual version of the
JB> technique generally involves building lookup tables and is a standard
JB> technique in dynamic programming. This process, however, is often
JB> tedious and time-consuming, and it requires significant mod-
JB> ification and debugging of existing code. An “automatic†memoization
JB> facility is one in which existing functions can be programmatically
JB> changed to cache previously seen results in a table,

JB> http://techdigest.jhuapl.edu/td/td1802/hall.pdf p254-255.

JB> (unless I misunderstand automatic memoization)

do you know there is a memoize module on cpan? it does that very same

Yes, I do. The discussion is more about what's memoization, what's
automatic memoization, and what's manual memoization. ;-)
 
R

Rainer Weikusat

John Bokma said:
[...]
(And yes, Rainer is right that memoisation usually refers to adding
caching in front of an existing naively-implemented function to make it
more efficient.)

And yet there are plenty of papers written on automatic memoization,
which would be a pleonasm if it was that simple.

This technique is called ¡°memoization.¡± The manual version of the
technique generally involves building lookup tables and is a standard
technique in dynamic programming. This process, however, is often
tedious and time-consuming, and it requires significant mod-
ification and debugging of existing code. An ¡°automatic¡± memoization
facility is one in which existing functions can be programmatically
changed to cache previously seen results in a table,

http://techdigest.jhuapl.edu/td/td1802/hall.pdf p254-255.

(unless I misunderstand automatic memoization)

I think you misunderstand the agenda of people who are (for one reason
or another) wedded to C++ in a way which not only cannot ever end in a
divorce but where adultery is not only frowned upon but unthinkable. I
think it runs roughly like this: 'Memoziation' is sort of a neat term for
making stupidly written code run faster. Since we are the C++ and
resistance is futile, it has to be assimilated. Yet, our language cannot
do this, not even in its most "dynamic" imitations of ancient Java
constructs (the seriously bad joke called 'local classes' which can be
used a poor man's closures). Holland in Not!, what can we do? Sophistry
to the rescue! We can write functions using caches, so let's rename
"using a cache" as "memoization". We can do that by writing the code so,
now we have "manual memoization". And we can automate
this compile-time source-code transformation by abusing our Powerful(!)
template system in another way nobody except the author understands (for
exactly 3 minutes)! Phew, that was close, but rejoyce all ye true
believers, we have it now!!

Nevertheless, the original idea was based on have a recursively defined
factorial function named fact.

"To endow fact with the "memo" facility [...] we merely write

newmemo(fact, 100, nonop=) -> fact

This replaces the old definition of fact with a new one which
has a memo apparatus attached such that the rote has an upper,
fixed limit of 100 entries and uses the "=" relation for lookup
purposes."

That's the core of the original description of the idea which refers to
performing an operation on an existing function which yields a new
function of the same name as a result which uses a cache to hold
intermediate results. In case nothing suitable is found in the cache, it
invokes the original function which - in turn - invokes the memoized
function recursively because that's what's going to happen when the
function presently named fact is being called.

And that's exactly that the Memoize module does which is roughly the

my @cache;
my $orig = \&fib;
no warnings "redefine";
*fib = sub {
my ($n) = @_;
$cache[$n] ||= $orig->($n);
};

from one of Ben Morrow's postings.
 
R

Rainer Weikusat

Janek Schleicher said:
Am 13.02.2014 12:17, schrieb Rainer Weikusat:
This means a hash table has to
be used instead of an array and that the argument list needs to be
transformed into a suitable key.

Well, for a http status code look up, an array lookup would be most
time efficient (beside a C solution).
This is non-trivial because of the
possibity that arguments with the same 'effective content' might
stringify differently, eg, when passing [1,2,3] as an argument, and that
semantically equivalent (in the sense of producing identical results)
argument lists might be different.

Well, that might be the reason why we capsulate foreign procedure calls.

FYI: Attaching these two statements in this way to an explanation why
'general memoization', as performed by the Memoize module, turns out to
be more a lot more complicated than contrived examples restricting
themselves to functions with single integer arguments isn't exactly
suitable to convey whatever meaning you meant to convey (if any).
 
J

John Bokma

Rainer Weikusat said:
John Bokma said:
Ben Morrow said:
This is a modified function which uses a cache:

my @cache;
sub fib {
my ($n) = @_;
$n < 3 and return 1;
$cache[$n] ||= fib($n - 1) + fib($n - 2);
}

In "An Introduction to Algorithms" this is called memoization
in the chapter on lineair programming. Due to the limits of the
Pascal-like pseudo code used in the book that's all that can be
done. So, I would call this (now) manually memoization.

Assuming the following example:

sub fob
{
$_[0] < 3 ? 1 : fob($_[0] - 2) + fob($_[0] - 1);
}

is fib now a memoized version of fob? And even if it was

sub fub {
my ($n) = @_;
$n < 3 and return 1;
fub($n - 1) + fub($n - 2);
}

would fib be a memoized version of fub?

In "An introduction to algoritms" a /new/ function is created (which
calls an additional function that does the caching). This new function is
called MEMOIZED-MATRIX-CHAIN.

So the authors call a new function with added caching a memoized version
of another function (given earlier) which gives me the impression that
if voor all x f(x) = g(x) with f(x) having caching and g(x) not that f
is called a memoized version of g. Or: if the caller can't detect the
difference between calling f and g other than speed (and memory usage)
is there a difference?

Most likely this also depends on the context of the programming language
itself. There are languages in which programmatically changing f to a
memoized version of f is not possible other than rewriting the source
(either manually or with a dedicated program). So either we have to say
in such languages memoization is impossible or just call a function that
caches, whether is was implemented as such manually or automatically
source rewritten a memoized version. Maybe even if a non-caching version
would be a contrived version?

[..]
 
J

John Bokma

Rainer Weikusat said:
That's the core of the original description of the idea which refers to
performing an operation on an existing function which yields a new
function of the same name as a result which uses a cache to hold
intermediate results. In case nothing suitable is found in the cache, it
invokes the original function which - in turn - invokes the memoized
function recursively because that's what's going to happen when the
function presently named fact is being called.

Easiest thing to do seems to be to become "real people" and from now on
talk about caching only. ;-)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Help with a hash 4
Read a hash 5
traversing a hash structure 3
data to hash 1
Push regex search result into hash with multiple values 14
hash of arrays 1
multiple text replacements from a hash 2
simple hash 3

Members online

No members online now.

Forum statistics

Threads
474,093
Messages
2,570,607
Members
47,227
Latest member
bluerose1

Latest Threads

Top