E
Erik Veenstra
Imagine, you're building a CVS like repository. A repository
has modules, a module has branches and a branch has _a lot of_
items (history!). This repository is handled by a standalone
server with the FTP protocol as "frontend".
In a naive implementation, you simply load all items (except
the contents of the items) in memory, so they are readily
available. In pure OO modeling theories, this is usually true:
"All objects live in core.".
In reality, you don't want to do that. You only want to load
all items of a branch if they are referred to (and optionally
unload it after a while). So we introduce Branch@loaded and
move the invocation of Branch#load from Branch#initialize to
each place in the code where Branch@items is used. *Each*
place, don't forget even one place! Sooner or later, you'll
forget one! It's too tricky... And bad coding...
So I came up with this LazyLoad, a generic lazy-loading class.
We can initialize Branch@items to LazyLoad.new(self, :load,
:items) instead of Hash.new. Whenever this object is referred
to (e.g. with @items.keys), LazyLoad#method_missing is invoked.
This method invokes Branch#load, gets the object Branch@items
(which now refers to a filled Hash) and sends the original
message to this Branch@items. This instance of LazyLoad now
dies in peace.
I implemented LazyLoad (see below) and use it in a real
situation. Seems to work. The server starts really fast and the
user thinks that all branches are loaded.
I embedded the backend in the commandline tool as well. If you
use this commandline tool to synchronize the local workset with
the repository, you usually want to load only *one* branch, not
all of them. The speed benefit is huge, whereas the impact on
the code is close to zero!
The code below demonstrates this theory: Step 1 is the naive
implementation of Branch, step2 is the enhanced implementation
of Branch and step3 implements LazyLoad itself. (Steps 1 and 2
are just examples of the use of LazyLoad. They are not
complete.)
Comments? Ideas? Something I overlooked?
gegroet,
Erik V. - http://www.erikveen.dds.nl/
----------------------------------------------------------------
# STEP 1, NAIVE IMPLEMENTATION
class Branch
def initialize
@items = {}
load
end
def load
@items = {}
# Fill @items... EXPENSIVE, TIME CONSUMING, MEMORY HUNGRY!
end
end
----------------------------------------------------------------
# STEP 2, INTRODUCING LAZYLOAD
class Branch
def initialize
@items = LazyLoad.new(self, :load, :items)
end
def load
@items = {}
# Fill @items... EXPENSIVE, TIME CONSUMING, MEMORY HUNGRY!
end
end
----------------------------------------------------------------
# STEP 3, IMPLEMENTATION OF LAZYLOAD
class LazyLoad
def initialize(object, load_method, property)
@object = object
@property = property
@load_method = load_method
end
def method_missing(method_name, *parms, &block)
@object.send(@load_method)
@object.instance_eval("@#{@property.to_s}").send(method_name,
*parms, &block)
end
end
----------------------------------------------------------------
has modules, a module has branches and a branch has _a lot of_
items (history!). This repository is handled by a standalone
server with the FTP protocol as "frontend".
In a naive implementation, you simply load all items (except
the contents of the items) in memory, so they are readily
available. In pure OO modeling theories, this is usually true:
"All objects live in core.".
In reality, you don't want to do that. You only want to load
all items of a branch if they are referred to (and optionally
unload it after a while). So we introduce Branch@loaded and
move the invocation of Branch#load from Branch#initialize to
each place in the code where Branch@items is used. *Each*
place, don't forget even one place! Sooner or later, you'll
forget one! It's too tricky... And bad coding...
So I came up with this LazyLoad, a generic lazy-loading class.
We can initialize Branch@items to LazyLoad.new(self, :load,
:items) instead of Hash.new. Whenever this object is referred
to (e.g. with @items.keys), LazyLoad#method_missing is invoked.
This method invokes Branch#load, gets the object Branch@items
(which now refers to a filled Hash) and sends the original
message to this Branch@items. This instance of LazyLoad now
dies in peace.
I implemented LazyLoad (see below) and use it in a real
situation. Seems to work. The server starts really fast and the
user thinks that all branches are loaded.
I embedded the backend in the commandline tool as well. If you
use this commandline tool to synchronize the local workset with
the repository, you usually want to load only *one* branch, not
all of them. The speed benefit is huge, whereas the impact on
the code is close to zero!
The code below demonstrates this theory: Step 1 is the naive
implementation of Branch, step2 is the enhanced implementation
of Branch and step3 implements LazyLoad itself. (Steps 1 and 2
are just examples of the use of LazyLoad. They are not
complete.)
Comments? Ideas? Something I overlooked?
gegroet,
Erik V. - http://www.erikveen.dds.nl/
----------------------------------------------------------------
# STEP 1, NAIVE IMPLEMENTATION
class Branch
def initialize
@items = {}
load
end
def load
@items = {}
# Fill @items... EXPENSIVE, TIME CONSUMING, MEMORY HUNGRY!
end
end
----------------------------------------------------------------
# STEP 2, INTRODUCING LAZYLOAD
class Branch
def initialize
@items = LazyLoad.new(self, :load, :items)
end
def load
@items = {}
# Fill @items... EXPENSIVE, TIME CONSUMING, MEMORY HUNGRY!
end
end
----------------------------------------------------------------
# STEP 3, IMPLEMENTATION OF LAZYLOAD
class LazyLoad
def initialize(object, load_method, property)
@object = object
@property = property
@load_method = load_method
end
def method_missing(method_name, *parms, &block)
@object.send(@load_method)
@object.instance_eval("@#{@property.to_s}").send(method_name,
*parms, &block)
end
end
----------------------------------------------------------------