J
Jamey Cribbs
Mongoose is a database management system written in Ruby. It has an
ActiveRecord-like interface, uses Skiplists for its indexing, and
Marshal for its data serialization. I named it Mongoose, because, like
Rudyard Kipling's Rikki-Tikki-Tavi, my aim is for it to be small, quick,
and friendly.
You can download it from: http://rubyforge.org/projects/mongoose/
*Credits*
-------------
Thanks to Logan Capaldo for letting me steal a lot of the code from
KirbyRecord.
Thanks to Ezra Zygmuntowicz and Fabien Franzen, whose ez_where Rails
plugin, provided much of the
inspiration for the query language.
Thanks to everyone who gave me feedback on KirbyBase. I have tried to
put all the lessons learned from developing that library to good use here.
*Features*
---------------
* Pure Ruby, with no external dependencies.
* ActiveRecord-like interface.
* Fast queries on indexed fields (Up to 10x faster than KirbyBase).
Indexes are Skiplists, which are just plain fun to play around with.
* Not an in-memory database. Data is only read in from disk when needed
and changes are immediately written out to disk.
* In-memory indexes are initialized from dedicated index files, rather
than rebuilt from scratch upon database initialization (like KirbyBase
does). This can greatly reduce startup times.
* Supports any data type that Marshal supports.
* Table relations supported via has_one, has_many.
*Why?*
-----------
Well, I started to look into performance improvements for KirbyBase.
One thing I noticed was that Ruby takes a comparatively long time
converting strings to native data types like Integer, Time, etc. Since
KirbyBase stores its records as strings, returning a large result set
could take a long time. I found that if I Marshaled records before I
wrote them to disk, a subsequent read of those records was significantly
faster.
About the same time, I read a paper about skiplists. A skiplist is a
data structure that is relatively simple (compared to say a b-tree) to
understand and implement in code. I was able to take the pseudo-code in
the paper and implement a Ruby version in a couple of hours. Skiplists
are pretty fast and since they are pretty easy to understand, I think
there is good potential to tweak them. So, I wanted to try using
skiplists for indexing rather than KirbyBase's array-based indexes.
I started to retrofit both the Marshal serialization and Skiplists in
KirbyBase, but quickly found that it was going to be more work than just
starting over from scratch with a new design. Besides, I didn't want to
radically change KirbyBase and piss off the current user base (both of
you know who you are).
So, I started from scratch. This also gave me the opportunity to make
two other major changes. First of all, I wanted to keep the query
language as close to KirbyBase's as possible (i.e. Ruby blocks), but I
wanted more control over the query expression. I took the opportunity
to borrow a lot of idea's from Ezra's ez_where plugin. I think this
will give me the capability down the road to tweak the query engine,
based on the query itself.
The second thing I changed was that I have finally seen the light about
ActiveRecord, so I stole pretty much all of Logan's KirbyRecord code to
give Mongoose an ActiveRecord-like api.
The end result of all of this (I hope) is a database management system
that is small, easy to use, and fast.
*What about KirbyBase?*
ActiveRecord-like interface, uses Skiplists for its indexing, and
Marshal for its data serialization. I named it Mongoose, because, like
Rudyard Kipling's Rikki-Tikki-Tavi, my aim is for it to be small, quick,
and friendly.
You can download it from: http://rubyforge.org/projects/mongoose/
*Credits*
-------------
Thanks to Logan Capaldo for letting me steal a lot of the code from
KirbyRecord.
Thanks to Ezra Zygmuntowicz and Fabien Franzen, whose ez_where Rails
plugin, provided much of the
inspiration for the query language.
Thanks to everyone who gave me feedback on KirbyBase. I have tried to
put all the lessons learned from developing that library to good use here.
*Features*
---------------
* Pure Ruby, with no external dependencies.
* ActiveRecord-like interface.
* Fast queries on indexed fields (Up to 10x faster than KirbyBase).
Indexes are Skiplists, which are just plain fun to play around with.
* Not an in-memory database. Data is only read in from disk when needed
and changes are immediately written out to disk.
* In-memory indexes are initialized from dedicated index files, rather
than rebuilt from scratch upon database initialization (like KirbyBase
does). This can greatly reduce startup times.
* Supports any data type that Marshal supports.
* Table relations supported via has_one, has_many.
*Why?*
-----------
Well, I started to look into performance improvements for KirbyBase.
One thing I noticed was that Ruby takes a comparatively long time
converting strings to native data types like Integer, Time, etc. Since
KirbyBase stores its records as strings, returning a large result set
could take a long time. I found that if I Marshaled records before I
wrote them to disk, a subsequent read of those records was significantly
faster.
About the same time, I read a paper about skiplists. A skiplist is a
data structure that is relatively simple (compared to say a b-tree) to
understand and implement in code. I was able to take the pseudo-code in
the paper and implement a Ruby version in a couple of hours. Skiplists
are pretty fast and since they are pretty easy to understand, I think
there is good potential to tweak them. So, I wanted to try using
skiplists for indexing rather than KirbyBase's array-based indexes.
I started to retrofit both the Marshal serialization and Skiplists in
KirbyBase, but quickly found that it was going to be more work than just
starting over from scratch with a new design. Besides, I didn't want to
radically change KirbyBase and piss off the current user base (both of
you know who you are).
So, I started from scratch. This also gave me the opportunity to make
two other major changes. First of all, I wanted to keep the query
language as close to KirbyBase's as possible (i.e. Ruby blocks), but I
wanted more control over the query expression. I took the opportunity
to borrow a lot of idea's from Ezra's ez_where plugin. I think this
will give me the capability down the road to tweak the query engine,
based on the query itself.
The second thing I changed was that I have finally seen the light about
ActiveRecord, so I stole pretty much all of Logan's KirbyRecord code to
give Mongoose an ActiveRecord-like api.
The end result of all of this (I hope) is a database management system
that is small, easy to use, and fast.
*What about KirbyBase?*