Serialization - filesystem or dbms

A

Antimon

Hi,
I'm working on a gameserver (not a huge project) and i need to save the
whole world state to somewhere.

Game world will be object oriented and there can be many different
class types so i cant just create 4-5 tables on a database and store
information. So i wanted to have a "ISerializable" interface with two
methods: "Serialize" and "Deserialize". Serialize will return that
objects byte[] representation and Deserialize will use a byte[] to
construct the same object from scratch.
All "ISerializable" objects will be mapped in a static
"Hashtable<Serial, ISerializable>" where "Serial" will be an id number
class assigned to each ISerializable on construction. So i can write
Serial id's as object pointers while serialization.

This approach seemed to be fine to me. So i'm thinking about
implementing it on filesystem or an rdbms. If i use a rdbms, there will
be 2 tables :) One for class types, one for object instances. I will be
storing binary data on database (returned from Serialize method).
It seems that there's no point on using an rdbms for something like
this but if i use filesystem, i will need to suspend server each 30
mins or something and dump all world into a file. So, a crash may cause
a timewarp. If i use a rdbms, i can have a continious saving mechanism.
I can place modified objects into a queue, and a thread can write them
to the database continiously. So if server goes offline for some
reason, i would only loose data on the serialization buffer which will
not be a comparable amount to 30 mins. And rdbms system would allow
splitting the server to 2 machines (one for server application, one for
rdbms layer) w/o any effort.

On the other hand, filesystem mechanism will be very easy to implement
and maintain.

All suggestions are welcome :) Please help me decide what to do.
 
Z

zero

Hi,
I'm working on a gameserver (not a huge project) and i need to save the
whole world state to somewhere.

Game world will be object oriented and there can be many different
class types so i cant just create 4-5 tables on a database and store
information. So i wanted to have a "ISerializable" interface with two
methods: "Serialize" and "Deserialize". Serialize will return that
objects byte[] representation and Deserialize will use a byte[] to
construct the same object from scratch.
All "ISerializable" objects will be mapped in a static
"Hashtable<Serial, ISerializable>" where "Serial" will be an id number
class assigned to each ISerializable on construction. So i can write
Serial id's as object pointers while serialization.

This approach seemed to be fine to me. So i'm thinking about
implementing it on filesystem or an rdbms. If i use a rdbms, there will
be 2 tables :) One for class types, one for object instances. I will be
storing binary data on database (returned from Serialize method).
It seems that there's no point on using an rdbms for something like
this but if i use filesystem, i will need to suspend server each 30
mins or something and dump all world into a file. So, a crash may cause
a timewarp. If i use a rdbms, i can have a continious saving mechanism.
I can place modified objects into a queue, and a thread can write them
to the database continiously. So if server goes offline for some
reason, i would only loose data on the serialization buffer which will
not be a comparable amount to 30 mins. And rdbms system would allow
splitting the server to 2 machines (one for server application, one for
rdbms layer) w/o any effort.

On the other hand, filesystem mechanism will be very easy to implement
and maintain.

All suggestions are welcome :) Please help me decide what to do.

It seems like you already have the pros and cons thought out quite well.
An rdbms seems like a heavy tool for this, but for performance it may be
necessary. On the other hand, maybe you could use a separate thread that
continuously (or at least sooner than every 30 minutes) saves the world
state to file, without affecting performance much. As for the difficulty
in implementing and maintaining, it just depends on what you're used to.
Using an rdbms in Java isn't really that much more complicated than using
files. I think both options are about equal, so it doesn't really matter
which you choose. Just make a decision, and stick with it.
 
R

Roedy Green

All "ISerializable" objects will be mapped in a static
"Hashtable<Serial, ISerializable>" where "Serial" will be an id number
class assigned to each ISerializable on construction. So i can write
Serial id's as object pointers while serialization.

Why not just use Java's built-in serialisaton? you do one i/o and
Java chases dependent objects for you.
 
A

Antimon

I'm not familiar with java's built-in Serialization. I was thinking
about using "Externalizable" interface but i'm worried about error
handling.
I mean, if i just remove a class type from server, i can check for the
ctor and request input to ignore that type while deserialization.
"Serializable" on the other hand, produces too big object instance
data. And, there's not too much difference between Externalizable and
my approach. writeObject and readObject methods will be needed to
implement in both. I will just need to take care of pointers and stuff
but i will gain full control over serialization.
 
D

Dimitri Maziuk

Antimon sez:
....
It seems that there's no point on using an rdbms for something like
this but if i use filesystem, i will need to suspend server each 30
mins or something and dump all world into a file. So, a crash may cause
a timewarp. If i use a rdbms, i can have a continious saving mechanism.

I think the main advantade of rdbms would be transactions: a crash may
cause time warp, but the world will be restored to a consistent state.
If you use filesystem, you'll have to deal with the possibility of a
crash during file write.

There are a couple of persistence packages you should look into, like
db4o and hibernate.

Dima
 
R

Roedy Green

I think the main advantade of rdbms would be transactions: a crash may
cause time warp, but the world will be restored to a consistent state.
If you use filesystem, you'll have to deal with the possibility of a
crash during file write.

You can log transactions without a DBMS; you must commit every x
seconds or so to make sure they are fully written to disk. This is
considerably faster than all the before looks and after looks you
might do for a database transaction. The disadvantage is you have to
replay the updating transactions, including the calculations, against
an intact database backup. This can take quite a while before you
have recovered. This was the technique I used back in the 70s for
central banking on computers with less ram and CPU power that today's
desktops.
 
R

Richard Wheeldon

Antimon said:
I'm not familiar with java's built-in Serialization.

http://mindprod.com/jgloss/serialization.html
Thought I'd save Roedy the trouble :)
"Serializable" on the other hand, produces too big object instance
data.

Sounds like you need to take a look at the "transient" keyword. Java's
serialization should take care of most of this. You could also look at
the XML based serialization which is very bloated but much easier to
repair manually if required.

Alternatively, if you want the RDBMS features but don't want the bloat,
have you tried looking at an embedded database such as hsql ?

Richard
 
H

Hiran Chaudhuri

Roedy Green said:
Why not just use Java's built-in serialisaton? you do one i/o and
Java chases dependent objects for you.

If I understood zero's point, he thinks of an RDBMS for transactionality.
That is one point, but then there is also the overhead of real mapping the
data to the database.

Maybe some other solution might come handy. How about object oriented DBMS?
Or XML, whether in filesystem or database.....

Hiran
 
R

Roedy Green

The disadvantage is you have to
replay the updating transactions, including the calculations, against
an intact database backup. This can take quite a while before you
have recovered. This was the technique I used back in the 70s for
central banking on computers with les
the other disadvantage is you must take your database offline
periodically for backup. For most businesses you can take your
website down for maintenance providing only lookup for long enough to
copy the flat files, something considerably quicker than any sort of
record by record backup.
 
I

isamura

"Roedy Green" wrote ...
: On Sun, 04 Dec 2005 22:30:07 GMT, Roedy Green
: indirectly quoted someone who said :
:
: >The disadvantage is you have to
: >replay the updating transactions, including the calculations, against
: >an intact database backup. This can take quite a while before you
: >have recovered. This was the technique I used back in the 70s for
: >central banking on computers with les
: the other disadvantage is you must take your database offline
: periodically for backup. For most businesses you can take your
: website down for maintenance providing only lookup for long enough to
: copy the flat files, something considerably quicker than any sort of
: record by record backup.
:
This is not necessarily true if you use MySQL. You can setup slaves to mirror the master db and get
instant backup copies. You can even go further by stopping a slave and back that up. Perhaps other
RDBMS also have this capability.

..k
 
R

Roedy Green

For most businesses you can take your
: website down for maintenance providing only lookup for long enough to
: copy the flat files, something considerably quicker than any sort of
: record by record backup.
:
This is not necessarily true if you use MySQL. You can setup slaves to mirror the master db and get
instant backup copies. You can even go further by stopping a slave and back that up. Perhaps other
RDBMS also have this capability.

We are differening on the meaning of quicker. The whole point of
using an advance SQL engine is it lets you backup without shutting
down. You can't get much quicker than an instantaneous backup. But in
another sense, e.g. it terms of total CPU cycles or total number of
I/Os, such a record by record backup has much more total overhead
than shutting down and backing up a flat file a meg a pop..
 
D

Dimitri Maziuk

Roedy Green sez:
You can log transactions without a DBMS; you must commit every x
seconds or so to make sure they are fully written to disk. This is
considerably faster than all the before looks and after looks you
might do for a database transaction. The disadvantage is you have to
replay the updating transactions, including the calculations, against
an intact database backup. This can take quite a while before you
have recovered. This was the technique I used back in the 70s for
central banking on computers with less ram and CPU power that today's
desktops.

How about granularity of your saves, did you have lots of small files
or one huge one? How many copies of each? Did you run round-robin on
2 files or did you create a new savefile every time? If it's one file,
how big was it and how long did it take to write it? If it was lots of
little ones, what did you do to avoid name clashes etfc. when creating
new ones?

Sure you can log transactions without a DBMS, if you want to write all
that code.

Dima
 
R

Roedy Green

How about granularity of your saves, did you have lots of small files
or one huge one? How many copies of each? Did you run round-robin on
2 files or did you create a new savefile every time? If it's one file,
how big was it and how long did it take to write it? If it was lots of
little ones, what did you do to avoid name clashes etfc. when creating
new ones?
Back the 70s you typically logged to mag tape. This was your way of
being sure you had everything captured no matter how terrible the
crash.
 
R

Roedy Green

Back the 70s you typically logged to mag tape. This was your way of
being sure you had everything captured no matter how terrible the
crash.

Typically back then you assigned two tape drives that automatically
toggled back and forth. So long as the operator got the new empty
tape up in time there was no delay.
 
L

Larry Coon

Roedy said:
the other disadvantage is you must take your database offline
periodically for backup.

This is definitely not true for Sybase, and I assume it's
also false for most/all modern dbms's.


Larry Coon
University of California
 
R

Roedy Green

This is definitely not true for Sybase, and I assume it's
also false for most/all modern dbms's.

but I was not describing an SQL database. I was talking about roll
your own transaction replay.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,830
Latest member
HeleneMull

Latest Threads

Top