concurrent DBM for Java?

J

Jim Janney

I need to maintain a data base of small text snippets keyed by
arbitrary strings, without the overhead of a full SQL relational
database. We will have several people putting data into it so it
needs to support concurrent access over a network.

JDBM catches my eye (http://jdbm.sourceforge.net/) but I can't find
any place that says whether it supports concurrent access or not.
Berkeley DB would be perfect but would need a license from Oracle
since this is a commercial product. Gdbm apparently does not support
concurrent access. TDB (http://sourceforge.net/projects/tdb/) looks
good but would require a JNI interface and probably some work to get
it to compile under Windows.

Any recommendations? I can do JNI if neccessary but would be just as
happy not to need to. I need something I can get going quickly --
management wants it yesterday, as usual...
 
A

Arne Vajhøj

I need to maintain a data base of small text snippets keyed by
arbitrary strings, without the overhead of a full SQL relational
database. We will have several people putting data into it so it
needs to support concurrent access over a network.

JDBM catches my eye (http://jdbm.sourceforge.net/) but I can't find
any place that says whether it supports concurrent access or not.
Berkeley DB would be perfect but would need a license from Oracle
since this is a commercial product. Gdbm apparently does not support
concurrent access. TDB (http://sourceforge.net/projects/tdb/) looks
good but would require a JNI interface and probably some work to get
it to compile under Windows.

Any recommendations? I can do JNI if neccessary but would be just as
happy not to need to. I need something I can get going quickly --
management wants it yesterday, as usual...

Why are you so sure that a full SQL relational database has
too much overhead?

They work. They are widely used. They are optimized a lot.
They support ACID.

Chosing some rarely used ISAM package, that may have
concurrency problems & other bugs and may not even be optimized
very well seems as a high risk for little gain to me.

Sure Google, Yahoo, Facebook etc. uses NOSQL databases
today. But they can also afford to spend 2 digit millions
of dollars on the development without blinking. Can you?

Arne
 
J

Jim Janney

Arne Vajhøj said:
Why are you so sure that a full SQL relational database has
too much overhead?

Five years of experience using a relational database in the
application for which this is intended. They're great for some
things, and we use it for those things. This is not one of them.
They work. They are widely used. They are optimized a lot.
They support ACID.

Chosing some rarely used ISAM package, that may have
concurrency problems & other bugs and may not even be optimized
very well seems as a high risk for little gain to me.

Sure Google, Yahoo, Facebook etc. uses NOSQL databases
today. But they can also afford to spend 2 digit millions
of dollars on the development without blinking. Can you?

If I had time to debate this I wouldn't have asked the question. I
was hoping someone might have a quick, constructive suggestion.
 
A

Arne Vajhøj

Five years of experience using a relational database in the
application for which this is intended. They're great for some
things, and we use it for those things. This is not one of them.

And the simple solution of buying some more powerful
database hardware is not an option?
If I had time to debate this I wouldn't have asked the question. I
was hoping someone might have a quick, constructive suggestion.

The fact that using a relational database does not perform
well does not guarantee that a non-relational database will
perform well.

That is just wishful thinking.

If you ask questions for free in a public forum, then
you should be prepared for suggestions about other
approaches than the one you prefer.

Arne
 
J

Jim Janney

Arne Vajhøj said:
And the simple solution of buying some more powerful
database hardware is not an option?

Tell all of our customers they have to upgrade their network and buy a
faster AS/400? No, that's not going to fly.
The fact that using a relational database does not perform
well does not guarantee that a non-relational database will
perform well.

That is just wishful thinking.

If you ask questions for free in a public forum, then
you should be prepared for suggestions about other
approaches than the one you prefer.

In the abstract it is undoubtably an interesting technical question.
I don't have time for it now.
 
J

Jim Janney

Spud said:
Google "key value stores". There are a bunch of them. Unfortunately
most aren't very far along yet, and some have way more overhead than I
think you want. But Tokyo Cabinet gets good reviews, as do a few
others.

I overlooked that one. Thank you.
 
T

Tom Anderson

And the simple solution of buying some more powerful database hardware
is not an option?

Are you blinking serious? Are you mental? In what universe is it simpler
to use an RDBMS and pay copious dollars for more hardware than to use a
key-value store, a kind of software which is *principally known* for being
simpler than an RDBMS?
The fact that using a relational database does not perform well does not
guarantee that a non-relational database will perform well.

That is just wishful thinking.

No, it's a reasonable hypothesis.

tom
 
K

Ken

Tell all of our customers they have to upgrade their network and buy a
faster AS/400?  No, that's not going to fly.

You have an AS/400... I don't understand what the issue is... are you
trying to replace the AS/400?

If the AS/400 is fast enough and for some reason think that SQL is too
slow, if it's an old dieing 400 I could see that. But if you only
have a few hundred users and it's a half way modern iSeries I couldn't
imagine them being able to stress it with SQL, unless they're day
traders!!!

Anyways if the AS/400 is fast enough what is wrong with its record
level access? I've never tried to use Java for record level access
(but I very much bet it's possible via JTOpen) I've only it in RPG (if
you don't program in RPG see SETLL and CHAIN) and even then I'd rather
use sql than doing record level. The problem with it is if you
support more than one client. Different clients are going to have
different data needs and so an optimal set of CHAINs for one customer
is not going to be the same for another and if this simple group of
strings ever has a dependency with data elsewhere... it's just so much
harder to manage than simply using transactions.

I haven't done much research on Express-C (isn't it free?), it might
have the same functionality and you'd probably be more comfortable
with a flavor of DB2.
 
J

Jim Janney

Patrick May said:
Jim Janney said:
I need to maintain a data base of small text snippets keyed by
arbitrary strings, without the overhead of a full SQL relational
database. We will have several people putting data into it so it
needs to support concurrent access over a network. [ . . . ]
Berkeley DB would be perfect but would need a license from Oracle
since this is a commercial product.

BDB was my first thought when reading your requirements. It's fast
and trustworthy. How expensive is the commercial license? I suspect
it's worth the amount of risk it eliminates.

Their web page doesn't say. Apparently you tell them what you plan to
do with it and they offer you a price.
 
J

Jim Janney

Ken said:
You have an AS/400... I don't understand what the issue is... are you
trying to replace the AS/400?

If the AS/400 is fast enough and for some reason think that SQL is too
slow, if it's an old dieing 400 I could see that. But if you only
have a few hundred users and it's a half way modern iSeries I couldn't
imagine them being able to stress it with SQL, unless they're day
traders!!!

Anyways if the AS/400 is fast enough what is wrong with its record
level access? I've never tried to use Java for record level access
(but I very much bet it's possible via JTOpen) I've only it in RPG (if
you don't program in RPG see SETLL and CHAIN) and even then I'd rather
use sql than doing record level. The problem with it is if you
support more than one client. Different clients are going to have
different data needs and so an optimal set of CHAINs for one customer
is not going to be the same for another and if this simple group of
strings ever has a dependency with data elsewhere... it's just so much
harder to manage than simply using transactions.

I haven't done much research on Express-C (isn't it free?), it might
have the same functionality and you'd probably be more comfortable
with a flavor of DB2.

Our customers have AS/400s, some of them dating back to 2001. Either
SQL or record level access would be more than fast enough if the code
ran directly on the 400, but when you connect over a network the
latency goes way up, and I wanted to be able to do a bunch of
unrelated queries (roughly 10 to 200) with no perceptible delay to the
user (this is part of displaying a screen).

I'm beginning to think that the same problem would apply to anything I
do over a network, so I'm currently rethinking my whole approach to
the problem. The data is small enough I can cache it all locally, but
I need to reload it when it changes.
 
R

Robert Klemme

Our customers have AS/400s, some of them dating back to 2001. Either
SQL or record level access would be more than fast enough if the code
ran directly on the 400, but when you connect over a network the
latency goes way up, and I wanted to be able to do a bunch of
unrelated queries (roughly 10 to 200) with no perceptible delay to the
user (this is part of displaying a screen).

I'm beginning to think that the same problem would apply to anything I
do over a network, so I'm currently rethinking my whole approach to
the problem. The data is small enough I can cache it all locally, but
I need to reload it when it changes.

Network latencies are pretty low these days. And even if you are
spending a few milliseconds the user won't probably notice.

In situations like these this is what I typically do: I create a toy
example that models important aspects of the original problem and
benchmark it. In your case I would create a simplified schema that
covers what you need and create a console only application that accesses
it in meaningful ways.

One solution to make JDBC with multiple queries fast is using the JDBC
driver's batch mode. That way you may reduce network roundtrips. The
efficiency of this of course depends on the driver and DB at hand.
Given that DB2 is a mature product and IBM is heavily involved in Java
for years I expect it to work pretty well.

Another solution would be to write stored procedures that smartly return
the data you need. Whether that is feasible depends of course on the
application logic.

In any case, if customers do have an AS/400 already and if they have DB2
on that machine already I'd try to use that. Advantage is that they do
not need to buy new licenses, install additional software, train admins
etc. and you can be sure that the DB is capable of handling the load
(and probably much higher load).

Depending on your data and access logic it may even be simple to hack
some client side caching together on top of JDBC.

Even if that approach fails or you have extreme requirements for
performance I would look into "non standard" solutions.

Kind regards

robert
 
M

Martin Gregorie

I'm beginning to think that the same problem would apply to anything I
do over a network, so I'm currently rethinking my whole approach to the
problem. The data is small enough I can cache it all locally, but I
need to reload it when it changes.
I seem to remember that AS/400 remote file replication via the
journalling facility worked pretty well, so if you do go for local
caching, remember to look at it. And, of course, its language-agnostic.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,139
Messages
2,570,805
Members
47,351
Latest member
LolaD32479

Latest Threads

Top