ANNOUNCE: xsdb -- the eXtremely Simple Database goes alpha

A

Aaron Watters

ANNOUNCE: xsdb -- the eXtremely Simple Database goes alpha

Links:
Home page with docs and other links:
http://xsdb.sourceforge.net/
Sourceforge project with download links
http://sourceforge.net/projects/xsdb/

Executive Summary:

The xsdb package is an open source database implementation implemented
in Python
and hosted on SourceForge.

The xsdb package provides fundamental concurrent database functionality
with
concurrency control and recovery. Fundamental characteristics include:

- Extreme portability and ease of installation and use.
- A simple semantics of objects with associated descriptions
compatible with the relational model, object modelling methods,
and other data organizations such as OLAP.
- Multiple access paths and indices.
- Timestamp based concurrency control for safe concurrent database access.
- Commit/Rollback and recovery support.
- A variety of underlying storage implementations with configurable
features and
performance characteristics.
- No intrinsic database size limitations.

The package is intended to provide what you really need from a database
for most applications,
without the other stuff (among other goals).

General Technical Notes:
The xsdb package is implemented in Python, and the server mode requires
stackless python.
An xsdb database (not in server mode) will run using standard C Python
or Java Python (Jython).

Please have a look and give it a try. Thanks very much!

-- Aaron Watters [attempt 2]

===
Even in a perfect world where everyone is equal
I'd still own the movie rights and be working on the sequel
-- Elvis Costello "Every day I write the book"
 
A

A.M. Kuchling


Probably because Stackless made it easier to write the server without
having to wrestle an async socket library such as Medusa or Twisted.
IMHO this sort of limitations severely reduce any
project's potential.

Yeah, but it's his code, so he can do whatever he likes.

Sheesh, Aaron announces a database system that looks really spiffy from the
examples, like ZODB without the pain of ExtensionClass, and the first two
responses are griping about one aspect of it. Sometimes folks don't know
when they're well-off. (Now if only he'd released it before I wrote the
PyCon proposal tracker using PostgreSQL...)

--amk
 
I

Istvan Albert

A.M. Kuchling said:
Sheesh, Aaron announces a database system that looks really spiffy from the
examples, like ZODB without the pain of ExtensionClass, and the first two
responses are griping about one aspect of it

But that is not just one aspect of it, it is a matter of the
most fundamental usability question, will it work on my system?

The answer is no, I have to install another version of python
to use it.
> Probably because Stackless made it easier to write the server without
> having to wrestle an async socket library such as Medusa or Twisted.

The question that needs to be answered is whether it would
be worth to wrestle with those rather than locking out
the vast majority of the potential users.

I have a lot of respect and admiration for everyone who
undertakes a project of this magnitude and complexity,
on the other hand I think that tying the project to a
python implementation that most of us have no compelling reason
to use will severely affect its overall impact. And it would
be a shame.

Istvan.
 
S

Skip Montanaro

Istvan> The question that needs to be answered is whether it would be
Istvan> worth to wrestle with those rather than locking out the vast
Istvan> majority of the potential users.

I imagine Aaron provides the code (it is hosted on SF, after all). All you
need to do is port it to use Twisted or Medusa, then feed the diffs back to
Aaron. If it results in broader reach for xsdb without making the existing
code a nightmare to maintain, he'll probably fold it in.

Skip
 
A

Aaron Watters

Istvan Albert said:
Why?
IMHO this sort of limitations severely reduce any
project's potential.
Istvan.

Because it's the Right Way (tm) to do it :).

First let me emphasize that only the server layer
uses stackless at present.

I'm using stackless because (aside from the fact that
it was simplest way implement the functionality)
database concurrency control requires the following:

If a young transaction tries to read something written by
an old transaction which has not yet committed it must wait
until the old transaction decides to commit or abort.

In order to allow transactions to wait the options are:

1) Use an event loop and write the application "inside out",
much like a fortran 4 program attempting to emulate recursion.

2) Use operating system threads (which have very high overhead
and sometimes don't really work the same across different
platforms...)

3) Use stackless.

4) punt: automatically abort any transaction which needs
to wait.

As a first approach I went for (3) because it was easy. I don't
plan to do (1) because I treasure my sanity. I intend to
implement both (2) and (4) as server options before I call
xsdb a "beta", but I want to also keep the stackless version alive.

I'm still wishing that real stackless functionality will make it
into standard Python, but I also don't really understand the
deep implications.
-- Aaron Watters

===
I don't know if you've been loving somebody
I only know it isn't mine. -- Elvis Costello "Alison"
 
P

Paul Rubin

In order to allow transactions to wait the options are:

1) Use an event loop and write the application "inside out",
much like a fortran 4 program attempting to emulate recursion.
...

Maybe you could find some clever way to do it with Python generators.
 
F

Fredrik Lundh

Aaron said:
ANNOUNCE: xsdb -- the eXtremely Simple Database goes alpha

now, how cool is this. hugunin and watters both reappear after
many years, on nearly the same day, both with stuff that shows
that they didn't really give up on Python hacking; they've just
been working on the perfect design...
In order to allow transactions to wait the options are:

1) Use an event loop and write the application "inside out",
much like a fortran 4 program attempting to emulate recursion.

2) Use operating system threads (which have very high overhead
and sometimes don't really work the same across different
platforms...)

3) Use stackless.

4) punt: automatically abort any transaction which needs
to wait.

5) use an event loop and use a generator for the relevant code;
when you discover that you need to pause, yield to the framework.

</F>
 
P

Paul Rubin

Fredrik Lundh said:
5) use an event loop and use a generator for the relevant code;
when you discover that you need to pause, yield to the framework.

This kind of design really could benefit from Raymond Hettinger's PEP
of a while back, proposing being able to pass parameters from the
caller back to the yield statement of a yielded generator.
 
R

Robin Becker

[QUOTE="Paul Rubin said:
5) use an event loop and use a generator for the relevant code;
when you discover that you need to pause, yield to the framework.

This kind of design really could benefit from Raymond Hettinger's PEP
of a while back, proposing being able to pass parameters from the
caller back to the yield statement of a yielded generator.[/QUOTE]
I was just about to ask if generators allow for a stream like mechanism,
but obviously if we're not allowed to change the generator state then it
seems quite hard.
 
P

Paul Rubin

Robin Becker said:
I was just about to ask if generators allow for a stream like mechanism,
but obviously if we're not allowed to change the generator state then it
seems quite hard.

Well, there's always global state.
 
A

Aaron Watters

regarding the use of stackless in
http://xsdb.sourceforge.net
Fredrik Lundh said:
when you discover that you need to pause, yield to the framework.

Please tell me I'm missing something, but I don't think
this will really help. The problem is that I need to "yield"
or "suspend" or "send something across a channel" from about
45 places in the code some of which are arbitrarily deep into
multiple recursions. The generator thing will only allow
me to go one level deep into a single call -- no? By contrast
the stackless.channel mechanism is a far more general construct,
allowing me to "yield" at any point without restructuring the
code at all. Stackless rules.
-- Aaron Watters
===
I'm standing in the middle of the desert
waiting for my ship to come in
-- Sheryl Crow "Leaving Las Vegas"
 
D

Duncan Booth

(e-mail address removed) (Aaron Watters) wrote in

regarding the use of stackless in
http://xsdb.sourceforge.net


Please tell me I'm missing something, but I don't think
this will really help. The problem is that I need to "yield"
or "suspend" or "send something across a channel" from about
45 places in the code some of which are arbitrarily deep into
multiple recursions. The generator thing will only allow
me to go one level deep into a single call -- no? By contrast
the stackless.channel mechanism is a far more general construct,
allowing me to "yield" at any point without restructuring the
code at all. Stackless rules.

The generator solution may not be appropriate for your task, but it isn't
entirely accurate to say that you can only go one level deep. You can (sort
of) yield from arbitrarily deep function nesting, or even from recursive
functions. The catch though is that you do have to write the code in a
slightly contorted manner in order to yield from below the first function.

The rule to follow is simply: any function which wants to yield, or which
calls a function that wants to yield has to be a generator and has to be
called from a 'for' loop which itself yields.

e.g. A generator that walks a tree recursively:

def inorder(t):
if t:
for x in inorder(t.left):
yield x
yield t.label
for x in inorder(t.right):
yield x
 
J

Jp Calderone

(e-mail address removed) (Aaron Watters) wrote in



The generator solution may not be appropriate for your task, but it isn't
entirely accurate to say that you can only go one level deep. You can (sort
of) yield from arbitrarily deep function nesting, or even from recursive
functions. The catch though is that you do have to write the code in a
slightly contorted manner in order to yield from below the first function.

The rule to follow is simply: any function which wants to yield, or which
calls a function that wants to yield has to be a generator and has to be
called from a 'for' loop which itself yields.

e.g. A generator that walks a tree recursively:

def inorder(t):
if t:
for x in inorder(t.left):
yield x
yield t.label
for x in inorder(t.right):
yield x

This works, but it is even easier. All you need is top-level code to
handle it:


def unroll(f, *a, **kw):
gstack = [iter(f(*a, **kw))]
while gstack:
try:
e = gstack[-1].next()
except StopIteration:
gstack.pop()
else:
if isinstance(e, types.GeneratorType):
gstack.append(e)
else:
yield e


def inorder(t):
if t:
yield inorder(t.left)
yield t.label
yield inorder(t.right)

unroll(inorder, t)


A bit more frameworky code, but it's all isolated in one place, which is
much nicer than having to spread it all over the place.

Jp
 
D

Duncan Booth

This works, but it is even easier. All you need is top-level code
to
handle it:


def unroll(f, *a, **kw):
gstack = [iter(f(*a, **kw))]
while gstack:
try:
e = gstack[-1].next()
except StopIteration:
gstack.pop()
else:
if isinstance(e, types.GeneratorType):
gstack.append(e)
else:
yield e


def inorder(t):
if t:
yield inorder(t.left)
yield t.label
yield inorder(t.right)

unroll(inorder, t)


A bit more frameworky code, but it's all isolated in one place,
which is
much nicer than having to spread it all over the place.

Nice idea, provided you never want to yield a generator. Also should it
check for a generator, or just for any iterator.

You can also go for a recursive definition of unroll and use it to unroll
itself which I think reads a bit more clearly.

def unroll(iterator):
for v in iterator:
if isinstance(v, types.GeneratorType)
for inner in unroll(v): yield inner
else:
yield v

for node in unroll(inorder(t)):
... do whatever ...

I wonder if this is useful enough to go in itertools?
 
D

David Mertz, Ph.D.

|> I was just about to ask if generators allow for a stream like mechanism,
|> but obviously if we're not allowed to change the generator state then it
|> seems quite hard.

|Well, there's always global state.

But the state need not be global, just a mutable object yielded by a
generator. As I thought about this fact, I have come to find Raymond
Hettinger's proposals for enhancing simple generators less urgent (but I
probably still vote +1, though now moot).
... message = [None]
... while message[0] != "EXIT":
... yield message
... ... if mess[0] is not None: print mess[0]
... mess[0] = raw_input("Word: ")
...
Word: foo
foo
Word: bar
bar
Word: EXIT

This is a toy example, but the point is that we are perfectly able to
pass data back into a generator without using global state.
 
A

Aaron Watters

Duncan Booth said:
(e-mail address removed) (Aaron Watters) wrote in



The generator solution may not be appropriate for your task, but it isn't
entirely accurate to say that you can only go one level deep....

The rule to follow is simply: any function which wants to yield, or which
calls a function that wants to yield has to be a generator and has to be
called from a 'for' loop which itself yields....

Yes. I see this would work. But this would then have to be
pervasive throughout my code -- and even in client code that
uses the xsdb code directly (but not from a remote client)....
yuck. No thanks :(. The acceptible options still are threads,
stackless, or punt.

-- Aaron Watters
===
How do zen masters walk through walls?
Doors.
 
M

Michele Simionato

But the state need not be global, just a mutable object yielded by a
generator. As I thought about this fact, I have come to find Raymond
Hettinger's proposals for enhancing simple generators less urgent (but I
probably still vote +1, though now moot).
... message = [None]
... while message[0] != "EXIT":
... yield message
...... if mess[0] is not None: print mess[0]
... mess[0] = raw_input("Word: ")
...
Word: foo
foo
Word: bar
bar
Word: EXIT

This is a toy example, but the point is that we are perfectly able to
pass data back into a generator without using global state.

A more verbose but arguably more elegant way would be to wrap the
generator in a class. Let me repost some code I wrote some time ago.

"""An object-oriented interface to iterators-generators"""

class Iterator(object):
"""__gen__ is automatically called by __init__, so must have signature
compatibile with __init__. Subclasses should not need to override __init__:
you can do it, but you must do it cooperatively or, at least, ensure that
__gen__ is called correctly and its value assigned to self.iterator.
"""
def __init__(self,*args,**kw):
super(Iterator,self).__init__(*args,**kw)
self.iterator=self.__gen__(*args,**kw)
def __gen__(self,*args,**kw):
"Trivial generator, to be overridden in subclasses"
yield None
def __iter__(self):
return self
def next(self):
return self.iterator.next()

class MyIterator(Iterator):
def __gen__(self):
self.x=1
yield self.x # will be changed outside the class
yield self.x

iterator=MyIterator()

print iterator.next()
iterator.x=5
print iterator.next()

Wrapping the generator in the class, I can pass parameters to it (in
this case x). IOW, here the generator has an explicit "self" rather
than an implicit "__self__" as in the PEP. I am not sure if I like the
PEP, wouldn't be easier to have a built-in iterator class?

Michele
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top