Database vs Data Structure?

E

erikcw

Hi,

I'm working on a web application where each user will be creating
several "projects" in there account, each with 1,000-50,000 objects.
Each object will consist of a unique name, an id, and some meta data.

The number of objects will grow and shrink as the user works with
their project.

I'm trying to decided whether to store the objects in the database
(each object gets it's own row) or to use some sort of data-structure
(maybe nested dictionaries or a custom class) and store the pickled
data-structure in a single row in the database (then unpickle the data
and query in memory).

A few requirements:
-Fast/scalable (web app)
-able to query objects based on name and id.
-will play nicely with versioning (undo/redo)

Any input on the best way to go?

Thanks!
Erik
 
C

castironpi

Hi,

I'm working on a web application where each user will be creating
several "projects" in there account, each with 1,000-50,000 objects.
Each object will consist of a unique name, an id, and some meta data.

The number of objects will grow and shrink as the user works with
their project.

I'm trying to decided whether to store the objects in the database
(each object gets it's own row) or to use some sort of data-structure
(maybe nested dictionaries or a custom class) and store the pickled
data-structure in a single row in the database (then unpickle the data
and query in memory).

A few requirements:
-Fast/scalable (web app)
-able to query objects based on name and id.
-will play nicely with versioning (undo/redo)

Any input on the best way to go?

Thanks!
Erik

When you change an object, what will you do?

1) Changes on disk only.
2) Changes in memory only & flush.

Databases cache a binary in memory, which I find underrated.
 
I

I V

use some sort of data-structure (maybe
nested dictionaries or a custom class) and store the pickled
data-structure in a single row in the database (then unpickle the data
and query in memory).

Why would you want to do this? I don't see what you would hope to gain by
doing this, over just using a database.
 
B

Bruno Desthuilliers

erikcw a écrit :
Hi,

I'm working on a web application where each user will be creating
several "projects" in there account, each with 1,000-50,000 objects.
Each object will consist of a unique name, an id, and some meta data.

The number of objects will grow and shrink as the user works with
their project.

I'm trying to decided whether to store the objects in the database
(each object gets it's own row) or to use some sort of data-structure
(maybe nested dictionaries or a custom class) and store the pickled
data-structure in a single row in the database (then unpickle the data
and query in memory).

Yuck.

Fighting against the tool won't buy you much - except for
interoperability and maintainance headeaches. Either use your relational
database properly, or switch to an object db - like ZODB or Durus - if
you're ok with the implications (no interoperability, no simple query
langage, and possibly bad performances if your app does heavy data
processing).
A few requirements:
-Fast/scalable (web app)
-able to query objects based on name and id.
-will play nicely with versioning (undo/redo)

Versionning is a somewhat othogonal problem.
Any input on the best way to go?

My very humble opinion - based on several years of working experience
with both the Zodb and many RDBMS - is quite clear : use a RDBMS and use
it properly.
 
C

castironpi

Why would you want to do this? I don't see what you would hope to gain by
doing this, over just using a database.

Are databases truly another language from Python, fundamentally?
 
A

Aaron Watters

My very humble opinion - based on several years of working experience
with both the Zodb and many RDBMS - is quite clear : use a RDBMS and use
it properly.

Yes, somewhere down the line you will want
to get a report of all the customers in Ohio,
ordered by county and zip code, who have a
"rabbit cage" project -- and if you just pickle
everything you will end up traversing the entire
database, possibly multiple times
to find it. A little old fashioned
database design up front can save you a lot of pain.

-- Aaron Watters

===
http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=ouch
 
C

castironpi

Yes.  A fair amount of study went into them.  Databases are about
information that survives the over an extended period of time (months
or years, not hours).

Classic qualities for a database that don't normally apply to Python
(all properties of a "transaction" -- bundled set of changes):
     * Atomicity:
        A transaction either is fully applied or not applied at all.
     * Consistency:
        Transactions applied to a database with invariants preserve
        those invariants (things like balance sheets totals).
     * Isolation:
        Each transactions happens as if it were happening at its own
        moment in time -- tou don't worry about other transactions
        interleaved with your transaction.
     * Durability:
        Once a transaction actually makes it into the database, it stays
        there and doesn't magically fail a long time later.

-Scott David Daniels
(e-mail address removed)

Scott,

Classical qualities for Python that don't normally apply to a database
are:
* Encapsulation
* Modularity

(Besides for social factors,) I make case that database language is
always better to start learning than program language. They have no
properties of databases.

Hold that databases are slightly less sophisticated than language, and
you hold rates at which data comes from the universe. Note,
information isn't terribly well quantified, but is empirical. Note,
matter comes from the universe too, but information goes to heads and
we care.

I hold it's proper to distinguish, though: you're doing the usual
things to data from the real world, but it seems like things I want to
do to computer screen are hard to do to a computer screen. I'm
sensitive to money (want>0); why doesn't it want to do computers? I'd
rather just animate plastics. Hook some up to it. Build roads and
surfboards. (No snowboards; it's water; or it's way below it.) Plus
get a bunch from the A.C. lines.

Data always come from the universe, i.e. from matter. Just another
way to make money with it. Everyone can make money, what's the
problem with SQL?
 
C

castironpi

@Acm.Org> declaimed the following in comp.lang.python:

        Hijacking as with the gmail kill filter I had to apply...


        Databases predate Python by decades... Though getting hardware fast
enough to implement the current darling -- relational -- did take a few
years.

        In my college days, database textbooks introduced: Hierarchical (IBM
IMS, I believe was the archetype used, though there were many others);
DBTG (Data Base Task Group) Network (the DBMS on the Xerox Sigma-6 at my
campus was a network model); and then gave Relational as an
experimental/theoretical format. About two years after I graduated, the
revised versions of the textbooks started with Relational, and then
listed hierarchical and network as "historical" formats.

        In hierarchical and network, one had to explicitly code for the way
the data was stored... In simple form: hierarchical required one to
access from a top-level record, which then had "fields" comprising
related data (and could have multiple occurrences).

Invoice:        has customer number, name, address, etc. and a "field" for
line items... The line items were a subtree: item number, description,
quantity, price, extended price...

        Network extended the hierarchical model by allowing access to the
"subtrees" from multiple different types of parent trees.

        Relational started life as a theory of "how to view the data --
independent of how it is stored" -- comprising relations (which are NOT
the links between tables. In relational theory the terms equate as:

Common/Lay                      Theory
table                                   relation
column                                  domain
row/record                              tuple

"relation" meant that all the data in each tuple was related to the
others. The SQL "relationship operators" that are used to link separate
tables are not where "relational database" comes from.

        SQL started life as a query language -- also independent of how the
data is stored. however, it fit into relational theory easily... Maybe
because it sort of combines relational algebra and relational calculus.






        Well... self-explanatory...


        One of the key ones...

        update accounts set
                balance = balance - 100
        where accountnum = "from account";
        update accounts set
                balance = balance + 100
        where accountnum = "to account";

        A failure between the two update statements MUST ensure that no
changes were made to the database... Otherwise, one would lose 100 into
the vapor. (This example does link back to the "A" and is more on the
user side -- the code needs to specify that both updates are part of the
same transaction).


        Though how various RDBMs implement this feature gets confusing. One
has everything from locking the entire database (basically meaning that
"losing" transactions don't get applied at all and the code has to
reexecute the transaction logic) down to those that can lock on
individual records -- so overlapping transactions that don't need those
records complete with no failures.


        Assuming a disk failure is not "magic" and one doesn't have a recent
backup <G>

--
        Wulfraed        Dennis Lee Bieber               KD6MOG
        (e-mail address removed)             (e-mail address removed)
                HTTP://wlfraed.home.netcom.com/
        (Bestiaria Support Staff:               (e-mail address removed))
                HTTP://www.bestiaria.com/

I'm holding the premise that money can be made different ways, also
and as technique is scarce, and exploration in programming is a non-
negative utility. I have a soft-coded script I can show, I'm just not
in the space program.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,816
Latest member
SapanaCarpetStudio

Latest Threads

Top