text file vs. cPickle vs sqlite a design question

D

Dag

I have an application which works with lists of tuples of the form
(id_nr,'text','more text',1 or 0). I'll have maybe 20-50 or so of these
lists containing anywhere from 3 to over 30000 tuples. The actions I
need to do is either append a new tuple to the end of the list, display
all the tuples or display all the tuples where the last element is a 1

Basically what I'm wondering is the best way to store these data stuctures
to disc. As the subject mentioned I've basically got three approaches.
Store each list as a text file, pickle each list to file or shove the
whole thing into a bunch of database tables. I can see pros and cons
with each approach. Does anybody have any advice as to whether any of
these approaches is obviously better than any other? On one hand I like
the text file approach since it lets me append without loading
everything into memory, on the other hand the sqlite approach makes it
easy to select stuff with SELECT * FROM foo WHERE... which could be
handy if ever need to add more advanced filtering.

Dag
 
G

Gabriel Genellina

I have an application which works with lists of tuples of the form
(id_nr,'text','more text',1 or 0). I'll have maybe 20-50 or so of these
lists containing anywhere from 3 to over 30000 tuples. The actions I
need to do is either append a new tuple to the end of the list, display
all the tuples or display all the tuples where the last element is a 1

Basically what I'm wondering is the best way to store these data
stuctures
to disc. As the subject mentioned I've basically got three approaches.
Store each list as a text file, pickle each list to file or shove the
whole thing into a bunch of database tables. I can see pros and cons
with each approach. Does anybody have any advice as to whether any of

From your description, none of these three approaches is obviously better.
Try to isolate the data from its storage, and use the easiest way now
(pickle perhaps?).
This way you can change it later easily - maybe to use sqlite if you need
more difficult queries.
 
B

Bruno Desthuilliers

Dag a écrit :
I have an application which works with lists of tuples of the form
(id_nr,'text','more text',1 or 0). I'll have maybe 20-50 or so of these
lists containing anywhere from 3 to over 30000 tuples. The actions I
need to do is either append a new tuple to the end of the list, display
all the tuples or display all the tuples where the last element is a 1

Basically what I'm wondering is the best way to store these data stuctures
to disc. As the subject mentioned I've basically got three approaches.
Store each list as a text file, pickle each list to file or shove the
whole thing into a bunch of database tables. I can see pros and cons
with each approach. Does anybody have any advice as to whether any of
these approaches is obviously better than any other? On one hand I like
the text file approach since it lets me append without loading
everything into memory, on the other hand the sqlite approach makes it
easy to select stuff with SELECT * FROM foo WHERE... which could be
handy if ever need to add more advanced filtering.

Given your specs, I'd go for SQLite without any hesitation. Your data
structure is obviously relational (a list of tuples is a pretty good
definition of a relation), so a relational DBMS is the obvious solution,
and you'll get lots of other benefits from it (SQL being only one of
them - you can also think about free optimization, scalability, and
interoperability). And if you don't like raw SQL and prefer something
more pythonic, then you have SQLAlchemy and Elixir.

My 2 cents...
 
J

John Machin

Dag a écrit :




s/if/when/


Given your specs, I'd go for SQLite without any hesitation. Your data
structure is obviously relational (a list of tuples is a pretty good
definition of a relation), so a relational DBMS is the obvious solution,
and you'll get lots of other benefits from it (SQL being only one of
them - you can also think about free optimization, scalability, and
interoperability). And if you don't like raw SQL and prefer something
more pythonic, then you have SQLAlchemy and Elixir.

My 2 cents...

.... and a few more cents:

There are *two* relations/tables involved (at least): a "tuple" table
and a "list" table. The 20-50 or so lists need a unique name or number
each, and other attributes of a list are sure to come out of the
woodwork later. Each tuple will need a column containing the ID of the
list it belongs to. It's a bit boggling that (1) each tuple has an
id_nr but there's no requirement to query on it (2) req. only to
"append" new tuples w/o checking id_nr already exists (3) req. to
"display" all of 30,000 tuples ...
 
P

Paddy

I have an application which works with lists of tuples of the form
(id_nr,'text','more text',1 or 0). I'll have maybe 20-50 or so of these
lists containing anywhere from 3 to over 30000 tuples. The actions I
need to do is either append a new tuple to the end of the list, display
all the tuples or display all the tuples where the last element is a 1

Basically what I'm wondering is the best way to store these data stuctures
to disc. As the subject mentioned I've basically got three approaches.
Store each list as a text file, pickle each list to file or shove the
whole thing into a bunch of database tables. I can see pros and cons
with each approach. Does anybody have any advice as to whether any of
these approaches is obviously better than any other? On one hand I like
the text file approach since it lets me append without loading
everything into memory, on the other hand the sqlite approach makes it
easy to select stuff with SELECT * FROM foo WHERE... which could be
handy if ever need to add more advanced filtering.

Dag

If you have enough resources to keep all the lists comfortably in
memory, and you have enough disk space then I would save your data as
python text. Something like:

print "# <What the data is and how it is formatted>"
print "all_lists = []"
for i,l in enumerate(all_lists):
print "all_lists.append( [ #", i
for tpl in l:
print " ", tpl, ","
print " ]) #", i

You would then have your data saved in a format that could easily
be re-used by other programs at a later date, and that can be
examined in any text editor.

- Paddy.
 
B

Bruno Desthuilliers

John Machin a écrit :
(snip)
... and a few more cents:

There are *two* relations/tables involved (at least): a "tuple" table
and a "list" table.

<ot mode="thinking out loud">
Mmm... From a purely technical POV, not necessarily. If there's no need
for anything else than distinguishing between different lists, a single
table with a compound key (list_id, tuple_id) could be enough...
</ot>
 
B

Bruno Desthuilliers

Dag a écrit :
I have an application which works with lists of tuples of the form
(id_nr,'text','more text',1 or 0). I'll have maybe 20-50 or so of these
lists containing anywhere from 3 to over 30000 tuples. The actions I
need to do is either append a new tuple to the end of the list, display
all the tuples or display all the tuples where the last element is a 1

Basically what I'm wondering is the best way to store these data stuctures
to disc. As the subject mentioned I've basically got three approaches.
Store each list as a text file, pickle each list to file or shove the
whole thing into a bunch of database tables. I can see pros and cons
with each approach. Does anybody have any advice as to whether any of
these approaches is obviously better than any other?

Seems that so far, you get as many different opinion as answers - not
sure this will help much :-/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top