Is there a maximum size to a Python program?

P

Paul Boddie

I have a program that is generated from a generic process. It's job is to
check to see whether records (replicated from another system) exist in a
local table, and if it doesn't, to add them. I have 1 of these programs for
every table in the database.

Some people have discussed whether you need one or many programs. In
practice, you're going to need to execute many "tasks" to process all
your tables, and potentially this could mean generating many SQL
statements, as I will describe below.

[...]
I am thinking that dynamically generating the programs to run might not be
such a good idea. It would be a shame to drop it because the system needs to
be generic and it runs from an XML file so the resulting code could be
pretty complex, and I am new to Python. The program did generate a pyc so it
was able to compile.
Thoughts anyone?

This problem sounds as if it could be solved adequately using only the
database system, although I accept that sometimes some logic or
additional processing could be done in Python. If I were to receive
updates from a system, presumably as files, I'd want to bulk copy them
into the database and then perform the necessary inserts using SQL. If
the raw updates were not directly compatible with the database,
perhaps because data representations might differ, then I would want
to process them before doing the bulk copy.

In other words, the workflow for a single table would look like this:

1. Obtain update files.
2. Process files in order to make the data compatible with the
database.
3. Create temporary tables for the updates in the database.
4. Bulk copy the files into the temporary tables (using COPY or LOAD
DATA commands).
5. Do the necessary inserts (where you have to either use the non-
standard
INSERT OR UPDATE or an INSERT involving a join between existing
and temporary tables).
6. Drop the temporary tables (if not done automatically).

Now, it's almost certain that the details of the above workflow would
vary from table to table and from update to update. Even in the case
where you don't have to process the files, you still need to copy the
files into the database and to work with a different table each time.
This is where generating "something" for each table is unavoidable,
and one solution might be to have a generic template and to substitute
the table name and update filename into that template as you process
each table and its associated data. Whether you actually generate an
SQL command file for each table, or whether you have a program issue
the commands directly, is a matter of preference.

Anyway, from what you've described, that's how I would approach this
problem. Certainly, it's a lot more straightforward than dealing with
object-relational abstractions in programs generated by other
programs.

Paul
 
P

Paul Hemans

Thanks to everyone for all the suggestions. Here's what I did:
Stuck with sqlAlchemy because I like the idea of ORMs and I like being able
to abstract from the database vendor.
Left the schema definitions to be generated in a separate file, so that it
could be re-used
Dumped xml.dom and went to lxml
Brought the data processing into the main program

Works like a charm and the code is poetry.
Great community.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,828
Latest member
LauraCastr

Latest Threads

Top