Question about 'iterable cursors'

F

Frank Millman

Hi all

I am using a few DB_API adaptors - ceODBC for Sql Server, psycopg2 for
PostgreSQL, and sqlite3 for sqlite3.

They all offer the feature that if a cursor executes a SELECT, the cursor
returns an iterator which can be used to fetch one row at a time. I have
been using this feature for a while and it seems like a 'good thing'.

Now I am not so sure. I am using a connection pool to maintain connections
to the database. A principle I am following is that a connection must be
returned quickly, so that it is available for reuse.

I have been happily returning the connection, but keeping the cursor open
while processing the rows selected. I now realise that this is dangerous.
Therefore I have changed my system to execute fetchall() on the cursor
before returning the connection. This obviously loses the benefit of the
iterator.

I would appreciate confirmation that my thinking is correct on this issue.
Or is there any way that I can have my cake and eat it?

Thanks

Frank Millman
 
A

Alain Ketterlin

Frank Millman said:
I am using a few DB_API adaptors - ceODBC for Sql Server, psycopg2 for
PostgreSQL, and sqlite3 for sqlite3.

They all offer the feature that if a cursor executes a SELECT, the
cursor returns an iterator which can be used to fetch one row at a
time. I have been using this feature for a while and it seems like a
good thing'.

Now I am not so sure. I am using a connection pool to maintain
connections to the database. A principle I am following is that a
connection must be returned quickly, so that it is available for
reuse.

I have been happily returning the connection, but keeping the cursor
open while processing the rows selected. I now realise that this is
dangerous. Therefore I have changed my system to execute fetchall() on
the cursor before returning the connection. This obviously loses the
benefit of the iterator.

I would appreciate confirmation that my thinking is correct on this
issue. Or is there any way that I can have my cake and eat it?

Your thinking is correct: you need to keep the connection while
processing the cursor. Databases are made to scale, you may well be
processing the first lines of the result before the DBMS has even
finished scanning tables. View this as a pipe, the cursor being one end
of the pipe. The usual setting, fetching one line at a time, lets you
overlap your processing with the network transfers.

Fetching all data, returning the connection, and then start processing
only makes sense if the processing take a lot of time (I mean: a lot
more than fetching results), which is a rare case. Unless you are in
such an extreme situation, I would suggest leaving the optimization to
the connection pool, which is here to solve what you are trying to
solve.

-- Alain.
 
F

Frank Millman

Alain Ketterlin said:
Your thinking is correct: you need to keep the connection while
processing the cursor. Databases are made to scale, you may well be
processing the first lines of the result before the DBMS has even
finished scanning tables. View this as a pipe, the cursor being one end
of the pipe. The usual setting, fetching one line at a time, lets you
overlap your processing with the network transfers.

Fetching all data, returning the connection, and then start processing
only makes sense if the processing take a lot of time (I mean: a lot
more than fetching results), which is a rare case. Unless you are in
such an extreme situation, I would suggest leaving the optimization to
the connection pool, which is here to solve what you are trying to
solve.

Thank you, Alain. That is very clear.

So my analysis of the problem is correct, but my solution is wrong.

Instead of executing fetchall() and returning the connection, I should
retain the connection until I have exhausted the cursor.

That makes a lot of sense.

Frank
 
D

Dennis Lee Bieber

So my analysis of the problem is correct, but my solution is wrong.

Instead of executing fetchall() and returning the connection, I should
retain the connection until I have exhausted the cursor.

That makes a lot of sense.
Especially if all you are processing are read-only activities.

If you have a connection/cursor doing write operations, you may not
be able to commit those writes until all reading cursors have closed.
(Read the documentation on the SQLite3 locking system -- though the
newest version has added a second type of locking which may complicate
the matter. The original/normal scheme has potential readers "outside"
SQLite3, active readers "inside" SQLite3 -- when an active reader cursor
advances to a pending write, it blocks all the potential readers from
entering, but is itself blocked until all other active readers have
exited)
 
J

John Nagle

Especially if all you are processing are read-only activities.

If you have a connection/cursor doing write operations, you may not
be able to commit those writes until all reading cursors have closed.
(Read the documentation on the SQLite3 locking system -- though the
newest version has added a second type of locking which may complicate
the matter. The original/normal scheme has potential readers "outside"
SQLite3, active readers "inside" SQLite3 -- when an active reader cursor
advances to a pending write, it blocks all the potential readers from
entering, but is itself blocked until all other active readers have
exited)

Right. The scarce resource is database locks, not connections.
Especially with SQLite, which has, by necessity, a rather brutal
locking strategy.

Realize that SQLite is not a high-performance multi-user database.
You use SQLite to store your browser preferences, not your customer
database.

If you're doing enough transactions from multiple processes that
performance is an issue, you need to move up to MySQL or Postgres.
If almost all transactions are SELECTs, performance may not be
too bad, but if there are INSERT and UPDATE transactions on the
same table, performance will be awful.

John Nagle
 
L

Lie Ryan

Realize that SQLite is not a high-performance multi-user database.
You use SQLite to store your browser preferences, not your customer
database.

I agree with SQLite is not multi-user; I disagree that SQLite is not a
high-performance database. In single user cases, SQLite should far
outperform a client-server-based database engine since it doesn't have
the client-server overhead.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top