A
Andy
Hi,
I'm trying to look into the best way of speeding up multiple insert or
update statements into a database table. The values to insert or update
are in a file which could contain upto 100,000 records.
My plan was to have multiple children to do the database calls and feed
them via a process just reading the data from a file. I believe using
DBI with threads is a bad idea and to be honest my current view of perl
and threads is that it is not stable enough to be used in production
systems, but i'm happy to be told i'm wrong.
So the only other method I can think of is to fork multiple child
processes's. Each child will need to create its own database connection
and then the main process will need to feed in the data to do the
database calls
So questions:
1. Is fork my only option ? or am I overlooking a much better solution
2. How do I produce a "queue" for the forked processes to work from ?,
the only think I can think of is using some sort of PIPE, actually was
looking at "Safe Pipes" documented here:
http://www.fnal.gov/docs/products/perl/pod.new/5.00503/pod/perlipc.html#Safe_Pipe_Opens
any other ideas
3. What happens if the child process can't get a database connection
for some reason (database refuses connection) ? how can it tell the
master not to send requests as they cannot be processed, if there was a
single queue to work from this would be fine (as the child would'nt
pick any requests up) but if using a queue per child as I would be
using "Safe Pipes" then I think i'm stuck
4. Do you think I even need to to try and make this job concurrent or
will a single thread be quick enough ? anyone have any stats on how
many insert / update statements can be run through DBI per second ?
thanks in advance for any help / suggestions
Cheers
Andy
I'm trying to look into the best way of speeding up multiple insert or
update statements into a database table. The values to insert or update
are in a file which could contain upto 100,000 records.
My plan was to have multiple children to do the database calls and feed
them via a process just reading the data from a file. I believe using
DBI with threads is a bad idea and to be honest my current view of perl
and threads is that it is not stable enough to be used in production
systems, but i'm happy to be told i'm wrong.
So the only other method I can think of is to fork multiple child
processes's. Each child will need to create its own database connection
and then the main process will need to feed in the data to do the
database calls
So questions:
1. Is fork my only option ? or am I overlooking a much better solution
2. How do I produce a "queue" for the forked processes to work from ?,
the only think I can think of is using some sort of PIPE, actually was
looking at "Safe Pipes" documented here:
http://www.fnal.gov/docs/products/perl/pod.new/5.00503/pod/perlipc.html#Safe_Pipe_Opens
any other ideas
3. What happens if the child process can't get a database connection
for some reason (database refuses connection) ? how can it tell the
master not to send requests as they cannot be processed, if there was a
single queue to work from this would be fine (as the child would'nt
pick any requests up) but if using a queue per child as I would be
using "Safe Pipes" then I think i'm stuck
4. Do you think I even need to to try and make this job concurrent or
will a single thread be quick enough ? anyone have any stats on how
many insert / update statements can be run through DBI per second ?
thanks in advance for any help / suggestions
Cheers
Andy