C
Chris H.
im currently working on an issue for work, in which processing of a pipe
delimited text file consisting of 356,400 lines of data with 19 fields in
each row.
the issue that i am having is not the performance of reading and splitting
this data, but in writing to the database. using dbi/dbd::mysql, using
'localhost' as the server the script connects to i get the following
results:
Updating us_hourly table - 08/25/06 10:19:09
1000 records processed.
real 0m1.096s
user 0m0.424s
sys 0m0.008s
------------------------
using 'servername.tld' from a remote machine, i get the following
performance:
Updating us_hourly table - 08/25/06 10:17:49
1000 records processed.
real 1m11.606s
user 0m0.250s
sys 0m0.034s
------------------------
the issue seems to be with remote connections either through the dbi
module, or the dbd::mysql driver. this also appears with the sybase dbi
driver going from a unix machine to a mssql machine.
are there any remote connection variables or performance enchancing calls
that im missing?
the sql query thats being run is:
update table foo set var1 = 'foo', var2 = 'foo2' where var3 = 'bar' and
var4 = 'bar2';
the table is indexed based on var3 and var4, which did show a huge
increase in performance once it was indexed properly.
this is data that needs to be updated every 15 minutes, so dropping the
data and re-inserting is not an option.
thanks for any help or insight you can provide on this issue.
delimited text file consisting of 356,400 lines of data with 19 fields in
each row.
the issue that i am having is not the performance of reading and splitting
this data, but in writing to the database. using dbi/dbd::mysql, using
'localhost' as the server the script connects to i get the following
results:
Reading input file - 08/25/06 10:19:09time ./ushbh-dev.pl hbh.txt
Updating us_hourly table - 08/25/06 10:19:09
1000 records processed.
real 0m1.096s
user 0m0.424s
sys 0m0.008s
------------------------
using 'servername.tld' from a remote machine, i get the following
performance:
Reading input file - 08/25/06 10:17:49time ./ushbh-dev.pl hbh.txt
Updating us_hourly table - 08/25/06 10:17:49
1000 records processed.
real 1m11.606s
user 0m0.250s
sys 0m0.034s
------------------------
the issue seems to be with remote connections either through the dbi
module, or the dbd::mysql driver. this also appears with the sybase dbi
driver going from a unix machine to a mssql machine.
are there any remote connection variables or performance enchancing calls
that im missing?
the sql query thats being run is:
update table foo set var1 = 'foo', var2 = 'foo2' where var3 = 'bar' and
var4 = 'bar2';
the table is indexed based on var3 and var4, which did show a huge
increase in performance once it was indexed properly.
this is data that needs to be updated every 15 minutes, so dropping the
data and re-inserting is not an option.
thanks for any help or insight you can provide on this issue.