Getting directory sizes on win32

J

Jeffrey Ellin

Hi, I am using the following code to get the directory sizes of users
outboxes on our appservers. This code snippet works but it is
dreadfully slow. I have also used File:Find, but it doesn't seem any
faster. Any ideas on how to speed it up? Everything is running on
Win2K.



#sql to get all active users and their last sync date, exclude users
who
#are enddated in the system
$sql = " select n.name,n.APP_SERVER_NAME, max(s.LAST_UPD) as
sync_date " .
" from siebel.s_node n, " .
" siebel.s_extdata_node e, " .
" siebel.s_dock_status s " .
" where n.ROW_ID = s.node_id and " .
" e.NODE_ID = n.ROW_ID and " .
" n.node_type_cd = 'REMOTE' and " .
" s.type = 'SESSION' and " .
" local_flg = 'N' and " .
" e.ACTIVE_FLG = 'Y' and " .
" (n.EFF_END_DT > sysdate or n.EFF_END_DT is null)" .
" group by n.name, n.APP_SERVER_NAME " .
" order by sync_date " ;

#execute sql
$sth = $dbh->prepare($sql);
$sth->execute;

#delete old report file
unlink 'outboxreport.csv';

#loop through each user in resultset.
while (($node,$server,$sync)=$sth->fetchrow_array()){
#get name of docking directory
my $dockloc = substr($server,6);
#assemble path statement
my $path = "//$server/docking$dockloc/$node/outbox";
#my $path = "//$server/docking/$node/outbox";
#get directory size
my $dirsize = -s $path;
opendir(my ($dh),$path);

#loop through each file in the directory skip over dat and uaf since
they are part of new database
while( defined( my $filename = readdir $dh ) ) {
next if $filename eq "." or $filename eq ".." or $filename
=~ /uaf/ or $filename =~ /dat/;
$dirsize += -s "$path/$filename";
}

#re-open file so it writes as we process
open REP, ">>outboxreport.csv";
#convert file size to megabytes
$dirsize = $dirsize/1000000;
#round file size
$dirsize = sprintf "%.2f", $dirsize;
#print out report in csv format
print REP "$node,$server,$sync,$dirsize\n";
}
 
J

James Willmore

On 3 Sep 2003 14:04:18 -0700
Hi, I am using the following code to get the directory sizes of
users outboxes on our appservers. This code snippet works but it is
dreadfully slow. I have also used File:Find, but it doesn't seem any
faster. Any ideas on how to speed it up? Everything is running on
Win2K.

#sql to get all active users and their last sync date, exclude users
who
#are enddated in the system
$sql = " select n.name,n.APP_SERVER_NAME, max(s.LAST_UPD) as
sync_date " .
" from siebel.s_node n, " .
" siebel.s_extdata_node e, " .
" siebel.s_dock_status s " .
" where n.ROW_ID = s.node_id and " .
" e.NODE_ID = n.ROW_ID and " .
" n.node_type_cd = 'REMOTE' and " .
" s.type = 'SESSION' and " .
" local_flg = 'N' and " .
" e.ACTIVE_FLG = 'Y' and " .
" (n.EFF_END_DT > sysdate or n.EFF_END_DT is null)" .
" group by n.name, n.APP_SERVER_NAME " .
" order by sync_date " ;

You could use a here doc for this part. Won't do wonders for speed,
but will aid in debugging later.

==untested==
$sql = <<SQL;
select n.name,n.APP_SERVER_NAME, max(s.LAST_UPD) as
sync_date
from siebel.s_node n,
siebel.s_extdata_node e,
siebel.s_dock_status s
where n.ROW_ID = s.node_id and
e.NODE_ID = n.ROW_ID and
n.node_type_cd = 'REMOTE' and
s.type = 'SESSION' and
local_flg = 'N' and .
e.ACTIVE_FLG = 'Y' and
(n.EFF_END_DT > sysdate or n.EFF_END_DT is null)
group by n.name, n.APP_SERVER_NAME
order by sync_date
SQL
++end++
#execute sql
$sth = $dbh->prepare($sql);
$sth->execute;

#delete old report file
unlink 'outboxreport.csv';

I'm thinking that you may fair better if you store the results of the
query in a hash, _then_ iterate through the hash doing stuff with the
files/directories. That's just an opinion and it's unproven. My
thinking is that the longer you have the query open, the more
resources you're using. Of course, storing the information from the
query takes up resources as well. So, pick you poison.
#loop through each user in resultset.
while (($node,$server,$sync)=$sth->fetchrow_array()){
#get name of docking directory
my $dockloc = substr($server,6);
#assemble path statement
my $path = "//$server/docking$dockloc/$node/outbox";
#my $path = "//$server/docking/$node/outbox";
#get directory size
my $dirsize = -s $path;
opendir(my ($dh),$path);

#loop through each file in the directory skip over dat and uaf
since
they are part of new database
while( defined( my $filename = readdir $dh ) ) {
next if $filename eq "." or $filename eq ".." or
$filename
=~ /uaf/ or $filename =~ /dat/;
$dirsize += -s "$path/$filename";
}

#re-open file so it writes as we process
open REP, ">>outboxreport.csv";
#convert file size to megabytes
$dirsize = $dirsize/1000000;
#round file size
$dirsize = sprintf "%.2f", $dirsize;
#print out report in csv format
print REP "$node,$server,$sync,$dirsize\n";
}

When you say slow, how slow? And how much data are we talking about?
I mean, if your talking terrabytes, it's going to take some time to
get that information. Plus, consider the platform and how it handles
memory, resources, etc. More memory will mean some better
performance, etc.

Just my zero cents - money back if not satisfied :)
 
J

Jeffrey Ellin

When you say slow, how slow? And how much data are we talking about?
I mean, if your talking terrabytes, it's going to take some time to
get that information. Plus, consider the platform and how it handles
memory, resources, etc. More memory will mean some better
performance, etc.

I think the slow portion is the actual quering of each file for size
and the fact that it is occuring over the lan, albiet 1gibit lan,
slows it down. We are talking 2000 users with about 100-200 files in
each directory. It took 4hrs to run last night.

On the up side the requirements have changed so I don't have to
exclude the two file types (dat and uaf) so now I am using the nt
diruse to calculate size.

$res = `diruse /m $path`;
@res = split(/\n/,$res);
@dirsize = split(/\s+/,@res[3]);
$dirsize = "@dirsize[1]";

Runs in 5minutes now.
 
J

John Bokma

Jeffrey said:
When you say slow, how slow? And how much data are we talking about?
I mean, if your talking terrabytes, it's going to take some time to
get that information. Plus, consider the platform and how it handles
memory, resources, etc. More memory will mean some better
performance, etc.


I think the slow portion is the actual quering of each file for size
and the fact that it is occuring over the lan, albiet 1gibit lan,
slows it down. We are talking 2000 users with about 100-200 files in
each directory. It took 4hrs to run last night.

On the up side the requirements have changed so I don't have to
exclude the two file types (dat and uaf) so now I am using the nt
diruse to calculate size.

$res = `diruse /m $path`;
@res = split(/\n/,$res);
@dirsize = split(/\s+/,@res[3]);
$dirsize = "@dirsize[1]";

please change the latter to:

$dirsize = $dirsize[1]; # no array slice, no "".

And the @res[3] to $res[3]...

Put use strict; somewhere on top of your script and use -w: ie:
#!....perl -w
 
J

Jay Tilton

(e-mail address removed) (Jeffrey Ellin) wrote:

: $res = `diruse /m $path`;
: @res = split(/\n/,$res);

If you use the backtick operator in list context, the returned results
will be burst into lines for you.

@res = `diruse /m $path`;

: @dirsize = split(/\s+/,@res[3]);
^^^^^^^
Don't use an array slice to get a single array element. Got warnings
turned on?

@dirsize = split(/\s+/, $res[3]);

: $dirsize = "@dirsize[1]";
^ ^
Don't quote variables when you don't need to, and, again, avoid the
one-element array slice.

$dirsize = $dirsize[1];

You could boil it all down into a single statement that makes the
intermediate arrays unnecessary.

$dirsize = ( split /\s+/, (`diruse /m $path`)[3] )[1];
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top