Memory leak with threads

J

Jon Combe

I have some Perl code that uses threads but found that after running
for a few days the memory usage had increased dramatically. I
simplified the code to the following but find when I run it with a
varying number of threads, the memory usage always increases. The
"countdown" at the end is just there to give me 30 seconds to grab the
memory utilisation before the process ends. I find that it uses approx
70MB for 50,000 threads, 140MB for 100,000 threads and 210MB for
150,000 threads.

I've tried on Perl 5.8.8, 10.2 and 12.2 all with similar results. Is
this a bug or have I just done something wrong? Any help appreciated!

#!/usr/bin/perl -w

use strict;
use threads;
use threads::shared;

for ( my $i = 0 ; $i < 150000 ; $i++ )
{
my $thread = threads->create('thread_function');
$thread -> join();
if ( $i % 1000 == 0 )
{
printf ( "%010i\n" , $i );
}
}

print "DONE\n";

for ( my $i = 0 ; $i < 30 ; $i++ )
{
print "Count down " . (30-$i) . "\n";
sleep(1);
}

sub thread_function
{
}

Thanks.
Jon
 
D

Dr.Ruud

I have some Perl code that uses threads but found that after running
for a few days the memory usage had increased dramatically. I
simplified the code to the following but find when I run it with a
varying number of threads, the memory usage always increases. The
"countdown" at the end is just there to give me 30 seconds to grab the
memory utilisation before the process ends. I find that it uses approx
70MB for 50,000 threads, 140MB for 100,000 threads and 210MB for
150,000 threads.

I've tried on Perl 5.8.8, 10.2 and 12.2 all with similar results. Is
this a bug or have I just done something wrong? Any help appreciated!

Joined threads still occupy memory, so that is probably all.

Play around with the below. Also run it with a parameter.


#!/usr/bin/perl -w
use strict;

use Devel::Size qw( total_size );

use threads;
use threads::shared;

my @thread;
for ( 0 .. 2999 ) {

push @thread, threads->create('thread_function');

$thread[ -1 ]->join();

@ARGV and $thread[ -1 ] = undef;

if ( @thread % 1000 == 0 ) {
printf "%s %s\n" , 0+ @thread, total_size( \@thread );
}
}

printf "%s %s\n" , 0+ @thread, total_size( \@thread );

sub thread_function {
# ...
}


Without the undeffing, it looks like this:

0 56
1000 88160
2000 176256
3000 268448

and with undeffing, like this:

0 56
1000 20144
2000 40240
3000 64432
 
D

Dr.Ruud

Without the undeffing, it looks like this:

0 56
1000 88160
2000 176256
3000 268448

and with undeffing, like this:

0 56
1000 20144
2000 40240
3000 64432

That was the output of a slightly newer version:

#!/usr/bin/perl -w
use strict;

use Devel::Size qw( total_size );

use threads;
use threads::shared;

use constant NUM => 3000;

my @thread;
printf "%8s %8s\n" , 0+ @thread, total_size( \@thread );

for ( 1 .. NUM ) {

push @thread, threads->create('thread_function');
$thread[ -1 ] -> join();

@ARGV and $thread[ -1 ] = undef;

if ( ( NUM - @thread ) % 1000 == 0 ) {
printf "%8s %8s\n" , 0+ @thread, total_size( \@thread );
}
}

sub thread_function {
# ...
}

__END__
 
J

Jon Combe

That was the output of a slightly newer version:

Thanks for looking into this. I'm a little puzzled why the need to
undef the thread. Since my original code stored each thread in my
$thread in the loop $thread should go out of scope after each
iteration of the loop and should get automatically cleaned up. Is that
not the case?

Looking at your results (and I get similar) it appears threads still
use memory after they've ended. Is there no way to free up all the
memory a thread occupied once it has ended?

Thanks
Jon
 
S

Steve C

I have some Perl code that uses threads but found that after running
for a few days the memory usage had increased dramatically. I
simplified the code to the following but find when I run it with a
varying number of threads, the memory usage always increases. The
"countdown" at the end is just there to give me 30 seconds to grab the
memory utilisation before the process ends. I find that it uses approx
70MB for 50,000 threads, 140MB for 100,000 threads and 210MB for
150,000 threads.

I've tried on Perl 5.8.8, 10.2 and 12.2 all with similar results. Is
this a bug or have I just done something wrong? Any help appreciated!

Is there any way to change your application to have a thread pool
instead of creating and destroying threads?
 
S

sln

I have some Perl code that uses threads but found that after running
for a few days the memory usage had increased dramatically. I
simplified the code to the following but find when I run it with a
varying number of threads, the memory usage always increases. The
"countdown" at the end is just there to give me 30 seconds to grab the
memory utilisation before the process ends. I find that it uses approx
70MB for 50,000 threads, 140MB for 100,000 threads and 210MB for
150,000 threads.

I've tried on Perl 5.8.8, 10.2 and 12.2 all with similar results. Is
this a bug or have I just done something wrong? Any help appreciated!

#!/usr/bin/perl -w

use strict;
use threads;
use threads::shared;

for ( my $i = 0 ; $i < 150000 ; $i++ )
{
my $thread = threads->create('thread_function');
$thread -> join();
if ( $i % 1000 == 0 )
{
printf ( "%010i\n" , $i );
}
}

print "DONE\n";

for ( my $i = 0 ; $i < 30 ; $i++ )
{
print "Count down " . (30-$i) . "\n";
sleep(1);
}

sub thread_function
{
}

I ran this code and did not find the problem you
have. I'm using Activestate 5.10.0 for win32 on an
XP platform.

From task manager, the mem usage fluxuates around 3.2 MB
(sort of a minimum for the perl interpreter). The # of
threads is never more than 1 - 2 so the thread is being
destroyed fine.

There is no reason "my $thread" should not be reused and its
object dereferenced.
What seems to be the problem is the object it references
is not being freed.

Some things you could try, but probably won't work:

- threads->create(\&thread_function)->join();

- Create just a few threads, then see what these say:
threads->list()
threads->list(threads::all)
threads->list(threads::running)
threads->list(threads::joinable)

Also, there are caveats about various things, stacksize,
OS, etc..

-sln
 
X

Xho Jingleheimerschmidt

Dr.Ruud said:
Joined threads still occupy memory, so that is probably all.

Play around with the below. Also run it with a parameter.


#!/usr/bin/perl -w
use strict;

use Devel::Size qw( total_size );

use threads;
use threads::shared;

my @thread;
for ( 0 .. 2999 ) {

push @thread, threads->create('thread_function');

Why introduce an array when it wasn't there in the original code and it
doesn't do anything meaningful?

$thread[ -1 ]->join();

@ARGV and $thread[ -1 ] = undef;

if ( @thread % 1000 == 0 ) {
printf "%s %s\n" , 0+ @thread, total_size( \@thread );
}
}

printf "%s %s\n" , 0+ @thread, total_size( \@thread );

sub thread_function {
# ...
}


Without the undeffing, it looks like this:

0 56
1000 88160
2000 176256
3000 268448

and with undeffing, like this:

0 56
1000 20144
2000 40240
3000 64432


This is just telling us that arrays take up space up to their high water
mark, more when dense and less when sparse, but still space. It has
nothing to do with threads. The leak (which does seem to exist) is not
going to be detected using Devel::Size on a variable that has nothing to
do with the leak.

You need to measure the process space from the OS, not from Perl, which
after all likely cannot be trusted if it is leaking memory.

On Linux:

for ( my $i = 0 ; $i < 150000 ; $i++ )
{
my $thread = threads->create('thread_function');
$thread -> join();
if ( $i % 1000 == 0 )
{
printf ( "%010i\t%3\$s" , $i, `ps -p $$ -o rss` );
}
}


Xho
 
J

Jon Combe

I ran this code and did not find the problem you
have. I'm using Activestate 5.10.0 for win32 on an
XP platform.

From task manager, the mem usage fluxuates around 3.2 MB
(sort of a minimum for the perl interpreter). The # of
threads is never more than 1 - 2 so the thread is being
destroyed fine.

Apologies I should have mentioned what OS I was using in my post. I'm
using Linux (Cent OS 5.4) and I've also tried Ubuntu Linux (10.2) with
the same result. However your results from Windows are interesting and
prompted me to try the same code on Windows. I have Starwberry Perl
rather than Active State and find that the memory stays constant at
around 3.8MB regardless of the number of threads. So it appears to be
an issue with Perl on Linux rather than Perl in general.
There is no reason "my $thread" should not be reused and its
object dereferenced.
What seems to be the problem is the object it references
is not being freed.

Some things you could try, but probably won't work:

- threads->create(\&thread_function)->join();

- Create just a few threads, then see what these say:
  threads->list()
  threads->list(threads::all)
  threads->list(threads::running)
  threads->list(threads::joinable)

Also, there are caveats about various things, stacksize,
OS, etc..


I tried running the following:-

(Note for some reason I had to use threads::list not threads->list
even though the latter, which you suggested, is what it says to use in
the documentation)

#!/usr/bin/perl -w

use threads;
use threads::shared;

my @threads;
for ( my $j = 0 ; $j < 10 ; $j++ )
{
my $thread = threads->create(\&thread_function);
push ( @threads , $thread );
}
my @all_threads = threads::list(threads::all);
my @running_threads = threads::list(threads::running);
my @joinable_threads = threads::list(threads::joinable);
print "ALL:- : \n" . join ( "\n" , @all_threads ) . "\n";
print "RUNNING:- : \n" . join ( "\n" , @running_threads ) . "\n";
print "JOINABLE:- : \n" . join ( "\n" , @joinable_threads ) . "\n";

foreach ( @threads )
{
$_ -> join();
}

@all_threads = threads::list(threads::all);
@running_threads = threads::list(threads::running);
@joinable_threads = threads::list(threads::joinable);
print "ALL:- : \n" . join ( "\n" , @all_threads ) . "\n";
print "RUNNING:- : \n" . join ( "\n" , @running_threads ) . "\n";
print "JOINABLE:- : \n" . join ( "\n" , @joinable_threads ) . "\n";

sub thread_function
{

}

I get the following output:-

ALL:- :
threads::all=SCALAR(0x993fd0c)
threads::all=SCALAR(0x99cc024)
threads::all=SCALAR(0x99cc03c)
threads::all=SCALAR(0x99cc054)
threads::all=SCALAR(0x99cdc50)
threads::all=SCALAR(0x99cdc68)
threads::all=SCALAR(0x99cdc80)
threads::all=SCALAR(0x99cdc98)
threads::all=SCALAR(0x99cdcb0)
threads::all=SCALAR(0x99cdcc8)
RUNNING:- :
threads::running=SCALAR(0x99cc018)
threads::running=SCALAR(0x99cdd4c)
threads::running=SCALAR(0x99cdd64)
threads::running=SCALAR(0x99cdd7c)
threads::running=SCALAR(0x99cdd94)
threads::running=SCALAR(0x99cddac)
threads::running=SCALAR(0x99cddc4)
threads::running=SCALAR(0x99cdddc)
threads::running=SCALAR(0x99cddf4)
threads::running=SCALAR(0x99cde0c)
JOINABLE:- :
threads::joinable=SCALAR(0x99cdcbc)
threads::joinable=SCALAR(0x9da0e58)
threads::joinable=SCALAR(0x9da0e70)
threads::joinable=SCALAR(0x9da0e88)
threads::joinable=SCALAR(0x9da0ea0)
threads::joinable=SCALAR(0x9da0eb8)
threads::joinable=SCALAR(0x9da0ed0)
threads::joinable=SCALAR(0x9da0ee8)
threads::joinable=SCALAR(0x9da0f00)
threads::joinable=SCALAR(0x9da0f18)
ALL:- :

RUNNING:- :

JOINABLE:- :


So it does behave as I expect in that after doing the join no threads
are running.

Jon
 
J

Jon Combe

You need to measure the process space from the OS, not from Perl, which
after all likely cannot be trusted if it is leaking memory.

On Linux:

for ( my $i = 0 ; $i < 150000 ; $i++ )
{
         my $thread = threads->create('thread_function');
         $thread -> join();
         if ( $i % 1000 == 0 )
         {
                 printf ( "%010i\t%3\$s" , $i, `ps -p $$ -o rss` );
         }

}

Here is a sample of the output I get from this - so it does confirm
for sure it's leaking memory.

0000000000 2872
0000001000 3672
0000002000 4384
0000003000 5092
0000004000 5804
0000005000 6516
0000006000 7228
0000007000 7936
0000008000 8648
0000009000 9356
0000010000 10072
0000011000 10784
0000012000 11496
0000013000 12204
0000014000 12920
0000015000 13628
0000016000 14340
0000017000 15052
0000018000 15764
0000019000 16428
0000020000 17184
0000021000 17900
0000022000 18616
0000023000 19328

Jon
 
D

Dr.Ruud

Here is a sample of the output I get from this - so it does confirm
for sure it's leaking memory.

0000000000 2872
0000001000 3672
0000002000 4384
0000003000 5092
0000004000 5804
0000005000 6516
0000006000 7228
0000007000 7936
0000008000 8648
0000009000 9356
0000010000 10072
0000011000 10784
0000012000 11496
0000013000 12204
0000014000 12920
0000015000 13628
0000016000 14340
0000017000 15052
0000018000 15764
0000019000 16428
0000020000 17184
0000021000 17900
0000022000 18616
0000023000 19328

Also check perl -V |grep thread

With a perl 5.8.5. and archname=i386-linux-thread-multi, I get:

0000000000 2748
0000001000 2892
0000002000 2892
0000003000 2892
0000004000 2892
0000005000 2892

etc.
 
J

Jon Combe

Also check perl -V |grep thread
With a perl 5.8.5. and archname=i386-linux-thread-multi, I get:

0000000000      2748
0000001000      2892
0000002000      2892
0000003000      2892
0000004000      2892
0000005000      2892

I'm wondering if this leak was introduced between 5.8.5 and 5.8.8? The
result I originally posted are for perl 5.8.8 (osname=linux,
osvers=2.6.18-53.el5, archname=i386-linux-thread-multi)

Here are the results for 5.12.2 i686-linux-thread-multi

0000000000 2972
0000001000 4280
0000002000 5568
0000003000 6856
0000004000 8148
0000005000 9436
0000006000 10724
0000007000 12016
0000008000 13304
0000009000 14592
0000010000 15880

I also have tried 5.10.1 (archname=i686-linux-gnu-thread-multi)

0000000000 2896
0000001000 4404
0000002000 5896
0000003000 7388
0000004000 8884
0000005000 10376
0000006000 11868
0000007000 13360
0000008000 14852
0000009000 16344
0000010000 17840
0000011000 19332
0000012000 20824

So all the three versions I have tried exhibit it.

Jon
 
S

sln

I get the following output:-

ALL:- :
threads::all=SCALAR(0x993fd0c)
threads::all=SCALAR(0x99cc024)
threads::all=SCALAR(0x99cc03c)
threads::all=SCALAR(0x99cc054)
threads::all=SCALAR(0x99cdc50)
threads::all=SCALAR(0x99cdc68)
threads::all=SCALAR(0x99cdc80)
threads::all=SCALAR(0x99cdc98)
threads::all=SCALAR(0x99cdcb0)
threads::all=SCALAR(0x99cdcc8)
RUNNING:- :
threads::running=SCALAR(0x99cc018)
threads::running=SCALAR(0x99cdd4c)
threads::running=SCALAR(0x99cdd64)
threads::running=SCALAR(0x99cdd7c)
threads::running=SCALAR(0x99cdd94)
threads::running=SCALAR(0x99cddac)
threads::running=SCALAR(0x99cddc4)
threads::running=SCALAR(0x99cdddc)
threads::running=SCALAR(0x99cddf4)
threads::running=SCALAR(0x99cde0c)
JOINABLE:- :
threads::joinable=SCALAR(0x99cdcbc)
threads::joinable=SCALAR(0x9da0e58)
threads::joinable=SCALAR(0x9da0e70)
threads::joinable=SCALAR(0x9da0e88)
threads::joinable=SCALAR(0x9da0ea0)
threads::joinable=SCALAR(0x9da0eb8)
threads::joinable=SCALAR(0x9da0ed0)
threads::joinable=SCALAR(0x9da0ee8)
threads::joinable=SCALAR(0x9da0f00)
threads::joinable=SCALAR(0x9da0f18)
ALL:- :

RUNNING:- :

JOINABLE:- :


So it does behave as I expect in that after doing the join no threads
are running.

I find it weird threads->list() doesen't work for the unix flavor.
I get the same (almost) when using threads::list().
The error is that most of the time, the threads already
returned (are not running) when getting the list
with 'threads::list(threads::running)'.
Using threads->list() acurately (mostly) reflects the
right status.

Either way it doesen't matter. I think what matters more
is if the thread is being detached before it is joined.

I put a test together. See what you get with this.
Try with/without the comment in the sleep() in the thread
function. If its not detached before its joined, then there might
be a leak problem.

Another test might be to exit() from main after the threads are
created. This should print some thread messages.

use strict;
use warnings;
use threads;
use threads::shared;

my @threads;

for ( my $j = 1 ; $j <= 10 ; $j++ )
{
my $thread = threads->create(\&thread_function, $j);
push ( @threads , $thread );
}
sleep(1);

my @running;
my @joinable;
my @detached;

for ( @threads )
{
push (@running, $_) if ( $_->is_running() );
push (@joinable, $_) if ( $_->is_joinable() );
push (@detached, $_) if ( $_->is_detached() );
}

print "\n";
print "ALL:- : \n" . join ( "\n" , @threads ) . "\n";
print "RUNNING:- : \n" . join ( "\n" , @running ) . "\n";
print "JOINABLE:- : \n" . join ( "\n" , @joinable ) . "\n";
print "DETACHED:- : \n" . join ( "\n" , @detached ) . "\n";

sleep(1);
print "Joining ...\n\n";

for ( @threads )
{
$_ -> join();
}

@running = ();
@joinable = ();
@detached = ();

for ( @threads )
{
push (@running, $_) if ( $_->is_running() );
push (@joinable, $_) if ( $_->is_joinable() );
push (@detached, $_) if ( $_->is_detached() );
}

print "RUNNING:- : \n" . join ( "\n" , @running ) . "\n";
print "JOINABLE:- : \n" . join ( "\n" , @joinable ) . "\n";
print "DETACHED:- : \n" . join ( "\n" , @detached ) . "\n";


sub thread_function
{
print "thread $_[0] start\n";
# sleep(4);
print "thread $_[0] done\n";
}
__END__

Output:
thread 1 start
thread 1 done
thread 2 start
thread 2 done
thread 3 start
thread 3 done
thread 4 start
thread 4 done
thread 5 start
thread 5 done
thread 6 start
thread 6 done
thread 7 start
thread 7 done
thread 8 start
thread 8 done
thread 9 start
thread 9 done
thread 10 start
thread 10 done

ALL:- :
threads=SCALAR(0x18f7cec)
threads=SCALAR(0x18f7d0c)
threads=SCALAR(0x18f7d2c)
threads=SCALAR(0x18f7d4c)
threads=SCALAR(0x18f7d6c)
threads=SCALAR(0x18f7d8c)
threads=SCALAR(0x18f7dac)
threads=SCALAR(0x18f7dcc)
threads=SCALAR(0x18f7dec)
threads=SCALAR(0x18f7e0c)
RUNNING:- :

JOINABLE:- :
threads=SCALAR(0x18f7cec)
threads=SCALAR(0x18f7d0c)
threads=SCALAR(0x18f7d2c)
threads=SCALAR(0x18f7d4c)
threads=SCALAR(0x18f7d6c)
threads=SCALAR(0x18f7d8c)
threads=SCALAR(0x18f7dac)
threads=SCALAR(0x18f7dcc)
threads=SCALAR(0x18f7dec)
threads=SCALAR(0x18f7e0c)
DETACHED:- :

Joining ...

RUNNING:- :

JOINABLE:- :

DETACHED:- :
 
I

Ilya Zakharevich

You need to measure the process space from the OS, not from Perl,

Measuring from Perl is OK if Perl was build using "my" malloc().
which after all likely cannot be trusted if it is leaking memory.

With "my" malloc(), trust and leaks become orthogonal issues.

Yours,
Ilya
 
X

Xho Jingleheimerschmidt

Ilya said:
Measuring from Perl is OK if Perl was build using "my" malloc().


Well, OK. But how do you do it? Devel::Size certainly isn't the way.

Xho
 
I

Ilya Zakharevich

Well, OK. But how do you do it? Devel::Size certainly isn't the way.

Sure. For one-off, use PERL_DEBUG_MSTATS=1 (or 2); for continuous
loop, one should use the interface (IIRC) in Devel::peek. This should
be in one of perldebug*.pod (do not know where it is split to
nowadays; when I documented it, there was exactly one perldebug...).

Ilya
 
J

Jon Combe

I put a test together. See what you get with this.

Oddly, I get this on 5.8.8.

Can't locate auto/threads/is_running.al in @INC (@INC contains: /usr/
lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/
site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/
5.8.8/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/
lib/perl5/vendor_perl /usr/lib/perl5/5.8.8/i386-linux-thread-multi /
usr/lib/perl5/5.8.8 .) at thread_test.pl line 24

However it does work on 5.12.2 and I get the same output you get
(other than the memory addresses being different of course). With the
sleep in place I get this output which all looks OK.

thread 1 start
thread 2 start
thread 3 start
thread 4 start
thread 5 start
thread 6 start
thread 7 start
thread 8 start
thread 9 start
thread 10 start

ALL:- :
threads=SCALAR(0x8c7c500)
threads=SCALAR(0x8c7c520)
threads=SCALAR(0x8c7c540)
threads=SCALAR(0x8c7c560)
threads=SCALAR(0x8c7c580)
threads=SCALAR(0x8c7c5a0)
threads=SCALAR(0x8c7c5c0)
threads=SCALAR(0x8c7c5e0)
threads=SCALAR(0x8c7c600)
threads=SCALAR(0x8c7c620)
RUNNING:- :
threads=SCALAR(0x8c7c500)
threads=SCALAR(0x8c7c520)
threads=SCALAR(0x8c7c540)
threads=SCALAR(0x8c7c560)
threads=SCALAR(0x8c7c580)
threads=SCALAR(0x8c7c5a0)
threads=SCALAR(0x8c7c5c0)
threads=SCALAR(0x8c7c5e0)
threads=SCALAR(0x8c7c600)
threads=SCALAR(0x8c7c620)
JOINABLE:- :

DETACHED:- :

thread 1 done
thread 2 done
thread 3 done
thread 4 done
thread 5 done
thread 6 done
thread 7 done
thread 8 done
thread 9 done
thread 10 done
RUNNING:- :

JOINABLE:- :

DETACHED:- :


If I put an exit right under the initial for loop as you said I get
some thread messages:-

Perl exited with active threads:
1 running and unjoined
9 finished and unjoined
0 running and detached

Jon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,965
Messages
2,570,148
Members
46,710
Latest member
FredricRen

Latest Threads

Top