cannot solve a memory leak

  • Thread starter Sébastien Cottalorda
  • Start date
S

Sébastien Cottalorda

Hi all,
I develop an program based on a module names 'SHM' based on
IPC::Shareable module.
I'm under Fedora core 9 (kernel 2.6.27.21-78.2.41.fc9.i686) with
Perl 5.10.0

it works like this:

IPC::Shareable does this when it try to get something in memory
segment:
--------------------------------------------------------------------------
sub FETCH {
[snip]
my $data;
if ($self->{_lock} || $self->{_iterating}) {
$self->{_iterating} = ''; # In case we break out
$data = $self->{_data};
} else {
$data = _thaw($self->{_shm});
$self->{_data} = $data;
}
my $val;
[snip]
# work with $data following $self->{_type} type => then put result
in $val variable
return $val
}
.....
sub _thaw {
[snip]
my $s = shift;
my $ice = $s->shmread;
my $tag = substr $ice, 0, 14, '';
if ($tag eq 'IPC::Shareable') {
my $water = thaw $ice; # imported from Storable module
defined($water) or do {
require Carp;
_debug "Prob into shm segment ", $s->id, ": ", $ice;
Carp::confess "Munged shared memory segment (size
exceeded?) into shm segment ";
};
return $water;
} else {
return;
}
----------------------------------------------------------------------------------
my module, 'SHM', based on IPC::Shareable module does this
sub get_hash {
my $this = shift;
my $type = ref($this) or croak "GET: $this is not an object";
my $hashref = shift;
my %datas = ();
eval {
%datas = %{$this->{_data}}; # line 537 on which I got
Munged share memory .... message
};
if ($@){
croak "GET: Unable to get ".$hashref->{type}."| segment
allKeys : $@";
}
else {
[snip]
}
}
---------------------------------------------------------------------------------------------------------
my program base on the 'SHM' module does this
[snip]
my $memCmd = 'ps axfv | grep -P "^\s{0,}'.$$.'" | grep -v grep ';
print "MEMSize (BEFORE)=". `$memCmd` ;
if ($SEGMENT->shlock(LOCK_SH)){
eval {
$pano_SHM = $SEGMENT->get_hash({ type => 'panneau', primary
=> $control });
};
if ($@){
[snip]
}
print "MEMSize (AFTER)=". `$memCmd` ;
}
else {
print "unable to get a lock on SHM\n";
}

I run several time (5 times), in parallel, the same program.
I have then concurrent access to the memory segment.
On concurrent access, one process get the error :
"Munged shared memory segment (size exceeded?) into shm segment"
I note then that the virtual memory grow till 1,8 Go for the process
on which occured the access problem.
I trace this:

MEMSize (BEFORE)= 5080 pts/7 S+ 17:47 0 3 13776 8992
0.2 \_ PK52
"Munged shared memory segment (size exceeded?) into shm segment" at
SHM line 537
MEMSize (AFTER)= 5080 pts/7 S+ 17:47 0 3 1849624 9832
0.2 \_ PK52

If someone can help me in solving this problem , make me add some
useful trace somewhere, or whatever else .
Thanks in advance.

Sebastien
 
S

sln

Hi all,
I develop an program based on a module names 'SHM' based on
IPC::Shareable module.
I'm under Fedora core 9 (kernel 2.6.27.21-78.2.41.fc9.i686) with
Perl 5.10.0

it works like this:

IPC::Shareable does this when it try to get something in memory
segment:
--------------------------------------------------------------------------
sub FETCH {
[snip]
my $data;
if ($self->{_lock} || $self->{_iterating}) {
$self->{_iterating} = ''; # In case we break out
$data = $self->{_data};
} else {
$data = _thaw($self->{_shm});
$self->{_data} = $data;
}
my $val;
[snip]
# work with $data following $self->{_type} type => then put result
in $val variable
return $val
}
....
sub _thaw {
[snip]
my $s = shift;
my $ice = $s->shmread;
my $tag = substr $ice, 0, 14, '';
if ($tag eq 'IPC::Shareable') {
my $water = thaw $ice; # imported from Storable module
defined($water) or do {
require Carp;
_debug "Prob into shm segment ", $s->id, ": ", $ice;
Carp::confess "Munged shared memory segment (size
exceeded?) into shm segment ";
};
return $water;
} else {
return;
}
----------------------------------------------------------------------------------
my module, 'SHM', based on IPC::Shareable module does this
sub get_hash {
my $this = shift;
my $type = ref($this) or croak "GET: $this is not an object";
my $hashref = shift;
my %datas = ();
eval {
%datas = %{$this->{_data}}; # line 537 on which I got
Munged share memory .... message
};
if ($@){
croak "GET: Unable to get ".$hashref->{type}."| segment
allKeys : $@";
}
else {
[snip]
}
}
---------------------------------------------------------------------------------------------------------
my program base on the 'SHM' module does this
[snip]
my $memCmd = 'ps axfv | grep -P "^\s{0,}'.$$.'" | grep -v grep ';
print "MEMSize (BEFORE)=". `$memCmd` ;
if ($SEGMENT->shlock(LOCK_SH)){
eval {
$pano_SHM = $SEGMENT->get_hash({ type => 'panneau', primary
=> $control });
};
if ($@){
[snip]
}
print "MEMSize (AFTER)=". `$memCmd` ;
}
else {
print "unable to get a lock on SHM\n";
}

I run several time (5 times), in parallel, the same program.
I have then concurrent access to the memory segment.
On concurrent access, one process get the error :
"Munged shared memory segment (size exceeded?) into shm segment"
I note then that the virtual memory grow till 1,8 Go for the process
on which occured the access problem.
I trace this:

MEMSize (BEFORE)= 5080 pts/7 S+ 17:47 0 3 13776 8992
0.2 \_ PK52
"Munged shared memory segment (size exceeded?) into shm segment" at
SHM line 537
MEMSize (AFTER)= 5080 pts/7 S+ 17:47 0 3 1849624 9832
0.2 \_ PK52

If someone can help me in solving this problem , make me add some
useful trace somewhere, or whatever else .
Thanks in advance.

Sebastien


I'm not sure of what version your using. The latest on CPAN seems to
be version 0.60 @ 2001, that does sub-ties on references.

It looks in your usage like you have derived from IPC::Shareable.
Also, there might be a conflict with Perl 5.10, I don't know.

There is no indication how you instantiated your derived class
but you can't even get an object without doing
tie $variable, 'IPC::Shareable', 'data', \%options;
somewhere in your constructor.

you> I have then concurrent access to the memory segment.

I don't know how that is true from your example. I don't actually
see you using 'tied' type acess to that variable.

Instead, you are reading from a temporary variable in the objects
hash: {_data}, used to read/write to shared memory.

But you are only reading from from shared memory with the lock.
You never unlock it, which would write it out:

locking reads shared memory to _data:
$self->{_data} = _thaw($self->{_shm})

Unlocking writes _data to shared memory:
_freeze($self->{_shm} => $self->{_data}

Unfortunately, by doing this directly, you are circumventing the 'tied' mechanisms
of update with its magic, and sub-tie (_mg_tie).

There is a debug mechanism if you want to use it:

use constant DEBUGGING => ($ENV{SHAREABLE_DEBUG} or 0);
_debug() if DEBUGGING;

Version info:
$VERSION = 0.60;
0.60 Mon Mar 5 15:20:18 EST 2001
Lee Lindley ([email protected]) added the waschanged optimization,
improved the locking functionality, fixed numerous bugs, and generally cleaned
things up; thanks.
This version of IPC::Shareable does not understand the format of shared memory segments
created by versions prior to 0.60. If you try to tie to such segments, you will get an error.

These are just my observations as its the first time i've looked at it.

Hope this helps.
Ps. If you reply, snip out the content of non-interrest.

-sln
 
S

sln

[snip]

But I don't know what version sharable this is using.


http://kobesearch.cpan.org/htdocs/IPC-Shareable/README.html

Known Problems
--------------------
2. Running out of shared memory

make test may fail with the message

Munged shared memory segment (size exceeded?)

This is likely because the tests are exceeding the maximum size of a shared memory
segment (SHMMAX) or the system-wide limit on shared memory size (SHMALL).
The only solution is to increase SHMMAX and/or SHMALL for the system.
Consult your system documentation for how to do this.

This failure could also mean that IPC::Shareable doesn't like your version of Storable
(IPC::Shareable makes some assumptions about the structure of serialized data).
This message would happen, for instance, when version 0.53 of IPC::Shareable was used
in conjunction with 1.0.x versions of Storable. If you're having problems, try using
Storable 1.0.7 which is known to work with IPC::Shareable 0.54.



-sln
 
S

Sébastien Cottalorda

Hi all,
I develop an program based on a module names 'SHM' based on
IPC::Shareable module.
I'm under Fedora core 9 (kernel 2.6.27.21-78.2.41.fc9.i686)   with
Perl 5.10.0 [snip]
---------------------------------------------------------------------------------------------------------
my program base on the 'SHM' module does this
[snip]
my $memCmd = 'ps axfv | grep -P "^\s{0,}'.$$.'" | grep -v grep ';
print "MEMSize (BEFORE)=". `$memCmd` ;
if ($SEGMENT->shlock(LOCK_SH)){
   eval {
       $pano_SHM   = $SEGMENT->get_hash({ type => 'panneau', primary
=> $control });
   };
   if ($@){
       [snip]
   }
   print "MEMSize (AFTER)=". `$memCmd` ;
}
else {
   print "unable to get a lock on SHM\n";
}
I run several time (5 times), in parallel, the same program.
I have then concurrent access to the memory segment.
On concurrent access, one process get the error :
"Munged shared memory segment (size exceeded?) into shm segment"
I note then that the virtual memory grow till 1,8 Go for the process
on which occured the access problem.
I trace this:
MEMSize (BEFORE)= 5080 pts/7    S+    17:47      0     3 13776  8992
0.2          \_ PK52
"Munged shared memory segment (size exceeded?) into shm segment" at
SHM line 537
MEMSize (AFTER)= 5080 pts/7    S+    17:47      0     3 1849624 9832
0.2          \_ PK52
I'm not sure of what version your using. The latest on CPAN seems to
be version 0.60 @ 2001, that does sub-ties on references.
Exact, I've that version.
It looks in your usage like you have derived from IPC::Shareable.
Also, there might be a conflict with Perl 5.10, I don't know.

neither do I :)
There is no indication how you instantiated your derived class
but you can't even get an object without doing
  tie $variable, 'IPC::Shareable', 'data', \%options;
somewhere in your constructor.

That's correct, I instantiate my class using
sub new {
my $class = shift;
....
my $self = (
_handle => undef,
_config => undef,
_segments => undef,
_data => undef,
);
$self->{_handle} = tie %{$self->{_data}}, 'IPC::Shareable',
'SUPE', {%options};
bless $self, $class;
return $self;
}
  you> I have then concurrent access to the memory segment.

I don't know how that is true from your example. I don't actually
see you using 'tied' type acess to that variable.

I say that the get_hash access is part of a while(1)/sleep loop.
So I run several times, in parallel, the same program that access to
the same shared segment.
I've then concurrent access.
All the system works well for several hours (even several weeks) since
a conflict occurs.
At hat time, no program crash because I 'eval' every sensitive
operation.
The entire system goes on running without any problems.
The only thing is that the program that have encountered the conflict
see his virtual memory grows up to 1.8 Go
But everything goes on running even the entire OS.
In worst time, I got 4/5 programs that see their virtual memory grows
up to 1.8 Go (I've a 4 Go RAM on my server)
Instead, you are reading from a temporary variable in the objects
hash:  {_data}, used to read/write to shared memory.

Yes, I've separed read access (get_hash) and write access (set_hash)
to the shared segment for one reasons:
1- I need to share multidimensional hash (3 levels) and
IPC::Shareable, on the OS, creates me a lot of share segments in that
configuration (more that 9 segments created for only one hash of hash
of hash) --> So I implement a kind of Hash::Flatten method. I
transform a multidimensional hash into a only one dimension hash.
That's why I need to separate get access and write access because I
need to process several things before thawing of freezing.
2- I note that it's preferable to copy the shared hash before
process it for performance reason (I note then that if I lock the
segment, the IPC::Shareable module does it automatically) --> never
mind, I'll remove my 'very-intelligent-pseudo-optimization' later.

But you are only reading from from shared memory with the lock.
You never unlock it, which would write it out:

        locking reads shared memory to _data:
        $self->{_data} = _thaw($self->{_shm})

        Unlocking writes _data to shared memory:
        _freeze($self->{_shm} => $self->{_data}

Of course I lock/unlock the segment, I do not use directly _freeze and
_thaw.
I forgot to write that on the code example, here is the correct one :
if ($SEGMENT->shlock(LOCK_SH)){
eval {
$pano_SHM = $SEGMENT->get_hash({ type => 'panneau', primary
=> $control });
};
if ($@){
$SEGMENT->shunlock();
[snip]
}
else {
$SEGMENT->shunlock();
[snip]
}
print "MEMSize (AFTER)=". `$memCmd` ;
}
else {
print "unable to get a lock on SHM\n";
}

Unfortunately, by doing this directly, you are circumventing the 'tied' mechanisms
of update with its magic, and sub-tie (_mg_tie).

That's not the case.
There is a debug mechanism if you want to use it:

        use constant DEBUGGING     => ($ENV{SHAREABLE_DEBUG} or 0);
        _debug() if DEBUGGING;

I've already try that debug mode, but it is very verbose and the
entire system can run for several days (even weeks) without problems.
I'll have several Gigabits of log file ...

The thing I do not understand is why the virtual memory grows up to
1.8Go even if I 'eval'uate every sensitive operation.
No programs crashes.
I try to use every time 'my' variables to let the garbage collector do
his job when it goes of the scope.

When I encoutered concurrent access I note that :
SHM module crash like this:
---------------------------------------------------------------------------------------------
my %datas = ();
eval {
%datas = %{$this->{_data}};
};
if ($@){
croak "GET: Unable to get ".$hashref->{type}."|".$hashref->
{primary}." segment key=".$hashref->{key}." : $@"; # ***** CRASH HERE
*****
}
else {
# process the %datas hash
}
undef %datas;
---------------------------------------------------------------------------------------------
May be the garbage collector do not claim the memory occupied by
%datas in that case ?
I don't know.

If the program do not work, I'll look for somewhere else, but I'm
troubled by the fact that the system works perfectly for several days.

Sebastien
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,967
Messages
2,570,148
Members
46,694
Latest member
LetaCadwal

Latest Threads

Top