Obscure utf8/"panic: malloc" bug

K

kj

I'm trying to find a workaround to bug that is causing me a lot of
problems now. This bug surfaces under very rare conditions, but
they happen to apply to me at the moment.

I've traced the bug to the execution of line 49 of DBD/mysql.pm
(v. 3.0002) (though I'm almost certain that the involvement of
DBD/mysql.pm here is purely coincidental):

if ($dsn =~ /([^:;]*)[:;](.*)/) {

At the time of failure, the $dsn variable contains, as far as I
can tell, the string "host=martdb.ebi.ac.uk;port=3306". If, within
the Perl debugger, I turn on tracing when the program reaches the
line shown above, but right before failure, I see that (somehow),
the code being executed is from the function SWASHNEW in
/usr/lib/perl5/5.8.6/utf8_heavy.pl. The execution of the function
reaches the end, and control returns to line 50 of DBD/mysql.pm,
implying that the match succeeded. This is as expected, except
that if I print $1 and $2 I get nonsense, though this nonsense
changes from one execution to the next. Sometime I get columns of
hex numbers:

DB<1> p $1

1D6C2 1D6DA
1D6DC
DB<2> p $2
D6FA
1D6FC 1D714
1D716 1D734
1D736 1D74E
1D750 1D76E
1D770

....but more typically what I get are strings of gobbledygook (i.e.
the stuff one sees if one tries to print binary data).

BTW, the original string in $dsn appears to be OK:

DB<3> p $dsn
host=martdb.ebi.ac.uk;port=3306


Of course, from this point on all hell breaks lose.


I have tried to come up with a small script that reproduces this
bug, but no avail. The only way I can reproduce this bug involves
having a lot of other big stuff loaded.

I'm at wits' end about this. At this point I'm just looking for
a workaround.

Any suggestions would be appreciated!

FWIW, the system is Linux (SuSE). I gave the output of uname -ar
and perl -V below. (The only thing that looks to me like a potential
problem is the fact that perl was compiled with
archname=i586-linux-thread-multi, but uname -ar does not mention
i586 at all, only i686 and i386. Our sysadmin assures me that this
mismatch should not be a problem, but I'm not entirely convinced.)


kj


% uname -ar
Linux luna 2.6.11.4-21.10-smp #1 SMP Tue Nov 29 14:32:49 UTC 2005 i686 i686 i386 GNU/Linux
% perl -V

Summary of my perl5 (revision 5 version 8 subversion 6) configuration:
Platform:
osname=linux, osvers=2.6.9, archname=i586-linux-thread-multi
uname='linux g226 2.6.9 #1 smp tue jun 28 14:58:56 utc 2005 i686 i686 i386 gnulinux '
config_args='-ds -e -Dprefix=/usr -Dvendorprefix=/usr -Dinstallusrbinperl -Dusethreads -Di_db -Di_dbm -Di_ndbm -Di_gdbm -Duseshrplib=true -Doptimize=-O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -g -Wall -pipe'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING -fno-strict-aliasing -pipe -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -g -Wall -pipe',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING -fno-strict-aliasing -pipe'
ccversion='', gccversion='3.3.5 20050117 (prerelease) (SUSE Linux)', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='cc', ldflags =''
libpth=/lib /usr/lib /usr/local/lib
libs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version='2.3.4'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.6/i586-linux-thread-multi/CORE'
cccdlflags='-fPIC', lddlflags='-shared'


Characteristics of this binary (from libperl):
Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
Locally applied patches:
SPRINTF0 - fixes for sprintf formatting issues - CVE-2005-3962
Built under linux
Compiled at Dec 17 2005 03:23:29
%ENV:
PERL5LIB="/home/jones/local/lib/perl5"
PERL5_CPANPLUS_CONFIG="/home/jones/.cpanplus/config"
PERL_RL="Perl"
@INC:
/home/jones/local/lib/perl5/5.8.6/i586-linux-thread-multi
/home/jones/local/lib/perl5/5.8.6
/home/jones/local/lib/perl5/i586-linux-thread-multi
/home/jones/local/lib/perl5
/usr/lib/perl5/5.8.6/i586-linux-thread-multi
/usr/lib/perl5/5.8.6
/usr/lib/perl5/site_perl/5.8.6/i586-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.6
/usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.6/i586-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.6
/usr/lib/perl5/vendor_perl
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
kj
I'm trying to find a workaround to bug that is causing me a lot of
problems now. This bug surfaces under very rare conditions, but
they happen to apply to me at the moment.

I've traced the bug to the execution of line 49 of DBD/mysql.pm
(v. 3.0002) (though I'm almost certain that the involvement of
DBD/mysql.pm here is purely coincidental):

if ($dsn =~ /([^:;]*)[:;](.*)/) {

At the time of failure, the $dsn variable contains, as far as I
can tell, the string "host=martdb.ebi.ac.uk;port=3306". If, within
the Perl debugger, I turn on tracing when the program reaches the
line shown above, but right before failure, I see that (somehow),
the code being executed is from the function SWASHNEW in
/usr/lib/perl5/5.8.6/utf8_heavy.pl.

With the current state of Perl internals (non-reenterable REx engine),
debugging into SWASH* functions can only lead to a disaster (although
I did not know this may be malloc()-related). I wrote about this here
about a year ago.

Hope this helps,
Ilya
 
S

Sisyphus

"kj" <[email protected]>
..
..
I have tried to come up with a small script that reproduces this
bug, but no avail. The only way I can reproduce this bug involves
having a lot of other big stuff loaded.

If it's *core* stuff, then that's probably ok.
I'm at wits' end about this. At this point I'm just looking for
a workaround.

Looks to me that (probably) no-one here will be able to provide that
workaround unless you can provide a basic script that demonstrates the
problem.

My understanding of Ilya's response is that there's not much point in
resorting to the debugger for his particular problem.

Cheers,
Rob
 
K

kj

In said:
[A complimentary Cc of this posting was sent to
kj
I'm trying to find a workaround to bug that is causing me a lot of
problems now. This bug surfaces under very rare conditions, but
they happen to apply to me at the moment.

I've traced the bug to the execution of line 49 of DBD/mysql.pm
(v. 3.0002) (though I'm almost certain that the involvement of
DBD/mysql.pm here is purely coincidental):

if ($dsn =~ /([^:;]*)[:;](.*)/) {

At the time of failure, the $dsn variable contains, as far as I
can tell, the string "host=martdb.ebi.ac.uk;port=3306". If, within
the Perl debugger, I turn on tracing when the program reaches the
line shown above, but right before failure, I see that (somehow),
the code being executed is from the function SWASHNEW in
/usr/lib/perl5/5.8.6/utf8_heavy.pl.
With the current state of Perl internals (non-reenterable REx engine),
debugging into SWASH* functions can only lead to a disaster

Thanks for this insight. As it turns out, this bug surfaces *only*
when I run the code in the debugger which means that I cannot debug
this code using the Perl debugger; this I consider a significant
hindrance. If I understand what you say correctly, this situation
means that any Perl code that (directly or indirectly) invokes
SWASHNEW cannot be debugged using the Perl debugger.

Honestly, I can't understand why utf8 and utf8_heavy are being
loaded at all. I painstakingly traced the moment at which utf8
first appears in %INC to the following line:

/usr/lib/perl5/vendor_perl/5.8.6/i586-linux-thread-multi/DBI.pm:555:

555: $dsn =~ s/^dbi:(\w*?)(?:\((.*?)\))?://i
556: or '' =~ /()/; # ensure $1 etc are empty if matc

....but I don't see why this code causes utf8.pm to be loaded at all!
(FWIW, when this all happens the string in $dsn does not contain
anything other than ASCII characters.) The loading of utf8_heavy.pl
happens under equally mysterious circumstances. It's like voodoo...



Is there a way to "go up the call stack" in the debugger (to examine
lexical values in the calling environment)?

Also, is there a way to generate a call tree for a Perl program?
(although
I did not know this may be malloc()-related).

In retrospect, I think my mention of malloc was probably premature.
It is true that, in *some* runs (this is somewhat of a heisenbug),
the value of $@ after I run the whole thing within an eval does
say something like "panic: malloc at line /path/to/DBD/mysql.pm
line 49", but it could be that the malloc error is downstream of
the real error.
I wrote about this here
about a year ago.

I didn't have much luck when I searched clpm with Google Groups
for your earlier posts on this, but maybe I didn't use the right
search keywords. Can you help me narrow down the search?

Thanks again,

kj
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
kj
Thanks for this insight. As it turns out, this bug surfaces *only*
when I run the code in the debugger which means that I cannot debug
this code using the Perl debugger; this I consider a significant
hindrance. If I understand what you say correctly, this situation
means that any Perl code that (directly or indirectly) invokes
SWASHNEW cannot be debugged using the Perl debugger.

Honestly, I can't understand why utf8 and utf8_heavy are being
loaded at all.

A (silly, IMO) architecture to support RExen on UTF8 strings.
Is there a way to "go up the call stack" in the debugger (to examine
lexical values in the calling environment)?

Do not think so. Likewise for local()ized stuff.
Also, is there a way to generate a call tree for a Perl program?
dprof?
I didn't have much luck when I searched clpm with Google Groups
for your earlier posts on this, but maybe I didn't use the right
search keywords. Can you help me narrow down the search?

REx engine debug OR debugger utf8 OR unicode author:Zakharevich group:*perl*

Hope this helps,
Ilya
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,968
Messages
2,570,149
Members
46,695
Latest member
StanleyDri

Latest Threads

Top