This program makes Python segfault - no other does

J

Juho Saarikko

The program attached to this message makes the Python interpreter segfault
randomly. I have tried both Python 2.2 which came with Debian Stable, and
self-compiled Python 2.3.3 (newest I could find on www.python.org,
compiled with default options (./configure && make). I'm using the pyPgSQL
plugin to connect to a PostGreSQL database, and have tried the Debian and
self-compiled newest versions of that as well.

I'm running BitTorrent, and that works perfectly well; btlaunchmany.py has
been running for months continuously without any problems. I've also run
the kernel compile test (compiling the Linux kernel nonstop to find any
inadequeties in processor cooling), and couldn't get any errors in 6 hours.

This makes me thing I'm hitting some weird bug in the interpreter.
Specifically, I'm wondering if my habit of reusing old variable names in a
function once they are no longer needed might be causing the trouble;
maybe it causes confusion on the variable type ?

The program retrieves Usenet News messages from the database (inserted
there by another Python program, which works perfectly and also uses the
pyPgSQL plugin).

So, here's the program. Does anyone know what's wrong with it ?


#!/usr/local/bin/python2.3

# Insert message contents into the database, for each message-id already there
#
# Copyright 2004 by Juho Saarikko
# License: GNU General Public License (GPL) version 2
# See www.gnu.org for details

from pyPgSQL import libpq
import nntplib
import sys
import string
import regex
import sha
import imghdr
import binascii
import StringIO
import os

def strip_trailing_dots(n):
tmp = []
for i in range(len(n)):
if n[-1] == "," or n[-1] == ".":
tmp.append(n[:-1])
else:
tmp.append(n)
return tmp

def findmimetype(body, filename):
what = imghdr.what(StringIO.StringIO(body))
if what == "gif":
return "image/gif"
if what == "png":
return "image/png"
if what == "jpeg":
return "image/jpeg"
return None


def try_decode_and_insert_uuencoded(conn, id):
begin = regex.compile("begin [0-9]+ \(.*\)")
conn.query("BEGIN")
basedir = "kuvat"
message = conn.query("SELECT data FROM fragments_bodies WHERE message = " + str(id) + " ORDER BY line")
print message.ntuples

keywords = []
picids = []
n = 0
s = ""
print 'Starting message id ' + str(id)
while n < message.ntuples:
# print "length of row " + str(n)
# print str(message.getlength(n, 0))
# print "Got length"
s = str(message.getvalue(n, 0))
# print "Got s"
if begin.match(s) > 0:
# print "Begin matched"
body = []
file = begin.group(1)
# print "Starting to decode, at line " + str(n + 1)
for k in range(n+1, message.ntuples):
# print "Decodind row " + str(k)
s = message.getvalue(k, 0)
if s[:3] == "end":
n = k + 1
break
try:
body.append(binascii.a2b_uu(libpq.PgUnquoteBytea(s)))
except:
bytes = (((ord(s[0])-32) & 63) * 4 + 3) / 3
body.append(binascii.a2b_uu(s[:bytes]))
# print "Got to end, at line " + str(n)
# print "Attempting to join body"
body = string.join(body, "")
# print "Attempting to hash body"
hash = sha.new(body)
qhash = libpq.PgQuoteBytea(hash.digest())
# qbody = libpq.PgQuoteBytea(body)
# print "Attempting to find whether the pic already exists"
already = conn.query("SELECT id FROM pictures WHERE hash = " + qhash)
if already.ntuples == 0:
# print "Attempting to find mimetype"
mimetype = findmimetype(body, file)
# print "Found mimetype"
if mimetype != None:
# o = conn.query("INSERT INTO pictures (picture, hash, mimetype) VALUES (" + qbody + ", " + qhash + ", " + libpq.PgQuoteString(mimetype) + ")")
# already = conn.query("SELECT id FROM pictures WHERE OID = " + str(o.oidValue()));
# already = conn.query("SELECT id FROM pictures WHERE data = " + qbody)
# already = conn.query("SELECT id FROM pictures WHERE hash = " + qhash)
# print "Attempting to insert hash and mimetype"
conn.query("INSERT INTO pictures (hash, mimetype) VALUES (" + qhash + ", " + libpq.PgQuoteString(mimetype) + ")")
# print "Attempting to get id"
already = conn.query("SELECT id FROM pictures WHERE hash = " + qhash)
# print "Attempting to get value"
picid = already.getvalue(0, 0)
# print "Attempting to OK dir"
if os.access(basedir + "/tmp", os.F_OK) != 1:
os.mkdir(basedir + "/tmp")
fh = open(basedir + "/tmp/" + str(picid), "wb")
fh.write(body)
fh.close()
# print "File ok"
else:
picid = already.getvalue(0, 0)
if already.ntuples == 0:
# print "already.ntuples == 0, ROLLBACKing"
conn.query("ROLLBACK")
return
# print "Appending picid"
picids.append(picid)
# print "Picid appended"
else:
tmpkey = strip_trailing_dots(string.split(s))
if len(tmpkey) > 0:
for j in range(len(tmpkey)):
keywords.append(tmpkey[j])
# print "Adding 1 to n"
n = n + 1
if len(picids) > 0:
# print "Finding Subject"
head = conn.query("SELECT contents FROM fragments_header_contents WHERE message = " + str(id) + " AND header = (SELECT id FROM fragments_header_names WHERE header ilike 'Subject')")
if head.ntuples > 0:
# print "Splitting Subject"
blah = head.getvalue(0,0)
print str(blah)
blahblah = string.split(str(blah))
# print "Stripping"
abctmpkey = strip_trailing_dots(blahblah)
# print "Stripping done"
# print "Really"
tmpkey = abctmpkey
#B print "Subject split"
if len(tmpkey) > 0:
for j in range(len(tmpkey)):
keywords.append(tmpkey[j])
o = conn.query("INSERT INTO messages DEFAULT VALUES")
mid = conn.query("SELECT id FROM messages WHERE OID = " + str(o.oidValue))
messageid = mid.getvalue(0, 0)
if len(keywords) > 0:
for x in range(len(tmpkey)):
qword = libpq.PgQuoteString(str(keywords[x]))
tmp = conn.query("SELECT id FROM keywords_words WHERE keyword = " + qword)
if tmp.ntuples == 0:
conn.query("INSERT INTO keywords_words (keyword) VALUES (" + qword + ")")
tmp = conn.query("SELECT id FROM keywords_words WHERE keyword = " + qword)
keyid = str(tmp.getvalue(0, 0))
for y in range(len(picids)):
conn.query("INSERT INTO keywords_glue(word, picture) VALUES (" + keyid + ", " + str(picids[y]) + ")")
dummyone = "SELECT fragments_header_contents.line, fragments_header_names.header,"
dummytwo = " fragments_header_contents.contents FROM fragments_header_names, fragments_header_contents"
dummythree = " WHERE fragments_header_contents.message = " + str(id)
dummyfour = " AND fragments_header_contents.header = fragments_header_names.id"
head = conn.query(dummyone + dummytwo + dummythree + dummyfour)
if head.ntuples > 0:
for h in range(head.ntuples):
qhead = libpq.PgQuoteString(str(head.getvalue(h, 1)))
qcont = libpq.PgQuoteString(str(head.getvalue(h, 2)))
tmp = conn.query("SELECT id FROM header_names WHERE header = " + qhead)
if tmp.ntuples == 0:
conn.query("INSERT INTO header_names (header) VALUES (" + qhead + ")")
tmp = conn.query("SELECT id FROM header_names WHERE header = " + qhead)
headid = str(tmp.getvalue(0, 0))
line = str(head.getvalue(0, 0))
conn.query("INSERT INTO header_contents (header, message, line, contents) VALUES (" + headid + ", " + str(messageid) + ", " + line + ", " + qcont + ")")
conn.query("DELETE FROM fragments_header_contents WHERE message = " + str(id))
conn.query("DELETE FROM fragments_bodies WHERE message = " + str(id))
conn.query("COMMIT")
tmpdir = basedir + "/tmp/"
for i in range(len(picids)):
picid = picids
if os.access(basedir + "/" + str(picid%1000), os.F_OK) != 1:
os.mkdir(basedir + "/" + str(picid%1000))
os.link(tmpdir + str(picid), basedir + "/" + str(picid%1000) + "/" + str(picid))
os.unlink(tmpdir +str(picid))
else:
conn.query("ROLLBACK")
return


database = libpq.PQconnectdb('dbname = kuvat')
items = database.query("SELECT message FROM whole_attachments")

# try_decode_and_insert_uuencoded(database, 1167)

for i in range(items.ntuples):
print 'Starting call ' + str(i)
try_decode_and_insert_uuencoded(database, items.getvalue(items.ntuples - 1 - i,0))
print ' returned from call ' + str(i)
# except:
# print 'Some other error occurred, trying to continue...\n'
 
T

Terry Reedy

Juho Saarikko said:
The program attached to this message makes the Python interpreter segfault
randomly. I have tried both Python 2.2 which came with Debian Stable, and
self-compiled Python 2.3.3 (newest I could find on www.python.org,

2.3.4 was just released. I believe it fixed two segfault bugs.
compiled with default options (./configure && make). I'm using the pyPgSQL
plugin to connect to a PostGreSQL database, and have tried the Debian and
self-compiled newest versions of that as well.

Based on posts over several years, seq faults most often arise from
1. buggy compilers, especially at 'higher' optimization settings
2. buggy compiled extensions
3. byte code fiddling
and only occasionally from
4. Python interpreter bugs
which the developers consider high priority for squashing.
Specifically, I'm wondering if my habit of reusing old variable names in a
function once they are no longer needed might be causing the trouble;
maybe it causes confusion on the variable type ?

Very dubious to me; at worst you should get an exception with traceback.
So, here's the program. Does anyone know what's wrong with it ?

Since you don't seem to be fiddling with internals, there is no way I know
to eyeball the code and tell which line is doing the 'impossible'. So I
think it up to you to determing the offending line by uncommenting print
statements and adding more as needed. Then see if you can reduce the
program and still get segfaults.

Terry J. Reedy
 
J

Juho Saarikko

Based on posts over several years, seq faults most often arise from
1. buggy compilers, especially at 'higher' optimization settings

Compiler is gcc 2.95.4, with the debug optimization ("make OPT=-g").
2. buggy compiled extensions

There seems to be one potential segfault waiting there, if PostGreSQL
gives suitably malformed Bytea string, but it has nothing to do with this.
3. byte code fiddling
and only occasionally from
4. Python interpreter bugs
which the developers consider high priority for squashing.

Well, what do you think, should I report this ? Because I can't find
anything wrong with the extension code - not that my C skills are that
great, but still.
Very dubious to me; at worst you should get an exception with traceback.

Yup, wasn't that; there's something fishy happening at Python memory
management code, apparently.
Since you don't seem to be fiddling with internals, there is no way I know
to eyeball the code and tell which line is doing the 'impossible'. So I

The conn.getvalue() -lines. Specifically, getvalue() on a Bytea field
seems to be the cause of the problem, triggering the segfault on some
lines.
think it up to you to determing the offending line by uncommenting print
statements and adding more as needed. Then see if you can reduce the
program and still get segfaults.

I did, that's why the print lines are there. I'm sorry I didn't think to
include the info from the start... Anyway, here's gdb stacktrace, if
anyone's interested. The problem is not in unQuoteBytea, which works fine
untill it tries to discard some temporary variables, at which point,
kaboom. That would seem to indicate either line #2 or #3, but this is the
first time I've used a debugger, so I might be quite wrong.

#0 0x400c4c1b in free () from /lib/libc.so.6
#1 0x400c4aa3 in free () from /lib/libc.so.6
#2 0x0807ff2e in PyObject_Free (p=0x81d7240) at Objects/obmalloc.c:774
#3 0x0807f5a6 in PyMem_Free (p=0x81d7240) at Objects/object.c:2111
#4 0x4023a2d4 in unQuoteBytea (sin=0x81dd2ec ">nemo wrote:") at libpqmodule.c:417
#5 0x40243c2f in libPQgetvalue (self=0x401dfe00, args=0x401ea74c) at pgresult.c:630
#6 0x080fede0 in PyCFunction_Call (func=0x401eabcc, arg=0x401ea74c, kw=0x0) at Objects/methodobject.c:73
#7 0x080b7d4a in call_function (pp_stack=0xbffff2f8, oparg=2) at Python/ceval.c:3439
#8 0x080b50e0 in eval_frame (f=0x81dcbd4) at Python/ceval.c:2116
#9 0x080b7ff3 in fast_function (func=0x401ed294, pp_stack=0xbffff448, n=2, na=2, nk=0) at Python/ceval.c:3518
#10 0x080b7e2d in call_function (pp_stack=0xbffff448, oparg=2) at Python/ceval.c:3458
#11 0x080b50e0 in eval_frame (f=0x818d16c) at Python/ceval.c:2116
#12 0x080b669d in PyEval_EvalCodeEx (co=0x401bade0, globals=0x4018b79c, locals=0x4018b79c, args=0x0, argcount=0, kws=0x0, kwcount=0,
defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2663
#13 0x080b16f0 in PyEval_EvalCode (co=0x401bade0, globals=0x4018b79c, locals=0x4018b79c) at Python/ceval.c:537
#14 0x080da8e4 in run_node (n=0x40174350, filename=0xbffff83b "decode_uu.py", globals=0x4018b79c, locals=0x4018b79c, flags=0xbffff670)
at Python/pythonrun.c:1265
#15 0x080da880 in run_err_node (n=0x40174350, filename=0xbffff83b "decode_uu.py", globals=0x4018b79c, locals=0x4018b79c, flags=0xbffff670)
at Python/pythonrun.c:1252
#16 0x080da843 in PyRun_FileExFlags (fp=0x8142908, filename=0xbffff83b "decode_uu.py", start=257, globals=0x4018b79c, locals=0x4018b79c,
closeit=1, flags=0xbffff670) at Python/pythonrun.c:1243
#17 0x080d97c6 in PyRun_SimpleFileExFlags (fp=0x8142908, filename=0xbffff83b "decode_uu.py", closeit=1, flags=0xbffff670)
at Python/pythonrun.c:862
#18 0x080d9065 in PyRun_AnyFileExFlags (fp=0x8142908, filename=0xbffff83b "decode_uu.py", closeit=1, flags=0xbffff670)
at Python/pythonrun.c:659
#19 0x08055220 in Py_Main (argc=2, argv=0xbffff734) at Modules/main.c:415
#20 0x080549e6 in main (argc=2, argv=0xbffff734) at Modules/python.c:23
 
M

Michael Hudson

Juho Saarikko said:
I did, that's why the print lines are there. I'm sorry I didn't think to
include the info from the start... Anyway, here's gdb stacktrace, if
anyone's interested. The problem is not in unQuoteBytea, which works fine
untill it tries to discard some temporary variables, at which point,
kaboom. That would seem to indicate either line #2 or #3, but this is the
first time I've used a debugger, so I might be quite wrong.

#0 0x400c4c1b in free () from /lib/libc.so.6
#1 0x400c4aa3 in free () from /lib/libc.so.6
#2 0x0807ff2e in PyObject_Free (p=0x81d7240) at Objects/obmalloc.c:774
#3 0x0807f5a6 in PyMem_Free (p=0x81d7240) at Objects/object.c:2111
#4 0x4023a2d4 in unQuoteBytea (sin=0x81dd2ec ">nemo wrote:") at libpqmodule.c:417

Oh look, this is clearly inside the libpq extension module! What
evidence do you have for a bug in Python itself?

Cheers,
mwh
 
J

Juho Saarikko

Oh look, this is clearly inside the libpq extension module! What
evidence do you have for a bug in Python itself?

The function unQuoteBytea allocates memory with PyMem_Malloc, and frees it
with PyMem_Free. The segfault happens at freeing the memory (as the
backtrace shows). It seems to me that if Python's memory management
routines fail to free an object they've allocated, it must be a bug in
Python. That or some other bug corrupts memory structures, in which case
it's almost impossible to track down. At this point I'm considering either
switching to a different database plugin, or to Java.

I tried the new Python version (3.3.4c1) and got the exact same behaviour.
Aarrgghh.

Here, I'll attach the unQuoteBytea function, it's a short one. Maybe you
can find some problem in it I couldn't:


PyObject *unQuoteBytea(char *sin)
{
int i, j, slen, byte;
char *sout;
PyObject *result;

slen = strlen(sin);
sout = (char *)PyMem_Malloc(slen);
if (sout == (char *)NULL)
return PyErr_NoMemory();

for (i = j = 0; i < slen;)
{
switch (sin)
{
case '\\':
i++;
if (sin == '\\')
sout[j++] = sin[i++];
else
{
if ((!isdigit(sin)) ||
(!isdigit(sin[i+1])) ||
(!isdigit(sin[i+2])))
goto unquote_error;

byte = VAL(sin[i++]);
byte = (byte << 3) + VAL(sin[i++]);
sout[j++] = (byte << 3) + VAL(sin[i++]);
}
break;

default:
sout[j++] = sin[i++];
}
}

sout[j] = (char)0;

result = Py_BuildValue("s#", sout, j);
PyMem_Free(sout);

return result;

unquote_error:
PyMem_Free(sout);
PyErr_SetString(PyExc_ValueError, "Bad input string for type bytea");
return (PyObject *)NULL;
}
 
M

Michael Hudson

Juho Saarikko said:
The function unQuoteBytea allocates memory with PyMem_Malloc, and frees it
with PyMem_Free. The segfault happens at freeing the memory (as the
backtrace shows). It seems to me that if Python's memory management
routines fail to free an object they've allocated, it must be a bug in
Python.

Um. Almost certainly PyMem_Malloc winds up just calling malloc(), and
you can see above that PyMem_Free is winding up calling free().

Is this a debug build of Python? You might want to try one of them.
That or some other bug corrupts memory structures, in which case
it's almost impossible to track down. At this point I'm considering
either switching to a different database plugin, or to Java.

I tried the new Python version (3.3.4c1) and got the exact same behaviour.
Aarrgghh.

Here, I'll attach the unQuoteBytea function, it's a short one. Maybe you
can find some problem in it I couldn't:


PyObject *unQuoteBytea(char *sin)
{
int i, j, slen, byte;
char *sout;
PyObject *result;

slen = strlen(sin);
sout = (char *)PyMem_Malloc(slen);
if (sout == (char *)NULL)
return PyErr_NoMemory();

for (i = j = 0; i < slen;)
{
switch (sin)
{
case '\\':
i++;
if (sin == '\\')
sout[j++] = sin[i++];
else
{
if ((!isdigit(sin)) ||
(!isdigit(sin[i+1])) ||
(!isdigit(sin[i+2])))
goto unquote_error;

byte = VAL(sin[i++]);
byte = (byte << 3) + VAL(sin[i++]);
sout[j++] = (byte << 3) + VAL(sin[i++]);
}
break;

default:
sout[j++] = sin[i++];
}
}

sout[j] = (char)0;


I think j can equal slen at this point?

Truth in advertising: I googled for libpqmodule.c, got (as I hoped)
the CVS logs from SF, noticed that the most recent log entry said:

09NOV2003 bga Fixed a buffer overrun error in libPQquoteBytea based on a fix
by James Matthew Farrow. [Bug #838317].

, that the date of the this log was after the most recent release of
pypgsql and then looked at the diff.

Maybe you should try building pypgsql from CVS...

Cheers,
mwh
 
C

Christophe Cavalaria

Juho said:
The function unQuoteBytea allocates memory with PyMem_Malloc, and frees it
with PyMem_Free. The segfault happens at freeing the memory (as the
backtrace shows). It seems to me that if Python's memory management
routines fail to free an object they've allocated, it must be a bug in
Python. That or some other bug corrupts memory structures, in which case
it's almost impossible to track down.

As a rule of thumb, you should assume that malloc and fre aren't bugged at
all. It is far easier to crash a free call than to find a bug *in* free.

Example code that will likely segfault in free or at the program exit/next
malloc :

free((void*)1);


void * a = malloc(2);
free(a);
free(a);


char * a = malloc(2);
free(a+1);


etc...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,830
Latest member
HeleneMull

Latest Threads

Top