S
Seebs
ruby 1.8.7-p22, OS X 10.4.mumble, PostgreSQL 8.3.1, ruby-pg 2008-03-18.
I get random data corruption when trying to execute queries.
The data corruption comes and goes VERY unpredictably. I've narrowed it
down to a small chunk of the pg.c module, except I don't understand the
ruby interpreter well enough to say much more.
Here's what happens:
I pass a whole bunch of arguments in to a prepared statement. In pg.c,
that leads to a loop through nParams, reproduced here in case it means
anything to someone.
for(i = 0; i < nParams; i++) {
param = rb_ary_entry(params, i);
if (TYPE(param) == T_HASH) {
param_value_tmp = rb_hash_aref(param, sym_value);
if(param_value_tmp == Qnil)
param_value = param_value_tmp;
else
param_value = rb_obj_as_string(param_value_tmp);
param_format = rb_hash_aref(param, sym_format);
}
else {
if(param == Qnil)
param_value = param;
else
param_value = rb_obj_as_string(param);
param_format = INT2NUM(0);
}
if(param_value == Qnil) {
paramValues = NULL;
paramLengths = 0;
}
else {
Check_Type(param_value, T_STRING);
paramValues = StringValuePtr(param_value);
paramLengths = RSTRING_LEN(param_value);
fprintf(stderr, "%d: %p -> %s\n",
i, paramValues, paramValues);
}
if(param_format == Qnil)
paramFormats = 0;
else
paramFormats = NUM2INT(param_format);
}
for(i = 0; i < nParams; i++) {
if (paramValues && !strcmp(paramValues, "+rG")) {
fprintf(stderr, "got a +rG %p in slot %d\n",
paramValues, i);
abort();
}
}
Obviously, I added the printfs.
Running this, I get an abort:
0: 0x425af0 -> 102
2: 0x4265c0 -> true
3: 0x4265d0 -> 40.48324
4: 0x4265e0 -> -88.09905
5: 0x432820 -> 102
6: 0x422530 -> 65.2579241765071
7: 0x474ab0 -> 2008-06-17
8: 0x4258f0 -> 14:42:36
got a +rG 0x4265c0 in slot 2
So! Somewhere between the 2nd pass (out of 13 or so) through the first
loop, and the next loop, 0x4265c0 has gotten overwritten with garbage.
This is not specific to boolean data; I have also had it happen on strings,
but the boolean data was a bit easier to track down. This is a pure
heisenbug, which moves to new data depending on things like "the contents
of ARGV".
Can anyone give me a hint as to what I should be looking at? I tried turning
down compiler optimizations, to no noticable effect. (It moved, but it
moves any time anything changes.) The "T_HASH" case is probably irrelevant,
as all 12 arguments are strings. About all I can think of is that, perhaps,
rb_obj_as_string is allocating strings which are getting garbage collected
before the end of the routine?
I'm afraid I can't make this bug report much more useful, I don't really
understand the code. I don't know how the garbage collector works, either.
.... But interestingly, wrapping the call to the API function this wraps
in GC.disable/GC.enable makes the bug go away. I'll annotate my rubyforge
bug, but if anyone here can tell me what I should be doing properly to
tag these things not to be collected until this function is done, I'd love
to know.
I get random data corruption when trying to execute queries.
The data corruption comes and goes VERY unpredictably. I've narrowed it
down to a small chunk of the pg.c module, except I don't understand the
ruby interpreter well enough to say much more.
Here's what happens:
I pass a whole bunch of arguments in to a prepared statement. In pg.c,
that leads to a loop through nParams, reproduced here in case it means
anything to someone.
for(i = 0; i < nParams; i++) {
param = rb_ary_entry(params, i);
if (TYPE(param) == T_HASH) {
param_value_tmp = rb_hash_aref(param, sym_value);
if(param_value_tmp == Qnil)
param_value = param_value_tmp;
else
param_value = rb_obj_as_string(param_value_tmp);
param_format = rb_hash_aref(param, sym_format);
}
else {
if(param == Qnil)
param_value = param;
else
param_value = rb_obj_as_string(param);
param_format = INT2NUM(0);
}
if(param_value == Qnil) {
paramValues = NULL;
paramLengths = 0;
}
else {
Check_Type(param_value, T_STRING);
paramValues = StringValuePtr(param_value);
paramLengths = RSTRING_LEN(param_value);
fprintf(stderr, "%d: %p -> %s\n",
i, paramValues, paramValues);
}
if(param_format == Qnil)
paramFormats = 0;
else
paramFormats = NUM2INT(param_format);
}
for(i = 0; i < nParams; i++) {
if (paramValues && !strcmp(paramValues, "+rG")) {
fprintf(stderr, "got a +rG %p in slot %d\n",
paramValues, i);
abort();
}
}
Obviously, I added the printfs.
Running this, I get an abort:
0: 0x425af0 -> 102
2: 0x4265c0 -> true
3: 0x4265d0 -> 40.48324
4: 0x4265e0 -> -88.09905
5: 0x432820 -> 102
6: 0x422530 -> 65.2579241765071
7: 0x474ab0 -> 2008-06-17
8: 0x4258f0 -> 14:42:36
got a +rG 0x4265c0 in slot 2
So! Somewhere between the 2nd pass (out of 13 or so) through the first
loop, and the next loop, 0x4265c0 has gotten overwritten with garbage.
This is not specific to boolean data; I have also had it happen on strings,
but the boolean data was a bit easier to track down. This is a pure
heisenbug, which moves to new data depending on things like "the contents
of ARGV".
Can anyone give me a hint as to what I should be looking at? I tried turning
down compiler optimizations, to no noticable effect. (It moved, but it
moves any time anything changes.) The "T_HASH" case is probably irrelevant,
as all 12 arguments are strings. About all I can think of is that, perhaps,
rb_obj_as_string is allocating strings which are getting garbage collected
before the end of the routine?
I'm afraid I can't make this bug report much more useful, I don't really
understand the code. I don't know how the garbage collector works, either.
.... But interestingly, wrapping the call to the API function this wraps
in GC.disable/GC.enable makes the bug go away. I'll annotate my rubyforge
bug, but if anyone here can tell me what I should be doing properly to
tag these things not to be collected until this function is done, I'd love
to know.