mysterious memory corruption, very confused

S

Seebs

ruby 1.8.7-p22, OS X 10.4.mumble, PostgreSQL 8.3.1, ruby-pg 2008-03-18.

I get random data corruption when trying to execute queries.

The data corruption comes and goes VERY unpredictably. I've narrowed it
down to a small chunk of the pg.c module, except I don't understand the
ruby interpreter well enough to say much more.

Here's what happens:

I pass a whole bunch of arguments in to a prepared statement. In pg.c,
that leads to a loop through nParams, reproduced here in case it means
anything to someone.

for(i = 0; i < nParams; i++) {
param = rb_ary_entry(params, i);
if (TYPE(param) == T_HASH) {
param_value_tmp = rb_hash_aref(param, sym_value);
if(param_value_tmp == Qnil)
param_value = param_value_tmp;
else
param_value = rb_obj_as_string(param_value_tmp);
param_format = rb_hash_aref(param, sym_format);
}
else {
if(param == Qnil)
param_value = param;
else
param_value = rb_obj_as_string(param);
param_format = INT2NUM(0);
}
if(param_value == Qnil) {
paramValues = NULL;
paramLengths = 0;
}
else {
Check_Type(param_value, T_STRING);
paramValues = StringValuePtr(param_value);
paramLengths = RSTRING_LEN(param_value);
fprintf(stderr, "%d: %p -> %s\n",
i, paramValues, paramValues);
}
if(param_format == Qnil)
paramFormats = 0;
else
paramFormats = NUM2INT(param_format);
}

for(i = 0; i < nParams; i++) {
if (paramValues && !strcmp(paramValues, "+rG")) {
fprintf(stderr, "got a +rG %p in slot %d\n",
paramValues, i);
abort();
}
}

Obviously, I added the printfs.

Running this, I get an abort:

0: 0x425af0 -> 102
2: 0x4265c0 -> true
3: 0x4265d0 -> 40.48324
4: 0x4265e0 -> -88.09905
5: 0x432820 -> 102
6: 0x422530 -> 65.2579241765071
7: 0x474ab0 -> 2008-06-17
8: 0x4258f0 -> 14:42:36
got a +rG 0x4265c0 in slot 2

So! Somewhere between the 2nd pass (out of 13 or so) through the first
loop, and the next loop, 0x4265c0 has gotten overwritten with garbage.

This is not specific to boolean data; I have also had it happen on strings,
but the boolean data was a bit easier to track down. This is a pure
heisenbug, which moves to new data depending on things like "the contents
of ARGV".

Can anyone give me a hint as to what I should be looking at? I tried turning
down compiler optimizations, to no noticable effect. (It moved, but it
moves any time anything changes.) The "T_HASH" case is probably irrelevant,
as all 12 arguments are strings. About all I can think of is that, perhaps,
rb_obj_as_string is allocating strings which are getting garbage collected
before the end of the routine?

I'm afraid I can't make this bug report much more useful, I don't really
understand the code. I don't know how the garbage collector works, either.

.... But interestingly, wrapping the call to the API function this wraps
in GC.disable/GC.enable makes the bug go away. I'll annotate my rubyforge
bug, but if anyone here can tell me what I should be doing properly to
tag these things not to be collected until this function is done, I'd love
to know.
 
S

Seebs

I'm afraid I can't make this bug report much more useful, I don't really
understand the code. I don't know how the garbage collector works, either.
... But interestingly, wrapping the call to the API function this wraps
in GC.disable/GC.enable makes the bug go away. I'll annotate my rubyforge
bug, but if anyone here can tell me what I should be doing properly to
tag these things not to be collected until this function is done, I'd love
to know.

Okay, this is almost certainly wrong, but:

I experimentally added an array of N "VALUE" objects. After the first
hunk of code has obtained the "correct" value, I then stash that value
in the Nth item of the array (or store a 0 there), and call
rb_gc_register_address(&param_string_values);

After calling the postgresql function, I loop through calling
rb_gc_unregister_address(&param_string_values);

The program now runs on the whole data set available to me without errors;
that's about 15x as long as it usually made it before.

I'm not saying this is the correct fix, but I think it is pretty good
confirmation that the analysis is right and the garbagec collector is the
culprit.
 
R

Roger Pack

I get random data corruption when trying to execute queries.
valgrind might tell you if memory is being tramped.
 
S

Seebs

valgrind might tell you if memory is being tramped.

It is. There's a loop of
VALUE x;
char **foo = malloc(buncha char *);

for (big list of things) {
x = rb_obj_as_string(y);
foo = GetStringValue(x);
}

The idiom of using rb_obj_as_string, and then using the value, is common in
the Ruby source. It works. ... It works *as long as you don't allocate
anything more before you're done with it*. What ends up happening is that,
if enough of the objects in question need a new string allocated by
rb_obj_as_string, sooner or later you end up invoking the garbage collector.
Now, since there's only one x, the garbage collector assumes the current
rb_obj_as_string() return is in use, *and the others aren't*. So it might,
if it wants the space, free one... And then the memory gets reused.

With "big list" being about 12 items, about half of which needed new strings
allocated, this ended up blowing up about once every thousand or two thousand
runs. Unfortunately for me, I had about 80,000 data points. :)

I submitted a more detailed bug report to the ruby-pg project, and I've
adopted a workaround (possibly very inefficient) involving an array of VALUE
objects and rb_gc_{un}register_address. It's ugly but it eliminates the bug.
 
J

Jeff Davis

valgrind might tell you if memory is being tramped.

It is. There's a loop of
VALUE x;
char **foo = malloc(buncha char *);

for (big list of things) {
x = rb_obj_as_string(y);
foo = GetStringValue(x);
}

The idiom of using rb_obj_as_string, and then using the value, is common in
the Ruby source. It works. ... It works *as long as you don't allocate
anything more before you're done with it*. What ends up happening is that,
if enough of the objects in question need a new string allocated by
rb_obj_as_string, sooner or later you end up invoking the garbage collector.
Now, since there's only one x, the garbage collector assumes the current
rb_obj_as_string() return is in use, *and the others aren't*. So it might,
if it wants the space, free one... And then the memory gets reused.


Thanks again for the detailed analysis.

To be clear, you're saying that the new object created by
rb_obj_as_string() can be freed as soon as I allocate any new ruby
object?

Is this documented behavior? To be safe, should I always assume any
object that I allocate in C land lives only until the next object is
allocated (unless it's referenced by some other object Ruby knows about,
of course)?

Regards,
Jeff Davis
 
N

Nobuyoshi Nakada

Hi,

At Mon, 30 Jun 2008 15:47:19 +0900,
Seebs wrote in [ruby-talk:306636]:
It is. There's a loop of
VALUE x;
char **foo = malloc(buncha char *);

for (big list of things) {
x = rb_obj_as_string(y);
foo = GetStringValue(x);
}


It's your bug.
The idiom of using rb_obj_as_string, and then using the value, is common in
the Ruby source. It works. ... It works *as long as you don't allocate
anything more before you're done with it*. What ends up happening is that,
if enough of the objects in question need a new string allocated by
rb_obj_as_string, sooner or later you end up invoking the garbage collector.
Now, since there's only one x, the garbage collector assumes the current
rb_obj_as_string() return is in use, *and the others aren't*. So it might,
if it wants the space, free one... And then the memory gets reused.

Because you drop the references to the created objects. You
have to keep the objects but not only the pointers.
I submitted a more detailed bug report to the ruby-pg project, and I've
adopted a workaround (possibly very inefficient) involving an array of VALUE
objects and rb_gc_{un}register_address. It's ugly but it eliminates the bug.

VALUE x, array;
char **foo = malloc(buncha char *);

for (big list of things) {
x = rb_obj_as_string(y);
rb_ary_push(array, x);
foo = GetStringValue(x);
}

By keeping the values in an automatic variable `array', they
are marked and won't be freed.
 
T

Tim Pease

Hi,

At Mon, 30 Jun 2008 15:47:19 +0900,
Seebs wrote in [ruby-talk:306636]:
It is. There's a loop of
VALUE x;
char **foo = malloc(buncha char *);

for (big list of things) {
x = rb_obj_as_string(y);
foo = GetStringValue(x);
}


It's your bug.
The idiom of using rb_obj_as_string, and then using the value, is
common in
the Ruby source. It works. ... It works *as long as you don't
allocate
anything more before you're done with it*. What ends up happening
is that,
if enough of the objects in question need a new string allocated by
rb_obj_as_string, sooner or later you end up invoking the garbage
collector.
Now, since there's only one x, the garbage collector assumes the
current
rb_obj_as_string() return is in use, *and the others aren't*. So
it might,
if it wants the space, free one... And then the memory gets reused.

Because you drop the references to the created objects. You
have to keep the objects but not only the pointers.
I submitted a more detailed bug report to the ruby-pg project, and
I've
adopted a workaround (possibly very inefficient) involving an array
of VALUE
objects and rb_gc_{un}register_address. It's ugly but it
eliminates the bug.

VALUE x, array;
char **foo = malloc(buncha char *);

for (big list of things) {
x = rb_obj_as_string(y);
rb_ary_push(array, x);
foo = GetStringValue(x);
}

By keeping the values in an automatic variable `array', they
are marked and won't be freed.


I am wondering why the strings (returned from rb_obj_as_string) will
be garbage collected but the array will not be garbage collected? Both
have the same local scope, and they are not referenced by any other
ruby object.

Please explain when you have time.

Blessings,
TwP
 
J

Joel VanderWerf

Tim said:
On Jul 6, 2008, at 7:15 PM, Nobuyoshi Nakada wrote: ...
VALUE x, array;
char **foo = malloc(buncha char *);

for (big list of things) {
x = rb_obj_as_string(y);
rb_ary_push(array, x);
foo = GetStringValue(x);
}

By keeping the values in an automatic variable `array', they
are marked and won't be freed.


I am wondering why the strings (returned from rb_obj_as_string) will be
garbage collected but the array will not be garbage collected? Both have
the same local scope, and they are not referenced by any other ruby object.


IIUC, 'VALUE array' is a local, hence on stack, hence GC marks it. The
'VALUE x' local only protects the current string.
 
T

Tim Pease

Tim said:
On Jul 6, 2008, at 7:15 PM, Nobuyoshi Nakada wrote: ...
VALUE x, array;
char **foo = malloc(buncha char *);

for (big list of things) {
x = rb_obj_as_string(y);
rb_ary_push(array, x);
foo = GetStringValue(x);
}

By keeping the values in an automatic variable `array', they
are marked and won't be freed.

I am wondering why the strings (returned from rb_obj_as_string)
will be garbage collected but the array will not be garbage
collected? Both have the same local scope, and they are not
referenced by any other ruby object.


IIUC, 'VALUE array' is a local, hence on stack, hence GC marks it.
The 'VALUE x' local only protects the current string.


Thanks, Joel! That makes sense.

Blessings,
TwP
 
J

Jeff Davis

I am wondering why the strings (returned from rb_obj_as_string) will
be garbage collected but the array will not be garbage collected? Both
have the same local scope, and they are not referenced by any other
ruby object.

Thanks for mentioning that, I had the same question.

Also, what *exactly* can I do between:
x = rb_obj_as_string(y)
and a statement that makes "x" safe (e.g. stores a reference in some
other value that's safe from collection)?

In other words, what ruby routines might invoke the garbage collector,
and thus possibly destroy any un-saved objects that I might have?

Regards,
Jeff Davis
 
N

Nobuyoshi Nakada

Hi,

At Tue, 8 Jul 2008 15:02:58 +0900,
Jeff Davis wrote in [ruby-talk:307560]:
Also, what *exactly* can I do between:
x = rb_obj_as_string(y)
and a statement that makes "x" safe (e.g. stores a reference in some
other value that's safe from collection)?

Basically, "x" is safe as long as it is refered from an
automatic variable. But if you only use the internal pointer
of it, e.g., RSTRING_PTR and RARRAY_PTR, it may be optimized
out by the compiler. You can prevent this with RB_GC_GUARD
macro.

RB_GC_GUARD(x) = rb_obj_as_string(y);
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,999
Messages
2,570,244
Members
46,839
Latest member
MartinaBur

Latest Threads

Top