Initializing a malloc'ed struct whose fields are const with run-timevalues

B

Ben Bacarisse

Shao Miller said:
To Ben: I think the effective type would be 'struct toto', as 'memcpy'
is included in 6.5p6.

Interesting! My recollection was that memcpy would treat the
destination as a char array but would then set the effective type for
subsequent accesses, but the text says it sets the effective type "for
that access and for subsequent accesses".

However, I think the conclusion is still valid. To be more specific, I
think the effective type is a red herring -- I don't think it matters when
determining what can and can't be modified here.

The closest approach to UB that I can find is this (7.1.4 p1):

"Each of the following statements applies unless explicitly stated
otherwise in the detailed descriptions that follow: If an argument to
a function has an invalid value (such as [...] a pointer to
non-modifiable storage when the corresponding parameter is not
const-qualified) ..."

But that does not apply here -- the argument is a pointer to modifiable
storage no matter what it's (effective) type is about to become. Of
course, there may be some other reason why it's UB, but I can't see one.
 
B

Ben Bacarisse

Shao Miller said:
Are you sure about that? 'res' undergoes lvalue conversion and no
longer has qualified type.

Am I sure about what? I think you know that you can't assign to a
struct toto object, so are you saying if I am sure that returning this
struct type limits what one can do with it?

A pointer to allocated storage is more flexible (because you have
control over the lifetime) and the function as given is almost identical
to a simple expression: (struct toto){i, f, p}.
 
S

Shao Miller

Am I sure about what? I think you know that you can't assign to a
struct toto object, so are you saying if I am sure that returning this
struct type limits what one can do with it?

A pointer to allocated storage is more flexible (because you have
control over the lifetime) and the function as given is almost identical
to a simple expression: (struct toto){i, f, p}.

Sorry, Ben. I've approached this in my head in four ways:

1. (Non-preferred; must be wrong because of lvalue conversion)

'res', before lvalue conversion, has a compatible type with the return
type of the function, so "the value is converted as if by assignment to
an object having the return type of the function" does not apply, and
"the value of" 'res' "is returned to the caller as the value of the
function call expression"

2. (More preferred than #1)

'res' undergoes lvalue conversion and yields an unqualified type with
the value. This type is incompatible with the return type of the
function, so "the value is converted as if by assignment to an object
having the return type of the function", but the footnote reminds us
that "The return statement is not an assignment." So there might be one
or more differences.

Looking at 6.5.16p1, it cannot apply as it is, since "an object having
the return type of the function" isn't the same as "a modifiable lvalue
designating an object having the return type of the function." We need
to twist it a bit. There's also no direct discussion of conversion for
the RHS operand... That's "2 strikes". So what if we drop that bit and
move on to 6.5.16.1?

Alternatively to moving on, there's a constraint violation (as you've
mentioned) based on "modifiable". But why pick "modifiable" and not
"lvalue"? :)

Looking at 6.5.16.1p1, the return type of 'foo' and the unqualified type
of rvalue from 'res' satisfy the second bullet. Then p2 does indeed
refer back to 6.5.16p2, but only insofar as the type of the
assignment-expression. The type of the assignment-expression so happens
to be the unqualified type of 'struct toto', which is the same type as
the rvalue from 'res', so there's actually no conversion for the rvalue
from 'res'.

3. (Preferred)

6.7.3p4: "The properties associated with qualified types are meaningful
only for expressions that are lvalues.132)" But "the value of the
function call expression" is never an lvalue, so perhaps the return type
of a function is never qualified.

In that case, the lvalue conversion of 'res' yields an rvalue whose type
is already compatible with the return type, so "the value is converted
as if by assignment to an object having the return type of the function"
does not apply, and "the value of" 'res' "is returned to the caller as
the value of the function call expression".

If #3 is true, the wording could be a bit clearer.

4. (D'oh!)

Unfortunately, I don't think any of #1, #2, #3 is true, due to 6.7.3p9:
"...If the specification of a function type includes any type
qualifiers, the behavior is undefined.136)" (Since at least C90.) I
think that means that 'foo' can't be declared that way.

(This is Ike Naar's 'foo', obviously. Citations relative to N1570.)
 
B

Ben Bacarisse

Nick Bowler said:
Danger, Will Robinson!

Implementations are allowed to define a function-like macro called
memcpy, and the above line will not work on implementations that do so.
Macro rguments will be split on the commas in the compound literal,
which is almost certainly not desirable.

Good point. I'd forgotten about that entirely. Fortunately the
compiler will tell you what's happened so this should be a sort-lived
problem.
Therefore, it is neccesary to add parentheses (or to #undef memcpy...),
either around the compound literal to avoid such splitting, or around
memcpy to avoid any macro substitution at all. For example:

if (res != NULL) memcpy(res, (&(struct toto){i, f, p}), sizeof *res);
if (res != NULL) (memcpy)(res, &(struct toto){i, f, p}, sizeof *res);

I'd choose the first (ugly though it is) just because the macro might
well be there because it provides some benefit.

There might have been an argument for counting {} along with ()s when
collecting, well, arguments but it's too late now...
 
B

Ben Bacarisse

William Ahern said:
C99 (N1256) 6.5p6:

If a value is copied into an object having no declared type using
memcpy or memmove, or is copied as an array of character type, then
the effective type of the modified object for that access and for
subsequent accesses that do not modify the value is the effective
type of the object from which the value is copied, if it has one.

I guess the effective type arises only after memcpy completes.

I'd say that "for that access and for subsequent accesses" includes the
memcpy itself, though my memory of that paragraph was as you describe --
some sort of byte array copy and only subsequently a new effective type.

Anyway, I suspect the effective type is a red herring. I don't see any
problems arising from this aspect of the code.
 
S

Shao Miller

I believe that changed with C99. It only applies to objects of static
storage duration. 6.7.8p4 + "exceptio probat regulam in casibus non
exceptis".

Thank you, Mr. William Ahern.

I think it has.

Thank you, Ben.

You think correctly - as of C99, that restriction only applies to
initializers for objects of static storage duration. As of C2011, it
also applies to initializers for objects with thread storage duration
(6.7.9p4); only C90 imposed it on initializers for objects with
automatic storage duration.

Thank you, Mr. James Kuyper.

Shao Miller said:
#include <stdlib.h>
#include <string.h>
struct toto *foo(int i, float f, void *p)
{
struct toto s = { i, f, p };
struct toto *res = malloc(sizeof *res);
if (res != NULL) memcpy(res, &s, sizeof s);
return res;
}

[...]

2) I'm not sure it is well-defined to memcpy stuff into some const
fields. Does it tickle the cranky UB gods?

No, I think it's fine. You are not modifying anything that is const.
malloced storage has no declared type and the memcpy probably makes the
effective type a byte array.

To Ben: I think the effective type would be 'struct toto', as 'memcpy'
is included in 6.5p6.

Interesting! My recollection was that memcpy would treat the
destination as a char array but would then set the effective type for
subsequent accesses, but the text says it sets the effective type "for
that access and for subsequent accesses".

However, I think the conclusion is still valid. To be more specific, I
think the effective type is a red herring -- I don't think it matters when
determining what can and can't be modified here.

The closest approach to UB that I can find is this (7.1.4 p1):

"Each of the following statements applies unless explicitly stated
otherwise in the detailed descriptions that follow: If an argument to
a function has an invalid value (such as [...] a pointer to
non-modifiable storage when the corresponding parameter is not
const-qualified) ..."

But that does not apply here -- the argument is a pointer to modifiable
storage no matter what it's (effective) type is about to become. Of
course, there may be some other reason why it's UB, but I can't see one.

My original note to Noob about effective type was not applicable to
Noob's case, but just a warning about using 'memcpy' into allocated
storage for C >= C99, in general. For example, just before the 'return':

if (res) {
int * ip = (void *) ((char *) res + offsetof(struct toto, i));
/* 'ip' not in bullets of 6.5p7 */
*ip;
}

(But I could be forgetting something else.)
 
B

Ben Bacarisse

Shao Miller said:
Sorry, Ben. I've approached this in my head in four ways:

1. (Non-preferred; must be wrong because of lvalue conversion)

'res', before lvalue conversion, has a compatible type with the return
type of the function, so "the value is converted as if by assignment
to an object having the return type of the function" does not apply,
and "the value of" 'res' "is returned to the caller as the value of
the function call expression"

2. (More preferred than #1)

'res' undergoes lvalue conversion and yields an unqualified type with
the value. This type is incompatible with the return type of the
function, so "the value is converted as if by assignment to an object
having the return type of the function", but the footnote reminds us
that "The return statement is not an assignment." So there might be
one or more differences.

Looking at 6.5.16p1, it cannot apply as it is, since "an object having
the return type of the function" isn't the same as "a modifiable
lvalue designating an object having the return type of the function."
We need to twist it a bit. There's also no direct discussion of
conversion for the RHS operand... That's "2 strikes". So what if we
drop that bit and move on to 6.5.16.1?

Alternatively to moving on, there's a constraint violation (as you've
mentioned) based on "modifiable". But why pick "modifiable" and not
"lvalue"? :)

Looking at 6.5.16.1p1, the return type of 'foo' and the unqualified
type of rvalue from 'res' satisfy the second bullet. Then p2 does
indeed refer back to 6.5.16p2, but only insofar as the type of the
assignment-expression. The type of the assignment-expression so
happens to be the unqualified type of 'struct toto', which is the same
type as the rvalue from 'res', so there's actually no conversion for
the rvalue from 'res'.

3. (Preferred)

6.7.3p4: "The properties associated with qualified types are
meaningful only for expressions that are lvalues.132)" But "the value
of the function call expression" is never an lvalue, so perhaps the
return type of a function is never qualified.

In that case, the lvalue conversion of 'res' yields an rvalue whose
type is already compatible with the return type, so "the value is
converted as if by assignment to an object having the return type of
the function" does not apply, and "the value of" 'res' "is returned to
the caller as the value of the function call expression".

If #3 is true, the wording could be a bit clearer.

4. (D'oh!)

Unfortunately, I don't think any of #1, #2, #3 is true, due to
6.7.3p9: "...If the specification of a function type includes any type
qualifiers, the behavior is undefined.136)" (Since at least C90.) I
think that means that 'foo' can't be declared that way.

You wrote eight paragraphs of stuff that you don't think is true?

How do you interpret 6.7.3 p9 so that the function above is undefined,
but const char *int_to_str(int) is not?
 
S

Shao Miller

Shao said:
Because they amount to the same thing, behind the scenes, on a few
implementations? ;) Lots of people still do:

struct tag s;
memset(&s, 0, sizeof s);

instead of:

struct tag s = { 0 };

But the latter seems better (when possible).


This has the same problem with the initializer list, though.

I would guess that 'malloc' was being used for lifetime considerations,
but obviously only Noob knows.

As I often do, my simplification went a bit too far, so I'll just
describe the actual use-case.
(This involves two threads on a POSIX-compliant platform.)

I have a foo_start function which malloc's space for a "context"
struct, populates the struct according to the function's parameters,
then spawns a new thread which is passed this context. (This is why
dynamic allocation must be used.)

To make matters more complex, I have a flexible array at the end of
the struct.

Basically, struct ctx is defined this way:

struct ctx {
int file_idx;
const char *buf;
sem_t sem;
char path[];
};

void *foo_run(void *arg) {
struct ctx *ctx = arg;
do stuff in an infinite loop, according to ctx
}

void foo_start(int param1, int param2, ...)
{
struct ctx *ctx = malloc(sizeof *ctx + paramx);
populate the fields of ctx;
spawn(foo_run, ctx);
}

In my current version, I don't have any const qualifiers in my code.
I like to look at the assembly code generated by the compiler, and I
noticed that every time I need some field from ctx, the compiler has
to reload it, instead of caching the value in a register.

As far as I understand, this is expected: the address of the struct
could be stored anywhere, and any function in a different translation
unit could "pull the rug" from under me. Except that *I* wrote the
code, and I *know* (by design) that most fields in the struct do NOT
change after init.

So I set out to sprinkle a few "const" qualifiers here and there, to
see if that would convince the compiler that some optimizations are
indeed possible (I know this looks like a clear case of premature
optimization, but I figured I might as well learn something new along
the way!)

I don't think I can just const-qualify the entire struct, because
1) the prototype for a thread's entry point is imposed by POSIX
2) const-qualifying a pointer parameter is only a contract between
the user and the function's implementer, saying "my function won't
touch your preciousss struct", it doesn't say that the struct won't
be changed by something else.

Whereas, I was under the impression that a const-qualified field
means "hear, hear, this field SHALL NEVER change!"

Anyway, I'd be happy to hear the word from the regs, and/or take
this to comp.unix.programmer at some point (although I do believe
that the core of my question is a C question, not POSIX).

Although I gave the wrong answer about C >= C99 aggregate/union
initializer-list, if you'll still permit, I'd like to offer some
thoughts on why using 'offsetof' (as a previous post did) might have
benefit:

1. Your non-constant initializers require C >= C99.

2. The work of initialization and then copying to allocated storage
is more work (but prettier, perhaps) than assigning directly to the
allocated storage, unless working with automatic objects has a special
advantage in terms of speed.

3. For C >= C99, the effective types for the sub-objects in the
allocated storage will remain modifiable, but the 'struct toto *' used
to access any of those will only allow 'const' access; perhaps the
artificial limitation you desire? "Internal" functions could use
'offsetof', if needed.

Using 'memcpy' from another 'struct toto' establishes the effective type
of the allocated storage and then it can be argued that there are
complications from pointing into the storage with a pointer to an
unqualified referenced type, then using such pointers to read the value.
(Although I think that argument would disappear if you use a 'const
int *' to read the 'const int' sub-object of the allocated storage,
instead.)
 
T

Tim Rentsch

Noob said:
Is it possible, in standard C89 and C99, to initialize a
struct whose fields are const with values only known at
run-time?

Yes, although it's easier in C99, because of the point you
identify below (point 1).
[snip] The best I could come up with is:

#include <stdlib.h>
#include <string.h>
struct toto *foo(int i, float f, void *p)
{
struct toto s = { i, f, p };
struct toto *res = malloc(sizeof *res);
if (res != NULL) memcpy(res, &s, sizeof s);
return res;
}

which has several defects:

1) Apparently, C89 does not allow one to use elements "not
computable at load time" in an initialization list. However, it
is allowed both in C99 (right?) and in gnu89.

Right. I recommend -std=c99 in preference to -std=gnu89 (and
of course -pedantic-errors). According to another posting you
want to use flexible array members so apparently you will be
using C99 anyway.
2) I'm not sure it is well-defined to memcpy stuff into some
const fields. [Is the behavior well-defined?]

Yes, well-defined behavior, since the space being copied into
is allocated memory. Moreover copying from a declared struct
of the same type does the right thing in terms of effective
type rules.
 
T

Tim Rentsch

Ian Collins said:
Ike said:

And how about returning by value?

struct toto foo(int i, float f, void *p)
{
struct toto res = {i, f, p};
return res;
}

Is that legal? struct toto has const members, so can it be
returned by value from a function?

Can you name a constraint or a 'shall' provision that it
violates? I can't find one. If there is no such violation,
then the behavior is well-defined (and I believe that is in
fact the case).

Furthmore this result is more useful than it might appear at
first blush, because such functions can be used as part of
initialization:

struct toto somethin = foo( 0, 3.14, malloc( 97 ) );

Of course there isn't much point for this particular
definition of foo(), but it's easy to imagine other
"constructor" functions that would be quite handy to have
around.
 
S

Shao Miller

You wrote eight paragraphs of stuff that you don't think is true?

Well I was hopeful for #3, but seemed to recall having previously read
some other bit of Standard text contradicting it. When I went looking,
I came across 6.7.3p9 and it looked like it was the antagonist. Perhaps
that wasn't it, after all.
How do you interpret 6.7.3 p9 so that the function above is undefined,
but const char *int_to_str(int) is not?

Well I was thinking along the lines of the qualification of the return
type's type category, but maybe not...

Have you any thoughts about #1, #2, #3?
 
T

Tim Rentsch

James Kuyper said:
A struct containing a member that is const-qualified is not modifiable
(6.3.2.1p1). It is a constraint violation for an assignment operator to
have a left operand which is not a modifiable lvalue (6.5.16p2).
Arguments passed to a function (6.5.2.2p7), and values returned by a
function (6.8.6.4p3), are both converted "as if by assignment", and I
believe are therefore subject to the same constraints.

Too funny.

First, the constraint violation for assignment operators has
nothing to do with conversions that are done as part of performing
the operation. These aspects appear in different paragraphs and
in different types of subsections.

Second, 'return' only does such an assignment-like conversion
when the type of the return expression differs from the result
type of the function.
 
T

Tim Rentsch

Ian Collins said:
I just tried this code:

#include <stdio.h>

struct toto { const int i; const float f; const void *p; };

struct toto foo(int i, float f, void *p)
{
return (struct toto){i, f, p};
}

int main(void)
{
float f = foo( 1,2,NULL).f;
}

and gcc (-Wall -std=c99 -pedantic) compiles it, but Sun c99 (which
tends to be both strict and conforming) rejects it with one error:

"x.c", line 7: left operand must be modifiable lvalue: op "="

where line 7 is the function return.

I believe gcc has it right here. Even the error message from
the Sun compiler suggests they were thinking of this like a
kind of assignment, which it certainly isn't in this case.
 
I

Ian Collins

Tim said:
Ian Collins said:
Ike said:
[snip]

And how about returning by value?

struct toto foo(int i, float f, void *p)
{
struct toto res = {i, f, p};
return res;
}

Is that legal? struct toto has const members, so can it be
returned by value from a function?

Can you name a constraint or a 'shall' provision that it
violates? I can't find one. If there is no such violation,
then the behavior is well-defined (and I believe that is in
fact the case).

No, I'm not one for remembering where stuff is in the standard... But I
think I (and possibly the Sun compiler writers) took the assignment like
nature of return too literally.
 
T

Tim Rentsch

Ian Collins said:
Tim said:
Ian Collins said:
Ike Naar wrote:
[snip]

And how about returning by value?

struct toto foo(int i, float f, void *p)
{
struct toto res = {i, f, p};
return res;
}

Is that legal? struct toto has const members, so can it be
returned by value from a function?

Can you name a constraint or a 'shall' provision that it
violates? I can't find one. If there is no such violation,
then the behavior is well-defined (and I believe that is in
fact the case).

No, I'm not one for remembering where stuff is in the
standard...

Sorry, what I said came out sounding accusatory and I
didn't mean it that way. All I meant was, this question
is what should be considered to resolve the issue.
But I think I (and possibly the Sun compiler writers) took
the assignment like nature of return too literally.

Actually my first reaction was similar to that, even though
I wasn't thinking specifically of assignment. Upon further
consideration though allowing it makes more sense, because
what's happening is more like an initialization than an
assignment, and initializers certainly are allowed for
const-qualified types.
 
I

Ian Collins

Tim said:
Ian Collins said:
Tim said:
Ike Naar wrote:
[snip]

And how about returning by value?

struct toto foo(int i, float f, void *p)
{
struct toto res = {i, f, p};
return res;
}

Is that legal? struct toto has const members, so can it be
returned by value from a function?

Can you name a constraint or a 'shall' provision that it
violates? I can't find one. If there is no such violation,
then the behavior is well-defined (and I believe that is in
fact the case).

No, I'm not one for remembering where stuff is in the
standard...

Sorry, what I said came out sounding accusatory and I
didn't mean it that way. All I meant was, this question
is what should be considered to resolve the issue.

No offense taken.
Actually my first reaction was similar to that, even though
I wasn't thinking specifically of assignment. Upon further
consideration though allowing it makes more sense, because
what's happening is more like an initialization than an
assignment, and initializers certainly are allowed for
const-qualified types.

Now I've taken the time to check 6.8.6.4p3 I think the wording

"the value of the expression is returned to the caller as the value of
the function call expression"

is clear enough. I might ask the now Oracle compiler people why they
consider it an error.
 
B

Ben Bacarisse

Shao Miller said:
There is probably a better example, since none of the members here are
non-'const'.

Not that it matters much but I just spotted that p is not const. I
don't think it affects much of the discussion though, and the OP may in
fact have intended it to be.

<snip>
 
N

Noob

Ben said:
Not that it matters much but I just spotted that p is not const. I
don't think it affects much of the discussion though, and the OP may in
fact have intended it to be.

You're right, I meant:

struct toto { const int i; const float f; void *const p; };

And it turns out that all this work was fruitless, as gcc does not
take advantage of the const-ness of the fields, and keeps reloading
the values from memory every time they are needed...

Regards.
 
B

BartC

Noob said:
You're right, I meant:

struct toto { const int i; const float f; void *const p; };

And it turns out that all this work was fruitless, as gcc does not
take advantage of the const-ness of the fields, and keeps reloading
the values from memory every time they are needed...

Where should it load them from instead?

Perhaps there was no opportunity to keep the values resident in registers.
 
B

Ben Bacarisse

Tim Rentsch said:
Ian Collins said:
I just tried this code:

#include <stdio.h>

struct toto { const int i; const float f; const void *p; };

struct toto foo(int i, float f, void *p)
{
return (struct toto){i, f, p};
}

int main(void)
{
float f = foo( 1,2,NULL).f;
}

and gcc (-Wall -std=c99 -pedantic) compiles it, but Sun c99 (which
tends to be both strict and conforming) rejects it with one error:

"x.c", line 7: left operand must be modifiable lvalue: op "="

where line 7 is the function return.

I believe gcc has it right here. Even the error message from
the Sun compiler suggests they were thinking of this like a
kind of assignment, which it certainly isn't in this case.

As you say elsewhere, the assignment-like nature of the return only
kicks in when the types differ, but even when it does, I'd take "is
converted as if by assignment" to mean that only the conversion and type
constraints of assignment are to be considered. (The constraints being
important because they would, for example, prevent a function with type
'T *' returning a 'const T * value without a diagnostic).

More curious to me is that gcc (in conforming mode) permits this:

struct toto { const int i; const float f; const void *p; };

struct toto foo(int i, float f, void *p)
{
return (struct toto){i, f, p};
}

int bar(struct toto p) { return p.i; }

int main(void)
{
return bar(foo(1, 2, 0));
}

but 6.5.2.2 p2 says:

"Each argument shall have a type such that its value may be assigned
to an object with the unqualified version of the type of its
corresponding parameter."

The phrase "unqualified version of the type" can't surely be intended to
mean the recursive removal of all type qualifiers. Elsewhere it simply
means that the "top level" qualifiers are removed. If that's the right
interpretation, the call to bar should be a constraint violation.

It looks like gcc is borrowing from C++ where argument passing is
defined to like initialisation (which I've always found that to be more
logical than C's assignment-like call semantics). Anyway, either I'm
misreading something or gcc has missed something.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,077
Messages
2,570,566
Members
47,202
Latest member
misc.

Latest Threads

Top