Keyword parameters

K

Keith Thompson

The second argument to strtod() is such.

No, strtod() isn't variadic. You can write any of the following:
strtod("42", NULL);
strtod("42", (void*)NULL);
strtod("42", 0);
strtod("42", (void*)0);
and the argument will be implicitly converted to char**.
 
J

James Harris

Ike Naar said:
It is known that the variable part of the argument list has the form

"keyword0", param0, "keyword1", param1, ... , "keywordN", paramN,
SENTINEL

where all "keyword*" entries have type char*, so it would not be
unreasonable to let the sentinel have type char* as well since that
is the type that the function expects for all odd-numbered entries.

I know this subthread was, as Noob mentioned above, going off on a tangent
but the discussion seems to well illustrate that keywords would be better
identified by integers rather than strings. Some reasons:

* Zero could be used as the terminator.
* Non-zeroes could identify keywords.
* Keyword ids could be provided as header constants.
* And/or they could be provided as exported globals.
* Selection by a switch statement would be fast and easy

Maybe something like the following?

va_start(k, unsigned);
while (key_id = va_arg(k, unsigned)) { /* Iterate over keywords */
switch (key_id) {
case A:
/* Use va_arg to fetch value(s) for keyword A */
break;
case B:
/* Use va_arg to fetch value(s) for keyword B */
break;
... etc ...
default:
/* Error - invalid key id */
}
}
va_end(k);

By contrast strings would require a series of strcmp() operations or
something quite complex like trie matching.

Of course, there is also the struct-based approach that has been mentioned
for later versions of C, and the option of passing the address of a struct.
Which choice turns out to be "best" may depend on a number of factors but
ISTM that there are some good ways in C to support keyword args. Cool!

James
 
J

James Kuyper

On 06/04/2014 10:40 AM, Noob wrote:
....
OK, this is the part I didn't understand well.

Consider this (bogus) variadic function:

#include <stdarg.h>
int foob(int u, ...)
{
int res;
va_list ap;
va_start(ap, u);
double *p = va_arg(ap, double *);
res = p ? (*p > 0) : 42;
va_end(ap);
return res;
}

If I understand correctly what you wrote in your previous post,
then calling

foob(42, (void *)0);

has undefined behavior, because the caller passes a (void *) argument,
but the function expected a (double *) through va_arg?

Exactly. void* is not compatible with double*. It's convertible to and
from double*, but it doesn't necessarily have the same representation,
alignment, or even necessarily the same size. There have been real
implementations where pointers to types that are word-aligned are
smaller than pointers to types that are not.
 
B

BartC

I know this subthread was, as Noob mentioned above, going off on a tangent
but the discussion seems to well illustrate that keywords would be better
identified by integers rather than strings. Some reasons:
* Selection by a switch statement would be fast and easy
By contrast strings would require a series of strcmp() operations or
something quite complex like trie matching.

For short string matching I've used char literals such as 'abcd' or
'abcdefgh'. These can't be compared quite as quickly as compact enum codes,
but it's faster than string matching. Also you don't need to maintain a set
of enums for each function you want to use keyword parameters with; you just
use the parameter name (or a version of it to fit within the 4- or
8-character limit).

But, I think C doesn't like you using multi-char constants like this (trying
it now, gcc doesn't even like more than 4 characters), which limits the use
of this method.
 
K

Keith Thompson

BartC said:
For short string matching I've used char literals such as 'abcd' or
'abcdefgh'. These can't be compared quite as quickly as compact enum codes,
but it's faster than string matching. Also you don't need to maintain a set
of enums for each function you want to use keyword parameters with; you just
use the parameter name (or a version of it to fit within the 4- or
8-character limit).

But, I think C doesn't like you using multi-char constants like this (trying
it now, gcc doesn't even like more than 4 characters), which limits the use
of this method.

The standard doesn't even guarantee that 'ab' and 'ba' have distinct
values; it only says that their values are implementation-defined
(and of type int). On the other hand, they'll be distinct in any
sane implementation, and assuming that they'll be distinct doesn't
particularly bother me. (Assuming they have some particular value
would be unwise if your code is supposed to be at all portable.)

gcc warns about all multi-character character constants by default
(a reasonable warning IMHO, but not required by the standard).
It issues a different warning for multi-character constants of more
than 4 characters (where sizeof (int) == 4) -- but that warning is
also not required, or even suggested, by the standard.

For this program:

#include <stdio.h>
int main(void) {
printf("'a' = 0x%x\n", (unsigned)'a');
printf("'ab' = 0x%x\n", (unsigned)'ab');
printf("'abc' = 0x%x\n", (unsigned)'abc');
printf("'abcd' = 0x%x\n", (unsigned)'abcd');
printf("'abcde' = 0x%x\n", (unsigned)'abcde');
}

gcc produces the following warnings:

c.c: In function 'main':
c.c:4:42: warning: multi-character character constant [-Wmultichar]
c.c:5:42: warning: multi-character character constant [-Wmultichar]
c.c:6:42: warning: multi-character character constant [-Wmultichar]
c.c:7:42: warning: character constant too long for its type [enabled by
default]

and this output:

'a' = 0x61
'ab' = 0x6162
'abc' = 0x616263
'abcd' = 0x61626364
'abcde' = 0x62636465

The program is perfectly legal, and any conforming implementation must
accept it. Its output is implementation-defined.

I wouldn't use multi-character constants myself other than to find out
what a compiler does with them.

I've seen them most often in code written by inexperienced programmers
who haven't yet learned the difference between single and double quotes.
 
K

Keith Thompson

James Kuyper said:
On 06/04/2014 10:40 AM, Noob wrote:
...

Exactly. void* is not compatible with double*. It's convertible to and
from double*, but it doesn't necessarily have the same representation,
alignment, or even necessarily the same size. There have been real
implementations where pointers to types that are word-aligned are
smaller than pointers to types that are not.

And because foob is variadic (more precisely, because (void *)0 is
passed as an argument corresponding to the ", ..." in the declaration),
no conversion takes place. The caller passes an argument of type void*,
and foob *assumes* that it's of type double*.

"If you lie to the compiler, it will get its revenge." -- Henry Spencer
 
K

Keith Thompson

BartC said:
For short string matching I've used char literals such as 'abcd' or
'abcdefgh'. These can't be compared quite as quickly as compact enum codes,
but it's faster than string matching. Also you don't need to maintain a set
of enums for each function you want to use keyword parameters with; you just
use the parameter name (or a version of it to fit within the 4- or
8-character limit).

Something I missed in my previous followup: Why do you think
comparisons of values like 'abcd' would be slower than comparisons
of "compact enum codes"? Non-wide character constants are always
of type int; a literal 'abcd' is likely to be exactly equivalent
(in value and type) to a literal 1633837924 (for example).

A comparison of a single-byte literal like 'a' *might* be optimized to
an 8-bit comparison -- but only if byte comparisons are actually faster
than word comparisons on the target hardware.
 
B

BartC

Something I missed in my previous followup: Why do you think
comparisons of values like 'abcd' would be slower than comparisons
of "compact enum codes"? Non-wide character constants are always
of type int; a literal 'abcd' is likely to be exactly equivalent
(in value and type) to a literal 1633837924 (for example).

By 'compact' I mean the values needed to represent the set of parameter keys
occupy a narrow range, such as 0 to 13 when there are 14 keys.

Given such a key, a switch can use a jump table to go straight to the
handler for that key.

With multi-char literals, it is necessary to test them one by one. (This can
be optimised a little testing the most likely ones first, but that's about
it. It won't be worthwhile using binary searches, hashing and such, which
are still not going to be as fast as a switch.)

(There might also be a way to use some compile-time macro to effectively
hash any multi-char value, or even a regular string, into a compact ordinal
value. But that gets complicated and the keyword code gets more cluttered.)
 
M

Malcolm McLean

With multi-char literals, it is necessary to test them one by one. (This can
be optimised a little testing the most likely ones first, but that's about
it. It won't be worthwhile using binary searches, hashing and such, which
are still not going to be as fast as a switch.)

(There might also be a way to use some compile-time macro to effectively
hash any multi-char value, or even a regular string, into a compact ordinal
value. But that gets complicated and the keyword code gets more cluttered.)
Realistically you've got to use an if .. else ladder. It's got the same
visual complexity as a switch. But if you're worried about the comparatively
minor overhead, probably you shouldn't be coding named arguments anyway.

Strings aren't efficient. But they have the advantage that you can
always provide something short and humanly meaningful, like "x" (you
can't realistically #define x). They don't pollute namespace, it's
obvious what type they are, and you can print them out for debugging
and get meaningful values. Also, its hard to inadvertently pass
a wrong value (MYLIB_CURSOR_X as opposed to MYLIB_CURSOR_XPOS in
a related function).
If the practice catches on, then as most named arguments will be passed
as string literals, a compiler can optimise the strcmp() calls to
nothing.
 
K

Keith Thompson

Malcolm McLean said:
Realistically you've got to use an if .. else ladder. It's got the same
visual complexity as a switch. But if you're worried about the comparatively
minor overhead, probably you shouldn't be coding named arguments anyway.

Strings aren't efficient. But they have the advantage that you can
always provide something short and humanly meaningful, like "x" (you
can't realistically #define x). They don't pollute namespace, it's
obvious what type they are, and you can print them out for debugging
and get meaningful values. Also, its hard to inadvertently pass
a wrong value (MYLIB_CURSOR_X as opposed to MYLIB_CURSOR_XPOS in
a related function).
If the practice catches on, then as most named arguments will be passed
as string literals, a compiler can optimise the strcmp() calls to
nothing.

They don't pollute namespace because they have no namespace. String
literals make it (nearly) impossible for the compiler to catch typos.
gcc can detect many errors involving printf-style format strings, but
only because that particular syntax is hardwired into the compiler; that
won't happen for any literal values you invent for your own program.

And I don't see how using string literals rather than identifiers makes
it harder to inadvertently pass an incorrect value. In addition to the
existing risk of using the wrong name, the add the risk of passing an
incorrect spelling of the right name.
 
B

BartC

They don't pollute namespace because they have no namespace. String
literals make it (nearly) impossible for the compiler to catch typos.

That's true. Although you can still inadvertently type the name of a
parameter from a different function (or even something that isn't a
parameter keyword name).

But the point about the namespace is valid. If you have many functions (a
suite of functions in a library for example) that tend to use the same
parameter names (X,Y or Z, or Filename, Caption, Result etc), then you need
to think up different names for each function, or create some scheme where a
bunch of functions can share the same core keyword names.

With strings or multi-char constants, that is not an issue. (Of course, even
better would be to have actual keyword parameters; they're straightforward
to implement, and simpler to do so than designated initialisers. Why would
something so obviously useful be left out of a major new version?)
 
N

Noob

On 06/04/2014 10:40 AM, Noob wrote:
...

Exactly. void* is not compatible with double*. It's convertible to and
from double*, but it doesn't necessarily have the same representation,
alignment, or even necessarily the same size. There have been real
implementations where pointers to types that are word-aligned are
smaller than pointers to types that are not.

Roger that.

C is a sneaky treacherous little bastard. Every time I think I have
(most of) it nailed down, I get smacked upside the head.

Last time, it was the huge pile of FAIL that is the ctype.h API.
(Which requires one to cast the parameter.)

Regards.
 
N

Noob

NULL is very often of type int.

FWIW (not much), I would guess that a majority of implementations define
NULL as ((void *)0)
Type compatibility is a more stringent condition than
whatever you're thinking of; see the standard for its definition. Types
that can be converted to each other, either implicitly or explicitly,
are not necessarily compatible. Even types with the same representation
aren't necessarily compatible.

OK, thanks to Ben, James and you for clearing that confusion of mine
(between compatible and convertible).
For example, the POSIX execl() function is declared as:

int execl(const char *path, const char *arg0, ... /*, (char *)0 */);

Funny you should mention execl, because that is specifically the
function I had in mind, and I could have sworn that the docs did
not specify the type for NULL.

I guess it could be considered a defect in the man-pages entry.
http://man7.org/linux/man-pages/man3/exec.3.html

.... because POSIX did indeed specify it:
http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html

I am pretty sure it was not specified in the 2001 document...

Regards.
 
K

Keith Thompson

Noob said:
FWIW (not much), I would guess that a majority of implementations define
NULL as ((void *)0)

Perhaps; I haven't done a survey. gcc and clang use ((void*)0) (yes,
<stddef.h> is provided by gcc, not by a separate library). Solaris cc
uses 0. But either definition is legal, and it's unwise to write code
that assumes one or the other.
OK, thanks to Ben, James and you for clearing that confusion of mine
(between compatible and convertible).


Funny you should mention execl, because that is specifically the
function I had in mind, and I could have sworn that the docs did
not specify the type for NULL.

I guess it could be considered a defect in the man-pages entry.
http://man7.org/linux/man-pages/man3/exec.3.html

Quoting the page you cited:

The list of arguments must be terminated by a null pointer, and,
since these are variadic functions, this pointer must be cast
(char *) NULL.
... because POSIX did indeed specify it:
http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html

I am pretty sure it was not specified in the 2001 document...

It is in the 2004 version:
http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html
I haven't found the 2001 version online, but I'd be surprised if
it didn't mention the cast somewhere.

Incidentally, POSIX requires NULL to expand to "an integer
constant expression with the value 0 cast to type void *", a
stricter requirement than ISO C, which doesn't require the cast.
Of course 0 is still a null pointer constant.
 
K

Keith Thompson

Noob said:
On 06/04/2014 10:40 AM, Noob wrote: [...]
If I understand correctly what you wrote in your previous post,
then calling

foob(42, (void *)0);

has undefined behavior, because the caller passes a (void *) argument,
but the function expected a (double *) through va_arg?

Exactly. void* is not compatible with double*. It's convertible to and
from double*, but it doesn't necessarily have the same representation,
alignment, or even necessarily the same size. There have been real
implementations where pointers to types that are word-aligned are
smaller than pointers to types that are not.

Roger that.

C is a sneaky treacherous little bastard. Every time I think I have
(most of) it nailed down, I get smacked upside the head.

What, I have to use pointer types consistently? What a sneaky
treacherous little bastard! (Sheesh.)
Last time, it was the huge pile of FAIL that is the ctype.h API.
(Which requires one to cast the parameter.)

I agree that's unfortunate. There are historical reasons for it.
Prior to the introduction of prototypes in C89, it wasn't possible to
specify a parameter of type char, so the is*() and to*() functions
had to take arguments of type int. Combine that with the requirement
to accept EOF as an argument.

I wouldn't call it a "huge pile of FAIL", though, just an annoyance.

In the old days, programmers probably didn't bother with the cast.
On EBCDIC systems, plain char is unsigned and omitting the cast
is harmless. On ASCII systems, the cast matters only for negative
values, and such values would not appear in ordinary text.
 
B

Ben Bacarisse

Noob said:
Last time, it was the huge pile of FAIL that is the ctype.h API.
(Which requires one to cast the parameter.)

The price of backwards compatibility, unfortunately. This can help if
you have a lot of code that does these tests:

#define IS(what, c) (is##what((unsigned char)(c)))
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,120
Messages
2,570,710
Members
47,283
Latest member
hopkins1988

Latest Threads

Top