Using printf in C++

J

Joshua Maurice

Which, given the density of the x86 instruction set, happens quite often.

With -fno-exceptions, the text is reduced by 1-2%.   For a large application,
this is  not insignificant.  For an operating system or hypervisor, this is very
significant.

I've always wondered why no one implemented exceptions fully
"correctly". From my naive understanding, you should be able to get
basically no additions to the code of the function and be able to
support exceptions. You would need some constraints to prevent certain
kinds of code migration to enable later correct unwinding when an
exception is thrown, but no bookkeeping should be needed. Just consult
the program counter, have a giant lookup table, and you're good. (A
hack may be needed for allocating an array of a type with a throwing
constructor IIRC. I haven't really thought this through fully.) It
seems quite straightforward in principle, though perhaps annoying to
implement. Surely no more annoying than anything else a compiler must
do, right? I suspect a lot of it is backwards binary compatibility
concerns.

Also, I've always wondered if exceptions were an acceptable way to
handle errors in an OS. This seems to make it perhaps vulnerable to
certain kinds of denial of service attacks from one process. Not an
immediate security concern, but it seems you may be able to bog down
the OS in the exception code path which will be orders of magnitude
slower. I really do like exceptions in user apps, where if an error
occurs then most requirements about performance are out the window.
I'm not sure if that's true in an OS. I'm too ignorant to comment
further.
 
J

jacob navia

Le 21/05/12 22:45, Joshua Maurice a écrit :
I've always wondered why no one implemented exceptions fully
"correctly". From my naive understanding, you should be able to get
basically no additions to the code of the function and be able to
support exceptions. You would need some constraints to prevent certain
kinds of code migration to enable later correct unwinding when an
exception is thrown, but no bookkeeping should be needed. Just consult
the program counter, have a giant lookup table, and you're good. (A
hack may be needed for allocating an array of a type with a throwing
constructor IIRC. I haven't really thought this through fully.) It
seems quite straightforward in principle, though perhaps annoying to
implement. Surely no more annoying than anything else a compiler must
do, right? I suspect a lot of it is backwards binary compatibility
concerns.

Well, MSVC and gcc both agree alrready with you. In 64 bit windows the
compiler generates a series of byte codes for stack unwinding machine
that describe exactly the stack movements at each relevant instruction
to allow the unwinding to be done. This doesn't cost you ANY run time
cost besides the incredibly big tables but since those tables are
going to be paged out anyway their impact on run time performance is
OK. The tables are poorly documented but it is possible to figure
them out after some months of work... I did that :-(

In 64 bit Unix (gcc branch) the tables are described roughly in the
DWARF debug information specifications. Obviously gcc will never follow
ANY specs 100% so there are some discrepancies between the docs and
the actual tables. Obviously it *is* possible to figure everything
out if you spend some months working on it.

But yes, in 64 bit Unix/Windows there is a big table that describes the
stack.

The Mac OS X is a different beast. I tried to figure it out but it was
completely impossible. I gave up after 3 months of work.
 
I

Ian Collins

Which, given the density of the x86 instruction set, happens quite often.

With -fno-exceptions, the text is reduced by 1-2%. For a large application,
this is not insignificant. For an operating system or hypervisor, this is very
significant.

So -fno-exceptions saves 1-2%, but how much does the error checking code
that replaces exceptions add?

There really is no such thing as a free lunch.
 
J

Joshua Maurice

Most functions already have return values, and the
simple if statements required to check for failure (and propogate if necessary)
takes little space.

The only place where it becomes cumbersome is vis-a-vis allocation failures from
'new'.   This is usually bypassed in operating systems/hypervisors by overloading
new and ensuring that the new will always succeed (by using fixed-sized allocation
pools and checking availability prior to calling new.  If the constructor can
fail (which is not generally the case), one uses an init() function instead and calls
it after "new".   Init() functions are common anyway when there are multiple constructors.

I think you're the one missing the point. Ian's argument was that a
couple extra instructions here and there that aren't even executed
will have less of an effect on the icache than the error return code
that is actually executed every time without exceptions. You seem most
concerned with a non-branch instruction (right?) that isn't even
executed, but you're perfectly fine with littering the main-path code
with branches.

/end devil's argument position

Not that I'm saying you should use exceptions, but I think this
particular line of argument is not good.
 
I

Ian Collins

Most functions already have return values, and the
simple if statements required to check for failure (and propogate if necessary)
takes little space.

So you have a design that doesn't use exceptions, so disabling them
makes sense. Most of the functions I write are either void or return a
result. To truly compare apples with apples, you have to include the
code that sets and returns the error codes as well as checking them.

I think Joshua Maurice just stated my case better than me!
The only place where it becomes cumbersome is vis-a-vis allocation failures from
'new'. This is usually bypassed in operating systems/hypervisors by overloading
new and ensuring that the new will always succeed (by using fixed-sized allocation
pools and checking availability prior to calling new. If the constructor can
fail (which is not generally the case), one uses an init() function instead and calls
it after "new". Init() functions are common anyway when there are multiple constructors.

Very similar to working in kernel land, where (at least on Solaris)
there's no run time support. Having to check everywhere drives me nuts!
I find all the checks interrupt the flow of the code.
 
I

Ian Collins

I believe that the code to set and check the return values, where used and
needed is less than the the code to handle exceptions. I base
this on the need for generic exception handlers to differentiate between multiple
exceptions and handle each of them differently. The switch code, for example
would be pure overhead.

The code may or may not be bigger, but the point we are trying to make
is that code will never be run, loaded in the cache, unless an exception
is thrown. The error checking code will always be run!
 
B

BGB

No, they did not. The only type to have special behavior is
java.lang.String and only with the '+' operator.

it includes more than just String+String, there is also String+int,
String+long, String+float, ... which basically append values of a number
of other types onto the String (as strings).


there is also some internal funkiness involving the StringBuffer class.

what the language lacks, however, is generic user-definable operator
overloading, but I was not claiming here that it had this.


but, anyways, they later added "System.out.printf()", which is at least
cosmetically similar to C's printf.

No, I think it does. Python and Java are both perfectly happy to let
me omit the type specifiers and rely on string conversion to do what I
want; likewise, they can also do the necessary coercion in limited
situations. This is simply impossible in C with mere printf and
friends. This is not a minor difference in the least. You only have
to bother with types in the formatting strings when you actually care
about the types, usually because of how you're formatting that
particular argument.

this is a side issue, and is more related to the implementation than to
the interface design.

it is like claiming that "{0}" in C# is entirely unlike a C format
string (say "%d\n" or "%s\n" or similar), because there is no explicit
type (because C# can figure the type out on its own).

C needs the types, yes, because it can't figure it out on its own, due
to how variable argument lists are implemented in the language.


Yes, but reducing the comparisons to such a (ultimately) superficial
thing is rather absurd and shows a serious lack of understanding of
all the interfaces being discussed.

I disagree that reducing it to superficial details is absurd, because it
was the superficial similarities which were being argued about here.

given that many of these languages handle the argument lists in ways
considerably different than C, or using different type semantics, isn't
really all that important, since the point was that they copied elements
of the interface *style*, rather than the particular *mechanism*.


I am not claiming here that raw untyped argument lists are a good idea,
by any means, nor that other subsequent languages also use raw untyped
argument lists (nor that it is possible to preserve type-information in
variable argument lists in C, even if this would be rather useful
sometimes...).

all this is besides the point.
 
A

Adam Skutt

it includes more than just String+String, there is also String+int,
String+long, String+float, ... which basically append values of a number
of other types onto the String (as strings).

That doesn't change what I said in the least. I suggest you go read
the language specification, as the behavior is much simpler and quite
well defined.
there is also some internal funkiness involving the StringBuffer class.

Nope. Again, I suggest you go read the language specification.
what the language lacks, however, is generic user-definable operator
overloading, but I was not claiming here that it had this.

Then your expectation that they provide iostream-like insertion and
extraction operators was unreasonable on its face, and you knew that!
this is a side issue, and is more related to the implementation than to
the interface design.

Utter nonsense. It completely changes how I approach using the
function and writing my code. How can it not? Ergo, it's part of the
interface. This isn't a subjective point to debate.

How I print a custom object will be different in Java / Python / C#
vs. C, even though they all use "formatting" strings.
it is like claiming that "{0}" in C# is entirely unlike a C format
string (say "%d\n" or "%s\n" or similar), because there is no explicit
type (because C# can figure the type out on its own).

There's more to printf than the stupid format string. I'm not sure
why you persist in believing that the format string is the only
important thing.
C needs the types, yes, because it can't figure it out on its own, due
to how variable argument lists are implemented in the language.

And that's a serious problem, one of the problems that iostreams is
explicitly design to solve. A problem that most other languages solve
too, albeit differently from C++. Unsurprisingly, the solutions to
this problem impact how we use the formatting and I/O functions
provided by the language.
I disagree that reducing it to superficial details is absurd, because it
was the superficial similarities which were being argued about here.

No, it was not merely the superficial similarities being argued about
here. If you believe that, then perhaps your newsreader is broken.
given that many of these languages handle the argument lists in ways
considerably different than C, or using different type semantics, isn't
really all that important, since the point was that they copied elements
of the interface *style*, rather than the particular *mechanism*.

But they didn't. The lack of type safety is a fundamental part of C
printf, even if it's only due to language limitations. Likewise, the
enforced type safety in C# / Java / Python is equally important, even
if it's only due to language limitations.
I am not claiming here that raw untyped argument lists are a good idea,
by any means, nor that other subsequent languages also use raw untyped
argument lists (nor that it is possible to preserve type-information in
variable argument lists in C, even if this would be rather useful
sometimes...).

all this is besides the point.

If that's true, then not only do you have not a point, but it's pretty
difficult to believe you ever had one in the first place...

Adam
 
B

BGB

That doesn't change what I said in the least. I suggest you go read
the language specification, as the behavior is much simpler and quite
well defined.


Nope. Again, I suggest you go read the language specification.

this is how it is implemented, at least going by what I saw before in
javac output, however, this isn't directly visible from the users' POV.


like, "foo "+"bar", produces output which looked more like it had
instead compiled something like:
new StringBuffer("foo ").append("bar").toString();

Then your expectation that they provide iostream-like insertion and
extraction operators was unreasonable on its face, and you knew that!

I never claimed that they did.

you are saying I said stuff that I didn't say.

Utter nonsense. It completely changes how I approach using the
function and writing my code. How can it not? Ergo, it's part of the
interface. This isn't a subjective point to debate.

How I print a custom object will be different in Java / Python / C#
vs. C, even though they all use "formatting" strings.

I disagree on the significance of this point.

yes, there are differences, but they don't much change the interface,
only what it does (and the means and ease by which one prints
user-defined types).


say, in C writes:
printf("%s\n", SomeObjectToString(obj));

and, in some other language, one can write, say:
printf("%s\n", obj.toString());
or:
printf("%O\n", obj);
(which automatically coerces any type with a toString() method).


yes, it is different, but does this really matter all that much for sake
of the visible interface?...

There's more to printf than the stupid format string. I'm not sure
why you persist in believing that the format string is the only
important thing.

it was the point which was being commented about, all of the other stuff
is there, but is besides the point.

And that's a serious problem, one of the problems that iostreams is
explicitly design to solve. A problem that most other languages solve
too, albeit differently from C++. Unsurprisingly, the solutions to
this problem impact how we use the formatting and I/O functions
provided by the language.

fair enough.

but, this wasn't what I was arguing about.

No, it was not merely the superficial similarities being argued about
here. If you believe that, then perhaps your newsreader is broken.

this was in a different part of the thread, and was not the part that I
was arguing about.

But they didn't. The lack of type safety is a fundamental part of C
printf, even if it's only due to language limitations. Likewise, the
enforced type safety in C# / Java / Python is equally important, even
if it's only due to language limitations.

I disagree here.

the lack of type-safety is a side issue due to limitations in C, but is
not an inherently part of the "interface style" per-se, which I define
here as "format-strings followed by arguments" (in contrast to
"insertion and extraction operators", namely, "<<" and ">>", which
characterize the style of iostreams).

rather, it is more of a "limitation" or a "functional hazard".

If that's true, then not only do you have not a point, but it's pretty
difficult to believe you ever had one in the first place...

no, it was a different point, and not directly related to the one you
are arguing about.


I was not going on about type-safety or printing user-defined types, but
rather that format strings have proven far more popular in general in
subsequent languages than the use of << and >> (or some other direct
analogue) as the primary printing interface.

yes, language and compiler designers can design/implement stuff like
this if they wanted, but the thing is, most didn't, and this is itself
probably telling.
 
A

Adam Skutt

On 5/21/2012 7:53 PM, Adam Skutt wrote:

I never claimed that they did.

But you did: you claimed it was significant that they choose not to go
that route, despite knowing such a route was impossible due to other
language decisions. In the case of Java, these decisions were made
before the formatting functions even existed.
I disagree on the significance of this point.

Good for you, but it's significant to the language and it has the
final say. As such, I don't really care what significance you give
it. The language clearly impacts the design of the functions, the
formatting string language, and their capabilities. All of which
impact how users perceive and use the functions.
yes, there are differences, but they don't much change the interface,
only what it does (and the means and ease by which one prints
user-defined types).

say, in C writes:

printf("%s\n", SomeObjectToString(obj));

and, in some other language, one can write, say:
printf("%s\n", obj.toString());
or:
printf("%O\n", obj);
(which automatically coerces any type with a toString() method).

Except no one does that in C, because doing that in C results in
either a memory-leak or code that isn't thread-safe. This isn't even
the least bit legitimate.

yes, it is different, but does this really matter all that much for sake
of the visible interface?...

Yes, it does. It means that I can use the formatting function
generically, with any object, safely. I cannot do that in C. It's not
difficult in C, it is _impossible_ in C.

It means I can pass any object to the formatting function, and I don't
have to remember the stupid formatting code (if I don't need special
behavior), nor do I have to remember to treat certain types
specially.

Why you think this is insignificant is beyond me. I also see no point
in discussing this with you further, because you have some arbitrary
and bizarre idea of the interface for a printf function that you're
unwilling or unable to properly define. Should we include FORTRAN
format strings? What about message translation formatting strings
that include special handling for plural nouns and the like? What
about the limited built-in format in Scheme?

I could go on-and-on, but hopefully my point is clear: there's little
significance in the fact that lots of formatting systems use a string
and then some arguments to interpolate into the string. It's even a
viable, reasonable solution for certain problems even with you have
something like C++ iostreams (like date/time conversion to/from
strings).

Adam
 
B

BGB

In languages without generic templates (such as JavaScript) that works
fine, but with templates you have to provide a mechanism that works with
both built in and user defined types.

This is at the heart of the C++ deign philosophy, one should be able to
use a user defined type as if it were built in type.

quick add:
little stops a language from treating every object with a "toString()"
method as if it were a built-in type at least WRT printing as a string
(along with built-in types such as "int" or "float" implicitly having
"toString()" methods, at least "sort of"...).

similarly, in such a language, if "toString()" is valid on every type,
then there is no reason for it not to be usable in generics/templates as
well.


granted, yes, C++ doesn't do it this way.
 
I

Ian Collins

quick add:
little stops a language from treating every object with a "toString()"
method as if it were a built-in type at least WRT printing as a string
(along with built-in types such as "int" or "float" implicitly having
"toString()" methods, at least "sort of"...).

similarly, in such a language, if "toString()" is valid on every type,
then there is no reason for it not to be usable in generics/templates as
well.

Well it does half the job. But it fails when the thing you are
streaming to isn't an ostream, such as some form of binary stream.
 
B

BGB

Well it does half the job. But it fails when the thing you are streaming
to isn't an ostream, such as some form of binary stream.

possibly...


I guess this would be probably about at which point someone throws in a
binary serialization interface, and maybe a language feature like
aspects or partial classes (to allow this to be done more generally).

although, yes, overloaded operators probably do make a cleaner solution
in this case.
 
B

BGB

But you did: you claimed it was significant that they choose not to go
that route, despite knowing such a route was impossible due to other
language decisions. In the case of Java, these decisions were made
before the formatting functions even existed.


I only said that it was significant that languages which came after C++
chose not to use an iostream-like interface design (IOW: using "<<" and
">>" operators for stream IO).

many languages could have done something at least cosmetically similar
if they wanted (whether or not the language had things like user-defined
operator overloading, they could hard-code it, or give it special
syntax, or similar). (IOW: as syntax sugar).

not that it would necessarily be functionally equivalent (or even
necessarily user-extensible), but this is besides the point.


the point here is much more about what it looks like than what it does
or how it works.


Good for you, but it's significant to the language and it has the
final say. As such, I don't really care what significance you give
it. The language clearly impacts the design of the functions, the
formatting string language, and their capabilities. All of which
impact how users perceive and use the functions.

potentially, but they use a string and an argument list just the same.

Except no one does that in C, because doing that in C results in
either a memory-leak or code that isn't thread-safe. This isn't even
the least bit legitimate.

it depends...


one option here is to basically having a "rotating allocator" which
basically just operates as a ring buffer, with any newer allocations
overwriting whatever was there from before.

so long as no one tries to hold onto a pointer into this buffer for any
extended length of time, it all works fairly well (it assumes that code
will be done with the memory before the allocator wraps back around to
this point). (it is not strictly safe, but it tends to work).

another option is to use a garbage collector (has its own pros and cons).


I have seen stuff like this done in practice in several programs.


Yes, it does. It means that I can use the formatting function
generically, with any object, safely. I cannot do that in C. It's not
difficult in C, it is _impossible_ in C.

well, except for something:
the generic cases where this would really matter are themselves
effectively (1) impossible in C, so it is a moot point.


1: there are possible edge cases, such as a printf within a macro with
an unknown type argument, but doing anything like this is not common.

in cases where one implements a mechanism by which to effectively deal
with types which may readily change (such as a dynamic type-system),
they have likely also implemented a means by which to convert them to
strings.


for other non-C languages, they may address this matter in other ways
(since these languages need not inherit C's type-system limitations),
making arguing about it a moot point in these cases.

an example is "{0}" and similar in C#, which themselves figure out the
type. otherwise, a language could stick with a C-like "%..." notation,
but then treat the type-letters more as a formatting hint ("present it
as if it were type"), rather than having them actually specify the type.

likewise, the language could allow a general mechanism (such as a
"toString()" operation of some sort) by which to allow any type to be
used in formatted output.


It means I can pass any object to the formatting function, and I don't
have to remember the stupid formatting code (if I don't need special
behavior), nor do I have to remember to treat certain types
specially.

Why you think this is insignificant is beyond me. I also see no point
in discussing this with you further, because you have some arbitrary
and bizarre idea of the interface for a printf function that you're
unwilling or unable to properly define. Should we include FORTRAN
format strings? What about message translation formatting strings
that include special handling for plural nouns and the like? What
about the limited built-in format in Scheme?

I could go on-and-on, but hopefully my point is clear: there's little
significance in the fact that lots of formatting systems use a string
and then some arguments to interpolate into the string. It's even a
viable, reasonable solution for certain problems even with you have
something like C++ iostreams (like date/time conversion to/from
strings).

well, except FORTRAN came before C++, so would not be included in a list
of languages which came after C++.

likewise for Scheme.

although not exactly the same, both could probably otherwise be included
in this category.



but, anyways, the core definition here is that printing is done in some
form resembling:
someprintf([output,] formatstring, arguments...);

with someprintf either being a function name, or some sort of method call.

output may be seen in function-call varieties, but is normally N/A if
this is part of a method for some sort of output-stream object or
similar (which will itself define the output).

with formatstring consisting of:
characters which will go directly to output;
some means by which to specify arguments and their respective formatting
(width, formatting style, ...), which may also specify types.

and arguments being:
values which will be used in producing the output using the format
string (often, but not always, being used in-sequence).


now, what goes on inside the function, or how one can use the function,
or alternative uses with other stream types, ... are generally outside
the current scope.


granted, this is not the only style of printing interface in common use
("echo" and "BASIC-style print" variations are also floating around, but
I classify them as different styles given that values are given inline).
 
J

jacob navia

Le 22/05/12 05:27, Adam Skutt a écrit :
It means I can pass any object to the formatting function, and I don't
have to remember the stupid formatting code (if I don't need special
behavior), nor do I have to remember to treat certain types
specially.

Sure, then the boss says:

"I want rounding to 2 decimals, names flush left in the first
column that must be 16 chars long"

And then you have to remember all the manipulators code and get
it right in C++ what a nightmare. After some tinkering you say:
"Who cares?" and write

printf("%-16s|%10.2g|\n",name,value);

and you are done.

Printf is a run time interpreter that uses a simple command language
to describe the formatting of values. There is no alternative in
sight (yet)
 
A

Adam Skutt

Le 22/05/12 05:27, Adam Skutt a écrit :


It means I can pass any object to the formatting function, and I don't
have to remember the stupid formatting code (if I don't need special
behavior), nor do I have to remember to treat certain types
specially.

Sure, then the boss says:

"I want rounding to 2 decimals, names flush left in the first
column that must be 16 chars long"

And then you have to remember all the manipulators code and get
it right in C++ what a nightmare. After some tinkering you say:
"Who cares?" and write

        printf("%-16s|%10.2g|\n",name,value);

and you are done.


I can't remember it in C, either, so I'm going to the documentation
either way. I can't imagine finding a job enjoyable where I'm writing
formatting strings that often.

Adam
 
J

jacob navia

Le 22/05/12 21:33, Adam Skutt a écrit :
Le 22/05/12 05:27, Adam Skutt a écrit :


It means I can pass any object to the formatting function, and I don't
have to remember the stupid formatting code (if I don't need special
behavior), nor do I have to remember to treat certain types
specially.

Sure, then the boss says:

"I want rounding to 2 decimals, names flush left in the first
column that must be 16 chars long"

printf("%-16s|%10.2g|\n",name,value);


And how would it look in C++ please?

Just compare then.
 
I

Ian Collins

Le 22/05/12 21:33, Adam Skutt a écrit :
Le 22/05/12 05:27, Adam Skutt a écrit :



It means I can pass any object to the formatting function, and I don't
have to remember the stupid formatting code (if I don't need special
behavior), nor do I have to remember to treat certain types
specially.

Sure, then the boss says:

"I want rounding to 2 decimals, names flush left in the first
column that must be 16 chars long"

printf("%-16s|%10.2g|\n",name,value);


And how would it look in C++ please?

Just compare then.


sprintf to a buffer, output the buffer.
 
J

jacob navia

Le 22/05/12 23:31, Ian Collins a écrit :
Le 22/05/12 21:33, Adam Skutt a écrit :
Le 22/05/12 05:27, Adam Skutt a écrit :



It means I can pass any object to the formatting function, and I don't
have to remember the stupid formatting code (if I don't need special
behavior), nor do I have to remember to treat certain types
specially.

Sure, then the boss says:

"I want rounding to 2 decimals, names flush left in the first
column that must be 16 chars long"

printf("%-16s|%10.2g|\n",name,value);


And how would it look in C++ please?

Just compare then.


sprintf to a buffer, output the buffer.


Yes!.

KISSes!

Simplicity is golden.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,137
Messages
2,570,799
Members
47,347
Latest member
edward_eden

Latest Threads

Top