Juha said:
Could someone please post a practical example of this mythical "code
bloat" caused by templates, and a better alternative?
(And "practical" above means not artificially contrived to be as
pathological as possible by using completely unconventional code that
no sane programmer would ever write.)
Not to blame templates in general for code bloating (bloating compilation time
is another story; for whatever reason no one here complains about this one), but
the above ostream code (and any iostream code, in my experience) certainly
bloats the code at the call site as compared to printf code as it emits many
extra function calls and object constructions on top of 2-3 function calls taken
by printf.
Just for my own interest, I composed two roughly equivalent printing functions
from the above example into a little program (see below), compiled it with g++
4.6.3 on 64-bit Linux, optimizing for size:
g++ -Os -o pvocs pvocs.cpp
and measured the functions from 'objdump -d' dump.
the size of printfOutput() is 0x400f22 - 0x400e74 + 1 = 175 bytes
the size of ostreamOutput() is x401146 - 0x400f23 + 5 = 552 bytes
The last addition of 1 or 5 is for the length of the last command whose address
I have in the dump. It is instructive that printOutput() ends with a simple
1-byte retq instruction:
400f22: c3 retq
whereas ostreamOutput() ends with 5-byte call of (obviously long-jumping) C++
stack-unwinding stub
401146: e8 c5 fb ff ff callq 400d10 <_Unwind_Resume@plt>
where _Unwind_Resume strains CPU pipeline a couple of extra times and loads a
few extra cache lines:
0000000000400d10 <_Unwind_Resume@plt>:
400d10: ff 25 82 13 20 00 jmpq *0x201382(%rip) # 602098
<_GLOBAL_OFFSET_TABLE_+0xb0>
400d16: 68 13 00 00 00 pushq $0x13
400d1b: e9 b0 fe ff ff jmpq 400bd0 <_init+0x20>
(so much for C++ function calls' not having extra run-time cost as compared to C
ones -- probably true only if you use nothrow functions throughout?)
As for the original purpose of the exercise, ostreamOutput() takes more than
thrice the code size of the printfOutput(). Judge for yourself.
STANDARD DISCLAIMER : your mileage can vary (not too much though
).
-Pavel
// -------------- pvocs.cpp -- cut here -------------------
#include <iomanip>
#include <iostream>
#include <sstream>
#include <stdio.h>
using namespace std;
void
printfOutput(size_t b, void *bAddress, int bEnabled, int bCore) {
char core[16];
if (bCore == -1) {
snprintf(core, sizeof(core), "%s", "All");
} else {
snprintf(core, sizeof(core), "%2.2d", bCore);
}
printf(" %2.2zu %016llx %3.3s %s\n", b, (long long)bAddress, core,
bEnabled ? "Yes" : "No");
}
void
ostreamOutput(size_t b, void *bAddress, int bEnabled, int bCore) {
ostringstream core;
bCore == -1 ? core << "All" : core << setw(3) << bCore;
clog << setfill(' ') << setw(3) << ' ' // indent
<< setfill('0') << setw(2) << b
<< setfill(' ') << setw(8) << ' '
<< setfill('0') << setw(16)
<< hex << reinterpret_cast<unsigned long>(bAddress)
<< setfill(' ') << setw(2) << ' '
<< core.str().substr(0, 3) << ' '
<< setfill(' ') << setw(3) << ' '
<< (static_cast<bool>(bEnabled) ? "Yes" : "No")
<< '\n';
}
int
main(int, char*[]) {
size_t b = 2;
void *bAddress = &b;
int bEnabled = 0;
int bCore = -1;
printfOutput(b, bAddress, bEnabled, bCore);
ostreamOutput(b, bAddress, bEnabled, bCore);
return 0;
}
// -------------- pvocs.cpp -- cut here -------------------