P
Peter van Merkerk
E. Robert Tisdale said:The typical application here at the Lab is about 2 MBytes.
The typical processing node has 2 GBytes memory --
the application code absorbs about 0.1% of that.
That is all very nice...but how big is your processor cache? A L1 cache
is usually only a few kilobytes, L2 caches typically a few hundred
kilobytes and if you are lucky enough to have a L3 cache you might get
up to a few Megabytes. Unfortunately the larger the cache the slower it
gets (that is why there are multiple cache levels instead of just one
big cache). On todays processors running at high clock frequencies cache
misses are performance wise very expensive. In other words code bloat is
not just about memory usage but potentially also hurts performance. When
writing high performance code, memory usage and memory access patterns
are a concern.
Note that inlining does not necessarily result in more code. For very
small functions the inlined code may be less than the code needed to
call a non-inlined function. Hence inlining should be used judiciously.