Le 21/12/10 03:08, Joshua Maurice a écrit :
Well, that could have made sense in 1990, but for a modern software
component in 2010 it is more important to have in mind things like:
(1) Robust software that fails gracefully without undefined behavior.
All errors must produce always the same result and must be
anticipated by the library.
(2) Speed is mainly a consequence of the algorithm. Do not risk a
crash to save a few instructions. Remember that an integer
comparison and a (not taken) conditional branch do not disturb
the pipeline and have a cost of 1/(length of the pipeline) cycles
Do not try to save in software security!
(3) Offer simple interfaces with a lot of default values that can be
changed later. Avoid complexity, and use a small brain memory
footprint. Remember that you can buy a RAM extension but brain
memory extensions aren't available. KEEP IT SIMPLE.
That's nice. However, in the real world, it's different. For example,
in my company, we sell a product that has to move gigabytes of data at
a time, do various transformations and cleanup to it, and put it
elsewhere, on a regularly and daily basis. The same product is also
used for sub-sonic response time for SQL queries to access this data.
If we coded as you suggested, then we would not meet our performance
criteria. I've done some non-trivial benchmarks for some of our more
critical codepath code pieces, written in sane C++ and written in sane
Java 1.6 on the Sun JVM, for example, and the sane Java was about 4.5
times slower.
When I went to using naked Java arrays instead of its container
library ArrayList, it went down to only ~2.3x slower. And I'm talking
really easy code which the compiler ought to be easily able to prove
that no polymorphism is going on, yet I still got the penalty for
using abstraction. This is fundamentally different than the C++ world
where the implementations will quite likely produce code with
std::vector that is as good as code you wrote by hand to implement a
dynamically resizing array. (One caveat: for types with trivial
construction and no trap values, vector can be slower because it
initializes all of its elements. Generally that's not a big deal.)
Yes, I hesitate to extrapolate too far in a language pissing match
between C++ and Java, but the fact remains that the additional "safety
features" you propose would easily degrade my particular product's
runtime by 2 times, and that would easily cause our customers to
choose our competitors.
You know how we got a significant performance boost? We changed the
layout of the data from row major to column major. That resulted in a
significant speedup. This is due to data access patterns and caching.
Big-O is indeed a useful tool, but only when used correctly, and when
you understand its limitations. First, you must be accurately modeling
your problem, and for many years now, all desktop and server machines
have not been random access equivalent, which makes all common Big-O
analysis sketchy at best.
This is just not true. My library is as fast as optimized C++ STL
to within a few percent.
This is not surprising since it does essentially the same thing.
Those people do not use the STL either because they roll always
their own. If your application is so CPU bound you have to roll
your own list package, you can't afford a vector and you use
a simple array.
No. That's the beauty of the STL. It's almost certainly faster than
what you're going to roll by hand for the common use case. You
generally cannot quite a faster dynamically resizing array than
std::vector.
Your library is almost certainly measurably slower because you're
maintaining this indexed thingy to maintain iterator positions to
always be able to do constant time iterator difference. I suspect
you've done some micro benchmarks and completely ignored any
significant cache miss effects from your extra code.
Business projects are interested more in speed than in safety?
You prefer brittle software rather than software that goes at
essentially the same speed but makes more tests and is safer?
Ineteresting point of view.
It makes money. That's what a professional programmer is concerned
with at the end of the day. (Well, that and being moral, such as not
killing people, not lying, not stealing. I hope that's implicitly
understood, but just in case it isn't, I'll mention it explicitly.)
This discussion was about whether "mismatch" should receive information
about the length of its second argument and test whether there is a
buffer overflow when the second container is too short. In my opinion
3 more assembler instructions do not make ANY difference.
True it *could* be that calculating the length between two iterators
in some containers could be a lengthy operation. Then, my library would
be slower for those containers. I do not level by the worst case,
I just let the worst case be the worst case and try to improve the
average case.
It's more than 3 assembly instructions. You're talking about
maintaining extra memory to maintain this indexed lookup table to do
constant time iterator difference for even iterators of a binary tree
data structure. This is not just 3 assembly instructions. This is more
memory, a lot more instructions for iterator manipulation, insertion,
removal, everything. We're also talking about potentially significant
cache effects because a simple access has to access another allocated
block, potentially far from the first.