Steven Jenkins said:
It's been a long time since I was involved in one, but I'm reasonably
confident that we use "standard" benchmarks for large procurements.
I think some people have lost sight of what "benchmark" means. For
computer apps some people have been claiming its TPS, MIPS or whatever
form of throughput they are proposing. However, take a step back and
think about "benchmark" in more general terms and you get a better idea
for what a benchmark is. This is what Steven Jenkins was identifying
with his satellite TCP/IP benchmark.
A benchmark is something, anything by which you can compare. Typically
it is the best of breed at some point or other. Here is an example:
I play various musical instruments, one of them being the Border Bagpipe
made by Jon Swayne. Jon Swayne is a legend in his own lifetime to many
dancers and many musicians in the UK. For dancers it is because he is
part of Blowzabella, a major musical force in social dancing throughout
the last 25 years. For musicians, and particularly bagpipers, it is
because he took the bagpipe, an instrument known for not typically being
in tune, and if it was, not necessarily in tune with another bagpipe of
the same type (or even by the same maker!) and creating a new standard,
a new benchmark, if you will, by which other bagpipes are judged. Its
not just Jon Swayne, there are some other makers, but they changed
everyones perception and his pipes are the benchmark by which others are
judged (yes, they really are that good). When you talk to pipers in the
UK and mention his name there is a respect that is accorded. You don't
get that without good reason. Anyway I digress.
The benchmark for Steven's satellite test was did it match the
round-trip criteria. I think absolutely Steven's example is a benchmark.
Its much looser than other benchmarks, but thats not the point. The
point is did it serve a purpose?
For other people the benchmark will be does it perform the test within a
given tolerance? For other people it may be how much disk space does it
use? or is the latency between packets between X and Y? For other people
it will be is it faster than X?
Where Austin's point comes in is that he points out the latter test is
meaningless because you are comparing apples with oranges, when you
should really be comparing GMO engineered (optimized) apples with GMO
(optimized) oranges to be even getting close to a meaningful test. Even
so you are still comparing cores to segments and it gets a bit messy
after that, although they both have pips.
Even so, I once worked for a GIS company (A) that wrote their software
in C with an in-house scripting language. We won the benchmarks when in
competition with other GIS companies. The competition won because of
clever marketing. Their customers lost (*) though because the
competitors software was too hard to configure and our marketing people
were not smart enough to identify this and inform the customer of the
problem.
What sort of benchmarks were being tested?
o Time to compute catchment area of potential customer base within X
minutes drive given a drive time to location.
o Time to compute catchment area of potential customer base within X
minutes drive given a drive time from location.
o Time to compute drive time to location of potential customer base
within X minutes drive given a particular post code area.
o Time to compute drive time from location of potential customer base
within X minutes drive given a particular post code area.
o Think up any other bizarre thing you want.
Times to and from are/location may not be the same because of highway
on/off ramps, traffic light network delay bias and one-way systems.
Superstores often don't care much about drive time from, but care a lot
about drive-time to. For example drive time from may be 15mins, but
drive-time to may be only 5mins.
As you can see the customer requirements are highly subjective, but the
raw input data is hard data - maps and fixed road networks. The
computing time etc, thats also a fixed reality given the hardware.
Its all about perception and need.
I think the benchmarketing term is quite apt for most benchmarks.
....and Steven, your story was great. I could really relate to a lot of
that.
Stephen
(*) Its a matter of debate, they also used an in-house language and
finding non-competitor engineers that used the language was nigh on
impossible and thus they were very expensive to hire to do the
configuration. Our (A) stuff was not so configurable, but didn't need to
be.
When were we doing this stuff? 90..94 for me. X11 and Motif was the cool
stuff back then.