(e-mail address removed) (Elijah Bailey) wrote
I want to sort a set of records using STL's sort() function,
but dont see an easy way to do it.
I really do think that qsort() is the way to go.
However, here's what you asked for.
First, you need a strucure to sort.
// Structure needed for sort.
template <int size>struct Record {
char dat[size];
bool operator<(const Record<size>&rhs) const
{ return less_than(dat,rhs.dat); }
};
On some systems (or with some compiler settings!), this might only work
correctly if size is a multiple of some padding value. I leave you to
figure out how to deal with this.
The sort itself is trivially easy.
// The sort itself.
template<int size>void dosort() {
Record<size> *dat = reinterpret_cast<Record<size>*>(data);
std::sort(dat, dat+n);
}
Unfortunately, you can't just call
dosort<m>
because m isn't a compile-time const. This makes sense if you think
about it -- your comments about speed are correct, but what you're doing
is trading speed against code size. The code is expanded in-line, which
means it needs to understand the recordsize before you begin. So, in
order to call it you're going to need a constant. Here's the only way I
can think of to convert a variable into a compile-time const.
// The only way to call it!
void callsort() {
switch (m) {
case 4: dosort< 4>(); break;
case 8: dosort< 8>(); break;
case 12: dosort<12>(); break;
case 16: dosort<16>(); break;
case 20: dosort<20>(); break;
// ... etc...
default: throw "Missing case!";
}
}
You're going to need a case for every possible value of m.
If you test the above code with Microsoft Visual C++ 6.0, you're going to
get an error. This is caused by a compiler bug; if the only place we
instantiate Record<size> is within dosort<size>, the compiler gets it
very wrong (in this case, it will silently always use Record<20>). The
solution is to instantiate Record<size> in callsort, and pass it to
dosort:
template<int size>void dosort(Record<size>unused) {
Record<size> *dat = reinterpret_cast<Record<size>*>(data);
std::sort(dat, dat+n);
}
// The only way to call it!
void callsort() {
switch (m) {
case 4: dosort(Record< 4>()); break;
case 8: dosort(Record< 8>()); break;
case 12: dosort(Record<12>()); break;
case 16: dosort(Record<16>()); break;
case 20: dosort(Record<20>()); break;
// ... etc...
default: throw "Missing case!";
}
}
By explicitly instantiating every Record<n> size, we bypass the Microsoft
compiler bug. Note: I only tested this with values that were a multiple of
4. I don't know if it would work for odd lengths or not (this would be
compiler-dependant anyway).
I think that by now you can see the problem with this approach: code bloat.
(Not talking about source code, but generated code!) All of the in-line
logic to do the sort is being instantiated 5 times. If m can have 200
possible values, the problem is 40 times worse. If m can have 5000 possible
values, you may find that your program is MUCH bigger than the database
you're trying to sort. You mentioned that you were low on space. Even if
you do have enough RAM for this, depending on how your system works,
loading all of this code into memory might actually make your program
seem sluggish.
Have you considered the many virtues of qsort()? It *is* part of C++, and
the specs match your situation so closely, it's as if they custom wrote it
for you.