P
Phil Carmody
You must be sure that hostile input can't turn the worst case into the
average.
I'm aware of sentences that make sense, and that one doesn't.
I would say "try again", but I would't mean it.
Phil
You must be sure that hostile input can't turn the worst case into the
average.
For what it's worth, if you expand your horizons outside of just Unix
and Linux, things are rather different. For one thing, the problem that
you want to avoid has not "already happened" on systems such as Win32
and OS/2 when the exception that one has to handle is raised. This is
because on such systems the decommitted stack scheme relies upon all
functions performing stack probes. Stack probes happen in the function
perilogue, at the beginning before the meat of the function begins
execution, and it is during those probes that the exceptions will be
raised, as guard pages and non-stack areas are referenced by the probing.
There's that narrow horizon right there, in your very own words.My own programs are allowed to assume [...] a correspondingly early
version of POSIX, [...]
Phil Carmody said:I'm aware of sentences that make sense, and that one doesn't.
I would say "try again", but I would't mean it.
Keith Thompson said:It seemed clear enough to me, though perhaps a bit awkwardly phrased.
Quicksort is O(n log n) in the average case, O(N^2) in the worst
case. But that "average case" assumes random input. If Quicksort
is running in an environment where an attacker can feed it carefully
crafted input, its average case can become O(N^2), where the averate
is computed over the set of inputs it actually receives. This can
be used as a denial-of=service attack, which can, in some cases,
constitute a security hole.
Keith Thompson said:It seemed clear enough to me, though perhaps a bit awkwardly phrased.
Quicksort is O(n log n) in the average case, O(N^2) in the worst
case. But that "average case" assumes random input. If Quicksort
is running in an environment where an attacker can feed it carefully
crafted input, its average case can become O(N^2), where the averate
is computed over the set of inputs it actually receives. This can
be used as a denial-of=service attack, which can, in some cases,
constitute a security hole.
Barry Margolin said:If an attacker wants to make your system spend huge amounts of time
sorting, he can simply feed it arbitrarily large inputs. Regardless of
the algorithm, more input means more CPU time used.
So it's kind of silly to worry about an attacker when designing
something like this, and more useful to design the algorithm around the
expected normal input.
(I don't remember who pointed this out, but yes, in an earlier
article I was sloppy in my use of big-O notation.)
It's probably easier to guard against unreasonably large inputs.
(A slow network connection will do the job nicely.) It's harder
to guard against attacks where a large input causes the program
to consume vastly more resources than another input of exactly the
same size.
If I were designing a program that needs to sort arrays whose
contents are not under my control, I'd avoid using an algorithm
whose run-time behavior can vary tremendously depending on the
nature of the input -- not just because of the possibility of
malicious attacks, but because even random data could conceivably
cause the same problem. Even if the problem is unlikely to occur,
it's good to have one less thing to worry about.
If this were difficult to do, I'd stop and consider whether it's
worth the effort, but it shouldn't be. Though a naive Quicksort
algorithm does suffer from this problem, my guess is that most,
perhaps all, library sorting algorithms avoid it, either by tweaking
Quicksort or by using a different algorithm.
A concrete question: Are there any existing C library implementations
of qsort() that are vulnerable to this kind of problem?
(I don't remember who pointed this out, but yes, in an earlier
article I was sloppy in my use of big-O notation.)
It's probably easier to guard against unreasonably large inputs.
(A slow network connection will do the job nicely.) It's harder
That's great. You run an Internet service that handles a zillion transactions
per day, it's someone figures out how to make an algorithm degenerate to the
worst case, he brings your server farm to permanent 100% CPU usage with a
series of malicious packets that use up only .0001% of your network
bandwidth. What's your solution? Host the service on a 9600-baud modem!
That'll stop the attack! And your other users too. But they were an
unreasonably large group.
Did anybody read the Algorithmic Complexity Attacks paper before rushing to
post their ignorant opinions?
[...]
If I were designing a program that needs to sort arrays whose
contents are not under my control, I'd avoid using an algorithm
whose run-time behavior can vary tremendously depending on the
nature of the input -- not just because of the possibility of
malicious attacks, but because even random data could conceivably
cause the same problem. Even if the problem is unlikely to occur,
it's good to have one less thing to worry about.If this were difficult to do, I'd stop and consider whether it's
worth the effort, but it shouldn't be. Though a naive Quicksort
algorithm does suffer from this problem, my guess is that most,
perhaps all, library sorting algorithms avoid it, either by tweaking
Quicksort or by using a different algorithm.
On re-reading this, I realize that the sentence
If this were difficult to do, I'd stop and consider whether it's
worth the effort, but it shouldn't be.
was unclear. I meant that it shouldn't be difficult to do, not
that it shouldn't be worth the effort.
A concrete question: Are there any existing C library implementations
of qsort() that are vulnerable to this kind of problem?
I took a look at the glibc implementation of qsort(). It does
use Quicksort, but with some tweaks. It picks the pivot using a
median-of-three decision tree, it falls back to insertion sort for
small segments, it uses an explicit stack rather than recursion,
and it always handles the larger partition first. I don't know
whether this completely avoids the worst-case N**2 behavior or not,
but it should make it less likely to happen accidentally.
Keith Thompson said:I took a look at the glibc implementation of qsort(). It does
use Quicksort, but with some tweaks. It picks the pivot using a
median-of-three decision tree, it falls back to insertion sort for
small segments, it uses an explicit stack rather than recursion,
and it always handles the larger partition first. I don't know
whether this completely avoids the worst-case N**2 behavior or not,
but it should make it less likely to happen accidentally.
Ben Pfaff said:It also uses merge sort instead of quicksort when enough memory
is available.
You just shuffle the input. Then it becomes far more likely that theIt does *not*. Again, I recommend Musser's original Introsort paper
for a quite interesting description of how to generate worst case
sequences for the common variants of Quicksort. It's actually
remarkably straight-forward.
You just shuffle the input. Then it becomes far more likely that the
computer will break than that you hit a worst case or even a bad case.
From the clock.Where are we going to get the entropy to do a good shuffle?
From the clock.
The attacker can still time his input so that the clock comes up with
the right number, but that's likely to be challenging. Certainly it
will be a lot harder than getting the source for Microsoft Visual C++
qsort(), and engineering an input that takes N^3 iterations to
complete.
Ben Bacarisse said:James Kuyper said:On May 5, 11:37 am, (e-mail address removed) wrote:
Hello
If a program recurses too deeply there's always a chance that it could
run out of stack space and die with a SIGSEGV. Is there any API that can
tell you how much stack space is left or some other method to prevent this
in C/C++? I realise I could catch the signal but thats a pretty damn ugly
way to do it.
B2003
The only proper way to avoid stack-overflows is to prevent it from
happening in the first place.
And how do you do that? If the stack size is sufficiently limited, even
int main(int argc, char*argv[]) { return 0; }
could overflow the stack.
Hmm... only in the most contrived implementation.
[snip elaboration]
Keith Thompson said:On re-reading this, I realize that the sentence
If this were difficult to do, I'd stop and consider whether it's
worth the effort, but it shouldn't be.
was unclear. I meant that it shouldn't be difficult to do, not
that it shouldn't be worth the effort.
I took a look at the glibc implementation of qsort(). It does
use Quicksort, but with some tweaks. It picks the pivot using a
median-of-three decision tree, it falls back to insertion sort for
small segments, it uses an explicit stack rather than recursion,
and it always handles the larger partition first.
I don't know
whether this completely avoids the worst-case N**2 behavior or not,
but it should make it less likely to happen accidentally.
You've made me curious enough to revisit the code of my standardOdd, in order to reduce actual recursion (and do tail recursion
instead), and to keep locality of reference tight, I'd handle the
smaller part first were I to use an actual quicksort.
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.