You've quoted me out of context. I wasn't asking for justification for
exceptions in general. There's no doubt that they're useful. We were
specifically talking about NAN == NAN raising an exception rather than
returning False.
It's arguable that NaN itself simply shouldn't exist in Python; if the FPU
ever generates a NaN, Python should raise an exception at that point.
But given that NaNs propagate in almost the same manner as exceptions,
you could "optimise" this by treating a NaN as a special-case
implementation of exceptions, and turn it into a real exception at the
point where you can no longer use a NaN (e.g. when using a comparison
operator).
This would produce the same end result as raising an exception
immediately, but would reduce the number of isnan() tests.
I'm not sure what "not representable" is supposed to mean,
Consider sqrt(-1). This is defined (as "i" aka "j"), but not representable
as a floating-point "real". Making root/log/trig/etc functions return
complex numbers when necessary probably be inappropriate for a language
such as Python.
but if you "undefined" you mean "invalid", then correct.
I mean undefined, in the sense that 0/0 is undefined (I note that Python
actually raises an exception for "0.0/0.0").
Not necessarily. William Kahan gives an example where passing a NAN to
hypot can justifiably return INF instead of NAN.
Hmm. Is that still true if the NaN signifies "not representable" (e.g.
known but complex) rather than undefined (e.g. unknown value but known to
be real)?
While it's certainly
true that *mostly* any intermediate NAN results in a NAN, that's not a
guarantee or requirement of the standard. A function is allowed to
convert NANs back to non-NANs, if it is appropriate for that function.
Another example is the Kronecker delta:
def kronecker(x, y):
if x == y: return 1
return 0
This will correctly consume NAN arguments. If either x or y is a NAN, it
will return 0. (As an aside, this demonstrates that having NAN != any
NAN, including itself, is useful, as kronecker(x, x) will return 0 if x
is a NAN.)
How is this useful? On the contrary, I'd suggest that the fact that
kronecker(x, x) can return 0 is an argument against the "NaN != NaN" axiom.
A case where the semantics of exceptions differ from those of NaN is:
def cond(t, x, y):
if t:
return x
else:
return y
as cond(True, x, nan()) will return x, while cond(True, x, raise()) will
raise an exception.
But this is a specific instance of a more general problem with strict
languages, i.e. strict functions violate referential transparency.
This is why even strict languages (i.e. almost everything except for a
handful of functional languages which value mathematical purity, e.g.
Haskell) have non-strict conditionals. If you remove the conditional from
the function and write it in-line, then:
if True:
return x
else:
raise()
behaves like NaN.
Also, note that the "convenience" of NaN (e.g. not propagating from the
untaken branch of a conditional) is only available for floating-point
types. If it's such a good idea, why don't we have it for other types?
Equality comparison is another such function. There's no need for
NAN == NAN to fail, because the equality operation is perfectly well
defined for NANs.
The definition is entirely arbitrary. You could just as easily define that
(NaN == NaN) is True. You could just as easily define that "1 + NaN" is 27.
Actually, "NaN == NaN" makes more sense than "NaN != NaN", as the former
upholds the equivalence axioms and is consistent with the normal behaviour
of "is" (i.e. "x is y" => "x == y", even if the converse isn't necessarily
true).
If you're going to argue that "NaN == NaN" should be False on the basis
that the values are sentinels for unrepresentable values (which may be
*different* unrepresentable values), it follows that "NaN != NaN" should
also be False for the same reason.
What relevance does bool have?
The result of comparisons is a bool.
NAN means "this is a sentinel marking that an invalid calculation was
attempted". For the purposes of numeric calculation, it is often useful
to allow those sentinels to propagate through your calculation rather
than to halt the program, perhaps because you hope to find that the
invalid marker ends up not being needed and can be ignored, or because
you can't afford to halt the program.
Does INVALID == INVALID?
Either True or INVALID. You can make a reasonable argument for either.
Making a reasonable argument that it should be False is much harder.
If you can cope with the question "Is an apple equal to a puppy dog?"
It depends upon your definition of equality, but it's not a particularly
hard question. And completely irrelevant here.
So what should NAN == NAN equal? Consider the answer to the apple and
puppy dog comparison. Chances are that anyone asked that will give you a
strange look and say "Of course not, you idiot". (In my experience, and
believe it or not I have actually tried this, some people will ask you to
define equality. But they're a distinct minority.)
If you consider "equal to" to mean "the same as", then the answer is
clear and obvious: apples do not equal puppies,
This is "equality" as opposed to "equivalence", i.e. x and y are equal if
and only if f(x) and f(y) are equal for all f.
and any INVALID sentinel is not equal to any other INVALID.
This does not follow. Unless you explicity define the sentinel to be
unequal to itself, the strict equality definition holds, as NaN tends to
be a specific bit pattern (multiple bit patterns are interpreted as NaN,
but operations which result in a NaN will use a specific pattern, possibly
modulo the sign bit).
If you want to argue that "NaN == NaN" should be False, then do so. Simply
asserting that it should be False won't suffice (nor will citing the IEEE
FP standard *unless* you're arguing that "because the standard says so" is
the only reason required).
(Remember, NAN is not a value itself, it's a sentinel representing the
fact that you don't have a valid number.)
i'm aware of that.
So NAN == NAN should return False,
Why?
just like the standard states, and NAN != NAN should return True.
Why?
In both cases, the more obvious result should be some kind of sentinel
indicating that we don't have a valid boolean. Why should this sentinel
propagate through arithmetic operations but not through logical operations?
Yes, that's a consequence of NAN behaviour.
Another consequence:
x = float("nan")
x is x True
x == x
False
Ordinarily, you would consider this behaviour a bug in the class' __eq__
method.
I can *live* with it (not that I have much choice), but that doesn't meant
that it's correct or even anything short of downright stupid.
There is a good, solid reason: it's a *useful* standard
Debatable.
that *works*,
Debatable.
proven in practice,
If anything, it has proven to be a major nuisance. It takes a lot of
effort to create (or even specify) code which does the right thing in the
presence of NaNs.
Turning NaNs into exceptions at their source wouldn't make it
significantly harder to write correct code (there are a handful of cases
where the existing behaviour produces the right answer almost by accident,
far more where it doesn't), and would mean that "simple" code (where NaN
hasn't been explicitly considered) raises an exception rather than
silently producing a wrong answer.
invented by people who have forgotten more about
floating point than you or I will ever learn, and we dismiss their
conclusions at our peril.
I'm not aware that they made any conclusions about Python. I don't
consider any conclusions about the most appropriate behaviour for hardware
(which may have no choice beyond exactly /which/ bit pattern to put into a
register) to automatically determine what is the most appropriate
behaviour for a high-level language.
A less good reason: its a standard. Better to stick to a not-very-good
standard than to have the Wild West, where everyone chooses their own
behaviour. You have NAN == NAN raise ValueError, Fred has it return True,
George has it return False, Susan has it return a NAN, Michelle makes it
raise MathError, somebody else returns Maybe ...
This isn't an issue if you have the language deal with it.
Incorrect. NANs are not "unknowns", or missing values.
You're contradicting yourself here.