I would have thought (though as you can all see I am no numerical analyst)
that an "industrial" sqrt function would want to fiddle with the mantissa
and exponent directly, particularly for small positive numbers because for
these you ether have to be sloppy with epsilon (lovely phrase) as I was or
do a whole lot of floating point ops as you do.
Indeed, taking the number apart (into 1.bbb...b for some series of
bits denoted by the "b"s, and then an exponent "e") is quite helpful.
If e is even, we need only calculate sqrt(1.bbb...b) and set a new
exponent e/2. If e is odd, we can either use (e-1)/2 or (e+1)/2
as the new exponent, and either double or halve the 1.bbb...b number
before taking the square root. In the first case, we need only
compute sqrt(x) for 1 <= x < 2, and in the second, we need only
compute sqrt(x) for either 0.5 <= x < 1 (using e+1) or 2 <= x < 4
(using e-1). These ranges for x tend to behave nicely, and even
if we use the lattermost range -- the (e-1)/2 method for odd
exponents -- the output from the sqrt(x) operation is already a
normalized number.
Of course, if we do this with an IEEE-style system, we have to
check for zero, infinity, and NaN first; and then we have to handle
denormals by normalizing them (using a wider-range exponent before
halving as above). (Note that sqrt(denorm) is always a normalized
number.)