R
Rogan Dawes
Hi folks,
I noticed a weird problem in a bit of code that I was writing. I wanted
to display a hierarchy of URL's in a TreeModel. In doing so, I was
adding URL's to a HashSet. While I was testing, I was using invalid
URL's, like "http://abcd/", etc
What happened was that the calls to set.contains(url) and set.add(url)
were showing delays of up to 4 seconds executing the methods. And this
changed, depending on whether I was using a HashSet or a TreeSet with a
custom Comparator.
What turned out to be the problem is that the URL.hashCode() method was
actually trying to resolve the address of the hostname that was
specified, via the protocol specific handler. And obviously, the
hostnames ("abcd") I was using did not exist, and the DNS resolution was
taking some time to timeout.
Am I the only person to think that this is a completely STUPID design?
I can think of many scenarios where one may want to keep even valid
URL's in a Set, without being able to resolve them to an IP address. For
example, a web scanner that works in a private environment (using a
corporate DNS), where the results may be reviewed on a machine outside
of the environment, with no access to the internal DNS servers.
This problem makes it almost impossible to use this kind of data
structure in an offline environment.
Does anyone have any suggestions on how to get around this issue? At
this point, I am thinking of simply copying the URL class into my own
code, and removing all traces of this idiocy.
Regards,
Rogan
I noticed a weird problem in a bit of code that I was writing. I wanted
to display a hierarchy of URL's in a TreeModel. In doing so, I was
adding URL's to a HashSet. While I was testing, I was using invalid
URL's, like "http://abcd/", etc
What happened was that the calls to set.contains(url) and set.add(url)
were showing delays of up to 4 seconds executing the methods. And this
changed, depending on whether I was using a HashSet or a TreeSet with a
custom Comparator.
What turned out to be the problem is that the URL.hashCode() method was
actually trying to resolve the address of the hostname that was
specified, via the protocol specific handler. And obviously, the
hostnames ("abcd") I was using did not exist, and the DNS resolution was
taking some time to timeout.
Am I the only person to think that this is a completely STUPID design?
I can think of many scenarios where one may want to keep even valid
URL's in a Set, without being able to resolve them to an IP address. For
example, a web scanner that works in a private environment (using a
corporate DNS), where the results may be reviewed on a machine outside
of the environment, with no access to the internal DNS servers.
This problem makes it almost impossible to use this kind of data
structure in an offline environment.
Does anyone have any suggestions on how to get around this issue? At
this point, I am thinking of simply copying the URL class into my own
code, and removing all traces of this idiocy.
Regards,
Rogan