proxy baffement

R

Roedy Green

This is not so much a question about Java as about HTTP.

I wrote a program called Brokenlinks that tests all the links on my
website and tells me about permanent redirects and sites that have
stayed dead for 6+ days. (It filters out temporary outages).

Something strange is happening with http://ask.com

My program, written in Java, says it is always dead. Firefox often
thinks it is alive.

I fired up Wireshark to see if I could figure out what was going on.

The first thing I noticed was this

GET
/tbproxy/lh/fixurl?hl=en-US&sd=ca&url=http%3A%2F%2Fask.com%2F&sourceid=chrome&error=connectionfailure
HTTP/1.1
Host: linkhelp.clients.google.com

Somehow my request is sometimes going through a Google proxy. Buy
WHY? What has google got to do with this? I am am using Firefox, not
Chrome.

Further, I notice Chrome often says "resolving proxy" and dithers for
a long time loading a LOCAL file off my hard disk. Why would a web
proxy be involved at all?

Opera has been completely unusable for about two years because it
takes about a minute to load a file off local disk (it is fine on the
web). I thought had something to do with Google Javascript fetching
ads and translate slowly, but maybe that too is a proxy problem. The
Opera people ignored me every time I reported the problem.

I always thought proxies just did a bit of caching, but otherwise you
could ignore them. Yet it seem clearly my browser is speaking a
different protocol from usual. It seems to be aware of its presence.

My questions:

If my IAP is using a proxy, what are the benefits?

Is there a way to sneak around it in case it is screwing things up?

Why google? Do the host proxies for IAPs? Is ASK owned by Google?

Is my Java program supposed to be proxy-aware and do something
different?
 
R

Roedy Green

Is there a way to sneak around it in case it is screwing things up?

I posted what I have discovered so far at
http://mindprod.com/jgloss/proxy.html

Another puzzle has come out. Where does Windows get its its notion of
what proxy to use from? It is not part of the DHCP protocol. The
docs I have found suggest it has to learn by manually configuring an
IP. I most definitely did not do that. I wonder if some Google
product did it.
 
A

Andreas Leitgeb

Roedy Green said:
My program, written in Java, says it [ask.com] is always dead.
Firefox often thinks it is alive.
I fired up Wireshark to see if I could figure out what was going on.
The first thing I noticed was this
GET
/tbproxy/lh/fixurl?hl=en-US&sd=ca&url=http%3A%2F%2Fask.com%2F&sourceid=chrome&error=connectionfailure
HTTP/1.1
Host: linkhelp.clients.google.com

The "error=connectionfailure" makes it look like the result of some
other software that gets triggered by failed connect-attempts and
tries to retrieve separate information for the failure. It could
be a browser-addon, but perhaps could also be software installed
at system level (chrome surely looks quite suspicious for that).

Doesn't wireshark show some SYN packets going to ask.com before this
line?
Opera has been completely unusable for about two years because it
takes about a minute to load a file off local disk (it is fine on the
web).

Cannot reproduce that with opera on my machine. (and perhaps the
opera-people weren't able, either.) Perhaps you use network-drives
mounted locally? Or an over-eager virus-scanner, that intercepts
local file reads? Or such without the "-scanner"...
Is there a way to sneak around it in case it is screwing things up?

Subscribe to a different one. But your problem doesn't so far look
like it was the IAP's responsibility.
While they could force a more or less transparent proxy upon all your
connections to the outside world, they surely have no means to interfere
with local file accesses, unless something goes wrong in the browser.
 
R

Roedy Green

Cannot reproduce that with opera on my machine

It would be helpful if you performed this experiment.

Download a typical page from my website, e.g.
http://mindprod.com/jgloss/wireshark.html

and save it on local hard disk.

The try loading it with Opera.

If it loads fine that suggests there is something weird with my
machine.
If it is very slow to render, that suggests there is something strange
about my pages.
 
R

Roedy Green

Cannot reproduce that with opera on my machine. (and perhaps the
opera-people weren't able, either.) Perhaps you use network-drives
mounted locally? Or an over-eager virus-scanner, that intercepts
local file reads? Or such without the "-scanner"...

In trying to track this down I turned off dynamic virus checking,
Windows index, Copernic index. I have most of the accelerators, e.g.
Open Office that load at boot time turned off. This proxy thing looks
like the best lead so far since it is in there without my permission.

The was another anomaly than seems to have gone away. Timeouts in my
Java code HTTP Get code sometimes would not time out. They just sat
there forever.

Maybe it is time for another Windows 7 install to scape off the
barnacles.

Where does your father keep his barnacles?
~ One of Charles Darwin’s children to a school friend.
 
G

Gene Wirchenko

It would be helpful if you performed this experiment.

Download a typical page from my website, e.g.
http://mindprod.com/jgloss/wireshark.html

and save it on local hard disk.

The try loading it with Opera.

If it loads fine that suggests there is something weird with my
machine.
If it is very slow to render, that suggests there is something strange
about my pages.

I do not use Opera, but I tried this with Firefox 8.0 under
Windows XP SP 3. The render from file was fine.

Sincerely,

Gene Wirchenko
 
R

Roedy Green

I do not use Opera, but I tried this with Firefox 8.0 under
Windows XP SP 3. The render from file was fine.

For me Chrome work quite well from local disk. Opera is impossible,
and Firefox is usually ok, but sometimes takes a long time.

The catch is Chrome can't do Applets, so I have to toggle back and
forth between Chrome and FIrefox. Firefox has a weird stuttering
problem with Java. All is going fine and it gets into its head to
reload the page without the Java. I manually reload the page and it
comes back. Even when it is working fine, FIrefox calls my init
methods twice when a page loads.


The browser people seem to be trying as hard as Microsoft to kill
Java.
 
S

Steven Simpson

GET
/tbproxy/lh/fixurl?hl=en-US&sd=ca&url=http%3A%2F%2Fask.com%2F&sourceid=chrome&error=connectionfailure
HTTP/1.1
Host: linkhelp.clients.google.com

Somehow my request is sometimes going through a Google proxy. Buy
WHY? What has google got to do with this? I am am using Firefox, not
Chrome.

Are you sure the request is from your program? Does User-Agent show it
to be Java, at least? You could set it to be more specific, to be sure
it's /your/ Java program.
 
A

Andreas Leitgeb

Roedy Green said:
I fired up Wireshark to see if I could figure out what was going on.
The first thing I noticed was this
GET
/tbproxy/lh/fixurl?hl=en-US&sd=ca&url=http%3A%2F%2Fask.com%2F&sourceid=chrome&error=connectionfailure
HTTP/1.1
Host: linkhelp.clients.google.com

Another thing: I googled for "tbproxy fixurl" and got hits mostly talking
about some google-script to be used for a site's 404-page.
 
R

Roedy Green

Is my Java program supposed to be proxy-aware and do something
different?

In poking around to figure out what is going on I discovered a proxy
can be configured in the Windows control panel. There is also a place
to configure one in the Java Control panel.

I found a note suggesting that if you ticked "automatically detect
settings" in the Windows control panel, you might end up with a proxy.

There are three Java system properties.
System.setProperty( "proxySet", "true" );
System.setProperty( "http.proxyHost", proxyHostName );
System.setProperty( "http.proxyPort", Integer.toString( proxyHostPort
) );

It seems all you need to hook up a proxy is its dns name or ip. Even
though there are about 5 differerent proxy protocols, I gather the
machines sort that out themselves.

I also discovered inside Google Chrome in "under the hood" you can
change proxy settings. However, I think that is just a hook into the
Windows control panel.
 
R

Roedy Green

Did that, but opened it in a textviewer instead, and saw a couple
of <script ... src="http://..."> in it. So much for "local."

What you saw were google adsense ads and a google translate widget.

The odd thing is when the same page is loaded from a website it works
fine. It is WAY slower when loaded off local hard disk.

This may simply be Google screwing up, or deliberately hosing other
browsers besides Google Chrome which works fairly well.

Yet I think some people don't have trouble with loading from local
hard disk.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,967
Messages
2,570,148
Members
46,694
Latest member
LetaCadwal

Latest Threads

Top