K
Ken Fine
I'm using XMLHTTP to screen-scrape many thousands of pages of content as
part of a data-structuring project.
One issue that I'm running into is that some entities such as curly quotes
and curly apostrophes do not translate properly; they're returned as
question marks indicating an unidentified character.
I'm guessing the usual hack of writing a translate function doesn't work
since the problem lies in the data being pulled down by XMLHTTP.
Is there anything that can be done, short of using a different
screen-scraping component? I intially used something called "ASPTear", but
moved to XMLHTTP since it seems to return fewer errors in production
Thanks,
-KF
part of a data-structuring project.
One issue that I'm running into is that some entities such as curly quotes
and curly apostrophes do not translate properly; they're returned as
question marks indicating an unidentified character.
I'm guessing the usual hack of writing a translate function doesn't work
since the problem lies in the data being pulled down by XMLHTTP.
Is there anything that can be done, short of using a different
screen-scraping component? I intially used something called "ASPTear", but
moved to XMLHTTP since it seems to return fewer errors in production
Thanks,
-KF