Hpricot Html Parsing

S

Suja Suchu

Hi,
I'm getting funky characters, when parsing html using Hpricot.
How to remove this funky character?

Anyone have a fix / workaround for this?

thanks in advance,
Suja
 
T

Thibaut Barrère

Hi Suja,

two suggestions:
- check the encoding used by the page you're hashpricoting (doh -
think I just invented a verb, or what).
- puts $KCODE to see if you're running in unicode or not. If you are
hashpricoting a page encoded in UTF-8, but KCODE is set to none (or if
the page is in latin1, but KCODE is set to U), then you'll have to
change the encoding using iconv for instance.

cheers

Thibaut
 
L

Lee Jarvis

Suja said:
Hi,
I'm getting funky characters, when parsing html using Hpricot.
How to remove this funky character?

Anyone have a fix / workaround for this?

thanks in advance,
Suja

Could you describe these 'funky characters'?
 
S

Suja JS

Lee said:
Could you describe these 'funky characters'?

Like '�' in this text.
"By Mike Monson CHAMPAIGN � Effective today the city of Champaign is
closing three bridges and posting load limits on three others."
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
474,264
Messages
2,571,323
Members
48,007
Latest member
Elvis60357

Latest Threads

Top