How to set the mechanize page encoding?
Posted
by Juan Medín
on Stack Overflow
See other posts from Stack Overflow
or by Juan Medín
Published on 2009-12-12T03:31:43Z
Indexed on
2010/03/08
8:06 UTC
Read the original article
Hit count: 686
Hi,
I'm trying to get a page with an ISO-8859-1 encoding clicking on a link, so the code is similar to this:
page_result = page.link_with( :text => 'link_text' ).click
So far I get the result with a wrong encoding, so I see characters like:
'T?tulo:' instead of 'Título:'
I've tried several approaches, including:
Stating the encoding in the first request using the agent like:
@page_search = @agent.get( :url => 'http://www.server.com', :headers => { 'Accept-Charset' => 'ISO-8859-1' } )
Stating the encoding for the page itself
page_result.encoding = 'ISO-8859-1'
But I must be doing something wrong: a simple puts always show the wrong characters.
Do you know how to state the encoding?
Thanks in advance,
Added: Executable example:
require 'rubygems'
require 'mechanize'
WWW::Mechanize::Util::CODE_DIC[:SJIS] = "ISO-8859-1"
@agent = WWW::Mechanize.new
@page = @agent.get(
:url => 'http://www.mcu.es/webISBN/tituloSimpleFilter.do?cache=init&layout=busquedaisbn&language=es',
:headers => { 'Accept-Charset' => 'utf-8' } )
puts @page.body
© Stack Overflow or respective owner