Nokogiri and Special Characters
Posted
by Moe
on Stack Overflow
See other posts from Stack Overflow
or by Moe
Published on 2010-04-03T19:28:43Z
Indexed on
2010/04/03
19:33 UTC
Read the original article
Hit count: 460
I'm using Nokogiri to grab the contents of the title tag on a webpage, but am having trouble with accented characters. What's the best way to deal with these? Here's what I'm doing:
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::HTML(open(link))
title = doc.at_css("title")
At this point, the title looks like this:
Rag\303\271
Instead of:
Ragù
How can I have nokogiri return the proper character (e.g. ù in this case)?
© Stack Overflow or respective owner