Scraping — character (long dash) error in Nokogiri
- by DavidP6
I having trouble scraping a certain long dash that is encoded as ; on the Time magazine site. It looks like this: —. It works fine when this dash is encoded as mdash, but when the problem dash is scraped, it is returned as unknown characters. I am using Nokogiri and am wondering if I have to use some sort of special encoding? The page says it is encoded with UTF-8.