getURL, parsing web-site with german special characters
- by Kay
I am using getURL() and htmlParse() - how can I make web-site content with special characters to be displayed properly?
library(RCurl); library(XML)
script <- getURL("http://www.floraweb.de/pflanzenarten/foto.xsql?suchnr=814")
doc <- htmlParse(script, encoding = "UTF-8")
xpathSApply(doc, "//div[@id='content']//p", xmlValue)[2]
[1] "Bellis perennis L., Gänseblümchen"
# should say:
[1] "Bellis perennis L., Gänseblümchen"
> Sys.getlocale()
[1] "LC_COLLATE=German_Austria.1252;LC_CTYPE=German_Austria.1252;LC_MONETARY=German_Austria.1252;LC_NUMERIC=C;LC_TIME=German_Austria.1252"