Converting HTML special characters into their value using Python
Posted
by tipu
on Stack Overflow
See other posts from Stack Overflow
or by tipu
Published on 2010-05-19T14:03:10Z
Indexed on
2010/05/19
14:10 UTC
Read the original article
Hit count: 327
I have a file that's littered with these:
http://www.utexas.edu/learn/html/spchar.html
That link just displays all sorts of HTML entities, such as
– –
— —
¡ ¡
and so on. Is it possible in Python to natively convert these characters back into their values so any occurrences of –
will appear as –
instead? My current approach was just to make a dict of key html entities and their utf-8 values and do search and replace, but I was wondering if there are any libraries that can take care of this for me.
© Stack Overflow or respective owner