How do browsers/PHP handle characters outside the set characterset?
- by Maarten
I'm looking into how characters are handled that are outside of the set characterset for a page.
In this case the page is set to iso-8859-1, and the previous programmer decided to escape input using htmlentities($string,ENT_COMPAT). This is then stored into Latin1 tables in Mysql.
As the table is set to the same character set as the page, I am wondering if that htmlentities step is needed.
I did some experiments on http://floris.workingweb.nl/experiments/characters.php and it seems that for stuff inside Latin1 some characters are escaped, but for example with a Czech name they are not.
Is this because those characters are outside of Latin1? If so, then the htmlentities can be removed, as it doesn't help for stuff outside of Latin1 anyway, and for within Latin1 it is not needed as far as I can see now...