Servlet receiving data both in ISO-8859-1 and UTF-8. How to URL-decode?

Posted by AJPerez on Stack Overflow See other posts from Stack Overflow or by AJPerez
Published on 2010-05-28T11:08:34Z Indexed on 2010/05/28 11:11 UTC
Read the original article Hit count: 269

I've a web application (well, in fact is just a servlet) which receives data from 3 different sources:

  • Source A is a HTML document written in UTF-8, and sends the data via <form method="get">.
  • Source B is written in ISO-8859-1, and sends the data via <form method="get">, too.
  • Source C is written in ISO-8859-1, and sends the data via <a href="http://my-servlet-url?param=value&param2=value2&etc">.

The servlet receives the request params and URL-decodes them using UTF-8. As you can expect, A works without problems, while B and C fail (you can't URL-decode in UTF-8 something that's encoded in ISO-8859-1...).

I can make slight modifications to B and C, but I am not allowed to change them from ISO-8859-1 to UTF-8, which would solve all the problems.

In B, I've been able to solve the problem by adding accept-charset="UTF-8" to the <form>. So the <form> sends the data in UTF-8 even with the page being ISO.

What can I do to fix C?

Alternatively, is there any way to determine the charset on the servlet, so I can call URL-decode with the right encoding in each case?

© Stack Overflow or respective owner

Related posts about servlets

Related posts about utf-8