3/9/2010

UTF-8 Encoding In POST And GET Request On Tomcat

Filed under: — By Aviran Mordo @ 4:33 am

I recently had to write a project using Tomcat that takes data from html forms and save the data to a database. I thought hey this is pretty strait forward, however while I was expecting the form data to arrive to Tomcat as a UTF-8 string I surprisingly got the request encoding is ISO-8859-1.

While you think that browsers take hints from the page or form encoding and send form data back to the server in the same encoding, web servers remain unaware of the encoding scheme. They typically assume that the request encoding is ISO-8859-1.

So, if my application expects a UTF-8 encoded string, Tomcat assumes 8859-1. The result, of course, is that text data becomes mangled.

Looking for answers I found that I can specify URIEncoding=”UTF-8″ in Tomcat’s connector settings within the server.xml file. Now you might think, hey that’s pretty strait forward, well I thought so too, until I discovered that it only works for GET requests, and Tomcat ignores this setting for POST request.

Now my project had to deal with POST data, and also store the data into a database. So I kept looking until I found a solution. In order for your Servlet to process POST data at UTF-8 you need to explicitly set the character encoding in your Servlet, and to do that all you need to do is put this line in your doPost method (or just add a filter chain and add this line in the doFilter method

request.setCharacterEncoding("UTF-8")

Another trick to get UTF-8 in Tomcat is to tell the JVM to use UTF-8 as file encoding?

-Dfile.enconding=UTF-8

I know it seems strange that Tomcat does not have a configuration setting to handle UTF-8 encoding in POST request, but I could not find one. If you know of such configuration setting you are welcome to share this information in the comments.

 

One Response to “UTF-8 Encoding In POST And GET Request On Tomcat”

  1. David van Enckevort Says:

    Hi,
    See this page for some info how you can fix that. I haven’t tried it yet, but I stumbled upon it when I was investigating exactly the same issue.

    http://wiki.apache.org/tomcat/FAQ/CharacterEncoding

    David

Leave a Reply

You must have Javascript enabled in order to submit comments.

All fields are optional (except comment).
Some comments may be held for moderation (depends on spam filter) and not show up immediately.
Links will automatically get rel="nofollow" attribute to deter spammers.

Powered by WordPress