Tuesday, January 8, 2013

Page encoding issues

Today I got a bug assigned to me that had to do with how data was being sent in post requests. On the page a user could enter text in several fields, if they entered ŽŒŸ Š†€ƒaniežœš and submitted it everything on the page and subsequent pages worked fine. If however the user used these characters: ëïéâäîü they would be converted to question marks. My dilemma is that only a month ago I made a whole bunch of changes to get that first set of characters to work and I didn't want to make all the same changes again to get the new set to work. I thought (more like prayed) it had to be something simple like page encoding. Long story short I fixed the problem with about 10 lines of code. I made sure each JSP that was getting hit though the submit process had it's character set and page encoding set to UTF-8. Then in each function in the servlet I made sure to set the characterEncoding on the request before it was ever used and on the response right before it was redirected or written to. That fixed the issue for me. No more question marks and no need to make lots of convoluted changes.

For future reference one team memeber suggested I try the character encoding CP1252, which made progress but didn't fix everything. Also there is a chance I didn't try it on every page before finding a page in the process I forgot about.

Another team member suggested I do the following:
new String(theTextToConvert.getBytes("Windows-1252"), "ISO-8859-1")
That also seemed to make progress but still didn't work fully and I would have had to make about 500 changes in several files instead of just 10 in two files.

No comments:

Post a Comment