Wednesday 2 December 2009 @ 6:45 am

Making a correct utf8 web application in LAMP (Perl) is not easy. There are lots of dangerous traps along the way.

We need to take care of source code encoding, web forms data – inputed by user –  encoding, mysql encoding, displaying data (writing) encoding, and perhaps also take care of data taken from disk files.

First, we need to take care of source code, if we want to write there something like

$var = “zażółć gęślą jaźń”

we need  to use utf8

The use utf8 pragma tells the Perl parser to allow UTF-8 in the program text in the current lexical scope

Do not use this pragma for anything else than telling Perl that your script is written in UTF-8.

use utf8 is not a magical trick to fix all problems with utf8, just a beginning of the journey…

We must have a good code editor or IDE that understands utf8. It will be also nice to have possibility to open files with other other encodings and convert them to utf8. For unix/linux there is for example KDevelop, and many other tools. For windows, there are many editors too. See: http://en.wikipedia.org/wiki/Comparison_of_text_editors#Unicode_and_other_character_encodings and http://www.alanwood.net/unicode/utilities_editors.html

When using use utf8 remember to save code in utf8 encoding!

To be continued…

