Unicode horror – nice tool to convert
Wednesday 13 January 2010 @ 6:59 am

Here is a nice tool to convert for example different characters to percent encoding for URIs, 0x notation, decimal code points etc; (by Richard Ishida):


Ok, it is not strictly related to perl , but I guess it may be handy for people that were interested in the “utf at LAMP horror” cycle.

Comments (0) - Posted in work by  

Short script that may replace wine tasters.
Wednesday 6 January 2010 @ 6:52 am

Those fancy wine opinions were created not by fancy wine tater, but a Perl script!

Greg Sumner created script ( perl source: http://www.gmon.com/tech/sillytng.txt ) that may replace wine tasters and their head-spinning wine notes.

Something more stupid:

See it for yourself and play with it 🙂



Comments (0) - Posted in fun by  

Utf8 horror at LAMP – accept charset
Wednesday 30 December 2009 @ 6:55 am

Continuing the never ending saga of perl / utf horror:

<form method=”post” accept-charset=”utf-8″ action=”…”>

Well, I never used it… and my web app works.

Is this accept-charset really needed? Do you know?

Comments (2) - Posted in work by  

Utf8 in web perl application (LAMP) – dbi, mysql
Wednesday 23 December 2009 @ 6:59 am

Horror with utf8 and LAMP ( perl ) web application contiunued:

We need to take care of  mysql connection, so it is ut8 – ready:

if (my $dbh = DBI->connect(“DBI:mysql:database=”.$name.’;host=’.$hostname, $user, $password,
RaiseError        => $raise,
#             AutoCommit        => 1,
mysql_enable_utf8 => 1,
on_connect_do => [ “SET NAMES ‘utf8′”, “SET CHARACTER SET +’utf8′” ],
})) {

$dbh->{‘mysql_enable_utf8’} = 1;
return $dbh; # DBI database handler

We also must make sure that our tables are unicode:

`id` int(10) unsigned NOT NULL AUTO_INCREMENT,


And remember, that utf8 chars may take more space than normal – so prepare longer varchars in tables and be prepared for problem with indexes, like this: Specified key was too long; max key length is 1000 bytes – see http://bugs.mysql.com/bug.php?id=4541

Comments (0) - Posted in work by  

Utf8 in web perl application (LAMP) – binmode, charset
Wednesday 16 December 2009 @ 6:57 am

After wrestling with perl encoding, we need to make sure the pages of website we create are displayed in utf8.
This means we need to have proper header in pages, for example:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>
<meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8″>



Content-Type: text/html; charset=utf-8

Second step, is to set binmode on STDOUT (if we print our dynamically generated webpages)

binmode STDOUT, “:utf8”;

to get rid of

Wide character in print at …


Comments (2) - Posted in work by  

Utf8 in web perl application (LAMP) – part 2 – Encode
Wednesday 9 December 2009 @ 6:57 am

The data we get from outside (like STDIN) is usually binary data.
Command like uc, lc wants text strings.

Here is how to change binary data to text strings:

use Encode ;
$foo = Encode::decode_utf8($foo);

you can also convert from other encodings:
$data = decode(“iso-8859-2”, $data);

To check whether a string have utf flag turned on:

use Encode qw(is_utf8);
print is_utf8($foo) ? “utf8” : “not utf8”

If you do not know what encoding is the data, use perl module:
use Encode::Guess;
my $enc = guess_encoding($data, qw/euc-jp shiftjis 7bit-jis/);

You need to tell explicitly which encodings are suspected, because
by default, it checks only ascii, utf8 and UTF-16/32 with BOM.

Encode::Guess->set_suspects(qw/euc-jp shiftjis 7bit-jis/);

But remember, that the guessing is not magic, and it likely to fail.
For example it may fail to recognize whether data are in is-8859-1 or iso-8859-2

The reason is that Encode::Guess guesses encoding by trial and error. It first splits $data into lines and tries to decode the line for each suspect. It keeps it going until all but one encoding is eliminated out of suspects list. ISO-8859 series is just too successful for most cases (because it fills almost all code points in \x00-\xff).

Comments (2) - Posted in work by  

Utf8 in web perl application (LAMP)
Wednesday 2 December 2009 @ 6:45 am

Making a correct utf8 web application in LAMP (Perl) is not easy. There are lots of dangerous traps along the way.

We need to take care of source code encoding, web forms data – inputed by user –  encoding, mysql encoding, displaying data (writing) encoding, and perhaps also take care of data taken from disk files.

First, we need to take care of source code, if we want to write there something like

$var = “zażółć gęślą jaźń”

we need  to use utf8

The use utf8 pragma tells the Perl parser to allow UTF-8 in the program text in the current lexical scope

Do not use this pragma for anything else than telling Perl that your script is written in UTF-8.

use utf8 is not a magical trick to fix all problems with utf8, just a beginning of the journey…

We must have a good code editor or IDE that understands utf8. It will be also nice to have possibility to open files with other other encodings and convert them to utf8. For unix/linux there is for example KDevelop, and many other tools. For windows, there are many editors too. See: http://en.wikipedia.org/wiki/Comparison_of_text_editors#Unicode_and_other_character_encodings and http://www.alanwood.net/unicode/utilities_editors.html

When using use utf8 remember to save code in utf8 encoding!

To be continued…

Comments (2) - Posted in work by  

Nice people I met at YAPC::EU – continued
Wednesday 25 November 2009 @ 6:58 am

More nice people I met:

Gábor Szabó – no, not the hungarian guitarist, nor football player, but the guy who makes Padre and advertises perl so it does not die.

Three nice guys from Poland (okay, two, as I travelled with one): Maciej, Perl developer from booking.com, and the second I do not know much about 🙂

And lots and lots more, but its no use to write about all 🙂

Comments (0) - Posted in yapc by  

Nice people I met at YAPC::EU
Wednesday 18 November 2009 @ 6:55 am

On the YAPC::EU this summer I met a few nice people.Here is a short list, just a few form many nice people:

I was honoured to meet Larry Wall, who told me to have “appropiate amount of fun” on my presentation about WTFs.

A few minutes earlier I met Damian Conway, whom I helped with aplying last-minute changes to conference schedule and putting in on conference room door, before I realised who he was.

Paul Fenwick dressed as Star Trek hero did presentation about klingon Perl programming. A bit earlier this day, Damian Conway did completely different presentation about klingon Perl programming, making with his voice sounds that soon made him thirsty.

to be continued…

Comments (0) - Posted in yapc by  

Website authorization – my solution
Wednesday 11 November 2009 @ 6:37 am

I wrote about wondering how to make “login” to a dynamic website in Perl. The best solution advised by http://perldesignpatterns.com/?WebAuthentication was to make a temporary token: “cookie with an authorization token. Store the token in the database along with an expiration time separate of the cookie. The token should be random generated and completely seperate from the password but handed out when the password is validated. This is the best case;”, but it was overshot for now, so I settled up for this scheme:

Whan user registers, his password is stored as md5 digest in database. Salt is generated – string of eight random letters, numbers etc.I use Crypt::PasswdMD5 qw(unix_md5_crypt);

When user logs in, password is checked-  crypted using crypted pass from database as salt:

if ( $cryptedpassword eq unix_md5_crypt($password, $cryptedpassword)) {

and if it is ok, cookie is stored with user ID and crypted password.

The cookie is then checked on every page, whether it contains the crypted password from database.

Well, this is my idea of doing it for now, already implemented, I feel a bit unease about that – what is the point of crypting password and storing it crypted, as it really matters whether the pass from cookie is equal to pass in database – it could be not crypted and it would work the same way.

The only advantage is that the password is not stored in cookie – but it is not needed, as just the digest is needed to pretend to be logged in.

What do you think?

Comments (4) - Posted in cpan,work by  

« Newer PostsOlder Posts »