

Some Thoughts on Unicode in Perl 6 - lelf
http://rdstar.wordpress.com/2013/07/22/some-thoughts-on-unicode-in-perl-6/

======
octo_t
Even in perl5, the unicode support is incredibly good:

[http://stackoverflow.com/questions/6162484/why-does-
modern-p...](http://stackoverflow.com/questions/6162484/why-does-modern-perl-
avoid-utf-8-by-default)

~~~
buster
"good"? I always found that it was extremely annoying and a hell to get done
right, what the fuck with all the different options? Until today the last tool
i wrote that i wanted to have unicode-proof still spits out the occasional
warning, but i just don't care anymore for the warnings that pop up. Even if
perl5 can do unicode, it's a nightmare to use, imo.

And the first comment (which i did go through for my ventures into perl
unicode) is nothing but prove how broken it is. So maybe you are just being
sarcastic and i didn't get it.. ;)

~~~
Mithaldu
You don't get it.

First of all you don't NEED most of the stuff shown there. The boilerplate at
the bottom is there to cover EVERY POSSIBLE THING anyone would possibly ever
want to do with unicode in Perl and because tchrist thinks it's funny to do
such things. If all you want is to read a unicode file, change its contents,
then write it back to hdd without breaking encoding, you don't need more than:

    
    
        use utf8;
        use IO::All -utf8;
    

Secondly, all other languages do it worse than Perl 5.

~~~
buster
Well, that's nothing more like the most basic case of string usage. Of course
that works.

For a rather complex multithreaded client/server program that reads and writes
difference sources (http, webdav, ldap, file, terminal) with a lot of
dependencies it really was a nightmare, trust me. Maybe i didn't get some
obvious global "do that unicode stuff right" switch and i am just too dumb.
But even then, i've never had such problems in _any_ other language.

Interestingly i did need to write a very similar tool one year later in python
and i had _much_ less problems. And that was python 2.x which is always said
to have bad unicode support. Maybe it's just because in python (afaik) you
only have to take care of some str() or unicode() semantics and not about a
gazillion of use statements, open() parameters and whatnot.

Maybe Perl5 has perfect unicode support but how to use and achieve that is a
freaking nightmare in a non-trivial program.

~~~
Mithaldu
Quite honestly, there's not even any need for the "use" statements. When
dealing with unicode you only need to do one thing:

When dealing with any input/output, find out whether you read/write bytes
from/to it or characters, and depending on what you find, decode or don't.

I'm very sure that the only reason you had trouble with your first tool was a
lack of knowledge and thus a lack of pervasive diligence, leading to you
leaving some inputs/outputs unhandled or even worse, handled double.

Then you ended up doing that stuff more correct on your second try, since
python asks you to do the same things.

All the use statements in Perl do is make it so you actually don't need to do
that because it makes certain operations work in incode mode by default.

In short: Please try to actively separate your own past shortcomings and
growth as a developer from judgements on the capability of a language.

~~~
buster
Mh, no. That's not it, since in the beginning the code didn't use those
statements. I was taking ownership of the code and then had weird warnings
popping up all over the place. The fix for them was always one of those things
from the stackoverflow list.

So, no, i'm not convinced perl has good unicode support in terms of actually
supporting the programmer. Mind you, that was 2 or 3 years ago, so maybe the
situation is different now. To me, it was just a stupid task of creating new
tests, finding a warning and then trying to figure out what the fix is.
Sometimes the fix broke other parts or libraries and then you go to fix #2
(because of course there are a gazillion ways in Perl to do a single task). To
me it was really not a very good experience. And really, just look at that
stackoverflow article. That's a consistent and good unicode support? No. Maybe
the current perl is different, though (i think the project ran 5.12 or 5.14).

~~~
Mithaldu
Man, i didn't want to trot this out at the start, but well, it turned out my
initial hunch was entirely correct. You took software that was made by
flailing at it like a monkey, then flailed at it some more. No wonder it
turned out as an unpleasant experience.

Also, as to your question about the SO post. You claim:

"The fix for them was always one of those things from the stackoverflow list."

I find myself thoroughly baffled by that, since over half that post busies
itself with promoting warnings into fatal errors that end the program with
stacktraces.

Honestly, i'd love to see your warnings-spitting tool, just so i could
straighten it out some.

------
Mithaldu
Counting graphemes by default is really something all computer languages will
need to adopt at some point in the future if we collectively want to get out
of the morass of encoding errors. Good to see at least one language is making
tiny inroads towards that.

------
greenlakejake
Perl 6 is the Unicorn of computer languages - a mythical beast.

Yes I know I'll be downvoted to zero karma.

~~~
hibbelig
Duke Nuk'em Forever! :-)

~~~
greenlakejake
I used to joke that Duke Nuk'em Forever hadn't shipped because they were using
Perl 6. But Duke shipped first. :)

