

Dealing with Unicode in Go - hermanschaaf
https://coderwall.com/p/k7zvyg

======
tptacek
Here's Rob Pike on this issue:

[https://groups.google.com/d/msg/golang-
nuts/1yL7IsqADSw/GS8F...](https://groups.google.com/d/msg/golang-
nuts/1yL7IsqADSw/GS8Fex5BCbkJ)

Golang's Unicode support (in 2012 at least) is "less than rudimentary".

In practice, I find Go's Unicode support superior to all the other languages
I've shipped with --- C/C++, ObjC, Ruby, and Python, mostly --- mostly because
it draws a line between "strings" and "byte vectors" (the way Cocoa does with
NSData/NSString) and has a solid UTF-8 library. Your support for Unicode does
need to be explicit and somewhat hand-rolled, but unlike (say) Ruby, the
language doesn't trick you into thinking you're handling Unicode when you're
not, and unlike Python, handling Unicode doesn't push you into an alternate
type space.

(Conceding rapidly: I've shipped the least amount of code in Python and there
is probably an elegant Pythonic solution I'm just not aware of).

------
cthom06
I commented on the article, but here as well. When dealing with unicode in Go
you should be converting strings to []rune

~~~
tsewlliw
Is rune "right" even? I thought that rune was a single codepoint. How does
that behave in the presence of combining characters or in non normalized
unicode strings?

<https://twitter.com/glitchr_>

~~~
cthom06
Yes it is a single codepoint. Normalization is a whole other issue, and a much
less trivial one.

