Hacker News new | past | comments | ask | show | jobs | submit login

Concerning Unicode reversal, the author's statements "This has not been done", "there is no genuine real-world need", "The algorithm doesn't exist" are wrong (see UAX #29 §6.4), "suitable metadata might need to be added" has already been done, and "what a 'string' is considered to be" is well-defined and requires no deliberation or reinterpretation. His text is surprising to me because I know the author programs in Perl, JS, Python, so he should really be knowledgeable about this topic. With the following code samples, all the edge cases written about in the article are taken care of.

https://docs.raku.org/routine/flip

    $some-str.flip
https://p3rl.org/unicook#℞-32:-Reverse-string-by-grapheme

    join '', reverse $some_str =~ /\X/g
https://php.net/grapheme_extract

    ...
https://github.com/dart-lang/characters

    ...
https://grapheme.readthedocs.io/en/latest/grapheme.html#grap...

    from grapheme import graphemes
    "".join(reversed(list(graphemes(some_str))))
https://npmjs.com/package/grapheme-splitter

    const gs = new(require('grapheme-splitter'));
    gs.splitGraphemes(some_str).reverse().join('');
Rust deserve special mention because the language implementers did the correct thing and put it in the language: https://doc.rust-lang.org/1.3.0/std/str/struct.Graphemes.htm... … only to take it out again for no good reason, thus rendering the programming language not Unicode compliant in the process. Talk about snatching defeat from the jaws of victory! m-(

Other languages: try to find a binding for libpcre or libicu.

http://www.pcre.org/current/doc/html/pcre2pattern.html#Exten...

http://userguide.icu-project.org/boundaryanalysis#TOC-Charac...




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: