Hacker News new | past | comments | ask | show | jobs | submit login

here's another "fun" grep locale oddity:

   $ echo HI | LANG=en_US.utf8 grep '^[a-z]'
   HI
   $ echo HI | LANG=C grep '^[a-z]'
   $
apparently en_{GB,US}.utf8 orders a-z like aAbBcC..zZ.

   $ echo ZI | LANG=en_US.utf8 grep '^[a-z]'
   $



I had the same problem with sort:

  $ sort <<EOF
  > Aa
  > aa
  > Ab
  > ab
  > EOF
  aa
  Aa
  ab
  Ab
I was going crazy because I was getting different results in OSX and Ubuntu. Setting the LANG to POSIX fixed it.


    $ sort --version
    sort (GNU coreutils) 8.14
For what it's worth, this gets me the same results under LANG=C and LANG=en_US.utf8


This is what I get:

    $ echo HI | LANG=C grep '^[a-z]'
    $ echo HI | LANG=en_US.utf8 grep '^[a-z]'
    $ 
How come?


I was able to reproduce the bug. It could be a version thing.

  ; grep --version
  GNU grep 2.6.3
  ; echo A | LANG=en_US.utf8 grep '[a-z]'
  A


No, I have the same version but not a similar result. I also have the en_US.utf8 locale installed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: