Hacker News new | past | comments | ask | show | jobs | submit login

Oh this is fun! Beating Decades of Optimised C (best of three)

    $ time wc w.txt
     3156098 6312196 380648004 w.txt
    
    real 0m1.358s
    user 0m1.277s
    sys 0m0.066s
... with one line of q (best of three):

    q)\t g:{(sum[1<deltas where x]+sum[not x:max x],0),sum max x 0 1}0x0d0a2009=\:read1 `:w.txt
    783
    q)g
    380648004i
    6312196
    3156098i
783msec: almost twice as fast!



Won't this give an incorrect number of bytes for several consecutive whitespace characters? Why not using # to count the bytes instead?


in k, but with that same bug';

(b:#a;b++/,/a=/:" ";b+#,/a)


Here is one without the bug;

b:#a;c:,/a=/:" ";(b;b++/c@&~c&=':c;b+#,/a)

Now to see if I can check the performance.


We discussed this at length in the shakti mailing list [1]. This version by chrispsn is my favorite:

    {+/["\n"=x],+/[<':~x in "\n\r\t "],#x}
I think it's the one that more clearly express the intent. However, this other one by Attila Vrabecz performs better in the current version:

    {(+/x in"\n\n";+/0>':x in" \n\t\r";#x)}@1:"big.txt"
[1] https://groups.google.com/forum/?utm_medium=email&utm_source...


Nice, thanks for that!


That is quite beautiful. It may also have awoken Cthulhu, but that's a separate issue. :)




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: