Hacker News new | past | comments | ask | show | jobs | submit login
Chopping substrings (perl.org)
39 points by lizmat 53 days ago | hide | past | web | favorite | 12 comments



I assumed this was just using the chop command on strings and was wondering why it would be front page worthy. Then, I saw it was Damian Conway solving problems and got excited. Cool stuff! Perl6 is pretty cool.


He's been doing a lot of these recently. Always an entertaining read.


Every time I see Perl code like this, I think "That's awesome! I've forgotten how expressive Perl is. It would be fun to use Perl for my next project."

Then a few seconds later I snap out of it and realize that there is no way that I would ever be able to understand what "max :all, :by{.chars}, keys [∩] @strings».match(/.+/, :ex)».Str" does after a few months (or hours). The documentation for that single line would need to be nearly as the OP's blog post.


The author explained it quite well, I'd say.

@strings».match(/.+/, :ex)».Str

This is actually quite straightforward after being referred as the vector method call (basically map) and the other stuff is just function/operator calls apart from maybe the :by{.chars} which I can assume what it's doing but only after seeing the implementation I can see that it's something like a named parameter with a lambda

Reading these articles by Damian I almost get jealous of how expressive Perl 6 is.


> This is actually quite straightforward after being referred as the vector method call (basically map) and the other stuff is just function/operator calls apart from maybe the :by{.chars} which I can assume what it's doing but only after seeing the implementation I can see that it's something like a named parameter with a lambda

This doesn't remotely explain what that line of code is doing.


My TXR Lisp session from the last 5 minutes:

  This is the TXR Lisp interactive listener of TXR 221.
  Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet.
  1> (defun subseqs (seq)
       (if (plusp (len seq))
         (cons seq (append (subseqs [seq 0..-1]) (subseqs [seq 1..:])))))
  subseqs
  2> (subseqs "abc")
  ("abc" "ab" "a" "b" "bc" "b" "c")
  3> (defun longest-subseq (. seqs)
       (let ((subseq-sets (mapcar [chain subseqs hash-list] seqs)))
         subseq-sets))
  longest-subseq
  4> (longest-subseq "abc" "cde")
  (#H(() ("a" "a") ("c" "c") ("bc" "bc") ("ab" "ab") ("b" "b") ("abc" "abc"))
   #H(() ("cd" "cd") ("d" "d") ("cde" "cde") ("c" "c") ("e" "e") ("de" "de")))
  5> (defun longest-subseq (. seqs)
       (let* ((subseq-sets (mapcar [chain subseqs hash-list] seqs))
              (isec [reduce-left hash-isec subseq-sets]))
         (car (find-max isec : [chain car len]))))
  longest-subseq
  6> (longest-subseq "ABABC" "BABCA" "ABCBA")
  "ABC"
  7> 
Basically, convert the lists of subsequences into hashes (thus sets). Then just reduce-left over hash-isec to find the intersection of these hashes. Then find-max to get the maximum element. The third argument of find-max is the key function through which elements are projected to get the comparison key. Under find-max, hash elements are considered to be conses. We want the maximum length so we chain together car and len.

The : symbol in a function argument list means "use whatever is the default value for this optional argument"; in this case, it tells find-max to use the default comparison function.

(P.S. What we should be doing here is finding the maximum length among the input strings, and then generating substrings of only up to that length.)


How do people generally input these custom operators/characters such as the vector method call and the set operator used in the article?


You usually don't need to. Perl6 supports both Unicode and ASCII (a.k.a. "Texas") versions of these operators: https://docs.perl6.org/language/unicode_ascii

To answer the question, though: in my case I use X11's support for compose key sequences (with Caps Lock as my compose key) whenever possible (TODO: figure out a reliable way to add new sequences to fill in the gaps; I've been meaning to be able to type Compose + S + H + R + U + G to be able to instantly pop in that one shrug emote without having to search for it all the time).

Emacs (and other editors presumably) could probably be programmed to replace ASCII operators with Unicode operators on the fly, though I don't know of any packages which do this.


> TODO: figure out a reliable way to add new sequences to fill in the gaps

You can do that with an .XCompose file in your home directory. See https://github.com/kragen/xcompose for more info.


I've tried .XCompose before, but it didn't seem to work at all (hence the reliable specifier in my TODO :) ). Supposedly uim helps with that, but of course I can't seem to actually compile it (and there ain't a Slackbuild for it), so... <compose> <s> <h> <r> <u> <g>


Typically either by setting up a keyboard layout, using xcompose, or using an editor which can insert them (eg use input modes in emacs). An alternative is to use the “Texas” versions which are made out of ascii but are slightly bigger.


replacing regex with substr is an optimization I've been using for years with excellent results. Fixed length fields rock.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: