Hacker News new | past | comments | ask | show | jobs | submit login
Lispy sets (vicsydev.blogspot.com)
66 points by codr4life on Jan 7, 2017 | hide | past | web | favorite | 19 comments

While I certainly don't want to discourage anyone from hacking on anything they enjoy, if your primary purpose is to use ordered collections, FSet [0] has them, and much, much, much more. It's Quicklisp-loadable.

[0] https://github.com/slburson/fset

No worries; one of the reasons I stepped up in was to shake the tree a bit and see what people are using instead, considering how "lousy" the built-in solution performs. I also think there's room for a couple of steps in between; and a notable difference is that the implementation I posted prefers in place updates, functional use requires cloning.

> functional use requires cloning

If you're suggesting that functional updates require the entire collection to be copied, that's not correct. Only a small amount of data (oproportional to the logarithm of the size of the collection) has to be copied on each update. The result is that functional collections are quite practical for most uses. They have stylistic benefits as well.

Not at all, just noting the difference. I don't expect this to have much of a chance in comparison with more elaborate, mature libraries; like the one you suggested.

A real set implementation is 250x faster than using a list as a set (doesn't mention which operations or include the numbers though..)? One would hope so for large sets. That is what the author is referring to by built-in set functionality, right?

It's not any more real than built-in, it uses a list as a set as well. The benchmark is now included at the bottom of the post, or you can follow the link to the repo if you want to run it yourself.

Right, I saw the benchmarking code in the linked repository already, it would be interesting if you included the results from your machine in the post.

By "real" I meant "not naive."

For what it's worth, I was able to reproduce the results.

  Evaluation took:
    5.930 seconds of real time
    5.923203 seconds of total run time (5.916811 user, 0.006392 system)
    99.88% CPU
    13,045,070,068 processor cycles
    2,392,064 bytes consed

  Evaluation took:
    0.014 seconds of real time
    0.014117 seconds of total run time (0.014106 user, 0.000011 system)
    100.00% CPU
    31,060,638 processor cycles
    3,571,712 bytes consed
The first para is for lists, the second for slists.

Cool, so at least we know I'm not seeing ghosts :)

Will do. I hear you, what's surprising (to me) is how much less naive you have to go to get a workable solution.

I somehow missed previous submissions from this blog, but I've now favorited some of them and subscribed to its feed.

In particular, http://vicsydev.blogspot.com/2017/01/on-tests.html resonates very much with my approach to organizing programs:

1. https://github.com/akkartik/wart/blob/91356d9385/organizatio...

2. http://akkartik.github.io/mu/html/000organization.cc.html (https://github.com/akkartik/mu/blob/61fb1da0b6/000organizati...)

The ability to 'tag' tests into multiple suites is particularly nice. I might steal that at some point, but I'm first going to think about how it might synergize with my Literate Programming approach (http://akkartik.name/post/wart-layers)

Are you the author of the blog? Scrolling the code snippet keeps causing prev/next post page changes. It's really frustrating are you able to fix it?

Ouch, sorry. Unless that's a general problem that more people are having that's outside of my control. I'm delegating snippets to Github gists, do you have the same problem with other gists? Either way, I suggest following one of the links to the repo instead then.

Yes, general problem with all Blogspot blogs. It's been going on for years at this point.

This is getting off-topic, but Blogspot/Blogger is a #$%# shitshow. My general rule of thumb is to use services from a company that treats them as its top priority (like say Wordpress in this case). Google likely doesn't include Blogger in its top 100 priorities.

Np was able to solve by reading in landscape. Really liked the article otherwise!

It would be interesting to see how things compared with set operations on hash table keys and `sort`.

Could you elaborate on that a bit? First I thought you wanted to compare to a straight hash table but then the sorting threw me off :)

Hashtables are unordered by key, so if you want ordering for those keys, which would really be the set elements, you'll have to sort.

I get it, hash table for membership and manual ordering. That's an option. From my experience; hashed sets are more expensive to set up, because of the slots; making small repetitive sets less appealing. And manual ordering is not always an option. I tried adding hashing on top of slists, an array of slists basically; to get away from traversing the whole thing for large sets; but the results were disappointing, basically identical performance.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact