Edit: I know this sounds like I am trying to dismiss the problem. That's not my point. I'm trying to say the problem is compounded by the choice to bundle the solutions to several different problems. If people have criticisms I would love to read a reply. </Edit>
> We ought to have a standard version of the reject! code precisely because it's so difficult to get right.
No, it's easy to get right in any specific case. It's iterating over a dang list, it's the first thing you learn how to do as a programmer. It's only hard to get right if you're trying to solve it for every use case with a single interface.
Essentially: you are taking two different algorithms and trying two switch between them behind a single interface. Meaty if-statements behind flags are an indication this is happening. It's cool of you want that but put it in a library and give it a GitHub per.
An example: I don't use a library to merge attributes on objects. I cut and paste one of a few snippets or write from scratch. Sometimes you want to mutate an object, sometimes you want a copy, sometimes you want to shallow copy or keep a reference to the parent. I could write a nasty complicated function that intelligently solved "the problem". But all that does is bundle together a bunch of different solutions to different problems and make it hard for me to understand which one it picked.
Edit 2: I could be persuaded that what I'm saying is just not Rubyish. I'm a JavaScript programmer (after many years of Ruby) so maybe there's some truth in that. I do think that distancing ourselves from the mechanics of list manipulation can be bad for our code quality.
> No, it's easy to get right in any specific case. It's iterating over a dang list, it's the first thing you learn how to do as a programmer. It's only hard to get right if you're trying to solve it for every use case with a single interface.
Isn't any correct implementation of any function that retains any only the elements that pass some predicate going to have to do basically the same thing that reject! does, or else be slow (creating a new list instead of mutating in place)?
I guess there's a somewhat simpler implementation that would work for some use cases (those in which order is not important) that just swaps in the last element instead of removing. But if maintaining order is important the complexity is similar.
I think that what would happen if Ruby didn't include reject! is that most programmers would just needlessly create new arrays all the time, since that's easier to write.
> [won't] any function that retains ... elements that pass some predicate going to have to do basically the same thing that reject! does ...?
Well, possible things you might want that still fall under the banner of "in-place reject":
- single slice at the end, vs atomic iterations as the article stated
- traversing the list in different directions, or according to priority
- aborting part way through
- any of a million domain concerns
If you rolled your own iteration, you have a platform for handling any of these. If you used a library, even the standard library, you need to start over from scratch, and either build your own iteration anyway, or find another library.
The exception to this is something like Haskell where the abstractions of the abstractions are modelled in the type system, so you never have to start anything over you just bisect your types and reassign responsibilities around the wound.
it might be quite fast, but it's never going to be as fast as a mutable version of the same thing. there are plenty of cases where you have many (millions?) of elements and want to remove only a few. in that case, copying an array of many elements for the sake of immutability is quite wasteful
*edit: i say never, but if you do cool stuff with referencing array ranges so that you don't copy, but reference the unchanged parts of the previous array then sure, but that's more of a low level feature of the language (eg erlang and clojure both do this i thiiiink?) than something you'd want to implement in a library because it can be suuuuper tricky to get right
No one does immutability with copying, that would be a joke. There is also nothing tricky about not copying full lists on append at userland lib, just more advanced data structures.
Yes, it won't be as fast in modifying. But it will be a lot faster in other operations, so it's a trade off and you chose appropriately for your use case.
Plus, it's a lot easier to read and understand "immutable code".
I'm going to assume this bottlenecked-by-mem-alloc work isn't Ruby.
The rule of thumb I personally go by is that if you're working in e.g. Ruby or Python it's better to favor immutability over mutability because mem alloc will almost never be the problem.
I say this having worked on a Python (pypy) product where mem alloc WAS the bottleneck in a particular area. So I know it's not always true, but almost always. Probably preaching to the choir here :)
Oof! reject! is not "iterating over a list," but compacting an array. It's a pretty simple operation that should be available in a standard library. There will be some hairy edge cases around nonlocal exits and accesses to the array itself within the reject block, but it's not too hard to create a standard implementation that properly handles most of the edge cases, and documents the ones that aren't handled.
Roughly, you move elements into place as you go, but don't shift all the higher-index elements down right away. Document that the block can only access elements with greater indices than the current element, and you're good to go.
> I don't use a library to merge attributes on objects. I cut and paste one of a few snippets or write from scratch. Sometimes you want to mutate an object, sometimes you want a copy, sometimes you want to shallow copy or keep a reference to the parent. I could write a nasty complicated function that intelligently solved "the problem". But all that does is bundle together a bunch of different solutions to different problems and make it hard for me to understand which one it picked.
Rubyish-ness aside, this just seems like a worse solution than a library.
1. You're cluttering your files with code that isn't essential to your business logic (and although you might be able to recognise a particular snippet as in-place-merge vs shallow-copy-merge, anyone new who's reading your code will need to read each snippet as they come across it to understand what high-level operation you're performing).
2. Especially in JS, more copied code equals more parse time (which can be a real killer as a project grows).
3. If your code does have a bug (like being accidentally quadratic), you now have a harder time finding and updating all of the places that use it.
4. You've spent your own time writing, customising, debugging and maintaining these snippets when you could have just written _.extend() and had short, clear, working code that most JS developers already understand the behaviour-of.
I'm not saying never write your own snippet — there will be plenty of times when you need some really specific behaviours or performance characteristics — but the default way to do common data-shuffling operations should be a clearly named function call to a shared library.
1. I could be wrong but I took the original comment to mean he copy and pastes the functions, not actually writing out the algo for every instance (I don't think I could possibly argue that's a good idea). An even better solution is to just put it in it's own module with an implementation for each.
2. I don't know if I entirely buy the more code=more parse time being a real reason here if you're willing to pull in the entire underscore library just to get _.extend
3. See response to #1
4. I disagree with this, for the reasons you stated with #2 actually - I really can't stand bringing in one gigantic library for a single piece of functionality. It's the exact reason that jQuery lingers in so many projects when it's really just there to provide a coherent XHR request interface.
I think the general direction of smaller reusable modules that node pushed everyone towards is the best solution.
I totally agree: small specialised modules are great. Underscore is only worthwhile if you need some significant fraction of it (but it makes for an easy example).
I think 4. it actually really important — regardless of whether you're using big or small libraries — it doesn't take long before the maintenance cost of any code you write starts to become a drag; any maintenance you can avoid by using open source libraries should be considered an investment in future productivity. Of course, that doesn't apply if the third party code is crap or not suited to your immediate use case.
In regards to 4: if you use ES6+ modules and Webpack 2 (or rollup), you can import only the the stuff you need from lodash.
That said, my experience is that in some cases you still end up with a bunch more code than expected because of internal dependencies, but perhaps that code would've been necessary anyways.
Nearly all bugs in anyone's code come from things you get right 99% of the time, but mess up that 1%. If Ruby's standard library provides a correct implementation of `Array#reject!`, it's eliminating one thing, however insignificant, that you might mess up. That even standard library implementers occasionally mess up iterating over a dang array (`#reject!` was correct but inadvertently suboptimal) is a good example of how, in the large, "just get it right" is much easier said than done.
Another good reason is that the Ruby standard library implements many methods, such as `#reject!`, in C instead of pure Ruby, for vastly better performance. You can do that yourself of course, but that sort of defeats the purpose of using Ruby. Try implementing some `Array` iteration methods in pure Ruby with for loops, then benchmark them against the standard equivalent; they'll usually be dozens to hundreds of times slower.
`reject!` seems like a very straightforward case to me, one that makes sense to be in the standard library. How would different use cases change its behavior? What are the "meaty if-statements behind flags" here?
> We ought to have a standard version of the reject! code precisely because it's so difficult to get right.
No, it's easy to get right in any specific case. It's iterating over a dang list, it's the first thing you learn how to do as a programmer. It's only hard to get right if you're trying to solve it for every use case with a single interface.
Essentially: you are taking two different algorithms and trying two switch between them behind a single interface. Meaty if-statements behind flags are an indication this is happening. It's cool of you want that but put it in a library and give it a GitHub per.
An example: I don't use a library to merge attributes on objects. I cut and paste one of a few snippets or write from scratch. Sometimes you want to mutate an object, sometimes you want a copy, sometimes you want to shallow copy or keep a reference to the parent. I could write a nasty complicated function that intelligently solved "the problem". But all that does is bundle together a bunch of different solutions to different problems and make it hard for me to understand which one it picked.
Edit 2: I could be persuaded that what I'm saying is just not Rubyish. I'm a JavaScript programmer (after many years of Ruby) so maybe there's some truth in that. I do think that distancing ourselves from the mechanics of list manipulation can be bad for our code quality.