

2 is a code smell - puredanger
http://tech.puredanger.com/2011/10/12/2-is-a-smell/

======
fhars
You might with more justification claim that varargs methods are a code smell,
as they conflate the algebraic structure of the elements and the operations on
them (for example that they form a semigroup with respect to the operation at
hand) with the traversal over a collection of elements, thereby grossly
violationg the principle of separation of concerns. After all, many
interesting algebraic structures _are_ defined in terms of binary operators.
If you separate operations between elements and traversal over collections,
you can fold any operation over any traversable entity, with just #operations
+ #traversables functions, instead of #operations * #traversables functions.
(You have a similar structure in languages from the APL family, where
operators are all unary or binary, and traversal is specified by adverbs to
the operators.)

This does of course not apply if you use a deliberately restricted language
that has just one type of collection traversal, in that case you are better
off with varargs, as n*1 < n+1.

------
Sharlin
This is pretty much an instance of the good old "zero, one, infinity" rule.

[http://www.catb.org/jargon/html/Z/Zero-One-Infinity-
Rule.htm...](http://www.catb.org/jargon/html/Z/Zero-One-Infinity-Rule.html)

~~~
brlewis
This article restricts itself to a context where following "zero, one,
infinity" results in simpler, more concise code. This is a good restriction.
In general, following "zero, one, infinity" will often put you at odds with
YAGNI.

~~~
puredanger
My experience with Clojure has been that sticking to N is so harmonious with
the Clojure way that it is simpler and yields more flexible code.

If the code is _simpler_ then I don't see how YAGNI comes into play. Why would
you do something that is both more complicated and less amenable to future
needs?

~~~
brlewis
By "in general" I meant outside the context the post was restricted to, i.e.
Clojure. What you're saying is the same as what I said in the first part of my
comment, up to "This is a good restriction."

------
Jach
Cache:
[http://webcache.googleusercontent.com/search?q=cache%3Ahttp%...](http://webcache.googleusercontent.com/search?q=cache%3Ahttp%3A%2F%2Ftech.puredanger.com%2F2011%2F10%2F12%2F2-is-
a-smell%2F&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-
US:official&client=firefox-a)

tl;dr Clojure is awesome. And use

    
    
        (defn add-vectors [& vs]
         (vec (apply map + vs)))
    

instead of

    
    
        (defn add-vectors [va vb]
         [(+ (first va) (first vb)) (+ (second va) (second vb))])
    

since in Clojure code that operates on 1 or N things is usually better than
some other fixed value. (Like 2.)

------
lyudmil
I try to be very pedantic about having the most terse, yet readable code
possible in my application. This probably means that if, for now, I only need
to know how to add two vectors, I will only include a function that adds two
vectors and no more. I do realize there's hidden structure there waiting to be
exposed, but programming is largely about elaborating the _useful_ structures.
If the author's examples had been in Java instead of Clojure, a lot of us
would've been complaining that he's over-designing.

I do think the author has a point, but to me the logical complexity
progression for a piece of code is:

1\. Make it work in the "1" case.

2\. If necessary, make it work in the "2" case.

3\. If necessary, make it work in the "n" case (n variable).

I find this particularly useful for larger decisions: I can make this work
right now if I add this tiny little wart? No problem. Another thing came up
and a similar wart is required? Ok, I guess. A third? Time to find the
structure that my code is telling me about and refactor to expose it.

~~~
humbledrone
The final (varargs) add-vectors function in the OP is clearly (at least to me)
more terse and readable than the original (2 arg) one. I've only written a
couple thousand lines of Clojure to date, but code like that already just
looks right. The varargs version has less possibilities for errors; the 2 arg
version allows for several permutations of first/second to be written
incorrectly, or the wrong va/vb used.

------
KeithMajhor
What about sub-vectors? That smells like a 2 to me...

I'd rather call fold or foldr myself and KNOW how it handled more than 2
arguments. I'd rather do it for add-vectors too, just so it was consistent.
You have to define primitives somewhere. Even if what you call elsewhere is
some fluffy handle-everything function. At the very least, I wouldn't call it
a smell.

~~~
cube13
I agree that it isn't necessarily a smell.

It really depends on the scope of the problem. Is this supposed to be a fully-
functional addition function, or is this a throwaway helper function that the
coder wrote because they were doing that specific operation many times, and
they didn't want to type the entire thing out? If it's the first case, then it
may be an issue. If it's the second, it's just poorly named, and not
necessarily an issue at all.

------
heyrhett
If a co-worker told me that my function was "code smell" because it took 2
arguments, I would tell him that he needs more important things to worry
about.

~~~
humbledrone
If a coworker told me the same thing, I'd ask, "Why?" If they gave me the
explanation given in the article, I'd say, "Thank you, that's a nice little
improvement. The code is simpler AND more general." Then, I'd go on working, a
little bit better of a programmer for it. If I repeated this ritual a few
hundred times, I might actually become a halfway-decent programmer. In this
field, details matter.

~~~
erydo
There's a difference between seeing that something can be generalized,
optimized, or improved and spending the time to do it.

Knowing that difference and applying judgement is a major differentiator
between inexperienced programmers and experienced ones. That, I think, is what
heyrhett was getting at.

~~~
humbledrone
I don't get it. In the case of the vector-add function, once you've happened
to notice the more general solution, implementing it will take a couple of
minutes. It's not comparable to an optimization, which would normally increase
the risk of code errors -- it is a simplification, which would decrease the
risk of code errors.

Two minutes to make a minor simplification and learn a generally good way of
approaching things with the tool at hand (clojure)? Count me in.

------
imurray
Reminds me of zip, zip3, zip4, etc. in Haskell.

~~~
ams6110
Erlang as well, though they don't go to 4 IIRC. A pragmatic compromise of
providing a simpler interface for common use cases.

~~~
nemetroid
Haskell actually goes all the way up to zip7. A generalised zipn function is
quite complicated to write in Haskell, due to the type system. Here's a blog
post about it: [http://paczesiowa.blogspot.com/2010/03/generalized-
zipwithn....](http://paczesiowa.blogspot.com/2010/03/generalized-
zipwithn.html)

------
kevinlawler
K handles the general case:

    
    
        1 1 1 + 2 2 2
      3 3 3
      
        (1 1 1; 2 2 2; 3 3 3) + (4 4 4; 5 5 5; 6 6 6)
      (5 5 5
       7 7 7
       9 9 9)

~~~
hazexp
As does R :)

    
    
        > c(1, 1, 1) + c(2, 2, 2)
        [1] 3 3 3
        > matrix(c(1, 1, 1, 2, 2, 2, 3, 3, 3),
        +        ncol = 3, byrow = TRUE) +
        + matrix(c(4, 4, 4, 5, 5, 5, 6, 6, 6),
        +        ncol = 3, byrow = TRUE)
             [,1] [,2] [,3]
        [1,]    5    5    5
        [2,]    7    7    7
        [3,]    9    9    9
    

Not quite so concise and elegant but it's pretty straightforward stuff for
most R users.

------
MostAwesomeDude
Since we've already descended into anecdotal territory, lemme drop an
anecdote. I had a dot-product function which turned out to be a hotspot in
some numerical code. The original function looked something like:

    
    
        def dot(v1, v2):
            return sum(i * j for i, j in zip(v1, v2))
    

Very functional, very pretty, very straightforward. But it was slow, both in
CPython and PyPy. A bit of bytecode analysis and profiling later, we realized
that unrolling would be faster by several orders of magnitude. So I checked
this in:

    
    
        def dot2(u, v):
            return u[0] * v[0] + u[1] * v[1]
        def dot3(u, v):
            return u[0] * v[0] + u[1] * v[1] + u[2] * v[2]
    

And the bottleneck went away. From the Zen of Python: "Practicality beats
purity."

~~~
puredanger
If you spend any time in the Clojure community, I think you'll come to an
appreciation that practicality and performance are key principles of its
design. The goal is to obtain both high levels of abstraction and performance.

~~~
NickPollard
But performance means different things to different people.

If you're writing code where dot products are not something you do a lot of,
then keeping a clean, idiomatic functional implementation is probably fine.

If you're writing code where dot products happen _a lot_ , such as on real-
time three-dimensional games (which happens to by my oeuvre), then the
performance increase of having specific, unrolled implementations is huge.

~~~
Jach
It depends on what the language and compiler do for you, the elegant form may
actually be faster than your supposed uglier-but-faster optimization. As
always, time-test before and (if you've decided it's necessary) after
optimizing. I remember reading a warning somewhere in the Clojure docs about
how using (get x idx) is much faster than (nth x idx), I don't remember why
though.

~~~
NickPollard
This is of course true; ALWAYS profile the code before and after any
optimizations, it is quite common for things not to act the way you'd expect
(either because of the hardware, the compiler, or even other bits of code
you're not aware of, which perhaps relies on certain assumptions).

