Hacker News new | comments | show | ask | jobs | submit login
The Grep Test (jamie-wong.com)
284 points by phleet 1590 days ago | hide | past | web | 169 comments | favorite



So basically I can't commit anything containing the slightest form of metaprogramming? Seems a bit extreme to me. A solution is to go for the more concise and robust metaprogrammed piece of code (the example of the form with the User object is a good one) and add a comment mentioning the methods called in there.

The "smartass" way of programming is sometimes overused but it does have its benefits. When you're metaprogramatically setting the attributes on the User, you're also avoiding needless and error-prone repetition and making sure that this central piece of code will either crash all the time or work all the time for all attributes. This has tremendous value.

So while I understand the point about this article, I might want to add a pinch of salt to the dogma underlying it.


Your example of adding a comment mentioning the methods called in there would indeed pass the Grep Test, and is a reasonable compromise when there is a real call for dynamic declaration.

I think there are definite ways of adding metaprogramming functionality without breaking this test. For instance, in the first JavaScript counterexample, if the iteration was over [{attr: "position", fn: "getPosition"}, {attr: "direction", fn: "getDirection"}] instead, the Grep Test passes, and you get much of the same benefits, with a very minor duplication that I'd argue is worth the cost.


I work in Perl for the most part, and the alternative I've been using lately for functions is to assert that for a given generated sub, the text 'sub routine_name' exists literally in the code somewhere. This takes two forms in my case:

    my %to_generate = (
        'sub first' => { ... },
        'sub second' => { ... }
    );
    generate_from_spec(%to_generate);
generate_from_spec then strips off the 'sub ', and errors out if it is not found.

The second case I have is a function that itself then generates a function, but is trying to look like a sort of normal function itself, in which case I end up with:

    generate_sub routine_name =>
        ... whatever arguments ...
"abusing" Perl's => operator, which functions like a comma except that it forces stringification of the left argument, to once again make the literal, greppable "sub routine_name" appear in the codebase. Here "routine_name" is then just a standard string argument, the alterative being

    generate_sub "routine_name",
which is then harder to grep for. (Still possible, obviously, with a different grep query, but only if you already know up front you need to add the other possibilities.)

Note this actually goes a step beyond what you are proposing in that it makes the declaration site clear; my counterproposal for your JS case would be

    ["function getPosition", "function getDirection"].each(...)
and using string manipulation to do whatever you need to do to get the right info out of the function name.


So update your blog post to say that. I had the same reaction as GP.


Done.


I'm not sure I totally disagree with you, but I have to at least disagree with part of your argument.

> Seems a bit extreme to me.

That reasoning has absolutely zero argumentative value behind it, in any context that it's used. It ought to be treated like a logical fallacy.

> add a comment mentioning the methods called in there.

The problem of comments getting out of sync with code is omnipresent. "Never fail the grep test" seems like a much more easily-enforced and -maintained practice (both for yourself, and for teams) than what you're suggesting.


The point here is not that metaprogramming is universally bad, it's that it should be used as a last resort (or second-to-last resort to codegen, depending on your platform/application) and you should feel bad when you need to use it.

Simply using it to save keystrokes is pretty lame since you'll spend much more time maintaining code than typing out the original and explicitness is valuable when returning to a piece of code. Additionally, if you find yourself that you need a lot of metaprogramming for a lot of things it's often an indication that you could just refactor your code using static idioms and be better off.

Not to mention that the more you use metaprogramming, the more likely it is that you or someone else will kill some runtime optimizations of the JIT.

IMO, syntax matters way less than people tend to think it does, and the additional implementation complexity and astonishment cute syntax introduces tends to make the trade-off not worth it.


In my experience overuse of meta programming is usually a sign that the architecture of the system is bad and the software should be split up into separate libraries or web services.


That all depends on the language. Lisp (pretty much all of its dialects) heavily emphasizes the use of macros to extend the language and build all kinds of control structures and embedded domain-specific languages (EDSLs).


At the same time the Lisp community is one of the fiercest in supporting the idea that nothing that can be a function should be a macro. Metaprogramming just because it is clever is more abhorred in the Lisp community than any other I've seen.


Of course! Macros are distinctly less powerful than functions for one reason: they are not first class values at runtime and thus cannot be passed to higher-order functions.


This is solved by using an operative lisp - fexprs (or 'vau'/'$vau' in kernel and wat) are non-applicative combiners that execute at runtime, so you can have a first class combiner that chooses when/how/how many times to evaluate its arguments.

I've been having a lot of fun recently writing code in wat-pl (my perl port of Manuel Simonyi's wat-js interpreter)


Lisp macros are a lot simpler and easier to understand than what goes on with the types of metaprogramming done in ruby/rails. Expanding a macro is a lot easier for me to wrap my head around than stuff like dynamically defining methods when something is subclassed or when magic happens in method_missing.


The heavy emphasis of macros in Lisp may also be its greatest weakness. I see all sorts of Clojure code that uses macros where they needn't have, and it tremendously impedes my ability to use the code in ways the library author may not have expected. I love the way petehunt put it - metaprogramming is a last resort, you should feel bad when you use it, and language designers should include it but deemphasize it. (Scala and Haskell nail it I think; Clojure perhaps emphasizes macros a bit too much which harms the output of intermediate programmers. But, Clojure is not designed for intermediate programmers.)


Absolutely. One could make the argument, though, that many (including me) believe that expressive power and conceptual beauty is Lisp's core strength, not code maintainability or runtime performance.


Common Lisp was designed for development of large/complex applications ('applications' really means here software that is used in production) with good performance.

For example this CAD system from PTC is based on 6+ million lines of Common Lisp code and under development for two decades:

http://www.ptc.com/product/creo/direct


expressive power and conceptual beauty is Lisp's core strength, not code maintainability or runtime performance.

Well, Arc is actually very maintainable, and quite fast. After all, HN is powered by Arc, and it serves >100k daily uniques on a single core.


HN is fast because it is feature-barren. That's a design choice pg is free to make (and cutting features is a great way to improve performance), but not a proof of Arc/Lisp's ability to support complex applications with high-performance.


Which features would convince you that Arc is fast?


Just for the record, let's count to 100 million. And add the numbers up while we're at it.

  ~ $ time (echo '(do (= n 0) (for i 1 100000000 (++ n i)) prn.n (quit))' | arc)
  Welcome to Racket v5.3.5.1.
  Use (quit) to quit, (tl) to return here after an interrupt.
  arc> 5000000050000000
  
  real	0m28.561s
  user	0m28.308s
  sys	0m0.249s
  ----
  ~ $ time (echo -e 'n=0 \ni=0 \nwhile (i <= 100000000): \n  n += i \n  i += 1 \n\nprint(n)\n' | python3.3)
  5000000050000000

  real	0m30.244s
  user	0m30.230s
  sys	0m0.013s

It would appear to be competitive with Python on my machine on this particular task. (Also you can make it faster by dropping into Racket.)


A while loop and temporary mutable variables are definitely not the Pythonic way of doing this. More idiomatic:

    $ time python -c 'print sum(xrange(100000000 + 1))'
    5000000050000000
    
    real    0m1.398s
    user    0m1.383s
    sys     0m0.012s
Comparison to baseline:

    $ time (echo -e 'n=0 \ni=0 \nwhile (i <= 100000000): \n  n += i \n  i += 1 \n\nprint(n)\n' | python)
    5000000050000000
    
    real    0m33.140s
    user    0m32.939s
    sys     0m0.023s


I see. That is good to know; I merely chose mutating global variables in a loop because I knew how to do that in both languages. (I am not very familiar with Python.) That is not the idiomatic way to do it in Arc, either. I would normally use a recursive function, like this:

  arc> (time:xloop (i 0 n 0) (if (> i 100000000) n (next (+ i 1) (+ n i))))
  time: 9121 cpu: 9130 gc: 0 mem: 480  ; the times are in msec
  5000000050000000
Or perhaps a "higher-order function":

  arc> (time:sum idfn 1 100000000)
  time: 19889 cpu: 19908 gc: 0 mem: 1224
  5000000050000000
Or use a deforestation macro that I wrote, which is closest to your Python example:

  arc> (time:thunkify:reduce + (range 1 100000000))
  time: 17971 cpu: 17985 gc: 0 mem: 3592
  5000000050000000
Also, here's what you can get by dropping into Racket:

  arc> (time:$:let loop ((i 0) (n 0)) (if (> i 100000000) n (loop (+ i 1) (+ n i))))
  time: 402 cpu: 403 gc: 0 mem: 920
  5000000050000000
I suppose Python has an analogue of that--dropping into C, or at least loading C libraries. Which Racket can do too. Mmm.


Yep! Not saying that it's impossible, just that some of the concepts that Lisp is based on make these things harder to pull off. The reason for this is that Lisp emphasizes flexibility, and the more flexible your runtime environment is, the more ways to do something, which makes it easier to have hard-to-maintain code (and harder to reason about performance, too). But it's not impossible to do with Lisp for sure.


Lisp macros also provide ways to take ugly, hand-optimized code and clean it up to provide a cleaner, friendlier interface. Like all macros, this comes at no runtime cost (since all macros have been fully expanded by this point).


Actually, it's not any harder to write maintainable code in Arc. In many ways, it's much easier, because you write your program as a set of parts which fit together like an arch.


Yep, and it surely depends upon the problem being solved too.


While I mostly agree with what you are saying I think that there is a place for metaprogramming and some programming languages implement metaprogramming very well.

Using metaprogramming should not make you "feel bad", however it should not be used excessively. Metaprogramming gives you a lot of power, it allows you to extend the language in any way you please. With this great power, of course, comes great responsibility but I would rather have the option to use this power than to not have it at all.

I would also argue that metaprogramming can enable your code to be more efficient, if the macros/templates you write are evaluated at compile-time. This is the approach that the Nimrod programming language (http://nimrod-code.org) takes and it works well. This also means that in most cases compilation will fail if the macro is incorrect instead of causing some silent errors at runtime.


There's some strong (i hope a decent number of people here would say "winning") counterarguments, scala macros and implicits are incredibly useful. (i see lots of complaints about C++ template error messages, but i haven't been doing C++ so can't comment informedly)

I think a better rule is if the tooling can't look at the source, intermediate compiler stages/ AST and bytecode/binary and tell you unambiguously where something came from, then you can characterize it as a "last resort"

And there's the counterargument that for statically typed languages with good type systems (scala, haskell, ocaml, F#), there's REPL prototyping/testing and code generated from template haskell or camlp4 has to type check, so you have multiple safety mechanisms


The point here is not that metaprogramming is universally bad, it's that it should be used as a last resort ... and you should feel bad

you could just refactor your code using static idioms

you or someone else will kill some runtime optimizations of the JIT.

syntax matters way less than people tend to think ... astonishment [sic] cute syntax

Many intelligent, design-savvy people would strongly disagree with you. Besides the obvious gains in the ability to create DSLs, there is a power and fluidity in metaprogramming that is not possible any other way.

Speed (as, say, provided by runtime optimizations from a JIT compiler) is not always critical. If it was, one probably should be using a language other than Ruby or Javascript; they're meant to be flexible, powerful languages, and their emphasis is not on pure speed.

Syntax is not "cute." Humans are going to read and use this code, and the more it reads like English, the more likely it is to be understood.


Unsure why you [sic]'d astonishment; I was referring to https://en.wikipedia.org/wiki/Principle_of_least_astonishmen..., a well-accepted principle of UX which certainly applies to the UX of languages and APIs.

I am unsure what you mean by the "obvious gains" you get from DSLs. I see many DSLs as a code smell -- that the runtime environment is almost expressive enough to express the syntax construct the DSL creator wants, but not quite.

The implementation of DSLs tends to judiciously use closures, operator overloading, dynamic getters and proxies such that it's non-obvious what's going on under the hood. Sometimes that trade-off is worth it. Usually it isn't, in my experience.

I believe (I could be wrong) that DSLs were coined around 2004. I don't think that is a long enough history for us to even think about simply accepting that they are a good idea in and of themselves -- garbage collection was invented in 1959 and is still being debated!

I see the speed argument all over the internet being presented as a binary argument: either you care about speed or you don't. That's simply false -- I care about order of magnitudes for speed. I personally don't care if I can encode h264 faster in JS than I can another way, but I do care that my event handlers execute within 16ms so I don't drop frames in the browser.

Some syntax is certainly cute. Additionally, making syntax readable like English should be a non-goal IMO; for many people, Lisp is far more readable than SQL, which is much closer to English.


2004 seems really late for the inception (or possibly formalisation) of Domain Specific Language, I assume that you're talking specifically about embedded (also called internal) DSLs, otherwise SQL at least is from 1986.

As an earlier example of an Embedded DSL, the book PAIP[1] included prolog embedded in common lisp in 1992.

It could be argued that the loop macro in Common Lisp is a DSL for describing iteration. If not loop, then certainly regular expressions are a common EDSL for describing a regular language, and performing operations with those languages against strings?

I found this snippet from Computers in Crisis [2] which seems to describe domain-specific languages in familiar terms from 1975

    Most domain-specific programming languages can be categorized in one
    of two ways; either as a "sugared" general ... of programming, in
    fact the style of problem solving, embedded in and supported by that
    language remains unchanged.
[1] http://www.norvig.com/paip/README.html [2] http://books.google.co.nz/books?id=QndQAAAAMAAJ&q=%22Embedde...


Yes, you're right, I did mean embedded DSL.


Smalltalk has been doing embedded DSL's since its creation in the 70's; they are not a new phenomenon.


You seem to be suggesting that DSLs are universally accepted as a pinnacle of good design.


Indeed. I, for one, think that over-proliferation of DSL's can be a Bad Thing. To use one example from my own experience: I spend a lot of time coding in a Groovy / Grails environment. Now, I like Grails and Groovy, but Grails introduces it's own weird DSLs for things like:

configuring Spring Beans configuring Log4J defining URL mappings configuring Hibernate mappings & queries etc.

Now some of those are cool, but at least some of them replace other techniques that I already know, and find much more readable and understandable. Configuring Log4J, for example. I'd much rather simply jam a log4j.xml in there and forget about the Grails DSL.

The thing is, none of these DSL's is, in and of itself, necessarily bad in any way... but there's a bit of "cognitive overload" in having to deal with 3 or 4 or 5 new DSLs, on top of the base Groovy stuff. The flip side is, these DSLs mean that your configuration is done mostly in .groovy files and you can use normal groovy syntax for looping and accessing variables, etc. So it's easier to code up more dynamic configurations than if you were using XML files or .ini files.

Anyway, the point of this rant is not to say "DSLs are bad" but just to lend weight to the suggestion that they aren't universally Good either.


Not knowing anything about Grails, does the Log4j DSL not do the equivalent of generating the XML config, and if not, would you have preferred a system where you could choose to write your own config or let the DSL do it for you?


> I’m sure by this point, a few of you have thought “Hey, but Rails fails the Grep Test!”. This is absolutely true, most notable due to the dynamic find_by_ finders, and the dynamic _path URL path generators.

OK, sure, but after you've grokked the first "find_by" usecase...do you really need documentation for all the other kinds of "find_by"'s that you'll use?

In any case, I'd agree that meta-programming is too often abused, but the proposed grep test is far too strict. And, inability to create complete documentation for every single kind of method token is not really the main reason to avoid meta-programming...I'd say performance and the propensity for abuse are better motives.

This pull request re: removing most of Rails' dynamic finders in 4.x covers the topic nicely:

https://github.com/rails/rails/pull/5639


> In any case, I'd agree that meta-programming is too often abused, but the proposed grep test is far too strict.

I agree. In a CMS I'm coding, the core MVC-like mark up tags work by using function tables in Go. Such code would fail the grep test - but it makes a lot of sense in the project and it actually very readable as well. In fact in that particularly use case, not only is the code more readable, but it's also more efficient.

But this is the age old problem with having rules in languages (both human and programming) - more often than not, there are exceptions that are perfectly legitimate use cases.


Yes, grepping for a function and not finding it anywhere in the codebase is very annoying. I've had to maintain this sort of thing before. It's like in bash saying `${FOO_$BAR}` (if such a thing is possible). It gives me shudders.

I also agree that Rails gets a pass. It's a little different when you have a stable (well, sort of :-) API with dozens of books and thousands of blog posts. It was annoying to learn what was going on at first, but that knowledge is more long-lasting so worth a bit more pain.


Agreed, though I would say that if you take away the well known Rails idioms that fail the test, there are still some pretty egregious violations.

For example, try to step through the code that figures out which validation callbacks to trigger for an ActiveRecord model. You'll be led through at least 3 dynamically eval'd methods.

I had to do this recently and probably would have given up had there not been Foo.method(:bar).source_location to point me in the right direction.


Agreed; although the only thing worse than not finding it is finding it all over the place. I've worked with Python code where the developer liked to use member names like "type" and "id" -- my 'ag' output is useless in that case.

I'm still trying to figure out how Python projects scale to any appreciable size; I suspect maybe they don't. I've worked on several million-line codebases in statically-typed languages, are there any truly large Python projects?


The examples shown in the post aren't wrong because they fail the grep test, they are wrong because they A) capture an uninteresting abstraction that is not likely to see much usage as the program gets more complex, B) mix up domain code with tooling code that if at all gets developed should end up being part of a framework, plugin, base class etc.

The general underlying problem is that a lot of people stick with those "principles" (which are rules of thumb, often weak ones at that) they have read somewhere without developing a real understanding of software design. Slogans like DRY or SRP seem to blind people to often simple forseeable consequences of their design decisions.

Software design boils down roughly to two abilities: one is imagining a great many ways of structuring a program, the other is the understanding of practical consequences of a given program structure. So a good software designer must ask him- or herself: why code repetition is a bad thing? And the answer is that there is absolutely _nothing_ wrong. What is wrong is that a logically or structurally common aspect of the problem didn't receive the recognition in the form of an abstraction (a method, class, variable, interface ...). Then you waste time by not being able to reuse this abstraction, when something about this common aspect changes you are forced to go through fifty places and modify the code, but the worst thing is that there is a limit of the amount of details a programmer can keep in his or her head, and the less you are able to structure your program the more restrictive will be limit of the problem complexity you will be able to tackle. That's the reason software design is at all important.

There are cases however when there is no real logical common denominator to two pieces of code and they are similar practically by accident. It is just as wrong to abstract an accidental similarity as it is not abstract an existing one; soon the requirements change and the abstraction will have to be abandoned, code copied and developed in divergent ways.

Finally, people start attempting metaprogramming way before they had learnt enough about structuring programs using basic means, like breaking things down into classes and methods appropriately, data-driving your programs etc. Yes, this is actually a skill, and unless you rewrite some of your own programs 5-10 times and compare their structure you won't learn it. I also recommend reading some classical books: SICP, Refactoring, Refactoring To Patterns, Effective Java, Programming Pearls.

And the ultimate lessons come from maintaining your own code for a few years.


> The examples shown in the post aren't wrong because they fail the grep test, they are wrong because...

I read it has him saying "hey I found this easily identifiable test which seems to correlate with code that's possibly too clever for it's own good".

I also believe this test is worth thinking about because I tend to use grep/Find in Files a lot on my own code. It's often just simpler and faster and equally effective to the sophisticated IDE code comprehension stuff.

> a lot of people stick with those "principles" (which are rules of thumb, often weak ones at that) they have read somewhere without developing a real understanding of software design

We all have to start somewhere. I don't think it's possible to get "a real understanding of software design" without making some mistakes along the way. At best, we can help newcomers avoid the most painful ones and wasting too much time on dead ends.

But (IMHO) all good programmers learn by failure (their own as well as others).


So you're saying that code duplication is only (usually) a symptom of failing to factor out a relevant abstraction, rather than something to be fought in itself? On the first reading, it looks like you say "there's nothing wrong with code repetition" and then immediately list all the things wrong with it; instinctively, "a logically or structurally common aspect of the problem didn't receive the recognition in the form of an abstraction" is synonymous with code repetition.


Talking about code repetition is focusing on the wrong thing, consider how metaprogramming is used in the article: it mechanically generates the same code that previously was simply in place in the class. It doesn't make anything simpler to understand, it doesn't allow to modify common logic in one place and in the future it is likely those common "behaviours" will diverge anyway because the similarity is just very weak and superficial. I would instead advise to strive to extract all possible interesting abstractions and if you pushed yourself really hard to do it and there is some repeated code left at the end, it might just be fine to leave it as it is - the cost of removing duplication might outweigh the benefit.

In the example posted, position and direction are both vectors, recognizing them as such would provide a much better way of checking for "zeroness" and for negation. It's a nice educational aspect of this toy problem the author seems to not have noticed himself. There is often a similar elegant way out of real-world complex problems that doesn't involve fancy meta-programming but just good abstractions, putting right data in the right place etc.


The examples shown in the post aren't wrong because they fail the grep test, they are wrong because they A) capture an uninteresting abstraction that is not likely to see much usage as the program gets more complex

My understanding is the point of doing the grep test is to avoid capturing uninteresting abstractions not likely to see much usage as the program gets more complex. If you are a programmer who sometimes loses sight of the program's real goal because you're worrying about code duplication, the grep test might be useful for you.


No, this is wrong, wrong, wrong.

Ok, it's actually right under one false assumption: that we are limited to using our existing text-based tools like text editors, grep and so on. As long as we use these existing tools, this advice is quite valid.

However, we are absolutely not limited to existing tools, we can and should make new ones that augment how we work and allow us to reach more dryness without compromises. You could have tools that let you work with ASTs, that expand code on the fly, show you higher level information, let you debug and step through sections of code effortlessly.

In the end, dryness is good, because it means you have less repetitive work to do when things change (and software is all about changing, unless you don't want to make progress). So it's very well worth investing into being able to achieve dryness more naturally without the downsides.

To quote Bret Victor, stop "blindly manipulating symbols." (You don't have to do do so overnight, just as long as it's a goal you work on achieving as time goes on.)


yes, it's right, right, right. Because your clever AST savvy tools will fail in the presence of metaprogramming, unless you restrict it to patterns the tool knows. And then you're halfway towards static typing with inference.

Yes, grep is primitive for this task. We have much better tools for static languages, and they are great. It will be great to get them for dynamic languages. But they will fail in pretty much exactly these cases where this primitive test fails. Because halting problem.


If your metaprogramming is happening at compile time, and your AST savvy tools are savvy enough to work on the final (all macros evaluated) form of the code, then I don't see why they should fail?


Smuglispweeny? Unfortunately, a lot of metaprogramming is done in languages that don't even have a "compile time" never mind providing a hook to metaprogram in it.


If you wait until the final form of the code, it would seem very difficult to trace back from function symbol to its generating/binding macro. That is, how do you know which macro bound a specific function? Hope for no ambiguities, I suppose. Maybe it's not a problem, but I'd think the main point of such a grep tool is to point back to the originating code, right? Merely detecting that a specific function symbol is bound is not quite the same.


compile-time metaprogramming is far more tractable, yes.


No, actually, dynamic languages can, theoretically, handle this. At least, I can introspect in a running Lisp instance quite well. By all reports Smalltalk does it quite well too.

The fact that most dynamic languages were hacked together doesn't imply that the better systems also fail.


No, actually, dynamic languages cannot handle this even theoretically. See halting problem: No realistic amount of introspection is going to automatically tell you where negative_position is defined in the second example from TFA, because method_missing can do literally anything to decide what method names it does and does not handle. And Smalltalk's #doesNotUnderstand: works exactly like that.


You are confusing dynamic typing vs static typing with run-time execution vs compile-time execution.

A statically typed language can do arbitrary stuff at run-time too that will not be amenable to static analysis. This has nothing to do with the type system.

Furthermore, a dynamic language can be perfectly amenable to static analysis. For instance, in lisp+slime, slime-who-calls can identify callers of a particular function, even when grep would not, e.g. if the function was invoked via a turing-complete macro. The key thing here is that the AST to be analyzed does exist at compile-time; it just happens to be generated rather than hand-coded.


In practice, you can add a sprinkle of static to your dynamic language, adding a preprocessor that requires declaration of method_missing patterns that are acceptable to the application.


Theoretically, for constrained cases yes.

On the other hand, the typical Ruby ORM defines methods that corresponds to the current columns of tables in the database it is told to connect to. That fancy AST parsing tool won't be able to do anything in that case without being able to connect to your database and issue queries...

You might consider this "hacked together" but those kinds of api's is a major part of the draw of Ruby for a lot of people.


> clever AST savvy tools will fail in the presence of metaprogramming, unless you restrict it to patterns the tool knows. And then you're halfway towards static typing with inference.

Common Lisp has plenty of well-defined macros with simple patterns any tool could learn, but it's definitely dynamically typed. (defstruct foo ..) generates foo-p, make-foo, copy-foo; what parser couldn't understand that? All it takes is a few good macros to really make the DRY work. Either give up like the author and repeat yourself or move past the problem with a macro.

Patterns mean "I have run out of language." — Rich Hickey


> Common Lisp has plenty of well-defined macros with simple patterns any tool could learn

If the tool has to learn (i.e. be written to know about, yes?) these patterns, then it will not know about those you wrote.

However, as someone else has pointed out, the compile-time execution of macros actually does make them accessible to a sufficiently sophisticated "grep", the kind of tool shurcooL asked for.


Yes, it's good and important to keep code DRY. The point here is that we often come up with clever brute force means of keeping code DRY. In most cases where you're generating method names through string concatenation (to use an example from the article), you could achieve DRY code by refactoring.


Totally agree. Demand better. Have your metaprogramming and be able to know where 'getPosition' was dynamically generated.


So... does that mean that it's right? Because I don't think you can prove a point by assuming the existence of things that haven't been invented yet.


Yes yes yes. It's important to have code standards that enable code to be easy to understand easy to navigate.

I work in C++ all day and loathe when functions are fully implemented in the class declaration. Keeping them seperate with full ClassName::FunctionName(...) scoping makes finding functions exceedingly simple and friendly. Keep implementations separate means the full class declaration is easy to read, parse, and understand. I don't want to scroll through hundreds of lines of code just to see what functions are available.


Aren't there circumstances in C++ when this isn't possible? e.g. some uses of templates


No, you can implement methods outside the class declaration even with templates. It just has to be in the header file rather than the cpp.


I normally do C++, but started doing some Typescript (Javascript) the other day. Like Java, this doesn't make external implementation easy.

I found myself using a pattern of declaring a one-off interface for complex classes and then implementing it in the real class. I'm still undecided when/if this is a useful thing to do.


Sorry, don't buy it.

A published & concise interface wins over an undocumented redundant interface. Explicit-only just takes you down the path towards Java and COBOL.


While I agree with this idealistically and wish that all the code I've come across adhered to published interfaces, unfortunately that hasn't been the case.

Published interfaces may also not be possible in certain code bases or frameworks -- think of a Chef recipe for instance where most (ruby) code is being written to directly interact with the system rather than provide a service to other parts of the system. Dynamically generating variable names and key/value pairs in Chef recipes is a common pattern I've seen which makes grepping and isolating issues difficult if not impossible.


That's sufficient when the code is flawless and you never have to look at it. I.e. pretty much never.

The rest of the time it is still a huge benefit to be able to easily navigate the source. And frankly, a lot of the time the source is a more concise source of truth than decently written documentation.


Also, a working and redundant interface wins over a buggy, broken, and concise interface.

But I don't think the OP argued against either documentation or fixing bugs, so I fear that both of our comments are irrelevant.


Well said. Too bad most coders are horrible are finalizing an interface.


Agreed. The answer to this is documentation; not dogma.


Ehhhhhhhh not really. If you're using metaprogramming or abstract programming of any kind, it's usually for a good reason, and DRY in itself is worth simply not being able to find every single reference to your metaprogrammed item in your codebase.

It's an artefact of the design. You simply need to know how the abstraction is done, and then you look for invocations of that abstract more general form. It's not that difficult, and if it's being done, it is almost without exception a better way to do the thing.

If metaprogramming is not a better way to do the thing being done, and adds confusion and decreases reusability, then you shouldn't do it, but that's a tautology and doesn't mean we have to be able to grep for everything we ever write.


Honestly, in my professional opinion probably 60% of the usages of metaprogramming seen in the wild fail the "for a good reason" test pretty badly. Most of the time it's just the author being clever to save a few lines.

But I don't see why metaprogramming should inherently fail the grep test anyway. Sure, your property/thingy/whatever (an access method for a field automatically generated from a database, say) might be generic and generated at runtime, but its name should still be written down somewhere authoritative, even if it's in a schema or (if nothing else) a documentation file. That stuff should still be in your tree and greppable, and if it's not your code has discoverability problems.


This is a standard case of what I like to call a tradeoff.

Making code that passes the grep test allows many more programmers who are vaguely familiar with the codebase to make changes.

Making code that fails the grep test allows a team of a few highly skilled developers who know the codebase inside and out to do the work of hundreds.

It's like mathematical notation, you generally need experience in that sub-branch of mathematics to understand the notation.


I think "do the work of hundreds" is a little bit generous. It does save a little time, but does it really save so much time as to boost productivity 100x?


Most teams eventually need to on board new people, though.

Besides that, building teams of generalists is far easier if it's easy to dive into new code bases. Even with small teams highly specialized on individual projects, there are lots of advantages to having people be comfortable digging into other teams' code, which is much easier if it's more discoverable.


This is the second time this week I've seen "counterexample" used to mean "something that is bad because it breaks the rule, thus showing why the rule is good" rather than "something that invalidates the rule, thus showing that the rule is bad". Is this a real trend in usage or have I just been unlucky recently?


I agree that this usage isn't really in line with how it's used in mathematical proofs. Do you have a better concise suggestion I can keep in mind for next time?


anti-pattern?


One thing that can often help with this is to have a good REPL. With it you can dive into and explore the code as it is at runtime. For example, a common way that I figured out Rails was to open the console and use obj.methods to see what methods are available.

Other languages in which I've enjoyed REPLs include Python, JavaScript and Haskell. Two languages I have not found a good REPL for are Java and C++. Is anything of this sort available there?


Yes there is Cling for c++. Its really awesome. http://blog.coldflake.com/posts/2012-08-09-On-the-fly-C%2B%2...


Java and C++ are static typed, and are typically very readable and have a less need for REPL, not that a REPL won't help.


If you don't mind teaching yourself a bit of Clojure, you can use it as a REPL for Java with little difficulty.


If you think this is good, wait until you find out about statically typed languages and IDEs with code navigation features.


No, you can do this sort of madness in C too via function pointer tables. Give the function pointer field in your struct a commonly-grep-spammed name like "close" and watch your readers go mad trying to figure out what xxx->close() does.


Virtual tables are not quite the same thing.

You can grep for every instantiation of the struct in the code-base.

If anything, macros are more problematic:

  #define VTABLE_INIT(prefix) \
    (struct vtable){ .add = &prefix##_add, .sub = &prefix##_sub, and so forth }


I've done a bunch of work with statically typed languages and am not about to argue that dynamic languages can be as easily navigated or statically analyzed by IDEs as statically typed ones can.

That said, dynamic languages still have a great deal of value, and a significant portion of the programming population uses them, so I still think it's worthwhile to set down some useful rules of thumb.


> dynamic languages still have a great deal of value

I used to think this too.


If you think that is good, wait until you find out that Lisp has similar code navigation features under SLIME and yet is a dynamically typed language.


An important distinction is between mature (or matur-ish) library code and your consuming code base:

  - library code publishes an interface and can do what it wants
  - your client/application code should be greppable
There isn't a sharp distinction between the two, but it's a pretty big headache if developers are using heavy metaprogramming everywhere in your project.


Agreed. I suppose the more important test for library code is "googleability" rather than "greppability".


I must be pedantic and separate the definition of "metaprogramming," and "dynamic programming;" of which this is the latter.

The reason this difference is important is that metaprogramming is structured and explicitly defines the change in the program's semantics.

Thankfully I have never seen code like what was posted in my experience with Python so far. Explicit is better than implicit which the code example in this post would fail to pass IMO. Dynamically dispatching to names that don't exist in a class' published interface at call-time is a big no-no in my book.

Although practicality beats purity.

I haven't seen it yet but that doesn't mean there is a practical reason to use this method of dynamic dispatch. If there were one and it gets a problem solved NOW rather than waiting to find a better solution -- it might be worthwhile.

However it's a price you have to pay.

I think the grep-test is at least a good way to test the waters with a bit of code. I don't think it's a universal end-all-discussions rule.


Dynamic programming already has a meaning: https://en.wikipedia.org/wiki/Dynamic_programming


Right, thanks for pointing that out. That makes this what, then? Hash-blob programming? :)


Sorry to be pedantic, but this is not dynamic programming.


No need to be sorry


I totally agree with this. It's one of my pet peeves... dynamically generating method names so you can't actually search for them. Personally, I think code readability is one of the greatest goods in programming, and in general I would tend to place it above DRY as a priority (within reason)... particularly for mature, large, complex projects that require large teams (meaning lots of new people). And for the most part that's what I'm always hoping I'm building.


I don't necessarily disagree with the article, but maybe the solution is a new kind of search. One which understands languages, statically analyzes a source tree, and searches the ASTs for the tokens you're interested in.

This would also have the side benefit of being able to produce better designed search results.

EDIT: I suppose searching the AST wouldn't be enough, you would have to evaluate the code to some extent to be able to search these properties.


Yeah. This is my biggest gripe working with dynamically-typed languages, especially if the project is at all large. If I have SublimeText or Vim, I'll be equally productive in Python as I'd be in Java. But give me Eclipse/IntelliJ, and I can blow the pants off myself in terms of efficiency working in Java compared to working in Python with any Python IDE (that I'm aware of). There are just so few good tools for dynamic languages compared to static languages.

The situation is uniquely bad in JavaScript, since JavaScript doesn't formally have modules, functions can take any number of arguments, there's the funky scoping, and so forth. Not that good tools don't exist, but it makes it difficult for a JS IDE to enable working across files like a Java IDE can.


Agreed, maybe that's why fewer tools (if any) exist to do things like programmatic refactoring in JavaScript (does WebStorm do that at all?)

That being said, I've found the language a pleasure to work with (aside from browser incompatibility issues) and with the proper application of design principles such as modularity (check out require.js and the r.js optimizer) and, yes, DRY, I can be just as productive, if not more so, in JavaScript as I am in any statically-typed language.


Which statically-typed languages have you tried?


I've got varying degrees of experience (some professional, some not) with C++, C, Objective-C, Java, ActionScript 3.

EDIT: Also C# (so many languages, I can't even remember which ones I've used sometimes). I'm also not including TypeScript or Dart (optional static typing), which I've technically "tried", but don't know much about.


The ml style statically typed languages are very different from all of these (in good ways).

For example, in Haskell you can write code as if you're in a dynamic language without annotating all the types and yet you get all of the benefits of types. It's the best of both worlds.

Concluding that static languages suck after using java and c++ et al is a common mistake, and is understandable.


Well, I never said static languages suck :) I really like static typing actually, and have seriously thought about using TypeScript or Dart. I've also got years of professional experience with C++ and AS3 and I love them both. I think Java is a fantastic language. And Objective-C...is growing on me ;)

What I did say was that I can be just as productive in those languages as I am in JavaScript, if not more so. It remains to be seen whether I can gain more productivity with TypeScript or Dart.


When working with dynamic code, it’s an incredible boon to productivity to be able to quickly locate the definition of functions so you can build a complete mental context about what’s going on.

Imagine the 'incredible boon to productivity' you could get from tools (text editors, IDEs) that can deterministically show you all references or definitions of any method, variable, or class throughout your codebase whenever you want. (Hint: this isn't a dream--this is a huge plus of working with statically typed languages.)

For all of the pop-trendy love that dynamic languages seem to get for 'being fast for development,' (read: hacking) it's tragic how much they slow you down when you need to start hunting down where the hell something was magically (or meta-) declared or changed, especially when you're working with someone else's code (read: real life.)

The grep test seems like a great approach if you're stuck with a dynamic language. Of course, we don't always have the luxury to choose the technologies or platforms that we work with, but you've got the choice, a statically typed language solves this problem out of the box.


Sorry, my Common Lisp system does this quite well. Other dynlangs should be able to do it too....


Doesn't CL also fail the grep test? You can generate function names on-the-fly, can't you?


Separate compile-time from run-time....

(DESCRIBE #'functionname) from the REPL will give you information about the function; you can also pick up source if you configure it right.

The key idea is that software as written only loosely defines software images as they are live. 'Static' languages attempt to ensure that there is a tight correspondence, but dynamic linking defeats that in part.

Image based software ideas take this idea and run with it: that's why you download Smalltalk images, not smalltalk source.

Anyway, food for thought. :-)


I think this is a legitimate issue, though I'm not sure I really agree...there just seem to be too many places where it provides too much utility. That said, I tend to change my opinion on the topic fairly frequently, based upon whether I was last writing something where dynamic function declaration was convenient or last looking for where a function was declared.

One thing I think the post might emphasize more thoroughly is that dynamic function invocation (at least when the function is defined in the project scope), is probably far more problematic than dynamic function declaration, particularly if sometimes the function is explicitly invoked, and sometimes dynamically. In this situation, I'll generally try to document that the function is dynamically invoked with the function declaration, but I'm always unsure what exact information I should put there. Simply saying something like "Dynamically invoked -- grep will not find all usages" is a minimum, but I often want to add more than just that.


Agreed. I do comment on dynamic invocation being more problematic when I say "As a broad generalization, I would say dynamic declaration is occasionally worth the tradeoff, but dynamic invocation is almost never worthwhile.", but it perhaps should've been a larger part of the thesis.


Another one of those things you never have to worry about if you use a statically typed language.


That's actually not true. For instance, the game Alpha Centauri generated custom dynamic functions in RAM at runtime to draw user created units.


This is a great way of disagreeing with a position. Polite and backed by a concise fact. Thanks.


How do you know that? I'm not saying you're wrong, I'm just curious. Did you work on the game? Are you a hardcore modder?


"When I first came onboard with Loki, the Alpha Centauri Plantary Pack was my first porting project. I didn't know what to expect from commercial game code, but I sure wasn't expecting what I found in the SMAC codebase. Tens of thousands of lines of assembly code was in SMAC, some of which was self-modifying. I spent most of my time looking at memory in the debugger and flipping bits."


This has nothing to do with static vs dynamic typing. In Java, for example, you can easily invoke a method named by a string that's not defined at compile time (using the reflection api).


But since you can't define methods using reflection, I'd say this isn't used much and isn't really a problem in Java. Are you really having trouble decyphering a Java function relying on home-made dynamic dispatch?

But yeah, I'd say it's more about http://en.wikipedia.org/wiki/Dynamic_dispatch than static/dynamic typing. You can have dynamic dispatch in a statically typed language. I don't know how OP would feel about virtual methods in C++ since he resists using an IDE.


> But since you can't define methods using reflection, I'd say this isn't used much and isn't really a problem in Java.

You can do about anything you want with cglib / Javassist. Dynamically subtyping or byte code re-writing a class is how many Java ORMs work.


Correct. You can also use Dynamic Proxies, for metaprogramming, which are built right into the language.


That's a dynamically typed sub-language within Java.


Counterpoint: Macros. Though you could preprocess the code and grep that, it wouldn't work too well. Eclipse-et-al can give you a sort-of-works-most-of-the-time navigation features that work with the preprocessed code, which can overcome the macro problem, mostly.


Macros still can cause a lot of problems. See some of the crazy C++ macros or Template Haskell.


Statically typed languages also have worrisome features.

  #define readinto(obj, fd) (isatty(fd) ? readbuffered(obj, fd, findttystate()) : readfromfile(obj, fd))


In a statically typed language you have to worry about many other things tough =)


My first impulse is this is stupid. It means your tools aren't smart enough. Use an IDE, use aigrep, some tool that knows what to look for.

A better title would be "Don't use a fastening device if it fails the hammer test."

Are you don't use a tool that generates SetFunkyColumnName from funky_column_name data spec ? Or "I can't find the click handler in the .html, quit using jQuery" ?

I have been at the bottom of steep learning curves multiple times where I couldn't figure out where stuff was coming from. In most cases I got over it

Use the appropriate tools that get the job done and make you and your team productive. Tools and ideas evolve at different rates. Not all tools or ideas are implemented properly the first time, and not all tools or ideas are necessarily good/useful. But we'll never get any further if we don't try...


[deleted]


Everything in every language means something funny in some other language.


I was more interested in what you would post under your "HighKarmaAccount" but sadly that user is not recognized.


The "grep test" doesn't just help humans and IDE's. Closure Compiler's ADVANCED_OPTIMIZATIONS mode (https://developers.google.com/closure/compiler/docs/api-tuto... ) only works with JS code that can pass the "grep test". Setting or getting properties using dynamically generated property names won't work (even if Closure Compiler could understand your intent, it would have to know at runtime which minified name the concatenated parts should result in).

So in this case, writing longer, "grep-friendly" JS code can actually reduce the size of the JS payload you serve to your users.


The big elephant in the room is that this isn't really about metaprogramming. It's about the failure of the text as a means for manipulating of computer programs (e.g. eval on strings, reflection, method_missing, _getattr_, etc). It would be as if we represented and manipulated numbers as text strings.

Much like languages have more successful ways of representing numbers, some languages are more successful at metaprogramming. In particular, Lisp takes the view that the act of metaprogramming is really the act of writing application-specific compiler extensions. The default behavior simply doesn't involve ripping apart strings and concating them back together.


There is always going to be code that takes more than grep and a glance to understand. Sometimes you won't fully grok it until you've stepped through it line by line, even without "tricks" like metaprogramming. And more often than not the "tricks" make it more concise, less prone to bugs, and easier to understand.

It's madness to classify all use of metaprogramming as abuse. Perhaps you need to pour a glass of wine and learn to savor the source code, and to appreciate the power of modern programming languages.


Or... we improve the tools, instead of limiting the expressiveness of languages we use just to make grep happy.

(See Yegge's grok project, unfortunately he doesn't blog anymore)


Can't say I agree with this. A decent IDE has no problem tracing usage of dynamically generated methods etc. I know the JetBrains stuff can for Python and Ruby.


I don't want to be dependent on a single IDE to write or maintain code. Besides the annoyance of not getting to use what I want, what happens when the IDE stops being supported, but you still have to maintain the code?


thats a common argument against IDEs but i think it doesnt reflect reality. The most powerful IDEs like Visual Studio, Eclipse, XCode, IntelliJ IDEA have all been around for more than a decade and wont be going anywhere. Of course if you use some less popular product it might get abandoned, but the same can happen to less popular open source projects.

All the Textmate users now seem to be burned and always fear the software they use will die, so they stick to open source tools only. Those are okay for dynamic languages and smaller scope projects but i still find IDEs indispensable for large codebases and languages like C# or Java.


Not sure why TextMate users would feel burned. SublimeText is fully compatible with textmate bundles and cross-platform to boot.


It can trace execution of methods that have never been defined prior to usage? based on concatentation via name? Would LOVE to see a demo of this.


Why is this hard to believe? There are relatively well-established patterns for big libraries like rails, and big libraries like rails are also part of the reason people turn to such IDEs. Such cases could be pretty easily handed with some custom code. It doesn't have to be 100% generic and 100% perfect to be useful.


While RubyMine does an almost unbelievably good job of finding the declarations for dynamically generated methods, it is _far_ from perfect. I might even say it doesn't do it very well, except that what it does do is way more than I would expect it to be able to.


Why not use a system of (unified) comments interspersed in the code to demarcate various elements? A sort of index.

The idea is that programming languages allow too much flexibility in form and syntax for any indexing system to accomodate the full range of possibilities. And "coding style" rules are apparently too restrictive on creativity: they are too difficult to enforce. And hence a solution would be better to focus not on the code, but on the comments. Force programmers to adhere to a uniform commenting system.

The OP mentions ctags. It's not a perfect system, but it's still in the BSD base systems so everyone who has BSD has a copy of the needed programs. That's a start.

What about cxref? Another old system that's probably not perfect, but seems like it was aiming in the right direction.

I've never understood why programmers obsess about things like verbose function names (that make lines go way over 80 chars and bend across the page... with identation it becomes almost unreadable to my eyes) instead of just providing an index of all functions and including the verbose information in the index, not the code. vi (and no doubt emacs too) allows you to jump around easily so you could look things up in an index quite quickly.

Why do Wikipedia pages have an index of footnotes and references at the bottom? Why not stuff all this information into the words in the body of the article? Why do books have indexes? Why do academic papers use footnotes? I don't know. But I'm accustomed to these conventions.

I also don't know why code uses verbose function names and generally lacks an index or footnotes. But I guess programmers have just become accustomed to these conventions.


"As a broad generalization, I would say dynamic declaration is occasionally worth the tradeoff, but dynamic invocation is almost never worthwhile." <-- huh? What does this even mean?

This entire post is silly.

1. Not being able to grep for code has nothing to do with being DRY.

2. Being able to grep for something doesn't make it correct. Nor does it make it less maintainable if you don't used named functions.

Moving on...


This is a little silly. My game engine dynamically invokes lambdas, and script files dynamically added to Actors at runtime with reflection, and such all the time, and it's well-documented and maintainable. I can't think of a reasonable way to implement this stuff that passes this grep test. Unity development fails it too I guess?


Things are even worse in the PHP world than the languages described in the article. You get variable variables and you can use variable as the function/class names so this is totally valid:

  function do($class, $action, $param, $value) {
    $method = $action.$param
    $obj = new $class();
    $obj->$method($value);
    return $obj
  }

  $my_name = 'Arnor';

  $$my_name = do('Entity', 'set', 'Name', $my_name);

  // The entity is now stored in the variable $Arnor
Thanks PHP... Don't even get me started on __call.

It's fun to come up with clever solutions and puzzles, but it harms your code base. If you're proud of how new and clever your last 10 lines were, you probably need to refactor it.


The Grep Test is interesting, but it is too extreme for me. I have this thing I call the 'No Lie Principle.'

The idea is that when you look at code, the computations you see are the ones actually executed. Meta-programming may be used but only to add behavior to existing code, not to nullify it.

A good example is an 'execute around' method. The method you see in the code is executed, but some things may happen before it and some things may happen after it. What you can't do is replace the body with something else.

An interesting thing about the 'No Lie Principle' is that aligns with good practice around inheritance also. It's better to override abstract methods than it is to override concrete ones for a number of reasons.


This is dumb. Most languages support some way of autoloading classes. If you don't understand that then you won't be grepping the right file. As such this is not a valid test for readability because understanding how classes is loaded is assumed.


I have no problem with autoloading classes (though I suppose me including "modules" in the list might've implied that). If a class is autoloaded, I can still find its declaration and implementation using grep.

As for "Not grepping the right file", I meant it should be findable in a project-wide grep, not in the file I assume it to be in.


It's a noble concept, but a bit weak. Really what you want is a static analysis tool that an keep up with your programmers.

For example, Guice fails the Grep Test hard, but it incredibly helpful.

All data-driven fail the Grep test. Your browser fails the Grep test (you can't grep for javascript content)

You just need to replace Grep with an xref tool that understands your programming language, including your metaprogramming. This may require you to commit your configuration files and standard data objects into your source control, or build an indexer that can read your CMS as well as your code.


what about functions passed as arguments? quite a basic technique and I don't think it passes this test.


functions passed as arguments are fine, because something needs to be passing them. While the function invocation site itself is dynamic, the source of the call isn't.

Basically the important aspect is that even if you have higher order functions, the things passing the functions to these higher order functions will still be greppable.


I have unfortunately created an annoying case of this, in which a bunch of thunks are put in a list and invoked en masse. If any one of them has a problem, it's a bit of a problem to sort out where it was created. This is exasperated by the fact that functions do not know their own names in Lua.

  Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio
  > function f(g) g[1]() end
  > function thisFunctionCrashes() undefinedVariable() end
  > f({thisFunctionCrashes})
  stdin:1: attempt to call global 'undefinedVariable' (a nil value)
  stack traceback:
  	stdin:1: in function '?'
  	stdin:1: in function 'f'
  	stdin:1: in main chunk
  	[C]: ?
In the example, the source line is available, but in the genuine article the thunks are for native functions.


It has nothing to do with being dynamic, it is about names. The article insists that you need to be able to grep for the identifier, but you don't have the identifier. So you have to grep for the function you are in, find all the places it is called from, and then sort through all the functions passed into it as arguments. The function passed in as an argument is often an anonymous function, so it still won't have an identifier anyways. The entire concept is just plain nonsense.


Functions passed as arguments can be traced the same way you trace values. You find all invocations of the surrounding function and see what function arguments are being passed in.

It's more involved than just grepping for the name, but the same is true of tracing where any other argument value came from.


Come on.. Anonymous functions (lambdas) have the simple rule - use them when you need a function as an argument to a high-order function (such as filter or map) right here, in this call, because giving it a name and, especially place where you can't see its code, is less concise. If a function, it turns out, should be called more than once, it deserves a name.

The practice of placing any general functionality in lambdas or blocks will make thing more messy.


One should not get too dependent on just one tool as Grep, sacrificing some useful features of dynamic languages... In the Javascript example, (and possibly others), it is still possible to easily find where the method was defined - open a debugger, add a break point on console.log(ray.getPosition()); and step in to it. there are many other tools to help you, for example, many modern IDEs have powerful inspection features these days.


now I have a test for the "too abstract" problem. Also - bonus points to the author for pointing out at the end that general practice != abolute truth


This seems like a great case for solid comments in the code, as long as they're up-to-date and you don't have a huge number of permutations. I know we all like to crap on code comments here, but if you had say 5 possible permutations that this code supported, a programmer could easily throw a comment above it with each of the supported functions, comma-separated. Then grepping would indeed find them.


So, basically, everyone who is capable of producing excellent code is bared from using a powerful technique, largely because some people can't understand the code they produce.

It's a difficult trade-off, one one I've had to make calls on several times. Do you use the full capabilities of incredibly gifted and talented programmers, and then allow code into your codebase that maintenance programmers can't understand?

Tricky.


I think the best solution is to have the better ones mentor the lesser ones. It takes time away from coding in the short-term, but it is a good investment in your people in the long-term. Metaprogramming is something that any good programmer can understand if they are just taught well, just like all the other difficult parts of programming that people routinely like to ban: pointers, goto, lambdas and functional programming, C macros, Lisp macros, etc. Using them correctly really helps a lot, using them incorrectly can hurt a lot.

The solution is to learn how to hit the nail with the hammer, not your thumb.


I think the grep test is inappropriate as a blind requirement for commit acceptance, but I do think it gets at an interesting issue. If your code fails the grep test, pay attention, make sure what you're doing makes sense, and make sure it's documented somewhere that's accessible to those new to the code base (in terms of work flow as well as technical access).


This assumes you don't write any tests. In that case your system has more problems than the scope of this article.

There's nothing wrong with using metaprogramming to generate methods, as long as you are writing tests for those methods.

Grepping the application code is usually much less useful than grepping the tests to see how the system is intended to behave.


I wrote a utility ages ago to find orphaned Ruby methods. It greps (actually, acks) for method names. Naturally this only goes so far with a non-trivial Ruby codebase, which limited its usefulness. https://github.com/built/funk



Erm, what about first-class functions in general? This doesn't just make metaprogramming harder, it also stops you from saving functions in lists or hash tables and any sort of higher-order programming.


This obviously only applies when grep is all you have to find-who-calls and find-definition.


C macros can cause the same issue when used to concat function names using ##.


In a sane language you could just ask the function where it is defined.

    user=> (meta #'leiningen.gnome/uuid)
    {:arglists ([project]), :ns #<Namespace leiningen.gnome>, :name uuid, :column 1, :line 10, :file "leiningen/gnome.clj"}


choosing module and function names is an acquired art in python. I usually grep my entire code base (and ensure nothing matches) before I add a new public name.


I like the idea of the Grep Test, but this is not a great illustration of it. These refactorings are bad not because they hide names from the developer but because they use the wrong abstractions.

The point of DRY isn't to mindlessly remove code duplication. It's to remind us to look for code duplication and keep us mindful of coupling between the various parts of our code.

Where two identical chunks of code that represent the same kind of work are used in multiple places we introduce "algorithmic coupling." That is, whenever the work being done in one location changes, we have to make sure to change the work being done in the other location. Anyone reading this code -- whether the author, the author's future self, or teammates -- has to remember this extra fact, increasing the surface area for bugs.

There's also "name coupling," viz., for every vector _foo_ associated with the Ray we want methods like foo_is_zero?, negative_foo, etc. Here it's important that the naming convention be consistent, so there's coupling there, too. If the names aren't consistent anyone else reading the code would then have to remember that fact, and anyone changing the names would have to remember to update all the other names, too.

The irony of his example is that style of metaprogramming is a great way to get rid of name coupling, but he did nothing to get rid of the much worse algorithmic coupling. Indeed, algorithmic coupling screams for DRY whereas name coupling requires it on a more case-by-case basis.

That is, when all the names are highly localized, e.g., three short methods that share some naming pattern are all defined in succession, it's much less important to remove the duplication. Anyone reading or editing that code will quickly see what the pattern is and why it exists.

Here's a comment I left on the blog:

Hmm. I don't think the lesson here is about to-DRY-or-not-to-DRY your code. Instead, it's about using the appropriate abstractions.

Using the Ray example, both position and direction are vectors, not points. Make them vectors! Then you'd be able to say things like

  ray.position.zero?
  # We'd usually say "the inverse of" not "the negative of"
  ray.position.inverse
Furthermore, if you wanted to define the same methods, well...

  class Ray
    def position_is_zero?
      position.zero?
    end
  
    def direction_is_zero?
      direction.zero?
    end
 end
There's less repetition, now, because you're only repeating names, not logic. This means the code will only break when the names change, i.e., zero? becomes is_zero? or something.

In a world where all of your logic is also duplicated in the Ray class, the code would break when either the names or the logic changed.


More ways in which code is being made "approachable" to those who have no business reading code can pretend they have valuable input managing developers.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: