
Can your programming language do this? - arohner
http://arohner.blogspot.com/2009/07/can-your-programming-language-do-this.html
======
pg
Yes, and in fewer tokens:

    
    
        (def entropy (x) 
          (- (sum [* (p _) (log:p _)] x)))
    

I like macros, but I wouldn't use them in this case. Summing is close enough
to mapping that the form above is easy for any Lisp programmer to understand.

I'd also argue that code size is a more meaningful measure of readability than
familiarity. There's nothing intrinsically right about sigma notation. Some
people are just more used to it. But since whatever notation you put in a
language will eventually become familiar to its users, you should factor out
familiarity when making design decisions, or at most use it to break ties. The
notation above is honestly more readable to me, because it's what I'm used to.

~~~
arohner
I'm honored you read my post and replied to it.

> I'd also argue that code size is a more meaningful measure of readability
> than familiarity ... but since whatever notation you put in a language will
> eventually become familiar to its users, you should factor out familiarity
> when making design decisions

An exaggerated, uncharitable interpretation of what you said would be "users
will figure out the notation, regardless of how hard it is, so make them jump
through hoops so you can get a shorter notation". How do you distinguish
between shorter notation that is a net win, and shorter notation that looks
like perl?

My goal in using the macro was to reduce noise. Without the macro, the
equivalent clojure would look like

    
    
        (defn entropy [x]
            (- (Σ #(* (p %) (log (p %)) x))
    

To me, the original version is easier to read because it lacks the #, %
characters. Given that every call to sum would likely use an anonymous
function (and therefore use #, %), it seemed worthwhile. Maybe that's an
argument against clojure, but the arc version is almost as noisy, IMO.

~~~
pg
I think we should distinguish between notations that _allow_ users to write
incomprehensibly terse code and those that _force_ them to. Only the latter
are a problem, unless you're taking the protect-users-from-themselves approach
to language design.

It's an interesting question whether there are any existing language
constructs that force one to write unreadable code. I ended

<http://www.paulgraham.com/power.html>

by asking for examples, and I don't think I ever got any.

~~~
eru
> It's an interesting question whether there are any existing language
> constructs that force one to write unreadable code.

Depends. Suppose we added GoTo, ComeFrom, and assignment to the Haskell
specification, to be implemented by all conforming implementations. The
language has become strictly more powerful. But compiler writers will have a
much harder time proving certain properties about your program e.g. constancy
of variables. The implementation and optimisation techniques that depend on
those properties no longer work. Thus the purely functional parts of your
programmes will become slower, and you will be forced to switch to imperative
style if you want any performance.

(To make the compiler writers job even more difficult, add some of Python's
dynamic capabilities that make it so hard to compile. Plus an eval that works
in the current environment on strings and can change arbitrary variables or
cause other side-effects.)

Of course the language itself does not force you to write unreadable code. But
it may make it harder for the implementors to allow you to write readable
code.

------
Sharlin
Well, I certainly prefer it spelled "sum" -- trivial to understand and
viewable with any font.

In idiomatic Python, with only basic language features needed:

-sum(p(x) * log(p(x)) for x in l)

~~~
dreish
I think the point was that author was able to create a syntax for summation
that had approximately the same shape as the math notation.

I don't think I would argue that this is a very good example of macros (and
yes, I wrote that before reading pg's comment ;), but solutions people have
posted with other languages all have foreign symbols in them -- syntax for
passing around blocks, or language-defined looping or list comprehension
syntax, not ones they have chosen to suit their own needs. In this case,
that's a virtue; you learn the language's syntax, and then you know how to
recognize summation within any program in that language regardless of who
wrote it, but in cases where the problem is more complicated and the native
syntax for what you need to do more awkward, you really do want to create your
own syntax.

Arguably that's a danger inherent in Lisp: it gives more opportunities to
create needless new syntax, not just the kind of syntax that makes a program
clearer and simpler.

If you wanted to extend destructuring in Python (that is, (a, b) = (1, 2)) to
cover dictionaries, how would you do it? In a Lisp, it's done with macros. The
core language usually doesn't start out with any destructuring at all. In
Python, you can beg Guido and wait, or fork the project -- neither option is
practical.

~~~
tome
That facility's already there. Any object is unpacked using its __iter__:

    
    
       >>> d = { 'key1': 'val1', 'key2': 'val2' }
       >>> (x,y) = d
       >>> (x,y)
       ('key2', 'key1')
       >>> (x,y) = d.items()
       >>> (x,y)
       (('key2', 'val2'), ('key1', 'val1'))
       >>> (x,y) = d.values()
       >>> (x,y)
       ('val2', 'val1')

~~~
dreish
Sorry, I should have given an example. I don't see how the above is ever
likely to be useful, since the keys can be in any arbitrary order, but the
fact that you can get something sequential out of a dictionary and revert back
to tuple assignment is trivial.

What I'd like to see you do in Python is this:

    
    
       >>> d = { 'key1': 'val1', 'key2': 'val2', 'key3': 'val3' }
       >>> { 'key1': x, 'key3': y } = d
       >>> (x,y)
       ('val1', 'val3')
    

Nothing like this is built into the handful of special forms that Clojure
starts out with, but roughly halfway through core.clj, it gets added to the
language using the same macro facilities available to any Clojure programmer.
If Python manages to be as powerful a language as Clojure without macros, you
should have no trouble at all.

    
    
       user=> (def d {:key1 "val1", :key2 "val2", :key3 "val3"})
       #'user/d
       user=> (let [{x :key1, y :key3} d]
                [x y])
       ["val1" "val3"]
    

(The order of key-value pairs is reversed to support :as, :or, and :keys,
which let you do additional tricks while destructuring.)

~~~
tome
_but the fact that you can get something sequential out of a dictionary and
revert back to tuple assignment is trivial._

I disagree with this assertion. This functionality is not trivial and it
wasn't in early Python versions -- certainly not before the introduction of
__iter__!

It's existence is testimony to the continuing evolution, and consistency of
the Python language. Whether this is a good thing or not I suppose depends on
your point of view. I imagine Lispers would see it as a black mark that we had
to wait for Guido to implement it. Pythonistas on the other hand probably
value the fact that it was well thought out and discussed via a PEP, and is
universally implemented. Perhaps both are valid points of view.

Your challenge is interesting because it's so easy to solve in Python that I
can't imagine anyone except a Lisp programmer seeing anything in it to object
to. In Python it is trivial to replicate the functionality albeit with a
different syntax -- and syntax seems to be king for many Lisp fans.

    
    
       >>> def unpack(d, keys):
       ...         return (d[key] for key in keys)
       ... 
       >>> d = { 'key1': 'val1', 'key2': 'val2', 'key3': 'val3' }
       >>> 
       >>> (x,y) = unpack(d, ['key1', 'key3'])
       >>> print (x,y)
       ('val1', 'val3')
    

I like the Python version because I have to understand only two concepts to
work out what's happening: Firstly there's a function call. Secondly the
iterable result of the call is unpacked to a tuple.

The merits of the Clojure version I'm not qualified to say, but is it really
better in all regards than the Python equivalent? Even clarity, readability,
comprehensibility?

~~~
dreish
First of all, I certainly wasn't claiming that you couldn't look up more than
one hash key on one line of code, using a function. My point is that you might
reasonably want to extend the concept of destructuring further to cover
dictionaries, and that macros allow you to extend the language in that way.

Clarity, readability, and comprehensibility are three words that, in
programming, mean pretty much the same thing as familiarity. To me, if I
weren't looking at the definition of unpack, the above would look weird
because I'd think of the Perl function unpack(), which does something very
different. unpack(d, keys) above isn't something you'd even write in Clojure
because it would be a synonym for (map d keys). Aside from that, it does seem
very straightforward, if limited. Clojure does win on generality, because the
Clojure version nests, covering sequential data structures at the same time:

    
    
       user=> (def d {:a {:a1 [1 2 3] :a2 4} :b 5})
       #'user/d
       user=> (let [{{[_ _ x] :a1} :a, y :b}  d]
                 [x y])
       [3 5]
    

Hashes are the idiomatic general-purpose object in Clojure, so this sort of
capability is actually useful in taking apart function arguments.

~~~
tome
Ah well that nesting in Clojure is very cool! Python can unpack nested tuples,
but my "unpack" function would be very unwieldy there.

------
tedunangst
"Having a language that lets you declare constructs that look very similar to
their mathematical definition is a huge win for readability."

I'd agree, except every mathematical definition I've seen used infix notation.
Minus a function for summing a series, the C version at least keeps the
symbols in the same order. Anyone with even a passing familiarity of C can
read this and recognize it as the same thing.

    
    
      double entropy(int x) {
        double e = 0.0;
        for (int i = 0; i < x; i++) e += p(i) * log(p(i));
        return -e;
      }

------
Tichy
But how do you type the Σ? Don't do stuff like that... In one project I had to
use a templating language that used "<<" as a single character (I still can't
type it) instead of <% %> like all the other languages in the world. I ended
up using copy and paste for that character whenever I had to edit the
templates. Needless to say I enjoyed that project tremendously.

Really don't do that - try to stick to ASCII in the source code please.

~~~
gruseom
_But how do you type the Σ?_

You simply define a keyboard shortcut.

For very common primitives, a special character can work nicely. We use λ in
our Common Lisp code instead of typing _lambda_. Actually, our λ is a macro
that does a couple of things besides expanding into _lambda_ ; if it were just
the latter we probably wouldn't have gone there. Once we did, though, I was
surprised at the improvement in readability.

~~~
jasonkester
Ugh. If anybody ever has to maintain your code, they'll copy/paste the symbol
like Tichy described while cursing your name. Personally, I'd copy/paste it
into the global find/replace dialog the first time I came across it.

I mean, sure, if it was my job to deal with your unicode code every day from
here on out, I'd bother to learn how to map screwy symbols to keyboard
shortcuts. But really, why would you do this to your code in the first place?
It certainly doesn't make things _more_ readable. Exactly the opposite.

~~~
gruseom
_Personally, I'd copy/paste it into the global find/replace dialog the first
time I came across it._

Without bothering to understand why it was done that way or discussing it with
the team? That wouldn't be a good way to last very long.

 _why would you do this to your code in the first place?_

I was careful to include a phrase which explains this: _for very common
primitives_. Perhaps your code doesn't have those; ours does.

 _It certainly doesn't make things more readable._

Considering that I just finished saying it did, that's quite the claim.

 _screwy symbols_

Most programmers I've observed are closed-minded about deviations from their
comfort zone. Faced with something that doesn't conform to the way they always
do things (a.k.a. the Right Way), they reflexively condemn. Your comment is
remarkable insofar as every single one of its sentences suggests this
mentality. I hope that's just my mistaken impression.

~~~
jasonkester
Single letter procedure names are a bad idea in themself, and wouldn't make it
past a code review in any shop I've ever been associated with. Single letter
procedure names that you can't even read or type are an order of magnitude
worse.

So yeah, I don't think I'm alone in not wanting to dig through your code if
it's written this way. I don't want to see code like:

    
    
      for (int [left-curly-single-quote] = 1; '<10; '++)
    

Meaningful variable and procedure names were one of the big steps forward in
the 70's. And while I'm sure the math textbook pictogram for summation is
meaningful to you, it certainly isn't a good idea to use it in your code.

~~~
gruseom
Uh, right. Because l-a-m-b-d-a says so much more than λ.

------
slava_pestov
Here's a point-free version in Factor. It looks a bit different from the
original formula but once you get used to it its very quick to read; I prefer
it to the Lisp code since there are fewer parentheses to track:

    
    
        : entropy ( n -- e ) [1,b] [ p dup log * ] sigma neg ;
    

Note that [1,b] is a word which makes a numeric range from an integer on the
stack, and [ ... ] is a function object (quotation).

------
judofyr
Works nicely in Ruby 1.9:

    
    
      #!/usr/bin/ruby -Ku
      # encoding: utf-8
      module Enumerable
        def sum
          inject { |acc, val| acc + val }
        end
    
        def Σ(&blk)
          map(&blk).sum
        end
    
        def entropy(&prob)
          - Σ { |i| prob[i] * Math.log(prob[i]) }
        end
      end
    

EDIT: And in 1.8 if you pass -Ku.

~~~
jrockway
This also works fine in Perl, Common Lisp, and Haskell... and probably every
other programming language that was designed after 1970.

------
amix
I am looking forward for Fortress 1), which takes these ideas to the extreme
(basically, the whole language is unicoded and the syntax is mathematical).
Fortress also features some very innovative features such as implicit
parallelism for a imperative language and transactions. One of the creators of
Fortress is Guy Steele (one of the guys behind Scheme, Lisp and Java).

1) <http://projectfortress.sun.com/Projects/Community>

------
keefe
I don't think the readability of implementations is all that important. Java
is widely regarded as having poor syntax, but any client of this type of a
method (assuming it is in a static utility class) will do something like
double myEntropy = InformationTheory.entropy(myseries); The details of a
method like that just aren't all that interesting, it's 10 lines or 200
(likely the latter as there seems to be an optimized version for doing
anything). I like to abstract away mathematically complex(ish) things like
this and leave syntax like I stated above in my code, so that my code tells a
story that is easy to read about how data is being manipulated and really
focus on the high level semantics of what is going on. Maybe I am influenced
from having done clerical work in HS and typing fast enough that a few lines
doesn't matter, but sometimes I prefer longer code, with more interfaces and
more abstraction and more standardization to concise and elegant but more
obscure approaches.

------
ulvund
R:

    
    
      entropy <- function(x)  -sum(p(x)*log(p(x)))

------
jf
This reminds me of the Fortress programming language:

[http://research.sun.com/projects/plrg/Publications/fortress....](http://research.sun.com/projects/plrg/Publications/fortress.1.0.pdf)

(Image from the part of the above PDF that I was thinking of:
<http://skitch.com/joelf/bs99c/sun-research-fortress>)

------
icey
IIRC this will also work with PLT Scheme and (I _think_ ) Common Lisp (I can't
remember if you can use unicode names in CL).

~~~
arb
You can. For instance, SBCL introduced support for it around version 0.8 or so
(years ago).

------
bogdank
APL was capable of doing something similar. It would be nice to have keyboards
with extended set of characters.

~~~
buugs
I don't know if it will be of much use but take a look at
<http://www.jsoftware.com/> it is similar to apl in style

~~~
gchpaco
J took one of APL's chief positives, the only thing that made it possible to
look at absurdly dense APL code and make sense of it, and destroyed it. APL
has a "one character == one primitive" relation, except for user symbols which
all have to be letters. So no matter what jumble of crap you had, you could
tokenize it easily starting from anywhere. J has variable length characters.

~~~
slackenerny
_jumble of crap you had, you could tokenize it_

I don't grok this argument, when would one like to have that?

For me personally J is all about tacit functional programming. Any APL have
this?

~~~
gchpaco
Put another way, if you see sigma in an APL program, you can understand what
that is without knowing its context. Single characters in J do not have this
property; < is different from <. is different from <: (and sometimes
surprisingly different; =. defines a local and =: defines a global), and in a
lot of fonts it's not hard to mistake one for the other.

J, to me, took the worst features of APL (the profusion of symbols that only
loosely resemble anything else you've ever seen) and makes it even worse,
because at least the APL symbols were (usually) visually distinct! From the J
examples page are gems like this

    
    
      var =: ~.@(] , |:&.> , |.&.> , |."1&.>)^:3
    

which in the proportional font they use makes for a real entertaining time
telling , from ,. from ..

------
gregwebs
haskell list comprehension

entropy x = - sum [p i * (log . p) i | i <\- [1..x]]

------
jrockway
Yes, my programming language can do that.

------
hs
yes, lisp can make certain math simpler

consider bounds on a: 0 < a < 1; it's (and (< 0 a 1) (foo)) ... should be easy
in any language

how about 0 < a < b < 1 ? it's (and (< 0 a b 1) (bar)) ... a little ugly in
other languages

0 < a < b < ... < z < 1 ? i won't program that in other languages :p

------
trezor
C#:

    
    
        Func<List<double>, double> entropy = (X) =>
        {
            double e = 0;
            X.ForEach((p) => e += p * Math.Log(p));
            return -e;
        };
    

Compiler gets negative points for not being willing to infer the type-
definition of the function-variable though. I guess that is stuff you have to
live with in statically typed languages ;)

Edit: Just checked. C# allows unicode characters like Σ for functions and
variables, although I would never use it myself.

~~~
snprbob86
Not sure why you are using lambda syntax instead of defining a method _shrug_
, but anyway I don't think the type inference shortcoming is inherent to
static typing.

See this similar post regarding type inference on fields:
[http://blogs.msdn.com/ericlippert/archive/2009/01/26/why-
no-...](http://blogs.msdn.com/ericlippert/archive/2009/01/26/why-no-var-on-
fields.aspx)

I'm sure there is an equally reasonable explanation of costs and implications
for your example. Eric Lippert would probably be happy to share it with you,
why not ask him? :-)

~~~
scott_s
You're right, statically typed languages can use type inferencing; both
Haskell and Ocaml do.

------
TweedHeads
Javascript can do it all:

    
    
      function entropy(x){ return sum(p(x)*log(p(x))); }

------
mitko
Yes, it can. All programing languages are equivalent in power, as you can
write interpreter of language A in language B for, I think, any two
programming languages A and B.

However, some languages will achieve some things much easier. Your language
apparently makes it pretty easy to create new functions such as sum or
product.

~~~
scott_s
Start with the section "The Blub Paradox":
<http://www.paulgraham.com/avg.html>

