Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Visualizing 'Silhouettes' of Programming Languages (lelandbatey.com)
97 points by lelandbatey on June 30, 2017 | hide | past | favorite | 35 comments



Neat. I saw something like this recently in game form: Codeshapes http://teropa.info/codeshapes/


I'm getting the shape of a lot of license headers, from which I'm supposed to guess the language.


This is better than the story link, I think. I found it impossible at first, but after a few rounds there were some clear clues like the block of imports at the top of a Java file or the distinctive shape of functions in functional languages.


Wow, I'm way better at this than I imagined.


Interesting project, but these silhouettes reflect the programmer's coding style not anything specific to the languages themselves.


The interesting thing here would be seeing some form of aggregation of various projects to see what effects the language does have on style. Would love to see something that shows the effect of e.g. Python's whitespace or Go's boilerplate.


What immediately strikes me is Python's consistently high density compared to the other languages. This is something I've noticed before with well-written Python code; one can write powerful code in very compact blocks/paragraphs/stanzas/insert-metaphor-of-your-choosing.

Of course, take this with a grain of salt. This is a bit of an apples to oranges comparison, given it's just one file from one repo for each language.


It's striking how easily you can see the repetition most languages force on you. PHP's sawtooth silhouette is especially interesting.


These silhouettes reflect the programmer's coding style not anything specific to the languages themselves...


This is fun and interesting, but I don't think it says much about the various languages. Each of these languages can be formatted in a variety of styles.

As one example, Rust, and especially Servo, began with a recommended (or mandated) style with a heavy use of column alignment for things like function arguments. Like this code from rustfmt:

    let mut rewrites = try_opt!(subexpr_list.iter()
                                            .rev()
                                            .map(|e| {
                                                rewrite_chain_expr(e,
                                                                   total_span,
                                                                   context,
                                                                   max_width,
                                                                   indent)
                                            })
                                            .collect::<Option<Vec<_>>>());
But the Rust example in the silhouette page doesn't follow this style at all. Instead, it uses a purely indentation-based style with little or no use of column alignment. From looking at the latest Rust coding style document, it seems that the Rust community may be moving away from this aligned style to an indented style. (Anyone who is more tuned into Rust style, feel free to set me straight.)

If the code above were written in an indented style it might look more like this:

    let mut rewrites = try_opt!(
        subexpr_list
            .iter()
            .rev()
            .map( |e| {
                rewrite_chain_expr( e, total_span, context, max_width, indent )
            })
            .collect::<Option<Vec<_>>>()
    );
Same language, same code, but a very different silhouette.

Similarly, the last time I looked at the Oculus SDK, its C++ and C# code used a column alignment style, and I think some other projects like OpenCV use column alignment too.

Many other C++ and C# projects eschew alignment in favor of indentation, though.

This kind of choice in styles is available in all the languages being compared. Yes, even in Python. The Python example uses column alignment here and there, but it would be just as "Pythonic" to use indentation in those places.

So I don't think this is really comparing programming languages, it's just taking a single example for each language, where other projects in the same language may have very different silhouettes.


> (Anyone who is more tuned into Rust style, feel free to set me straight.)

Style team member here; you're absolutely right. We started with more visual indent but basically have switched entirely to block indent.

> it might look more like this:

This is how rustfmt would format this code, exactly.


im working on a auto formatting tool ... is there anyone here that enforce column alignment or have it in their style guide ?


For Rust? Are you aware of rustfmt?


I think this might be more effective if there was some kind of emphasis on lines which contain nothing but tokens used to end control structures, such as } or });

Frankly, as is, I think that javascript looks more crisp and sane than it really is (although, as another commenter points out, jQuery might not be the best choice for the current style norms).

Also, the Python choice, while a very cool project, is very atypical of Python style. I suggest using one of the more complex modules from Twisted.


When I looked at this, my immediate reaction was how awful long the files are. A programming language's community should encourage smaller files that do one thing.


i like large files, it makes it so much easier to read the code, figure out what it does, find stuff, and refactor. i like it when the entire program is just one file, except for the modules source. i hate when there are 10000 files all containing 5-10 imports and then just one or two lines of logic. or worse; include files within include files several levels deep that all touch global variables. my idea of good abstractions is to only lift out code that has no couplings and no shared logic with the program, like modules that can be reused by other people in other programs.


The files are long on purpose -- as the article says: The code samples where chosen by finding popular repos for each language and using the longest file in each repository.


One of the examples is the jQuery source code, but the long file shown isn't the actual source. The real source code is broken up into a number of much smaller files, and the jquery-2.2.4.js file shown on the page is built from these smaller files.


Something that bugs me is, in languages that have a wrapping `namespace foo { ... }` sort of construct, the contents of that block being indented.

If, in a typical file, after the preamble at the top, you're giving ALL of the subsequent lines a minimum of 1 indent to the right... then what's the point of that 1 leading indent?

Just hug the left side. You don't need a constant visual reinforcement that you're a good kid who namespaces their declarations.


The fractal dimension of each one would be nice.


For people who don't know about fractal dimensions, might I suggest this[0] video

[0] https://www.youtube.com/watch?v=gB9n2gHsHN4


Fractal dimension? How so?


What's most striking is that there is not a lot of differences between the languages. C and Python are somewhat denser and php tends to be more spread out. But that's about it.

I guess that reflects that humans break code in a way that looks the way they're used to.

Purely functional languages probably diverge significantly from these shapes.


It's a _very_ selective dataset, this doesn't show much. The idea's cool though


I would be curious to see some functional languages examples, like scheme, Haskell and OCaml, for example.


Yeah, was thinking the same for Erlang since it incentivizes terse sections of hoisted functions.


Shameless plug: a "Silhouette" (minimap) generator for command line: https://github.com/dpc/text-minimap


@lelandbatey: to see a difference that really pops, I suggest adding a common lisp or scheme silhouette, and a prolog one. Hell, if you can find a copy of gorilla.bas, do that too.


Wait,Java has actually the shortest silhouette?


It's not as if they generated some "average" code to make an apples to apples comparison, they arbitrarily grabbed files with very different purposes based on length.

So really all this does is show that different code with different purposes has different shapes. AKA, pointless. You would have to do this across huge numbers of files and average them out to have anything actually interesting to look at.


jQuery is hardly a good benchmark for current js style. You'll find completely different shapes in other libraries.


JS is the new C++. It's several different languages depending on which subset you use: jQuery style, pyramid callbacks, promises, async/await, etc.


always name your anonymous functions. that way it will be easier to break them out of the pyramid, make them testable, and easier to debug.


... and it will also make them not anonymous functions :)


A named "anonymous" function (lets call it lamda function to make it less confusing) is still a lamda function, but with a name ... It can not be called from elsewhere. So you get all the benefits from lamda functions! There's no advantage to not naming variables.

  var fs = require("fs");
  fs.readFile("somefile.txt", function readSomeFile(err, text) {
    if(err) throw err;
    else console.log(data);
  }




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: