Hacker News new | past | comments | ask | show | jobs | submit login
Adding Empty Set Literals to Python (philipbjorge.com)
41 points by philipbjorge on March 16, 2013 | hide | past | favorite | 46 comments

"I thought {@} would be a good for an empty set because it arguably looks like the empty set symbol"

I'm just nitpicking but I don't think this is a good justification - If the @ symbol looks like the empty set symbol, then {@} is the set containing the empty set

I personally prefer dict() to {} anyway, so I don't really miss the lack of an empty set literal.

Adding a set literal was a slippery slope anyway. Where's my frozenset literal? Where's my namedtuple literal?

(I also don't particularly like the {} syntax for dicts, because there is {'too_much': 'quoting'} which I find rather ugly. I much prefer dict(too_much='quoting') or, even better, Perl's solution: { too_much => 'quoting' }.)

I also prefer the dict() function as well as JS object literals but it's too bad there's a performance penalty to using dict().

There's a performance penalty to using Python, too. I wouldn't let miniscule detail affect whether or not you like the code you're reading. (By that logic, you should represent sets as {'element': 1} instead of set('element'), because calling the function "set" is slower than making a dictionary object from the special syntax.)

> There's a performance penalty to using Python, too.

I agree, but if it wasn't for the fact that using dict() is six times slower than using literals then it wouldn't make much of a difference arguing code style [1].

> (By that logic, you should represent sets as {'element': 1} instead of set('element'), because calling the function "set" is slower than making a dictionary object from the special syntax.)

Which is why I guess there are now set literals in Python (just not for empty sets :).

-- [1]: http://doughellmann.com/2012/11/the-performance-impact-of-us...

You're worrying about a constant slowdown (6x) on a constant operation (constructing a dictionary with a fixed number of items). In terms of computational complexity, that amounts to two times a negligible constant, i.e., nothing.

I think the reason for having set literals is not performance but completeness and prettier syntax.

Executive summary: If you don't like part of the language, it is really easy to add your own hack and make it worse ;-)

Really, you should trust the core developers on this one. Empty set literals are missing from Python for a reason.

You're right on trusting the core developers, but "for a reason" type arguments are not the way to do things. I love Python because it's so practical and manages to stop codebases from devolving into overengineered incomprehensible messes, but I hate tons lots of things about it, and more generally I hate its "worse is better" approach (http://www.jwz.org/doc/worse-is-better.html) to many things (because for Guido, if "worse" is "simpler to explain and implement", he always chooses it, always implementation simplicity over nice "interface" features and also over "interface" consistency... I almost want to shoot the man when I read arguments like http://www.artima.com/weblogs/viewpost.jsp?thread=147358 coming from a very smart person like him). OP's complaint is valid imho, consistency is f important, because small inconsistencies pile up and make a language harder to learn (yeah, you don't notice them once they're in your "muscle memory", but "learnability" is important for a language and this is why I like Python and I hope it doesn't lose this quality by slowly accreting inconsistencies). Hopefully his blogpost is a message that reaches Guido and the core developers and nudges them even a little bit towards a better path!

I just wish someone could find the old discussions...

The main argument I could see against empty set literals was how wrong it felt adding it to the grammar. Even adding 3 lines of code for the feature, felt like it was just creating chaos in a zen-like codebase.

In both of these replies, Guido is referring to older discussions. I'm curious now.

imagine that the "chaos" you are adding to the implementation is actually subtracted from its "interface" and consequently from all code that will be written thereon... sometimes you need to "add complexity"/"destroy order" to create more order at a higher level :)

I think it would be best if empty dictionary was {:} and empty set was {}. Of course, it would be insane to make this change.

That ain't orthogonality, this is orthogonality:

    dict() = {}
    set() = <>
    tuple() = ()
    list() = []
(btw, does anyone actually use `<>` as `!=`?)

It seems to me that our real problem here is the dearth of easily-accessible bracket characters on US-standard keyboards:

    x = 【1, 2, 3】
    y = 【】

They finally got rid of <> in Python 3 (together with the `` as being alias for repr).

And they're not going to add new syntax to Python 2.7 (and the core team has been empathetic that there would not be a 2.8), so `<>` could easily be repurposed for "empty set".

Of course `<1, 2, 3, 4>` would then have to be an alternative set literal.

    <pedant type="confused">emphatic?</pedant>

This would conflict with greater/smaller than operators, which are also used on sets to express sub/superset relations. If it wouldn't cause trouble in the formal grammar, then at least in terms of readability.

> This would conflict with greater/smaller than operators, which are also used on sets to express sub/superset relations.

Not any more than `[` being used for both literal lists and indexing.

> If it wouldn't cause trouble in the formal grammar, then at least in terms of readability.

It wouldn't cause the slightest trouble in the grammar, and I don't think it'd cause much if any trouble for readability either.

In terms of readability the angular brackets arguably don't nest well: <<<0, 1>, 2>, 3>.

Also, having braces for sets happens to coincide with mathematical notation. Angular brackets would be more appropriate for tuples, in this regard--although the brackets used for tuples in mathematical notation are less acute: \langle and \rangle in LaTeX.

I think it's pretty unfortunate that for practical purposes we're forced to make do with what's available in the ASCII character set, which is pretty much an accident of history. My favorite solution to the OP's problem would be to use the Unicode empty set symbol. People who for some reason aren't able to type it could fall back to set(), and in terms of reading it will be immediately obvious to everyone that the empty set is intended, which doesn't hold for all the ad-hoc solutions suggested until now.

Maybe Guido doesn't like it because it looks weird when compared to the rest of Python's syntax? Still, I think it's cool. Having sets availably without importing anything is one of the things I love about Python.

set = <> is still available.

set = <> is still available

Did this get misformatted? I don't know what you mean.

I believe sciencerobot is saying that "<>" could be used as the empty set literal.

Oh I see, thanks. "s=<>" is less ugly than "s={@}" but, obviously, is inconsistent with "s={1, 2, 3}".

Then fix that: s = <1,2,3>

I think that's better than using the same brace-type for sets and dicts anyway, and it can't be as confusing as () == tuple(), (x) == x, (x,) == tuple([x]).

Yes, that was the idea. I don't think it would be confusing to the parser.

set = <> looks like the other empty container literals, but it's not consistent with the literal notation for sets with one or more elements: my_set = {1,2,3}.

What about supporting <1, 2, 3> as well, and eventually deprecating the existing way? The end result would be much nicer.

you don't import sets , set() is available right away.

The webfont doesn't load for me because I have Ghostery installed (it blocks remote tracking scripts) so the page is all blank except for the code snippets (because Adobe Typekit is blocked). You should consider adding a fallback font.

Thanks and will do!

I just added a 1 second timeout that puts a default font on if the webfont hasn't loaded. Just tested with ghostery and it should be good to go.

I vaguely recall seeing a mailing list thread where `{-}` was proposed. I think it was rejected based on parsing ambiguity, i.e. `{-1}` vs `{-}`. (Or it's possible I might be losing my memory)

Wouldn't it lead to the same problem of a lack of consistency?

    dict: {} 
    list: []
    tuple: ()
    set: {-}

I originally implemented it that way and had that exact same issue (it's one of the revisions on github). Similarly, {/} would also be a problem.

How about {,} or s{} ?

I think {,} is intuitive because it reminds me of syntax for one-element tuples, e.g. (1,) -- but it may make code harder to read, which is not Pythonic.

I didn't think to do {,} which I like. s{} looks weird to me, but I couldn't tell you why.

The s{} alternatve is based on the strange r'' or u'' syntax i've gotten used to.

> What I learned: It’s not always hard to make interesting changes to large projects.

Should be, it's not always hard to make interesting changes to very well-designed and well-architected projects

    There should be one-- and preferably only one --obvious way to do it.
    Although that way may not be obvious at first unless you're Dutch.

Which is more Pythonic?

list() or []

dict() or {}

set() or ????

I would argue that the left is more pythonic and expressive.

Don't forget tuple() and ().

I don't think any are more pythonic. They are all valid python.

Is it just me, or is it a little ... obsessive to be this annoyed with having to use `set()` instead of `{@}`?

What, you've never gone and hacked an interpreter because you were mildly annoyed by something? I don't think he actually means for it to be used in production.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact