
Glom – Restructured Data for Python - mhashemi
https://sedimental.org/glom_restructured_data.html
======
nerdponx
It's a nice idea, but i never like writing what amounts to a DSL in strings in
my code (yes, that applies to in-code SQL as well, although that's often
unavoidable).

I prefer the `get_in()` method from Toolz:
[http://toolz.readthedocs.io/en/latest/api.html#toolz.dicttoo...](http://toolz.readthedocs.io/en/latest/api.html#toolz.dicttoolz.get_in)

~~~
jeremiahwv
I agree, I don't like the magic string approach (even if it is mostly just
dot-notation attribute lookup). However, there is some good stuff here, and
nested data lookup when value existence is unknown is a pain point for me.

In addition to the string based lookup, it looks like there is an attempt at a
pythonic approach:

    
    
      from glom import T
      spec = T['system']['planets'][-1].values()
      glom(target, spec)
      # ['jupiter', 69]
    

For me though, while I can understand what is going on, it doesn't feel
pythonic.

Here's what I would love to see:

    
    
      from glom import nested
      nested(target)['system']['planets'][-1].values()
    

And I would love (perhaps debatably) for that to be effectively equivalent to:

    
    
      nested(target).system.planets[-1].values()
    

Possible?

\--- edit: Ignore the above idea. I thought about this a bit more and the
issue that your T object solves is that in my version:

    
    
      nested(target)['system']
    

the result is ambiguous. Is this the end of the path query and should return
original non-defaulting dict, or the middle of the path query and should
return a defaulting dict? Unknown.

The T object is a good solution for this.

~~~
bilboa
The other problem that T solves and nested doesn't is being able to reuse a
spec. Once you have a spec, whether created with T or directly, you can call
glom multiple times with the same spec, pass the spec to another function, tc.

~~~
heavenlyblue
That’s called defining a function.

------
hessammehr
This might have been unintentional, but I suspect "Spectre of Structure" and
"Python's missing piece" refer to Nathan Marz's specter library for clojure
[1], similarly touted as clojure's missing piece. I tend to agree in the case
of specter, given the mind-boggling types of transformations that are easily
(and simply) expressed in it (and often run faster than idiomatic clojure as
well). Highly recommended if you ever need to work with deeply nested data
structures.

[1]
[https://github.com/nathanmarz/specter](https://github.com/nathanmarz/specter)

~~~
mhashemi
Total coincidence! Reading the README, Nathan and I are definitely on the same
wavelength though. When I get a chance I'll add it to the analogies doc:
[http://glom.readthedocs.io/en/latest/by_analogy.html](http://glom.readthedocs.io/en/latest/by_analogy.html)
:)

~~~
mhashemi
What's really interesting is that we're approaching the same ideal state from
different directions. Specter goes from Clojure's immutability to something
more practical, from Python's super dynamic system to something more
declarative and immutable.

------
faitswulff
This is really cool. Did you ever consider an API to do the reverse - to
insert a value at a particular point in the data?

My interest stems from this issue[0] on the Ruby issue tracker to make a
symmetrical method to Hash#dig (which does something similar to, but more
limited than glom) called Hash#bury. The problem in the issue was that
inserting a value at a given index in an array proved difficult and unnatural
in Ruby, so I was wondering if there were other solutions out there.

Another question occurs to me - does glom only support string keys?

[0]: [https://bugs.ruby-lang.org/issues/11747](https://bugs.ruby-
lang.org/issues/11747)

~~~
mhashemi
glom not only supports more than string keys, it also supports assigning to
non-dictionary objects. That's a part of the API we're working on right now,
actually.

As for the data insertion, mutation may be in the future, but for now glom
only transforms and returns new objects. Definitely something to think about
though, bookmarked! :)

~~~
hiccuphippo
Check the updateIn, mergeIn, mergeDeepIn methods in immutable.js. Maybe even
asMutable, asImmutable for mutation.

------
jbritton
There is also this Python lens library. [https://github.com/ingolemo/python-
lenses](https://github.com/ingolemo/python-lenses)

I can't say how they compare, but they have some overlapping features.

------
derefr
My favourite approach to this so far, that I would like other libraries to
copy, is Elixir’s Access protocol, which gives you e.g.:

    
    
        foo = %{key: [[1, 2], [3, 4], [5, 6]]}
    
        path = [:key, Access.all, Access.at(0)]
    
        get_in foo, path
        # => [1, 3, 5]
    
        update_in foo, path, &(&1 * 10)
        # => %{key: [[10, 2], [30, 4], [50, 6]]}
    
        foo
        |> put_in([:key, Access.all], “foo”)
        |> put_in([:new_key], “bar”)
        # => %{key: [“foo”, “foo”, “foo”], new_key: “bar”}
    

That third form is essentially the equivalent of building up a complex object
through a series of mutations—but entirely functional.

------
tathougies
This seems like lenses for python... neat! I often use python to mess around
with things, and almost always miss Haskell's lenses when doing so. This seems
like an interesting solution.

------
amock
I haven't played with this yet, but it looks really handy. I deal with much
more JSON on the command line than I'd like, so I think having both a single
library and command line tool to reshape that data will make that much easier.
I've used jq a few times, but when I want to move a little beyond what it does
I usually end up writing a Python script. Hopefully this will make that
transition smoother.

~~~
mhashemi
Haha, I'm all for console usage, but let me tell you, there's nothing quite
like that feeling of moving a working spec into a dedicated application with
exception handling, logging, etc. :)

------
spedru
I'm not really versed in the idioms/social mores of Python, so please take the
following with a grain of salt:

This seems like it usefully solves a problem, but the invocation pattern is
suspect to me -- Instead of "glom" taking the target for picking-apart plus a
magic little bit of DSL, what if "glom" took a single parameter, the
aforementioned DSL, and returned a function that would perform the
corresponding search when called on a target? Even if Python or this package
optimises away repeatedly searching (by the same spec|in the same manner), the
convention the package prescribes is odd to me, right after the first few
paragraphs of intro.

~~~
seanc
Python regex library does this, optionally

~~~
daturkel
Similarly, statistical distributions in SciPy can be used in "frozen" form
(pre-parameterized) or in a more general form where you supply the parameters
at the same time you are requesting some attribute it the distribution. Seems
to me to be a situation where one is useful if you expect reuse, and the other
is useful if you don't.

------
agf
It seems to me like the advantage to focus on here is the improved error /
`None` handling, which will speed debugging and make handling expected edge
cases easier. I've seen a lot of inexperienced developers tripped up entirely
by this kind of data access, and seen plenty of experienced developers waste
time debugging it because of the exact error cases the announcement
references.

The `T` object, which the article describes as its most powerful, can be a
useful pattern in some situations, but it's worth pointing out it isn't new or
unique to this project.

The author says in another thread here that he first started working on the
"stuff leading up to glom" in 2013. One older example, which is virtually
identical though less complete, is this Stack Overflow answer I posted in
2012:
[https://stackoverflow.com/a/9920723/500584](https://stackoverflow.com/a/9920723/500584)

I'd seen the general pattern even before that post, if not the Pythonic
syntax. I don't think that it's much of an improvement over defining a
`lambda`, so again I would say the thing to focus on is the improved
debugability and the simpler, dot-notation-as-generic-attribute-or-item-
accessor syntax. I think `T` is largely a distraction, or should be reserved
for advanced users.

~~~
heavenlyblue
I would like to see the author debugging an application with 10 levels of
object wrapping that had one of the middle object’s name misspelled.

Libraries like these shine only if they have brilliant tracing and debugging
capabilities; otherwise are too easy to reduce to literally a single function.

~~~
doublereedkurt
[http://glom.readthedocs.io/en/latest/api.html#debugging](http://glom.readthedocs.io/en/latest/api.html#debugging)

affordances to add tracing prints, or drop into a pdb at any level

The Inspect specifier type provides a way to get visibility into glom’s
evaluation of a specification, enabling debugging of those tricky problems
that may arise with unexpected data.

Inspect can be inserted into an existing spec in one of two ways. First, as a
wrapper around the spec in question, or second, as an argument-less
placeholder wherever a spec could be.

Inspect supports several modes, controlled by keyword arguments. Its default,
no-argument mode, simply echos the state of the glom at the point where it
appears:

------
tincholio
It looks quite similar in spirit to Clojure's Specter library
([https://github.com/nathanmarz/specter](https://github.com/nathanmarz/specter)),
and even seems to have a nod to it (The Spectre of Structure).

~~~
mhashemi
Oh nice! Total coincidence, I assure you. Still, I should read on this and add
it to the analogy doc:
[http://glom.readthedocs.io/en/latest/by_analogy.html](http://glom.readthedocs.io/en/latest/by_analogy.html)

Declarative data transformation generates a lot of comparisons (almost all of
them great, though!).

------
lapnitnelav
Looks really neat.

Striking a balance between ease of use / simplicity and powerful features is a
tough exercise but you did well.

I can foresee the CLI being quite useful to do away with the run-of-the-mill
sed / awk / grep [...] mess. Specifically for the less CLI inclined people out
there.

------
wbolster
in a similar spirit, i wrote "sanest", sane nested objects, tailored
specifically for json fornats:
[https://sanest.readthedocs.io/](https://sanest.readthedocs.io/)

it does not have the exact same feature set though. my focus was mostly on
both reading and modifying nested structures in a type safe way.

------
bluemanshoe2
Compare/contrast with pstar:
[https://github.com/iansf/pstar](https://github.com/iansf/pstar)

------
icebraining
Can it be used bidirectionally, without having to repeat the work?

I have a need to transform between pairs of structures, in both directions,
and ever since I found JsonGrammar
([https://github.com/MedeaMelana/JsonGrammar2](https://github.com/MedeaMelana/JsonGrammar2))
I've been pining for a Python version.

~~~
mhashemi
It depends on the complexity of the spec, but we've already done some
programmatic building of glomspecs, so for many cases I think the answer is
yes! Once we feel out the patterns I think glom will gain some utilities for
this purpose.

~~~
icebraining
Thanks, that's awesome!

------
sixdimensional
I had a quick look, but I didn't see filtering expressions, only shaping
expressions. It seems like glom is more of a result shaper/mapper. Can you
filter with glom (maybe with lambdas or something)? I could see the two going
together quite well if you were "glomming" a big Python object.

~~~
mhashemi
Filtering is supported through the OMIT value:
[http://glom.readthedocs.io/en/latest/api.html#glom.OMIT](http://glom.readthedocs.io/en/latest/api.html#glom.OMIT)
(another example:
[http://glom.readthedocs.io/en/latest/snippets.html#filtered-...](http://glom.readthedocs.io/en/latest/snippets.html#filtered-
iteration) )

Lambdas and functions are always a safe fallback, but glom does its best to
keep your specs readable and roundtrippable (gotta love a nice repr()).

~~~
sixdimensional
That's cool! Thank you for pointing it out!

------
pdobsan
There is already a well established Gnome project with the same name:
[http://www.glom.org](http://www.glom.org) It is a GTK+ front-end to
PostgreSQL, similar to Microsoft Access.

------
harel
This reminds me a little of the excellent dpath lib:
[https://github.com/akesterson/dpath-
python](https://github.com/akesterson/dpath-python)

------
jasonpeacock
I'm probably being dense, but I don't see a good description of the input data
types supported - the CLI says "json or python".

It would be great to have clarification if this is JSON only, or supports
other data structures, or parsers could be plugged in?

~~~
mhashemi
From within Python, all objects are supported by default. If you can parse it,
you can glom it. You can even register additional behaviors for specific types
to keep your specs tight:
[http://glom.readthedocs.io/en/latest/api.html#setup-and-
regi...](http://glom.readthedocs.io/en/latest/api.html#setup-and-registration)
(example:
[http://glom.readthedocs.io/en/latest/snippets.html#automatic...](http://glom.readthedocs.io/en/latest/snippets.html#automatic-
django-orm-type-handling))

The CLI is in a pretty preliminary state, usable but not as robust as it will
be in a few weeks. It only supports built-in parsers (JSON and Python
literals) What formats are you thinking? YAML?

~~~
jasonpeacock
Aha. Thanks! I understand better now, I didn't realize it was so general-
purpose :)

------
codezero
This is pretty neat and is something I've been thinking about for a current
project.

Does anyone know if something similar exists in Java/Scala land?

~~~
zaptheimpaler
I think this is similar to lenses in FP languages - check out Monocle for
Scala.

------
staticautomatic
Aside: are there any libraries for JSON or dict-like formats with xpath-style
querying that are as quick under the hood as lxml is for xml?

------
heavenlyblue
I think this project defines the first step at transitioning pip to a js-like
repository of single-function modules. Hurray for kool-aid.

------
loop0
I'm curious about how the author replaced DRF with glom

~~~
mhashemi
DRF still takes care of negotiating formats, etc., but it replaces the
serializers. I'll see about getting an example in the repo, stay tuned.

------
wildleaf
Shameless plug: [https://www.npmjs.com/package/safely-
nested](https://www.npmjs.com/package/safely-nested)

Nothing special, glom just reminded me of it.

------
boringg
Looks slick!

------
zestyping
The writing style is just insufferable. Even the API documentation is littered
with hyperbole and self-congratulation. We get it, you're proud of your work
and _extremely_ proud of yourself.

> "as simple and powerful as glom"

> "big things come in small packages"

> "small API with big functionality"

> "power is only surpassed by its intuitiveness"

> "simplicity is only surpassed by its utility"

> "shortest-named feature may be its most powerful"

For heaven's sake, give it a rest!

It's a big red flag about your priorities that when I go looking for a precise
specification, I can't find answers to simple questions and instead end up
wading through incessant marketing phrases. I tried, and I finally gave up
halfway through the API doc. It might even be the case that glom is a good
idea—but you're making it really hard to trust you as a source of objective
information about it.

Show, don't tell. My advice to you: you'll generate more interest if you
delete every congratulatory word on those pages and focus entirely on helping
your readers understand what glom does instead of trying to sell it to them.

~~~
mhashemi
Hey Ka-Ping! Maybe I did get carried away :)

How I wish one could publish a dry document and expect people to read all the
way to the bottom. I've published enough libraries to know that's not the
case. glom's free software so it's all there, as "shown" as can be.

But referring you to the code wouldn't be very considerate either. Instead,
here's this literate code version that I prepared in advance. Hopefully this
will be of more help to you:
[http://glom.readthedocs.io/en/latest/faq.html#how-does-
glom-...](http://glom.readthedocs.io/en/latest/faq.html#how-does-glom-work)

~~~
liteye
In [http://glom.readthedocs.io/en/latest/tutorial.html#access-
gr...](http://glom.readthedocs.io/en/latest/tutorial.html#access-granted),

> After years of research and countless iterations, the glom team landed on
> this simple construct:

'years of research', 'countless iterations', 'glom _team_ ', really?

~~~
mhashemi
You're yellin at a tutorial man. You gotta let some flavor text slide. :)

That said, I'm no liar. Kurt and I (as a team), really did write stuff leading
up to glom in 2013 (years ago), and have written stuff like it enough times
that I've lost count (countless :P). If this isn't research, I don't know what
is. Heck, I'm even getting a fun little peer review!

~~~
mlevental
don't sweat it dude. I've noticed a lot of people on hn are crabby assholes
for absolutely no good reason. like on the post linking to Google's codelabs
(where there are hundreds of tuts about all sorts of things in the Google
ecosystem) there were only two comments and they were complaints. and recall
that every time an electron app is posted almost every comment is whining
about the performance. and every time a rust article is posted there's whining
about how it's more complicated than js. and every time there's a js article
posted there's whining about how it's not type safe like rust. and every time
someone posts a personal page someone has to point out how it's "garbage on
mobile" as if they're doing people a favor pointing out flaws (as if they
don't understand that mobile is the most heterogeneous platform out there). I
swear people don't know how to be grateful for free shit or just keep their
mouths shut when something doesn't tickle their own particular fancy. I wager
it's a defense mechanism because they themselves aren't making anything and so
they need assert their superiority in some way (because people that are busy
doing stuff don't have time to complain about things irrelevant to their own
work). kudos to you and Kurt for releasing a library to the community that's
different and interesting and fuck the haters.

~~~
bmarkovic
This needs to become a copy pasta + a version where we s/hn/reddit/g and every
friggin salty post needs it shoved to them.

------
rodrigoalviani
Cool... it looks like... errr... javascript :D

