As someone who deals with geospatial processes like this daily, I have 2 notes.
1. STAC implementations are already complicated. Not everyone has good catalogues, or that work uniformly well for all types of queries.
2. Using STAC geoparquet on top and then another layer on top would mean we have to self host yet another catalog, essentially a new standard.
In short, even though I believe STAC adoption is what we should aim for, in reality we usually end up building workarounds.
Totally agree. When I worked at Radiant Earth the most frequent question we got was "I have this data I want to share, how do I create a good STAC for it?"
It was a totally valid question but one that's practically impossible to answer. As a result there's just so much variation between STACs and you never really know what you're going to get.
Hi Sid here one of authors of blog. STAC being open and extensible, makes it a double edge sword yes. Quite refreshing to hear this from someone who was at Radiant coz, it shows we still haven't reached a great way of sharing data yet.
What do you think of attempts like Source Cooperative?
There's clearly a need for Source Cooperative given the overwhelming positive feedback we received during the beta. However, Source Cooperative is entirely dependent on Microsoft and Amazon subsidizing all of the S3 / Azure Blob Storage costs. They could pull the rug out at any moment, like we've seen with Planetary Computer, and Source Cooperative would no longer be sustainable.
Disclaimer: I built Source Cooperative and left Radiant 2 months ago.
Agree on both fronts!
STAC is pretty complex. My attempt here was to make raw data access easy and fast, not to solve STAC, which I believe stac-geoparquet basically makes an attempt to fix (makes it columnar and hence faster to query at scale).
And yes, having a parquet will add overhead of needing some form of catalog. But I believe we are very close to having Iceberg with native geo types being that catalog. at the same time, it opens another can of worms (databricks and other catalogs etc).
silver lining is that parquet (geoparquet) makes geo data closer to regular data.
Sure! Glad you shared your honest opinion. But I just want to reiterate that the blog and its subsequent library which will be released is not being done to create a new standard. All throughout the blog its been clearly stated that this is "a new approach", not "new standard" or "new format" or even "better standard".
In short, even though I believe STAC adoption is what we should aim for, in reality we usually end up building workarounds.