Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Parsing arbitrary unstructured text is not good for composition and requires making something bespoke for each and every tool.

Right, but therein lies the flexibility of using unstructured text as the exchange format. Each tool doesn't need to be aware of the format it needs to consume or output; it just cares about what makes sense for itself and the user. It's up to the user to determine the best way of composing each tool within a specific pipeline. This might seem cumbersome, but there are a limited set of ways tools can be composed, and generic helper tools exist for a variety of use cases (xargs, sed, awk, grep, cut, paste, etc.). If one tool breaks, then it's easy to either replace it, or fix the pipeline to support that specific tool.

In contrast, by making a strict contract of the exchange format, each tool needs to support a specific contract, and any deviation or update of it means that all tools need to be updated as well, while also maintaining backwards compatibility. This might be fine for a monolithic environment where a single project maintains all tools, but it's unsustainable if one wants to build an open ecosystem of tools that work independently, but can still be composed in any pipeline.



Except that means tools aren't doing "one thing" as in the Unix philosophy. Or least I know of very few unix tools that actually do that. At a minimum they need to bundle an argument parser (or more often reinvent their own). They also need to do the same for generating output, often in multiple different display formats depending on the precise flags used.

And I think your overstating the contract argument. If `ls` changed its output format in any way (unless behind a flag) it'd break a heck of a lot of bash scripts. With a well structured output format it's better able to maintain backwards compatibility while adding new features.


> If `ls` changed its output format in any way (unless behind a flag) it'd break a heck of a lot of bash scripts.

The `jc` ("CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON") documentation[1] & parser[2] for `ls` also demonstrate that reliable & cross-platform parsing of even the current output can be non-trivial based on filenames encountered, flags used & host platform--which means there's probably a non-zero number of bash scripts that are already broken but they don't know it.

[1] https://kellyjonbrazil.github.io/jc/docs/parsers/ls

[2] https://github.com/kellyjonbrazil/jc/blob/4cd721be8595db52b6...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: