It doesn't have to be this black and white, in my opinion. The most common data structures I've interacted with on the command line are _newline separated_ or _JSON_. What if your shell allowed parsing of JSON natively?
npm list --json | ({ json }) => json.somePackage.author.name
What if it allowed parsing of XML natively, allowing instant scrapers to be written in a single command?
curl www.example.com | ({ xml }) => xml.find('.importantInformation') > scraped_info.txt
Yeah this is probably it right here. Just being able to parse various common formats would give a lot of the structure people are looking for. The builtins can just operate on that. Maybe it could optionally detect the formats too. If such a shell became popular, it would also incentivize developers adding common formats as outputs to more programs too.
Newline separated what? Basically all Linux config files are newline separated lines, but that won’t get you the equals separated data pairs, indented blocks, pretend-xml sections, ad-hoc boolean representations, INI-style sections, ad-hoc list-of-values style data out of them.
> What if it allowed parsing of XML natively, allowing instant scrapers to be written in a single command?
Powershell’s Invoke-RestMethod parses xml natively to a structured output; it’s useful but not enough to make XML fun or trivial to work with beyond root.node.foo.title simple scraping.
What I meant by this is that the newline character is the primary delimiter of the data, but I take your point. IMO, the best way to tackle the myriad ways of structuring data is to write parsers for them, not to rewrite the programs themselves to work with your shell's idea of structured data. That's a losing battle.
Interesting note: newlines are perfectly valid characters in file names. In fact, the only byte that cannot be in a file name is '/'[0], which means that you're asking for trouble trying to parse the in-band signaling you have to do with a stream that's pretending to be structured.
[0] with a hex editor it is possible to create a file system entry using '/' in its name. Linux does not handle this situation with grace.
This is how I felt powershell ought to go on Linux - parsers from every config format to a common intermediate and back, argument completers for every common shell command to a standard intermediate representation.
I think that’s a losing battle as well due to the brittle nature, any tweak to any command output could break something, meaning endless maintenance work on a large scale.
Rewriting concepts into new commands at least bring consistency and a chance of shrinking maintenance work as it settles on a nice design - and if not the maintenance won’t be endless rebuilds of the same parsers.
jq makes handling json in existing bash scripts simple. It can even do some simple data wrangling via built-in functions like map. It's not a universally standard command yet, but it's available for most modern systems.
I'm not sure about similar xml cli tools, but they probably exist.
Point is, both of these should be cli tools, not shell built-ins.
npm list --json | ({ json }) => json.somePackage.author.name
What if it allowed parsing of XML natively, allowing instant scrapers to be written in a single command?
curl www.example.com | ({ xml }) => xml.find('.importantInformation') > scraped_info.txt