It doesn't have to be this black and white, in my opinion. The most common data ...

guerrilla · on Aug 7, 2020

Yeah this is probably it right here. Just being able to parse various common formats would give a lot of the structure people are looking for. The builtins can just operate on that. Maybe it could optionally detect the formats too. If such a shell became popular, it would also incentivize developers adding common formats as outputs to more programs too.

kbrazil · on Aug 10, 2020

This is what jc[0] does.

[0] https://github.com/kellyjonbrazil/jc

(disclaimer: I am the author)

jodrellblank · on Aug 7, 2020

Newline separated what? Basically all Linux config files are newline separated lines, but that won’t get you the equals separated data pairs, indented blocks, pretend-xml sections, ad-hoc boolean representations, INI-style sections, ad-hoc list-of-values style data out of them.

> What if it allowed parsing of XML natively, allowing instant scrapers to be written in a single command?

Powershell’s Invoke-RestMethod parses xml natively to a structured output; it’s useful but not enough to make XML fun or trivial to work with beyond root.node.foo.title simple scraping.

cnity · on Aug 7, 2020

> Newline separated what?

What I meant by this is that the newline character is the primary delimiter of the data, but I take your point. IMO, the best way to tackle the myriad ways of structuring data is to write parsers for them, not to rewrite the programs themselves to work with your shell's idea of structured data. That's a losing battle.

AnIdiotOnTheNet · on Aug 7, 2020

Interesting note: newlines are perfectly valid characters in file names. In fact, the only byte that cannot be in a file name is '/'[0], which means that you're asking for trouble trying to parse the in-band signaling you have to do with a stream that's pretending to be structured.

[0] with a hex editor it is possible to create a file system entry using '/' in its name. Linux does not handle this situation with grace.

jodrellblank · on Aug 7, 2020

This is how I felt powershell ought to go on Linux - parsers from every config format to a common intermediate and back, argument completers for every common shell command to a standard intermediate representation.

I think that’s a losing battle as well due to the brittle nature, any tweak to any command output could break something, meaning endless maintenance work on a large scale.

Rewriting concepts into new commands at least bring consistency and a chance of shrinking maintenance work as it settles on a nice design - and if not the maintenance won’t be endless rebuilds of the same parsers.

kbrazil · on Aug 10, 2020

jc[0] does this for many commands and file-types, including .ini and key/value pair files with various delimiters and comment identifiers.

[0] https://github.com/kellyjonbrazil/jc

(disclaimer: I am the author)

mywittyname · on Aug 7, 2020

jq makes handling json in existing bash scripts simple. It can even do some simple data wrangling via built-in functions like map. It's not a universally standard command yet, but it's available for most modern systems.

I'm not sure about similar xml cli tools, but they probably exist.

Point is, both of these should be cli tools, not shell built-ins.

mdaniel · on Aug 7, 2020

> I'm not sure about similar xml cli tools, but they probably exist.

I :heart: xmlstarlet (sometimes installed as that name, sometimes as just "xml"): https://formulae.brew.sh/formula/xmlstarlet#default => https://xmlstar.sourceforge.io/

The xsltproc that ships with libxslt1 is also pretty handy, although more verbose

While not quite xml, pup is super handy for doing quick html selections: https://formulae.brew.sh/formula/pup#default => https://github.com/EricChiang/pup#readme

Iwan-Zotow · on Aug 7, 2020

> I'm not sure about similar xml cli tools, but they probably exist.

xsltproc of course

trenchgun · on Aug 7, 2020

Ooh that would be pretty damn sweet tbh.