
Show HN: Bo, the Swiss army knife of data examination and manipulation - kstenerud
https://github.com/kstenerud/bo
======
natch
Example usage from the readme below. There is no explanation whatsoever
provided as to what is going on with all the different formats.

    
    
        Build up test data for low level code.
    
        $ bo -n oh1l2 Pc if4b 1.5 1.25 ii2b 1000 2000 3000 ih1 ff fe 7a ib1 10001011
        0x3f, 0xc0, 0x00, 0x00, 0x3f, 0xa0, 0x00, 0x00, 0x03, 0xe8, 0x07, 0xd0, 0x0b, 0xb8, 0xff, 0xfe, 0x7a, 0x8b
    

What?

What kind of value starts with a lower case ‘o’ or a ‘P’? Or are those
commands? If so, why don’t they start with hyphens? Are they file names? If
so, why aren’t they preceded by -i or -o to match the docs? Is some of it
output?

If I used this, how would I explain my work to my coworkers? A tool should
solve problems, not give me a new ones.

My approach is usually to make a tool friendly enough that a newbie can glance
at a command typed in by anyone and immediately see what it is supposed to be
doing, without reading any help file. But in this case even the help file
doesn’t help.

~~~
kstenerud
The HN crowd are my UX testers. Whatever people don't understand is easily
fixed with a git push!

~~~
mmastrac
I'll attempt to possibly convert the grandparent comment into something more
constructive. I'd note that your tool is probably the appropriate complexity
level compared to utilities of the same sort (ie: sed, awk, dd, etc).

But let's say we want to improve the UX of the tool for new users. As a
strawman, you could offer longhand forms of the mode switching flags for them.
For example:

bo -n oh1l2 Pc if4b 1.5 1.25 ii2b 1000 2000 3000 ih1 ff fe 7a ib1 10001011

Could be written long-form as:

    
    
      bo -n --output=type=hex,width=1,endian=l,width=2 --preset=c --input=type=float,width=4,endian=b 1.5 1.25 --input=type=int,width=2,endian=b 1000 ....
    

Each of your commands could have an advanced (oh1l2), medium (-o h1l2 or
--output h1l2) or beginner (--output=type=hex,width=1,endian=l,width=2)
syntax.

You could even print the shorthand version to stderr whenever one of the
simpler syntaxes was used.

~~~
kstenerud
That's actually another part that fell between the cracks and I'm attempting
to address... Aside from the -n, everything is parsed as a whitespace-
delimited stream. They just happen to be passed as command line arguments, but
they're really just stream data. So all the input, output, formatting, and
data commands can be passed as one big string argument

    
    
        bo -n "oh1l2 Pc if4b 1.5 1.25 ii2b 1000 2000 3000 ih1 ff fe 7a ib1 10001011"
    

Or as multiple arguments (no point here, just that you can do it):

    
    
        bo -n "oh1l2 Pc if4b 1.5 1.25" "ii2b 1000 2000 3000" "ih1 ff fe 7a ib1 10001011"
    

Or as a file from disk:

    
    
        echo "oh1l2 Pc if4b 1.5 1.25 ii2b 1000 2000 3000 ih1 ff fe 7a ib1 10001011" >mycommands.txt
        bo -n -i mycommands.txt
    

Or as multiple files from disk:

    
    
        echo "oh1l2 Pc if4b 1.5 1.25 ii2b 1000 2000" >mycommands1.txt
        echo "3000 ih1 ff fe 7a ib1 10001011" >mycommands2.txt
        bo -n -i mycommands1.txt -i commands2.txt
    

Or as stdin:

    
    
        echo "oh1l2 Pc if4b 1.5 1.25 ii2b 1000 2000 3000 ih1 ff fe 7a ib1 10001011" | bo -n -i -
    

Or as a crazy mix of all of that:

    
    
        echo "if4b 1.5 1.25" >float-32s.txt
        echo "ii2b 1000 2000 3000" >int-16s.txt
        echo "ih1 ff fe 7a" >hex-bytes.txt
        echo "ib1 10001011" >binary-bytes.txt
        bo -n -i float-32s.txt -i int-16s.txt -i hex-bytes.txt -i binary-bytes.txt oh1l2 Pc
    

It's designed to fit in with the unix file and pipe paradigm, and also be
composeable so that you can have saved data chunks to use in multiple cases,
which by necessity makes it a bit opaque for someone approaching it for the
first time, but also makes it incredibly powerful when mixed with other unix
tools.

~~~
oliverevans96
You might be interested in docopt ([http://docopt.org/](http://docopt.org/)),
which is a multi-language command line parser that removes almost all of the
boiler plate. Whereas the versions of docopts for Python, etc. just require an
import, the C version emits C code to do the parsing. It may be worth a look!

------
JosephRedfern
It looks like a useful tool, thank you for taking the time to write it.

I realise that further down the page you explain what the various type
definitions mean, but it might be nice to make that clear in the initial
examples you give.

For instance, say:

    
    
        See the per-byte layout (_oh1_) of a larger integer in big (_ih4b_) or little (_ih4l_) endian format, separating each byte with a space (_Ps_).
    
        $ bo -n oh1 Ps ih4b 12345678
        12 34 56 78
        
        $ bo -n oh1 Ps ih4l 12345678
        78 56 34 12
    

It otherwise look a little bit like voodoo to the untrained eye!

~~~
kstenerud
Yes, I was a bit torn between showing a bunch of what it can do, and getting
so verbose that people stop reading. This tool is surprisingly difficult to
explain concisely!

~~~
socialentp
That seems to be a reality of the “Swiss Army Knife for X” products. Tons of
features are a blessing and a curse. You should listen to this Masters of
Scale episode for an in-depth discussion of how to overcome the inherent
marketing challenges: [https://mastersofscale.com/diane-greene-look-
sideways/](https://mastersofscale.com/diane-greene-look-sideways/)

------
tehmoon
A really needed project! I've started something really similar last year
because I felt I needed something:
[https://github.com/tehmoon/cryptocli](https://github.com/tehmoon/cryptocli)
which can do a bunch of things. Perhaps you can get ideas!

Needless to say that I use it every day :p

------
doodhwala
Simple tool, elegant interface!

In the context of large files, do you think you would like to extend it to a
multi-threaded variant to fully exploit additional cores or will the overhead
due to threading and memory bottlenecks not provide any advantage?

~~~
kstenerud
It's fully reentrant and thread safe atm. I haven't thought about
multithreaded parsing & printing, though. Mostly it was intended as a quick
tool to get data into the right format for testing or debugging or writing
code.

------
mmastrac
This is pretty useful - my shell history is littered with invocations to node
(8333..toString(16)) or python (hex(ord('/')) doing this by hand.

Would love to see a homebrew formula for it.

------
Angostura
You mean, when under pressure, and used in the wrong way it will suddenly fold
up and take a chunk out of its owner?

~~~
Razengan
Thanks for instilling a “fear” of that in me to watch out for when using Swiss
knives from now on.

~~~
Angostura
You'll be OK as long as you don't push down on the tip of the blade (imagine
trying to use the point to drill a hole in some leather. The blades don't
lock, so can fold expectedly.

Other than that, lovely things. I was making a weak joke, which received the
downvotes it deserved :)

~~~
phillc73
It's a real threat and happened to me 25 years ago. I still have a scar across
the knuckle of my first finger and haven't used a Swiss Army knife since.
Leatherman or Gerber only for multi-tools now.

------
gaius
Presumably named for being a selector?

~~~
_ZeD_
Binary Output

