Was going to say I agreed with the sentiment. I didn't know about `struct` or `pack`, but Elixir (& Erlang) have a similar "special form" binary structures.
Though reading both `pack` and `struct` the formats are about as obtuse as a regex. And the lack of individual bit support in `struct` is limiting.
Come to think of it, like other comments already, this could be a handy CLI streaming tool. Dealing with adjusting binary output on the CLI turns out to be a pain...
I think it might be more intuitive as an explicit stream modifier: hex, decimal, etc. No reason to turn anything “off”, which is actually the activation of a diffferent mode.
0 for octal was a huge mistake. Most sane languages use 0o now, or just forget octal. How often do you use octal anyway? Like, once a year for file permissions?
Was the first thing that stood out to me as well. Toggled modes like this are also quite annoying when shuffling around sections of the file. Copy one hex too many and suddenly your numbers in another part are all wrong. And due to the toggle you'll have a hard time finding the place where it all went wrong.
As far as I could tell it's been more or less the only mode. At that point I'd probably get rid of it because global state (which modes are) has to be remembered at all times and in this case its invisible until you see output. The suggestion to use different operators or modifiers for a single numbers was a good one in that direction IMO.
In addition to "pack" and "struct", for prior art (or inspiration), look at any number of structured fuzzing frameworks (Peach might be a good example); constructing arbitrary binary formats from a recipe is how you build a structured fuzzer. I built one a few years ago that pursued a sort of DOM model, with id's and classes and path strings.
In the other direction, given the ability to spit out one binary character from the shell, you can bootstrap all the rest of this in pure shell script. I cheated and gave myself int{8..64} and did a whole ASN.1/DER in shell script. The shell has better control flow than custom language (but the custom language is a little simpler).
I did a KSH DER codec once too! It started out as an attempt to decode OIDs that were all over a codebase, and we had no tools to decode them, and I wanted to see if they were correct (indeed, they were not all correct). Then I extended that script a bit to other types. 'Twas a fun little side project.
With an FFI (which Bash has) one could do this much more easily than it was then (about 15 years ago, I think).
Length-prefixed strings would be useful for some file formats. You'd need to be able to pick a size for the length-prefix, such as u8,u16,etc.
Also some ability to length-prefix the output of a macro might be useful for tag-length-value-like formats.
For Unicode, ability to write out UTF-16 or UTF-32 would be useful, also CESU-8 and Java's Modified UTF-8.
Also, for a challenge, think about how you could write macros to generate ASN.1 formats such as BER. You'd probably need some more smarts in your macro language to handle such a task gracefully.
Also, how does one control the endianness of output? Maybe you need u16le and u16be, or some kind of general endianness modifier.
Output of IEEE floating point might also be useful in some applications.
Cool. That patch used weird language, though ("a greater-than sign"). I think the people who don't already know how to write to a file might appreciate knowing what this is actually called (output redirection) so they can look it up.
I use fasm for this very use case. It has a wonderful macro language that can be used to programmatically generate binary data on top of being able to emit instructions. It's really cool to see other languages for generating binary files as its an interesting niche to fill.
FlatAssembler is amazing.
I remember people making fractal images using only preprocessor [1], JVM bytecode compilers and a lot of other cool stuff.
Also it is a powerful editor, you can abuse 'file', 'load', 'store' directives to make existing file modifications.
Not sure if I will use it, but I like it. This is as close to raw bytes as it gets despite some HLL like features.
This to me is real hacking, saying what the heck and writing a tool for oneself instead of looking around and getting lost in the multitude of 'mature' options that exist. I am sure hacking this was quicker, and more fun, than browsing all the available options, picking one, installing the tools, learning to use them and getting the desired result.
Now, how long till someone decides to bootstrap this - use it to hack a binary that compiles it?
Oh definitely, this was a fun one to write. I got so caught up in it that I completely forgot the original project I wanted to use it, which was a small VM.
Neat! I could used this for generating mixed binary/ASCII payloads for network protocol testing. One issue, though -- it seems like trailing whitespace in the input isn't handled correctly. It seems to be picked up by the command processor and treated as a duplicate of the previous command. Amusingly, I was able to diagnose this using t2b by piping its output back through itself:
The problem with this kind of site and all those Awesome-XYZ lists is that it could take years to review each item.
I'm not saying building these kind of lists is a bad thing, but I haven't found an efficient way to make use of lists like this and void diverting my focus.
I doubt that given it took me weeks to find review, and put up half those links. It would take some time, though. If nothing else, check out projectoberon.com and/or the amber slides:
I really wanted something like this when embedding serial numbers and keys in ROMs. At the time I was mostly working in C and PHP, so I was imagining something closer to a template language, but something like this would have given me a good intermediate step where I could have used PHP to render this text format and then ran that to produce the necessary bits on disk.
It would be great if I could specify the number of bits per field like in Elixir/Erlang/C++ bit fields and then also specify what the extra bits should be set to if the total number of bits isn't divisible by 8. I know that document strings are helpful but writing it into the code would be much more robust.
It's shared project configuration. If you're using the same IDE as the author (CLion in this case) you get all the code style settings, run/debug configurations and shared project config/settings without any setup required. If you're not using CLion just ignore the files, they don't do any harm.
Yep. It helps me a lot, because sometimes I switch between my Mac and Windows boxes. Especially in this case where I had to verify/release the build on two platforms.
The only downside is that not everyone uses the same editor, so in projects with more contributors, it can quickly create bloat.
- the hex modifier does both turning mode on and off, which will be confusing in a longer file
- it would be good if prefixes such as 0x or bx were handled to temporarily override the current setting
- using "get var" to actually output something is weird, i'd use "put"
- add a way to handle endianness