This is nice! I like the colorization a lot -- it's like with source code, you don't realize how much you rely on the colors until you get a new computer and need to download syntax definitions for your editor again. Only feature I really think is missing is line numbers.
My first thought too. Then I looked and saw it was nice, and then I noticed it's by sharkdp. He really has a knack for improving on things you thought were fine (fd and bat being other examples).
edit: Added in this commit: https://github.com/sharkdp/hexyl/commit/91a119f4537f746045c8...
I wonder though, whether I can colour code 'blocks' to differentiate them? One of the things I seemed to do a lot in hex editors was to check out the differentiation in, for example, 72 byte blocks of data, so to be able to delineate 72 byte blocks in different colours in the hex editor would make it easier to see where each block starts/ends.
00006980: 0000 0000 0000 0000 0000 0000 0000 0000
line number = floor(X/(bytes/line)) + 1
column offset = X % (bytes/line) + 1
Or it could just print byte numbers instead of line numbers.
I've never thought of xxd as being anything but a "Vim thing".
That it had its own history (1990-1998) before being pulled into Vim is a TIL for me.
Here you go:
There might still be other versions of this out there, but I suspect most OS distros pick up xxd by way of packaging Vim.
So if you work on some other xxd, you're probably in a dead fork.
The man page is also from Vim:
My first patch to xxd, if I were to work on it, would be to fix the man page to indicate clearly that it's the xxd bundled with Vim.
This also solves the problem of moving the cursor and drawing a screen in general.
If anything, I've found doing this works better then trying to muck around with ncurses, which often gets actively mislead by incorrect terminfo entries.
I get the idea behind your question, but it irks me a bit since it implies a sort of responsibility to contribute to an existing piece of software. This is a net positive of course, but the beauty of open source is the freedom to work on the things you want to work on.
The border lines don't help much either and seem more like decoration; addresses to the left and an indexing header on the top, like the traditional "canonical" hexdump format, would be far more useful.
I did use it religiously for a long, long time; then tried without for a week and never went back.
It's not like telling the difference between string literals and keywords was ever a major issue for me. I guess it could help short term when learning a new language, but I'm pretty sure it slows down overall progress.
Slow compiles have the same effect for me, I get pushed out of flow and into reactive mode.
The only way to write good code is to be proactive, and any features that interfere with that process are worse than whatever "problems" they "solve".
Hexyl would look like:
.. which is apparently also called "Hexyl". Granted, I was just looking for a short word that starts with "Hex" and I was never good at organic chemistry :-)
xxd `which hexyl` > /dev/null 0.12s user 0.09s system 99% cpu 0.219 total
hexdump `which hexyl` > /dev/null 0.19s user 0.05s system 91% cpu 0.256 total
hexyl `which hexyl` > /dev/null 1.69s user 0.39s system 95% cpu 2.175 total
hyperfine './target/release/hexyl ./target/release/hexyl' 'xxd ./target/release/hexyl' 'hexdump ./target/release/hexyl'
Benchmark #1: ./target/release/hexyl ./target/release/hexyl
Time (mean ± σ): 1.529 s ± 0.028 s [User: 1.476 s, System: 0.050 s]
Range (min … max): 1.491 s … 1.581 s 10 runs
Benchmark #2: xxd ./target/release/hexyl
Time (mean ± σ): 70.5 ms ± 0.5 ms [User: 68.0 ms, System: 1.2 ms]
Range (min … max): 69.5 ms … 72.3 ms 41 runs
Benchmark #3: hexdump ./target/release/hexyl
Time (mean ± σ): 262.4 ms ± 2.8 ms [User: 260.1 ms, System: 1.5 ms]
Range (min … max): 259.8 ms … 268.8 ms 11 runs
'xxd ./target/release/hexyl' ran
3.72 ± 0.05 times faster than 'hexdump ./target/release/hexyl'
21.70 ± 0.43 times faster than './target/release/hexyl ./target/release/hexyl'
Yes, it's a shame. But I don't think there is too much we can do about it. We have to print much more to the console due to the ANSI escape codes and we also have to do some conditional checks ON EACH BYTE in order to colorize them correctly. Surely there are some ways to speed everything up a little bit, but in the end I don't think its a real issue. Nobody is going to look at 1MB dumps in a console hex viewer (that's 60,000 lines of output!) without restricting it to some region. And if somebody really wants to, he can probably spare 1.5 seconds to wait for the output :-)
A few extra comparisons and output for each byte shouldn't be that much slower; fortunately the function of this program is extremely well-defined, so we can calculate some estimates. Assuming a billion instructions per second, taking ~1.5s to hexdump ~1 million bytes means each byte is consuming ~1500 instructions to process. In reality the time above is probably on a faster CPU, so that number maybe 2-3x more. That is a shockingly high number just to split a byte into two nybbles (expected to be 1-3 instructions), convert the nybbles into ASCII (~3 instructions), and decide on the colour (let's be very generous and say ~100 instructions.)
The fact that the binary itself is >1MB is also rather surprising, especially given that the source (not familiar with Rust, but still understandable) seems quite small and straightforward.
Did rust become less dependent on allocator performance, or did system allocators improve enough? IIRC glibc malloc has improved a lot over the last few years, particularly for multithreaded use, but I don't know about windows / macOS.
So, now that we have a stable way to let you use jemalloc, the right default for a systems language is to use the system allocator. If jemalloc makes sense for you, you can still use it, but if not, you save a non-significant amount of binary size, which matters to a lot of people. See the parent I originally replied to for an example of a very common response when looking at Rust binary sizes.
It's really more about letting you choose the tradeoff than it is about specific improvements between the allocators.
(If it's spending a lot of time in Rust's format function you could also use a (or the same) lookup table to convert to hex/dec/oct.)
Edit: Turns out to be about 22% overhead, see https://github.com/sharkdp/hexyl/pull/23. Also it was 2 strings per byte, not 1.
Benchmark #1: hexyl $(which hexyl)
Time (mean ± σ): 169.8 ms ± 8.2 ms [User: 152.5 ms, System: 17.1 ms]
Range (min … max): 162.2 ms … 189.1 ms 16 runs
Benchmark #2: hexdump -C $(which hexyl)
Time (mean ± σ): 188.5 ms ± 4.4 ms [User: 186.2 ms, System: 2.2 ms]
Range (min … max): 184.1 ms … 198.2 ms 14 runs
Benchmark #3: xxd $(which hexyl)
Time (mean ± σ): 72.8 ms ± 2.7 ms [User: 71.9 ms, System: 1.1 ms]
Range (min … max): 71.0 ms … 87.8 ms 40 runs
Most of the improvement came from not using printf, fputs, and putchar in favour of operating directly on an array for the line that can be fwritten in one call.
He also authors https://github.com/sharkdp/bat, which is also worth to check out.
As someone who is considering learning Rust, this may just have pushed me over the edge.
There's also a vim mode for hex, which I use less often:
Many moons ago I wrote a very bare bones ncurses hex viewer/editor, after not really being able to find one that fit my needs. I still use it pretty regularly:
This little program should fit nicely into that category. An area I would like to explore, but have little-to-none incentives.
Good work on these tools though. You make me want to learn Rust :)
I removed my fork...
Afterwards you will probably want to run experiments with a hex editor, but I found that xxd + vim was better for reading then doing so directly in a hex editor at the time. If I did it again I'd likely start with this because of the color.
For other things, I've used it for examining network packets (binary protocols), mystery files, and proprietary data formats. I've also used a hex editor to hack executables (most notably hacking in a dark theme into Unity3D).
This is freaking awesome. Been looking for something like this for awhile!
Or the Joe editor in -hex mode: https://joe-editor.sourceforge.io/
i do seem to have 256 colors in urxvt
my $TERM is rxvt-256color and i have rxvt-unicode-256color installed