fqing finally. It always seemed strange that there wasn’t any central database of binary parsers that everyone could contribute to. Nearly every file format is fully documented, but none of the docs are programmatic.
I was trying to rip some sounds from a wii game called Rhythm Heaven, and it’s ridiculous how primitive the tech is. By that I mean the programming community’s tech. If you want to extract some assets, you’d better be running Windows, and you’ll need to download some random exe from mediafire made by Jared, a 13yo that coded the extractor in C in his spare time. This is only a very slight exaggeration; Windows being a requirement isn’t.
Hopefully projects like this will standardize all binary formats once and for all.
Actually, this is a good opportunity to ask: how would one contribute a binary parser to fq? If I wrote one for wii sound files, can I just submit a PR or is there some other process?
EDIT: https://github.com/wader/fq/blob/master/doc/dev.md documents the development side of things. I’m more interested in the project itself — if someone puts in the work to make a decoder for an obscure binary format, will it get merged (assuming it’s high quality) or is this only for popular formats?
There is the 010 Editor, at heart a cross-platform scriptable hex editor with a template language [1]. It has a central template repository [2] as well as templates around the internet (e.g. 3, 4).
But it being a paid tool means there are fewer template contributions from 13 year olds, which if we are all honest make up the majority of unpaid open source contributions - they simply have more spare time.
Do 13 year olds care about licenses and pay for the software they use? I certainly didn't when I was 13. And 010 editor is about as well protected as WinRAR.
I have a hard time imagining a 13 year old using that particular tool buying it. In fact there is a good chance it is used by the crackers themselves.
010editor's templating language looks very interesting (and something I had been trying to accomplish with my own mixed-bag of tools when reverse-engineering). I suppose as a hobbyist, the price is a hard one to get over...
Being a long time personal friend with the author I can assure you the more obscure the better :-) His interest in esoteric things and solutions are "well documented" if you browse around his github repos.
fun fact: fq is kind of the spiritual predecessor of https://github.com/wader/flac.tcl, you can see traces of it in flac_frame.go in fq, was used to prototype some things :)
Kaitai Struct is the better way to go for an ecosystem solution, but the tooling could certainly use improvements. In addition to 010 Editor, there's also KDE's Okteta. It does not have a lot of good OSDs and the OSD format/scripting for specifying formats is a little anemic (I'd like to help improve it if I can find time...) but it's very serviceable and a decent open source alternative to what 010 has. Shameless plug, I made a decent Windows EXE/PE OSD for Okteta. (It's even got a bit of support for NE16 executables, just for fun.)
This entire genre of tools has been a long time point of interest for me. In addition to making a couple OSDs and contributing some tiny improvements to Kaitai, I also have my own binary schema library for Go, restruct, which, biased or not, remains my favorite way to poke at arbitrary formats, since it's really easy to sketch stuff out and read and write to files quickly. It's basically Go's encoding/binary but with struct tags for more advanced things.
> This entire genre of tools has been a long time point of interest for me.
Same for me! It turned into a long journey and I am working on a solution that I am very happy about.
Somewhere mid-journey I learned about kaitai struct and lost a bit of steam seeing it was similar. But I think my offering is superior in a more simple template format with less programming required and a nice cli app.
I am yet to announce it publicly, but i been meaning to for so long already.
If you would like to check it out I would be happy!
You can use it to map / view content from a format there is a template for. Alot of common formats is already included and you can extend it using your own templates.
As someone who had a great time with Kaitai, may I suggest that you write an interface so that fq can be used with any format that Kaitai understands (and any that people add in future)
The functionality that I am personally interested in from a binary parsing framework like Kaitai is generating an encoder implementation in addition to a decoder one. In other words, given a description of a binary format, I would like to be able to construct an instance of a class whose memory layout matches the format. For instance, if the format has an int n, then an array `a` of size `n`, and then a double `d`, it would be awesome to be able to construct a corresponding object with fields `n`, `a` and `d` and when I change `n`, then the size of `a` changes accordingly. And then, if I pass a pointer to this object to the decoder, it would be able to parse it correctly, as if the memory representation of the object came from some external buffer.
Kaitai support for serialization has been a long time issue. It's obviously non-trivial given that it has at least one case that doesn't exist today (instantiating a structure without loading any existing data.)
It seems strange, because it's not reality! Forensic tools like FTK and Autopsy have had a plug-in framework for these forever, speaking as a former contributor to the former. There's also Kaitai Struct.
I'm sure other communities have popped up that I haven't heard of, too. There's lots of interest in unifying forensic parsing under open work.
I'm working on something, that is a open template format for binary file formats. It is usable today as a universal file extractor, with some bugs and limitations.
> It always seemed strange that there wasn’t any central database of binary parsers that everyone could contribute to.
Fully agree on the need for such a database. The problem is that it's been tried, but lacked traction. The latest one I've seen is Kaitai format files[1], that can be used in visualizers or to auto-generate parsers.
I used to be from the romhacking community back in the 2000s and due to usage of Windows, open source/foss wasn't even known to most people. The culture of Windows programmers is way more focused on freeware/binaries.
Still waiting to this day for FuSoYa to release the source code of Lunar Magic.
About a central database of binary parsers, I've been wanting this for ages too. The closest I ever found was augeas, but that's for configuration files.
> About a central database of binary parsers, I've been wanting this for ages too. The closest I ever found was augeas, but that's for configuration files.
I'm working on something, that is a open template format for binary file formats. It is usable today as a universal file extractor, with some bugs and limitations.
Yeah that’s a mysterious one. Such an incredible achievement, and it enabled so, so much creativity, and free (as in beer) to all as far as I know. I hope FuSoYa does open source it someday.
> I was trying to rip some sounds from a wii game called Rhythm Heaven, and it’s ridiculous how primitive the tech is.
This seems like a pretty niche need with only some hobbyists motivated enough to work on it. Is there a broader application than your use case? Otherwise I think thats why the existing software for this isnt great.
Yeah, I didn’t mean to sound entitled. I only meant I was excited for projects like fq to shake things up. When I was writing a parser for the sound file I was thinking “hmm… this really feels like duplicate work.”
On the other hand, it’s surprising to me that “grab sounds from a wii game” is so niche! My gut felt like it would be slightly more complicated than unzipping a tarball, but my gut didn’t expect it to be a programming challenge worthy of a small competition.
Seriously. Nintendo is an evil corporation. I was about to write “close to,” but they crossed the line when they sent someone to prison and garnished his wages for the rest of his life.
Not just someone but dude was not ripping or cracking stuff for fun. He made business out of it and what’s worse he added ransomware to scam his “customers”.
If he added ransomware, that’s a bit different. Making a business out of it isn’t that bad (think about it in terms of people going to prison for assault vs merely making some money), but the ransomware would be.
Still, garnishing someone’s wages the rest of his life seems out of proportion. But I admit it’s harder to defend someone that made a livelihood out of holding peoples’ data hostage.
I really don't think protecting IP produced at company expense is "evil." That's their prerogative, and people knowingly violating agreements/ToS are playing with fire.
I miss the ResEdit days of Macintosh days (i.e. MacOS 7-8-9). You could see/steal/modify/hack the visual assets of most binaries. You could remap keyboard shortcuts, modify menus, etc.
Hi, sorry for the delay, on vacation. There is no process really more then convincing me :) and i think i will accept any decoder that is for a format used in public, standardized or proprietary.
I do want to add some kind of runtime format support and i'm working adding kaitai support but it's not ready yet, it's not an easy thing to do :) but i've made very good progress. ideally it will be something like: fq -d format.ksy <query> file
Try unblob sometime, it’s a more modern, maintained alternative (not a fork). A company called OneKey that do some firmware security stuff maintain it, and generally it’s pretty good.
That's also my experience. People releasing their tools on obscure forums, usually without source code and no version control. Almost enire communities that haven't heard of GitHub. Though usually the tools work in wine... but are next to useless. GUI only and very clumsy to use. So many clicks for each single file to extract, no batch operations, no command line interface. A big WTF. How can you do any modding with those tools?
It's hard to not read this as elitist and entitled, to be honest. There are many people who know that GitHub exists but don't want to use it, and they also don't care for you having their source code. It's a choice they make. This entitlement is interesting because the solution obviously is to just make your own tools and not be a lazy library/tool hunter all the time.
If they do less work, i.e. not write a GUI, but just provide a command line tool you can suddenly automate things and it all gets much easier to use.
Anyway, I then wrote my own tools and put them on GitHub. And documented the reverse engineered file formats. And then got pull requests from people that want to help!
I was just voicing my bewilderment at this other culture. Don't understand why one would want to live like that.
I love that this includes a section in the README about other tools that are similar or related to fq. Every open source project should list its competitors.
It would be good if some form of externally plugable binary format specification is doable in the future. As far as I can see, if the binary format is not supported OTB, you can't use this tool.
I second poke. It's an amazing tool for debugging in general.
It's relatively rare to look into standardized binary formats (you'll likely look directly into a library at that point), unless you're writing a writer/parser/decoder yourself and need to double-check the output.
When developing with general binary data in mind, poke is much more useful.
Hi, i'm working on runtime kaitai support and have made good process but it's quite a big task. Keep an eye on https://github.com/wader/fq/issues/627 your interested.
This is what I like about powershell. It passes objects via pipeline and if you need to query or filter something, you don't need to learn millions of different tolls (jq, xmlstarlet, etc.) - just use programming language features for everything.
Hi fq author here and also acquaintance with the gnu poke author. One difference is that fq is focused in decoding and querying while poke might be more focused on editing and runtime modeling of formats. But we have both inspired each others think.
I've written a parser of Java class files (which works in any JVM language as they all compile to the same bytecode format). It was surprisingly easy! Maybe that could be useful to analyse class files in jq??!
Did I miss something or is there no Ubuntu or Debian installer?
I certainly know how to download a file and add it to my path (or put it in my personal bin directory) but sure would be nice to have a super simple installer.
LOL - please do not make a snap or whatever the hell the "cool kids" use. I certainly wouldn't want to advocate continued use of that pattern for utility functions.
Since this is written in Go, it's almost trivial to use fpm [1] to generate a variety of packages. Alternatively you can use nfpm [2] if you don't want to have to deal with installing Ruby & a gem.
Do you have access to the program that reads the data? If so, you can use a debugger to step through the parser for the file, even if symbols are stripped [1]. You can breakpoint on syscalls, such as when the file gets opened [2] and then step through and look around memory for the decrypted version. If you have an idea of what the file should contain you can probably identify patterns this way.
I'm not an expert on this topic at all though.
[1] Of course you then have less information but it's still possible to see the assembly while the file gets parsed. See for example,
For this kind of task, using low-level debugger tools is probably better. Rizin[1][2]/Cutter[3][4] could help. We also have GSoC participant this year who works hard on improving debuginfo and debugging support[5]. I personally also like Binary Ninja, they recently made their debugger stable enough[6].
Commenting to follow because I’m curious what alternatives you mean. I thought a lot of people liked jq and I only just finally got around to installing it, so if there’s a much better way I’d like to hear it.
I prefer a SQL-like format. It’s not as complete but it cover most of the day-to-day use cases. Take a look at https://github.com/dcmoura/spyql (I am the author). Congrats on fq!
Thanks! i actually experimented a bit with an SQL-like interface for while, dump things into sqlite and use that as query engine. Problem usually was that file formats tend to be mix of array and tree structures more then relational and at least standard SQL is not great for that. Maybe some graph-SQL dialect could work?
Hi author here. There is no stable API at the moment and it depends a bit what you want to do. If you mean write own private decoders it is possible but i can't guarantee the API will change, see this old twitter thread https://twitter.com/TimMattison/status/1600871136027627521
If you mean using existing decoders and access the result i think you probably do in theory. That is kind of how the interp packages in fq is implemented, it implements a jq interface + some fq bells and whistles using the decode value structure.
Looks like not really. The proto support is pretty basic. It can’t print floats and doubles and doesn’t parse groups or packed fields. It doesn’t use a descriptor database so it can only print the field number, not its name, and it can only differentiate nested messages if the user calls ‘|protobuf’ on what is otherwise considered a string.
Hi author here. Yeap currently only decodes the "wire" format. Actually it can decode using a schema but not parsed from a proto file, this is used by other formats using protobuf as a subformat then they pass a schema internally using go types. But proper proto schema support would be nice.
the name kind of sucks, why 'f' q? F for ... 'FU' ('teenage snigger'). It should be 'bq', binary query, after 'jq' json query.
Cool project none-the-less. The comment about 'programmatic documentation' of binary formats is very interesting, maybe some kind of 'binary description markup' could be part of this?
I was trying to rip some sounds from a wii game called Rhythm Heaven, and it’s ridiculous how primitive the tech is. By that I mean the programming community’s tech. If you want to extract some assets, you’d better be running Windows, and you’ll need to download some random exe from mediafire made by Jared, a 13yo that coded the extractor in C in his spare time. This is only a very slight exaggeration; Windows being a requirement isn’t.
Hopefully projects like this will standardize all binary formats once and for all.
Actually, this is a good opportunity to ask: how would one contribute a binary parser to fq? If I wrote one for wii sound files, can I just submit a PR or is there some other process?
EDIT: https://github.com/wader/fq/blob/master/doc/dev.md documents the development side of things. I’m more interested in the project itself — if someone puts in the work to make a decoder for an obscure binary format, will it get merged (assuming it’s high quality) or is this only for popular formats?