
File format wiki - chei0aiV
http://fileformats.archiveteam.org/
======
scrollaway
I worked in video games reverse engineering for several years, so I'll take
the occasion to quote a recent tweet from a friend/colleague of mine:

> _At HandmadeCon @mike_acton brought up the game data format problem. I 've
> reverse engineered hundreds of proprietary formats from games, and he's
> right that all I would need to make things simple is a basic description of
> the file data. Just struct declarations are enough. No encryption has
> stopped me, I always find the data I need, and your data all has the same
> general architecture, so there's no reason a game developer should be afraid
> of the public knowing how their game's data is laid out. This zeitgeist of
> closely guarding that information is doing you more harm than good —
> moddability is a key feature of many successful games, and all modders want
> is docs._

\--

There's a lot of other industries that could stand to learn from this.
Reinventing the wheel sucks. Doing your own, proprietary thing in your corner,
taking on extra maintenance work, extra work to modify existing tools to make
them compatible with your own image/archive/whatever format... it all sucks
and nobody wins.

~~~
TheCapn
Any good sources you'd use to get off the ground running with file format
reverse engineering?

I have a couple small projects on the backburner at work that require me to
reverse engineer file formats. I've been able to decode a few basic details
out of the files by looking for patterns in a hex editor but the finer details
are still escaping me.

~~~
scrollaway
I'm self-taught so I won't be much help there. The usual path starts with
learning assembly. If you have skills in reverse-engineering the actual
executables, reversing the file formats is as simple as tracing back what
happens when the files are read.

It gets tougher when there are anti-debugging protections in place but that
generally happens only in the video games industry.

If the format is simple enough though, you will be fine with a good hex editor
and some pattern recognition.

~~~
a_t48
Along with learning assembly, find an active project looking for help. I cut
my teeth on deconstructing\modding Oni (see
[http://wiki.oni2.net/ONCC](http://wiki.oni2.net/ONCC) for example) and it was
great to have other people to bounce ideas off of.

------
johnlbevan2
FileFormat.info probably deserves a mention, given that has pretty much the
same objective:
[http://www.fileformat.info/format/a.htm](http://www.fileformat.info/format/a.htm)

~~~
hackuser
Doesn't FileFormat.info already 'solve the problem'? What does the new wiki
offer that FileFormat.info lacks, and which couldn't be added to
FileFormat.info?

~~~
mynewtb
_You may not publish, copy, display, distribute, transmit, perform, modify,
create derivative works from, or sell any Materials, information, products, or
services obtained from this Site, except as otherwise expressly permitted
under applicable law and as described in these Terms of Use. FileFormat.Info
retains all right, title, and interest to the Materials._

------
ddorian43
You can also help the Archive Team by running the warrior which archives
websites that are closing:
[http://www.archiveteam.org/index.php?title=ArchiveTeam_Warri...](http://www.archiveteam.org/index.php?title=ArchiveTeam_Warrior)

------
xomateix
A bit off topic, but I would like to know why did they choose mediawiki as a
platform?

Under my limited experience working with wikipedia and wikidata I've seen it's
not the best option to a) store structured data and b) edit the pages
(markdown is, imho, much better for that).

~~~
scrollaway
Mediawiki is honestly the only option when it comes to _user-friendly_ wiki
platforms.

[Last time I said that, someone recommended Moin. I hope this won't happen
this time - Moin is a UX joke]

~~~
hackuser
Dokuwiki's UI is fine. Also, what about TWiki/FOSSwiki?

~~~
db48x
I think it's hosted somewhere that has automated tools for installing common
software, and Mediawiki is what you get.

------
gobusto
This reminds me of the Xentax wiki:
[http://wiki.xentax.com/](http://wiki.xentax.com/)

------
ChrisArchitect
Good share, but why is this Show HN? Did you create it? It's not new...... etc

------
yoodenvranx
Has anyone an idea what happened to wotsit.org?

------
numberwhun
To be honest, I expected a bit more from the site, considering its name. It is
good, don't get me wrong, covering a lot of file formats, but you have limited
it to things on your computer. For me, if it were an all inclusive wiki, you
should expand it to include things like financial file formats. Banks tend to
use formats like ach and edi (for example). Those formats are extensive and
can get quite complicated, but their file specs are published and available.
The site is a great start, but I would love to see it expanded to include
anything file format related. Even having the most obscure formats will make
the site more appealing to people as a reference. Sort of 'the place to go'
for file formatting. Just my .02.

~~~
textfiles
Then get editing.

------
g1n016399
Like many ambitious wiki-based projects, this site seems to suffer from lack
of focus. Look at these pages:

[http://fileformats.archiveteam.org/wiki/Dendrochronology](http://fileformats.archiveteam.org/wiki/Dendrochronology)

[http://fileformats.archiveteam.org/wiki/Quantum_computer](http://fileformats.archiveteam.org/wiki/Quantum_computer)

[http://fileformats.archiveteam.org/wiki/TLD_.mobi](http://fileformats.archiveteam.org/wiki/TLD_.mobi)

[http://fileformats.archiveteam.org/wiki/Endianness](http://fileformats.archiveteam.org/wiki/Endianness)

How is any of these things a file format? Because the definition of a file
format in the FAQ
(<[http://fileformats.archiveteam.org/wiki/FAQ:File_Format>](http://fileformats.archiveteam.org/wiki/FAQ:File_Format>))
is so broad that it can be made to encompass basically everything. The
manifesto that started this site is a rambling sermon that doesn't clarify
anything in this respect: it just repeats "let's solve the problem" without
even properly defining what the problem is.

There are also other issues. The classification scheme is often Procrustean
and confused. Error messages were until recently mixed up with error detection
codes, which conflates two different meanings of "error". Similarly FUSE
shares a category with HFS+, even though the former is an API, and the latter
a disk format; distinct things which just happen to share the name of "file
system". The pages are rather short and consist mostly of lists of links.
Given the above-mentioned lack of clearly defined scope, I suspect many pages
seem to be created about topics just because they're in the news and/or just
to have a place to put a link to a "neat" blog post: see for example
<[http://fileformats.archiveteam.org/wiki/Facebook#Links>](http://fileformats.archiveteam.org/wiki/Facebook#Links>).

Last but not least, the whole site is rather ugly, and the logo is awfully
non-descriptive of what it's supposed to contain; what am I supposed to do
with this thumb, stick it up my arse?

It's a shame, really, because documenting file formats is a hard and valuable
endeavour. But I don't think these people are going to do a good job of it.

~~~
TillE
Endianness is extremely relevant to file formats. The others certainly seem
like irrelevant nonsense; "organic file formats"?

~~~
db48x
Archive Team is backing up DNA, just in case.

------
jrgoj
Thanks for this. I write file parsers for language translation at my day job,
so this could be a nice resource to have on hand.

~~~
scrollaway
Out of curiosity, are you familiar with Translate Toolkit?
[https://github.com/translate/](https://github.com/translate/)

If you're not and are interested by it or in contributing to it, you should
email me - I'll put you in touch.

~~~
jrgoj
I have not worked with the Translate Toolkit yet, but I'm a little familiar
with a few of the tools. I'm always interested in learning or contributing
where I can.

I currently work on a proprietary l10n/i18n system, so a lot of what I do is
under an NDA. However, I'm slowly gaining traction in moving us, at least in
part, to a more open model.

If you let me know your email, I will reach out.

~~~
scrollaway
My email is on [https://leclan.ch](https://leclan.ch) \- Translate was my
previous job.

------
hackuser
What is the name of your wiki?

