
"I think you will all appreciate this person's commenting style" - ahalan
http://jwz.livejournal.com/1774883.html
======
saurik
PSD was never intended to be a data interchange format: it is the
serialization format of a single program that has more individual unrelated
features that actual people rely on than almost any other piece of software
and has maintained striking amounts of backwards compatibility and almost
unbroken forwards compatibility during its over two decades of existence. This
product's "file format" needs to be critiqued in this context, along with
similar mega-programs like Office.

I am thereby having a difficult time fathoming why anyone would think that a
PSD file is thereby going to be some well-organized file format that they
should easily be able to parse from their own application is just naively
wishful thinking: even other products from Adobe have limitations while
opening these files; to truly manipulate these files you really need to be
highly-compatible with Photoshop's particular editing model (hence the
conceptual difference between these two classes of file format).

~~~
yason
If a wise programmer decided he needs a serialization format, would he
deliberately include in that format all the crap so vividly pointed to by the
article?

No.

He will think of the "serialization format" as an interchange format between
_two different instances_ of his program. One process first writes the data
file and another process later will read it. He also knows that sooner or
later the "serialization format" needs to talk with _different versions_ of
his program, not just different running instances.

AFAIK, the Word .doc also started (and unfortunately continued) as basically a
not-so-designed memory dump of the in-memory OLE data model. It's a format
that more often than not has infamously stumped its own implementation as
well. (Over time, OpenOffice has saved quite a lot of .doc files of Office
users.)

~~~
SideburnsOfDoom
> AFAIK, the Word .doc also started (and unfortunately continued) as basically
> a not-so-designed memory dump

This may be true but not the whole story. It's the reason why the MS office
team bit the bullet and replaced .doc with .docx about 5 years ago
<http://en.wikipedia.org/wiki/Office_Open_XML>

Docx is basically XML in a zip file. It's a beast and has lots of compromises
for backward compatibility, but as a design starting point, "zipped XML" is
far far better than a binary dump of the in-memory data.

~~~
lucian1900
It's still worse than ODT (which itself isn't exactly pretty), for no good
reason. That's sad.

~~~
SideburnsOfDoom
ODT is also XML-based, to Docx's problems compared to ODT can't be blamed on
XML.

~~~
lucian1900
I never said it has anything to do with XML. OOXML is extremely complex for
little reason. Even though it is also quite complex, ODT is much, much
simpler.

~~~
makomk
There are actually reasons for some of OOXML's weirdness, just not good ones.
For instance, it appears the reason why OOXML is pretty much the only XML-
based document format which doesn't use a mixed content model is because
there's a huge amount of prior art that'd have made it impossible to patent if
they had. (Microsoft tried anyway though.)

------
gjm11
Has been on HN before (<http://news.ycombinator.com/item?id=575122>) but it
was years ago. I mention this just in case others are having the same feeling
of deja vu as me.

~~~
rjzzleep
well i'm glad it was posted again

------
greggman
I'm pretty sure the PSD format chucks are based off IFF spec from 1985

<http://www.martinreddy.net/gfx/2d/IFF.txt>

Things were padded to 4 byte boundries because the 68000 processor would crash
if you read an unaligned 32bit value. So the length of the actual data was
what you find in the size field of each chuck but each chunk is padded. That
way you didn't have to work around the 68000 quirks and read a byte at a time.

I wrote a psd reader in 93. It wasn't that hard and still works today. Maybe I
chose an easy subset. It only reads the original result (merged layers) that
gets saved when you chose to save backwards compatible files in photoshop.

[http://elibs.svn.sourceforge.net/viewvc/elibs/trunk/elibs/li...](http://elibs.svn.sourceforge.net/viewvc/elibs/trunk/elibs/lib/echidna/photoshp.c?revision=97&view=markup)

~~~
to3m
I wrote one a few years ago as well. It read layers, summary image, and some
layer metadata that I needed (blend mode, layer name, visibility flag, etc.).
There's documentation for the format on the adobe site, I think (wherever it
came from at the time - autumn 2007 - no fax was required), so it was actually
fairly straightforward. An artist made me a bunch of PSD files with the stuff
in that they wanted to use, and I sat there comparing the results of my code
to what Photoshop did.

The only oddity I can recall is that Photoshop does something odd with the
alpha channel - I think it was the alpha channel? - by sometimes storing it
with the summary image rather than the layer to which it's related. (Don't ask
me for more details than that - I don't remember.) I thought at the time that
this looked like somebody's attempt to make newer data work tolerably with
older revisions. That part WAS annoying, because the documentation didn't
mention that, and it took about a week before somebody managed to create a
photoshop file that was arranged this way.

The file format overall bore many of the hallmarks of one that had grown
rather than being planned, but it looks like they'd started to clamp down on
things at some point because the newer data chunks looked a lot better-
designed than the old ones. These things happen. It could be worse. BMP is
worse. TGA is worse. They aren't even chunk-based.

------
runn1ng
John Nack replied to this 3 years ago on his blog.

[http://blogs.adobe.com/jnack/2009/05/some_thoughts_about_the...](http://blogs.adobe.com/jnack/2009/05/some_thoughts_about_the_psd_format.html)

------
hcarvalhoalves
I appreciate the first code comment more after the introduction:

    
    
        if(sign!='8BIM') break; // sanity check
    

"Sanity check" as in "let's make sure it's really a PSD before we go insane".

------
drivingmenuts
So, I guess embedding a PSD in a DOC file is like putting a Bag of Holding in
a Portable Hole?

------
bitwize
And yet to be considered a non-toy image editor, you must support 100% of this
format perfectly.

------
simula67
Many more for your viewing pleasure :
[http://stackoverflow.com/questions/184618/what-is-the-
best-c...](http://stackoverflow.com/questions/184618/what-is-the-best-comment-
in-source-code-you-have-ever-encountered)

~~~
mmariani
Thanks for this link! It's filled great laughs. Like this one:

    
    
      #define TRUE FALSE //Happy debugging suckers
    

I imagine what the guy who wrote it must've been through... :-P

PS: I wish Jeff hadn't shut down the thread.

------
felipc
One of my favorite blog posts from Joel Spolsky talks about this, basically
explaining how these formats come to be. For mega-softwares like those, the
source code is the de facto file spec
www.joelonsoftware.com/items/2008/02/19.html

------
smosher
This reminded me of just how nice the Doom WAD format is:
<http://doomwiki.org/wiki/WAD>

When a friend complained that he had a hard time figuring out which maps were
present in a given WAD, I enjoyed myself while writing a utility to organize
them into directories with map numbers. I kept thinking: this is how you
serialize data. Looking back on the code now, it's still easy to understand.

------
brendandahl
If he thinks PSD is bad he should try PDF which is really about 30
inconsistent formats all packaged into one inconsistent format.

------
dschiptsov
This is much better reason to hire a person than 10 resumes.)

------
flebron
I like the 'sanity check' at the bottom. :)

~~~
new299
should clearly just return false. ^^

------
drp4929
Is this a comment or rant ?

~~~
mbetter
Is this a comment or a question?

------
unix-dude
lol'd hard.

------
joshka
Whilst I enjoy jwz's writings, please follow the hacker news guidelines which
can be found at <http://ycombinator.com/newsguidelines.html>

In particular: Please submit the original source. If a blog post reports on
something they found on another site, submit the latter. The original source
is
[https://code.google.com/p/xee/source/browse/XeePhotoshopLoad...](https://code.google.com/p/xee/source/browse/XeePhotoshopLoader.m#102)

Also: Please use the original title, unless it is misleading or linkbait.

~~~
pseut
Later on the list of guidelines:

"Don't abuse the text field in the submission form to add commentary to links.
The text field is for starting discussions. If you're submitting a link, put
it in the url field. If you want to add initial commentary on the link, write
a blog post about it and submit that instead."

