Hacker News new | past | comments | ask | show | jobs | submit login
Reverse Engineering Apple's typedstream Format (chrissardegna.com)
127 points by css 10 days ago | hide | past | favorite | 25 comments





This is perfectly timed, as I wanted to find a way to programmatically modify my Mac's AppleScript display settings/theme. For whatever reason, they're stored as typedstream format, embedded in a plist in base64. Found an old implementation/header from 1999 from Mac OS X Server v1.2, signed by Bertrand Serlet, and was going to dig in when I found the time. Now I can dig into this.

The plist is probably a binary plist (header bytes `bplist00`) generated by NSKeyedArchiver, and then the specific data you need is encoded inside. Edited iMessages are stored in the exact same way. Luckily the plist itself is not that complex–but typedstream is pesky to work with.

The plist is no issue, but it's the values therein where you run into typedstreams. For every setting for the Script Editor's formatting, is a separate dictionary, with an NSColor and NSFont key set to a data type value. The data is a base64 encoded `streamtyped` file. Passing it through base64 decode and running `file` on the output gives back `NeXT/Apple typedstream data, little endian, version 4, system 1000`, just as in the OP.

The only reason I want to do this is because I wipe a Mac nearly weekly, and need it setup more or less the same way again. I could probably just drop the .plist in that directory and bobs your uncle, but I also would change the fonts Script Editor is using to a third-party font not installed, so I don't want to have to worry about weird order of operation BS, and also find a way to set it to any arbitrary font, as I often change out the "fixed width" font I use in all the editors for that week (I have favorites, not just a favorite, gotta keep it fresh, ya know).

I figured that since Script Editor, and the AppleScript components of macOS are so old and creaky, forgotten leftovers in the Yellow Box that no one bothered to fix. I had no idea typedstreams were still being used in modern Apple software.


Question from a relatively uninformed sysadmin/freelance I.T. provider—will these new iMessage functions allow for 3rd-party applications (ie—CRMs, client support platforms, etc.) to read and/or work with incoming iMessages from my iCloud account? The only thing I really miss since coming from Android was my ability to consolidate all of my client communications, many of whom send text messages first and foremost (which I prefer).

Apple provides Messages for Business [0], but if you have a machine that can read the iMessages as they come in, you could use the library [1] that powers `imessage-exporter` as a bridge.

[0]: https://register.apple.com/messages

[1]: https://docs.rs/imessage-database/latest/


I have a product that does exactly this. E-mail me at ben AT theengine DOT co, I'd love to show it to you and see if it would help.

Don’t know if it helps, but I know iMessage stores message data on MacBooks inside a SQLite file; I was scanning through it previously because I was trying to do a bulk search

The grandfather of protobuf. Lost in the tales of time.

Grandfather of Protobuf is ASN.1

Very much so. Pretty much all of these protocols are simplifications of asn1 and in some cases (like protobuf) there are a handful of things that got lost because the wire formats didn’t have them as they didn’t need them. A schema indicator being the single biggest flaw in protobuf.

Why is the lack of a schema indicator the biggest flaw of protobuf?

It makes it impossible to write a general purpose dissector that takes captured messages or bytes and figure out how to parse it.

All they needed was a varint at the head of any marshaled from to at least provide some scoping clue.


If you parse a serialized protobuf byte array without having a .proto file, you have no way to dustinguish a byte string field from a nested message field. Thus you have no way to know how deep your parser should go.

Semi-related, one of the `imessage-exporter` contributors provided a great write-up on reverse engineering the handwritten and digital touch message protobufs [0]. The reconstructed proto files are [1] [2].

[0]: https://github.com/trymoose/handwriting2svg/blob/0eb56cf4582...

[1]: https://github.com/ReagentX/imessage-exporter/blob/beeb853b2...

[2]: https://github.com/ReagentX/imessage-exporter/blob/beeb853b2...


One usually has two grandfathers, so it still works out.

The telco industry, including GSM and its successors, uses ASN.1 widely.

iMessage uses a very strange amalgamation of typedstream (message content), keyed archives (app messages, sticker data), and protobufs (Digital Touch, handwriting) for different features. I wonder what motivated all of those design decisions.

This is stuff is such a PIA to parse. I assume it's just different teams doing different features over the years, and being alternately repulsed/seduced by each format. Probably features are implemented as libraries so there isn't a master oversight - they aren't trying to make iMessage's internal formats follow a consistent plan, just let all the libs coexist...

Maybe they should be repulsed, considering all of the journalists that are getting persecuted and/or murdered because they are getting pwned through iMessage serialization bugs :)

As someone who used to work on that team, it’s so interesting hearing thoughts from external public on the team.

I would love to hear your thoughts as an insider.

"Those who don't understand ASN.1 are doomed to reinvent it, poorly."

That said, it could be much worse --- JSON, or XML.


Nice writeup! I wonder if gnustep's NSUnarchiver could be augmented for full compatibility with Foundation?

I was curious how chat gpt can analyze this giving some general instructions https://chatgpt.com/share/67a102b0-b3e4-8003-974d-2ef73a738a...



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: