Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, Swift-most-everything is pretty slow, but particularly parsing/generating. Pre-Swift Foundation serialisation code was already...majestic, and in the Swift conversion they've typically managed to slow things down even further. Which didn't seem possible, but they managed.

I have given a bunch of talks[1] on this topic, there's also a chapter in my iOS/macOS performance book[2], which I really recommend if you want to understand this particular topic. I did really fast XML[3][4], CSV[5] and binary plist parsers[6] for Cocoa and also a fast JSON serialiser[7]. All of these are usually around an order of magnitude faster than their Apple equivalents.

Sadly, I haven't gotten around to doing a JSON parser. One reason for this is that parsing the JSON at character level is actually the smaller problem, performance-wise, same as for XML. Performance tends to be largely determined by what you create as a result. If you crate generic Foundation/Swift dictionaries/arrays/etc. you have already lost. The overhead of these generic data structure completely overwhelms the cost of scanning a few bytes.

So you need something more akin to a steaming interface, and if you create objects you must create them directly, without generic temporary objects. This is where XML is easier, because it has an opening tag that you can use to determine what object to create. With JSON, you get "{" so basically you have to know what structure level corresponds to what objects.

Maybe I should write that parser...

[1] https://www.google.com/search?hl=en&q=marcel%20weiher%20perf...

[2] https://www.amazon.com/gp/product/0321842847/

[3] https://github.com/mpw/Objective-XML

[4] https://blog.metaobject.com/2010/05/xml-performance-revisite...

[5] https://github.com/mpw/MPWFoundation/blob/master/Collections...

[6] https://github.com/mpw/MPWFoundation/blob/master/Collections...

[7] https://github.com/mpw/MPWFoundation/blob/master/Streams.sub...




That resonates well with my conclusions that led to the Replicated Object Notation project. [1]. If the parser creates an AST tree or some number of dictionaries or some other bullshit... "now you have two problems", that's it.

I settled on a tabular-log format, which is streamed and immediately consumed most of the time, no intermediate object structures.

Then, that "text vs binary" distinction became mostly moot. The binary is slightly more efficient, but grossly less readable, so no big gain, unless at grand scale.

[1] http://replicated.cc




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: