Hacker News new | past | comments | ask | show | jobs | submit login

They are also very similar (basically the Go version was a direct port of the Rust version) so the performance should be very comparable.

Sure, but different approaches are going to be more optimal for different languages.

I assume by zero-copy you mean that identifiers in the AST are slices of the input file instead of copies?

Yes. From the README:

zero-copy: if a parser returns a subset of its input data, it will return a slice of that input, without copying

Geal also makes claims that nom is faster than hand-written C parsers.

It's somewhat complicated because some JavaScript identifiers can technically have escape sequences (e.g. "\u0061bc" is the identifier "abc"), which require dynamic memory allocation anyway.

Nom comes with 'escaped' and 'escaped_transform' combinators. In theory it should be possible, with relative ease, to return a slice if there are no escape characters and an allocated string if expansion is required. Presumably you'd have to use a Cow<str> though.

Note that strings aren't slices of the input file because JavaScript strings are UTF-16, not UTF-8, and can have unpaired surrogates. So I represent string contents as arrays of 16-bit integers instead of 8-bit slices (in both Go and Rust).

Of course it is. My opinion (which is worth what you've paid for it) is that I'd just go for UTF-8 support. I can't remember the last time I've seen UTF-16 in the wild (thankfully).

Performance-wise the other thing that I'd keep in mind with rust is that in debug mode string handling is painfully slow.

Edit: here's the URL for nom: https://github.com/Geal/nom




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: