Hacker Newsnew | past | comments | ask | show | jobs | submit | peterohler's commentslogin

I've been writing Lisp code off and one since the 80s. The standard for Common Lisp has to be sbcl but the REPL is pretty minimal. The available packages tend to be more limited than Go which I've been using a lot lately. I did find a way to have a more functional REPL and also have access to all the Go packages by writing SLIP (https://github.com/ohler55/slip). Yes I know this is a plug for SLIP and if that offends anyone I apologize. The reasons mentioned for developing it are valid though and I've managed to use Lisp for almost all the data mining and processing tasks.

Over the last 2 or 3 years I've been building a Common LISP implementation in Go so that Go packages can be utilized by LISP code. Building a REPL with lots of interactive features was rewarding as was taking up the challenges of the object systems (CLOS and Flavors) and generics. Just open sourced it at the start of January. https://github.com/ohler55/slip


Thanks for the help.


After several years of development here is a mostly Common LISP implementation written in Go with a REPL, CLOS, generics, Flavors, and much more.


If you prefer JSONPath as a query language, oj from https://github.com/ohler55/ojg provides that functionality. It can also be installed with brew. (disclaimer, I'm the author of OjG)


JSONPath is also supported by Postgres!

Helpful when querying JSON API responses that are parsed and persisted for normal, relational uses. Sometimes you want to query data that you weren’t initially parsing or that matches a fix to reprocess.


speaking of classic databases: can anyone explain to me, a dummy, why any syntax like this or even GraphQL is preferable to "select a.name, a.age from friends a where a.city = 'New York' order by a.age asc"?


I'm clearly biased but oj which uses JSONPath is my preferred JSON manipulator. It can be installed with brew. It may not be for everyone but some of you might like it.


My experience is quite a bit different. Of course the examples I would use are more like what you might expect in real code. The comparison should be against code that calls a function that either returns and error and checks that error or one that panics and recovers. The overhead of returning the extra error and then the conditional used to check that error is more than a panic on error and recovery somewhere up the stack. This was not true in the early days of go but it is true today.

It really depends on the code being written. Try one approach then the other and see if it works better in your situation. For the example in the article there is really no need for an error check in the idiomatic case so why compare that to using panic. If there was an error to check the result would be much different.


Oj author here. While it's flattering to have Oj be the standard to beat I'd like to point out that most of the issues with Oj revolve around the JSON gem and Rails doing a monkey patch dance and Oj trying to keep pace with the changes. The Oj.mimic_JSON attempts to replace the JSON gem and only replaces the monkey patches made by that gem. The preferred approach for Oj outside of trying to mimic the JSON gem to to never monkey patch. That approach is used in all other modes that are not mimicking the JSON gem or Rails. I should point out that other Oj modes perform much better than the JSON gem and Rails modes.


> I should point out that other Oj modes perform much better than the JSON gem

Which modes are that? https://github.com/ohler55/oj/blob/develop/pages/Modes.md#oj...

I tried:

    Oj.dump(obj, mode: :strict)
and a few others and none seemed faster than `json 2.9.1` on the benchmarks I use.

Edit:

Also most of these mode simply aren't correct in my opinion:

    >> Oj.dump(999.9999999999999, { mode: :compat })
    => "999.9999999999999"
    >> Oj.dump(999.9999999999999, { mode: :strict })
    => "1000"


Using the benchmarks in the Oj test directory Oj has a slight advantage over the core json for dumping but not enough to make much difference. The comparison for Oj strict parsing compared to the core json is more substantial as 1.37 times faster. The benchmarks use a hash of mixed types included some nested elements.

The callback parsers (Saj and Scp) also show a performance advantage as does the most recent Oj::Parser.

As for the dumping of floats that are at the edge of precision (16 places), Oj does round to to 15 places if the last 4 of a 16 digit float is "0001" or "9999" if the float precision is not set to zero. That is intentional. If that is not the desired behavior and the Ruby conversion is preferred then setting the float precision to zero will not round. You picked the wrong options for your example.

I would like to say that the core json has a come a very long way since Oj was created and is now outstanding. If the JSON gem had started out where it is now I doubt I would have bothered writing Oj.


> Using the benchmarks in the Oj test directory

I'm sorry, but I've looked for a while now, and I can't seem to identify the benchmark you are mentioning. I suspect it's the one John took for his benchmark suite? [0]

> Oj has a slight advantage over the core json for dumping but not enough to make much difference

I'd be curious to see which benchmark you are using, because on the various ones included in ruby/json, Oj is slightly slower on about all of them: https://gist.github.com/byroot/b13d78e37b5c0ac88031dff763b3b..., except for scanning strings with lots of multi-byte characters, but I have a branch I need to finish that should fix that.

> The comparison for Oj strict parsing compared to the core json is more substantial as 1.37 times faster

Here too I'd be curious to see your benchmark suite because that doesn't match mine: https://gist.github.com/byroot/dd4d4391d45307a47446addeb7774...

> The callback parsers (Saj and Scp) also show a performance advantage as does the most recent Oj::Parser.

Yeah, callback parsing isn't something I plan to support, at least not for now. As for Oj::Parser, `ruby/json` got quite close to it, but then @tenderlove pointed to me that the API I was trying to match wasn't thread safe, hence it wasn't a fair comparison, so now I still bench against it, but with a new instance every time: https://github.com/ruby/json/pull/703.

> You picked the wrong options for you example.

No, I picked them deliberately. That's the sort of behavior users don't expect and can be bitten by. As a matter of fact, I discovered this behavior because one of the benchmark payloads (canada.json) doesn't roundtrip cleanly with Oj's default mode, that's why I benchmark against the `:compat` mode. IMO truncating data for speed isn't an acceptable default config.

[0] https://github.com/jhawthorn/rapidjson-ruby/blob/518818e6768...


The strict mode benchmarks for Oj are in the test/perf_strict.rb. Others are are in perf_*.rb.

If callback parsing is not supported that's fine. Oj does support callback parsing as it allows elements in a JSON to be ignored. That save memory, GC, and performance. Your choice of course just as including callback parsers is a choice for Oj.

Ok, so you picked options that you knew would fail. Again you choice but there are certainly others that would trade a slight improvement in performance to not have 16+ significant digits. It's a choice. You are certainly entitled to you opinion but that doesn't mean everyone will share them.

I'm not sure what platform you are testing on but i'm sure there will be variations depending on the OS and the hardware. I tested on MacOS M1.


> If callback parsing is not supported that's fine.

Yes, as mentioned in part 1 of the series, my goal for ruby/json, given it is part of Ruby's stdlib, is to be good enough so that the vast majority of users don't need to look elsewhere, but it isn't to support every possible use case or to make a specific gem obsolete. For the minority of users that need things like event parsing, they can reach to Oj.

> but that doesn't mean everyone will share them.

Of course. When I was a fairly junior developer, I heard someone say: "Performance should take a backseat to correctness", and that still resonate with me. That's why I wouldn't consider such truncation as a default.

> i'm sure there will be variations depending on the OS and the hardware. I tested on MacOS M1.

I suspect so too. I'd like to get my hands on a x86_64/Linux machine to make sure performance is comparable there, but I haven't come to it yet. All my comparisons for now have been on M3/macOS.

> It looks like a lot of time and effort went into the analysis.

It was roughly two weeks full time, minus some bug fixes and such. I think in the end I'll have spent more time writing the blog series than on the actual project, but that probably says more about my writing skill :p

Anyway, thanks for the pointers, I'll have a look to see if there's some more performance that need to be squeezed.


If you would like to discuss separately on a call or chats I'd be up for that. Maybe kick around a few ideas.


I missed responding to your assertion that the Oj::Parser was not thread safe. An individual Oj::Parser instance is not thread safe just like other Ruby object such as a Hash but multiple Oj::Parser instances can be created in as many threads as desired. The reason each individual Oj::Parser is not thread safe is that it stores the parser state.


Yes that's what I meant. The benchmark suite I took from rapidjson was benchmarking against:

    Oj::Parser.usual.parse(string)
That is what isn't thread safe. And yes you can implement a parser pool, or simply so something like:

   parser = (Thread.current[:my_parser] ||= Oj::Parser.new(:usual))
But that didn't really feel right for a benchmark suite, because of the many different ways you could implement that in a real world app. So it's unclear what the real world overhead would be to make this API usable in a given application.

> is that it stores the parser state.

And also a bunch of parsing caches, which makes it perform very well when parsing the same document over and over, or documents with a similar structure, but not as well when parsing many different documents. But I'll touch on that in a future post when I start talking about the parsing side.


Just so you know, I am impressed by the depth you've delved into with JSON parsing and dumping. It looks like a lot of time and effort went into the analysis.


Ah, I figured why on the Oj side `ruby/json` appeared slower: https://github.com/ohler55/oj/pull/949


Merged. Didn't seem to make much difference though. Results for the original Oj parser are pretty close to the core json now. I'll have to update the README for Oj. It's a bit stale. The new Oj::Parser is still much faster if not restricted to the current Rails environment.


Out of curiosity, I'm looking at the JSON spec. This mildly horrifies me: "This specification allows implementations to set limits on the range and precision of numbers accepted."

The spec doesn't specify a precision or range limit anywhere (just suggests that IEEE754 might be a reasonable target for interoperability, but that supports up to 64bit floats, and it looks like Oj is dropping to 32bit floats?).

Python and Go don't go and change the precision of floating point numbers in their implementations, but according to the standard, they're entirely entitled to, and so is Oj.

I don't see anything in https://github.com/ohler55/oj/blob/develop/pages/Modes.md#oj... specifying that Strict will force floating points to specific precision vs other implementations


Yes, JSON as a format is very much under specified, a lot of these sorts of things are basically implementation defined.

In general libraries do what make sense in the context of their host language, or sometimes what makes sense in the context of JavaScript.

For ruby/json, I consider that if something can be rountriped, from Ruby to JSON and back, it should be, which means not reducing float precision, nor integer precision, e.g.

    >> JSON.generate(2**128)
    => "340282366920938463463374607431768211456"
But other libraries may consider that JSON implies JavaScript, hence the lack of big integer, so such number should be dumped as a JS decimal string or as a floating point number.

> I don't see anything in [...] specifying that Strict will force floating points to specific precision vs other implementations

Yes, and that's my problem with it. As you said, Oj is free to do so by the JSON spec, but I'd bet 99% of users don't know it does that, and some of them may have had data truncation in production without realizing it.

So in term of matching other libraries performance, If another library is significantly faster on a given benchmark, I treat it as a bug, unless it's the result of the alternative trading what I consider correctness for speed.


There are a few more tolerant versions of JSON. In OjG I called the format SEN https://github.com/ohler55/ojg/blob/develop/sen.md


Very cool. I don't know if it's too much of an ask but could you adopt that to also work with OjG which uses JSONPath for instead of the jq syntax. I'd be glad to help if you are up for it. My apologies if I am out of line.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: