Just had a look (https://github.com/duckdb/duckdb/issues/9399). Yeah it's worrying that such a trivial query returned incorrect results - but credit to the Devs for getting it fixed quickly.
To my knowledge the only databases that can be described as "military-grade" in terms of testing are SQLite and Postgres.
Apparently DuckDB requires your real-life name to file an online bug report, bucking every norm of online handles for communication, as well as enabling doxxers and stalkers to find and trace people in real life.
It seems to be about wanting to know who you're talking to when providing free support for an open-source project, and whether the person submitting an issue is using the project for personal use or within an organization.
> If I don’t know who you are, am I enabling you to build the new Turkish censorship infrastructure, or helping you implement [Russian internet blocking] more efficiently? These are two examples that actually happened by the way.
It's the same with a lot of open source contributions; those need to do so for legal and copyright reasons.
If you're afraid of doxxing and / or stalking though, at least you have the choice to not contribute. You can still post somewhere else and ask someone else to make the report for you if need be.
Query engines are (not)surprisingly complex software products. Add to that the constant (and aggressive, due to the competition in the field) evolution and adition of the new features that can interact with every existing feature in any existing context and you have a perfect environment for bugs to appear.
Both the Brier score and log loss are proper scoring rules (i.e. optimized when the predicted probabilities are the true outcome probabilities), and the choice between the two seems to have minimal impact on the conclusions that can be drawn (https://pubsonline.informs.org/doi/abs/10.1287/deca.2013.028...). I covered the Brier score in the post as I thought it would be easier to digest for a general audience.
As Frank Harrell wrote on his blog (https://www.fharrell.com/post/class-damage/), one advantage of the Brier score could be its interpretability and the ability to break it decompose it into discrimination and calibration components.
Indeed. Note though that proper scoring rules form a large class and it can matter which one you choose.
For example, for logistic regression, things become a lot simpler if one chooses log loss (equivalently KL divergence) because one ends up with a convex minimization problem. Had one chosen Brier score here the problem is no longer convex and where one starts the training iteration will determine where the updates converge to. Sometimes this indeterminacy is a problem -- am getting poor results, is it because the data has changed, or is it that my initial seed has changed and the udates have converged to a worse solution.
Agreed. It would be great to hear your views on some of the key gaps in modern data science curricula that could be covered in the blog - would you be able to drop me a line at datarecipes@pm.me? Thanks!
I agree that the post lacks depth, but it was intended to be a gentle article accessible to a general audience, so they can start applying it in practice in their day to day lives. I would, however, really love to hear your views on what might be a more rigorous treatment of similar topics that can be introduced in an accessible way - would you be able to drop me a line at datarecipes@pm.me? Thanks!
To my knowledge the only databases that can be described as "military-grade" in terms of testing are SQLite and Postgres.