More

wjakob · on April 4, 2022

> Why would you recommend that? [..] It's a bunch of work for no benefit.

nanobind/pybind11 (co-)author here. The space of python bindings is extremely diverse and on the whole probably looks very different from your use case. nanobind/pybind11 target the 'really fancy' case you mention specifically for codebases that are "at home" in C++, but which want natural Pythonic bindings. There is near-zero overlap with Cython.

erwincoumans · on April 4, 2022

Yes, I assumed everyone who cares about performance (or who writes large programs) is also a C++ or CUDA programmer. Don't tell me that is not the case :-)

wjakob · on Dec 4, 2021

There is also IINA that somehow seems more snappy/better-integrated on macOS.

tehnub · on Dec 4, 2021

IINA is basically a front-end for mpv. I use it. It's got nice features like force-touch support, but in my experience it's a memory hog—way worse than base mpv somehow.

wjakob · on Oct 20, 2020

Caution: if you rely on pybind11 or a project using pybind11 (many projects do, like NumPy/SciPy/Tensorflow/PyTorch..), hold off on upgrading to Python 3.9.0 for now.

A change in Python 3.9.0 introduces undefined behavior in combination with pybind11 (rarely occurring crashes, but could be arbitrarily bad). We will work around it in an upcoming version of pybind11, and Python will separately also fix this problem in 3.9.1 slated for release in December.

Details available here: https://pybind11.readthedocs.io/en/latest and https://github.com/python/cpython/pull/22670.

user5994461 · on Oct 20, 2020

I was about to ask why would they do a release if they know it's broken and blowing up their flagship libraries? It's worrying when the interpreter starts to favor shipping quickly over being functional.

But nevermind, the 3.9 release was 15 days ago and the bug was identified 8 days ago. I guess there has to be a release before people will start using it and find bugs/regressions.

There's a saying to always wait for the .1 release, this is a good illustration of why.

dec0dedab0de · on Oct 20, 2020

I don't know if I would call the SciPy ecosystem the flagship libraries of python. I know it is very popular,and probably in the top 5 use cases, and I use some of them everyday. but in my experience they've always been off in their own world in relation to the rest of the python community. Most of the people I've met that use this stuff are either scientists or finance people that are just using it as another tool along with matlab, R, and excel. Which is what they should be doing, but in their minds they're not writing python code they're using Pandas, Jupyter, and Matplotlib.

But yes, never ever jump on a new release of anything, at least in production. Always stand back and wait for someone more optimistic to find the bugs

dragonwriter · on Oct 20, 2020

> I don't know if I would call the SciPy ecosystem the flagship libraries of python.

I mean, sure NumPy is integral to python finance and scientific computing...but it's also right at the heart of the popular Python roguelike tutorial, too.

There's Python code that doesn't use NumPy, sure (I've got some in production right now), but I can't personally think of a Python library with nearly as deep dependencies across nearly as diverse a range of use cases.

raziel2p · on Oct 20, 2020

What makes pybind a "flagship library"? If it was very widely used this bug probably would've been caught by people testing the alpha/beta/rc versions of 3.9.

dragonwriter · on Oct 20, 2020

> What makes pybind a "flagship library"?

NumPy, SciPy, and friends are, individually and collectively flagship libraries, and depend on pibind11.

> If it was very widely used this bug probably would've been caught by people testing the alpha/beta/rc versions of 3.9.

They are very widely used, but “probably” means “do enough releases and you will find exceptions”. Welcome to an exception.

true_religion · on Oct 20, 2020

People using numpy are not in the fringe. If this wasn’t triaged during prereleases then it’s either that it doesn’t always cause problems, or there is an issue in the diversity of testers.

ictebres · on Oct 20, 2020

the parent was probably talking about NumPy/SciPy/Tensorflow/PyTorch, which I would also label as flagship libraries...

xiphias2 · on Oct 21, 2020

I personally don't like Python (I prefer Julia/Ruby/Rust), but all the new advances in machine learning are implemented in Tensorflow and PyTorch, so I came back to Python just because of these libraries and infrastructure on top of them.

I was just using the expit function from scipy yesterday by the way, because that's where it was implemented.

nemetroid · on Oct 20, 2020

> If it was very widely used this bug probably would've been caught

As the top-level comment says, it rarely causes crashes.

vdfs · on Oct 20, 2020

hold off on upgrading to a new major version is always a best practice, unless you really need a specific feature in the new release

travisjungroth · on Oct 20, 2020

Or you feel the call to adventure!

wjakob · on Jan 12, 2020

What's wrong with red-black trees? :) Since you mention C++: you do realize that std::map will typically be implemented with one?

wolf550e · on Jan 12, 2020

You want more fanout to use the entire cache line or even the entire virtual memory page. RB trees assume access to any address in memory has equal cost, which is extremely incorrect.

wahern · on Jan 13, 2020

The red-black tree implementation typically used in *BSD development, <sys/tree.h>, is an intrusive data structure; tree nodes are embedded within the object you must dereference for key comparison. There's no need to optimize fanout because you've entirely removed that extra indirection.

As a general purpose data structure implementation for systems development, <sys/tree.h> is quite nice. However, for any particular task you can always optimize the data structure, or choose an entirely different data structure, to better fit the problem. Which is perhaps why the Linux kernel has so many different tree and hash implementations.

alexhutcheson · on Jan 12, 2020

std::map and std::set are famously terrible, for that reason: https://youtu.be/fHNmRkzxHWs?t=2695

ncmncm · on Jan 13, 2020

Indeed. C++ needs some better Standard containers. As it is, you need 3rd party library containers when you need optimal performance on ordered collections. Abseil and boost have alternatives.

The great advantage C++ has is that there is no temptation to open-code them. Library implementations can absorb immense optimization and testing effort, amortized over all uses, and delivered without compromise.

fluffything · on Jan 12, 2020

Horrible cache utilization.

The_rationalist · on Jan 12, 2020

Compared to what?

alexhutcheson · on Jan 12, 2020

If you're not interleaving insertions and lookups, then a sorted vector can be really good.

If you need to interleave insertions and lookups, but don't need to traverse in sorted order, then a good hash table (not std::unordered_map) is normally the best option.

You only need a tree if you need traversal in sorted order and interleaved insertions and lookups, which is pretty uncommon. Even then you are almost always better off with a B-tree than a red-black tree.

ComputerGuru · on Jan 12, 2020

Even with interleaved insertions and lookups, there are common scenarios that make a sorted vector still significantly more performant. I wrote about it when we published our open source SortedList<T> implementation for .NET and .NET Core [0], specifically comparing it to AVL and Red-Black trees.

[0]: https://neosmart.net/blog/2019/sorted-list-vs-binary-search-...

alexhutcheson · on Jan 12, 2020

That's a good point, and a really interesting blog post.

rrss · on Jan 12, 2020

Seems like these criteria miss the use case for mm_rb, one of the central red-black trees in linux.

If you need to be able to store intervals (i.e. virtual memory areas) and do lookups based on any address in the interval, not just the base address, I don't think a hash map is the best option.

alexhutcheson · on Jan 12, 2020

IMO "lookups based on any address in the interval" requires "traversal in sorted order", although I could have probably been more precise in my terms.

I would be curious if anyone has ever profiled the impact of changing mm_rb to a B-tree. It might be very difficult if existing code that uses mm_rb depends on pointer stability, though.

Polyisoprene · on Jan 12, 2020

I believe Splay trees are usually mentioned as an alternative.

cesarb · on Jan 12, 2020

Aren't splay trees terrible because they turn all reads into writes? (Which means the cache lines bounce between readers, instead of being shared as they would be with pure reads.)

Polyisoprene · on Jan 12, 2020

Yeah, having reads rebalancing the tree in a multithreaded subsystem is probably not optimal. Might be that the Splay-trees have outstayed their welcome :)

FreeBSD’s tree.h used to have, iirc, both RB- and Splay-trees.

fluffything · on Jan 16, 2020

A B-tree set, for example.

_wldu · on Jan 12, 2020

std::set too, which really is just a map where the key and value are the same.

wjakob · on Oct 14, 2019

Here are links to more concrete studies concerning this effect.

Action spectrum for melatonin regulation in humans: evidence for a novel circadian photoreceptor. Brainard et al. https://www.jneurosci.org/content/jneuro/21/16/6405.full.pdf

Phototransduction by retinal ganglion cells that set the circadian clock. Berson et al. https://science.sciencemag.org/content/sci/295/5557/1070.ful...

wjakob · on Oct 5, 2019

Note that this is only for OpenGL ES!

wjakob · on July 2, 2019

The extra-painful part about this whole situation is that it's apparently only due to Apple blocking NVIDIA from signing and publishing their drivers. (for whatever bizarre reason they may have to do that)

soganess · on July 3, 2019

I know it stinks for the people who are acustom to Nvidia's workflow, but it is quite hard to feel bad for Nvidia. A bigger fish is doing to them what they do to smaller players.

Nvidia blocks the Nouveau drivers from working properly on Maxwell2 or newer cards: https://www.phoronix.com/scan.php?page=news_item&px=Nouveau-...

judge2020 · on July 3, 2019

kitsunesoba · on July 3, 2019

I think there’s more to the story than Nvidia is letting on. It wouldn’t surprise me a single bit if Apple is blocking Nvidia due to something like QA issues — Nvidia’s web drivers are notoriously buggy under macOS and Nvidia seems completely unrepentant about it. Pair this with Nvidia’s unwillingness to share source and collaborate with other companies on development and you’ve got problems.

I currently run an EVGA 980Ti Classified in my hackintosh. It’s great hardware, but with the frequency of glitches that the drivers bring I’m increasingly inclined to sell it and replace it with an AMD card.

wjakob · on June 9, 2019

The maintainers of SCons have long argued perceived performance and scalability issues do not exist (https://github.com/scons/scons/wiki/WhySconsIsNotSlow).

I've long been really excited about SCons but eventually decided to move away because it became unbearably sluggish for a large-ish codebase with >180K lines of C++ code split into many files. Another issue are cross-platform builds. SCons breaks every time there is a new version of Visual Studio, and it takes many months until an updated version restores compatibility.

maccard · on June 9, 2019

That blog post is an impressive logical leap on the authors part.

I realise that the post is not from recently but the benchmarks are on seriously outdated software and hardwaee, those specs are over a decade old at this stage. His conclusions are also misleading, all of his graphs show that make is 2-10x faster than scons, and that make scales better than scons.he seems hell bent on proving there's no quadratic complexity, despite there being an order of magnitude of a difference!

On the other hand, I don't think it's reasonable to assume every build system will immediately support every compiler/IDE. At the end of the day, scons is an open source project and if that's the only issue stopping you using it, I'm sure they'd be happy to accept patches providing support rather than wait months.

beagle3 · on June 9, 2019

Scons is almost 20 years old at this point. There are projects like waf that tried to fix it but ended up incompatible (and faster and more usable).

Scons has great ideas, but something either about the implementation or about reality fails to deliver.

maccard · on June 9, 2019

As is cmake. Make is about to hit a mid life crisis, other projects like Ninja are faster and more lightweight, projects like WAF/Premake use a scripting language instead of a DSL, others liek FASTbuild claim to support distrobuted builds. At a certain point, sofradre has to be accepted for what it is, not what it claims to be.

jlarocco · on June 9, 2019

I agree it's a peculiar system to use for benchmarking in 2018, but for comparison purposes it shouldn't matter much as long as the same software/hardware are used for both Make and SCons.

There's definitely some mental gymnastics and careful massaging going on to hide the performance problem. Would be a great example for a "How to lie with charts and graphs" article.

maccard · on June 9, 2019

I think it is still invalid to consider for a benchmark. We don't consider a car track based on how quickly 10 year old cars can navigate it.

In the last decade we have gone from everyone needs an antivirus to Windows 10 is good enough for most people. We've had Spectre and meltdown, SSDs have become viable and been replaced by nvmE SSDs we've had a decade of OS development, filesystem development and hardware improvements.weve also had huge improvements in build ststems, Ninja was only relewaeed in 2012!

And yet, even after all of that, the TLDR from the article is "scons is an order of magnitude slower than make, but my hardware was the bottleneck before i figured it out scaled quadratically, therefore scons is not slow".

geezerjay · on June 9, 2019

> On the other hand, I don't think it's reasonable to assume every build system will immediately support every compiler/IDE.

If your API is stable, why wouldn't it be reasonable to expect that it should work out of the box?

maccard · on June 9, 2019

I don't think visual studio had ever claimed to have a stable API. 2019/2017/2015 have been good, but there was some definitely teething issues migrating to their current situation. That doesn't necessitate months of waiting - there is nothing stopping me or you adding support for an unstable API

bjourne · on June 9, 2019

Try WAF. It was created about twelve years ago as a fork of SCons because the author was dissatisfied with its performance problems. WAF is in my opinion a very capable build and configuration system but the main drawback is its small community. The person developing WAF isn't keen on "marketing" it.

ncmncm · on June 9, 2019

The only place I have encountered WAF was at Bloomberg, where it was used to build the C++ foundation library (BDE) that Bloomberg uses in older projects in place of Boost and Std, both of which it predates.

WAF worked, and didn't seem especially slow, unlike Scons, but I frequently needed to code Python to make it do what seemed like pretty ordinary things.

SolarNet · on June 9, 2019

The alternative I've enjoyed is Bazel.

bonzini · on June 9, 2019

Meson is basically SCons done right.

gvalkov · on June 9, 2019

Personally I see Meson as an attempt to do CMake right. A big differences between Meson and SCons is that SCons handles the execution of the build graph, while Meson delegates execution to Ninja. The nice thing about the former is that while the graph is being walked, new nodes can be added. The nice thing about the latter is that it's incredibly efficient.

bonzini · on June 9, 2019

Why not both? :-) While the architecture is more similar to CMake, Meson's language reminds me of SCons---but it is a custom DSL and not Python, which makes things less surprising. For example in SCons a single-element list can be replaced with with the element itself, but that is a feature of SCons's builtins: if you write custom Python code, you can have confusing results when you place a string in a place where you must use a list of strings.

Also SCons is Turing complete and this makes it a bit of a stretch to call the language declarative; Meson keeps simple loops but is not Turing complete and the maintainers are okay with plans/pull requests that make Meson even more declarative. In Meson generally "there is only one way to do it", in SCons much less so which is ironic given it uses Python as the language.

badsectoracula · on June 9, 2019

All these systems annoy me with their dependencies... the only one that i kinda do not mind much is premake because it consists of a single self-contained executable. However even with that you need to have premake available. I'd prefer it if these programs took a page out of Autotools and created configuration scripts that only relied on whatever the target OS has available (plus the compiler) without having the need to install a separate build system since after a while you end up with a bunch of different build systems since everyone prefers theirs.

bonzini · on June 10, 2019

Meson's only dependency is Python, with no external packages needed---only the standard library.

The only annoying part of Meson is that it only supports a few compilers (GCC, clang, ICC, MSVC), and requires porting to new compilers unlike Autoconf.

dfgdghdf · on June 9, 2019

Meson is very different to SCons. Meson is more like a better CMake.

budabudimir · on June 10, 2019

It's not slow?

It takes it 30s, on a decent size codebase, just to tell you that there is "Nothing to do." There is order of magnitude difference between it and make. It's a complete travesty!

smoknib · on June 9, 2019

I’ve never heard of half these build systems - are they mostly used for Cxx heavy repos ?

microcolonel · on June 9, 2019

Probably the biggest SCons project is Blender (unless they've switched to something else since last I worked on it).

weberc2 · on June 9, 2019

wjakob · on June 3, 2019

Lack of NVIDIA support is a deal-breaker. The AMD ecosystem is just so far behind when it comes to frameworks like CUDA, OptiX, cuDNN, etc.. Why can't Apple open up kernel-level support by cooperating more with NVIDIA? This state of things seems completely bizarre to me.

arendtio · on June 3, 2019

I know this is an Apple/Mac discussion, but from a Linux/OSS perspective, it looks very different. If it weren't for CUDA, Tensorflow and so on, I wouldn't even consider buying NVIDIA (anymore).

Currently, I have a laptop with NVIDIA graphics and a desktop PC with AMD graphics. The AMD stuff just works; out of the box (OSS, good performance, happy user). But NVIDIA either comes with nouveau (OSS, poor performance) or the nvidia proprietary binary driver which has all sorts of weird issues (e.g. always-on fans, animations running at different speeds, etc.).

Sure, this doesn't say anything about how a Mac would run with NVIDIA drivers, but it gives a hint that NVIDIA has its own weak spots. But the reason why Apple decided to skip on NVIDIA in this case, is probably rooted deeper within their strategies.

jachee · on June 3, 2019

There was an anecdote on a semi-recent (sometime this year) episode of ATP where an inside source related to them that nVidia were a terrible organization to try and work with. They apparently screwed Apple over, and Apple has a long memory for that sort of thing.

TazeTSchnitzel · on June 3, 2019

The AMD ecosystem will always remain behind if application developers continue to choose single-vendor proprietary frameworks rather than standard APIs.

jayd16 · on June 4, 2019

This is an odd complaint considering the Metal requirement on MacOS.

TazeTSchnitzel · on June 4, 2019

I'm not happy about Metal either. There is at least MoltenVK…

roboyoshi · on June 3, 2019

Maybe it's NVIDIA blocking the way? If Linus is correct, then they are not a nice company to work with.

fjarlq · on June 3, 2019

A member of NVIDIA's CUDA Product Management wrote in October 2018:

"Apple fully control drivers for Mac OS. But if Apple allows, our engineers are ready and eager to help Apple deliver great drivers for Mac OS 10.14 (Mojave)." -- https://devtalk.nvidia.com/default/topic/1042279/cuda-setup-...

roboyoshi · on June 3, 2019

well that is sad then.. Maybe they want to push the Metal platform? idk.

george_perez · on June 4, 2019

macOS Catalina (10.15) is entirely built on-top of Metal. So, yes.

pwinnski · on June 3, 2019

To hear Nvidia tell it, it's Apple fault. Weird how everybody who works with Nvidia has issues, though.

jzl · on June 4, 2019

There's been a long-simmering hostility between the two companies. Broken down somewhat here: https://appleinsider.com/articles/19/01/18/apples-management...

whywhywhywhy · on June 3, 2019

There was some sleight of hand in the benchmarking for me too, they compared GPU compute rendering engine speeds to "Nvidia Quadros" when no one doing CUDA rendering on Windows uses Quadro cards, they all use GTX because the price to performance is ridiculous on Octane render/Redshift/etc.

wjakob · on Nov 2, 2018

The material database provides broad coverage of common and more specialized materials, including isotropic and anisotropic BRDFs of metals, paper, car paints, organic samples, fabrics, etc. Each material is available in a spectral version that covers the 360(UV) - 1000nm (NIR) range with ~4nm sample spacing, as well as a RGB version for compatibility with renderers that do not support color spectra. The representation is extremely compact, requiring approximately 16KiB and 544KiB per channel for isotropic and anisotropic materials, respectively. Furthermore, it provides a natural importance sampling operation that does not require numerical fits or complex additional data structures. Technical details on the underlying parameterization and measurement methodology are available in the paper

An Adaptive Parameterization for Efficient Material Acquisition and Rendering (http://rgl.epfl.ch/publications/Dupuy2018Adaptive)