
Libcu++: Nvidia C++ Standard Library - andrew3726
https://github.com/NVIDIA/libcudacxx
======
fanf2
“ _Whenever a new major CUDA Compute Capability is released, the ABI is
broken. A new NVIDIA C++ Standard Library ABI version is introduced and
becomes the default and support for all older ABI versions is dropped._ ”

[https://github.com/NVIDIA/libcudacxx/blob/main/docs/releases...](https://github.com/NVIDIA/libcudacxx/blob/main/docs/releases/versioning.md)

~~~
MichaelZuo
It’s interesting that they use the word to broken to describe incompatible
machine code. Well if the code is recompiled for each new version then it’s
different from the old machine code, that’s by definition. Does any major
software vendor support older versions of the ABI or machine code?

~~~
londons_explore
Famously Microsoft does with Windows. That's how an exe file from 25 years ago
can still run today.

~~~
moonchild
Yes, but GPU architecture changes very frequently.

Shaders from 15 years ago still work, but they're compiled on-the-fly to a
GPU-dependent format. I expect you don't want to have to recompile an entire
c++ stdlib every time you recompile your own code.

~~~
blelbach
> I expect you don't want to have to recompile an entire c++ stdlib every time
> you recompile your own code.

That's basically our current model, I discussed this on Twitter recently.

[https://twitter.com/blelbach/status/1307396914057326592](https://twitter.com/blelbach/status/1307396914057326592)

------
lionkor
> Promising long-term ABI stability would prevent us from fixing mistakes and
> providing best in class performance. So, we make no such promises.

Wait NVidia actually get it? Neat!

~~~
matheusmoreira
This is an awesome quote... Same argument used by the Linux kernel developers.

------
lars
It really is a tiny subset of the C++ standard library, but I'm happy to see
they're continuing to expand it:
[https://nvidia.github.io/libcudacxx/api.html](https://nvidia.github.io/libcudacxx/api.html)

~~~
roel_v
Yeah, really tiny... At first I thought 'wow this is a game changer', but then
I looked at your link and thought 'what's the point?'. Can someone explain
what real problems you can solve with just the headers in the link above?

~~~
happyweasel
It runs on the GPU?

~~~
roel_v
What runs on the gpu?

~~~
jcelerier
this library

------
RcouF1uZ4gsC
For everyone wondering where are all the data structures and algorithms,
vector and several algorithms are implemented by Thrust.
[https://docs.nvidia.com/cuda/thrust/index.html](https://docs.nvidia.com/cuda/thrust/index.html)

Seems the big addition of the Libcu++ to Thrust would be synchronization.

~~~
blelbach
Yep, that's correct. My team develops Thrust, CUB, and libcu++.

------
davvid
Here's a somewhat related talk from CppCon '19: "The One-Decade Task: Putting
std::atomic in CUDA"

[https://www.youtube.com/watch?v=VogqOscJYvk](https://www.youtube.com/watch?v=VogqOscJYvk)

------
jlebar
This is super-cool.

For those of us who can't adopt it right away, note that you can compile your
cuda code with `--expt-relaxed-constexpr` and call any constexpr function from
device code. That includes all the constexpr functions in the standard
library!

This gets you quite a bit, but not e.g. std::atomic, which is one of the big
things in here.

------
BoppreH
Unfortunate name, "cu" it's the most well known slang for "anus" in Brazil
(population: 200+ million). "Libcu++" is sure to cause snickering.

~~~
NullPrefix
This only affects developers. Limited scope.

Wasn't there something related about Microsoft Lumia phones?

~~~
kitd
cf. the Vauxhall Nova car

"No va" means "doesn't go" in Spanish.

~~~
andrepd
Also Hyundai Kona, "cona" means "cunt" or "pussy" in Portuguese.

~~~
FridgeSeal
Wow, Kona Bikes [0] must have a fun time in Portugal then..

[0] [https://konaworld.com/](https://konaworld.com/)

------
einpoklum
1\. How do we know what parts of the library are usable on CUDA devices, and
which are only usable in host-side code?

2\. How compatible is this with libstdc++ and/or libcu++, when used
independently?

I'm somewhat suspicious of the presumption of us using NVIDIA's version of the
standard library for our host-side work.

Finally, I'm not sure that, for device-side work, libc++ is a better base to
start off of than, say, EASTL (which I used for my tuple class:
[https://github.com/eyalroz/cuda-
kat/blob/master/src/kat/tupl...](https://github.com/eyalroz/cuda-
kat/blob/master/src/kat/tuple.hpp) ).

...

partial self-answer to (1.):
[https://nvidia.github.io/libcudacxx/api.html](https://nvidia.github.io/libcudacxx/api.html)
apparently only a small bit of the library is actually implemented.

~~~
blelbach
> apparently only a small bit of the library is actually implemented.

Yep. It's an incremental project. But stay tuned.

> I'm somewhat suspicious of the presumption of us using NVIDIA's version of
> the standard library for our host-side work.

Today, when using libcu++ with NVCC, it's opt-in and doesn't interfere with
your host standard library.

I get your concern, but a lot of the restrictions of today's GPU toolchains
comes from the desire to continue using your host toolchain of choice.

Our other compiler, NVC++, is a unified stack; there is no host compiler. Yes,
that takes away some user control, but it lets us build things we couldn't
build otherwise. The same logic applies for the standard library.

[https://developer.nvidia.com/blog/accelerating-standard-c-
wi...](https://developer.nvidia.com/blog/accelerating-standard-c-with-gpus-
using-stdpar)

> Finally, I'm not sure that, for device-side work, libc++ is a better base to
> start off of than, say, EASTL (which I used for my tuple class:
> [https://github.com/eyalroz/cuda-
> kat/blob/master/src/kat/tupl](https://github.com/eyalroz/cuda-
> kat/blob/master/src/kat/tupl)... ).

We wanted an implementation that intended to conform to the standard and had
deployment experience with a major C++ implementation. EASTL doesn't have
that, so it never entered our consideration; perhaps we should have looked at
it, though.

At the time we started this project, Microsoft's Standard Library wasn't open
source. Our choices were libstdc++ or libc++. We immediately ruled libstdc++
out; GPL licensing wouldn't work for us, especially as we knew this project
had to exchange code with some of our other existing libraries that are under
Apache- or MIT-style licenses (Thrust, CUB, RAPIDS).

So, our options were pretty clear; build it from scratch, or use libc++. I
have a strict policy of strategic laziness, so we went with libc++.

~~~
justicezyx
How this library works?

There appears a llvm libcxx bundled in as part of the repo. What's the purpose
of that libcxx?

~~~
blelbach
That involves a few diagrams, but essentially, we have two layers:

\- the libcu++ layer, which has some of our extensions and implementations
specific to our platform. \- the libc++ layer, which is a modified upstream
libc++.

A header in the libcu++ layer defines the libc++ internal macros in a certain
way, and then includes the applicable libc++ header.

This is the current architecture, but we're moving away towards a more
integrated approach where almost everything is in the libc++ layer.

------
Mr_lavos
Does this mean you can do operations on struct's that live on the GPU
hardware?

~~~
shaklee3
You have been able to do that for a long time with UVA.

~~~
blelbach
Since Unified Memory. UVA, or Unified Virtual Addressing, just ensured that a
GPU-private object wouldn't have the same address as a CPU-private object.

~~~
shaklee3
You're right, sorry. Mixing up terms.

~~~
blelbach
Not your fault, we don't make it easy. The acroynms are terrible! That's why I
typically spell out the full term.

My first week at NVIDIA:

Me, to very senior engineer: _something something_ UVM.

Very senior engineer: What's UVM?

Me: Unified Virtual Memory.

Very senior engineer: Don't call it that, call it Unified Memory, no
abbreviation. TLAs are evil.

Me: What's TLA?

Very senior engineer: Three letter acronym.

------
gj_78
I really do not understand why a (very good) hardware provider is willing to
create/direct/hint custom software for the users.

Isn't this exactly what a GPU firmware is expected to do ? Why do they need to
run software in the same memory space as my mail reader ?

~~~
dahart
What do you mean about running in the same memory space? Your operating system
doesn’t allow that. Is your concern about using host memory? This open source
library doesn’t automatically use host memory, users of the library can write
code that uses host memory, if they choose to.

How would a firmware help me write heterogeneous bits of c++ code that can run
on either cpu or gpu?

~~~
gj_78
IMHO, the question is not that we need code to run on CPUs and GPUs , we do
need that, The question is whether the GPU seller has to control both sides.
Until I buy a CPU from nvidia I want to keep some kind of independence.

When will we be able to use a future riscv-64 CPU with an nvidia GPU ? we will
let the answer to nvidia ?

~~~
dahart
You _can_ use this library to write code that runs on both risc-v and a GPU!
You seem to be pretty confused about what this library is. It’s not exerting
any control. It’s open source! It’s strictly optional, and it only allows
developers to do something they actually want, to write code that will compile
for any type of processor that a modern c++ compiler can target.

~~~
gj_78
Again, I see what you mean. I am even against nvidia advising the developers
to use such or such C++ library (be it GNU). It is not their role to do that.
We need smarter and more shining GPUs from nvidia, not software.

I would say .... The hardware must be sold independently of the software ...
but it is a bit too complex, I know.

~~~
dahart
I'm not understanding your point at all. You don't think developers should be
able to write C++ code for the GPU?

What do you even mean about 'it is not their role to do that.' and 'hardware
must be sold independently of the software'?? Why are you saying this?
Software interfaces are critical for all GPUs and all CPUs, just ask AMD &
Intel. There is no such thing as CPU or GPU hardware independent of software.
Plus, the specific library here _is_ being sold independently of the hardware,
it is doing exactly what you say you want, it's separate and doesn't require
having any other nvidia hardware or software. (I can't think of any good
reasons to use it without having some nvidia hardware, but it is technically
independent, as you wish.)

~~~
gj_78
> You don't think developers should be able to write C++ code for the GPU?

To be clear, I don't think nvidia-paid developers should be able to write C++
Code for a nvidia-sold GPU. The world will be better if any developer (paid by
nivida or not) is able to write code for any GPU (sold by nvidia or not). It
is not nvidia role to say how or when software will be written. Their hardware
is good and that's more than OK.

AI/CUDA code written specifically for nvidia is useless/deprecated in the long
term. A lot of brain waste.

~~~
jki275
That doesn’t make any sense.

You’re free to write whatever you want. This is Nvidia providing interfaces to
their hardware for those of us who don’t want to write them for ourselves.

It’s a gift. Take it or don’t. How in the world you can say Nvidia shouldn’t
be allowed to write software for their GPUs makes no sense at all. Should the
government stop them? Any developer can write anything they want - but Nvidia
is obviously going to support their own hardware. How does it make any sense
otherwise?

All code is “deprecated in the long term” for a long enough “long term”. That
doesn’t equal useless. Your comment is nonsensical.

~~~
blelbach
> It’s a gift.

I wouldn't say it's a gift, though; it's part of what you pay for when you buy
one of our products.

Sure, it's not listed as a spec on the box, but users expect that we're going
to provide them with a good software stack and support it.

------
scott31
A pathetic attempt to lock developers into their hardware.

~~~
jpz
They seem to be pushing the barrier on innovation on GPU compute. It seems a
little unfair to call that pathetic, whatever strategic reasons they have to
find OpenCL unappetising (which simply enables their sole competitor in
truth.)

Their decision making seems rational, of course it's not ideal if you're
consumer. We would like the ability to bid off NVidia with AMD Radeon.

Convergence to a standard has to be driven by the market, but it's impossible
to drive NVidia there because they are the dominant player and it is 100% not
in their interests.

It doesn't mean they're a bad company. They are rational actors.

~~~
my123
With nvc++, they are converging towards a standardised source code standard:
[https://developer.nvidia.com/blog/accelerating-standard-c-
wi...](https://developer.nvidia.com/blog/accelerating-standard-c-with-gpus-
using-stdpar/)

However, this notably doesn't cover binaries, which are GPU vendor specific in
that case, so AMD for example would have to provide a C++ compiler
implementing stdpar for GPUs targeted to their hardware.

