I too would be interested in understanding this better.
Let's say we're building a medical segmentation model, which takes a patient image and outlines a tumour (or some other feature that's unique to them). I am not sure this matters here, but let's say the model is a basic 2D U-net. Image pixels in, binary pixel labels out (cancer/non-cancer).
At a high level, how would a differentially-private setup work for training such a model across multiple institutions without pooling their patient data?