You can use relaxation for discrete variables (e.g. by using convex simplex), replacing them with differentiable variables, and then just discretize after the very end of the computation. A common trick for variational autoencoders that are another way to do generative models.