Curious to know what value you've seen out of these clusters. In my experience k means clustering was very lackluster. Having to define the number of clusters was a big pain point too.
You almost certainly want a graph like structure (overlapping communities rather than clusters).
But unsupervised clustering was almost entirely ineffective for every use case I had :/
I only got the clustering working this morning, so aside from playing around with it a bit I've not had any results that have convinced me it's a tool I should throw at lots of different problems.
I mainly like it as another example of the kind of things you can use embeddings for.
There are iterative methods for optimizing the number of clusters in k-means (silhouette and knee/elbow are common), but in practice I prefer density-based methods like HDBSCAN and OPTICS. There's a very basic visual comparison at https://scikit-learn.org/stable/auto_examples/cluster/plot_c....
You could also use a Bayesian version of kmeans. It applies a Dirichlet process as a prior to an infinite (truncated) set of clusters such that the most probable number k is automatically found.
I found one implementation here: https://github.com/vsmolyakov/DP_means
Alternatively, there is a Bayesian GMM in sklearn. When you restrict it to diagonal Covariance matrices, you should be fine in high dimensions
You almost certainly want a graph like structure (overlapping communities rather than clusters).
But unsupervised clustering was almost entirely ineffective for every use case I had :/