Don't do this. Have two K8s clusters. Even if the network were reliable you might still have issues spanning the overlay network geographically.
If you _really_ need to manage them as a unit for whatever reason, federate them(keeping in mind that federation is still not GA). But keep each control plane local.
Then setup the data flows as if K8s wasn't in the picture at all.
Where have you been suffering from this?
I don't want to have to restart the whole thing on each site every time it happens. I'd like a deployment/orchestration system that can work in such scenarios, showing a node as unreachable but then back online when it gets network back.
Isn't that exactly what happens with K8s worker nodes? They will show as "not ready" but will be back once connectivity is restored.
EDIT: Just saw that the intention is to have some nodes in a DC and some nodes in the edge and the intention is to have a single K8s cluster spanning both locations with unreliable network in between. No idea how badly the cluster would react to this.