You're ultimately right, though, and that a true HD is only going to come from the raw film content. What the neural network gives us are essentially plausible higher-res hallucinations.
Edit: as per the other comment, if the original exists only on video and not film, perhaps this is the best we're going to get.
I don't think that's quite right, at least it doesn't jibe with what the DS9Doc people have been doing (which consists partly of remastering pieces of DS9 scenes):
I think the footage really was on film, but the issue was that it was composited with low-quality CGI effects, or something like that. So you can rescan the film, but you have to redo all the compositing (and probably with your own models because I'm guessing the original CGI didn't look that good). That's why a DS9 remaster is so expensive.
The main difference here is that the interpolation algorithm on your TV is online. It's handling 30 frames per second, over 9 million pixels per second. Doing the interpolation offline (ahead of time), you can take as long as you want, look at multiple frames to try to make better guesses, try multiple things and use some fitness measure to pick a winner, even a frame or a pixel at a time.
It's still interpolation.
In this case they're using machine learning to add additional information about textures that isn't in the footage broadcast. They can add frames by interpolation, but the ML texturising and detailing is not interpolation.
Starting with a blob, if you interpolate you get a smoother blob, with this process you get a more structured figure.
It can still look nicer than naive upscaling though.
> ... interpolation is a method of constructing new data points within the range of a discrete set of known data points
There are entire catalogue of overlay comparisons of different releases, encodings etc. .