Please read the wiki op linked.

> A subset video bitstream is derived by dropping packets from the larger video to reduce the bandwidth required for the subset bitstream. The subset bitstream can represent a lower spatial resolution (smaller screen), lower temporal resolution (lower frame rate), or lower quality video signal.

Then we are not talking about an image format but instead a network protocol.

Inventing a new protocol if the problem can be solved by serving the proper markup, seems misguided.

