Video streaming hasn't been "a file" in a long time. HLS et al download little snippets at a time, adjusting to current bandwidth circumstances, typically with video and audio separate, etc. Even without DRM, the average user couldn't "download a file" from Netflix.
Note that --format is only required (to get the best available version) because yt-dlp appears to rely only on metadata present in the supplied .m3u8 file to determine which stream is "best", and no such metadata appears for the audio streams in this example.
For details on what yt-dlp knows about each candidate stream when attempting to choose the best, see
This would presumably not be a problem for our hypothetical "Netflix DLP.app", which could rely on whatever convention Netflix uses to indicate stream quality to its own clients when choosing streams, rather than falling back to sensible defaults for arbitrary HLS input.