Yeah, we need to give people some credit... If a dedicated stop button is the only thing left of the legacy tape machines and vcr's left us, then it makes sense to question that, too. Stopping the video by clicking on it is logical when no other options are given and everyone I've seen try this took about 3 seconds to figure it out. This gets even more logical if the video was originally started by clicking it.
Fitt's Law directly tells us how efficient "the video as button" approach is: if we presume that a hypothetical start/stop button is 1/10th of the width of the whole video (which would be one large button!) then the time to navigate to the button can be up to 10 times slower. (Testing this now with Hacker News' minuscule "reply" button.)
With touchscreens becoming more and more common, this probably becomes a standard anyway, as people are now starting to connect interacting with objects with the objects themselves, not with separate buttons.