In the video (and almost anywhere else you look) you'll note he talks about recursion on the concept of "computer" as a way to scale, because that way the parts are as powerful as the whole.
If you split that concept into the sub-notions of "procedures" and "data", you no longer have that recursive principle, and "functions" in this context are sufficiently equivalent to procedures.
Going the other way, functions don't really scale up, many if not most things in computing are not functions to be computed. One could argue that computers and "computer science" are misnamed, as computation is more the exception than the rule. "Data-around-shufflers" would be more appropriate.
I didn't see the exact monad references, but they look like ways of working around the fact that the "function" primitive is inappropriate. It is really nice, mind you, and has wonderful properties. Kind of like circles as the basis for astronomical orbits: they are "perfect", but the world isn't perfect in the same way, so trying to model the imperfect world with these perfect primitives leads to having to introduce epicycles/monads.
My 2 €¢
Edit: “Functions are nice, but you need to advance the state of the system over time” — https://www.youtube.com/watch?v=fhOHn9TClXY&feature=youtu.be...
In the French language we have "calculateur" which is the direct translation of "computer" but only commonly used in the context of scientific computing, and "ordinateur" which is the common name for computers... The meaning of "ordinateur" pretty close to "data-around-shufflers", from latin ordino (“to order, to organize”) - https://en.wiktionary.org/wiki/ordinateur: "in its application to computing, [ordinateur] was coined by the professor of philology Jacques Perret in a letter dated 16 April 1955, in response to a request from IBM France, who believed the word calculateur was too restrictive in light of the possibilities of these machines (this is a very rare example of the creation of a neologism authenticated by dated letter)"