Isn't this only true in a type system with subtyping? If it does apply to a type system without subtyping I would genuinely like to know if you have any references for it. I was under the impression that variance, as a concept, was immaterial for anything else.
Additionally, why would having only invariant functions mean leaving 'first-class' functions? Even assuming subtyping relations between types (simple or complex), invariant functions would still be 'first-class', just not acknowledging of the subtyping relation.
Parametric polymorphism gets you subtyping for function types even if you don't have it in general. `forall a . a -> a` is a subtype of `T -> T` for any particular `T`, for instance. And if you don't have subtyping or polymorphism, then talking about "leaving out" co/contravariance doesn't really make sense: there's nothing left to vary.
Additionally, why would having only invariant functions mean leaving 'first-class' functions? Even assuming subtyping relations between types (simple or complex), invariant functions would still be 'first-class', just not acknowledging of the subtyping relation.