The Galerkin method, which is the underlying concept that finite element analysis uses. Suppose you want to approximate the solution to a PDE with a finite sum of basis functions.
The solution usually lives in an infinite dimensional Hilbert space, whereas our space of possible approximate solutions is a finite-dimensional subspace.
To find the "best" solution that lives in the finite-dimensional basis-function (or trial function) space, we simply choose the coefficients of the basis functions such that the residual is orthogonal to the approximate solution.
I like this, but I'll also contend that it's part of a broader question in applied math: What does it mean to be zero?
One way to define this is to say that a vector is zero when it's orthogonal to every other vector in some space. This gives rise to Galerkin and Petrov-Galerkin methods to solve differential equations. However, it also gives rise to linear system solvers such as conjugate gradient (CG).
Another way to define zero is to say that a vector is zero when it's norm is zero. This gives rise to least-squares finite element methods to solve differential equations. It also gives rise to linear system solvers such as GMRES.
Anyway, I agree with you, but wanted to add that it's an idea part of a greater strategy that gives rise to a huge number of good algorithms.
The solution usually lives in an infinite dimensional Hilbert space, whereas our space of possible approximate solutions is a finite-dimensional subspace.
To find the "best" solution that lives in the finite-dimensional basis-function (or trial function) space, we simply choose the coefficients of the basis functions such that the residual is orthogonal to the approximate solution.