The optimal block size is obviously 0, as these blocks will propagate through the network the fastest, thus lowering the orphan rate to nearly 0, and will require the least resource usage (bandwidth, hard disk space, CPU cycles) for full node operators.
... see what I did there?