It's not the requirement to partition the drive that kills the idea. Setting aside the first small chunk of the drive for the firmware to read is how it's always been done. The problem is that this would require the partitioning to be done below the wear leveling layer, which reduces the effectiveness of the wear leveling slightly and means you can only change your bootloader settings a thousand times before the drive is dead.
That's not really what kills it. You can use a small ring buffer to boost that to 100 thousand, and/or use the flash in a more reliable way there (like SLC or devoting 2/3 of the bits to ECC).