If Android was anything to go by, migrating build systems is a risky endeavour. Miso claims compatibility with ninja, so I’m guessing this route was deemed easier to make incremental improvements.
Tangentially related, Jony Ive of Apple fame recently did an interview on BBC's desart island disks where he discussed his father's involvement in bringing Design Technology to UK schools: https://www.bbc.co.uk/sounds/play/m00289vf
Nvidia canvas existed before text to image models but it didn't gain as much popularity with the masses.
The other part is the training data - there are masses of (text description, image) pairs whilst if you want to do something more novel you may struggle to find a big enough dataset.
But it is correct that the article does not reiterate the technical details of the exploit.