There are some things that are more compilcated, e.g. you need to bring your own libraries for font support and input methods and rasterization and so on, and you need to handle hotplugged input devices and such. But protocol-wise it’s not really more conplicated, no.
But I'm pretty sure it's more complicated than I think. :-)