Is the high priced 256 thread part that interesting for rendering? You can get 4 of the 64 thread parts on separate boards and each one will have its own 8 channel ddr instead of having to share that bandwidth. Total performance will be higher for less or same money. Power budget will be higher but only a couple dollars a day, at most. But I haven't been involved in a cluster for some time, so not really sure what is done these days.