Package that up in a self-contained Vagrant image / Docker image / whatever with all of the skeleton required to run a full answer to completion. Then, take out the full answer. Consider leaving a reduced test suite so that candidates can see if they're making progress in a positive direction.
Now write the rubric by which you'll assess candidate answers. Generally, you'll want to pick ~5 areas which are important for you and write prose describing what 0 points through 4 or 5 points are worth in each of those areas. Then, determine what a passing score is, perhaps by calibrating through running it with existing engineers at the company. And then (this is the brutally difficult part): convince your organization that the rubric now makes hiring decisions.
Then, create a way for people to submit these into your hiring process. This might be as simple as "Create a private gist of this file and email me the link" through something with material tooling developed.
(This answer may or may not be exactly the same as Thomas' answer, but we were co-founders at a company which was quite related to this problem.)
It’s amazing how often hiring processes are built and not tested.