Even basic image classifiers tend to be 100x faster on a GPU or TPU...
TensorFlow has some known inefficiencies.
What CPU is being used?
Assuming the benchmark is done with something like an EC2 C5 instance, the results in this post are quite slow. Somewhere around 14x slower than benchmarks from a year ago on EC2 C5 instances. 
 https://dawn.cs.stanford.edu/benchmark/ImageNet/inference.ht..., using the c5.2xlarge benchmark and assuming linear scaling
As an aside, I took into account the resource allocation in the parent comment. The c5.2xlarge has 8 cores, 8GB RAM  and does a single fp32 inference in ~17ms. If we chop that down to 4 cores and assume linear scaling we can fathom running ResNet-50 in ~35ms compared to the ~500ms achieved here. I'd recommend comparing to a known baseline rather than a "vanilla setup" to ensure you aren't missing any simple changes that may dramatically improve performance.
channel = grpc.insecure_channel('0.0.0.0:9000')
stub = PredictionServiceStub(channel)