Development of models in our data environment: notebooks, pyspark EMR clusters for analytical workloads and offline models, tensorflow/EC2 P2s for online models.
Jobs are scheduled (Azkaban) for reruns/re-training and pushed from data env to the feature/model-store in live env (Cassandra). Online models are exported to SaveModel format and can be loaded on any TF platform, eg java backends.
Online inference using TF Serving. Clients query models via grpc.
A lot of our models are NN embedding lookups, we use Annoy for indexing those.
Jobs are scheduled (Azkaban) for reruns/re-training and pushed from data env to the feature/model-store in live env (Cassandra). Online models are exported to SaveModel format and can be loaded on any TF platform, eg java backends.
Online inference using TF Serving. Clients query models via grpc.
A lot of our models are NN embedding lookups, we use Annoy for indexing those.