
Show HN: Deploy and Retrain Cutting Edge ML Image Recognition via REST API - hsikka
https://modeldepot.io/percept
======
mikeshi42
Hey Everyone!

Harsh and I built ModelDepot with the goal of empowering everyday developers
to use Machine Learning simply and quickly.

After having thousands of engineers use ModelDepot, we learned that many teams
need a simple and effective way to deploy our models and use them for both
inference and further training. Deploying should be easy, fast, and on your
terms, and training should be effective with less data.

With ModelDepot Percept, you can deploy cutting edge pretrained Image
Classification models with just one line on any infrastructure you choose. The
model can be used immediately for predictions, or can be effectively trained
on your own data with just a few training samples. Interacting with Percept is
easy using a REST API, and as cheap as the infrastructure you choose to deploy
it on.

We’ve launched ModelDepot Percept with state of the art image classification,
and quickly are moving to include other use cases as well.

Let us know if you have any questions or comments! We're more than happy to
chat!

If you want more training and prediction free credits, reply here or tweet us
your username @ModelDepot :)

------
mlthoughts2018
I really think these types of services miss the big picture. The hard part of
creating these services in-house is not wrapping things up in a single
discrete unit of deployment like a Docker container, and it's not even
following research or tutorials for how to actually train a fine-tuned or
transfer-learned modification to a popular pre-trained model.

Those are the easy parts that it's not that interesting to pay someone else to
do.

The hard parts are always

(1) your own data cleaning, pre-treatment and ingestion pipeline to get the
data ready in the first place and into a conformable and reproducible state
for tracking trained models and experiments; and

(2) [most important] writing the acceptance testing and model checking
plumbing code to validate that trained models are working as expected on
custom use cases with custom KPIs or metrics that business stakeholders want
(e.g. basic accuracy metrics built in to most modeling tools only matter to
the team training the model -- other people could not care less; they want
summarization in 'business' terms and demonstrations of accuracy that reflect
exact customer or stakeholder use cases).

Because of (1) and especially (2), you still need to employ expensive in-house
machine learning staff to understand how to evaluate your models, even if
those models are consumed from a third-party service like this one, or like
Google's AutoML or AWS Rekognition.

It's why this idea of "machine learning for people who don't know machine
learning" seems so silly and cost-ineffective to me. You still have to pay the
salary cost to have in-house modeling experts who can understand diagnostics,
bugs, training errors, etc., for these third-party models and who can
understand any methodological issues with data preparation and pre-treatment,
and who can translate machine learning diagnostic jargon into meaningful
metrics and demos for specific stakeholder use cases.

Finally, you also still need some mix of machine learning infrastructure
expertise to understand if the cost-per-request of paying for these third
party services is actually worth it, especially when adjusted for the level of
accuracy you need for your specific application (something that is always
suspiciously hard to get information about on the pricing pages of these
services).

Often times you don't need to offload for the sake of autoscalability
features, and you do need to verify if the latency and throughput of the third
party service are acceptable for the performance constraints of your specific
problem, as well as what overall cost they equate to in terms of supporting
your expected throughput for your traffic to the service.

Basically, it seems like the value add these services want to advertise is
that you can outsource the problem of developing a machine learning model,
even when you need to provide your own data for fine-tuning.

But you can only outsource the _tiny_ amount of work that would have been the
implementation of some training code and the execution of some training
regimen to arrive at the deployable model artifact (e.g. a trained model
wrapped in a Docker container).

This is probably only 10% of the work at best, and it's also the "fun part"
that the machine learning engineers you still have to employee are going to be
grumpy about outsourcing rather than keeping their skills sharp and tailoring
the model more uniquely to your specific business problem.

The other 90% of the work, particularly understanding model diagnostics in an
acceptance testing sort of way that ties directly to a real use case, is still
there and fundamentally cannot be outsourced unless you're prepared to just
fully pay for an entire third party consultant to do the entire project.

Either way, I just don't see why this type of service would be valuable to
companies. They might get fooled into thinking if they have a precocious
general engineer who can sort of figure some stuff out, then they can offload
for the remaining effort, and save a bunch of money on the cost of headcount
for expensive machine learning engineers.

But I think they'll quickly get burnt by that idea, and have a weird situation
where they thought they paid for a service like this for scalability purposes,
but where the lack of in-house machine learning expertise is what really
limits scalability, and they end up with a very nasty form of vendor lock-in.

~~~
mikeshi42
Hey! I think all of those points are 100% valid and I understand where you're
coming from.

I'd like to preface that the product you see today isn't the end goal we're
trying to reach, it's very much just the beginning of the vision of easy to
use ML we want to accomplish, and we hope to meet the rest of the pains you've
outlined in the future.

1\. Data cleaning is the most important and painful thing we experience, and
all the companies we talk to experience as well. "Data cleaning" is often a
very broad term and can mean a lot of different things across different
companies. It's something we absolutely want to tackle, but it's not on the
immediate roadmap (it's still a bit vague of how to bring all the different
problems companies face together).

2\. This is something we hope to move into in the short term, we understand
that production does not mean "ML runs on sever and returns requests that look
right". We know that it means that the model scores well on offline validation
data, online data, and is able to notify appropriately of shifting
distributions of input data. We hope to have tools in place that can allow
teams to understand the performance of the model both in an offline and
continual online setting, and be alerted when the model is failing due to
unseen feedback loops or drastic difference between training and inference
distributions.

Where we think this stands today is that any engineer on any team can get
started quicker with validating a ML PoC and get it deployed into production
faster. While I understand there's a ton of nuance to "production", sometimes
"good enough" is fine for the short term (and there's a lot of tools that can
support them with the problems outlined above as well!), while we work out how
to support customers in the long term even better.

As for accuracy, one of our core values is transparency, you can read about
the technical workings here: [https://medium.com/modeldepot/percept-whats-
inside-the-ml-co...](https://medium.com/modeldepot/percept-whats-inside-the-
ml-container-1e71f5ee2747). This should arm a team with the appropriate terms
to search for and understand if our product is the right fit for their use
case. We hope to only expand the capabilities and options customers can tune
as well if this exact solution won't work for them out of the box.
Additionally, users are free to experiment around with how to understand the
accuracy of the model (beyond a single accuracy metric), we even go a bit into
this at the end of our guide here: [https://medium.com/modeldepot/apples-
oranges-a-machine-learn...](https://medium.com/modeldepot/apples-oranges-a-
machine-learning-classifier-baf549451502)

As for performance, since it's deployed on your own infrastructure, you can
scale horizontally as much as you'd like without hitting arbitrary rate
limits. In our own internal benchmarks we can get around ~6 sustained QPS on a
c5.4xlarge and about ~12 sustained QPS on a c5.9xlarge. If you'd like to learn
more, we can go more into details about p90s and whatnot across various
concurrent connection settings.

I hope you have a chance to check out the product and leave us more feedback,
we're striving to be different than other MLaaS offerings out there with more
transparency in what we do, to empower engineers to leverage ML in a more
meaningful way.

~~~
chatmasta
How do you enforce request limits when the code is not hosted on your server?
And why should I pay tiered based on number of requests if I’m the one paying
the infrastructure cost anyway?

(Sorry if I’m misunderstanding your pricing model, only had a quick glance)

~~~
mikeshi42
The pricing model is to help support our continual development costs of the
platform, it definitely takes a bit of time to develop the tech, and
continually test it with our users and refine the experience so that it's easy
to use out of the box for you. We have a pay-as-you-go structure to help make
it easier for users that aren't sure about their usage yet to try it out at a
very low cost, and decide if it'll work for them, we didn't want a steep
initial pricing structure to be a barrier of entry for people exploring ML.
The requests are loosely enforced through our central key server (though there
is no request rate limit). While absolutely no training/inference data leaves
the container, we do occasionally send back usage metrics to help keep track
of usage. If you're interested in a totally isolated solution, we can talk
about having on-prem deployed key servers so that your cluster can be
completely isolated.

I hope that clarifies everything, let me know if you have any feedback or
further questions :)

~~~
chatmasta
Usage based pricing seems very misplaced for a self hosted solution. Questions
of enforceability aside (all I need to do is remove the telemetry code to get
it free), the issue is that usage based pricing is meant to scale your revenue
alongside costs. When your customer is hosting the infrastructure, you have
the same costs regardless how many requests they make, so as a customer, it
doesn’t feel right to pay you for each request that _my own servers_ are
solving.

A competitor can easily undercut you here. I would also be interested in
hearing of any other company that charges per-request for self hosted
software, because it’s certainly not a model I’ve heard. Typically the way to
approach this is a licensing fee for running the software self-hosted.

~~~
mikeshi42
I don't think usage based pricing is esoteric, if you look towards products in
the enterprise space (Gitlab, Splunk, Mongo, etc.) They're all based off of a
usage metric of some sort (in our case API requests). We're making on-prem ML
accessible at SaaS prices (try negotiating a on-prem contract at other MLaaS).
Our costs are continual as we continue to improve the product over time, and
those improvements will be passed down to you as a user.

If you're interested in running the product on a licensing fee instead of pay-
as-you-go, feel free to shoot us an email at hi@modeldepot.io :) We found that
licensed pricing model to be more restrictive for new ML users and increased
barriers to entry. If you're not interested in using our paid product, you can
check out our primary pre-trained ML platform at
[https://modeldepot.io/browse](https://modeldepot.io/browse) and hopefully
find a ML solution that works for you without paying a cent.

