
Ask HN: How would you describe the market of Computer Vision? - grizzly_bear
I&#x27;m running a Computer Vision consulting company quite successfully. Basically, my job is to implement &quot;where&#x27;s Waldo&quot; IRL for my customers (B2B).<p>However, I&#x27;m surprised at how it seems difficult to focus on ONE specific kind of customer: most of my customers come from different backgrounds, the stakeholders are quite different... making the sales process sometimes slow and complex, with a high CAC.
In my path to have a streamlined (read: move from service to a more general product), I struggle with finding common traits to the Computer Vision needs.<p>Some customers have a programmer background, some do not.
Some customers come from marketing, others from manufacturing&#x2F;operations, others run SMBs.<p>The do-all-do-nothing approach of platforms like Sagemaker or Azure Custom Vision overwhelm them. These platforms usually address people who:
- already have a data science background
- integrate into a broader and generally complex environment<p>Easier to use platforms like Clarifai do not seem to be very common (not even mentioning the lay-offs) and ends up in a vendor lock-in situation where you HAVE to contract them if you want to go further with models tuning.<p>What is your take on the computer vision market? Do you think there&#x27;s a need for an in-between platform (ease of use of Clarifai but with Sagemaker&#x27;s flexibility and tweakability)?<p>If so, would you imagine an obvious and ever-present use case?<p>And as a side question: if you compared current Computer Vision market value chain to another industry, what would it be?<p>Thanks :)
======
m_ke
Hardest part of building computer vision models these days is collecting and
annotating data. A self serve tool like sagemaker doesn't solve that problem.

As a computer vision company your best bet is to focus on a specific industry
or a narrow problem and build fully featured products. This could be medical
imaging, satelite imagery, face recognition, fashion, adult content filtering,
face recognition, etc.

We're doing this for food. We built a consumer food logging app that's a great
source of data to keep improving our recognition model. At this point we have
millions of labeled images, something that would take most of our API
customers years to do. To make integration easier we couple the recognition
model with a food "knowledge graph" that includes nutrition facts, ingredients
and dietary categorizations. It turns out that this is useful to a lot of
healthcare, fitness, nutrition and CPG companies.

~~~
codingslave
Yeah, I'm working on a computer vision product, and the dtaa collection is a
nightmare. Not just collecting alot, but getting enought samples for each
class and each type of situation that might arise. It makes me want to pull my
hair out.

~~~
grizzly_bear
Tricky question: how do you define "enough"? :) (I usually answer my clients
as a joke: "thing about the MAXIMUM number of images you can provide me. Well
'enough', for us, is twice more!")

~~~
codingslave
When I've tuned my models a million ways and they still overfit? I'm
suspecting I don't have enough

~~~
grizzly_bear
I mean, the tricky part is to have an estimation of the number of images
_before_ you even train your model :) (usually for budgeting reasons)

------
PaulHoule
I think you're right that product-market fit isn't there, yet, except for the
organization that hires a real CV expert. So far vendors have wishful thinking
about what makes a "mimimum viable" platform.

~~~
grizzly_bear
Yes, my feeling is that we're in an "in between" situation:

\- the promise of AWS et al. is that deep learning is available to every
developer ; this is true on a purely technical basis, but without appropriate
knowledge and experience, constituting a curated database is _really_ hard.

\- CV expertise is a little bit overrated nowadays, as I believe key success
factors for a CV project is not the technical expertise of the algorithm
wizard, but rather the rigorousness of the project management (data science is
20%, the rest is 80%).

I may be wrong, but both approaches lack the same thing: a way to curate the
image dataset — and get those 80% of the work done without hassle. Once these
80% of data management is done, the main difference between the "CV amateur"
and "CV expert" is the ability to push the performance of the algorithm.

But maybe we just lack a "Postgres for images" that would make these 80%
easier, after all :)

