
Ask HN: I built a pipeline for in-video search at a hackathon, what're the uses? - muedzi
So I attended a hackathon recently and I wanted to play around with the GCP stack. I had a lot of fun with it, ended up using their Vision API and many other smaller things the build a pipeline where you could upload a video and, index a whole bunch of objects in it so that you could search. I even built a &#x27;voice skill&#x27; where you could say something like &#x27;find all occurrences of people&#x27; or &#x27;Start playing when baboon appears&#x27;<p>It was really fun, but I suspect there might actually be a use for it somewhere out there.<p>I&#x27;m not a business or startup guy, but am willing to explore it as a side project.  Any ideas where this might be useful?<p>The one thing that came to mind was analyzing security footage, but beyond that I&#x27;m blank
======
hos234
Probably in sports broadcasting/sports analytics. They do a lot of manual
lookups for replay, identifying important moments, filtering out wrong camera
angles etc. There maybe some simple cases that can be automated. Looking at
what Hawkeye, Playsight SmartCourt etc do might give you some ideas.

------
throw394812
Hey, send me an email, I'm interested.

------
verdverm
Sales call analytics

------
aisafetyceo
I went down that path as conceptual learning for building a voice controlled
website building system

\- you can't count on making much money just reselling the software on larry
pages computer because he is counting on making much money just reselling
software on his computers ...
[https://teachablemachine.withgoogle.com/](https://teachablemachine.withgoogle.com/)

Also.. more seriously, the latency to the google servers and back makes
serious applications out of reach.

for ex. combine the concept with this
[https://comma.ai/shop](https://comma.ai/shop) and theres a geniune billion
dollars waiting to be thought out

another ex. [https://www.indus.ai](https://www.indus.ai) but powered with many
cheap [https://amzn.to/2Xg4JkB](https://amzn.to/2Xg4JkB)

the path forward is to write a learning algorithm and recognize your work as
one who creates virtual computers \- screenless computing is a big deal,
microsoft surface earbuds will have basic office integration, but imagine if
you could your pipeline communicate the contents of pdfs,websites etc. without
a screen around

theres the obvious consumer products but also imagine being able to work with
a computer in new environments theres also the non obvious: imagine one
smartphone camera pointed at a room, every person in that room can access
their virtual computers using camera or voice You can sell kits that that
transform rasberry pi's into useful robots

\- 3d reconstruction is cleaner and more powerful then deep learning \-
stitching together screenshots of websites \- learning as environment
mirroring

its also useful to think of the practical economics of the distribution of the
technology ... android phones can go around $70 - $1000 but they essentially
run the same free software opensource software

so there is not much of dollars that any individual can extract from the
distribution of "AI" software

What I would do is delve into the technical software reasons behind Googles
acquisition of Fitbit.

Spoiler: How does fitbit allow developers to build with javascript and css
without running a "browser"

The point here is Larry Page is eyeing the virtual software developer
opportunity and so are you! in order to extract value from the vision
recognition opportunity you must spend a lot of energy writing rules

it would be efficient if the end user could program their camera on the fly
using natural language without internet connection on any device

this north star is the makers definition of an agi .. and as a independent
business you can't afford to work on a subset of the problem

