Show HN: Deep learning visual search and data analytics

aub3bhat · on April 27, 2017

Deep Video Analytics is an open source visual data analytics platform. The platform uses deep learning based indexing, detection and recognition models for visual search. Using Deep Video Analytics users can quickly load, annotate, index, images & videos. They can detect and recognize objects (such as faces) and seamlessly import and share processed datasets using Visual Data Network. Deep Video Analytics is developed using Django, Postgres, Tensor Flow & Docker to enable flexible deployment. You can find more information here: https://deepvideoanalytics.com/

garysieling · on April 27, 2017

When you say video, are you talking just images, or also audio? I built a video search site (https://www.findlectures.com) and I'm investigating options like this to feed more ranking factors into the system.

aub3bhat · on April 27, 2017

Hi the system is under active development I am planning to add support for audio processing using

1. https://projects.csail.mit.edu/soundnet/

2. https://github.com/aalireza/SimpleAudioIndexer (using PocketSphinx, NOT Watson)

garysieling · on April 27, 2017

Awesome, thanks!

Omnipresent · on April 27, 2017

This is phenomenal. Is the primary use case for this to:

- Find similar looking frames?

Question:

- Does it perform object detection on the frame? Similar to the video demo on Clarifai - https://clarifai.com/demo ?

aub3bhat · on April 27, 2017

We have Visual Search as a primary interface. However the goal is to build an application agnostic visual data analytics platform. Similar to a relational database we have high level concepts of indexers (convert image/bounding box into a feature vector), clusterers (cluster feature vectors) and retrievers (retrieve similar images/objects/annotated-regions).

To answer second question we also detect objects (VOC, YOLO 9000, Faces etc.), detected objects are also indexed and retrieved when performing visual search. Further you can perform clustering on these set of "indexing" vectors for things such as fast retrieval and quick labeling/annotations. We use Flickr LOPQ to implement ANN but like all other things you can use custom algorithm. I am working on adding indexing over any set of annotations/detections/frames.

You can find more information about the design goals and vision behind the project in presentation at https://deepvideoanalytics.com/

thedatamonger · on April 27, 2017

This looks awesome. I look forward to the usual hacker news banter of "yes this is awesome but this is awesomer, see xyz" :) yes awesomer is a word

dk8996 · on April 28, 2017

Is there a way to find other things besides faces, for example a blue car or a logo?

aub3bhat · on April 28, 2017

All frames and detected objects are indexed using inception Pool3 features, which serves as a General purpose indexer. So yes you can search for arbitrary objects such as a blue car or a long or a particular "scene", at the same time when definition of "similarity" is more object-specific/fine-grained (such as in case of face) you can use a custom network (such as facenet for faces) to generate the embedding/indexing vector.

To summarize yes we provide two indexers out of the box a general purpose inception v3 and a facenet. We plan to add more indexers soon, e.g. trained on Open Images or other domain specific dataset.

Jayakumark · on April 27, 2017

Exactly whats needed now as part of growing dataset and deep learning models

aub3bhat · on April 27, 2017

Yeah I am actively working on improving Visual Data Network, essentially the goals is to make sharing, downloading and configuration of datasets to be a single click operation.

Sharing visual data opens up a whole new set of opportunities for both businesses and researchers.

Omnipresent · on April 27, 2017

Can you please explain some real world use cases you're thinking about?

r0lisz · on April 27, 2017

Could this be used to power something similar to Google Photos?

aub3bhat · on April 28, 2017

Yes we have entire pipeline (detection -> embedding -> clustering) for faces as well as ability to extract text tags using models such as Open Images.

salilpn12 · on April 27, 2017

is there a way to create an API which returns these features?

aub3bhat · on April 27, 2017

The features are stored in .npy files. Currently there is a rest API but its only for django models (using amazing django rest framework), for search and feature retrieval creating API is straightforward and I will add one soon.