Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Computer Vision Project Ideas?
7 points by tempcvstudent 9 days ago | hide | past | web | favorite | 4 comments
I'm an undergraduate taking a deep learning/computer vision class this term, and our final assignment is an open-ended computer vision project. I'm looking for ideas.

Basically, I want something a little novel (perhaps even impressive) that's a departure from the standard image recognition and classification projects that are usually completed as "exercises" for this final project - things like license plate recognition or food classification that tend to be done every year.

For what it's worth, I have a pretty solid background in my CS fundamentals. I'm absolutely willing to consider any project - the ideation process has been pretty tricky thus far. The stack for the class is Python, TensorFlow, and OpenCV.

I've got 2 months to work on this project, and I'm willing to dedicate as much time as needed, so please don't limit ideas to a severe time constraint. I don't have any extra money though, so ideas that need a lot of cloud processing might be out of scope.

I think THE problem in computer vision. And it's really an AI problem. May be video prediction.

Basically output the next N frames, given an input sample. It's trivial to estimate the error as you just split the video in two equal pieces. And try to guess the second half.

And it's so difficult because it's not mere rgb probability. It requires scene understanding. And the physical trajectories of the bodies in motion!

I had the idea that you could take a really basic video game such as pong, snake, space invaders or tetris. And not only predict next frames. But also infer physical quantities from laws of motion.

Being able to apply it to a bullet hell style shmup would very very interesting. As some projectiles use sine waves. And others follow rng random walk paths ;)

"Basically output the next N frames, given an input sample. It's trivial to estimate the error as you just split the video in two equal pieces. And try to guess the second half."

I actually did an experiment with something very close to this in the 90's, using simple morphing. For example, a video of a scene panning, but it was actually still photos with morphing between them to create the animation as more of a smooth pan instead of the seizure-porn strobe effect of flipping pictures.

It didn't work as well as I thought it would, but I think what came out of it was possibly a form of art. Everything looked almost normal, but there was a really frenetic halo that accompanied anything that moved on the screen - especially if it moved in any direction where parallax plays a part in understanding what went where. The wriggley halo artifacts looked cool, but maybe only for a rock video. There was none of the smoothness I expected or hoped for.

How about something to do with race finish lines? RFID works fine if you can funnel the finishers through a small gateway but I know from my hobby of flat water kayaking that rivers and equipment can make this very difficult. Rowing and sailing will probably have the same issue. Your local running club may also struggle with identifying and ordering finishers...

- "Looking at me" detection, so your device can tell when you are looking at it. - pole vault trainer: detect and track the pole in pole vaulting competitions. Either pick out interesting measures (like the curvature of the pole, or acceleration of the tip) or try to directly predict the success of the jump.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact