
Launch HN: Piccolo (YC W18) – Camera for controlling your home with gestures - marlonmisra
Hi HN — we’re Marlon and Neil, founders of Piccolo (<a href="https:&#x2F;&#x2F;www.piccololabs.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.piccololabs.com&#x2F;</a>). Piccolo is a smart camera that lets you control your TV, lamps, fans, speakers, and other devices with simple gestures. For example, you can point at your lamps with your hand to turn them on or off.<p>The two of us have had an interest in computer vision for a long time and were in Udacity’s first self-driving car nanodegree cohort in 2016. We started this as a side project to control one lamp and soon had our entire house connected. For some actions, we found gestures to be much faster and more intuitive. For example, pointing at a lamp to turn it on is way more natural than saying “Hey Alexa, can you turn on my left living room lamp?”<p>To set up Piccolo, you can place it anywhere (near the TV is usually best), and then on the app you can indicate with bounding boxes where the devices are. After that, you connect those same devices (Chromecast, Hue lights, smart plugs, etc.), and you’re good to go. Some processing happens on-device, but the more complicated models are run in the cloud. Since we’re not a security camera, there’s no need to store video and so no image&#x2F;video data is ever stored.<p>We’re excited about the experiences you can build when you have a camera and apply computer vision techniques. With recent progress in human pose estimation, object classification, and object tracking, there’s really a lot you can do. We’re starting out with gestures, but our goal is to build a platform that lets anyone create and deploy vision apps. Here&#x27;s a few things we&#x27;re excited about:<p>- New apps. For example an app that detects medical emergencies (like an elderly person falling). We&#x27;d also love an app that can tell you where you left your phone and keys.<p>- App integrations. For example, letting Netflix know which people are in the room to get tailored recommendations for everyone vs. just the person signed in.<p>- Smarter hardware. For example, an Espresso machine that, with one click, makes your favorite drink because it knows who pressed the button.<p>- Voice-vision fusion. You should be able to trigger Alexa just by gazing at the Alexa device instead of saying &quot;Alexa&quot;. You should also be able to hold something and say &quot;Order 5 more of these&quot;.<p>We&#x27;re giving away 20 pre-release units next month to anyone that joins the waitlist. We’re happy to answer any questions and look forward to your feedback. If you want to follow up, our emails are marlon@piccololabs.com and neil@piccololabs.com.
======
JamesCoyne
From the Hitchiker's Guide:

A loud clatter of gunk music flooded through the Heart of Gold cabin as Zaphod
searched the sub-etha radio wave bands for news of himself. The machine was
rather difficult to operate. For years radios had been operated by means of
pressing buttons and turning dials; then as the technology became more
sophisticated the controls were made touch-sensitive--you merely had to brush
the panels with your fingers; now all you had to do was wave your hand in the
general direction of the components and hope. It saved a lot of muscular
expenditure, of course, but meant that you had to sit infuriatingly still if
you wanted to keep listening to the same program.

Zaphod waved a hand and the channel switched again.

~~~
lifeisstillgood
I wanted to post the exact same quote.

To understand the problem faced here it's worth trying to just watch people.
Guess what people are saying to each other.

------
brwsr
Definitely cool technology, only not for me. I'll never place a camera in my
home that is connected to some cloud service or company database. That data is
simply too valuable for me to send out to a "trusted" party.

~~~
neilraina
That's a fair concern. We are very serious about privacy. Existing cameras
today focus on security so they have to store your data. Piccolo on the other
hand is for real-time interactions so images are never stored.

~~~
maxerickson
What do you plan to do with other sorts of data?

Like does the cloud just trigger events and this it, or does it keep a log of
where people are located in the scene and whether they have their arms
crossed, etc?

------
huangc10
The demo on the front page looks super cool. Signing up and looking forward to
more progress and updates in the future. I wonder how accurate the finger
recognition is? Is it accurate enough to pick up numbers so your fingers can
be used to change the channel for example?

I did my final year engineering project with the Kinect (years ago!) and
controlled a robotic arm to pick up an egg. Didn't know how else to apply
this, but it seems like you guys really thought outside the box.

Only thing is the name...not sure about naming after the green alien from DBZ.

~~~
marlonmisra
It's accurate enough to pick up number of fingers from a fairly long distance
(~5m), and it works well in most lighting conditions, including complete
darkness.

~~~
huangc10
Actually the lighting was the other question I had in mind. Impressive. When I
worked on the Kinect code, lighting was never really an issue we had or wanted
to tackle.

I think something that people don't realize until they start using motion
sensing technology is that how intuitive it can really be. It might look
awkward at first, but once it's seamlessly integrated (ie. stepping into the
sensor's view), it's really just like magic. A good enough sensor can pick up
and understand very specific motions.

Anyway, good luck guys.

------
epberry
Very cool. I shudder to think of the GPU costs to run these models though.
Perhaps they're using TPUs to be as efficient as possible. If you imagine a
room occupied in the evening by people for several hours and you have any
decent framerate, you're running your pose estimation network on each frame
for several hours. And these models are big as far as I have seen. So that
pretty much means you have one cloud GPU per camera allocated every evening. I
suppose another option is they are running pieces of the network on the device
but I think that's unlikely.

Of course these models are getting smaller over time and I'm incredibly
impressed that these guys have put together the hardware, computer vision, and
cloud setup. I also think they've nailed the MVP - not too easy but not too
complicated either assuming they have decent models.

I'm signing up!

~~~
pavlov
I think you're overestimating the processing requirements. The original 2010
Kinect did fundamentally similar processing (multi-person tracking and
skeletal mapping) on the Xbox 360 which had a PowerPC CPU from 2005.

~~~
ReverseCold
Running neural networks is usually much easier than training them
(computationally).

Was the Kinect even a neural network? I don't think it was.

------
crolek
First off, signed up. This looks like a pretty sweet tool for home automation.

I'm slowly adding home automation to my house, but I'm a little more privacy
oriented. I totally recognize I might not be your target audience with some of
my constraints. Couple questions: If the service is paid (monthly or for a 1
time app purchase); how much is getting published to your servers? Because I
like the control over something as intrusive as cameras in my house I'd prefer
to self-host as much as I can.

Does it have to be the camera you provide/sell or could a high enough quality
camera do that?

------
hashkb
> but the more complicated models are run in the cloud

This is a massive problem with home automation tools today. I need to rely on
Comcast to stay alive to be able to turn lights on and off in my house? To me,
it's an absolute dealbreaker. I've spent more time than I feel like I should
have had to in order to control devices I own without any data leaving my LAN.
Am I the only one who feels like it's a totally absurd situation?

~~~
marlonmisra
Our first prototype ran locally end-to-end. To make that work we used a Jetson
TX2 ($600 computer), but the performance was abysmal. In a few years, it might
be possible to do what we're doing locally and at a reasonable price.

~~~
hashkb
I just think it's unfortunate that every player in the space has to ship their
own hardware (and, obviously, force the user into a walled garden so that
competitor's products are inconvenient to use.)

Give me (a power user) something I can run in a docker container locally, or
on any of my local Macs or PCs.

Or, assume as a constraint that smart homes need to be able to run themselves.
I would appreciate if everyone in the space made an effort at detente while
the technology matured to a point where a massive privacy and security hole
wasn't required in order to have the thing work at all.

------
warent
Super cool idea, although the privacy implications are a little bit
discomforting.

Anyway, I'd be interested to know how you're dealing with the depth problem.
For example, if there's a light behind you and a light above you, pointing up
looks like you could be pointing at either light.

~~~
neilraina
This is one of the most challenging parts of this problem. We’re in the middle
of transitioning to a 3D model that will be able to tell the difference really
well.

------
frakkingcylons
Wow, that short clip on your homepage is crazy effective. I understood it
immediately. Super cool product. I would buy this in a heartbeat if it came in
a self-contained hardware package that included the camera and a module that
does the CV processing in a way such that no video of my home touches a 3rd-
party server.

~~~
marlonmisra
Thanks. I think in a few years we'll see devices that can do everything
locally, and cost a reasonable amount. For now, like others have said, some of
the models for pose are too complex.

------
pavlov
Is it basically a smaller Kinect without the Xbox platform tie-in?

Looks very neat. If you open the API, I’d love to play with it!

~~~
neilraina
That's right, Piccolo is not tied to any console or platform. We're also very
excited about opening this up. We think there is potential to build really
powerful applications with the information available here. Stay tuned!

------
thedangler
Smartthings integration would be nice. I mean I guess its possible to do it
yourself, but out of the box would be sweet because all my lights and motion
run off it.

~~~
neilraina
That’s a great suggestion. We are working on connecting with a few different
smart home hubs including smartthings.

------
prawn
Brilliant home page video. Explains the premise quickly and clearly. I suspect
the power and convenience will evaporate privacy concerns for many.

It could recognise a couple in a dancing pose and randomly play something
appropriate. It could recognise heroic arms aloft and play applause. So much
potential.

There are countless things we do daily that have loads of room to be
simplified. Even getting an Apple TV where the remote can switch on the TV and
the device, and auto-change the TV's source to the device input felt like a
game-changer, but this could ramp that up significantly.

------
jkravitz61
I had a similar thought on the OpenPose library. How did your team end up with
such a robust model without using OpenPose? As far as I know, OpenPose is only
licensed for Academic use and was created using the Panoptic dataset which I
believe is open. The problem is, creating a system that reaches the parity
level of OpenPose with simple 2D image rec could be a startup in itself. I'm a
bit cynical...

EDIT: you answered my question with DensePose. Had no idea that exists!

------
didip
Your demo on the front page is super spot on. I get what you do instantly.

Hope you guys all the best. This is so cool, I am on the waiting list!

------
ibdf
This looks interesting... No pricing listed, but I'm guessing it will be
between $200 to $400. My main issue with "smart" devices is the price vs the
longevity.

(Cameras + bulbs + switches + doors + sensors) * how many rooms you have =
this automated home thing get's expensive really fast.

And then you end up with a not so smart home. Not to mention the struggle
that's to figure out what devices talks to what system and the many apps...
everyone want's to make their own app, and then you need a middleman app to
talk to other apps. All of this work, and in 5 years you will probably need to
replace everything. Just thinking about it, a smart home today will probably
devalue the price of your home in 10 years.

------
iltaiuti
I think this is a great idea, it reminds me a lot of the OpenPose library. I
found very interesting how you found each other via Udacity's course. Could
you expand on how you went about actually teaming up and starting out? Thanks
and good luck with the final pitch!

~~~
marlonmisra
Yes, OpenPose gave us a lot of inspiration. DensePose is similarly impressive
and came out recently.

We did the course together but met many years before, in high school in Canada
in '07.

------
sergiotapia
Do you see the entire video and process it, or do you only see the skeleton +
target areas?

~~~
marlonmisra
We do some preprocessing on-device and run our more complex models in the
cloud. It’s similar to Alexa which always listens for the trigger word, but
only streams to servers for the few seconds afterwords. While some image data
is processed in the cloud, it’s never stored.

------
xauronx
I've wanted to do this but with the Nest cameras I already have in my house.
It's too bad that Nest has made the decision to completely discourage anyone
using the video from these devices.

------
casenjo
I love home automation and I love this idea but I'm deadset against having a
cloud service connected for my home automation. Would it be possible to run a
server locally, using my GPU for the processing?

------
ngokevin
Awesome! Can't wait to no longer have to yell "ALEXA. HEY ALEXA" -> "sorry I
could not find birdroom lights".

------
hopfog
Very cool!

I got a 502 and "Oops! Something went wrong while submitting the form"
initially though but it worked on the second try.

------
loteck
_We 're giving away 20 pre-release units next month to anyone that joins the
waitlist._

Are you sure that wording is accurate?

------
ransom1538
I wonder if this could be an iPhone app, so we can just use an old iPhone. Do
we need a special camera at all?

~~~
marlonmisra
You could but the downsides are: 1/ It doesn't have the ideal sensors (stereo
cameras, or other depth sensor), so you'd have limited functionality 2/ Most
people wouldn't want to run their iPhone consistently day after day.

------
dboscochoi
Looks really cool. Your landing page does a really great job at demonstrating
what it does. Great work!

------
HammadB
Congrats, this is very impressive! I'm curious are you computing the 2D or 3D
pose of the person?

~~~
marlonmisra
The demo video on our website was done with 2D pose. We’re in the middle of
transitioning to 3D.

------
eaenki
Few things come to mind:

1: Amazement 2: DBZ's character 3: Google's Project Soli

------
faitswulff
As an aside, what was your background before doing the self-driving
nanodegree?

~~~
marlonmisra
I taught myself programming a few years ago and did the ML nanodegree before.
My cofounder Neil studied software at Waterloo and was an engineer at
Pinterest.

------
chanfest22
This looks AWESOME! Can't wait to run around the house like Iron Man.

------
briandear
Does it work with Apple HomeKit?

------
garrettlangley
congrats guys!

------
kiambogo
Will Piccolo help you not hit me with your car again Neil?

------
alwaysNever
It’s funny because, ycombinator loudly boasts that, ideally, it seeks to fund
billion dollar ideas, and yet I so very much do not want a thing like this.

Computer vision is this incredible thing, until you realize it means you’ll be
in front of a camera you cannot control, and that camera will require
broadband internet, if not for a one-time activation, then perhaps
continuously and indefinitely.

I’ve never seen a high-end electronic device produced after 2010 restrict
itself from the internet. No company seems have the self control necessary to
consider even the idea. No internet? Impossible!

What kind of echo chamber would shout down such an idea? Do people understand
why I might worry about a camera with unrestricted internet access? Why must I
trust anything so deeply?

I don’t want to hear about how I should just give up this fight, because I’ve
already lost anyway. I don’t care about what other devices already do. I don’t
care about how different this time is.

