Hacker News new | past | comments | ask | show | jobs | submit login

For all the reasons that this might not take off, what a thrill that people are trying something new--and it looks really nicely designed too.

I think this is easy to dismiss at first glance, but I genuinely believe they're trying to think about a new mode of interaction. The idea that "the computer will disappear" is probably accurate in the long term. Except for content delivery (reading, photos, movies), most tasks we achieve via computers and phones do not strictly require a screen. It's probably a good thing if computers did a better job of getting out of the way, and stop so loudly disrupting human interactions.

Whether this will be the solution is unclear; the privacy/creepiness angle is still real with an outwards-facing camera. Latency and battery life limitations might be too significant. The cost will be a non-starter for many (it is for me).

But I'm still impressed because there was a vision here. The conversational interface has never worked before for many reasons, but that does not mean it cannot work in principle, or that the ideal implementation would not be spellbinding. I'm glad they're trying. Also, the laser display is neat!




First, I’m really excited people are trying new things, but I won’t be buying this just based on the demo.

> The conversational interface has never worked before for many reasons, but that does not mean it cannot work in principle, …. I'm glad they're trying. Also, the laser display is neat!

So I did a lot of work over the years to research voice UI/UX and I’m very skeptical about this, even with the LLM stuff. I think an LLM was missing from the Siri/alexa era to transform it from “audio cli” to “chat interface” but there’s a few reasons besides that it didn’t catch on.

The information density and linearity of chat, voice especially, is a big problem.

When you look at a screen, your eyes can move in 2 dimensions. You can have sidebars, you can have text fields organized in paragraphs and buttons and bars etc. Not so with chatting - when you add linearity (you can only listen to or read one thing at a time, conversation can only present one list at a time) it becomes really slow to navigate any sort of decision or menu trees. Mobile-first have simplified this of course, but it’s not enough. Reading TTS becomes even slower to find the info you care about. It’s found a place for simple controls (smarthome, media, timers, etc) and simple information retrieval (weather, announce doorbell, read last text). Then there’s the obvious problem of talking out loud in public, false response recognition etc which are necessary evils of a voice UI.

I think the best hope for a voice device like this is to (as they’ve done) focus on simple experiences like “what’s I miss recently” and hope an AI can do a good enough job.

The laser display might help with presenting a full menu at once (media controls being an easy example), but it probably will end up being a pain to use (eg like a worse smartwatch).

Honestly though, my biggest hesitation (which could end up great) is the “pin” design. It’s novel, especially with the projector, but how heavy is it and how will that impact the comfort of my clothes? What about when wearing a jacket or scarf? Will this flop around while walking? Etc.


There is also a lack of serendipity or explorability with voice: How do you know whats possible? There is a reason a GUI menu is called a menu. It not only gives you access to multiple options but also at a glance an overview what options are there, like a restaurant menu.


Discoverability is the term; e.g., "What Can I Say? Effects of Discoverability in VUIs on Task Performance and User Experience" https://dl.acm.org/doi/10.1145/3405755.3406119


It'll flop everywhere, not just while walking. Boom boom.

But yeah I've been thinking that too. "Oh, put my coat on - better spend 30 seconds messing around with my pin" [...] "Ahhh back in the office. There goes another thirty seconds moving the pin so it can film me looking at a screen for four hours"

And yeah, I feel like the weight would definitely pull my jumper or t-shirt out of shape, and make things like my collar/neckline look out of whack. Maybe they'll bring out a range of clothes suitable for it, or suggest you wear a coat indoors like the woman in the video is doing.


Linear conversation is a big problem for anything beyond simple, casual usage. It is the reason that YouTube is a terrible research platform. Is the information you want inside that 3-hour video? Possibly, but with text I can search an article for content or skim sections to determine if it's worth a deeper read.

Let's not forget the value of non-linear input. Good search terms are often constructed rather than spilled forth. Sometimes I enter search terms, read it and realize that it's like to return unrelated results and need to modify it. By the time I realize this while speaking to an AI it's already spitting out the wrong information.

This leads to a need for altered interfaces that allow these scenarios to be accomodated. This is v1.0. Let's see where it goes.


>Will this flop around while walking?

If a science fiction author was writing it, the need for stiffer fabrics to support chest cameras would synergize with a neo-Victorianism in generation alpha. (Formal button-up shirts and higher necklines for enforced modesty)


IMO, with LLMs we won't really need information density except for certain classes of people.

Even now - clicking through some insurance company's website hierarchy to find something out is insanely painful.

But even for researching things that we should probably care about enough to do it ourselves, correlating different sources of information or working through abstract/ambiguous problems... the vast majority of ordinary people will 100% take the easy way out and let LLMs do most of the thinking for them. Even with free GPT-3, people are unflinchingly having LLMs solve problems they don't want to think about too deeply. What they pay for, with occasional inaccuracy, is more than offset by convenience.


> IMO, with LLMs we won't really need information density except for certain classes of people.

Maybe, but I don’t know if that day is here yet. I think “most people” do actually consume information. Like reading an insurance company’s website is pretty rare compared to things like using the Amazon App. Like it’d be hard to consume a list of 5+ push notifications via voice if you had to listen to them 1 by 1 instead of skimming them in a list next to their icons.

Even simple things like scrolling through a list of songs becomes painful. I have like 10k songs in my (streaming) library Sometimes I randomly scroll through it to find old music. That sounds impossible on voice. I’d be stuck with “shuffle” mode.

Being able to summarize and search text conversations via voice queries from their demo would be nice, but today that’s a task that you need a screen for.

The demo video shows the man buying a book online via voice after holding it up to the camera. How often is that the online shopping experience? I can’t imagine shopping without a screen 95% of the time.


>we won't really need information density

we may not need it but we certainly prefer it. People went completely voluntary from voice calling to texting and within texting to ever terser forms to the point were an entire website was built around a short character limit.

Except for people with disability I have not really seen a single case where that tendency towards compactness is reversed in communication.


> most tasks we achieve via computers and phones do not strictly require a screen.

X (doubt). There are unfortunately only 5 senses that our brains can interact with the outside world, and visual ways are the most information dense and the easiest to utilize. The screen isn't going away anytime soon.

Projector to me are same as screen - they've been around for as long too.

Though I do look forward to direct computer-brain interface, like introducing a 6th sense.


Projectors really love flat, non-moving surfaces. Will be interesting to see how they've coped with a wiggly hand wiggling around or in motion.


I highly doubt this thing is even usable outdoors. You would need pretty insane brightness levels for this to work in the sunlight. Companies have been trying to make projects with touch input for years, and nobody has gotten close to anything resembling a consumer product, I highly doubt they achieved it here


I agree, even though I'll reserve judgement until trying it. But it remains that limitations in power are unavoidable, and projecting a laser image onto a hand in daylight is going to use an awful lot of juice, particularly given the projector is so tiny I've no idea how this can be done so it's functional, nevermind "insanely great". Same goes for their claim that the speakers inside this tiny device are worthy of getting sound to the ears while skateboarding outdoors. You can have all the Head Related Transfer Functions in the world but again, you need speakers and amplifiers on the order of several watts to get the sound up to the ears. My iPhone Pro Max sounds great and loud in a quiet room, but take it onto the street to play music, it's barely audible. Also not sure how the device will know what kind of HRTF to use given its placement is going to vary so much.


The 5 senses thing is long-disproven rubbish. Humans have hundreds of senses.


Would love to hear what the rest of those are, please be specific.


Well, for instance, what is commonly referred to as "touch" is actually a whole bundle of senses. There's the actual sensation of pressure, but also texture, temperature, surface finish, the physical position of your various body parts, your sense of balance, etc etc.


Okay, but unless you're suggesting a computer interface based on proprioception, I'm not sure that that's relevant to the topic at hand.

I too would be interested to see an enumerated list of over 100 senses.


Isn't that what all the various VR glove type controllers are?


No, but points for a solid attempt. Senses are input (to the body), not output. Glove controllers are just output via movement, just like keyboards and touchscreens.

True, part of what makes them cool is that your proprioception more or less agrees with the virtual hand that you see in your headset, but that's just window dressing. The computer has no way to control that.


Can't help but notice that you again didn't answer the question. I will third the request.


not the parent but I posted an honest attempt at such a list on the sister comment if you're interested :)


Rather pushy demand for well after the East Coast has gone to sleep.


Ok, 100 senses could be too many for you to type. Maybe could you list 20 human senses?


I'm no biology expert but had to study some of this for my robotics degree not so long ago.

"Sight" split into rods for brightness sensitivity, and cones, each of which is deicated to one out of red, green, and blue. green is wider gamut of color than the others because there is a lot of green in nature. These sensors are fully independant of each other for the most part, although there is minor overlap between cones which is what we call other colors (yellow etc)

"Taste" Again split into different specialised papillae sensors. I dont remember so well, but its something like foliate for sour sensing, fungiform for salty, and vallate for bitter/poison. There is also sweet I dont remember the name, and some argue for umami

"Touch" There are an ungodly number of very distinct senses that go into touch. From more abstract ones like pain, heat/cold, moisture (not evenly distributed around body, for example have to touch things to lips to distinguish cold from wet), proprioception for joints (arguably an independant sense for each joint, or at least each "kind" of joint, because the biological mechanism is different for ball joints to saddle joints etc as well as specialised proprioception for eyeballs, tongue etc)

Then in actual touch touch there is Ruffini corpuscles sensing skin stretching and slippage of objects past the skin

Merkel discs, which senses pressure applied to the skin and low frequency vibration

Meissner's corpuscles, which sense vibrations in middle range. They are very sensitive and allow very slight sensing of tiny impulses such as picking up an insect's wing

Pacinian corpuscle sense extremely fast vibration which among other things allow the distinction between "rough" and "smooth" surfaces (by mechanical movement causing vibration)

There are also free nerve endings sensing stuff like itching and bruising.

Hair foillicles also sense movement and stretching of the hair they are attached too, which provides more touch data. Incidentally this mechanism is also used for balance and hearing via really complicated interactions of tiny hairs in the ear.

"Smell" Smell is fiendishly complex, it actually is more akin to the way antibodies in the body are made in the sense it consists of thousands (and millions) of specialised sensors made to "fit" and attach to individual compounds, so there are almost limitless individual senses of smell

There is also a whole lot of internal sensor data for things like breathing (you know when you are short of breath), digestion you know when you are full, or when you are craving one of a number of things sweet salty etc), bladder control.

This is mostly off the top of my head and i'm certain i'm misremembering some of the subtlties and a whole bunch more senses both obscure and immediately recognisable ones to any owner of a human body


This is super interesting, and I appreciate the level of detail and thought that went into your response. Some I'm willing to accept, like hot/cold being distinct from pressure being distinct from pain. (Spinal cord injury, for instance, can impair pressure perception in a particular part of the body without affecting hot/cold. And lumping joint pain in with "touch" is just silly.)

On the other hand, in the context of the discussion, it's hard to support the argument that you can count each colour channel separately just because the biological mechanics differ. You can't actually triple the amount of human-perceptable information by going from a monochrome to full colour display.

The point remains that we've plucked the low-hanging fruit when it comes to high-bandwidth human senses (or meta-senses if you insist on being pedantic). No one will buy a PUI (pain user interface).


absolubtely! sight is an amazingly high-bandwidth sense, as is hearing.

Other types of interfaces do exist, for example ive worked with vibration motor arrays placed on the skin for various purposes such as assisting in guiding the arm of a patient to target a specific point (vibrate on side closest to target) etc. We also worked with pads of electrical patches that pass small currents through the skin to produce a distinct sensation, like pain but barely at the threshold of being noticable. These were used for first responders, placed along the side of the torso underneath the clothes with flat profile, allowing them to have handsfree silent communication with low bandwidth. Something like "up up left left" being pre-agreed to mean leave the structure now etc. Another fun one I wanted to mention is in-mouth joysticks controlled with the tongue for quadreplegic patients to allow them to move a wheelchair or robot arm to regain some small independance (might seem like it would be uselessly hard to achieve anything with an arm controlled that way but the emotional impact of independance can't be understated for such people, even a simple task can be very meaningful)

They won't be as good as screens or audio unfortunately. But they can exist. Even braille screens and keyboards exist as a nice product and are reasonably high bandwidth.


> On the other hand, in the context of the discussion, it's hard to support the argument that you can count each colour channel separately just because the biological mechanics differ. You can't actually triple the amount of human-perceptable information by going from a monochrome to full colour display.

You absolutely LOSE perceptible information when you lose one of then channels, like in color blindness.


I mean you can separate a screen out into a bunch of pixels as well, or specifically blue, red, green pixels.

Even on the vision front, we have rods and cones that works differently to generate ONE vision.

This is entirely semantics.


Not sure there are hundreds, but just to add one example beyond “the five”: Proprioception [1]

[1] https://en.wikipedia.org/wiki/Proprioception


> Whether this will be the solution is unclear; the privacy/creepiness angle is still real with an outwards-facing camera.

I don't think you're wrong, but it's funny that we aren't as concerned about everyone walking around with outwards-facing phone cameras.


Or microphones being present absolutely everywhere.

I myself never felt like taping my camera, I feel like if someone pwned my system I would be much more worried about the leaked audio.


It's funny, I see people cover up the webcam on their laptops all the time, but not their phones. They forget that there's a camera on both sides of the phone.


Webcams in laptops are shitty cameras, and for most people, they're useless anyway (even in post-pandemic era, hardly anyone does conference calls, video or otherwise). Meanwhile, "selfie camera" is like literally the main purpose of the phone for a large chunk of the population.


tbf those are usually in a pocker or down facing, with filming being an explicit and purposeful action


Plenty of people are walking around, or sitting in a public place with their phone cameras facing out to the world.


I think what the parent comment was saying is that when being held in a normal manner, the phone is facing about 45 degrees below the horizon, so it can't see much except people's legs. To film people's faces and such, you'd have to tilt the phone up much higher than you would if you were just writing a text message / email or browsing the web. If you try writing a text on a phone that's angled up to the horizon like that, it's harder to type and harder to read the screen.


The wide angle lens on my iPhone can capture a pretty good portion of my current room even with my phone angled 45 degrees down.


True. I suppose the social conventions around overt vs covert use of smartphone cameras evolved before wide-angle cameras on phones became common, since wide-angle cameras on phones are a pretty new thing.


And especially since we can now make cameras small enough that you'd never know they were there. Even OVM6948 is commercially available, the size of a "grain of sand".

I've always said that privacy is an illusion, the usual example I give is: "You're lying in bed with the curtains drawn, you see a shadow fall across the curtains that looks like a person standing outside. Do you, or do you not have privacy?"

If the shadow turns out to be a person peeking through the curtains, then you don't. If the shadow turns out to be primal brain + tree shadow then you do. Schrodinger style.

Privacy is probably best described (as it sometimes is) as a "sense" of privacy I guess.


Well said.


Yeah, I expect that this will die a horrible death in the market, but it's definitely interesting with it's Star Trek vibe. :)

The next generation of devices that incorporate some of these features might be more successful.


I imagine if this company is successful, it will become quite the enterprise.


This doesn't feel like the right product for a lot of reasons. (Wait...do I have to pin it to the outside of my coat when I put that on? What's the battery life outside a coat in winter? Will it catch on my seatbelt?) Lots of practical problems for a lot of people. Still, LOTS of interesting ideas here.


> It's probably a good thing if computers did a better job of getting out of the way, and stop so loudly disrupting human interactions.

And that is not this. Talking out loud every few moments with verbal commands do a device is way more annoying that someone looking at and typing on a phone

That said, I agree with you at a glance it's neat. I think in reality though it's a poor idea given how often people need to give a verbal command.


The talking out loud I agree is problematic. The bluetooth functionality and increasing quality audio pass through give me hope for a simple earphone in one ear, and eventually... this: https://x.com/ruohanzhang76/status/1720525179028406492

Also bullish on hand gesture control. Maybe most stuff will eventually become jutsu level fancy hand movements lol. What a time to be alive. It is easy to remain grateful in this age of rapid progress.


The problem is the voice-based approach: it won't work reliably in loud environments, it won't be usable in a doctor's waiting room, libraries and other quiet environments, and some people simply don't like voice UIs.

If you want the computer to disappear, why not a better smartwatch? Or glasses, this time without the sci-fi gadget look? Both could support the exact same featureset but with a screen.


> The idea that "the computer will disappear" is probably accurate in the long term.

Why though? Computer requires attention, which pretty much rules out doing something else while using it, except perhaps when passively listening to a podcast (which doesnt really qualify as computer use). Even though we may see new mediums, the mode of interaction will remain similar to that of a book


I agree. This looks like a gadget, which means I probably won't rush to buy one, but I'm glad people are trying to push the envelope.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: