Hacker News new | past | comments | ask | show | jobs | submit | sim04ful's comments login

Stack Frontend: NextJS, Tailwind, NextUI, Rust (via wasm-bindgen) hosted on Vercel

Backend: Rust, Axum server, LMDB hosted on Alwyzon, Cloudflare for CDN caching and SSL.

https://www.arible.co A growing directory of useful productivity tools accessible without multiple subscriptions or registrations


I'm working on a tool directory that lets you run then directly without multiple subscriptions.

https://www.arible.co


I can't totally agree with your counter-counter example. Most non trivial problems are time bound, deadline exist, and no matter how well ingrained you are in first principles thinking you won't be useful if it takes months to come up with a solution.


The fundamental issue with this tech is that they're trying to pair high density information with a low bandwidth input interface (less than a handful of finger gestures) and voice input which is flawed; you can't pause, backtrack to correct mistakes, and is highly dependent on external context.


I think is voice and image is pretty high density and that's the future of most/all consumer interfaces with AI, no?


Voice is very high density as an input modality, but that's a double-edged sword.

It's fantastic for expressing complex requests that would otherwise be a lot of effort/time consuming on a less-information-dense interface.

But voice has a higher "floor" for ease of use. It's easier to tap a button to confirm than it is to speak "Yes, ok" out loud. It's also more socially appropriate in more circumstances.

The other problem is that voice is very low density as an output modality compared to the status quo, which are high-resolution screens. The amount of time it would take to express even the most basic of information (imagine speaking the HN home page out loud vs. reading it!) is pretty extreme.

Where this forms a bad combination are tasks where it's not realistic for the user to utter the full complex request at once, where the user must consult intermediate outputs to determine the next action. In that case you're in a really vicious scenario: the density of voice input is not really necessary, while the low-density of voice output slows the task down dramatically.

For example, think of a use case where you're booking a flight: "I want to see flights from San Francisco to New York".

It's not really possible for the user to fully define a booking in their initial voice input - the user would reasonably want to review choices, pick seats, etc, necessitating that the task be multi-step. Now imagine if voice was the exclusive modality - the UX would be positively painful.

> "that's the future of most/all consumer interfaces with AI, no?"

And this is why I disagree with this statement. I think the idea that voice is the dominant user interface is far from obvious, especially when it comes to AI systems.

LLM hype men tend to hand-wave around the complexities of this: "the AI will automatically pick the best possible flight for you! Why would you even want to review your choices?!" - which conveniently dismisses every area of weakness for this UX with "the AI will do it for you, trust it"... and time will tell but I suspect that will not work out that way.


The ideal form of the flight booking interaction is that it's basically a computerized travel agent. So it could theoretically ask questions or listen to your objections and refine the request that way.

That could be fine, it's probably more comfortable than using a complicated website at least for some people, but again this is a feature you stick in a phone app, not a dedicated hardware device.


> "The ideal form of the flight booking interaction is that it's basically a computerized travel agent. So it could theoretically ask questions or listen to your objections and refine the request that way."

And that is startlingly worse as a UX than the status quo.

An agent can apply some contextual filtering to improve the initial choices offered to the user - but they can do that in a GUI as well, and I would strongly argue that's better served in a GUI than via voice.

"Given your check-in time and transit from the airport, I think the following 3 flights make sense... [painstakingly list all 3 flights verbally]."

"Oh ok uh can you say the second one again? Was that out of JFK or Newark?"

... etc. Whereas a GUI is easy to parse and presents choices side-by-side in a way that's easy to compare.

This is the dissembling I'm talking about when it comes to some LLM proponents - the idea that the user having to perceive, compare, and analyze information just goes away, poof because the agent will just... make it no longer necessary.

It's a fundamental misunderstanding of these domains and use cases.

I'll generalize my prediction a bit more: an intelligent agent applying its contextual knowledge to a GUI is likely going to be overwhelmingly better as a UX than an intelligent agent that largely interacts verbally.


Image is the highest density method of communication we have (a huge portion of our brains are used for just this) but interfacing/responding/reacting to this information via a crippled gesture system and voice is a huge bottleneck.

So this can be avoided by intelligently reducing the dimensionality of this incoming data (potentially with AI) and/or increasing input bandwidth


IMO, It's just not something that can be improved no matter how much effort is put into it Speech IO is unreliable for anything serious, not to mention awkward in public, it's also easily attenuated in very trivial situations.


The ReactJS + Typescript + Rust + LMDB combo just works so well for me.


I'm really interested in this. Closest thing we've got is https://www.kialo.com/do-aliens-exist-1258


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: