Hacker Newsnew | past | comments | ask | show | jobs | submit | the_king's commentslogin

I think good names and a good file structure are the most important thing to get right here.


I think Aqua v1 had two problems:

1. The models weren't ready.

2. The interactions were often strained. Not every edit/change is easy to articulate with your voice.

If 1 had been our only problem, we might have had a hit. In reality, I think optimizing model errors allowed us to ignore some fundamental awkwardness in the experience. We've tried to rectify this with v2 by putting less emphasis on streaming for every interaction and less emphasis on commands, replacing it with context.

Hopefully it can become a tool in the toolbox.


Looking forward to giving it another try!


you've got nothing to worry about here.

if firebase studio can make a todo app, i'd be surprised. this is the worst "vibe coding" tool i've ever used.


> i've ever used

There are tools people complain about and there are tools nobody uses.


We can do that, with some help from the community.


I'm a paying customer and I signed onto Aqua Voice shortly after your demo on HN.

My experience with it has been overall positive but mixed. I enjoy using it for dictation, but I found that issuing editing commands and having them recognized/executed often took a lot longer than making an edit myself (which I can't do while in dictation mode).

But as a paying customer, seeing you go in this direction is somewhat sad/frustrating. You're abandoning the product I use, and you're saying that if I want to see my platform supported, I or someone from the community has to provide it- for a fully proprietary paid application.

I understand that I'm a minority user, but it's a bit disappointing to read this.


Totally understand, thanks for being a customer. I'm sorry we weren't able to make the web version as smooth as we wanted to.

We do plan to support Linux. This was probably a little bit of a blind spot for us - not realizing that anyone running a Linux desktop doesn't even have system voice tool to fall back on.


What support do you need?


thanks!

I share the same sentiment. I remember thinking in college how annoying it was that I was reading low-resolution, marked-up, skewed, b&w scans of a book using Adobe Acrobat while CS concentrators were doing everything in VS Code (then brand new).

but we do think voice is actually great with Cursor. It’s also really useful in the terminal for certain things. Checking out or creating branches, for example.


Aqua is in another league when it comes to accuracy. I just ran them side by side on a simple q to ChatGPT and here were the results...

Aqua Voice

  What is the first recorded eclipse in human history? I'm not asking when the first one occurred, but the first written record we have of an eclipse.
Windows Voice Typing (v11 24H2, Dell XPS 13 9340)

  What is the first recorded eclipse in human history i'm not asking 1 like the first occurred but the first ridden record we have of an eclipse

Windows mistakes were:

-"1" should be "when"

-"ridden" should be "written"

-No punctuation


thanks! We're working on iOS, but it's tough to get the ergos right given all of Apple's restrictions and neglected APIs.


Android app please!


I was excited to try this out because I've had a lot of trouble getting the Supabase integrations to work on Lovable and Bolt.new.

Sorry to say that Firebase Studio did an awful job. It did not successfully build even the first view of the app I asked for. It feels like I'm stepping back to release day of GPT-4.

Am I missing a switch to use the good Gemini 2.5 somewhere? I could tell from their response speed that I was not using a thinking model.


We're building something different, but there is some overlap. Aqua is built for max speed, while keeping accuracy high. To achieve that, inference runs in a datacenter (for now).

You can customize Aqua using custom instructions, similar to ChatGPT custom instructions, and get some Talon functionality from it:

In my own, I have:

1. Breaking the paragraphs with three or four sentences.

2. Don't start a sentence with "and".

3. Use lowercase in Slack and iMessage.

4. Here are some common terminal commands...


Thanks!

We're faster, more accurate, and have a streaming option. Aqua can go from key-up to paste in as little as 450ms. Flow was closer to 1000 in our tests.

Overall, you'll notice we make a few more tweaks to the output than Wisprflow.

For example, Aqua + Cursor is very powerful - we syntax highlight your transcript. The easiest way to see this is to use streaming mode (double press Fn) + deep context + cursor and try asking it to change something.

This also works in other "context rich" environments.


Hey, love the what you are building in this category. I've been using a competing product which you know very well about. They advertised about how you can improve your work per minute by dictation, which was the main draw for me because I do a founder. There's a lot of managerial work that I'm doing.

It has been a godsend in terms of increasing my productivity because I no longer have to type. I think your product's accuracy and latency shortening just make this even better. I often use it and then find out, "Hey, I need to make some changes," and I need to re-edit some of the stuff, which reduces the WPM productivity amount. So I think accuracy is definitely key here. Key metric to differentiate a product.

I am pushing this to other colleagues to get them to adopt. One challenge people are saying is that. One is that some people may not be as organized (you know, they might be a lot more organizationally structured in their mind). So for them, they're having trouble - they'd like to write things out, and by the time if things go out of their mouth, you know it's already formulated logical thought. Whereas you know people like me are a lot more verbal vomit type of person. For me it's huge because I say a lot of um like in all the other things I just dump stuff out and then organize it later.

Whereas other people organize stuff in their brain and then dump the information out. So people who do a lot of coordination and just you know so I feel like this could be two different segments to take into account.

Another one that's been fantastic is that we have multilingual colleagues who are speaking in Mandarin or something else and then they speak it and then ask Flow to be sent to translate it to a different language. That part I think has been fantastic.

I think the ability to edit what you wrote with AI is going to be the next key feature. Providing the context in the window is all wiithin the conversation right? For example, you just ask after because what you write out is not the final and you need to do a lot of editing and formatting. Sometimes when you say too much stuff, it's just like a huge jumble paragraph with a lot of fluff words. Make it clear, concise, trim non-effective words. I think those are a key feature because it's not about your productivity, it's about other people being able to ingest your information efficiently. At least that's what I look at from a managerial perspective.

To give you an example, everything I laid out above came from dictation. You can see how this is inefficient. There's a lot of inefficiencies here.


A feature that would be great is similar to how you can write snippets in all the other tools where you can say "calendar" or "cal" and then it gives you the link. If this is something possible, I think that would make this fantastic.

Another feature that would be great is actually being able to have a conversation with an AI model first and then refine the output iteratively until you're ready and then pipe all that over. The ability to have a chat is very good or do this all through voice.


Thanks! I'm tempted to try Speech to Text + Cursor/Copilot for development. It's probably the future since most people can speak faster than they can type.


This use case is great, even for people who haven't been interested in dictation before


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: