Hacker News new | past | comments | ask | show | jobs | submit login
On Firefox 4 Day, Chrome 11 Hits Beta With The Ability To Talk To Your Computer (techcrunch.com)
61 points by bkudria on March 23, 2011 | hide | past | favorite | 33 comments



Sadly, my computer still doesn't respond when I say "Computer, make me a sandwich." (Yes, I tried sudo.)


Making a sandwitch is hard for a computer, I think you should ask for a glass of water.


Upvoted because you gave me my next project. A robot that makes sandwiches, and serves me glasses of water.


Please, please, please name it Alfred.


I think the ability to automatically update people's browsers is an extremely valuable weapon of Google's. Compared to the slow, manual releases of FF and IE, Google surely has the advantage that new features will be seen by the public earlier on Chrome than elsewhere.

Does anyone know of Mozilla or Microsoft having any intention of switching to a more continuous release cycle?

(I believe Google uses http://code.google.com/p/omaha/ to push their updates out - quite a fascinating piece of software.)


There was a thread (AMA) with Firefox guys on reddit last week, and continuous/fast release cycle is one of things that they want to focus on now. They want to start using it this year.


Mozilla are moving to a more Chrome-like schedule, and I believe they’re going to implement autoupdates as well. They’ve currently got 3 more releases penned for this year.


Has anyone found a way to either A) Activate the mic control from javascript or B) Enable continuous recording once the user has initiated recording?

I understand the security implications of A. However, B would be an acceptable alternative since the user initiates it. I'm imaging a vocal interface that requires minimal user keyboard interaction.

I found an attribute called "continue", but it doesn't seem to do anything special when added to the input element.


If you looked at the image you'd notice it's simply an HTML tag. I'd assume the data wouldn't be accessible with javascript.

It's the way it should be, just imagine the vulnerabilities having js enabled microphones would have.


I talk to my computer all the time, never needed Chrome.


I tend to curse my computer a lot for reading my programs wrongly.


Actually kinda cool.

Does anyone actually use Voice Control on an iPhone, or any other speech recognition tech? I've never seen it used in real life except in IVR systems.


The time picker widget on stock Android is pretty horrid, so I regularly use voice recognition to set alarms. It's much easier to long-press a single button and then say "Set alarm for 8:30 PM."


I've tried to use it on my Android phone, but it hates my English accent. If I talk to it in a borderline-racist American accent, it does better, but not enough for me to actually get anything done.


As an American, I don't think I've ever seen anyone offended by an attempt at an American accent. I actually find it hilarious to hear British people attempting an American accent, because it's one of the rare times you can actually get a sense of what your own accent sounds like. The minor differences are accentuated when someone tries hard, and it's much easier to notice them.

Back on topic, didn't Google roll something about Voice Command customization out a while ago? I'd been using it when I can so it would "learn", but I guess that's pointless if it doesn't. It does seem to have gotten better for me since the start, but I don't know if that's personalization or just Google improving it. I mostly use it for searching or dialing, so I can't speak for its accuracy at longer commands like texting, however.


I'm curious which American accent sounds "borderline-racist".


Yes. I use it to make calls and control music without taking my gloves off on a cold day.


I love voice control with my iphone. I wear a bluetooth headset a lot, it's nice to be able to just touch the headset, say "call <person>" and it'll call them.


Oh, you're one of Them.


I use it all the time. Long press the search key from any app, say "Call Papa John". Does a GPS fix, search, lookup and call all in about 3 seconds.

I didn't know you could use it for alarms. That's brilliant and would be great for one-off alarms.

I also use it for short text messages if I don't have a second hand free or can type it out for whatever reason. It works reasonably well, but I've been letting it adapt to my voice pattern so maybe it works better for me than others.


I created a Voice Search Chrome extension back when Chrome first received speech input functionality that may interest some of you. It's on the Chrome Web Store at https://chrome.google.com/webstore/detail/hhfkcobomkalfdlmko...


hey i tried this out quickly. pretty good. two thoughts:

1) is it me, or does it take two clicks to go from the plugin icon to actually accepting voice input?

2) android's voice search has both visual and audible cues to indicate to the user that it's actually picking up your speech, finished, and is now thinking about it. that'd be excellent.


1. To initiate speech input, there needs to be direct user interaction on that microphone icon. I wish there was a way for extensions to initiate it manually.

2. This is a bug in Chrome due to the extension rendering in a popup bubble and the usual speech indicator also being in a popup bubble, but it won't render as only one popup bubble at a time can be displayed.


>If you’re running Chrome 11, you can try it out here. It works very well. You speak, and the browser is able to transcribe what you say. No Flash, no plug-in. Yep. Awesome.

Is this still making requests to Google's servers to do the transcription? Personally I think I'll take the resource-hungry plugin over a web API.

Well, on a computer. On a phone, cloud voice input is fantastic.


Does Chrome 11 support h.264 <video>?


11.0.696.16 dev played back an H.264 video. Yes.


Does the speech recognition computation happen in the browser code, or do they send the voice bits to their Google servers which translates it into text and then send it back to the browser?

The latter is how voice recognition works on Android, and why it only works when you have an active internet connection. It seems a little weird to me to build an html5 standard that requires server side computation. How is a non-profit open source browser going to fund the massive server load needed, not to mention the R&D needed to develop a voice-to-text translator? Google’s stuff is proprietary.


Seems rather random to me. Amusingly, when I said "Munich" it understood "New York" - so perhaps somehow it already realized it is a city :-)

I have had a Nexus One for over a year now, and all the speech recognition does is annoy me if I accidentally hit the microphone icon and have to cancel it.

To be fair, I am not a native speaker of English, maybe it works better for native speakers.


I have been a hardcore Firefox user and fan but Chrome was something I was really looking for. A browser with no confusing colors, hundreds of extensions and more.

I downloaded Firefox 4 and I liked it but after a couple of minutes I didn't resist not going back to Chrome.


I've been working on a plugin using the new sidebar extensions api, and I'm pretty sure this update broke the api :( Planning to file a ticket in the chromium bug tracker soon.


I made this quick demo of using it for wikipedia search using a bunch of different public APIs: http://bodytag.org/perch/


Oh! IE is just 9. Firefox is just 4 and we have Chrome ELEVEN.


In this day and age, I feel weird talking to my computer alone in my office.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: