Hacker Newsnew | past | comments | ask | show | jobs | submit | robitsT6's commentslogin

Right, if people want to talk about how they are worried about a future with super intelligence AI, that's I think something almost everyone can agree on is a worthy topic, maybe to different degrees but not the issue in my mind.

I think what it feels like I see a lot, are people who - because of their fear of a future with super intelligent AI - try to like... Deny the significance of the event, if only because they don't _want_ to wrestle with the implications.

I think it's very important we don't do that. Let's take this future seriously, so we can align ourselves on a better path forward... I fear a future where we have years of bickering in the public forums on the veracity or significance of claims, if only because this subset of the public who are incapable of mentally wrestling with the wild fucking shit we are walking into.

If not this, what is your personal line in the sand? I'm not specifically talking to any person when I say this. I just can't help but to feel like I'm going crazy, seeing people deny what is right in front of their eyes.


It basically comes down to the fact that the AI that exists now will enrich and empower corporations and the government, but won't do much for anybody else.

The pro-AI astroturfers are building the popular consensus of acceptance for what those in power will use AI for: disenfranchisement and oppression. And they are correct because the capabilities of AI right now will enable that, as stated above.

The AI denialists are correct as well: current AI isn't what it is popularly billed. The CEOs are falling over themselves claiming they can cut headcount to zero because of their visionary implementation of AI. It's the old prototype/demo but not the real system snowjob in software sales, alllll over again.

Any actual benefit of AI to the common man comes with the current state of how tech companies "benefit" a consumer: with absurd degrees of privacy invasion, weaponized psychological algorithms, attention destruction, etc.

Here's a fun startup that in invariably in the works: the omnipresent employee monitoring AI. Every click you make, every shit you take, every coffee you sip, and every meeting you tune out. Your facial expressions analyzed, your actual "passion" measured, etc. Amazon is already doing 80% of this without AI in the warehouses.

The only saving grace to that is Covid and WFH, where they don't have the right to intrude on your workspace. So next time you hear about the return to office, remember what is coming....


Reading the paper, they mention the robotic control is handled by RT-1:

The low-level policies are from

RT-1 (Brohan et al., 2022), a transformer model that takes

RGB image and natural language instruction, and outputs end-effector control commands.

For those that don't know, RT-1 (Robotic transformer) is previous work from the team that converts natural language to custom control code.

You can read more about Rt-1 here:

https://ai.googleblog.com/2022/12/rt-1-robotics-transformer-...

Maybe I'm missing something, but this sounds quite generalizable.


RT-1 is trained with very specific (and a lot of) mostly pick-and-place data. That's the domain it is an expert on. Unfortunately, there are only so many things you can build with go to/pick up/place level of instructions. Anything further that goes into the fine manipulation domain that you may need in a real kitchen is still absent.

This issue was a general disappointment in the robotics community; they had a LOT of funds to get robotics data with and they spent it on somewhat trivial tasks that we had almost solved already with much smaller and more principled models, instead of getting human demonstration data for more complicated tasks.


Right but is there any reason that this architecture won't work with more diverse data? Fundamentally it seems like their research is benchmarked around pick and place, so it makes sense to me that they would want to prove out transformer models could work in this space. Knowing transformers, it's probably safe to say that with more diverse training data, it will be able to scale to more complex controls in more complex embodied robots.

While I would love to see them working on robot chefs, I can appreciate that they want to start small. And regardless, it doesn't seem like there are any constraints outside of data for this architecture to work in more and more domains.


> Knowing transformers, it's probably safe to say that with more diverse training data, it will be able to scale to more complex controls in more complex embodied robots. While I would love to see them working on robot chefs, I can appreciate that they want to start small.

Except they are (were?) one of the biggest spenders in the entire field. While it may seem easy to hand-wave away "Just add more data!", robot data is way more expensive to get than language/image data, since people don't generate that naturally as they browse the internet. If their current operation was already too expensive for Google to keep running (as evidenced by Google shutting down this research arm), imagine what would happen if they proposed "let's spend a couple more orders of magnitude to get data!"

All I am saying is that they are going for an unambitious project that looks cool to the outsiders, with buzzwords like LLM. There are work by others [1] that are much more impressive in terms of manipulation diversity that would probably be a much better bet to add more data into, since they show a much more promising route to the future.

[1] https://twitter.com/chenwang_j/status/1628792565385564160?t=...


I can't really comment on the expenditure or the general strategy of Google's robotics teams, I know they closed down every day robotics, but that seemed to be specifically about the hardware.

Their software efforts still seem to be going strong, and I don't really think it's fair to say that demonstrating transfer learning in an embodied model none the less, is unambitious - nor does that seem like it's reflected in the results - but to be honest, I'm more or less a layman when it comes to this so I'll defer to you and keep what you're saying in mind.

To that end, if you're still feeling up to it, maybe you could tell me what your thoughts are with efforts like this from Google:

https://diffusion-rosie.github.io/

Seems to compare (somewhat) to Mimicplay, in that it is attempting to create more data for "cheap", even if it's not "real" control data.


I'm sorry, I like riffusion as much as the next guy, but in what world is any riffusion example better than the literal first example on this demo page?


Might be personal preference, but I don't think those examples are good at all. The only thing I am impressed with is the story mode and conditioning. I typically use riffusion to generate swing/electropop music, could be I'm too biased.


This isn't a very compelling argument. First of all, they aren't a "mish mash" in any real way, it's not like snippets of images exist inside of the model. Second of all, this is entirely subjective. Third of all, entirely inconsequential - if these models create 80% of the video we end up seeing, is it going to matter if you don't think it's a tasteful endeavour?


But there have been quite a few scientific papers that have used discoveries from AlphaFold already. There have been many scientists who have been stuck for years, who are suddenly past their previous bottlenecks. What gives you the impression that it hasn't helped us?


I am not saying that Alpha Fold won't help scientists publish papers. I am just skeptical (though still hopeful) of it doing anything to improve the human condition by actually making human existence better. Publishing papers can be of neutral or negative utility in that realm.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: