Hacker News new | past | comments | ask | show | jobs | submit login
AI Demos (meta.com)
312 points by saikatsg 36 days ago | hide | past | favorite | 124 comments



It's a tool box of demos with the following:

Segment Anything 2: Create video cutouts and other fun visual effects with a few clicks.

Seamless Translation: Hear what you sound like in another language.

Animated Drawings: Bring hand-drawn sketches to life with animations.

Audiobox: Create an audio story with A1-generated voices and sounds.


> This research demo is not open to residents of, or those accessing the demo from, the States of Illinois or Texas.

Not accessible if you're in Illinois or Texas.

They must have anti-AI laws, probably with voice conversion moreso than image segmentation and cartoon animation.

Hopefully the lawmakers see beneficial use cases and fix their laws to target abuse instead of a blanket coarse-grained GenAI restriction.


Illinois has laws against biometrics, which basically can be interpreted as broadly as anything that even looks for a face as a binary classifier. The translation demo uses video, intended to be your face.

Knowing meta they save all of it.


"knowing meta" - as if any company working on AI isn't saving all the training data they can.


Anthropic claims otherwise


Texas sounds reasonable in general. I've written license terms which exclude Texas. That's home of the patent trolls.

Heartland v. Kraft Foods is worth a read.


Texas has the “Capture or Use of Biometric Identifiers” act. It’s very similar to the Illinois act that requires consent etc. Although it’s been on the books for a long time, Texas AG Paxton really only started enforcing it in 2022, 14 years after the law first appeared. The first target was Meta.

In this case it’s not the patent trolls, but the biometric collection acts shared by Illinois and Texas.

Aside - if you use Clear for airport security in those states, you get an additional consent screen. It seems like about 50% of the time the Clear employee clicks through the consent screen before you can read it. I imagine this does not fulfill the legal requirements when that happens.


The provisions here don't seem like an unreasonable ask, really:

https://statutes.capitol.texas.gov/Docs/BC/htm/BC.503.htm


This provision is going to throw a monkey wrench into Texas's new Electronic Genital Verification system at the Dallas Forth Worth Airport. I guess they're going to have to go back to the manual genital verification system, and hire thousands of Genital Enforcement Inspectors to hang out in the bathrooms, addressing the biggest problem that threatens society today, by diligently saving the children and protecting the rights of women from trans people and drag queens who need to take a dump and would prefer not to go outside on the lawn or on the windshield of a CyberTruck (so easily confused with a prison toilet). At least that will create many new jobs for ex-cons and unskilled American citizens who can't find work elsewhere because immigrant terrorists took their jobs.

https://s.hdnux.com/photos/01/47/20/26/27067779/3/ratio3x2_9...

SECURITY NOTICE

Electronic Genital Verification (EGV)

Your genitalia may be photographed electronically during your use of this facility as part of the Electronic Genital Verification (EGV) pilot program at the direction of the Office of the Lieutenant Governor. In the future, EGV will help keep Texans safe while protecting your privacy by screening for potentially improper restroom access using machine vision and Artificial Intelligence (AI) in lieu of traditional genital inspections.

At this time images collected will be used solely for model training purposes and will not be used for law enforcement or shared with other entities except as pursuant to a subpoena, court order or as otherwise compelled by legal process.

Your participation in this program is voluntary. You have the right to request the removal of your data by calling the EGV program office at (512) 463-0001 during normal operating hours (Mon-Fri 8AM-5PM).


"I saw it on the internet, so it must be true!"


They don't. The complication -- in both directions -- is "record of hand or face geometry"

If I take a photo of you, that's a record of face geometry.

The Meta FAIR demos turned on my webcam because I didn't notice that was enabled when allowing audio. They grabbed a photo of me without my permission, with no purpose as far as I can tell. That should be illegal.

However, posting a photo of a public space in a news article? That seems to fall under the same provision.


I'm in Nebraska — but I think, due to my ISP, I appear to be in the Chicago area. Oh well.


Sounds like your ISP needs to update their IRR and RIR records


For mobile data, there might not even be an Internet gateway in every state, so this entire idea seems a bit ridiculous. iCloud Private Relay also regularly "crosses state lines" that way.

Of course, if this trend of state-specific restrictions continues, networks might want to invest in actually having per-state IP ranges.


Seamless translation is... Pretty incredible.

I speak English and Spanish, so I recorded some English sentences and listened to the Spanish output it generated. It came damn close to my own Spanish (although I have more Castilianisms in mine, which of course I wouldn't expect it to know)


A real test here would be to give it to my friend from Mendoza, Argentina.

I'm bilingual and still can't understand him. I'm not even sure half the things he says are actual words.


I tried it and it sounded nothing like me at all - just some random "generic" male voice that translate what I said into german. My wife put it as "that's shit - sounds nothing like you". Nuff said.


Same for me.

I also tried speaking German and translating it to English and when I said "Hallo ich wollte das nur mal ausprobieren" (Hello I just wanted to try this out) it translated it to "Hi, how are you? Do you know anyone who quit smoking?".

I feel gaslit.


I don't recommend using gas lighting near your lit cigarette.


I translated from French to English and vice versa and the voice sounded nothing like me in either case. The English to French translation also made me sound about 90 years old.


Some for me. I'm a man with a relatively deep voice. The translation was read out by some generic female AI voice.


I think you clicked the wrong recording. The generic female AI voice is the translation of what you said.


which is good. do you really want a deep-fake? that noone can distinguish?


If that’s how it’s being advertised, and that’s the reason people are giving it a shot based on that advertising, then I certainly do! And so, I imagine, did the people who have left feedback so far!


Being good would be bad, therefore being bad is actually good.


Did it _sound_ like you though? It doesn't sound remotely like me.


It didn't really the first time. I recorded a second one and annunciated really strong/well (and said more) -- that yielded the positive results.


Whether "we're there yet" on translation technology is still debated, but at some point we'll consider it "good enough" for most practical use cases, truly removing the linguistic barrier. This is actually both terrifying and exciting, because then it'll definitely start influencing spoken language to at least some degree.


It depends how much tolerance you have for mistakes. For a waiter or asking directions or things like that, 100% this works great. For a diplomatic discussion where nuance is very important however... It also doesn't work great for translating works of art where the translation itself is open-ended and can be done in a bunch of different ways and requires a lot of editorial/artistic decisions from the translator.


Unfortunate that the examples they provide were absolutely terrible and robotic.

It put me off from actually trying it, I might reconsider.


Is this subject purposely spelled Aidemos somewhere like the HN title says instead of AI Demos?


HN automatically recapitalizes words in submission titles so I think it’s possible this could have been submitted as “AIDemos by Meta”.


Ahh I see. Thanks for the info!


At least it's not AI Demons


Aidemos... the greek god... of intelligence...?


Fixed.


The seamless transition demo is fantastic. The translated voice is passable for my own native voice. It would be incredible when we can achieve this in real-time.


We can! At Kyutai, we released a real-time, on-device speech translation demo last week. For now, it is working only for French to English translation, on an iPhone 16 Pro: https://x.com/neilzegh/status/1887498102455869775

We released inference code and weights, you can check our github here: https://github.com/kyutai-labs/hibiki


Good work. The delay seems to be around 5 secods. This is a step in the right direction. I'm wondering how much more real-time can we push it.


Damn, this is pretty amazing. Feels like we’re not far off from the babel fish.


What is Meta’s angle with AI? They seem to be doing a lot of research but what is the end goal? Google and MSFT I understand, Meta not so much.


Meta believes the dollars at the end of the AI race will be in walled gardens and prop data, not data centers and models.

They are going to do everything they can to make sure no one uses the time that models and data centers are limiting factors to disrupt them.

In the same way google demonetized the application layer of the web to prevent walled gardens from blocking search.

If models and hardware become commoditized at the end of the race meta will have a complete psychographic profile of people on an individual and group level to study, and serve incredibly targeted content to.

Their only real competition in that would be someone developing a 'her' like app that takes people out of social media and into their own individual silo'ed worlds. In a lot of ways discord is the alternative world to meta's ecosystem. hyper focused invite only small communities.


> Their only real competition in that would be someone developing a 'her' like app that takes people out of social media and into their own individual silo'ed worlds

I take it you have not tried the new Gemini models on ai studio? It does real time streaming video input and conversation you can genuinely ask it questions about what you are looking at in a conversational audio in-out way. This is basically "her"-level technology in an unpolished form, right here today.


Her is about a lot more than just asking questions in pure audio. ChatGPT has also had this since for a little while.


Not really. Toss a scheduler in and some RAG to remember conversational stuff and that's about it.


ChatGPT has been doing this for ages. Is the Gemini version drastically different or something?


Gemini is capable of video - you can point your phone camera and talk about something you show Gemini in real world. My ChatGPT app can do just audio conversation.


OpenAI had demoed it some half a year ago, but access since then was limited. I got access to it just last week (via ChatGPT app). Since I'm in Poland, I kind of assumed US users had this for at least a month, but maybe they roll out their features by different criteria than just geography.


Chatgpt can do that as well.


> Meta believes the dollars at the end of the AI race will be in walled gardens

Will those walls keep AI-generated content out, or will they keep the people outside from accessing the AI-generated content in the garden?

If it's the first, somebody should tell them the slop's already up to their navels and they probably shouldn't be helping people generate more of it.

If it's the second, then the models that supply the content to the garden must have some kind of uniqueness/value, because otherwise you could get identical content from anywhere.

This is a genuine question, because I don't understand the logic here.

(I had assumed it was more like hardware companies funding open source way back when - Commoditize Your Complement).


> If it's the first, somebody should tell them the slop's already up to their navels and they probably shouldn't be helping people generate more of it.

One would imagine Meta can readily quantify how much AI-generated content is consumed across its properties.

Meta's play is simple: more engagement means more money for Meta, and this can be done by "slop" as you called it, or alternatively expanding the audience of high quality human-generated content, say via translation. A funny video in Albanian is probably still very funny after being translated to English.


> walled gardens

Apple tried that and it’s crumbling. Meta/Zuckerfuck is always behind the curve.

- AR (failed)

- “metaverse” (failed)

The only thing that has kept them above water is social media and selling off user data, and that’s crumbling as well. Smaller players have been eating their lunch and the user base is aging out.


Yeah, their stock is WAY over inflated. I know their data wells are drying up fast. The long bets aren't working out. The AI stuff is neat, and certainly disruptive, but it isn't a paying bet.

The writing is on the wall, and his "falling in line" with theb political climate speaks volumes on his effort to keep Meta afloat.


I was the biggest meta naysayer given they've never realeased an original product till date. But there is no denying that they have money tree in the ads business.


What are you talking about? Insane YoY rev growth. Still on a hockey stick growth curve. Lower PE than Apple. Best FCF in the biz. Well positioned to take over VR if it becomes a thing. WhatsApp is ripe for monetization.

Talk to any staff+ eng at Meta in Ads and they will tell you there's a lot of low hanging fruit left. Sure the music will stop eventually (it always does) but there's no evidence that's soon.

People need to separate their hatred of Meta/Zuck from an objective analysis of the company. Meta has been and continues to be an amazing stock to own.


> Well positioned to take over VR if it becomes a thing.

This is an incredibly generous way to admit that Meta failed their pivot to VR and they will probably never recoup the tens of billions of dollars that was spent on it.


I actually did.

They were instead talking about the implications of GPDR, how they are switching to secure multiparty computation to try and side step restrictions, the looming threat of other data restrictions coming onto the scene soon internationally, the aging userbase, the concern they can't trace who is buying what via ads anymore (i.e. did that sneaker ad result in a Nike purchase), etc. They either didn't have any low hanging fruit left, or were certainly tight lipped about it.


so in other words, "better targeting"? that's it?


Better targetting

Better moderation (to the extent they still care)

Generation of AI slop for the sheep to feed on

Use of AI is really core to their business, so understandable they want to build it themselves, but not so clear why they want to "open source" (weights) it other than to harm companies like OpenAI


Is something like automated personalized content creation (for ads) better targeting? Or is it qualitatively different?

I personally think that the population scale surveillance and behavioral manipulation infrastructure built by meta is unethical and incredibly dangerous.


In the same way that an atomic bomb is “just” a better bomb.

I keep telling parents that Meta et al are spending the inflation-adjusted equivalent of the Manhattan project — not to defeat Japan — but to addict their child.


If you know how these algorithms work, and you can be intentional about what you want, seeding with a few well thought out examples, and curating recommendations; they can be quite useful.

I think atomic power or even better drugs / medicines is actually a good analogy considering the dual use nature of the stuff that they are building. Can improve quality of life if used prudently and responsibly, or cause devastation if not.


Has Meta done anything else?


https://gwern.net/complement

Joel Spolsky in 2002 identified a major pattern in technology business & economics: the pattern of “commoditizing your complement”, an alternative to vertical integration, where companies seek to secure a chokepoint or quasi-monopoly in products composed of many necessary & sufficient layers by dominating one layer while fostering so much competition in another layer above or below its layer that no competing monopolist can emerge, prices are driven down to marginal costs elsewhere in the stack, total price drops & increases demand, and the majority of the consumer surplus of the final product can be diverted to the quasi-monopolist. No matter how valuable the original may be and how much one could charge for it, it can be more valuable to make it free if it increases profits elsewhere. A classic example is the commodification of PC hardware by the Microsoft OS monopoly, to the detriment of IBM & benefit of MS.

This pattern explains many otherwise odd or apparently self-sabotaging ventures by large tech companies into apparently irrelevant fields, such as the high rate of releasing open-source contributions by many Internet companies or the intrusion of advertising companies into smartphone manufacturing & web browser development & statistical software & fiber-optic networks & municipal WiFi & radio spectrum auctions & DNS (Google): they are pre-emptive attempts to commodify another company elsewhere in the stack, or defenses against it being done to them.


great question, i was wondering about that. I think it's mostly in discovery phase right now, similar to how they dabbled in crypto before, and the largely finished by now "metaverse" experiment. (yes, this dabbling involves a ton of money sometimes). These demos actually show what they might end up using AI for, but whether it's truly game-changing for their business and whether it will be good for the regular users, considering their shitty UI's both in FB and even Instagram by now are grossly obsolete, haven't changed in over a decade despite 70,000 people working there, and are nowadays mostly focused on violently shoving more ads over actual usefulness, is still an open question.

If their business remains a shitty declining buggy 20-year-old Facebook and a 10+year-old Instagram app, but they contribute to advancing open source models similar to how they did with React, I'll consider that a net win though.


After the 'metaverse' stuff flopped, desperate to spend their money on some other thing that might be The Future(TM)?

Arguably this would be kind of rational behaviour for them even if they thought that LLM stuff had a low chance of being the next thing; they have lots and lots of money, and lots of revenue, so one strategy would be just to latch on to every new fad, and then if one is a real thing they don't get left behind (and if it's not, well, they can afford it).

My suspicion is that this is where most Big Tech interest in LLMs comes from; it's essentially risk management.


Paraphrasing from someone who is involved in this - their angle in AI is better targeting of Ads - better classification, clustering, better "recommendations" for the advertiser, including visuals, wording, video etc.

These and others are just side benefits or some form of "greenwashing". Meta's main (and only) business is advertisement. They failed to capitalize on everything else.


Enabling experiences with AI that will drive people sharing content with each other, communicating online, and which can be utilized in AR/VR, where they have a lead position. In-house AI improvements have also helped ad placement and ad generation for clients

People who think Meta's main business focus is Facebook and Instagram don't pay attention.


What makes you think that more artificial stuff is going to reinvigorate the business? Metaverse was supposed to be such savior, but this time they didn’t even rename the company…


Just as an example, there's some pretty funny AI-assisted memes people pass around. The Harry Potter Balenciaga fashion one was a while ago and an example I remember.

Also, the business doesn't need to be reinvigorated. It is booming and they are investing in places to stack more gains down the road & cement current status, which investors like to see. Many big techs right now are flailing and trying to artificially keep profits up by slashing costs only and not increasing revenue, which dents innovation. Meta is managing to sink money into AR/VR AND AI while seeing big revenue growth.


Let's not pretend that AR wearables aren't the future of personal computing.


It’s possible of course, but will it be Meta? Who knows.


Money and manipulation? Was that a real question?


Yes, that's a real question, even for the money and manipulation use case, how does this help, especially the money part?


all math leads to cryptography; all media leads to ads (?)


You forgot "fucking over the competition".

Not that I'm complaining about their open-weights model releases destroying openai's moat... but still.


AI make stock go up.

I think this is it. I'm kicking myself for not going harder, but I was very much into LLMs/ML back in 2019, had I not given up I might have a startup right now.

I'd need like 70k and a minimum of 6 months, but I still have a few ideas for AI driven startups.


Generated content is my assumption. Both, by users but also fully automated.


I don’t think anyone wants generated content in their IG/FB feed, so not sure how this will play out in the long run


Correction: Nobody wants content that they can tell is AI generated.


Can’t wait until my inactive instagram account starts posting AI photos of my kids!


Somewhere, a hopeful startup founder scribbles furiously in a moleskine notebook.


People say that, yet how many likes and reshares does said generated content get?


My assumption is that 90%+ of those come from 1. bots 2. old people 3. third world. I don’t think this is the target audience most valuable advertisers are going for, and this type of slop probably makes other audiences want to leave the platform. So in the short term maybe it’s great for engagement metrics and stuff like that, but I don’t think it’s financially sustainable.


Sadly, i don't think they care much about what "everyone wants" because with userbase this size they will figure out a way to forcefully shove whatever they come up with into people's faces.


What is MSFT and Google's reason?


Both do search, devices, OS and browsers - very natural verticals to integrate with AI, and both have cloud platforms where they can sell it to developers.

With Meta I can’t think of a single existing vertical where AI would be desirable. Maybe Quest


Meta is aggressively pushing open source AI so as to not get annihilated by closed sourced AI that is being researched by MSFT and Google


Advertising?


I'm pretty impressed with the segment anything[0] demo, is this integrated into an actual product anywhere? I do some simple video editing for friends as a hobby and can see some of this be pretty useful.

[0]https://sam2.metademolab.com/


Photoroom [0] is from Y Combinator and their product is essentially SAM plus a lot of polish along with a good user experience. I'm not sure if they're using it, but if they're not, I think they should be.

[0] https://www.photoroom.com/


SwarmUI, a front-end for image generation models, has integrated SAM2 as a quick way to mask parts of an image for things like inpainting. It's wonderful.


It probably is, but you won't hear it advertised as such.


If anyone else is wondering, Meta FAIR stands for "Facebook Artificial Intelligence Research" and has since been renamed to "Meta AI"[1].

[1]: https://en.wikipedia.org/wiki/Meta_AI


It's not exhaustive. For exemple, it's missing the Meta Motivo demo at https://metamotivo.metademolab.com/ (humanoid control model)


Meta deeply comprehends the impact of GPT-3 vs ChatGPT. The model is a starting point, and the UX of what you do with the model showcases intelligence. This is especially pronounced in visual models. Telling me SAM2 can "see anything" is neat. Clicking the soccer ball and watching the model track it seamlessly across the video even when occluded is incredible.


“ Our site is not available in your region at this time.”


Companies have to be very careful with AI products in international markets and even some US states because there are a number of different AI legislations in that need to be checked.

This is why cutting edge models are delayed in certain regions.

The work to verify and document all of the compliance isn’t worth it for various small demos, so they probably marked it as only allowed in the US and certain regions.


Getting this from the US


I get

"Allow the use of cookies from Meta on this browser? We use cookies and similar technologies to help provide and improve content on . We also use them to provide a safer experience by using information we receive from cookies on and off Meta Quest, and to provide and improve Meta Products for people who have an account.

    •
    Essential cookies: These cookies are required to use Meta Products and are necessary for our sites to work as intended.
    •
    Cookies from other companies: We use these cookies to show you ads off of Meta Products and to provide features like maps and videos on Meta Products. These cookies are optional.
You have control over the optional cookies we use. Learn more about cookies and how we use them, and review or change your choices at any time in our . "

should I click on accept?


Same. Texas.


I was getting this from inside the US, however setting my VPN to LA worked to get around it. I assume this is because that's where the Meta engineers are ¯\_(ツ)_/¯

EDIT: Once accessed there is this note:

> This research demo is not open to residents of, or those accessing the demo from, the States of Illinois or Texas.

and I'm in TX


Oh wow, thanks for finding this. I am also in TX. I was going crazy thinking it might be my iCloud Private Relay


I think Texas has some recent law that could be interpreted as being against twinning tech / deep fakes like the voice cloning. ¯\_(ツ)_/¯ seems like a good time to "ask the lawyers" and "not make a not political statement"

Even a passing glance it would be immediately clear that it's not a real risk of any sort.


Where are all the links to models?



Neat, but I wish Meta would just say what this really is - "please give us some In the Wild data to further train our models on".

I did the same technique years ago for estimating ages. Person uploads an image, helps align 10% of our facial landmark points, and run the estimator. If we were wrong, ask for correction and refine.

Its still cool and all, but meh based on my prior experience.


I expected a lot more.


We can add these to the pile of completely useless AI shit the world built in the last two years. Are people under some kind of spell that forces them to be in owe ? Looking at a lawnmower magazine is more interesting than these in term of utility and interesting tech


> Our site is not available in your region at this time.

What the shit is this?


Blame your regulators.


I blame meta doing sketchy illegal stuff :D

Like when some junk food isn't available in my country, I think it's probably for the best.


These demos are nothing near sketchy illegal stuff :D


These are all half-baked at best. They are spending so much money on undergraduate-level work. But to be fair, who in their right mind would work for Meta in 2025 if you have the talent.


Of the big companies doing significant work in AI, I'd say Meta is one of the top ones to work at. Even if you're just looking at it from a 'who are the good guys' standpoint.


I’ve never heard anyone say Meta is the good guys. It ranks worse than Oracle in my book.


They are probably referring to open sourcing llama.


It's nice to see indulgences making a comeback. Open-source a few things and the techpriests will turn a blind eye to everything else you're doing.


And it's not even open source at all :D


But… it's not open source.


Meta is easily in the top 5 places to work at in the world, especially if you have the talent.


And have no ethics.


I really wish to see undergraduate doing this kind of work :P


What kind of undergraduate trains 70 B models?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: