One interesting feature that gets enabled with open weights is adding new capabilities (tasks) to these editing models. They generalize quite well with low samples (30 ish). We talk about it here https://blog.fal.ai/announcing-flux-1-kontext-dev-inference-...
Absolutely. This is the version of Kontext that everyone has been waiting for. It's far more useful now. This is the first of the new generation of imagegens that allows training. Can't do that with Gemini, GPT, MJ etc.
the main question is going to be software stack. NVIDIA is already shipping NVFP4 kernels and perf is looking good. It took a really long time after MI300X's that the FP8 kernels were OK (not even good, compared to almost perfect FP8 support in NVIDIA side of things).
I will doubt that they will be able to reach %60-70 of the FLOPs in majority of the workloads (unless they hand craft and tune a specific GEMM kernel for their benchmark shape). But would be happy to be proven wrong, and go buy a bunch of them
"We've been negotiating a $2M contract to get AMD on MLPerf, but one of the sticking points has been confidentiality. Perhaps posting the deliverables on X will help legal to get in the spirit of open source!"
"Contract is signed! No confidentiality, AMD has leadership that's capable of acting. Let's make this training run happen, we work in public on our Discord.
It still amazes me that George/Tinycorp somehow seems to get AMD on board every time, and being blissfully unaware that they are a very small player. See for example top comment here [0].
Don't get me wrong, I think it's impressive what he achieved so far, and I hope tiny can stay competitive in this market.
That top comment doesn't seem to have engaged completely with the context here. AMD fumbled trillions of dollars of value creation by mis-identifying what their hardware was for. Or perhaps it is more correct to say by being too dogmatic about what their hardware is for. They weren't in a position to be picky. They had a choice - they could continue making trillion-dollar mistakes until their board got sacked and the exec team replaced. Or they could maybe listen to some of the people who were technically correct regardless of their size in the market.
George is just some dude and I doubt AMD paid him much attention anywhere through this saga, but AMD had screwed up to the point where he could give some precise commentary about how they'd managed to duck and weave to avoid the overwhelming torrent of money trying to rush in and buy graphics hardware. They should make some time in their busy schedules to talk with people like that.
People get on board with George Hotz because they share the frustration of using ROCm on consumer GPUs, where the experience has been insultingly dreadful to the point where I decided to postpone buying new AMD GPUs for at least a decade.
I'm not quite sure why he decided to pivot to datacenter GPUs where AMD has shown at least some commitment to ROCm. The intersection between users of tinygrad and people who use MI350s should essentially be George himself and no one else.
Most of those willing to work with AMD are very small players (with some notable exceptions). They are likely hopeful that the small players will grow.
fal | Growth Engineer | San Francisco (on site 5 days/wk)
Help us scale generative‑media infra: hack demos in the AM, pitch partners over coffee.
You’ll build quick client libs & microsites, run data A/Bs, write content that drives sign‑ups, and hand‑hold new devs.
Need: Python, JS/React/Next.js, SQL; speed, ownership, love for gen‑AI.
Get: strong salary + equity, platinum health, unlimited “build‑something” stipend and most importantly a seat at a rocketship.
Shoot a link to something you’ve built to careers@fal.ai
GH200 is nowhere near $343,000 number. You can get a single server order around 45k (with inception discount). If you are buying bulk, it goes down to sub-30k ish. This comes with a H100's performance and insane amount of high bandwith memory.
For traditional LLMs this might be true (especially large MoEs at bs=1) but I highly disagree with "multi-modal models" phrase since most of the models that output in other modalities are generally compute bound. Which means less flops will make the experience so much worse (imagine waiting a couple minutes for an image and hours for videos).
For anyone that wants to test the original (non-distilled) HunyuanVideo (which is an amazing model) we have 580p version taking under a minute and 720p version taking around 2.5-3 minutes in our playground: https://fal.ai/models/fal-ai/hunyuan-video (it requires github login & and is pay-per-use but new accounts get some free credits).
Open source video models are going to beat closed source. Ecosystem and tools matter.
Midjourney has name recognition, but nobody talks about Dall-E anymore. The same will happen to Sora. Flux and Stable Diffusion won images, and Hunyuan and similar will win video.
Hunyuan, LTX-1, Mochi-1, and all the other open models from non-leading foundation model companies will eventually leapfrog Sora and Veo. Because you can program against them and run them locally or in your own cloud. You can fine tune them to do whatever you want. You can build audio reactive models, controllable models, interactive art walls, you name it.
Sora and Veo just aren't interesting. They're at one end of the quality spectrum, and open models will quickly close that gap and then some.
> Open source video models are going to beat closed source. Ecosystem and tools matter.
Midjourney has name recognition, but nobody talks about Dall-E anymore. The same will happen to Sora. Flux and Stable Diffusion won images, and Hunyuan and similar will win video.
Neither Flux (except the distilled Flux Schnell model) nor Stable Diffusion has open licensed weights, Stable Diffusion and Flux Dev are weights-available with limited, non-open licenses, Flux Pro is hosted-only.
Just because the OSI doesn't like Open RAIL doesn't make it not open source unless you're strictly talking about the OSD. The OSI can't even figure where the boundaries of open models lie - data, training code, weights, etc.
The RAIL licenses do have usage restrictions (eg. against harming minors, use in defamation, etc.), but they're completely unenforced.
> Just because the OSI doesn’t like Open RAIL doesn’t make it not open source unless you’re strictly talking about the OSD.
If you aren’t talking about the OSD, you end up reducing “open source” to a semantically-null buzzword. But, in any case, I intentionally didn’t mention “open source”. The weights are under a use-restrictive license, not an open license, even leaving out the debates over what “source” is. And tha’s just SD1.x, SD2.x, and SDXL, which have the CreativeML OpenRAIL-M license (SD1.x) or CreativeML OpenRAIL++M licenses (SD2.x/SDXL). SD3.x has a far more restrictive license, as does Flux Dev.
> Flux Schnell is Apache.
Huh. It’s almost like I should have explicitly except Flux Schnell from the other Stable Diffusion and Flux models when I said they didn’t have open licenses.
Oh, I did.
> LTX-1 is Apache.
Yes, it is. LTX-1 is “neither Flux (except the distilled Flux Schnell model) nor Stable Diffusion”. AuraFlow (an image model) is also Apache, and while its behind Flux – Dev or Schnell – or SDXL in current mindshare, it got picked – largely for licensing reasons – as the basis for the next version of Pony Diffusion, a popular (largely, though not exclusively, for NSFW capabilities) community model series whose previous versions were based on SD1.5 and SDXL, which gives it a good chance of becoming a major player.
Statements that begin like this are nearly always rhetorical attempts to subvert the standard usage of the terminology.
> but they're completely unenforced
Utterly irrelevant from a legal perspective. Also entirely circumstantial in that it depends entirely on the license holder and can easily vary between end users.
I'm also rather confused how RAIL entered into this to begin with. Unless I've missed something significant, most variants (or at least high end variants) of Stable Diffusion [0] and Flux [1] are under non-commercial licenses.
Not that I take issue with that. I've no delusion that a company is going to spend hundreds of thousands of dollars on compute and then open the floor to competitors who literally clone their data.
Easily Gimp and Krita or painting (you can buy the latter on steam, if you want to support open source).
Photoshop is a round and mature product, but since I don't do any print, I can do everything with Gimp (perhaps you can do print too, no experience here).
Creative cloud or however it is called today is a non-starter for me. Also, I can integrate Gimp in image pipelines more easily. I also use Blender for modelling.
Maybe I am not entirely up to date, but today you can use these tools to make things that were just not possible a few years ago. In a quality that is competitive with high quality media products.
For me it is a hobby and I get the advantages in a professional environment to use the same tools that fit long and complicated pipelines. But if you just want to create high quality art, the tooling is readily available.
It’s not comparable because GIMP has never had the effort put into it to compete with Photoshops most basic features. 15-20 years ago they were arguing that adjustment layers were not needed and they only managed to ship some form of it this year.
Blender vs commercial 3D software is a better example.
Hunyuan at other providers like fal.ai is cheaper than SORA for the same resolution (720p 5 seconds gets you ~15 videos for $20 vs almost 50 videos at fal). It is slower than SORA (~3 minutes for a 720p video) but faster than replicate's hunyuan (by 6-7x for the same settings).