Hacker News new | past | comments | ask | show | jobs | submit login
Adobe's new image rotation tool is one of the most impressive AI tools seen (creativebloq.com)
912 points by ralusek 3 months ago | hide | past | favorite | 268 comments



I'm making some big assumptions about Adobe's product ideation process, but: This seems like the "right" way to approach developing AI products: Find a user need that can't easily be solved with traditional methods and algorithms, decide that AI is appropriate for that thing, and then build an AI system to solve it.

Rather than what many BigTech companies are currently doing: "Wall Street says we need to 'Use AI Somehow'. Let's invest in AI and Find Things To Do with AI. Later, we'll worry about somehow matching these things with user needs."


I would interpret it that they're getting the same push from Wall Street and the same investor-hype-driven product leadership as every other tech firm, but this time they have the good fortune to specialize in one of the few verticals (image editing) where generative AI currently has superhuman performance.

This is a testable claim: where were Adobe in previous hype cycles? Googles "Adobe Blockchain"...looks like they were all about blockchains in 2018 [0], then NFTs and "more sustainable blockchains" in 2022 [1].

[0] https://blog.adobe.com/en/publish/2018/09/27/blockchain-and-...

[1] https://www.ledgerinsights.com/adobe-moves-to-sustainable-bl...


The article says clearly there's no guarantee this feature will be released.

Which I'm reading as "Demo-ready, but far from production-ready."

Somewhat relevant: my experience with Photoshop's Generative Fill has been underwhelming. Sometimes it's wrong, often it's comically wrong. I haven't had many easy wins with it.

IMO this is a company that doodles with code for its own entertainment, not a company that innovates robust and highly useful production-ready features for the benefit of users.

So we'll see if Mr Spinny Dragon makes it to production, and is as useful as billed in the demo.


you don't need to release to production for real value. I'm under intense pressure to scope out frothy AI features because just discussing them with prospects has a material impact on the costs of the sales funnel.


> just discussing them with prospects [...] sales funnel

I'll admit I have no idea what % of Adobe licensees/subscribers are individuals and small visual/graphic design firms (who choose Adobe for personal reasons) compared to larger companies (news agencies, web-design body-shops, etc) where employees use the tools given to them despite any personal preferences for rivals like Procreate, etc - and the rest: students, hobbyist photographers, etc.

...but none of the aforementioned market-segments seem like they'd make "AI" (whatever that means) any part of their purchasing-decision. Buzzwords only help sales when the audience is ignorant and/or impressionable; and when your audience are well-informed, seasoned (and cynical) professionals then buzzwords have the opposite effect and damage a company's credibility.

...so I'm not sure who, exactly, Adobe is trying to message with their press-copy for Adobe Firefly (their "generative AI for business" product); perhaps it's just a charade meant only for their shareholders? I'm glad they aren't copying Microsoft and shoving AI branding where it really doesn't belong and compromising the user-experience (...at least not so the same extent).


Execs love genai & execs make purchasing decisions.


Yup, this. I've recently interacted with someone whose board pushed for a company-wide coding assistant rollout, with the explicit goal of reducing development staff, or rather costs. The developers weren't really asking for it, but leadership assumes that they wouldn't, if it could make them redundant.

Seems like getting decisions made at that level can be extremely valuable, and at the same time lets you get away with building something that just seems like a useful product - because the people you're selling to won't use it. And furthermore, they will already go into this assuming resistance from the actual users, so they're unlikely to even listen to their feedback.

Of course it's not a long term strategy, but it seems like a potent short term money maker.


“I consider lying cause trick sales look like real value to me”. Not judging, but sales alone are not real value.


This sounds closer to fake value.


I disagree with your analysis. I think this is a novel use of AI in a commercial art product. Is there any AI feature that Adobe could release that you would not view as "pushed from Wall Street"?


I think you're being a bit too generous with Adobe here :-). I shared this before, but it's worth resharing [1]. It covers the experience of a professional artist using Adobe tools.

The gist is that once a company has a captive audience with no alternatives, investors come first. Flashy (no pun intended :-p), cool features to impress investors become more important than the everyday user experience—and this feature does look super cool!

--

1: https://www.youtube.com/watch?v=lthVYUB8JLs


I don’t think those ideas are mutually exclusive. I heavily dislike Adobe and think they’re a rotten company with predatory practices. I also think “AI art” can be harmful to artists and more often than not produces uninteresting flawed garbage at an unacceptable energy cost.

Still, when I first heard of Adobe Firefly, my initial reaction was “smart business move, by exclusively using images they have the rights to”. Now seeing Turntable my reaction is “interesting tool which could be truly useful to many illustrators”.

Adobe can be a bad and opportunistic company in general but still do genuinely interesting things. As much as they deserve the criticism, the way in which they’re using AI does seem to be thought out and meant to address real user needs while minimising harm to artists.¹ I see Apple’s approach with Apple Intelligence a bit in the same vein, starting with the user experience and working backwards to the technology, as it should be.²

Worth noting that I fortunately have distanced myself from Adobe for many years now, so my view may be outdated.

¹ Which I don’t believe for a second is out of the goodness of their hearts, it just makes business sense.

² However, in that case the results seem to be subpar and I don’t think I’d use it even if I could.


Whether they avail of it, or not, Adobe have the possibility of accessing feedback and iterating on it for a lot of core design markets. I have a similar view to yours, but there is a segment of the AI community who feel that they are disrupting Adobe as much as other companies. In most cases, these companies have access to the domain experience which will enable AI and it won't work the other way around.

All of this is orthogonal to Adobe's business practices. You should expect them to operate the way they do given their market share and the limited number of alternatives. I personally have almost moved completely to Affinity products, but I expect that Adobe should be better placed to execute products and for Affinity to be playing catchup to some extent.


    > I also think “AI art” can be harmful to artists and more often than not produces uninteresting flawed garbage at an unacceptable energy cost.
What do you think about Midjourney? The (2D) results are pretty incredible.


> What do you think about Midjourney

I think that’s irrelevant to the argument and that it only leads to an uninteresting derailment of the discussion, as demonstrated by the two flagged replies that also focused on that purely contextual piece of information.

Someone else nailed it:

https://news.ycombinator.com/item?id=41872284

If we ever meet, I’d be glad to answer that question in detail and compare notes. Right now, I think it’d be another distraction that does nothing to advance the conversation.


That is where opinions actually diverge between pro-AI and anti-AI clusters - they look gorgeous and human-indistinguishable if you aren't trained with tons of images, or extremely disturbing and obvious if you were. It's like how CGIs and special effects from the past would look terrible today.

The big genAI flamewar actually has very little do with copyright or would-be-lost jobs. It's mostly about quality and emotions encoded in the images(deep rage). Lots of tech inclined miss this point.


[flagged]


What’s the goal of your comment? You’re making a straw man argument which in no way relates to my point and ridicules the opinions of people not on this thread. That makes for uninteresting and needlessly divisive conversation.

The HN guidelines rightfully urge us to make substantive comments that advance the discussion and avoid shallow dismissals.

https://news.ycombinator.com/newsguidelines.html


I think they are actually agreeing with you. Just, in a somewhat unpleasant and sarcastic manner. They aren’t strawmanning your argument, right? They are strawmanning the argument against it.


> They aren’t strawmanning your argument, right? They are strawmanning the argument against it.

Yes, that’s the impression I got out of it too. I disapprove either way. I’d sooner defend a good argument against my point than a bad argument in favour of it.

I come to HN for reasoned, thoughtful, curious discussion.


I think what has happened (and I’ve been hit by this in the past, it is very annoying) is: You included the bit in the beginning about being generally skeptical of AI art in some forms to signal that you are somebody with a nuanced opinion, who believes that the thing can be bad at times. Then, you go on to describe that this isn’t one of those times.

Unfortunately, this gets you some comments that want to disagree with that less specific, initial aside. I’m not sure if people just read the first paragraph and respond entirely based on that, without realizing that it is not the main point of the rest of the post. Or if they just don’t want to give up the ground that you did in the beginning, at all, so they knowingly ignore the rest of the post.

I don’t really know what to do about this sort of thing. It seems like… generally nice to be able to start a post with something that says basically: look I’ve thought about this and it isn’t an uninformed reflexive take. But I’m trying to give up on that sort of thing. It isn’t really logically part of the argument, and it ends with people arguing in a direction that I’m not really interested in defending against in this context.

But it does seem a shame, because, while it isn’t logically part of the argument, it is nice to know beforehand how firm somebody’s stance is.


I think this is a great comment and that you absolutely nailed it. It’s a shame that it’s now buried under a flagged response, but still I wanted to make sure you knew (since it was directed at me) that I read it and appreciated it.


I think GP behavior is coming from weird assumption among Internet troll-y people that strong negativity shown by online drawing communities wrt AI _literally_ has nothing to do with output quality of generated data.

This is clearly incorrect to some, not to others. This point being unclear to some, leads to those people assuming that the commonly observed strong negativity is generalized response to all shape and form of new technologies, rather than that specific emotional reaction to current generation of still somewhat Lovecraftian generative AI outputs.

A bit like what if a non-vision super LLM was to characterize anti-genAI sentiment and create "techno-luddite artist" persona. But there's across-modal component to it that they don't capture, so that falls flat.


that's three comments so far (now four) discussing if the comment in question adequately adds to the discussion. If you ask me, hyperbole and sarcasm have a place in nearly any exchange of ideas, but maybe I just haven't drank the right kool-aid for this space.

I think another, perhaps more relevant reference could be the replacement of hand-painted cells with computer-generated frames for animation. It replaced one kind of artist with another. Nobody got all that worked up about it, in the long run.


There are plenty of sarcastic, hyperbolic, dismissive, etc comments on this site. I don’t think you need to gulp down the koolaid or anything.

But the discussion is a little better if we take a little sip every now and then, perhaps even slightly performatively. The “ground-state” of big open Internet discussion sites like this is dismissive and cheap, so it is good to have active pushback occasionally.


[flagged]


> I think the keyboard can be harmful to scribes

I like this reasoning. If something is new then it must be the future of humanity. People scoffed at Concorde for being “wasteful” and “flawed” but look at the company today


You’re focusing on an irrelevant part of the comment and making a straw man out of it. Your account has very little content so you may be unfamiliar with the HN guidelines, in which case I urge you to refer to them before proceeding.

Discussion should assume good faith and responses should become more substantive, not less, as the conversation goes on.

https://news.ycombinator.com/newsguidelines.html


You can have both!

Cool features that excite users (and that they ultimately end of using), and that get investors excited.

(i.e. Adobe mentioned in the day 1 keynote that Generative Fill, released last year and powered by Adobe Firefly is not one of the top 5 used features in Photoshop).

The features we make, and how we use gen ai is based on a lot of discussions and back and forth with the community (both public and private)

I guess Adobe could make features that look cool, but no one wants to use, but that doesn't seem to really make any sense.

(I work for Adobe)


> is not one of the top 5 used features in Photoshop

I mean, is there any Photoshop feature that’s come to dominate people’s workflows so quickly?

People (e.g. photographers) who use Photoshop “in anger” for professional use-cases, and who already know how to fix a flaw in an image region without generative fill, aren’t necessarily going to adopt it right out of the gate. They’re going to tinker with it a bit, but time-box that tinkering, otherwise sticking with what they can guarantee from experience will get a “satisfactory” result, even if it takes longer and might not have as high a ceiling for how perfectly the image is altered.

And that’d just people who repair flaws in images. Which I’m guessing aren’t even the majority of Photoshop users. Is the clone brush even in the top 5 Photoshop features by usage?


You're super wrong. Pro here working with this stuff for decades.

There was a brief moment in time where freehand was just a better and faster drawing tool than illustrator (which is whats is shown here) but from there on psp, ill & indesign have pretty much killed all competition out there.

The formats they use are sigularly stupid and arcane for legacy reasons, they are all mem hogs and inefficient to the extreme - but nothing beats that unholy trifecta and it is used it or die.

Now to get the point: generative fill is one of the absolute killer features of psp - in an instant it does what could take multiple hours to do previously with 5-10 sec of watching a loader.

There are many mor gamechangers and this really looks like another


> psp, ill & indesign

paint shop pro?


That should read "is NOW one of the top 5 used features in Photoshop".


Moreover, when one looks at the chronology with which features were rolled out, all the computationally hard things which would save sufficient time/effort that folks would be willing to pay for them (and which competitors were unlikely to be able to implement) were held back until Adobe rolled out its subscription pricing model --- then and only then did the _really_ good stuff start trickling out, at a pace to ensure that companies kept up their monthly payments.


Is there no alternative to Photoshop? Affinity or Pixelator don't cut it?


I think Krita is the best I’ve found now, though it’s not a 1-on-1 comparison.


Gimp? Although the UX feels so outdated.

https://github.com/KenneyNL/Adobe-Alternatives#photoshop


If you ever worked professionally with PS, you are getting tired of recommendations of tool that is capable like 20% of PS.


Interesting! What are some unique features not typically found in other apps? (don't say generative fill lol)

Creating a specification for minimum viable features of a raster editor able to replace Photoshop would be fantastic.

Could serve as a roadmap for alternatives and for bragging rights: "Painter2900: implements 99% of the Rasterflames Liberation Standard, and more!".


Mask precision, plugin compatibility, scripts and actions, color profiles management, brush options, camera raw compatibility, liquify tool, 3D, font options... I just named a few that I use regularly.

Beside functions, Gimp's UI looks outdated.


I haven't used Photoshop in about 15 years. Gimp was never close to Photoshop

Krita on the other hand is basically everything I remember about old Photoshop. Even the keyboard shortcuts are pretty much the same. The price is quite right too.

What Adobe has done with generative AI has been really impressive though. I am going to probably have to give PS a try just to see what I am missing.


Besides the UI, GIMP's UX is what breaks my heart.


Unfortunately Affinity does not run on linux natively.


pixelMator


My company has decided to update its hr page to use AI for reasons unknown.

So instead of the old workflow:

"visit HR page" → "click link that for whatever reason doesn't give you a permanent link you can bookmark for later"

it's now:

"visit HR page" → "do AI search for the same link which is suggested as the first option" → "wait 10-60 seconds for it to finally return something" → "click link that for whatever reason doesn't give you a permanent link you can bookmark for later"


Nvidia needs to continue selling chips like crazy, all companies in the US need to do their fair share to contribute!...


Bubbles require constant maintenance


You joke, but it's literally in the interest of many companies to prop up the SP500 et al. by wasting money on M7 products, isn't it?


Somebody's putting "AI expert" on their resume


Your company really needs those AI acceleration chips we're being shoveled!!

You could make your search that you didn't need 10x faster!


Mine has as well, but it's pretty useful. It's really just a search engine though, but it's indexed confluence and all other internal sites and i've found it pretty useful for everything.


"click link that for whatever reason doesn't give you a permanent link you can bookmark for later"

Sounds like engagement hacking?


I think this is a weird SAML pattern I've seen before where e.g. Okta generates a URL that's like https://somevendor.com/SAML/somesignedbase64payload to do SSO, which is sort of the inverse of the more common approach of the page you're logging into sending you to the Auth provider after seeing your email domain.


This is just to make it an IdP initiated flow (instead of a SP initiated flow) and its to prevent the extra hop back and forwards between Okta/IdP and the application.


My assumption would be clumsy session tracking.


This feels extremely ungenerous to the Big Tech companies.

What's wrong with trying out 100 different AI features across your product suite, and then seeing which ones "stick"? You figure out the 10 that users find really valuable, another 10 that will be super-valuable with improvement, and eventually drop the other 80.

Especially when if Microsoft tries something and Google doesn't, that suddenly gives Microsoft a huge lead in a particular product, and Google is left behind because they didn't experiment enough. Because you're right -- Google investors wouldn't like that, and would be totally justified.

The fact is, it's often hard to tell which features users will find valuable in advance. And when being 6 or 12 months late to the party can be the difference between your product maintaining its competitive lead vs. going the way of WordPerfect or Lotus 123 -- then the smart, rational, strategic thing to do is to build as many features as possible around the technology, and then see what works.

I would suggest that if Adobe is being slower with rolling out AI features, it might be more because of their extreme monopoly position in a lot of their products, thanks to the stickiness of their file formats. That they simply don't need to compete as much, which is bad.


> What's wrong with trying out 100 different AI features across your product suite, and then seeing which ones "stick"?

For users? Almost everything is wrong with that.

There are no users looking for wild churn in their user interface, no users crossing their fingers that the feature that stuck for them gets pruned because it didn't hit adoption targets overall, no users hoping for popups and nags interrupting their workflow to promote some new garbage that was rushed out and barely considered.

Users want to know what their tool does, learn how to use it, and get back to their own business. They can welcome compelling new features, of course, but they generally want them to be introduced in a coherent way, they want to be able to rely on the feature being there for as long as their own use of those features persists, and they want to be able to step into and explore these new features on their own pace and without disturbance to their practiced workflow.


There are multiple different types of users.

The users of https://notebooklm.google/ aren't the same as the users of Google Docs.


Think about the other side though -- if the tool you've learned and rely on goes out of business because they didn't innovate fast enough, it's a whole lot worse for you now that you have to learn an entirely new tool.

And I haven't seen any "wild churn" at all -- like I said in another comment, a few informative popups and a magic wand icon in a toolbar? It's not exactly high on the list of disruptions. I can still continue to use my software the exact same way I have been -- it's not replacing workflows.

But it's way worse if the product you rely on gets discontinued.


The presence or absence of some subtle new magic wand icon that shows up in the toolbar is neither making nor breaking anyone's business. And even if it comes to be a compelling feature in my competitor's product, I've got plenty of time to update my product with something comparable. At least if I've done a good job building something useful for my customers in the first place.

Generative ML technologies may dramatically change a lot of our products over time, but there's no great hole they're filling and there's basically no moat besides capital requirements that keeps competitors from catching up with each other as features prove themselves out. They just open a few new doors that people will gradually explore.

Anxiously spamming features simply betrays a lack of confidence in one's own product as it stands, directly frustrates professional users, and soaks up tons capital that almost certainly has other places it could be going.


> The presence or absence of some subtle new magic wand icon that shows up in the toolbar is neither making nor breaking anyone's business.

Sounds like famous last words to me.

The corporate landscape is filled with the corpses of companies that thought they didn't need to rush to adapt to new technologies. That they'd have time to react if something really did take off in the end.

Just think of how Kodak bided its time to see if newfangled digital photography would actually take off and when... and then it was too late.


You're comparing being 3 months behind on a supplementary software feature that's tucked among dozens of icons on the toolbar with making a hard decision about pivoting your entire megalithic industrial, research, sales, and distribution infrastructure to a radically new technology.

The discussion you started is about spamming features to see what sticks, as set against making deliberate, selective product decisions as you confidently observe your market.

It's possible that a company that ideologically sets itself against delivering any generative AI features ever might miss where the industry is going over the next 10 or 20 years. But we were never talking about that, were we?


Digital photography started out as a supplementary toy as well. And we are starting to witness a gigantic computational infrastructure pivot with GPU's and NPU's and whatnot. Google and Amazon are literally signing nuclear power plant agreements to power it. AI is a radically new technology.

Do you remember two years ago when ChatGPT came out, and people here on HN were confidently declaring it was the end of Google Search, unless Google proved they could respond immediately? And Google released Gemini less than six months later to demonstrate that Search wasn't going to go the way of Kodak, and it still took people a while to calm down after that?

And the AI revolution is moving a lot faster than the digital photography revolution. We're not talking about "the next 10 or 20 years". You seem to be severely underestimating the power of competition and technological progress, and the ability for it to put you out of business.

You're suggesting the correct approach is "deliberate, selective product decisions as you confidently observe your market." What happens when your deliberation is too slow, your selectivity turns out to be wrong, and your confidence is ill-founded? Well, the company that was willing to experiment with a lot more features is more likely to build the winning features and take over the market while you were busy deliberating.

I'm surprised to be having this conversation on HN, where the start-up ethos reigns supreme. The whole idea of the tech world is to try new things and fail fast, because it's better for everyone in the long run. That's what the big corporations are doing with AI features. Isn't that the kind of thing that tech entrepreneurs are supposed to celebrate?


> I'm surprised to be having this conversation on HN, where the start-up ethos reigns supreme.

Many of us are sick to death of the startup ethos. We want tools that work consistently and aren't constantly changing because someone at the company got bored.


Startups during their earliest stages are encouraged to throw spaghetti at the wall specifically because they don't yet have a customers to offend. They have nothing to lose from failing fast.

In the 2010's more mature companies explored adopting this same model, especially those that had themselves been founded the decade prior. What came out of it was a lot of spaghetti making a mess all over the walls, and the floors, and the ceiling. There were half-baked ideas everywhere, and a few genuine revolutions, but the quality of pretty much everything tanked.

Optimistically, we now seem to be at the starr of a pendulum swing back from that, but with little time to scrape off all the spaghetti that continues to drag everything down.


> What came out of it was a lot of spaghetti making a mess all over the walls, and the floors, and the ceiling.

I just don't know what you're talking about. What mess? What quality tanked? The picture you're painting is quite simply not what I see. Not at all, not even close.

The tools I use continue to work just fine. And I can point to tons of useful feature improvements and upgrades since the 2010's, that make a meaningful positive difference to both my productivity and my leisure. So I don't want to see companies suddenly become super-conservative in terms of releasing features. I want them to keep doing what they're doing.


Back in the olden days (10 years ago), when you bought software, you could actually keep using it indefinitely. Doesn’t matter if the company went bankrupt, if you like using Logic Pro 7 and it works with your equipment you can kept using it. I know people who only recently moved off of OS 9 - they were using creative software for over 25 years, it did what they needed it to do so they kept using it. I still know at least one person who uses Office for Mac 98 to this day on an iMac G3; it’s their only computer, but it still works and they have backups of their important documents, so why pay money to switch to an unfamiliar computer, OS, software?

This modern idea of “you’ll own nothing and you’ll like it” ruins that of course, but if someone bought CS6 they can still be using it today. If adobe went bankrupt 5 years ago they could still be legally using it today (they’d have to bypass the license checks if the servers go down, which might be illegal in the US, though). If adobe goes bankrupt tomorrow and I have a CC subscription, I can’t legally keep using photoshop after the subscription runs out.


You cannot even work offline in properly licensed CC.

I'm wonder when Adobe implement AI check for export. Then it will be impossible to export "wrong" files. It already started with scans of money, soon it targets CSAM, later politically incorrect topics, hatespeech, disinformations and age verification for nudity.


No, it's way worse if the product I rely on does as you suggest and keeps adding new features just to see what will stick. I hate that sort of behavior with a passion and it is the sort of thing which will make me never do business with a company again.


LLMs aren't profitable. There's no significant threat of a product getting discontinued because it didn't jump high enough over the AI shark.


> What's wrong with trying out 100 different AI features across your product suite, and then seeing which ones "stick"?

Even the biggest tech companies have limited engineering bandwidth to allocate to projects. What's wrong with those 100 experiments is the opportunity cost: they suck all the oxygen out of the room and could be shifting the company's focus away from fixing real user problems. There are many other problems that don't require AI to solve, and companies are starving these problems in favor of AI experiments.

It would be better to sort each potential project by ROI, or customer need, or profit, or some other meaningful metric, and do the highest ranked ones. Instead, we're sorting first by "does it use AI" and focusing on those.


What you describe, I don't see happening.

If you look at all the recent Google Docs features rolled out, only a small minority are AI-related:

https://workspaceupdates.googleblog.com/search/label/Google%...

There are a few relating to Gemini in additional languages and supporting additional document types, but the vast majority is non-AI.

Seems like the companies are presumably sorting on ROI just fine. But, of course, AI is expected to have a large return, so it's in there too.


Force-feeding 100s of different AI features (90% of which are useless at best) to users is what's wrong with the approach.


Why?

It's not "force-feeding". You usually get a little popup highlighting the new feature that you close and never see again.

It's not that hard to ignore a new "magic wand" button in the toolbar or something.

I personally hardly use any of the features, but neither do I feel "force-fed" in the slightest. Aside from the introductory popups (which are interesting), they don't get in my way at all.


It's popups. It's emails. It's constant nudges towards changes in workflows. Most importantly, it's accelerated the slurping of data and aggressive terms of service by an order of magnitude. Sure, in theory everyone wanted your data before, but now everyone wants all your data all the time. They want to upload it to their servers. They want to train products on it. And they want to ban you from using their product if you don't agree.


Depends on the product. Notion, for example, has a "press Space to AI" prompt in the editor. I think it can only be disabled for enterprise accounts.


So it's ok for all of us to become lab rats for these companies?


Every consumer is a "lab rat" for every company at all times, if that's how you want to think about it.

Each of our decisions to buy or not buy a product, to use or not use a feature, influences the future design of our products.

And thank goodness, because that's the process by which products improve. It's capitalism at work.

Mature technologies don't need as much experimentation because they're mature. But whenever you get new technologies, yes all these new applications battle each other out in the market in a kind of survival-of-the-fittest. If you want to call consumers "lab rats", I guess that's your choice.

But the point is -- yes, it's not only OK -- it's something to be celebrated!


You might be ok with being a lab rat, but most people are not. People buy products to satisfy their needs, not to participate in somebody else's experiment. Given the option (in the absence of monopoly) they will search for another company that treats them correctly.


> People buy products to satisfy their needs

People buy products for the novelty all the time. Sometimes they are disappointed with what they got, sometimes they discover new things. Take this very feature being discussed. How many people need it if Adobe released it today? How many would like what they see and decide to buy or renew?

> Given the option (in the absence of monopoly) they will search for another company that treats them correctly.

Are we still talking about product features?


I agree, people not only "buy products" for novelty, people crave novelty in general, from products to relationships.


This is certainly a great immediately useful tool but also a relatively small ROI, both the return and the investment. Big tech is aiming for a much bigger return on a clearly bigger investment. That’s going to potentially look like a lot of useless stuff in the meantime. Also, if it wasn’t for big tech and big investments, there wouldn’t even be these tools / models at this level of sophistication for others to be using for applications like this one.


While the press lumps it all together as "AI", you have to differentiate LLMs (driven by big tech and big money) from unrelated image/video types of generative models and approaches like diffusion, NeRF, Gaussian splatting, etc, which have their roots in academia.


LLMs don't have their roots in academia?


Not anymore.


Not at all - Transformer was invented by a bunch of former Google employees (while at Google), primarily Jakob Uszkoreit and Noam Shazeer. Of course as with anything it builds on what had gone before, but it's really quite a novel architecture.


The scientific impact of the transformer paper is large, but in my opinion the novelty is vastly overstated. The primary novelty is adapting the (already existing) dot-product attention mechanism to be multi-headed. And frankly, the single-head -> multi-head evolution wasn't particularly novel -- it's the same trick the computer vision community applied to convolutions 5 years earlier, yielding the widely-adopted grouped convolution. The lasting contribution of the Transformer paper is really just ordering the existing architectural primitives (attention layers, feedforward layers, normalization, residuals) in a nice, reusable block. In my opinion, the most impactful contributions in the lineage of modern attention-based LLMs are the introduction of dot-product attention (Bahdanau et al, 2015) and the first attention-based sequence-to-sequence model (Graves, 2013). Both of these are from academic labs.

As a side note, a similar phenomenon occurred with the Adam optimizer, where the ratio of public/scientific attribution to novelty is disproportionately large (the Adam optimizer is very minor modification of the RMSProp + momentum optimization algorithm presented in the same Graves, 2013 paper mentioned above)


I think the most novel part of it, and where a lot of the power comes from, is in the key based attention, which then operationally gives rise to the emergence of induction heads (whereby pair of adjacent layers coordinate to provide a powerful context lookup and copy mechanism).

The reusable/stackable block is of course a key part of the design since the key insight was that language is as much hierarchical as sequential, and can therefore be processed in parallel (not in sequence) with a hierarchical stack of layers that each use the key-based lookup mechanism to access other tokens whether based on position or not.

In any case, if you look at the seq2seq architectures than preceded it, it's hard to claim that the Transformer is really based-on/evolved-from any of them (especially prevailing recurrent approaches), notwithstanding that it obviously leveraged the concept of attention.

I find the developmental history of the Transformer interesting, and wish more had been documented about it. It seems from interview with Uszkoreit that the idea of parallel language processing based on an hierarchical design using self-attention was his, but that he was personally unable to realize this idea in a way that beat other contemporary approaches. Noam Shazeer was the one who then took the idea and realized it in the the form that would eventually become the Transformer, but it seems there was some degree of throw the kitchen sink at it and then a later ablation process to minimize the design. What would be interesting to know would be an honest assessment of how much of the final design was inspiration and how much experimentation. It's hard to imagine that Shazeer anticipated the emergence of induction heads when this model was trained at sufficient scale, so the architecture does seem to at least partly be an a accidental discovery, and more than the next generation seq2seq model that it seems to have been conceived as.


Key-based attention is not attributable to the Transformer paper. First paper I can find where keys, queries, and values are distinct matrices is https://arxiv.org/abs/1703.03906, described at the end of section 2. The authors of the Transformer paper are very clear in how they describe their contribution to the attention formulation, writing "Dot-product attention is identical to our algorithm, except for the scaling factor". I think it's fair to state that multi-head is the paper's only substantial contribution to the design of attention mechanisms.

I think you're overestimating the degree to which this type of research is motivated by big-picture, top-down thinking. In reality, it's a bunch of empirically-driven, in-the-weeds experiments that guide a very local search in a intractably large search space. I can just about guarantee the process went something like this:

- The authors begin with an architecture similar to the current SOTA, which was a mix of recurrent layers and attention

- The authors realize that they can replace some of the recurrent layers with attention layers, and performance is equal or better. It's also way faster, so they try to replace as many recurrent layers as possible.

- They realize that if they remove all the recurrent layers, the model sucks. They're smart people and they quickly realize this is because the attention-only model is invariant to sequence order. They add positional encodings to compensate for this.

- They keep iterating on the architecture design, incorporating best-practices from the computer vision community such as normalization and residual connections, resulting in the now-famous Transformer block.

At no point is any stroke of genius required to get from the prior SOTA to the Transformer. It's the type of discovery that follows so naturally from an empirically-driven approach to research that it feels all but inevitable.


This makes no sense. A thing's roots don't change, either it did start there or it didn't.


It didn't.

At least, the Transformer didn't. The abstract idea of a language model goes way back though within the field of linguistics, and people were building simplistic "N-gram" models before ever using neural nets, then using other types of neural net such as LSTMs and CNNs(!) before Google invented the Transformer (primarily with the goal of fully utilizing the parallelism available from GPUs - which couldn't be done with a recurrent model like LSTM).


On the plus side, for Adobe, is that they have a fairly stable & predictable SaaS revenue stream so as long as their R&D and product hosting costs don't exceed their subscription base, they're ok. This is wildly different from -- for example -- the hyperscalers, who have to build and invest far in advance of a market [for new services especially].


I don't think it's a Big Tech problem. Big Tech can come up with moronic ideas and be fine because they have unlimited cash. It's the smaller companies that need to count pennies who decide to flush the money down the AI Boondoggle Toilet.

"But Google does it. If we do it, we will be like Google".


"But Google does it. If we do it, we will be like Google".

Were you in my meeting about 40 minutes ago? Because that's almost exactly what was said.

If the big tech companies wanted to be really evil, they could invent a nonsense tech that doesn't work, then watch as all the small upstart competitors bankrupt themselves to replicate it.


Isn't this what AI is all about? Don't kid yourself, most companies, even some big ones, will bankrupt themselves chasing AI and the few remaining will get the spoils.


It seems that’s just the way things go with disruptive technologies. It’s a gold rush and you don’t want to be left behind.


Wait, is that why we all have microservices now?


That's exactly right. This appeared before on HN but that's what I wrote about a couple of years back: https://renegadeotter.com/2023/09/10/death-by-a-thousand-mic...


Sounds a bit like trying to roll and support your own k8s platform


You mean like React, right? Right?


It is the 'make something for the user/client' vs. 'make something to sell' mindset.

The latter one is what overwhelmingly more companies (not only BigTech, not at all!) adopted nowadays.

And Boeing. ;)


If the lore is to be believed, Southwest (a airline that has made its business only the 737) saw the a320 neo and basically told Boeing "give us a new 737 or we go to airbus." they did what the client wanted, to their detriment.

"If I asked people what they wanted they would've said faster horses," or whatever Henry Ford is falsely accused of saying.


And Boeing?


>Rather than what many BigTech companies are currently doing

this BigTech strategy or perhaps also Big money strategy is a common feature of all hype bubbles, see XML as an example.

Perhaps caused by the people with the big money not knowing anything about the technology but having a feeling that this will produce some valuable things, therefore throw a bunch of money at everything, those things that survive will reward us, the other stuff we write off.


That approach makes sense for very specific domain-tethered technologies. But for AI I think letting it loose and allowing people to find their own use cases is an appropriate way to go. I've found valuable use cases with ChatGPT in the first months of its public release that I honestly think we still wouldn't have if it went through a traditional product cycle.


Focusing on solving customer problems, not buzz words, typically is the right path.


Adobe has a powerful moat. That's why companies like MSFT can find better way to integrate AI into workflows rather than just AI companies without any moat.


Counterpoint, the pandering to the market has better stock price appreciation :)

Also I am sure Adobe is doing both. They released an OpenAI competitor recently


Been doing both. Just look at their asset store as of late. Complete mess if you work professionally.

At the same time, apparently their generative autofill is top notch. It's just a shame the industry decided to mix together ML tools with generative art, so that it's hard to tell which from which on a casual glance


Yeah I much prefer this approach to the current standard of just putting a chat bot somewhere on the page and calling it a day.


Precisely. There are many such use cases too! It's disappointing to see the industry go all in on chatbot wrappers.


Yeah but sometimes, they just f it up. Like the PS crop tool was aok then they introduced the move the background instead of the crop rectangle way of cropping which is still to this day a terrible experience.

Also, Lightroom is one of the worst camera tools out there. It's only known because ADOBE...


More like "ship some half baked bullshit wrapper for ChatGPT or llama and call it revolutionary."


The source (Adobe MAX) 'demoes' full range of incredible scenarios..

https://www.youtube.com/watch?v=gfct0aH2COw


The video is much better than the linked page. The video shows the dynamic multi-angle character rotation and other object rotations. https://www.youtube.com/watch?t=63


That event has the enthusiasm of old Apple demos


Weird that YouTube's default error page (xnx's post) is a 1 minute video titled "YouTube is not currently available on this device.".


Perhaps the OP meant to time stamp it, but the url was incomplete. The time stamp at 63 secs..

https://youtu.be/gfct0aH2COw?si=a8DGrbtjAzu1R6Yz&t=63


And a video from 7 years ago on their channel. So confusing


Maybe you missed the video in the linked article? it's the same demo.


Cut out the middle-man.


Not everyone likes watching videos.


Not bad, but what's up with the audience? Is there an Adobe cult or something?


Regardless of whether these are adobe employees or not, I’d argue that a feature like this warrants such a response.

It makes me miss Apple’s old keynote style that they’ve abandoned in favor of the bland, sanitized, over-polished and pre-recorded video keynotes.

I’m honestly over so much of the corporate cynicism and Blind-indification that’s turned what was once a necessary precautionary stance to this demonization or ridicule of people who happen to love their work and where they do it.


The audience is creative community members who use Adobe tools and are attending max (around 11,000 for this event).


They're attending and celebrating the demise of their own professions?


Their profession will be fine. They will continue to better understand colours, composition and design principles better than me.

We will both have AI to our disposal, and they will make better designs quicker than me.


This is Adobe Max, which is a huge event held by Adobe for creators.

This session is "Sneaks" which is held every year, and has a fun, casual atmosphere. i.e. it has a theme, has a celebrity co-host, lots of jokes, food and drink served, etc...

Its basically a bunch of people who are creative, and are having fun nerding out on the tech...

Its a lot of fun.

(I work for Adobe)


Of course there is. Just like there's an Apple Cult, Android Cult, Facebook Cult, Sportsball team cult, blah blah blah. Any group that is large enough to attract that many users/followers/fans will naturally have a subset that is more gungho than the rest.


Yes. Forsyth's book titled Group Dynamics is a great one if one wants to know more about this kind of thing.


Also the "fun" stage setting. I once quit a big-ish tech company when they started pulling stuff like that.


Not bad, but what's up with the audience? Is there an Adobe cult or something?

There are conferences for Adobe customers to teach them how to use Adobe tools. I think there was recently an Adobe Max conference in Los Angeles. It could have been filed there.


I'm not sure about this particular event, but companies often have employees who worked on the products in the crowd during launches to provide even more crowd noise.


It's an immense feature. Illustrators love it. Of course they're enthusiastic about it. Man, there's nothing like this. You draw 2D vector art and then rotate it in 3D space. What the heck, that's freaking crazy. I'd be hooting and hollering. How can you not be losing your mind over this. It would accelerate so many processes.

My wife's an artist and says they had a shitty version of this but this is crazy.


There’s millions of Adobe product users out there. Almost all the design tool developers have keynote events.

As an aside, I hate that people like yourself describe fans of anything they don’t personally understand as cults. It’s an antagonistic framing of a question designed to remove any good faith discussion.


I think cult is really fitting for a massive group of customers who are locked in by a monopolist. Maybe eve; worse than a cult, because there's just the one,


Come on.. it is literally dream come true for an artist whose nightmare is indecisive, confused clients.

You have to read some of the YouTube comments to understand that some of those hoots and claps could be for real.


Finally more AI tools for vectors!

With bitmaps you get a blob of pixels but vectors you can be edited and refined much easier.


Now someone do this with a human object, and wait for the media to have a collective freakout


I'm a bit disappointed they didn't an artist drawing something live and then rotating that.


This will work well about one out of every hundred times you try it. Enough to find good demo examples though.


Get an AI to generate 100 examples then


I find that Adobe is really pulling away from open source software with all this AI stuff. A few years ago it could be argued that GIMP, Inkscape, and Darktable could do almost everything that Photoshop, Illustrator, and Lightroom could, albeit with a jankier user interface.

But now none of the open source software can compete with AI generative fill, AI denoising, and now AI rotation.


With all due respect there’s never been a time when that could have legitimately been argued unless someone was doing relatively basic things with those apps or was a hobbyist.

There’s always been a significant gap in capabilities once you looked past the surface.

I find this sentiment is common among FOSS advocates who don’t actually professionally use those tools.

I am definitely an advocate for free tools closing that gap, but I both design content professionally and contribute to OSS projects to close that gap. So I feel quite confident in saying that gap has always been large when compared to the Adobe suite.


Even if the open source option is only slightly worse, why would a professional allow that to impact how they earn a living?


Let’s assume the open source software suite was exactly as capable and performant as Adobe’s.

Unless everything is exactly the same (even the default file format) you’re still going to be better off with “what everyone uses” because you have to share content, and one mistake of sending a GIMP file instead of a PSD isn’t really worth the savings.

Where the open source stuff shines is where you have 10,000 employees who need minor image editing once a month or less; then you can save millions by using GIMP et al.

It’s possible to have a complete open source image setup - https://www.peppercarrot.com/en/about/index.html has done it, for example (and the issues he experiences are worth reading).


Its not "slightly" worse. Its 10x or 100x drop in the artists performance/pipeline. No serious graphic designer I know uses FOSS vs Adobe products.

Plus they are industry standards so people expect those files at the end of the project.


Yeah precisely. The Adobe suite is affordable if you’re actively making money off using it. It’s also why there’s not as much investment in competing open source projects.

$80 CAD/mo for the whole suite minus the substance stuff (prices are regional). For the average freelancer in Canada, that’s not a consequential barrier to entry. That’s <$1k for a year for everything.

If I charge a rate of something like $40/hr, that’s two hours in a month? <2% of revenue. Am I going to risk spending that much extra time fussing with something else for 2% more $$?

Meanwhile Blender gets a lot of investment because the competition is much more expensive. CA$305/mo for Maya and I need to augment it with an adobe subscription for any non-3D work.


Sure as long as adobe doesn't decide to go back and do the whole "we own the rights to everything ever opened in one of our apps" again lol


Why wouldn't they if they do not want to use them for professional deontology reasons ? What impact size are we talking about ?


> I find this sentiment is common among FOSS advocates who don’t actually professionally use those tools.

It's wishful thinking at best and delusion at worst. I see it constantly on hn on many threads. I suspect hn incents that kind of language because people on hn like being pandered to.


In some way, having followed the open source image generation scene for a while, it feels a little bit like it's opposite?

Most of the ai image generation stuff I've seen from adobe feels late to party in terms of what you can do with open source tools. Where they do compete however is with tight integration, and I guess that's what matters the most to users in the end.

There are plugins for gimp that let you do image generation, inpainting and other things.

As far as what the post shows, it looks very much like current models that generate novel viewpoints of an object, but for illustrations. It might be doable to fine tune this for illustrations and simply vectorise the new viewpoint again. Though this will destroy any structure previously held in the object.

All I'm saying is that we have the tech to do even more than what adobe is doing, we just haven't put it nicely together yet.


I think your last paragraph sums it up pretty nicely: users need a good UX to get to these tools.

So I would love if GIMP started shipping these awesome plugins by default to pick up the pace!


The more I spend time as a software developer, the more strongly I believe that UX is 80% of what makes a tool good, and that a lot of programmers really just don’t get that.


There are also the programmers that do get that, but just don't have the ability to change it. I'm no artist, but I can tell you when something looks bad. I'm constantly playing with CSS to learn new things to make things look better. I'm now in that category of "it looks like someone tried but just didn't achieve, but better than most" level of design.

Programmers making things for other programmers will always be forgiven as long as it works. Programmers making things for the general population will not be forgiven to the same extent if at all. As soon as someone releases something that is polished, it will be used even if it doesn't work as well.


Yup, I suck at graphics, but I know when something is awful, and oftentimes I have an idea as to how it could be better, but then again, I am terribly incompetent at graphics.


Thank you!! I pay for Office 365 to use desktop Word even though I'm a very basic user of it. I'm well aware that LibreOffice exists and serves most of my needs, and that Word Online and Google Docs could serve my needs. But they're all so horribly inferior to the classic Word interface that I choose to pay for it.

And as we all know, this is why the iPhone has been so successful despite bringing Android-like features years after they were launched.


IMHO Krita has really become the cross platform open source darling for graphic editors. There are some things that are unintuitive but it's leagues better than GIMP.


Krita is amazing but isn't it specialized for digital painting rather than general purpose image manipulation?


GIMP does not fully support non-destructive editing yet.

That, by itself, would be a complete deal breaker for professional work.

There's plenty more deal breakers remaining.


> A few years ago it could be argued that GIMP, Inkscape, and Darktable

To a Linux user, yes. To a professional, it was always a cruel joke, it was never close, even a few years ago. It's like saying Notepad++ is a functional IDE, or Kdenlive is a functional replacement for DaVinci Resolve.

I cannot stress this enough: Actual professionals do not think GIMP is a viable replacement, in any way, and never have.


I would also like to add (as a separate comment though, this will be controversial):

Some would say that GIMP, Inkscape, and Darktable aren't really competitive yet because they haven't had enough investment. If we invested in them enough, and managed them well, they could be like Blender.

GIMP has been in development since 1995. Photopea was initially released in 2013, has been solely developed by one person, and is a far-and-away better Photoshop competitor. The projects themselves are mismanaged. GIMP should (frankly) be abandoned and completely reset, in my opinion, as being a failed attempt at salvaging old code forever. Wisdom is knowing when to keep pushing - and when to give up.


GIMP did spawn one kinda good thing, GTK was made because Peter Mattis disliked Motif and wrote a replacement and called it the GIMP toolkit


Krita destroyed Gimp years ago. This debate is always by people who obviously don't even use these tools.

No one would compare GIMP to Photoshop in 2024 that has any idea what they are talking about.


They'll probably be better able to compete once Adobe ups prices to reflect the actual cost of all that processing.


Photoshop is £30 a month. NASDAQ.com reports their net profit to be 40% and elsewhere they're reported to gross $20B revenue.

I think they can afford the ML based content generation costs without increasing prices.


They might do it anyway though. I have the "all apps" subscription but it's not actually everything they make any more, all their "Substance 3D" tools are another $50/mo. I can easily see this feature getting most of its functionality locked behind that extra subscription the way Illustrator's new 3d tools just give you a tiny handful of materials without that.


Oh for sure, their implementation is slicks and they're somewhat of a monopolist. The whole "free to educational institutions" really worked well for them and MS.

I don't doubt they will put prices up, just boring they don't need to.


Adobe had 41% profit margin in 2020, but otherwise in the 25% to 30% range since 2018.

https://www.macrotrends.net/stocks/charts/ADBE/adobe/profit-...


>A few years ago it could be argued that GIMP, Inkscape, and Darktable could do almost everything that Photoshop, Illustrator, and Lightroom could, albeit with a jankier user interface.

It really couldn't


I was just thinking similarly. I don't need any of these AI features and I'm certainly not about to start giving Adobe money, but I'd be lying if I said I wasn't jealous.


"But now none of the open source software can compete with AI generative fill, AI denoising, and now AI rotation."

This is a common pattern across many fields. The truly top-end companies are always running ahead of open source.

But that doesn't mean it's a permanent situation. It just means you're looking at it from a point in time where the commercials got there, and open source hasn't yet. Open source will get there, and then Adobe will be ahead on something else.

I've played a bit with "comfyui" over the past few days, a bizarre name for an AI image generation power tool. (And other things, but I have no experience there to know how good it is at those.) It drips with power. The open source world is not generally behind on raw capability. As is often the case, open source's deficiency for generative fill for instance is that A: it offers too much control, too many knobs (e.g., "which of several dozen models would you like to start with?"), and while that's awesome if you know what you're doing, it is not yet at the "circle this and click 'remove'" yet, and B: the motivation and firepower to integrate this all into a slick package is not there. I can definitely do an AI generative fill with open source software, but I'll be exporting an image into comfyui, either building my own generative fill program or grabbing some rando's program online who may or may not be using compatible models or require me to install additional bespoke functionality into comfyui, doing my work, and re-exporting it. The job is done, but it's much more complicated, and most people don't care about the other extra capabilities this workflow yields so for them it's just cost.

It's a very normal pattern in the open source world. Nothing about the current situation particularly gives me cause to worry specially about it.

To be concrete, here's a YouTube video that's to the more advanced side of what you can do in the open source world, which is probably still ultimately simplistic compared to what some people do: https://www.youtube.com/watch?v=ijqXnW_9gzc That entire series is worth a look, and there's more it doesn't cover. You can get incredible control over these tools in the open source world, but it involves listening to some guy on YouTube trying to explain why you might to sometimes use a thing called "dpmpp_2m_sde_gpu"... not exactly normie-friendly.


I mean we've been able to do generative fill and denoising, better in open for a while, its just not as easy (except for video really)

What Adobe does is wrap those things in an easy to use app, and then charge for it, and hopefully not change their licensing again to grab everyones shit again.

Regarding the scheduler (dpmpp) sure adobe doesn't tell you those things, but thats because they found one that worked, removed the options and packaged it up with a bow, comfy and a111 and forge etc, are more complex because they give you EVERYTHING and let you have at it. There are frontends that wipe all that away but they arent successful because like the linux world, people in opensource want to be able to tinker with all the internals and shit, which is why opensource tends to see some groundbreaking optimizations, like taking the Flux model from requiring 30+gb of vram to run to running on 6gb of vram lol


Almost all the AI models come from open research. Yes, open software hasn't caught up because they don't have the billions of research money to implement this models, but that doesn't mean it won't happen at some point.


Not yet, but I imagine soon they will. Closed source is moving to video and open source is catching up to static images with incredible pace. I won't be suprised if not only GIMP integrates something like a couple of general stable diffusion models but pirated copies of photoshop find a way to hook up a local generative model instead of the online stuff.


Can't speak for PS vs GIMP but I used to use Illustrator a fair bit and Inkscape was nowhere near it in terms of both features and useability. Now that was 15 years ago, so it may have caught up.


You are correct even today. Inkscape is great but it’s a fraction of the utility that Illustrator offers.

The only people who would actually equate them are people not professionally using these tools everyday.

Even paid apps like affinity designer are a fraction of the functionality of Illustrator.

Again, a great product but people are just dead wrong if they compare them as an absolute.


> The only people who would actually equate them are people not professionally using these tools everyday.

While someone who just needs a functional vector graphics editor and is not being paid to use adobe software might prefer inkscape. Speaking from personal experience, there.


Sure, if they fit your need, nobody is saying you should pay for more.

This is a common response when someone says something is better. It doesn’t mean they’re saying you need the better stuff or judging folks who don’t need it.


I'm not convinced. The flows are a little less convenient right now, but that's basically it.

Ex - I can absolutely get exactly this same rotation feature using open toolchains, they just haven't been nicely consolidated into a pretty package yet.

So to recreate the same thing adobe is doing here I currently have to:

1. Use the 3d-pack in comfy-ui to get stack orbit camera poses for my char (see: https://github.com/MrForExample/ComfyUI-3D-Pack scroll down to stack orbit in the readme)

2. Import those images back into the open source tool manually.

Is it as convenient? Nope - it requires a lot more setup and knowledge.

Is it hard to imagine this getting implemented in open source? Also nope. It's going to happen, it just won't be quite as quick.


Do you know if there is an AI tool to “explode” an image, such as a character, into individual parts for a texture atlas?

See example: https://user-images.githubusercontent.com/9606161/53874756-0...


They can, but the user experience is abysmal, useless and nerve racking.


I spent so many hours trying to do rotations with a pirated copy of Flash as a kid, and I never really got the hang of it, and it always bothered me how deceptively hard rotation was; when I would show my parents my work, they would do their very best to try and act excited but I could tell that they weren't really impressed with the effort because it doesn't seem that hard, at least to a lot of people.

This makes me irrationally happy.


Yeah, this is one of those things that seems trivial until you try to do it, and then it's impossible.


If you rmb-click on the video and select "show controls", you will not only be able to seek, but you'll also be able to unmute it.

I don't know why it was embedded with the controls hidden.


This is the true power of generative AI, enabling new functionality for the user with simple UX while doing all the heavy lifting in the background. Prompting as a UX should be abstracted away from the user.


This probably isn't backed by an LLM but instead some kind of geometric shape model.


How do you explain a horse 2 legs become 4 legs when rotated assuming they only drew 2 legs on the side view


The second L in LLM stands for "language". Nothing of what you're describing has to do with language modeling.

They could be using transformers, sure. But plenty of transformers-based models are not LLMs.


They are probably looking for LGMs - Large Generative Models which encapsulate vision & multi-modal models.


The model need only recognize from the shape that it is a horse, and would know to extrapolate from there. It would presumably have some text encoding as residual from training, but it doesn't need to be fed text from the text encoder side to know that. Think of the CLIP encoder used in stable diffusion.


Ok that's VERY impressive, now give me the possibility of exporting it as an .stl to 3D print and then we'll be talking. Just imagine drawing something in 2D and be able to print it as a fully 3D object, it gives me chills just by thinking about it.


This is here now. It's possible to set up a flow in a node based ai tool to go from text prompt to stl in a single shot.

They current results using open license models aren't incredible in their fidelity, but it's literally here today.


Links please?


https://github.com/MrForExample/ComfyUI-3D-Pack

Edit to add - the meshes it generates aren't the cleanest, so consider a follow up with something like:

https://www.meshlab.net/

Note - if 3d printing isn't your jam, you can also do things like generate game assets. Ex - take the output mesh and throw it into something like https://www.mixamo.com/#/ for auto rigging.

Again - the open source tooling is a bit fiddly, but it's workable (you'll need to be familiar with python packaging to get the 3d pack up and running in comfyui). Quality of the meshes will vary a lot based on input.

You can also find (somewhat) up to date info on new models and papers here: https://github.com/ActiveVisionLab/Awesome-LLM-3D?tab=readme...

If you don't want to run it yourself - there are also several commercial offerings right now:

https://www.meshy.ai/

https://www.3daistudio.com/

https://3dfy.ai/

etc


As someone who currently works in GenAI and analytics but paid their way through college doing design (for print media) and still keeps around old copies of Illustrator and Fireworks (running under Wine) as well as using Affinity Suite, this is STUPEFYINGLY more impressive than any LLM.

Still not enough to make me pay for Adobe Creative Suite (I just dabble these days), but the target demographic will be all over it.


I don't think quite the same kind of tech, but this kinda reminds me of the "3D" pixel art sprite editor thing in Smack Studio

https://youtu.be/sM3ss-lY1zU?t=10


I was just about to post the same link. It is really cool tech and Smack Studio is when I first fell in love with the concept. What Adobe has is cool, but I've seen it already.


In summary, Smack Studio is a fighting game with a character editor that can rotate any 2D pixel art sprite as if it were 3D by extrapolating a depth map. You can export the resulting sprites for use outside of the game.

The game was released on 2024-07-31.


Incredible, but a shame you'll have to use Adobe to get it.


Yes. I absolutely despise Adobe, and I will not be using this.

They were double charging me for photoshop for two years. I caught them and it took 60 minutes on the phone to get them to do something about it.

They have an entire cancellation department. (!)


probably not, there's a new and growing market of image editing software with a wide variety of tricks, i wouldn't be surprised to know this is already implemented on some niche website


Came here assuming they were using AI for "rotate 90°" ready to drop a rant, but this was actually impressive.


I had a similar negative reaction to the grandiose title but in this case it was totally deserved and I am pretty blown away.


Looks like Adobe finally found a way to cut down on piracy.

None of these new AI features will work on a pirated copy because it's all server-side processing.


Good idea, but such a frustrating company to do business with as a consumer


There are actually multiple open source ML models for 2d to 3d which is clearly what they are doing. The difference with most of them is that this is vectors.

There might actually be a similar open source model already.

But I think to create it you would build it from a database of 3d assets that you could render from many angles. Probably quite similar to the way the 2d to 3d works. I don't know maybe the typical 2d to 3d models will work out of the box or with some kind of smoothing or parameterization. Maybe if you have a large database of parameterized 3d models then you combine that with rendering in 2d from different angles then you can basically use the existing 2d to 3d model.

https://replicate.com/collections/3d-models


Are you sure that’s what they’re doing? In the demo, they show that the vector sections have been preserved, so there’s clearly more to the story. Maybe 2D -> 3D, map path to vertex, rotate, project path back into 2D?


It's 2d to 3d back to 2d and converted to vectors.


I think it's not the same but similar.


I thought this was one of those sarcastic headlines, highlighting the overuse of AI for basic processes.


SIGGRAPH from over a decade ago has entered the chat...

https://www.youtube.com/watch?v=Oie1ZXWceqM

It may not be AI, but this single video blew my mind back in *2013* and I find myself thinking about it often.


The video you shared very much looks deserving of the 'AI' label to me.

Perhaps you mean it doesn't use some of the techniques driving the current AI boom, like LLMs or diffusion models.


This is great -- I'm always amazed how effective classical algorithms are at doing so-called "neural tasks". What's strange is how few SIGGRAPH tech ever makes it out as a consumer product


>SIGGRAPH from over a decade ago has entered the chat... > >https://www.youtube.com/watch?v=Oie1ZXWceqM

A version of this was available in Photoshop for a long time, but I think the feature was deprecated and removed completely this year. I had used it for a few things here and there, but dedicated 3D tools were much better if you were working in that space.


Looks time consuming



This looks very cool. I really hope the results are not overly cherry-picked like Adobe's first version of the text-to-vector generation that only worked particularly well for the showcased art styles.


I won't be excited until its live in an app, company demos are always exciting.


This captures the essence of what "modern AI" is great at! Relieving the tedium of a highly constrained task.

Great demo. This will really help animators and artists.


How very strange, my partner was mocking up a room for our home just a few hours ago, and I asked whether an AI tool existed to rotate the incorrect angle of a sofa in a photo being used within the mock up - and here it is on hackernews just an hour later, just that tool..

Edit/ apparently I misunderstood it's only possible with vectors - getting close though to the reality mentioned!


preserving the vector art after transforming is really cool, anyone know the relevant papers? or was this original research done by Adobe?


I found the Project Turntable page on Adobe's site more interesting (with embedded video) on mobile than the linked CreativeBloq site:

https://www.adobe.com/max/2024/sessions/project-turntable-gs...


Amazing this will give ancient GIFs a facelift.


so, pacman will have 3 D characters now ?


Yes, Ms. Pacman has the DDs, which is why PacMan himself gives her the D.

3 Ds.


mr and ms pacman shop at DDs Discounts ? 3Ds


> Adobe's Brian Domingo told Creative Bloq that like other Adobe Innovation projects, there's still no guarantee that this feature will be released commercially.

Well, I confess I got a little bit confused here :/ . What's the purpose then for such an innovative solution if not commercialized?!


It means it’s a tech demo that still has issues which they think they can solve but aren’t sure.

Anything could happen and they need to be sure they don’t run afoul of securities law by “promising” something they don’t end up delivering.


I've seen a lot of cool shit from adobe but its mostly rehashed stuff thats been cleaned up from public workflows from stuff we've seen done in comfyui and other flux/stable diffusion based expansion workflows... like the IC-Light style relighting they demod...

But this... this is really fuckin cool


I am sure this is the right time for hobbysts to make your own movies, and animations.

I personally started programming, in part, to make simple animations like the ones you see in Scratch, and it’s incredible how accessible the tools are today for anyone looking to bring their ideas to life.


It looks cool and covinient for people like designers and other non-techinical content creators. One natural follow-up would be, can we find many other similar operations that are used by creativity people everyday and tackle them under a unified framework?


Makes me think of

https://lookingglassfactory.com/looking-glass-go-spatial-pho...

which needs multiple views of your image from different angle and tries to make it up with AI.


Does anybody know a DIY solution to get a similar result? I am asking because 300$ seems like a lot of money for this.


Well, when is so big bad company going to bully us into using their tools to convert 3D sculpts into flawlessly animatable models? I'll submit to their abuse and surrender my lunch money to them. Though not if it is Adobe, I still have some self-love.


haven't been in the loop for a while, stupid question: why do people hate adobe


Not a graphic designer, so I can't speak for their reasons, only for mine. First, I had a photoshop subscription and when I cancelled they wanted to fine me for cancelling. Then they bought the Substance suite and made it subscription only and very expensive (unless you buy the Steam version, whose price they doubled). That also hurt me, when I could barely afford those tools

I'm better off now, but I have a long memory and prefer to vote with my wallet by paying multiples to any competitor...which generally speaking is better for me and everybody else, since competition is the mother of innovation.

Apropo, Marmoset Toolbag 5 is out; it comes with a permanent license, it has a huge materials library, and the interface is very snappy and it doesn't feel like it has been programmed using Electron. You don't need to pay for Substance Painter this year.

Ah, and Adobe's latest exploit was a confusing TOS that more or less stated they would use your work that you edited locally with their software to train their AI models. I think they walked that one back when the wave of outrage hit them.


I do like AI stuff but isn't it simpler to just introduce 3d clipart essentially? Sure models could then be generated, but traditionally made models could be used too


It is a lot easier to make a 2d drawing than it is to make a 3d model.


I want the actual 3D models.

This looks like the perfect tech for a cel shaded game!


As someone who otherwise hates genAI, I must admit, this is actually a very cool demo and a very sensible application of AI.


Pretty incredible


Completely agree. I thought this was going to be some underwhelming nonsense, but that is legit impressive and something even a non-artist could benefit from.


Arguably non-artists benefit the most. This is a time saver for skilled artists but a whole new ability unlock for the unskilled ones.


This is basically like taking your 2D drawing to an artist and saying "draw this for me from different angles." Only now the artist is a computer, and probably costs you a lot less than paying a real artist every time you want to do this.

Animators are even more out of a job I guess, but really have been for quite some time I think, almost no animation is entirely hand-drawn anymore.


A large amount of animation made in Japan is still initially animated by hand on paper, actually! The anime industry is remarkably conservative, technologically, which makes it all the more impressive that its animation production output dwarfs that of most other places, including ones that have largely switched over to 3D or puppet rigging for animation productions...


I was going to write how this would be cool in a kids drawing app but the thought that they might never feel the need to draw something from a different angle. I wonder what other activities have been lost to time and technology.


> what other activities have been lost to time and technology

- flintknapping

- the distaff activities: carding, spinning, weaving, etc.

- "teamster" as a very highly skilled occupation

EDIT: compare https://www.youtube.com/watch?v=JD2ua6q8FFA&t=475s with https://www.youtube.com/watch?v=gjZX6L5cnUg&t=11s


Socrates was against the invention of writing because it meant people lost the skill to memorize and recite

https://www.historyofinformation.com/detail.php?id=3439


It really has been destructive. He anticipated the day in which you could change people's memories by editing the internet.


Depends on whether the kid wants to learn to draw, or just wants to create drawings.


Why might a kid not want to draw something from a different angle? In my introduction to drawing course I was asked to draw my non-dominant hand every day for a week, each time from a slightly different angle.


Because instead of doing that, they could have the computer rotate their drawing to a new angle.


I'd love to see what this tool does with bad drawings, heh.


It took me a while to understand that the second picture is actually a muted video with hidden controls.


one thing is you can't be lazy when drawing the initial vector like a car for example, you can't just draw from the top and expect it to generate a side shot after rotating. You need to draw maybe an isometric version first.


Has anyone actually use this tool? I wonder how cherry picked the example is.


Very nice! If only Adobe did Linux builds, that would be great.


pretty much how I always assumed AI-powered tools'd work. still, mind blowing to see in action


NeRF or gaussian splatting?


I don't think either of those would work with a single 2D vector image.



I'm pretty tired seeing AI slapped on everything but holy shit this is impressive.


Yeah, nothing Adobe does is impressive enough to be worth dealing with them.


Is there another source? None of the images loaded for me.


Edit: The article embeds this video, but apparently it embeds without sound. Here's a direct link.

https://cdn.mos.cms.futurecdn.net/RnibKWuooxUZ7xptBz4UHd/Pro...


People have been using 3D models for 2D graphics for at least a decade. 3D models rotate, by default.

This demo shows generating a 3D model from a simple 2D shape. It'll fall flat on its face trying to 3D model anything non-trivial which begs the question - who cares?

Also, you'll want to animate the 3D model - which this doesn't do, so you'll soon be back to your usual 3D toolkit anyway.


The difference is that you don't need a 3D model for this.

You start with 2D vector graphics that is significantly easier to create.


Yes, and the OP did acknowledge as much.


This is in a wholly 2D program. The input is implied to be one completely flat vector drawing, which Illustrator turns into a 3d model, and renders back into flat vectors at multiple rotations, with no further work on the part of the artist.

(I say "implied" because that's all they're showing in the video presentation, there may be additional setup involved that they're skipping. This is inside Illustrator though, which has a long history of 3d extensions being very awkwardly shoved into a corner of its toolset.)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: