Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Made some progress on character consistency for AI storytelling (artflow.ai)
77 points by tim_artflow 9 months ago | hide | past | favorite | 24 comments
It has been a major challenge for all AI storytellers to create images of a character with consistent face/hair/outfit/body type across different scenes. We took a stab at this problem at Artflow’s and we'd like to show it to you to gather some early feedback.

Please note that this is still an early version and we fully admit it's not perfect.

See a tutorial/sample here: https://app.artflow.ai/releases#release%203.5.1%202023-11-29




There’s an example here: https://app.artflow.ai/gallery/story/video/44e85eaef3c541629...

Judge for yourself.

My take: doesn’t seem remarkably different in consistency to other options, and the animation is massively inferior to other documented techniques (eg what corridor crew did in https://www.youtube.com/watch?v=FQ6z90MuURM several months ago).


Firstly you are right, Corridor's video is way more dynamic and consistency is great. However it was made with real human acting, video-to-video style transfer, days of trial and errors and post editing, whereas we are building a tool to drastically reduce the barrier of creation with AI actors and AI scenes, at the cost of several orders of magnitudes lower. Not really in the same league.


It's essentially a static image with a mouth moving when a sound plays. There's no real connection between what's being said and what shapes the mouth is making and the lack of any movement on the rest of the face is just incredibly awkward to watch. It might be a first step towards something great, but seems far too rough and early to be showing it off.


It’s also just the face that is moving, the hair and the body are still. Really awkward and artificial.


It looks like Clutch Cargo!


Seems really disingenuous to compare to corridor crew’s video which had to film real actors for each scene they want to create.


It's a face ipadapter and still the abysmal wav2lip 128px model. The training code is open at least do yourself the favor and train it in a higher resolution instead of being one of X companies that does the same as everyone else.


As a noob in Generative AI, could you go into more details about the face ipadapter and wav2lip models? How were you able to figure out that OP was using these models?


The generated people have the typical face adapter bugs. You can learn about them here [0]. wav2lip is a old gan based model, the output produces a slight green tint. The rest of the animation is just animatediff.

[0] https://www.youtube.com/watch?v=t2OBzV3UHv4

[1] https://github.com/Rudrabha/Wav2Lip


That sounds super cool, but to be frank with you, without some details about how you did so or at least some evidence that you have, I don't see how this belongs on HN or what sort of discussion we're meant to have about it. Currently this link drops you into the UI for your product, which is putting the burden of connecting these dots so we can have any kind of discussion about it on the community.


Good point, just updated the description to include a link of sample/tutorial (https://app.artflow.ai/releases#release%203.5.1%202023-11-29)


Sorry if this is ignorant, but it would be really amazing if you could export an STL of your characters...


That isn't currently technically feasible.


I feel like the proper use for a story someone writes using AI is to have my AI read it and post a review and never tell me about it.


Eh, is it really "consistent" costume if the chest plate of a character's armor changes in each shot? Or if the tassels on a sweatshirt are suddenly chrome?

I don't think this would pass in regular film for good consistency. It still looks like AI to me. I'm just being honest, as a totally non-expert media consumer. Consistency problems are a huge annoyance to me in movies, it's the kind of detail you should get right at the very least if you want to make a serious work.

But if I were doing that, I don't think I could trust this, based on the images on the linked page's masthead, and in that case this isn't a solved problem at all, and I wish you had not claimed it was solved. It would be fine to say you've made a dent in this problem! But now I just feel like you baited me into trying your app.


These are valid points. I agree consistency is not preserved 100% (although it's a super challenging task and it's the goal we are striving to achieve). At the same time, we've made a major progress on getting the consistency for about 80%, we'll keep improving it so hopefully it'll get there soon. Appreciate your candid feedback though!


I’m happy to admit it is very strong progress. I can see how the characters relate between shots now because they wear (at least) the same color and style of clothing.


Consistency is stellar. Do you plan working on animation next? all the examples on your site look more like a postcard with a bit of morph on it. Annoying Orange is more expressive.


Right now you can get lip motion and we do plan to tackle larger scale animations down the road. Regarding the "annoying orange", could you help me understand what exactly it is referring to?


It's an old meme/animation.

https://www.youtube.com/watch?v=DD5UKQggXTc

The title card really tells you all you need to know.


the screenshots looks super cool! the video doesnt play on ios safari. gonna signup when i get to my desktop. but this is supercool! i really thought character consistency problem wouldn't be solve in the near future


Thanks and please do give it a try! Curious to hear your thoughts :)


How does it compare to SadTalker?


Marlinho




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: