Hacker News new | comments | show | ask | jobs | submit login
Deep Angel: AI that erases objects from images (mit.edu)
147 points by kumaranvpl 79 days ago | hide | past | web | favorite | 57 comments



Tried to remove an elephant in an hard instance (what to do with the woman on top?), and a bicycle in an easy instance (lone bicycle), both found from Google Image Search .

http://deepangel.media.mit.edu/showcase/Works%20terribly

http://deepangel.media.mit.edu/showcase/Another%20greyish%20...

Terrible results in both. It looks like it just works on their examples and doesn't do any better than object detection + paint a solid rectangle on others (arguably, it does worse than that).


unfortunately in my experience that's the case for most of opensource AI projects out there, while the showcase results are hand-picked or the algorithms was trained and tuned to solve that specific image.


It makes me suspect that the overfiting concerns on deep learning are correct.


I doubt this AI gets perfect performance even in the training set. Deep generative models are known to underfit more rather than overfit, i.e. they can't even do a good job of the full training set let alone the test set. The cherry-picked examples you see are just statistical outliers corresponding to VERY easy examples.


Statistically-random excellent performance in complex tasks is very unlikely. More likely is that examples are in-sample from a small training set or very similar. Big NNs can memorize anything.


They can memorize any supervised learning task, but so far, we haven't been able to see any deep generative model successfully memorize something more complex than MNIST.


I think a 3 year old would do a better job than this AI. Not sure why they bothered with such a fancy website for this to be honest.


This happened with my image as well. AI just put a big gray rectangle on top of the persons and called it a day.

https://s3.amazonaws.com/deepangelai/results/person/thomasah...


damn this is embarrassing considering how much PR and marketing MIT puts out. That image is almost comical, it's as if somebody just got tired and holds a loose definition of "removing an object from a picture".


This system developped by Japanese researchers precedes this and seems to have better results:

https://news.developer.nvidia.com/automatic-object-removal-a...

MIT Media Lab has much better PR, of course.


If you're interested in seeing the top of the line inpainting results, you should check out DeepFill and DeepFill2.

http://jiahuiyu.com/deepfill2/

Deep Angel is based on an architecture combining Mask R-CNN (for object detection and instance segementation) and DeepFill for image inpainting.

And here's the paper behind DeepFill

https://arxiv.org/pdf/1806.03589.pdf

If you look at the papers, you'll see which one has the best inpainting results.


Can you help us understand why deep angel is giving gray blobs if these two pieces seem to work well independently?


That's a super fair question. Depending on the image, Deep Angel can produce a quite plausible background but sometimes it just fills in the object with a "gray blob." The gray blob issue can arise from (1) the object is too big relative to the rest of the image. For example, consider an image in which 67% of it is made up by the object that you wish to remove. In this case, DeepFill doesn't have enough context to fill in the pixels in a plausible manner. (2) the object is on the side of the image. The further skewed the object is from the center, the less information DeepFill has to plausibly inpaint. (3) the training data is quite different from the test data.

The gray blob is a collapse of the pixels to the mean of the colors and textures around the removed portion of the photo.

The AI isn't yet perfect, and as you use the AI, you'll start to see which kind of photographs work really well and which do not.


This is one of their example images: https://i.imgur.com/ZOU0vWL.png

This is what Deep Angel makes of it: https://i.imgur.com/JPvq7P7.png

This is what Krita makes of it, after I manually erase the bottle and drag the smart patch tool over it: https://i.imgur.com/nGa5e0L.png

Deep Angel is better, but it's underwhelming.


Gimp Resynthesize (all default settings): https://i.imgur.com/KvAQO7H.jpg


Gimp Resynthesize can do way better: https://i.imgur.com/prR6I8d.png


Using GIMP "smart remove selection" which uses resynthetize, with default settings, I've got this: https://imgur.com/a/9vEizbj


Look at the dish and how well it predicted it's shape. Also just from a tiny amount of blue it concluded that there should be another pan behind the bottle. That's pretty impressive imo.


Someone, please do Photoshop content-aware fill.


Photoshop 19.1.1 content-aware fill: https://i.imgur.com/Y0OHzav.jpg


> This is what Deep Angel makes of it: https://i.imgur.com/JPvq7P7.png

So turns objects into ghost versions


In the case of the first example I looked at.

It does better with this dog: https://i.imgur.com/mkP5tnb.pnghttps://i.imgur.com/3F7pPj0.png


That's an extremely easy test case, and I've seen better results several years ago.


It does OK with the dog. But it's still noticeable, and the only thing it needs to do is fill with the ground texture.

If it can't seamlessly handle a picture where the entire replacement is a single texture, then... nah, I'm not impressed.


When I press CTL-A on homepage, I see hidden text:

"the first axiom of spam: if you don't see spam around you, that means that everything around you is spam. enjoy the apophatic palimpsests."

But I don't understand what it means.


talking about the website itself — it's one of the worst ones (mildly spoken) i've seen in a while.


That hidden text has a link to http://spam.church/ .


And spam church is supposedly built by this fake corp - http://kendallcorp.mit.edu/

-------------------- Text on website (un-zalgofied) says

The Kendall Corporation has been providing infrastructure for worship since 1964. With over a hundred funded fads, religions and spiritual movements across the five continents, Kendall Corporation is the leading powerhouse for seeding creeds.

1. Do you have an early-stage development of a conducive mesostructure?

2. Are you good at building platforms to enlighten, seduce and draw crowds in?

3. Is a network of your making growing at superlinear speed?

Please reach out to us; we may be able to provide angel funding for your startup idea.

https://www.dropbox.com/s/6v4t2vegand2v1l/Spam%20Church_Stor...

(removed address and phone number -- even though fake, I don't like linking people's info)

---------------------

Looks like an MIT Media Lab graduate's painfully pretentious arg/art project (i love it) :D


> Looks like an MIT Media Lab graduate's painfully pretentious arg/art project (i love it) :D

Then it leads here: http://isthisabook.club/


In 1992, Michael Crichton's _Rising Sun_ had manipulation of security camera footage as its crucial plot device. It was painstaking, extraordinarily expensive work, reserved for covering up murder.

With this and several other recent developments, we're reaching a point that it can be done automatically. Which means, cheap or free.

Hmm.


This is why more content needs to be digitally signed. Imagine if you had access to archive.org and ran an AI fact-changer over it. How would you and I know?


Do you think this from hands on use or is this an assumption based on headlines?


I've often wondered whether certain internet memes are actually state actors practicing their craft. https://www.youtube.com/watch?v=q3SFXQfE4kk


You can do that without AI.

See PatchMatch algorithm.

Paper here: http://gfx.cs.princeton.edu/pubs/Barnes_2009_PAR/patchmatch....


I'd say even seam-carving removal+addition could probably do a lot of the same without the complexity of even that algorithm (much less an AI).


The website is hard to use repeatedly and the design is unpleasant to interact with.

I tried erasing people from a random Instagram account's images, and the algorithm does a blurry, but sufficient job of inserting an empty, grey square over all of the faces in the image. I then looked at some examples and realized that the algorithm was supposed to erase the chosen object seamlessly. I'm impressed with the dog example, but the dog example has the benefit of having a dog in the middle of a homogenous texture.


Back buttons are so overrated.


Its not working for me. I get broken images no matter which combination I try.


Very interesting, I really like how the objects melt away. I only tried a couple of combinations though because the website doesn't have routing setup properly so when you press back you come back to HN.


I get `AccessDenied` on most example images.

With them images I can see I like how they included images that work great (like erasing the dog) and lots of images that show the limitations of the algorithm (like the elephants)


The website is usability disaster.


While ugly, I found the UX easy enough to follow on my desktop (I'd hate to see it on a phone / tablet), but then once I made my selections the final image were broken (404s).


Tapping images on mobile doesn’t do anything for me. Can’t even see the demo...


But it breaks all browser default behaviours.


If it is possible you should always test these projects on your own images, because you can't be sure if the example images were in the training data set.


Reminds me of the Black Mirror episode called Arkangel [1]. Based on the name it seems that the researchers know that series.

[1]: https://en.m.wikipedia.org/wiki/Arkangel_(Black_Mirror)


Black Mirror tends to reuse a lot of tech themes across many episodes. The same theme is in the White Christmas[0] episode.

[0] https://en.wikipedia.org/wiki/White_Christmas_(Black_Mirror)


The idea of the "mutability of the past" has been around since at least "1984".



I was thinking it resembled the Weeping Angels from Doctor Who. [1] Once they touch people they disappear forever, seemingly apropos.

1: https://en.m.wikipedia.org/wiki/Weeping_Angel


Nit: They are not making people disappear but they send them back in time. But I also had to immediately think of them when I saw the title.


For me, it properly identified where the object is in the photo but did a poor job erasing it.


I have tried a few images and the results were distorted even in background and the people were replaced by grey rectangles.



Sweet. I'll use it to erase the misplaced power lines from all my pictures taken in beautiful places.

/tech win


You can do that since years ago using Photoshop content-aware thing or Pixelmator repair tool. GIMP and Krita ought to have a similar tool. IIUC the main difference in UX here is that you don't have to select around the object to be removed but merely point at it.

I recently used Pixelmator to remove empty bottles and cigarette butts on skateboarding photos to great effect.


Pretty cool.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: