Hacker News new | comments | ask | show | jobs | submit login
Facebook's '10 Year Challenge' Is Just a Harmless Meme–Right? (wired.com)
73 points by ColinWright 27 days ago | hide | past | web | favorite | 71 comments

Facebook already has access to a larger repository of photos going back over a decade and all the years inbetween along with decent face recognition to create a much bigger dataset than resorting to a hashtag challenge. But I guess that wouldn't be newsworthy.

Facebook already has access to a larger repository of photos going back over a decade and all the years inbetween along with decent face recognition to create a much bigger dataset than resorting to a hashtag challenge. But I guess that wouldn't be newsworthy.

This was addressed in the article:

In various versions of the meme, people were instructed to post their first profile picture alongside their current profile picture, or a picture from 10 years ago alongside their current profile picture. So, yes: These profile pictures exist, they’ve got upload time stamps, many people have a lot of them, and for the most part they’re publicly accessible.

But let's play out this idea.

Imagine that you wanted to train a facial recognition algorithm on age-related characteristics and, more specifically, on age progression (e.g., how people are likely to look as they get older). Ideally, you'd want a broad and rigorous dataset with lots of people's pictures. It would help if you knew they were taken a fixed number of years apart—say, 10 years.


In other words, it would help if you had a clean, simple, helpfully labeled set of then-and-now photos.

It tries and fails to address it. I could barely count the number of zeros I would have to put in front of the 1 in the percentage of Facebook's photo data this meme covers. And a good portion of their dataset will have EXIF timestamps. Training an algorithm on the meme only would be insane waste of their data set.

> their dataset will have EXIF timestamps

It used to be the case (still is?) that the dates on uploaded photos weren't applied to the photo album. I remember having to go through holiday snaps and change the date on each from the upload date to the actual date. The images were also resized down from what was uploaded.

So, if they have the originals with the full EXIF data, I'd like to be able to use that for my old photos!

Chances are they have and you won't be able to use them.

Why would it be so ? Because it profits facebook which is the only reason facebook exists. It profits them to have original with EXIF for data mining and you gave them permission to do so while also giving them the data, and it profits them to not make them available to you to save on bandwidth and processing costs.

eh, I think not. There is a reason why they resize the photos. To save space. Even at facebook scale the amount of space they save by doing this must be enormous.

Additionally, images used in AI are usually scaled down a lot more, 224x224 for something like resnet50. which means that they do not need your high quality original and the smaller one they generated are fine.

They don't need everyone to train their AI to detect aging. They don't even need a substantial sample size. All humans age in relatively similar ways.

Why would they even bother with this when they already have much better data though.

perhaps they forgot to split a test set :)

while I'm sure they already have a great dataset going back far, most of that dataset will be a small fragment of the world that already had internet 10 years ago, in the mean time a lot of people in other parts of the world have come online and may have old pictures of themselves

Lots of people have said this, but Facebook has upload directly from phones to their server, right? So I would be shocked if they aren't trusting the time tags from direct uploads and metadata set by the camera (which they strip before displaying, but surely they ingest it) more than they trust upload times.

Also, people tag themselves all the time, and if you upload a photo with people, you can tell quickly that they already identify people with high accuracy, because they suggest tags for you.

Seems you're missing one thing here, rewind 10 years back and the iphone 3G got released at the end of that year so not many people were using phones to take pictures and even less uploading them as their data plan did not allow when coverage allowed to do so.

You're really reaching now. People were regularly using phone cameras long before smartphones. By the time smartphones were a few gens in it was ultra common, and a big benefit of them (that even non-technical people could understand) was that you could use wifi whenever possible and avoid data charges (which really weren't much higher then than now in most markets; "unlimited" plans were more common too). Also the 3g has a 2mp camera, so we're talking about pictures that are 2-3MB at most. The suggestion that people were shy about uploading relatively small photos to Facebook circa ~2008, and that this supports this flimsy story in any way, is sheer nonsense. I'm no fan of Facebook - am a long-term outright refusenik actually - but the conspiracy theories are getting out of hand. There is zero substance to this article, it's wildly speculative clickbait.

In 2008 Facebook itself was in its infancy. Orkut was still the largest social network, only toppled around 2011. The majority of the world was definitely not uploading any phone pictures anywhere.

> In 2008 Facebook itself was in infancy

That's quite an overstatement, to put it very mildly. Facebook was allowing open non-.edu signups by 2006, and the buzz around it from it's school success was immense. By 2008 it certainly wasn't seeing a critical mass of boomers and other late(st) adopters, but it was still huge by any measure - 100 million users, and growing with unprecedented speed.

People were absolutely already uploading phone pictures to FB and other sites by then; I think there may even have been Facebook apps shipping on non-smartphones by that time, it was one of the earliest things carriers used to flog data plans. I agree that "the majority of the world" wasn't uploading phone pictures anywhere by then, but then I'd be surprised if that rather high bar has been reached today either.

The matter at hand is collecting user pictures for mass scale machine learning. I and everyone I know didn’t even join FB until 2011-12. 100m is nothing compared to the current 2.2B user base who is posting annotated 10 year old pictures of themselves. This is to counter the parent comment that “they already have this data anyway”, not an absolute statement on FB growth.

It seems, at least, Microsoft is way beyond needing a 10-year challenge training dataset.

Their "how old" facial analysis app has been around a few years and is remarkably accurate.


It is actually not. I've been labelled anything between 22 and 65. It is not only wrong, but also not accurate about it (large variance).

As has been pointed out, people identifying themselves makes it easier to figure out which 10-year-old and current photos are actually:

* themselves

* 10 years old

And of course no one would lie about it!

The percentage of people lying about it, at least before the stage where people start piling on with jokes, is probably much lower than the general noise in the data set if you were to try to find the photos you want without help.

That too is data. Those outliers just get sent to the not-to-be-trusted watch list.

Jokes on you, Facebook is actually training their AI for deep fakes!

That is assuming it's Facebook training the algorithm

Just say what you're saying. It's Feds. And they're the biggest social influencers around.

Facebook denies involvement. But Facebook is no longer worthy of blind trust.

This is a circular argument.

did you read the article? thats even addressed

Access to what?

a larger repository of photos going back over a decade



and rebuttal.

I know that Facebook's actions mean that it no longer deserves the benefit of the doubt but this seems like a non story that someone really wants to be a story.

Thinking about it lest say 10,000 people respond, is that even enough data to move the needle? Which photo do you use for the old vs recent? There is alot of cleanup that manually needs to be done for this to be a decent data set. Basic common sense says this is a non story.

I did my post undergrad research in 2000 in neural nets and the data sets were our biggest limiting factor, second was computation time. 10,000 data points was a huge set back then and still wasn't enough for most tasks.

10,000 ?

It's probably many many more times that about three times the magnitude.

It might be a bit of a noisy dataset now that it's a meme.


You continue to post unsubstantive and/or uncivil comments after we've asked you many times to stop. So, we've banned the account. We're happy to unban accounts if you email us at hn@ycombinator.com and we believe this will change.

I don't think pile is the correct collective noun.

Why do they need to be Indian? Sure the data would reflect that a plurality of global low wage workers are from India, but there are also plenty of low wage workers who are not ethnically Indian or geographically based in India (such as all of the Americans working on r/beermoney).

Because it is assumed that the lowest white collar wages are in India.

By using Indian workers you are helping to provide a living wage for people who really need it, not just for “beer money”.

I 1million% agree with giving the money to the worker in need, and I myself would opt for the Indian as well if quality is the same.

But, my suspicion is that everyone on beermoney actually has no other income :(


Did you really just belittle my 60+ hour work weeks and 45 minute commutes as some frivolity?

Do I have a choice about whether or not I have to hold down a job? No. No I don’t.

Do I have a choice about the level of competition I must bring to the table to simply tread water? No. No I don’t.

If I had a choice, I’d never program another piece of code again, but hey. Yeah, beer money. Like I have a life whereupon I enjoy it enough to drink beer.

I did not interpret matte_black's comment as belittling at all, the post being responded to literally talked about people on the r/beermoney subreddit.

My guess is you can just use EXIF metadata and/or picture resolution/sharpness to easily sort old Vs. new. Cameras have come a long way in ten years..

As has internet privacy. Any social site keeping EXIF intact will pay big in their bug bounty to hear about it because that's a GDPR nightmare

Facebook most certainly keeps exif intact. Just because they don't publish it doesn't mean they don't retain it as metadata.

I dont know what you think gdpr is but it's not a magic wand which prevents companies from gathering data.

My response was in the context of a third party using the data

This is just stupid. Even if Facebook wanted to do something like this, they have the data. Millions and millions of photos are already time stamped.

not stupid, just good critical thinking. most people are too naive when using facebook as the past shows.


In a very poor way which still leaves the question open.

"My flippant tweet began to pick up traction. My intent wasn't to claim that the meme is inherently dangerous."

He spun an offhand comment into full-on opinion manipulation, because we're now reading his article on the topic that not only has a clickbait headline, it seems to imply there's something more to the story, when in fact there isn't.

So now we've got all this digital ink spilled on the (entirely hypothetical) topic, and plenty of eyeballs buying it with their attention. But all of it is vapor, even at the admission of the authors.

The issue with internet nowadays to be honest. As soon as you got a bit of traction with a tweet or with a blog post people try to milk it in order to market themselves or other narcissistic interests.

It's not the internet. It's media in general. They need to create an artificial story to make a profit. Now that the internet is a profit making center, the media and its tactic has found a welcome home in the internet.

Hence the "arctic blast" about to "ravage the east coast". Or as someone who has lived in the northeast, just winter. Or any other superlative clickbait. Everything is a crisis, everything is a disaster.

You are replying to an article written by someone named "Kate O'Neill", in which they include a photograph of themselves via Twitter, and you are referring to them as "he"?

Right, so there's a letter missing from a pronoun. Noted. Any thoughts on the commentary itself? No?

Facebook and Google's facial recognition software is so advanced that they have no real use for photos of people explicitly tagged 10 years apart.

Google Photos has been able to track my goddaughter from literally her first photo (when she looked like an alien) to now (5 years later), with about 2 photos per year.

The subtle point here is that people have became so suspicious of these platforms that everything they do is observed with a sharper eye on privacy, speaking of which... I wonder how Portal is doing?

A funny trend I've been seeing is posts from /r/conspiracy being mined by journalists for their stories. This exact idea was posted a few days ago to Reddit, and is not this tech writer's idea.

Reporters combing Reddit for story ideas has been going on for years and years. Heck, it might be worth an experiment to see how quickly someone can start from scratch and put together a portfolio of clips this way. The pitches practically come prewritten.

I don't believe that for a second, it totally sounds like a conspiracy.

If you were going for a joke, nice one!

If serious, I can confirm that I saw this on reddit a few days ago too.

I saw it as well, honestly I think journalists (the kind that are closer to glorified bloggers) have been doing this for quite awhile though. I can't remember what site it was for sure, maybe cracked, but I remember looking at the recent articles and it looked like a tl;dr of some of the top reddit posts of the week.

I've definitely noticed that with Cracked.

This research has been in existence for over a decade[1]. Clearly this would be a valuable dataset, but as others have mentioned, Facebook has probably the most valuable dataset. Realistically, the biggest hurdle in modelling aging is in children. The bones/muscles/everything are so elastic that it makes it difficult to accurately predict how they will look. The primary use for this tech has been for catching high-valued individuals that have gone in hiding or children kidnapped into human trafficking (hence the focus on modeling child face growth).

This entire thing sounds more like someone made a joke about "Big Brother always watching" and people without a real understanding of what's possible freaked out when they realized it is.

[1] https://www.washingtonpost.com/national/health-science/can-y...

Imagine the irony if that wasnt't the intent, but articles like this have now placed the idea in the heads of facebook engineers.

This is idle speculation from somebody who has no idea what they're talking about. Because it was published on Wired it has gone viral.

It's just technical enough that most people who don't have a clue think that it might be right so they spread it.

Anyone who knows about data processing, programming, or AI knows that it's a very stupid idea due to easy-to-implement fault tolerance (such as random dropout) in machine learning models.

This seems more likely to be a marketing move to me than a covert request for AI training data. They’re struggling with engagement, so they seeded the 10 Year Challenge causing users to invoke the powerful emotion of nostalgia, made easy thanks to Facebook keeping all of your photos safe ;)

Amount of non-stories about Facebook is getting boring.

Can you think of other challenges from the past that might be good for training neural networks?

Also, is it hard to figure out the origins of a meme? Lots of them are categorized and researched pretty well already.

This is saying a lot about what people think of Facebook these days. I don’t believe that FB wants to gather data here. This is probably an idea coming from their marketing department. But hey, why would you not think they’re evil after everything they’ve done?

People above 18 don’t have very different face biometrics between 18 and 118 years old.

Maybe it's not the data in the meme but studying how memes spread and how well you can predict it?

Exactly what I thought when I first saw this. Glad to see somebody wrote about it.

What I like about this tweet is that society starts changing and realizing what possible things could be done with information that is shared. Good to see more critical thinking evolve when it comes to social media.

How about you have 1.3+ billion subject?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact