Hacker News new | comments | show | ask | jobs | submit login
Tell HN: Every photo in Facebook is somewhat publicly accessible
43 points by skbohra123 2631 days ago | hide | past | web | favorite | 88 comments
Facebook generates a static url for each photo you upload. Regardless of what your privacy settings are, anyone can access that photo if he knows url of that photo. I don't think that this should be the ideal behaviour? I tried changing my privacy settings such that photos I am tagged in, should be visible to 'only me', but regardless of this, anyone who knows this url can see this photo. I think this is a big privacy leak issue, or am I missing some point ?

No, they aren't.

Every photo in Facebook is accessible if you know the its secret photo ID, which is an unguessably large random number.

It's not completely optimal that Facebook embeds secrets in URLs this way (for example, if you browse directly to a photo via its fbcdn URL, you'll have planted the secret in your browser history). But it's a common industry practice, and since it's used on URLs that shouldn't normally end up in your browser history, it's hard to see the major problem with it.

So yes, you're missing a major point, and you haven't found a big privacy leak. Sorry.

Sounds like security by obscurity to me, which in my book is bad.

On their way through the web (unencrypted, mind you), the urls are visible to anyone. Any proxy server can start farming image urls. And what happens if someone reverse-engineers the number generation algorithm?

This is industry practice because it's cheap and the risk for exploitation is low. That doesn't mean it's secure or good. The OP is not missing a point.

If you can "reverse-engineer" (you mean "break") any common cryptographically secure random generator --- such as is provided by every mainstream operating system on the planet --- you can do far worse things than see Facebook photos. CSPRNGs are the font from which all real crypto keys spring. Viable attacks on RNGs are devastating to real cryptosystems. So relying on a CSPRNG isn't going out on a limb.

As for the rest of it: if you can capture the ID, you can capture the photo. See other comments on this thread for why that is and why it matters.

It's not really "security by obscurity", because the security mechanism is known to all. There's an authentication token that controls access to the image.

There are only two unusual factors. The token is contained in the URL instead of a cookie (which actually reduces the obscurity, but has no other effect). Also, the token is per-resource as opposed to per-user (which has both advantages and disadvantages).

It does have the important disadvantage that you can not revoke access to anything. If someone has seen it once, they can see it again.

On the other hand, who cares... it's also in their browser cache

As they say, you can't un-ring a bell. Once someone has seen a file, it could be in their browser cache, they could have saved it, and they might just remember its contents.

This no more prohibits revocation than right-click traps do. It was impossible to stop to begin with.

(Of course, I think you understand this, as you mentioned the browser cache. I'm mentioning this mostly for the benefit of later readers)

On their way through the web the images themselves are visible to "anyone" (any nodes or shared ethernets the packets go through), so if your photos are incriminating you should find a host you're willing to trust that will serve them https-only, or host them yourself.

Those of us having grown up bringing film to the drugstore aren't too concerned about URLs in our ISP's proxy server logs. Nobody cares about your facebook photos enough to risk their job at the ISP to steal them.

A quick google image search to the domain http://sphotos.ak.fbcdn.net/photos-ak-snc1/v2010 gives me a large number of images hosted there http://bit.ly/9Lp3r5 . This isn't the behaviour I would like.

This is presumably happening because people are deliberately taking images people gave them access to and embedding them on web pages Google is indexing.

The difference between this and "People of Facebook", the site where you can upload images your friends show you on Facebook, is that Facebook can detect, track, and disable these images, but can't do anything technical about images uploaded to "People of Facebook".

It is just silly to suggest that Facebook should take steps to prevent your friends from copying images. Nothing they do will work, and anything they do will create a false expectation of privacy on behalf of their users.

I found something interesting, there are four options in facebook for sharing photos 1. everyone ( of course on facebook I thought) 2. friends ( of course on facebook) 3. friends of friends ( again of course on facebook) 4. only me ( :) ) . So I don't find reason for giving access to anyone not having a facebook account and hence taking off control of the content.

You are stuck on this notion of publishing the link being "taking control of the content in ways the publisher didn't want" because you feel like you found this FBCDN link thing and it's captured your attention.

But in fact you lost control of the content the moment you published it to your friends (or whomever) on Facebook. We're just talking about 2 numbers, one 136 bits long and the other 73,720 bits long. I don't think the difference between these two numbers is worth arguing about.

Nice find. This is why it's 'not optimal' as Thomas puts it. Does anyone want to test what happens at different privacy settings? Sorry, I don't really use facebook.

Click on any of those images and the sidebar will tell you what web site Google found it on. This isn't a Facebook security hole. It's people publishing photos on the public web.

FWIW, the random number wasn't always large enough to be unguessable:


Which, ironically, makes me more confident in this implementation, because someone deliberately made them harder to guess; a lot of times, things look random and long but really aren't.

But couldn't a better checkpoint be designed with some overhead?

Say all urls containing /static/.. don't refer to an actual physical resource but go to a controller which checks the access and then serves the file. As such, the URL if given to someone else will fail as the controller won't authorize the file transfer.

I did it once with nginx at the front and django at the backend using X-Sendfile. I don't have the code around but it was similar to what is proposed in this discussion: http://groups.google.com/group/django-users/browse_thread/th...

Of course, this can't be done for the CDNs.

When you say it's a common industry practice, is the practice there because of CDNs?

I understand this isn't a big privacy leak. It looked awkward to me at first instance. I am trying to find what the general industry practice is in this and what could be the ideal solution as we are trying to build some image hosting for our product. Your answer is very insightful indeed.

I typically appreciate tptacek's security remarks, and agree that once you trust someone to view a photo, you're trusting them to not repost it.

However, I disagree that "everyone is doing it this way" is a good reason to do it this way.

Flickr, as I noted separately in this thread, doesn't do it this way. Changing privacy settings changes the URL, so someone using the URL (a common use case for images) can't use it any more. That's the "user expected" behavior. Facebook's behavior is a generally unwelcome surprise.

Additionally, since you're considering building image hosting: "token protected" URLs are well understood and often applied as a best effort solution to this problem. A token protected URL typically has an expiry, and in typical use is only valid briefly when issued for the visitor requesting the (authenticated) page containing embedded photos.

See: http://aws.typepad.com/aws/2009/11/new-amazon-cloudfront-fea...

Of course, the CDN edge cache servers have to cache the object yet respect the tokens, which requires a bit more intelligence from the cache, which is why "general industry practice" is to take the easier shortcut of not protecting the URL and image object at all.

Disclaimer: We offer token protected video and image CDN delivery, calling it "deep link protection". Our DLP, and token schemes in general, protect the link from misuse, not the asset.

Do people generally serve content directly out of S3 to the public? That costs money. We don't; we use S3 as a cache, and to serve files to customers.

Point being, S3 solves a different problem than Facebook does.

That link isn't about S3. That link is about CloudFront, the CDN edge caching layer above S3. S3 storage has always offered token protected links. CloudFront CDN introduced protected links only recently.

Customers and clients of CloudFront CDN (which uses S3 storage as its origin) wanted protected URLs, and AWS went to the time and expense to provide them. Content owners large enough to want or need CloudFront edge caching believe there are legitimate business cases for single use, expiring, IP restricted, or other classes of protected URLs for content.

As for Facebook:

Facebook operates web servers generating authenticated and authorized web pages. These pages are dynamic, generated per user, based on current privacy settings. These privacy-managed pages contain links to assets considered, by users, to be just as private as the page.

When the user changes privacy settings for the page, the linked assets privacy could easily be kept in line, as demonstrated by CloudFront CDN being able to support private content links.

Facebook's fault is that the privacy managed page links to public (non-privacy managed) assets, using links that do not respect the containing page's privacy settings.

To say the image shouldn't have privacy settings is to say the page shouldn't have privacy settings "because anyone could save it and repost it". (Which people do, via screenshots.) That's expected and accepted.

But once they change their privacy settings, users believe access permission changes. Access permission does change for the container page, but not the linked assets. That's a broken model.

Now that you know the answer, maybe you should edit your HN post so that it isn't broadcasting "Facebook is insecure" on the HN RSS feed.

done. :)

Facebook allows very sophisticated privacy settings, including allowing access to photos for a specific subset of users only.

I wonder what those privacy settings mean then, if the authorization checks are not happenning when the photo is accessed?

Presumably it means that anyone you show a photo to intentionally can in turn show it to people you don't intend to see the photo. Which, of course, must be true, no matter what Facebook does to protect photos.

Yes, but in addition they also assume that: - That the only way you know the photo ID is by having access to the photo. - If you had access to the photo at some point, you have access to it forever (even if it is revoked later on).

It may be industry wide practice as you have noted but the bottom line is that the privacy settings are not explicitly checked on every photo access. Makes you wonder where else they are using similar logic.

Not to lighten the issue, I think this is a security flaw, but the same is true if they simply save it to their computer. You have to trust your audience to begin with or you're hosed. There's no real way of stopping them from copying or disseminating content.

The authorization happens at the time you request the photo url, not the photo itself. So the security is on finding out what the url is, not the actual photo request. It'd be the same as someone being able to login if they know your password. That URL is the password.

A secret Id that every user you have shared the photo with knows and hence can publish them ? Uh.

Can someone explain to me what on Earth is going on in this comment thread? This response doesn't make any sense. Obviously, if you share your photo with someone, they can publish your photo! Welcome to the world of digital media! Why is this modded up, and the grandparent modded down? (Edit: It's more sane now, when I posted this the above was at 7 and tptacek was at -1.)

Someone needs to invent something to fix this problem. Some type of "rights management" tool.

instead of the karma, I am more interested in people believing that this is actually an issue, come on. I understand the complications in making things the way they should be, but then right behaviour is the right behaviour.

The photograph itself is just a larger and less random number that can also be published.

Your "friend" can download the photo and publish it somewhere else, open for everyone.

why, if you can simply hotlink to facebook cdn?

Leaking photos by publishing fbcdn links is worse for attackers on every axis than simply stealing and reposting the photo:

* The leaker and the viewers are more traceable, since they're hitting Facebook's servers

* Facebook can cut off access to the photos by reassigning the IDs

* To get the actual link, you have to dig into the Facebook page source; to get the photo, you just have to right-click on it.

This is a stupid, silly threat to worry about. Unless you find a way to predict fbcdn URLs, there's nothing overtly wrong with what Facebook is doing. Plenty of sites rely on the same technique to protect significantly more sensitive information.

You're talking about a leak while the photo is actively posted. I think that's obvious to users. It's less obvious that users can change privacy settings or delete the photo, and yet it's still accessible.

The interface says access is now (going forward) changed, but access doesn't change.

What's overtly wrong with what Facebook is doing shows up in practice in the news every time someone becomes an unexpected celebrity. The person promptly and maybe even preemptively changes their privacy settings, but their images remain available.


Even if it's technically reasonable, it's not user expected behavior.

I don't understand what you're trying to say. It sounds like you're saying, "sure, there's a totally obvious and simple way that people on Facebook can take and republish your pictures that Facebook can't do anything about, but did you also know that there's also this really convoluted way they could also do that, and Facebook could fix that?"

If you publish images to the public on Facebook, all bets are simply off. It is a bad idea for Facebook to give people the mistaken idea that any settings change on Facebook could ever take back anything posted to "Everyone" on the site.

Not talking about "Everyone". Talking about, for example, your best friend. Then you turn out to be a Russian spy named Anna. Suddenly your friend, who didn't care to download your photos before, now does want to download them.

More to the point, talking about changing privacy settings, and having the change work, where work is not defined as "defeat tptacek" but "defeat casual users".

The logical conclusion to "It is a bad idea for Facebook to give people the mistaken idea that any settings change on Facebook could ever take back anything" is for Facebook to remove any ability to set privacy more restrictive, ever. I doubt that would be popular.

While "you can't take it back" may be technically correct, most photos aren't being downloaded to repost, merely viewed inline online. There's no reason a user changing privacy to be more restrictive shouldn't expect that change to apply going forward.

  > To get the actual link, you have to dig into the Facebook page source
In some browsers, you can just right-click on the image and select properties to view it's URL.

Flickr changes a photo's URLs any time you adjust the privacy settings.

Downside: toggling privacy breaks links to the photo.

Upside: toggling privacy breaks links to the photo.

I have found another security hole: When viewing a photo, you can take a screenshot of it and upload it to anywhere else on the internet, where any arbitrary person can see it.

This violates all of my expectations of privacy.

The same thing happens in Picasa Web Albums. Photos in private albums are still accessible if the URL to the jpg image is known. The situation in Picasa is quite particular, since you get to choose between three visibility options when an album is created:

- Public

- Anyone with the link

- Private

The "Anyone with the link" option means that Picasa generates a "difficult to guess" URL for the whole album. That's fine and well documented, but the "Private" option might imply that anyone with the link won't be able to access individual photos, which is not the case. I'm afraid users might feel that their photos are hidden as much as emails are in GMail, for example.

Another fun fact:

Even if you delete a photo from an album, Facebook doesn't get rid of it immediately. I know this because I was able to retrieve a photo that I'd seen in someone's album, a day after it had been removed from that album, by retrieving the image url in my browser history.

I have a house in the woods and never lock the door because nobody knows where it is anyway! What's the problem?

This is definitely not an ideal setup, but it's hard to improve. The only solution I can think of is having an application server serve the pictures after authenticating the request. The picture servers would need to be connected to some sort of database to do that. Serving a single image would incur multiple times the current cost.

> The only solution I can think of is having an application server serve the pictures after authenticating the request.

Or, make the URL fbcdn.com/p/timestamp/hash of internal salt + timestamp/unique ID.jpg. If the timestamp isn't from within the past 2 minutes, don't serve the image. Almost every static server offers this as a plugin, and it doesn't need to touch a database. AWS offers something similar as a time-limited "access key".

Nice idea.

Right Click, Save Image As...

Post to imgur. How would this change be any different? And a better question is, why bother?

Not sure why you are hating on ideas to mitigate this from occuring.

But people can see your house in the woods. I think it's more like: see that forest out there? In one of those trees, I drilled a hole in the trunk and stuck a rolled-up $100 bill in it. Even though I told you it's out there, good luck finding it, unless I tell you which tree.

Unless I rent a bloodhound for an afternoon for $50...

All right, I'll give you that.

But what if your "reward" was not $100 but a photo of an unknown subject. There's a small chance that photo is something you find valuable (newsworthy, embarrassing to the subject, etc.), but there's a far better chance it's a scene from a toddler's birthday party taken on a bad cell phone camera. That's closer to what Facebook is doing. I wouldn't rent a bloodhound for that.

If you want to see photos of someone with a private account just find the name of someone on their list that has an open facebook account. Look through their wall (or whatever) and if you see them commenting on the persons photo just click that. You will not only see the photo but all the photos the album is housed in.

Much faster than trying to figure out URLs.

YMMV over this; sometimes it works sometimes it doesn't.

If the album owner is really locked down you can only see the photo commented on (I think it depends on whether you allow sharing with "friends of friends" or just "friends").

Random thought; you could make use of this as "free" image CDN (probably against the TOS but useful nonethless :))

I imagine Facebook photos get so much traffic they wouldn't even notice.

Another nice idea.

A related issue here is what happens when a user deletes a photo -- there is a lag time between the delete action and the CDN being updated. Deleted photos are still available via the direct image URL while the CDN is out of sync. As of July 2009 [1], the lag time was at least 2 months. So if the URL has been shared, access to the image may not be easily revoked.

[1] http://arstechnica.com/web/news/2009/07/are-those-photos-rea...

Many people are aware of this. It's been convenient so many times when needing to show a picture of some cute girl to my friends!

I think it's pretty safe as long as no one without the permissions can find (or guess/extrapolate) that URL. The images are probably just hosted by a CDN and serving up the files with authentication might slow it down or complicate the setup.

So, this is the ideal way to host static files? I think this is something like a pseudo security, isn't it? What we call is 'ostrich' approach to security. There must be some better way to do this.

Requiring attackers to guess a 128 bit random number (actually, fbcdn seems to use 136 bit random numbers) isn't "psuedo" security.

The problem is that the random number is static. As Aegan says, anyone allowed to see a photo can leak its URL. It would be better if the URL would depend on the user to whom the picture is served. But that'd be expensive.

Anyone allowed to see a photo can leak the photo. Why waste time protecting the ID?

Sure, in practice there's not much of a difference, I'm nitpicking, but it's not exactly the same thing.

People might leak the URL unintentionally, for instance by sending the link to others, not knowing that the photo owner doesn't want them to see the photo.

One can also use URLs in places where images are not allowed, for example when submitting them to community websites, without first uploading the photo somewhere and leaving a trail.

Thomas, I agree with you on all points you've posted _except_ the difficulty of attacking the system. It cannot be a truly random number of "X" bits because it is used as a permanent unique identifier. If it was actually a random number, then you have the potential of overwriting files and/or reusing URLs/UIDs.

For every UID discovered, the entropy shrinks and the system becomes easier to attack. Of course, I'm not saying the shrinking entropy would make it feasible to attack the system, just easier. ;)

This is a great noodley point, but note that they can just increase the width of the identifier to maintain a constant threshold level of entropy. They won't, of course. But they could.

We can rest assured that facebook simply doesn't care about this non-issue of image access, so increasing the bit width is not worth their time, but assuming someone actually had this need, doing it right is a lot more complicated than it seems at first glance. --Again, this is admittedly another annoying "noodley point" but it could be important for someone.

ASSUMING: The systems are most likely run in virtual machines, and the UID's are most likely the keys for a key-value store. Whether or not the UID are specific to just images, or are also used for other things (wall posts or whatever) is unknown.

The first question is whether or not your key-value store can handle the increase in key length (UID). Since this is fixable, the safe assumption is "yes" albeit there might be undesirable (or even unsurmountable) performance penalties.

A good PRNG is fed by a system entropy pool, and the pool is populated by various system entropy sources. Some believe virtualization can make the system entropy pool more predictable, but some believe virtualization can make the system entropy pool less predictable. It doesn't really matter which is true. The important point is if you can exhaust the system entropy pool, one of two things will happen; (1) the system will get bogged down waiting on the pool, or (2) a poorly written PRNG will start giving you less random numbers.

Of course, "less random" is the most accurate way to phrase it, but in some cases, the actual result is the (faulty) PRNG gives you "predicable" numbers if the system entropy pool is exhausted.

To do it right, you'd have to evaluate the frequency/pressure on the system entropy pool before increasing the key/UID width or you might suffer some highly undesirable effects. There are essentially two places where you might need to add hardware; (1) performance penalties of increased key width in the key-value store, and (2) performance penalties from exhaustion of the system entropy pool (with a well written PRNG). With the former you'd be paying for more servers, and with the latter you'd be paying for more entropy sources to feed the pools. I'm not sure how one would add specialized entropy source hardware in a virtualized environment, but with enough cost and effort, it might be doable (device mapping?). Needless to say, in either case or both, the costs could be prohibitive.

The admittedly "noodley" point here is correctly maintaining a constant threshold level of entropy is occasionally easier said than done. Of course, in other cases, it might be brain dead simple to maintain a constant threshold. The only way to figure it out is proper testing.

The non-power-of-two 136 bit length you mentioned seems interesting. I might be (incorrectly) reading more into it that is actually there, but I believe it hints at a far more cost effective solution, namely, just add more k-v stores. For example, the key/UID could only be 128 bits wide, and the extra eight bits indicate a particular k-v store. With the k-v stores being located "close" to the user, this could also improve bandwidth costs, usability/latency, and load balancing, in addition to the benefits of the CDN.

I don't have a facebook account, so I can't check, but if there's some degree of predictability in a single byte (leading, trailing, middle, ?) of the UID, it would be further support of the above theory of operation. It doesn't need to be perfectly consistent, since at each location there are probably multiple k-v stores.

Even on a collision of a generated key/uid in a particular k-v store, the UID might not be thrown out unless it was already used in _all_ of the local k-v stores. This would further reduce the pressure on the system entropy pool, but due to the need for replication/redundancy/backup, I seriously doubt this is the case.

Though going from 128 bit to 136 bit (or similar) _is_ an increase of the width of the identifier, it probably has nothing to do with maintaining a constant threshold of entropy. It's probably just a smart business decision to reduce hardware/bandwidth costs.

PLEASE NOTE: The above is only mindless speculation about how the 136 bit UIDs/keys might work, and (hopefully) no one at facebook will confirm or deny it.

Are passwords pseudo security too? They are easier to guess than a random url. That one is entered in a form field and the other in the url bar doesn't really matter does it?

It sort of does, since your browser knows that it needs to be carefuly with things you plug into password fields, and doesn't know that it needs to be careful with random URLs.

But that's less of an issue with a hidden IMG tag.

People are unlikely to share passwords - they are highly likely to share links via email or IM.

I am surprised this is standard behavior. So if I allow user X to see my photos it means that user has a right to publish all my photo urls. It doesn't look good to me.

How is that any different from being able to save the photo as a file and upload it to a free image hosting site? From a security point of view?

Since it can be easily circumvented anyway, disallowing sharing static photo URLs would be the real "pseudo security", in my opinion.

It's different because in this case if someone dumps all your photo links Facebook can in theory just assign them new random secret IDs, whereas if someone uploads your photos to a hosting site there's nothing they can do about it. In other words, from a security perspective, it is in no way worse.

Clearly this is something that that makes this issue even more dangerous.

It's actually worse than that. If you know the URL of the photo you can regenerate the url of the page it is on (and possibly see the gallery it is in) and the url of the profile it came from.

Also worth mentioning that the image name / url isn't enterly random - it contains the profile ID of the owner. This means the reverse situation to the one mentioned in the OP is possible, where a linked photo can be traced back to the person's profile.

This is pretty well known but I havn't seen it mentioned in this thread yet.

I've had a a few instances of a mate sending me a link to photos of people of whom I'm not friends with. So I agree that this is of a concern

They could just copy the photo and email that you. No amount of privacy code on FB's servers are going to be able to prevent that.

Well, then they should only display it through a Flash app.


Once it's been transmitted the end user has full control. (Which is why DRM constantly fails)

Taking a screenshot, then cropping it, is a fundamentally different process.

As one of the other commentors mentioned, the static url was a good way for him to share with his other mates, via chat, pictures of pretty girls.

Taking a screenshot, cropping it, saving it locally, then uploading it to some host, and then finally emailing or IMing the link. That is not a really easy process to share with his mates.

For clarification, my previous comment was directed at statictype's reply to djhworld:

"No amount of privacy code on FB's servers are going to be able to prevent that."

statictype referring to "that" as "They could just copy the photo and email that you."

By displaying the image in Flash, it eliminates the possibility of simply sending a URL, and instead, requires the process mentioned above.

No more sending your mate URLs of pretty girls.

I think photos are accessible in http://youropenbook.org/

This is for convince so people can easily share private photos. It is a usability trade-off.

hm, not quite.

It's done this way because running code to authenticate+authorize the user before serving up an image incurs a much higher performance hit than just shoveling the bits directly off a disk. Especially when those bits can come from a content delivery network.

Facebook handles a ridiculously large number of photos - probably an order of magnitude more than any other site - so this trade off, I would guess, is pretty crucial to keeping the site up.

Equivalent headline:

"Tell HN: Every encrypted message is somewhat publicly accessible*

*if you guess the key/password"

The difference is that keys and passwords are managed in a more careful way (think encrypted key stores), whereas URLs are not handled with the same care.

Apart from that, I agree with you.

Every photo is publicly accessible.

It's just not easily discoverable.

I agree, though, that this is mostly a non issue.

When i noticed it, I was freaked out for a while. I thought that is the same with the case of facebook videos but turned out it said,"This video either had been removed from Facebook or is not visible due to privacy settings." I Guess they should do the same with the photos too.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact