

Posted Items on Facebook - sangguine

Hi. I have a technical question about how Facebook does something using PHP. On Facebook, if you enter a link on Posted Items, Facebook automatically grabs the title, the thumbnail and the few first sentences. Do you know what they use? cURL maybe? If they do use cURL, do you think they would store this info in a database or pull this info every time?
======
nertzy
Here's how to find out:

Set up a page on a server you control that has a URL you know that Facebook
has never seen.

Start making a posted item on Facebook with that URL.

Check your logs and see what User-Agent Facebook is using, and try to get your
best guess as to what is going on.

Beyond the User-Agent (which might just be something like "Facebook") you
could always dive deeper by investigating the actual packets sent and
comparing them to those generated by something like wget or cURL.

------
wave
The easiest way will be to use Alexa Site Thumbnail from AWS
<http://tinyurl.com/57zbp2> , which charges $0.0002/thumbnail.

------
bkrausz
Title and first sentence are probably grabbed directly (probably cURL), it's
trivial to snag a <title> tag and the first <p> tag. They definitely cache
them somewhere, since pulling them every time would do terrible things to load
time and is unreliable.

~~~
swombat
Facebook cache absolutely everything they can.

------
ComputerGuru
cURL can't grab the thumbnails - you need to run the HTML through a rendering
engine to do that.

