Hacker News new | past | comments | ask | show | jobs | submit login
I Tried to Virtually Stalk Mark Zuckerberg (alexkras.com)
349 points by akras14 on April 4, 2016 | hide | past | favorite | 89 comments

If I had the resources zuckerberg has, and the access to information and data that he does, my posts would be optimized to show asynchronously by region, and to show first in the region most likely to respond positively to my post. That way it sets a tone of positivity for the rest of the commenters. At least in theory.

forget regions. they can show it to positively-disposed individuals.

That's what newsfeed algorithm is doing with all the facebook posts

No, it is optimizing for engagement (to be like a drug), not for positiveness.

You, I like you. They do social experiments anyways, so you never know, maybe they do some targeted notifications based on some set of metadata.

Clever. I suspect that would eventually be noticed by a researcher, but it may take a while.

There might be even something to that. I.e. If I was Mark, I would not release in India first, with all the Anti "Free Basic" sentiment.

Nice write up! I would assume in a large distributed back-end like facebook that notifications are not sent immediately and definitely not sent to everyone at the same time. This probably means that it is impossible to rely on them to get a first post comment.

Yes, I am sure you are right.

It did seem like some people were able to compose well thought out responses in a very short time. Which most likely was due to them getting a head start thanks to an advanced notification.

Or they have a better automated system...

I highly doubt it. At least the meaningful ones that I saw. If they have automated systems that good, they probably would be very rich by now.

> most likely was due to them getting a head start

This can be tested with proxies and multiple accounts.

While the analysis was interesting, I can't help but wonder why so many people write pointless comments to what is probably some marketing person impersonating Mark.

Because said comments have a reach of millions, and said comment has a picture of yourself and a link to your personal profile.

You can get a lot of subscribers that way.

It's a ego boost. Nothing more. You associate yourself with a famous person. It feels good. Even if it is totally pointless. Maybe a friend notices. And sometimes your comment gets many likes, but then you're really lucky.

So it's entirely pointless then. Good to know.

There's a pretty big difference between being dismissive and being right.

Fair comment. My comment was snarky, I apologise for that.

I think a lot of them share same hopes/sentiment that I did. A lot of them write as if they think there is high chance that Mark will read their comments. Even though I am pretty certain at this point that he won't.

Zuckerberg has replied to random people before so he probably does read some of them.

How do you know it wasn't the same marketing person mentioned in the GP writing those responses?

same reason you're replying to this post

I would probably go with auto-reply as soon as he post something, receive the text message and then edit the reply to something meaningful. Fun read! :)

Ah, great idea. I did not think of that :)

well, are you going to try it?

Probably not, I am pretty convinced at this point that Mark ignores those comments anyway.

Well, I know this wasn't your intention, but you could always "sell" that top spot and edit in the content from a top marketer, keeping it subtle enough that it looks legit.

Also, Mark is just one person. Imagine if you opened this script up for use targeting the fans of any celebrity / influencer. Companies would be willing to pay a LOT of money to be able to reach those audiences.

Can't believe I'm advocating comment spam, but sometimes the opportunities jump out at me. :)

It's a valid point, but I personally wouldn't want to build a business build on spamming.

The idea was just great akras14...sadly you're not interested in effecting that again. Would've loved the outcome.

eric001... Perfect, just perfect logic

This is why I love this industry. He started with a somewhat "bad" idea- even an automated reply would have a really hard chance being the first comment. But, he was able to experiment and tweak the idea to do something pretty cool with his work while learning. Well done!

Thank you :)

There is an amazing amount of Interesting™ stuff you can do with social media. I've been actively looking for and following people in the "social media influencer" game to see just how they pull it off (things like getting apps to the top of the app store, building gigantic and legit Twitter and Instagram accounts, etc.)

On this particular story, doesn't Facebook let you create notifications in any way? I get instant notifications from certain best friends for some reason. Maybe you could create a burner FB account and make Zuckerberg the only friend. I'd also consider trying this on someone like Robert Scoble or a tech journalist - someone who has a gigantic following, but relatively low comment velocity.

This article is interesting (there a discussion on HN somewhere): http://www.bloomberg.com/features/2016-how-to-hack-an-electi...

Anecdotally, through ~2009/11 you could use keyword density as a significant ranking variable on the Google Play store. I was amazed when I discovered as web SEO had long left this behind. I had an app coming up 3rd in searches for things like FB and YT.

Great point. I was actually thinking to say something along those lines in the conclusion.

While I would say my approach was mostly useless with a celebrity of Mark's scale. It could work with less popular accounts that do not get as much attention.

Love seeing someone using selenium and I had no idea you could use anything else than the Chrome driver, what's a headless browser?! I automated whole sections of my old Sales Rep job with Selenium it was so awesome

Basically just a browser without a GUI. You can run a normal browser like Chrome or Firefox with a dummy display server or a real headless browser like PhantomJS or SlimerJS.

Care to share further what you automated?

I was working a sales development type role, which involved some pretty fierce competition to "claim" inbound leads (contact form submissions basically) before anyone else did. Also there were a series of clicks and repetitive tasks you had to perform on each of these leads after you claimed it.

So I automated getting a list of leads, opening them in multiple tabs and sequentially performing all the repetitive tasks so when I eye balled them, they were at the right point in the process. I reckon I save myself about 1.5 hours per day doing this.

I'm really curious why browsers can't handle 18mb of comment data.

I've observed the problem myself when trying to load up old fb conversations to search for some detail I knew was there.

What's 18mb when my machine has 12gb ram?

I doubt it's the data itself that your browser is struggling with. It's more likely the parsing of the comment content, generation of 20,000 templates, keeping track of probably a quarter million DOM nodes, listening to a hundred thousand events, and making sure all of that is layed out and aligned sensibly in real-time that your computer struggles with.

Here's hoping it's only one event as opposed to a hundred thousand, at the very least.

In jQuery, for instance, you'd do something like

$(parentOfComments).on('eventName', '.commentClass', function() { //dostuff });

The event propagates up to the body or whatever parent element you use.

Good point, although 18 MB of text is a lot of text.

If I might digress a bit… most text editors will choke on an 18 MB text file as well (and if it’s an 18 MB text file with just one line (like a base64 string with no line breaks)… good luck getting anything other than a hex editor to open that with any speed). The only text editor I’ve found that doesn’t bog down too much when opening huge text files (even some of the ones with no line breaks!) is the one that comes with OS X: TextEdit¹.


¹ — https://en.wikipedia.org/wiki/TextEdit

I open pretty decent size sql files (200m-1g) all the time in vi without much issue.

> I Tried to Virtually Stalk Mark Zuckerberg

Isn't this just called "following"?

At first I also thought that stalking would be summarising several sources to find his physical location or something. So like you I was also initially a bit disappointed by the non-mallace in his approach.

But given his goal of "becoming a friend" by automatic following Mark, I ended up concluding that stalking wasn't hyperbole.

I can't tell if saying 'Mark Zuckerberg – the Bill Gates of our time' was a joke or not. I mean, isn't Gates very much still going strong? I get the analogy I suppose, but doesn't that imply Gates is all washed up?

We're just getting old my friend. There are kids programming right now that were born after 9/11.

Bills big asset growth period was the mid 1980s to late 1990s when MSFT stock doubled every other year. Its only doubled once in the entire 21st century.

Zucky is in his high personal asset growth stage now.

I thought it seemed odd that Python supported the AND operator on lists, but it doesn't.

You do need to convert to a set first, e.g.

  [1,2,3] & [1,2]
gives a TypeError, while

  set([1,2,3]) & set([1,2])
gives set([1,2]) as expected.

You can also write sets in a more compact notation in python as:

    {1,2,3} & {1,2}
Which gives {1,2} as a result.

Thank you for catching this! I did in fact call set on the comments outside of that function. I've updated the blog post.

This is great! You should also post it on /r/Dataisbeautiful to get more insight. Kudos.

Already did, thank you though :)

can you figure out a posting schedule for this Mark Zuckerberg guy, maybe there are certain times he is more likely to post. Then you can increase your rate of monitoring during those times, without such a high chance of being labelled a spammer by Facebook?

It takes a while for a Facebook post to processed. When posted it shows up in the user's news stream. However, once the page is refreshed it might be a few minutes before the post is seen again. I watched a video where some developers were talking about the problems with getting feedback to the posting user and pushing the posts to different server farms around the world. A lot of posts take minutes to propagate.

That would mean that you should use vpn and get an American IP address, close to a FB datacenter.

To hack Facebook maybe first we need to understand how they do things.[1]

[1] https://www.facebook.com/notes/facebook-engineering/scaling-...

That's sounds like another interesting project to try :)

I should try this

Looking at the graph of comment text, it's remarkable how many people are convinced that Mark Zuckerberg should be giving them money for some reason.

I agree that it was difficult parsing data on a post about a specific topic, Christmas.

However, most comments will be in reply to a post which usually has a central theme.

True. Looking back though, it might have been more funto pick a more controversial subject.

Enjoyed the article. For once, this article showed some flaws in the original process/idea, and showed very nicely how an original seed idea turned into something different, and more involved/interesting.

Clearly a lot of work and sweat went into getting the results you did, and the final output looks very polished.

Congrats for having the courage to post this.

does Mark respond to any of the posts? if so, what kind. Do some types of posts always generate more "buzz" among other commenters? Those are some of the questions that would be interesting to answer.

In this case the process was far more interesting than the findings, thanks for citing the book and videos.

I don't remember seeing any responses in my data set. I think I have seen him respond to comments before, hence where I got my inspiration.

I also know that he has people who actually are connected to him (he friended them back) and their comments appear to show up differently from the regular followers.

He does reply to those comments a lot more, intensifying the illusion that if you write to him, he will read it.

Lovely write-up, made me smile. I'll be reading your other posts on related subjects.

Great to read a walk through of a "directed trial and error" approach. Out of curiosity, how did you select NLTK and a graphing approach? Did you consider other techniques for ploughing through the data?

It would just be much easier to use the Facebook graph api, there is an official Python module and is well documented, and would be less likely to hit rate limits or other blocks - ironically that was one of the reasons that the author used scraping instead of the api.

Incidentally, the Graph API wouldn't work for this use case, since the Graph API will not let you get any data from users unless that user is authorized.

Inspiring use of scraping through social media - just what I was looking for as my next project.

Alex, where's my money $4.5m? :) Nice to read an article with a smile.

Haha :)

There is a point at which social becomes anti-social, even online.

Cool experiment!

BNtw. is there any similarity measure that can be calculated with less complexity? (e.g. without the need to compare every pair of comments?)

What a fun project, thanks for the share!

Awesome work, love the concept :)

Could you reply to his post by SMS?

I am sure I could, but I generally have a fast internet connection, so it wouldn't be any faster.


Haha, you win. I am not 100% sure what you mean, but I am glad you liked it?

Mark is not bill! Mark got everything from mommy and daddy! Bill started in his garage.

Bill Gates was also from an affluent family, which among other things helped him to log a lot of hours on a computer at the time when it was not available to the masses.

I think there is some luck at play for both of them. But I also think they deserve credit for their accomplishments.

I give both of them credit for their successful businesses. It's the same credit I give to Donald Trump. They are all fine businessmen. In my world, I want moral businessmen.

They all made a lot of money.

There's a part of me that cringes whenever Gate's talks about giving away his money. Yes, he made the money legally. We paid for his philanthropy? (To the Gate's, and his wife's non-profit; stop giving third world farmers Monsanto seeds. They are only getting one crop. Listen to Buffett's son.)

As to Mark. Yes he's a fine businessman. I don't think he has had an original thought since he stole the idea from the twins. I've never worshiped, nor liked that guy. I take that back--loved the tee shirts, and jeans. Honestly--I've never understood the whole tie thing.

And to be completely honest; I will love the internet again when/if FB is displaced by a the next big, new, wiz bang app. I am really tired of FB.

Go ahead call me a Troll. Call me a Hater. I like a lot of Founders, and their companies. I have never liked these two.

Who are some of the lesser-known founders that you do like, out of curiosity?

Just because your parents give you a few bucks/access doesn't mean that you instantly become billionaire status!

To get to the top you need a lot of luck, a lot of ability, a huge drive and every advantage you can get. This doesn't detract from the amazing accomplishment of both men.

If Facebook is so important to you, why not just get the necessary skills for the interview and join the company?

I think the article is very tongue in cheek! I'm pretty sure the author was trying to freak Mark out; I'm not certain this work would help to get him a job at Facebook either!

Haha, spot on.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact