Hacker News new | past | comments | ask | show | jobs | submit login

I found myself scrolling through Github’s “trending” repos, looking for some coding inspiration. Within the next hour, I stumbled across something called The Sherlock Project. Interesting, It had over 35k stars, must be pretty popular.

I quickly cloned the repo and started toying around with it. It didn’t take me long to realize the power of this tool. All I had to do was insert a username, and voila! I was looking at every social media website that was associated with the username. Not only that but direct links to the accounts.

I immediately wanted to turn this into a web app so that everyone could use it. My first challenge was that this was a CLI tool, so I got to work. The Sherlock project makes about 400 requests to various site s to check if your username exists. This was going to be tough... I noticed they were using requests.FutureSession to multithread the result.

I decided to use a multithreaded Web-socket to continuously report out data to the frontend. After ALOT of trial and error I finally got something working. The Issue now though was that it wouldn't run in production due to a multiprocessing error: Daemonic processes are not allowed to have children.

Eventually I learned that you cant use the standard multiprocessing library for this kind of thing, you had to use billiard. Bam! It worked. I quickly hacked together a simple frontend, configured the web socket, and results were pouring in.

Turns out, the web-socket is considered a "long running request" as it makes 400 external requests. Maybe I could use celery to offload this process to a worker and queue it up. I started working on it and realized this was a little out of my skill range.

I then decided to take a look at the logs where I hosted the code and what do i find? CPU, Memory, and bandwidth all reaching a staggering 100% usage. I was using the free tier of Render that only allowed for one instance of my app...duh. I did some rework of my codebase and it started running a little faster.

Needless to say, I learned to take it slow, build tests for my code, and be patient with results.

What do you guys think? Any hard lessons learned in coding? What were your takeaways?

Here is also a link to the repo: https://github.com/bnkc/handlefinder




You need a privacy policy (or at least a one-liner statement) that gives potential users some assurance you aren't harvesting their username / IP / etc or the results for some other purpose or piping it to advertisers.


Yep. The wrap around the tool is neat and looks well done based on what I could see, but I hesitated based just that consideration.

edit: I thought I should make my feedback less generic. In this case, by neat I mean: no fluff, no useless stuff on the landing page, straight to the point. I appreciate that.


If they were going to harvest those usernames by posting it on hackernews, wouldn't it be easier to just scrape hackernews for usernames in the first place?


Havent really thought of that. Ill take a look


I think it would be useful to show the networks that the username cannot be found on.


Thats a good thought. I was considering it but was worried about "cluttering" the site


I echo this request. It was my expectation actually!

I searched for my username and was shocked it was used on literally every website you checked. Then I tried a less common variant and was similarly shocked, until I realized that you were showing me fewer websites the second time around. Only then did I realize you only showed me sites where it was already claimed…


Could be useful for people who like to “claim” their common handle.


what I find annoying is that (other than HN) most places won't let you claim your username if the person has signed up, yet never posted over years (or even logged in) and places where an account was deleted.

when I signed up to HN, I had a different username. I reached out to admins and they looked at the other account that had been created but they let me have it because the person hadn't logged in since. Oh! I think GitHub did as well. (shit, I wonder if I have mixed up GitHub and HN... I'm pretty sure they're the two that did actually let me have my handle... :x)


That sounds like a bad idea.

Why would it be a good idea to recycle usernames like that?


Because some people sign up for a site early and then never use it. I can't get my desired handle on a particular social network because some dude registered 15 years ago, posted one item, and then never used it again.

I mean, obviously I'll live but it's a silly situation.


Because overtime we can wear down the idea that a public facing username is some kind of unique identifier.


I really like how Discord handles it. Username + 4 digit number = unique handle, but if nobody else has the same username in a server then you can just use the username to refer to them


Do you have any alternative solution in mind? ICQ numbers?


Content


Same reason we recycle domains.


Which is a bad idea also. We should have some domain name space which we let people register forever.


That results in the clownpenis.fart problem.



What problem is that?


Eventually all good names will be taken.


> I think GitHub did as well.

I can confirm that GitHub definitely does that.


.. echo'd as well.


this reply is really only quasi-related to your first sentence.

since you mentioned the "trending" repositories on github, i wanted to give the https://github.com/nschloe/github-trends project a shout out.

it's not the same thing at all but also kind of the same thing, although it's actually, at the same time, kinda also not. alright, i'll get real. the most important thing regarding the linked project is the fact that it's got graphs. with lines. in various colors. lines that generally rise upwards, towards the right-hand side of your screen(s). lines that, more often than not, have slopes which vary in intensity and length. lines that are part of, if i may take this chance to kindly reiterate, graphs.

everybody loves graphs, right? i know i do. almost as much as i love search results linking to 37 minute youtube how-to videos for reminders/instructions on how to fix a 37 second problem with absolutely zero transcript in the video description's text area.

anyways. here's a great example of a proper how-to video. https://www.youtube.com/watch?v=py3QKC_OTvI

if you're having a bad day, ignore everything i've said, and just watch the 1st 6 seconds of the above how-to video. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: