Hacker News new | comments | show | ask | jobs | submit login
Show HN: Turn GitHub Usernames into Emails (github.com)
51 points by kwl 436 days ago | hide | past | web | 38 comments | favorite

I would recommend everyone to create separate email for their open-source work. After ten years on SourceForge, Google Code and now Github I get 200 spam messages a day.

Definitely. When I first started programming I felt weirdly happy about getting spam related to github (because it meant at least one other person thought I was a 'real' programmer), but quickly that turns into annoyance at having such a cluttered inbox.

The spam source of my inbox are open source DLs because apparently they are based on either google groups or mailman and they almost always require registration. sigh

If you just want a list of contributors' emails, you can just run git log --pretty=format:'%ae' | sort | uniq

Even better, take those results through something like Mailgun email validation REST service, and you've got valid emails: https://documentation.mailgun.com/api-email-validation.html


> Mailgun’s email validation service is available for free to all Mailgun customers. It is intended to validate email addresses submitted through forms like newsletters, online registrations and shopping carts. It is not intended to be used for bulk email list scrubbing and we reserve the right to disable your account if we see it being used as such.

Install Flanker [1] and you can do it locally. Just make sure your ISP/firewall allows outgoing SMTP connections on port 25, and ideally have Redis running for caching [2].

[1] https://github.com/mailgun/flanker/

[2] https://github.com/mailgun/flanker/blob/master/docs/User%20M...

What is mail verification in 2016? SMTP VRFY is normally disabled these days.

Or `git shortlog -se`

this is much faster.

`sort -u` and you can drop the `uniq`.


I hate how people refer to email addresses and email accounts as "emails". Half the time, I don't know if they're referring to addresses, accounts, or messages. I'm not looking forward to when these people start referring to phone numbers as "phones".

On that note, I hate how Hamiltonian, Jacobian, and Lagrangian are nouns while Brownian, Newtonian, and Freudian are adjectives. It's about time we standardize our language and namespace conventions so these things are clearer.

forgot Orwellian: "1984" has a strong theme of standardizing namespace conventions to make things clearer. They call it "Newspeak". A terrifying idea.

I think the terrifying part of Newspeak is more in brainwashing people by removing certain words from vocabulary. Standardized namespace conventions isn't scary, and many human languages are close to standardized. English is just a mess. Spanish is a lot better in this respect (e.g. all verbs end in -er, -ir, -ar, period).

>I'm not looking forward to when these people start referring to phone numbers as "phones".

In my native language (Russian) people already do (and have done long before email existed). In fact, I don't see why that would be unusual. When faxes were used, everyone used to refer to fax numbers as "fax", right?

In bash, if anyone is interested in something they can throw in their .profile

    gitspam () {
    	if [ -z "$1" ] || [ "$#" -gt 1 ] || [ "$#" -lt 1 ] || grep -v "http\(s\)\?://\(.*\)\.git" <(echo "$1") &>/dev/null ; then
    		echo "usage: gitspam http://somedomain.tld/path/to/repo.git"
    	tmpdir=$(mktemp -d)
    	git clone --bare "$1" "$tmpdir" &>/dev/null
    	cd "$tmpdir"
    	git log --pretty=format:'%ae' | sort -u
    	cd "$OLDPWD"
    	rm -rf "$tmpdir"

I hit my noprocrast before I could edit this, but I believe that this:

    git shortlog -se |& sed 's/^.*\(<\(.*\)>\)/\2/g' | sort -u
is a quicker replacement for

    git log --pretty=format:'%ae' | sort -u
By about 50%

This is why I commit with username@users.noreply.github.com

GitHub's documentation:

  # Set your email address
  git config --global user.email "username@users.noreply.github.com"

Looks like a great way for recruiters to harvest email addresses.

Not surprisingly, they've been doing it for years.

That was exactly my thought as well. Who else is just going to send e-mails to random people from Github repositories?

Ooooooh this is gonna be controversial

every few months someone discovers that either a) there are e-mail addresses in git commits!!! Public!!! or b) You can put ANY e-mail address there and commit code as SOMEONE ELSE! and it seems the general reaction always is: "Yeah, that's how it works." That stuff is just to easy and obvious to be that interesting...

Now what are appropriate ways to use that data, that's an interesting question.

related: https://github.com/ghtorrent/ghtorrent.org/issues/32 (Ghtorrent archives the public timeline of GitHub)

a) is fine (which is what this "Gitmail" service provides), but shouldn't b) be preventable if you tied email addresses to the public keys uploaded by the user? So if I tried to push commits to Github with an email address that belongs to Linus Torvalds, Github should be able to reject the push based on the fact that I authenticated with a key pair that is not in Torvalds' keys.

(Perhaps I don't understand the way in which git pushes work well enough?)

You have to be able to push commits other people have made. E.g. if you work together in a different repo and then publish the result to github, one person pushes all the commits.

Maybe GitHub should indicate more precisely who pushed the commit, but on the other hand that's often unnecessary noise as well.

If you want to be able to trust the data in a commit, it has to be signed (which nearly nobody does, and AFAIK GitHub doesn't display)

> You have to be able to push commits other people have made. If you work together in a different repo and then publish the result to github, one person pushes all the commits.

I completely missed that use case. Thanks for the explanation!

You're right in that no one signs commits (unfortunately, including myself).

Someone should build a tool that autosigns your commits for you if you have proper SSH keys and emails.

There's an interesting discussion on Stack Exchange about whether it's useful to sign every commit:


Signing commits bumps the size of the repo substantially. Not to mention that you then have to maintain your WoT connection in order to make sure people can verify that your key is actually yours (although you can use keybase for that).

It's not so much about who pushes the branch to a repo and more about verifying who made the commits within a push (as a single push could contain commits from multiple people) and for verifying commit authors we already have commit signing[1]. It's just a pity nobody uses it.

[1]: https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work

Shouldn't be. Data's already there.

I wonder if this is similar to what spamming companies are using:

"We've seen your contributions to github and think you'd be an excellent fit for our startup ... PS Please move to Paris".

> "temp_email" already exists

That could be solved by using this, I believe: http://ruby-doc.org/stdlib-2.0.0/libdoc/tmpdir/rdoc/Dir.html

Yup! It's been updated to use tmpdir so that isn't an issue anymore

Yeah, this is why we have spam filters. Although, to be fair, I got my job at SUSE because of my free software contributions. So there is a benefit of including a real email (one that you check) in your commits.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact