Hacker News new | past | comments | ask | show | jobs | submit login
Linus Torvalds did not commit this (github.com/amoffat)
301 points by daenz on Aug 4, 2015 | hide | past | favorite | 184 comments

Linus actually commented on this lack of identity handling in 2012 in his famous explanation of why he doesn't use GitHub: https://github.com/torvalds/linux/pull/17

"since github identities are random, I expect the pull request to be a signed tag, so that I can verify the identity of the person in question."


"github throws away all the relevant information, like having even a valid email address for the person asking me to pull."

Yeah, the GPG signing support in Git would be the best solution to this, unfortunately GitHub doesn't offer integration for that.

Maybe this is where Keybase.io could come in?

He called github "braindamaged". Isn't that against the CoC?

> He called github "braindamaged". Isn't that against the CoC?

So, you think he violated a Code of Conduct that did not exist at the time, in the course of explaining why he was not a user of the service (so that any Code of Conduct for that service would not apply to him even if it existed.)

He's making a sarcastic reference to this recent incident: https://b0wie.s3.amazonaws.com/second-response-full.jpg.

I wonder what would happen if I push some electromagnetic calculations to Github (I'm a physicist): https://en.wikipedia.org/wiki/Retarded_potential

You can't win an argument against people who gave up on logic.

Making the violation before it's formalized does not necessarily excuse it. Perhaps proper action is to remove the full post or specific parts.

Who knows, but further down when he calls someone a moron and starts a little rant surely breaks any form of code of conduct.

Kinda hard to not come up with it when setting up your credentials is the first thing git wants you to do before you can commit anything. BTW. You can also overwrite them by command line switches per commit instead of setting environment variables.

I guess it's worth noting here that you can sign your commits with GPG: https://git-scm.com/book/tr/v2/Git-Tools-Signing-Your-Work

Similarly, nothing stops you altering the time claimed in the commit. Or -- for that matter -- from taking someone's diff and claiming credit for it.

For that reason, I jokingly created `git-upstage`, which streamlines the process of abusing commit edits and plagiarizing code! It squashes a branch, backdates it 5 minutes, and claims you wrote it.


Edit: Looks like my last commit left the important stuff commented out and can't fix it at the moment. Ah well, you're going to use the tool to rip it off anyway ;-)

I love this. Had a project in college that was supposed to be time limited... based on repository times. Oops. Big mistake prof. We rolled back the times on our repo and laughed maniacally about our free 6 hour extension.

Should professors really be spending their time locking down all the ways students may try to cheat? At Caltech, proctoring exams (for example) is not allowed by institute policy. A student's honor that he didn't cheat is considered good enough.

That's not the whole story. The student's honor is considered good enough that we shouldn't spend time locking things down, but at least when I interacted more with undergraduates here, Caltech had an investigation process for undergraduate academic dishonesty that was student-run and utterly dysfunctional, apparently involving scenes that would bring to mind the Spanish Inquisition, complete with 2am interrogations and insistence on confessions.

I was told of severe disciplinary actions for accusations that were ridiculous when students would not confess, and an administration that stood behind the decisions of students who judged other students more on their opinions than anything else. One student, for example, was apparently expelled for an instance of claimed cheating that would have involved him running back and forth on campus at the speed of a competitive athlete. He quietly returned a short time later, and was also admitted here as a graduate student; there were rumors of legal threats and a settlement.

At least when I was hearing more about such things, essentially, Caltech placed tremendous trust in students who were well-liked by particular people, and treated those who were disliked very poorly.

Why wouldn't you proctor exams? The time spent is small, less than 10 hours a semester, and the proctors can answer student questions or make corrections and clarifications to test questions. That it's a small disincentive to cheat is nice too, though in my experience only the most blatant of cheating would be caught. I say all this as someone who proctors exams.

> Why wouldn't you proctor exams? The time spent is small, less than 10 hours a semester, and the proctors can answer student questions or make corrections and clarifications to test questions.

Because Caltech faculty (and, for undergraduate student exams, grad students) have better things to do with their time than proctor exams (and, perhaps more to the point, because Caltech wants to attract faculty and grad students that feel that they have better uses for their time that baby-sitting exams.)

And, frankly, like many aspects of the trust extended to Caltech students, its a recruitment policy -- Caltech is an extremely selective, extremely small school that is competing with other elite institutions to attract the best students.

There's more to it than that. For example, I didn't feel any need to lock my dorm room door when going down the hall to the bathroom, and often never bothered to even close it. I never had anything stolen, nor did anyone else. (Not totally true, there were a couple instances where an outsider came in the unlocked dormitory doors and tried to boost something, but the other students gave chase and caught them.)

It's just nicer to live that way.

A friend of mine at UT had his room sacked the first week.

Honor codes and not proctoring exams (rather, the students doing it themselves) are a fairly common feature of engineering colleges, it isn't really something that contributes to Caltech standing out.

> Honor codes and not proctoring exams (rather, the students doing it themselves) are a fairly common feature of engineering colleges, it isn't really something that contributes to Caltech standing out.

> Honor codes and not proctoring exams (rather, the students doing it themselves) are a fairly common feature of engineering colleges, it isn't really something that contributes to Caltech standing out.

I'm not saying Caltech is unique in doing that, I'm saying that in the universe Caltech operates in it would conflict with their recruiting interests -- both for faculty and students -- to operate in a different way.

Having proctored my fair share of exams...you're there primarily to answer questions, not to enforce anything. The student who wants to cheat will find a way to cheat.

The reason was to emphasize that the students were trusted. Sometimes a professor would sit outside in the hall to answer questions, but he would not go in the room.

Most of the exams were take-home anyway, and included instructions giving a time limit and what reference material was allowed to be used.

To me this sounds like admin trying to save money. I found some more discussion in a couple places:



That's the first I've heard that it had anything to do with saving money, and I spent 4 years there. It does, however, make life easier for professors and students when you can trust each other.

I don't know what Caltech is like today. I attended in 70's, and the honor system was considered sacred by the students. If there were cheaters, they never bragged about it, and I don't know of any. I know one who fell asleep during his takehome exam, woke up and finished it, and so exceeded the time limit. He noted this on the exam. The professor replied back that he was very sorry and was forced to give him an F. The student repeated the (required) class next year.

The number of students who did poorly on exams argues that cheating was not widespread.

If the culture has changed in the intervening years, that makes me very sad.

> The professor replied back that he was very sorry and was forced to give him an F. The student repeated the (required) class next year.

Wow, way to prize process over people. Why not let the student drop the class and just re-take the test next time around?

Is Caltech designed to teach science, or train paper-pushers?

I don't know about caltech, but many universities replace an F grade if you retake the class.

In my day (ca. 2000) there was no "forced to give an F" and in fact it was very common for exam-takers to draw a line, write "everything below this line I did after the time limit", and get partial credit for it.

Not that I recall. I don't think it's quite fair to do that, as it then becomes an infinite time exam.

But also consider that the midterm and the final were the entire grade. No credit was given for homework, showing up for class, etc. The rules about the exams were pretty clear.

However, if you had a borderline exam grade, but had done the homework diligently, the prof would use that as a tie breaker.

His fellow students thought the F was a bit harsh, but he conceded that it was fair and took his lumps with equanimity. I quite admired him for it. In the end, it didn't hurt him because he graduated and went on to a very successful career.

You could at least ask the student how much time they took actively working on the exam, less the part where they fell asleep, and compare that to the time limit.

The awake time spent on the test was under the time limit. The wall time was a few hours over.

> I don't know what Caltech is like today. I attended in 70's, and the honor system was considered sacred by the students. If there were cheaters, they never bragged about it, and I don't know of any.

I (briefly) attended in the early 1990s, and it was the same.

> A student's honor that he didn't cheat is considered good enough.

Individual students certainly can have this integrity, but the demographic as a whole is demonstrably susceptible to cheating.

It also offends basic scientific process, in that it suggests that you don't need to bother with trying to make sure your results are robust; instead it relies on someone's word. Why bother having exams, then? Just ask "Do you think you understand the course material properly"?

Many times I honestly believed I understood the material, only to fall way short on the exam :-) Scientists can honestly believe their (wrong) results are correct, because it's easy to have unintended sources of error. This is why others try to replicate results, it's not necessarily about catching cheaters.

Think of it like running a marathon. Is there any satisfaction from thinking "I honestly believe I can complete a marathon!" ? I don't think so, but there's a heluva lot from actually completing one. Same for a tough degree program.

I agree with you, but the analogue is not scientists having their work replicated, it's doing the work in the first place. We don't accept 'trust me on this' in a scientific paper.

Regarding marathons, some people do cheat them - they're more interested in the social rewards than the personal growth. Some people flat-out lie about doing them at all. Different folks have different motivations.

> Individual students certainly can have this integrity, but the demographic as a whole is demonstrably susceptible to cheating.

Certainly demonstrably true of a significant portion of the demographic of "college students".

Caltech would probably argue that Caltech students are not a representative sample of college students, and that generalization from the more general class to the more specific here is a textbook example of the fallacy of division.

That's a 'begging the question' fallacy, where the conclusion ('Caltech students are more honourable than that') is used as the premise (ditto).

Even if we all agree that Caltech students are not representative of college students in general, it doesn't automatically follow that there is no significant degree of cheating.

> That's a 'begging the question' fallacy, where the conclusion ('Caltech students are more honourable than that') is used as the premise (ditto).

No, its not. Rejecting the validity of an argument for p is not the same as making an argument for not-p.

But that's exactly how you presented their supposed argument: "Caltech students don't cheat because they're not representative of the general student population", with the vague assumption that Caltech students are more honour-bound.

Not being representative of a population doesn't give any information about the makeup of the subsample, unless you have more information to add.

> But that's exactly how you presented their supposed argument: "Caltech students don't cheat because they're not representative of the general student population"

No, its not, which is why you had to present a "quote" that isn't to advance that story.

I presented how they would reject an argument from the general population, not positive argument for the absence of cheating at Caltech.

What are you talking about? You presented the argument as a potential rebuttal to the claim that students cheat. It's inherently implied that it's an argument for the absence of cheating at Caltech... otherwise it wouldn't be a rebuttal at all, and instead is a non-sequitur fallacy.

Yes, if you strip away the context of the literal words you said, you're correct. But in context, you're not.

An interesting question is does Caltech's admissions process select (unwittingly or otherwise) people who are likely to follow the honor system, or do people tend to rise to an expectation of integrity? I suspect the latter is more likely.

I also suspect that a university with strong anti-cheating measures is expecting students to cheat, and students will naturally fulfill that expectation.

> An interesting question is does Caltech's admissions process select (unwittingly or otherwise) people who are likely to follow the honor system, or do people tend to rise to an expectation of integrity?

I think the latter is definitely true, the former is probably true, and perhaps more importantly, Caltech's admissions process selects for people who are likely to view a system where rules are enforced primarily by monitoring as a challenge, making the alternative to trust being an arms race that consumes resources on both sides that could be more productively employed.

Even if it selects for that trait, it doesn't mean that there isn't a significant level of cheating. You can reduce the incidence of X and still have problematic levels of X.

Edit: An example: the US homicide rate has fallen in recent years. That's good news. But the US still has a serious problem with homicide, as its homicide rate is an outlier at 4-5 times that of all other first-world nations. Americans are getting less murderous, but murder is still a serious problem there.

At my school many, many students cheat. This devalues the education that I'm getting and is frustrating to students who actually put in the time to study.

It devalues your degree, but not your education. Having the confidence that you actually mastered the material is worth a great deal. Stay strong, dude.

True. You're right, it only devalues the degree but that's a huge part of the reason that I go to school at all.

Much of my education is self taught, and hour-for-hour knowledge-wise I believe my time could be better spent in self-directed learning, but that degree does have value and people who cheat make it worth less to employers and myself. I think it makes sense to try and catch people who abuse that.

I understand your feelings on this, but after you've been working for 3 years or so, nobody is going to give a damn about your degree or where you got it. They'll value what you can do, and that's where your education will pay off.

I'm actually working, and for almost 3 years :)

I definitely agree with what you're saying, especially in startups it really isn't that relevant (part of why I like the startup community so much). But I do spend a lot of time outside startups as well, and that glass ceiling is definitely present in large companies and academia (especially academia).

I wasn't clear. Having a degree is important and not having one is often a blocker. For example, you can't get a mechanical engineering job without a degree. (There are legal reasons for that as well.) Where it is from does not matter, nor does your GPA.

Your comment basically reads, "I cheated and got away with it, and I'm proud of it too. I'm so awesome! Take that, stupid professor!"

Don't hate the programmer, hate the tools. ;)

Or how about: don't hate the tools; hate the cheaters?

~ oh you Sweet summer child.

and what is wrong with that?

...then procrastinated for 6 hours. Or is that just me?

Oh, it appears that git 2.3.2 only allows to go back as far as Dec 2014.

That is a pity. I have already prepared a README.md claiming that I have created the internet, but then I failed while trying to backdate it to 1969 (PDT). I was this close to becoming very rich and very evil. Oh well.

Don't give up so easily - you have the source code! ;)

I got to my linux workstation, cloned the git repo and dug into the code. Could not figure out how the limitation worked though. Turned out, it did not - the problem was specific to my mac's git. So yes, you can author files from 1970 - it is not a big deal. Thought it was kind of a weird limitation.

I guess the real uptake is that commits should really be signed by default (eg, by encouraging signing in the tools & ecosystem).

Really, this only becomes a problem when services like GitHub link the name up, making it look more legitimate than it is. If they enforced authentication as that user before providing the linking it would be better (perhaps allowing approval of the linking if posted by a different user). Currently it's trivial to make it look like any GitHub user is an actual committer to some sort of egregious or controversial project and users unaware of how GitHub maps this may easily be confused.

EDIT: This gets even more disturbing when you realize that GitHub is a site many people list on their resumes and if this association applies from their user page too this could get very bad.

The simple solution seems to be github just distinguishing between signed and non-signed commits.

Apparently Github doesn't list those faked commits in your profile. Which is anyway a problem because you could be "committing" to who knows what project project without being notified.

I disagree, the real problem here is people thinking that somehow any github repository can be a trusted source. If you work on Linux, you know that the one and only source of truth is the repo that comes from Linus himself, not anything coming from github.

Linus doesn't distribute anything himself; you can either have the copy hosted by Github, or the one hosted by a Californian nonprofit called the Linux Kernel Organization, Inc.

It doesn't seem to link in reverse to the user page; at least, I can't find anything on Linus' page related to this repo.

There's a Github help page explaining when commits show up in your profile (though worded as an answer for people wondering why one didn't): https://help.github.com/articles/why-are-my-contributions-no...

The criteria not fulfilled for this one are that you have to have interacted with the repository on Github in some way. Any one of these criteria would suffice: you're a "collaborator" on the repository, you're a "member" of the organization the owns the repository, you have forked or starred the repository, you have opened a pull request or issue in the repository.

We would love support for signed commits in GitLab itself so you can check the signature from the web-interface http://feedback.gitlab.com/forums/176466-general/suggestions...

Signing my commits doesn’t prove that I didn’t do an unsigned one.

It's a github 'social' issue , not really a git issue. It might also be a legal issue (identity theft). And i'm a bit surprised github let people impersonate others through their 'social' features.

It is really not possible for github to track the origin of commits.

For example, consider a fork. If you pull some commits from the original and then push them to your fork then that would look just like this.

The only thing I can think of is that it may be possible to track commits if everyone would sign them as they were created, but that would require all users to change, so I don't see how that can happen.

A checkbox that requires push commits to be signed (you don't have to sign each one, but the top level of any push would have to be signed) would be nice. So would any indication that commits are signed in the GitHub UI. I've started signing my commits to GitHub, but verifying even that a signature is present (let alone correct) requires you to clone the repo and investigate it with commands you'd probably have to Google, as far as I know. No UI visibility whatsoever.

If a user could upload their public GPG key to their account (as they can their public SSH key), GitHub could easily verify signed commits, which git itself allows with `git commit --gpg-sign[=<keyid>] ...`.

It would be trivial for them to shell out to `git verify-commit <commit>` in order to verify that the claimed originator has signed his commits with a key tracked by github (by email address, which can belong to only one github user).

An idea that doesn't require signing is tracking commits internally.

If you push a commit made by you, Github can tag that as "trusted" (since you, the GH user, are vouching for your own commits). Then anyone who pulls them into their repo (even offline) and pushes it to their GH account would still have those commits tagged, since GH could match the hash with the ones it already knew about.

For the most part, this would solve the problem, since people usually upload the commits to their own GH fork and then issue a PR.

How about just showing who actually made the HTTPS or SSH connection to Github to push the changeset?

It is a Git issue. In any Git repo, you can spoof anybody's name or email. All Github did here is show his Github account instead of e-mail. That doesn't really make it worse.

It being a "Git issue" implies it should be fixed by modifying the Git software. I'm not entirely sure how someone would go about doing that, assuming you can't assume that each machine generating contributions will only be used by a single contributor and never shared. (Tokens such as thumb drives can be stolen and copied, too, and an application can't be sure it's reading off a thumb drive anyway.)

By appending "unverified" and "unsigned" after the email addresses? Just like email. This problem has been solved long ago, there is no reason why Git can't fix this. It already has PGP signatures, there is no reason to not add verification layer.

> [Git] already has PGP signatures, there is no reason to not add verification layer.

It already does.

man 1 git-verify-commit

man 1 git-verify-tag

This is a Github problem.

You can run those commands by cloning that repo. The issue here is that emails can still be spoofed while looking at the commit, without running the command.

Why do you think that seeing a username on Github is any different that seeing an unverified email in any other hosted Git repo? You need to git-verify in both cases. It is your own misunderstanding that seeing a username on Github implies it's verified. It's not. Just like looking at email is not on Git.

> Why do you think that seeing a username on Github is any different...

I don't.

> It is your own misunderstanding that seeing a username on Github implies it's verified.

I'm not confused. Others appear to be. My comment was in reply to a poster who seemed to be indicating that git lacked the ability to verify signed commits and tags.

I was informing him that git does indeed have that ability, and that its absence from GitHub is a GitHub problem, not a git problem.

You've leapt to an entirely unsupportable conclusion about my familiarity with git and commit/tag signing. :)

My informal, entirely unscientific survey of folks who use GitHub leads me to believe that they are -on average- less proficient with git, git concepts, and the notion of cryptographic signing than the average person who uses the git CLI.

This [0] appears to be the closest that the GH documentation gets to saying "anyone can commit with anyone else's email address". Adding support and graphics for tag and commit signature verification -along with support for tag/commit signing in the GitHub client- might be a nice thing to do for users who are not so familiar with git.

Oh, also:

> You need to git-verify [to verify unverified email addresses attached to git commits] in both cases.

git-verify-* is actually only useful on signed commits/tags. If a commit/tag hasn't been signed, it exits with a non-zero exit code and does nothing else. Given that the vast majority of commits one will run into will not be signed, git-verify-* typically won't help you to determine the validity of the authorship of a commit/tag. Reading the -very short- man page of either command would have caused you to understand this. :)

[0] https://help.github.com/articles/why-are-my-commits-linked-t...

TBH we "exploit" this when accepting PRs for an open source project I work on. It's not really feasible for us to expect / force each PR author to have a clean commit history, so we basically do some squashing, then commit the "single" change as the original author before merging.

So, when submitting to your project it can happen that I'm afterwards blamed for things I didn't do? (Or praised)

I'm not sure I follow? All of your work is intact and committed as you, it's just done as a single (squashed) commit instead of N commits. The only difficulty that potentially arises (that we've encountered so far) is a lack of granularity for commit messages, which is why we try to keep PRs very small and focused.

I think what johannes1234321 is concerned about is that while you say the work is intact, he can't actually vouch for that because he didn't do it. You did. Which means that he could potentially be blamed (or praised) for something he didn't do in the event that you don't keep his work intact (intentionally or otherwise).

Ah ok, thanks for the clarification. Yes, that is technically a possibility, but to be fair, a big takeaway from this whole thing is that signed commits are the only way to really have guarantees of authorship. If you're knowledgable enough to do that, you probably have a reasonably clean commit history, which we generally don't squash before merging.

If someone is concerned about getting blamed or praised for his work, he can sign his commits.

I don't think that could happen:

1. The content of the PR will stay the same. In this case, the only thing changing is the number of commits. Unless you expect someone to blame (or praise you) for the number of commits? 2. Git will track the Committer and Author separately.

That's the problem with all these guys obsessed with clean commit history.

What's wrong with having a clean commit history?

I think it's something people take too lightly.

Nothing. Just like there's nothing wrong with not having it clean.

Depends on the project. If I ran the Linux kernel, I'd insist on clear commits too.

Oh no no no. If you are handling something as important as the Linux kernel, you will absolutely want traceability over anything as trivial as clean history. You will impose signed commits and signed merges only... none of these FF stuff. If you want clean history on top of that, you will enforce that on original pull request not after the fact.

I'm responding to, "there's nothing wrong with not having it clean."

Yes, there often is. And I do enforce it on the original pull request.

You need that level of traceability on each commit specially on the Linux kernel, all our little projects sure, destroy the history, is fine.

> there's nothing wrong with not having it clean.


Well, as this submission demonstrates... you lose integrity and accountability

(which you never had in the first place, if you don't sign your commits)

I don't think you need to do this? I may be wrong but I thought if you squash the author's commits git will still give the author credit for the commit.

You are correct, the Author will stay the same and the Committer will be the person who squashed.

You’re doing a rebase, which is not the same thing as purposely editing the commit author credentials.

Negative, we're actually editing the commit author credentials when we create the final commit message, which adds a Changelog message and closes the original PR.


Why not just doing a rebase? I work on a big open-source project on Github and we do rewrite/squash commits to have a clean history, but it only involves a rebase, not editing the author credentials.

The issue at hand here is interesting: GIT _commit_ integrity is not guaranteed over all operations, even if you sign commits.

The problem is that GIT often changes commit details. If you, for example, rebase or cherry-pick a commit, the identity changes - as the commit includes a reference to the parent commit(s). This means that once you do any of those (standard) operations, the signature becomes invalid.

Signing only makes sense on a tag level or in repositories that keep all commits and never change them (e.g. by explicitly merging and adding merge commits).

There are systems that preserve commit integrity during all of those operations, e.g. DARCS.

IMHO git does it right. If signature was only at the commit level you could do a variant of replay attack.

For example you could cherry-pick all commits except security fixes and create a malicious version of the software that has all commits still signed by the original author.

I was not saying that the signature is _just_ at a commit level ("patches" being the DARCS lingo).

DARCS, just as git, has tags for that. A tag depends on all previous patches, so a signed tag guarantees just the same as a signed git tag does.

In that regard, git is strictly less powerful.

You can use github's api to identify out who pushed a commit:

Currently the event in question is on page 3 so you can use: https://api.github.com/repos/amoffat/masquerade/events?page=...

It i will roll to further pages as new events come in.

Github comment with formatted json event output: https://github.com/amoffat/masquerade/commit/9b0562595cc479a...

The login of the push is identified of course as the owner of the repo 'amoffat'

Yeah this is known, and you can get yourself an awesome list of contributors if you want: https://github.com/zixan/uberfareestimator/graphs/contributo...

I think github should allow me as a user to confirm contributions made out of the system, at least the first time per repo.

That's a great solution. People should tweet it at Github and see if they implement it.

I second that too.

Also you may send email from any address you want if you put it in the “MAIL FROM” SMTP transaction.

Thankfully the Linux kernel project accepts only signed commits.

> Thankfully the Linux kernel project accepts only signed commits.

This is false, FWIW.

Well, the signoff line that Linux requires is a legal signature, not a cryptographic one. It is meaningless to us, but it's good enough for lawyers.

Indeed. Thanks for the correction! It requires signed patches, alas not with a cryptographic signature.

I think this is pretty well known. You could always sign your commits if you're really worried about someone sticking your email address in their git config.

While it's probably well known from a command line perspective, I doubt it's well known from a web service (GitHub) perspective. I have a healthy distrust of git logs but trusting a photo and username on GitHub is a pattern reinforced by every other social app.

Not only it's not well known, but it breaks them:

some years ago I reported this issue on gitorious, which yielded an HTTP500 when the repository contained a signed commit:

https://issues.gitorious.org/issues/193 (it's now down, but hopefully it'll reappear on archive.org)

and it'll prevent Launchpad from automatically mirroring git repositories (which is paramount to be able to automatically update PPA with recipes)


It does give a strong impression of authority that said user did indeed author a commit. Especially being a link to that user's profile.

I've seen this demonstrated on github before w/ defunkt's account/email

I feel as though a GIT demo somewhere has already shown me how to impersonate Linus or anyone else.

Yeah, I've definitely seen something very similar, as well.

I can imagine a scenario where a malicious employee is intentionally injecting malicious code (backdoor, whatever) and wants that commit tied back to someone else (their enemy, boss).

If they have push access, would github log who actually pushed it (i.e. ssh key)?

Yes, Github's authenticated push logs will tell you that. However, Github won't make those public, because they consider user identification private info.

(Or at least they won't provide those push logs to the Apache Software Foundation, which is how I know this.)

They won't even give it to the owner of the repo in question?

They will not. For instance, the ASF cannot get the push logs for all the repositories under <https://github.com/apache>.

So this is a Github feature, which might be useful in some cases, I get that. The good news here is that it only goes one level deep, i.e., it does not automagically show up on Linus' account in any way.

This also has been an issue with Git for a long time. In any repository, you can see the email of the person who committed it, and it can be spoofed, because there are no checks. So while here we see it linked to Linus' account, the problem with plausible identity theft has always been a part of Git.

There are ways to solve it. Git can just put "<unsigned>" next to non PGP commits. Github can also put "<unverified>" when commits are made outside of Github realm (not using their keys of https auth), or are unsigned.

Just be careful while merging pull requests, which one has to anyway. And because one has to, these issues never seem to get a fix.

I don't get it, I never really looked at github comments before. Why are these comments so low-quality? Aren't developers who commit to github the only readership?

I had a relatively sane comment (IMO) about whether github should instead consider allowing users to opt-in to "show unsigned commits which claim to be from this address as unknown."

Oddly enough my comment itself was altered to be a mindless obnoxious comment! (does github use git commits to track the comments themselves?)

Repository authors can edit anyone’s comment.

You can just use this and not have to overwrite your name/email:

    git commit -m "message" --author=<author>
You can even pass a different commit date with --date=<date>.

It's actually a really useful feature when you're migrating a project from a different SCM system to git.

Another example of why email addresses cannot serve as "identities" like so many people, somehow magically, assume.

If you know it, everything about email and its protocols would make you assume the exact opposite.

Now PGP signatures on the other hand … but hey, nobody wants those, right?! Far too complicated! /rant

This has nothing to do with emails serving as identities and everything to do with not believing user's input without verification.

If github was linking to user's profile based on first/last name rather than email the issue would be the exact same.

There is a huge UX problem with validating the legitimacy of anything online. I have to know that credentials are available, and I have to know that it's possible to validate them. How do I even know if a particular set of credentials are legit? I'd have to know where to find validation for them. That's a whole other ball of wax in itself.

And we default to not requiring such authentication because the means we have are either completely useless (i.e. passwords) or so onerous (I always have to Google when I setup cryptographic credentials on anything, and we expect lay users to do this?) that they ruin adoption rates for software and services.

State-of-the-art secure credentialing should be as easy as passwords. Easier, even. Yet they're currently as "easy" as configuring Apache.

Any information you can get online or over the phone can be forged. (Passwords can be discovered, as can private keys, fingerprints, and the results of genetic tests.) Certain kinds of physical evidence, such as dead skin cells with usable genetic material in them, are to my knowledge effectively impossible to forge, but they can be "accidentally" contaminated beyond usability. Doing things in-person face-to-face is only an improvement if you knew the person before anyone had any incentive to fool you on the person's identity, which is hard; even then, allegiances can be bought, sold, and changed for other reasons.

My point is that fixing this issue is out-of-scope for a DVCS. It could, however, be improved a bit.

There's also the issue that securing these things is tricky. How do I secure and sync a gpg keg? Should I load a private meeting on my work PC? My phone?

So it seems the author has identified a real issue here, but I will go meta on this and identify issues with his demonstration. In my organization this would count as a bug report, so I wondered why this issue was not communicated privately to the operators of Github so they can have a chance to fix it before some un-educated person does some damage. Then I realized this issue might affect other git content hosters, so going public might alert them as well as forcing Github to fix it. Regardless, would the best approach not be to communicate privately first and allow Github to fix it before going public? If this was raised privately and not acted upon, then why are Github's internal processes so slow? So many questions, so little time...

I reported this to Github privately about a year ago – specifically, I asked why there isn't some visual indication when Git's `user.email` fails to match any of the Github account's verified e-mail addresses. If you commit with a `user.email` that doesn't match _anyone_, you get a little question mark; it seemed like they could do a similar thing when you commit using a `user.email` that matches someone-who-isn't-you. Even just showing which Github user made the HTTP or SSH connection to push the changeset would be an improvement.

The tech told me that the current behavior was by design, and then pretty much said I didn't know how git worked and didn't understand Github's team/sharing/trust philosophy. I was pretty disappointed by their response, all told.

The problem is that "it's not a bug, it's a feature". Look at the Linux kernel mirror for example. All those commits come from different users around the internet, but when their emails show up, GitHub can link their usernames to their profiles.

What to do about this though... that's a good question. Perhaps just not linking profiles when pushed in this way and/or labeling them "unverified" would be sufficient. GPG signing would be nice, but would likely annoy some users.

It's pretty much the same as forging email. Most won't notice. Both can be signed with pgp, but few check and it's clumsy for the mainstream.

In other words, a lot of people haven't the slightest clue how Git works. Luckily Github record push history separately

This is not just about Git, it's also that Github is implicitly trusting user data, by linking to the user profile.

In Git, it's clear that there's no authentication, a user is just a tuple of strings, but on Github the same doesn't apply, a user is actually a well defined entity which is secured by one or even two factors of authentication, so the expectations are different.

I found that github's automatic identity resolution is actually helpful in some cases. I migrated a bzr repo into github recently and was pleasantly surprised to see my contributors matched up to their github accounts. I understand that many people might not want this, but it is a feature that can be useful.

This has been around for quite a while now; I made a post on my blog about it back in 2013: http://www.jayhuang.org/blog/pushing-code-to-github-as-linus...

Of course this doesn't actually give you access to the person's account, but UX wise, it's incredibly misleading for someone to click a commit in my repository by "torvalds" and have it actually go to his profile. My issue is very much with the social implications of this as opposed to it being an actual security issue (see: signed commits).

There should be some indication at the very least that a commit is not signed.

Why wouldn't it? I'd say it's "supported by design".

It was a test to test the design.

Unfortunately so far the same is possible for gitlab too.

Isn't it the general problem that each user is authenticated as the "git" user and the rest (all the git commands) is just "user.name" and "user.email" fields in prefs?

They could identify users by their auth and store that data with the commits instead of relying on the user email. You either have to provide username/password, API token, or ssh key to push to a github repo. All of these identify you as you.

Edit: It is important to note that this would not replace git authorship info as it is very possible that you pushed something that someone else committed. It would allow you to have a UI trail to who pushed it which could help if you theorize that someone is impersonating someone else.

Interesting, there must be some level of validation because it does not show up in his public activity feed[0]

[0] https://github.com/torvalds?tab=activity

That's because there's a difference between a GitHub account (usually authenticated by a key) and a git commiter (only identified by the name and email). It would be pretty annoying if hundreds of commits popped up in his feed every time somebody forked Linux or git.

GitHub issues become ridiculous places full of mindless stupidity whenever an issue breeches a certain level of popularity. I'd love to see a mechanism to improve signal to noise ratios, it's not a place for Redditisms.

I don't like github's dependency on some git EMAIL variables and stuff.

First of all already of course there is the problem that putting your email on the internet is asking for spam.

But a bigger problem is, if I'm on a machine that doesn't happen to have those variables configered, and I push something to github, even if I use my github username and password then, it does not show me as author. Very annoying.

EDIT: I don't like git itself's email dependency either. But at least github could have done something with the fact that you login with a username... :)

You can put anything you like in the email field. I use fake addresses such as tom@tmbp (me on my laptop), tom@tw7 (me on my Windows 7 PC), and so on.

Add these addresses to your email addresses list on the settings page of your github account if you want your github avatar, etc., to be shown against commits using these addresses.

That’s because GitHub is just some nice sugar on top of Git. Git needs your email, so does GitHub.

One workaround would be to remove any public email address you commit under from your Github profile. Github uses these email addresses to tie back to your user profile so if you don't want commits pointing at your profile, don't tie email addresses to your profile.

Perhaps some mechanism is called for here for approving tying back to your user on new repositories or making any repository not owned by you or your organizations require an explicit opt-in to tie back to your profile.

Edit: This is kind of the equivalent of spoofing the From address in an email.

My github profile email is different from my git author email. The latter is even more accessible. Just run git-log on any repository an author has committed to.

Yes but if you remove that git author email from your list of emails on Github, any commits done under that email will no longer link back to your profile.

Yeah, this is one of those "feature not a bug" things that's been known for some time. As others have pointed out, sign your commits if this bothers you. It's even possible to get your fellow open source contributors to sign commits to your project, as we do with the Metasploit Framework: https://github.com/rapid7/metasploit-framework/wiki/Landing-...

Another good article on git-signing: http://mikegerwitz.com/papers/git-horror-story

Yes one have to be careful when merging and signing. But otherwise it works really well.

we even do not know if he REALLY did not commit this ;)

To guarantee the origin of the commits in a Git changeset, you need either PGP-signed commits or the authenticated push logs for a centralized server.


If i recall right, it's abuse of this that got the farcical/satirical "c plus equality" project drummed out of nearly every code hosting site. The name and email pulls in a Gravatar, the same one probably used on many other sites, with the result that it amounts to a really convincing forgery.


Makes it so that when you "git pull" it automatically checks a digital with a certain public key and refuses to apply the patch unless it has a valid signature.

We use signed commits in the Cryptech project and it works really nice. So much in fact that I now sign all commits for all git repos I work on. It would be nice if Github displayed the signature though.

git should really make it possible to use openssh/openssl keys for signing - so that I have one key both to push to github and sign my commits. I know that they can be converted from one form to another ... but its really inconvenient, especially on Windows, etc.

The GPG key is designed for a email based collaboration flow - that Linus uses. But most of us use Github or Bitbucket's UI to collaborate.

Github/Bitbucket should use this key in their UI as well - to show verified users.

I bought a YubiKey[0] a while back and was able to get it to do exactly what you're talking about--even on Windows, which I use most often. It wasn't necessarily easy to set up, but it has been working pretty consistently. It would have probably been easier if I had known more than just the basics of GPG.

I have since switched to using my YubiKey and GPG for SSH authentication on pretty much everything, as well as using it to sign my tags in my public git repositories. I don't think I would want to go back to moving keys between devices or setting up unique keys on each device now that I've got my YubiKey set up. Worth the investment, in my opinion.

[0] https://www.yubico.com/products/yubikey-hardware/

Wow the comments there are on par with 4chan.

Isn't github just a mirror? I doubt serious kernel development happens there. tl;dr who cares?

It isn't just about Linux. You can masquerade as anyone on any public project and cause confusion.

This is more of a problem with GitHub than git.

Remember, it's not that hard to host your own git repo. There's no need to use GitHub.

This has nothing to do with GitHub. The commit author field in Git is like the “From” field of an email; you can put whatever you want in it.

This has everything to do with GitHub.

Git provides mechanisms to sign both commits and tags, and to verify those signatures. GitHub fails to make use of those mechanisms.

I think it does. Github is an open system, whereas if you host your own Git server you can ensure that only trusted people have commit access to it.

No, not related.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact