Hacker News new | past | comments | ask | show | jobs | submit login
Breaking GitHub Private Pages for $35k (robertchen.cc)
847 points by SuperSandro2000 16 days ago | hide | past | favorite | 106 comments

Hi from the Pages team! This was such a great find and can't thank you enough for helping ensure Private Pages is as secure as possible.

This report helped uncover:

- A bug in Openresty where `ngx.redirect` didn't handle unsafe characters [1]. While the fix is now in the latest version of Openresty, a quick patch was to build the URL safely before using it in the redirect.

- You should check for case sensitivity when reading `__Host` prefixed cookies, and verify the values against your expected format. It's possible for both `__HOST-Foo` and `__Host-Foo` cookies to exist, and only the `__Host` prefix requires the `Secure` and `HttpOnly` attributes [2]. In our case we strip all cookies at the edge using Varnish (VCL) to ensure no user-supplied cookies make it to our origin, and now we also ignore any "Secure" cookies that don't appear to have been set by our servers.

[1]: https://github.com/openresty/lua-nginx-module/pull/1654

[2]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Se...

I'm not a security expert, but in node.js (specifically express.js) there's a concept of "signedCookies".

Usually you can just set a "httpOnly" flag to make sure client-side javascript can't mess with the cookie. But if you also sign the cookie, it further enforces this for any client tampering with the cookie manually too. Because only the server knows the secret for creating a new signature, if the client sends back a cookie that is modified in any way (including case sensitivity), it will be discarded. It should prevent the whole class of bugs caused by "unexpected format".

It's also a signed cookie, and we do the signature verification you mention if the format is correct, but if the format doesn't match, that cookie is discarded at the edge, before it even gets sent down to our app servers.

That's very similar to how JWT works. It's a signed message with some keys attached to it.

It's conceptually the same, but JWT implies a specific data format standard which might be a more involved change architecturally.

I'm mentioning the cookie signature stuff because it can be added in an almost blackbox way at the web framework level. Whenever you send back something with set-cookie, also set a signature. Whenever receiving a request, check for that signature too. Though I guess it's not a good idea to try to "roll your own" if your web framework doesn't already support this out of the box.

(same disclaimer, I am not a security expert)

Well, colour me impressed. The dude's in high school and finds stuff like this? Hopefully it won't get to his head, we've got enough "rock star" devs :)

> The dude's in high school and finds stuff like this?

Man, you should have seen the european demoscene in the 90s, half of the geniuses had the same main obstacle: homework from school.

Despite what it may seem IT is still in its infancy which means that we work in a field where pure affinity for the subject and raw curiosity-driven brain power (or in some case, dedication to the task), as well as a couple hundred euros worth of hardware, is enough to get into the most complex parts and sometime redefine them.

Access to all the papers and education you need is insanly open on the internet (unlike other fields), as well as trillions of line of code for you to explore. The only two limits are yourself, and time.

Which is not to say that training and experience have no role, but IT as a field is extremly large and for many parts we're still in an early uncharted phase, so discovery is made by explorers and dedicated people[1].

Still, most dev would gain a lot by going through a good algorithmic course.

[1] While talking about dedication, special mention to the linux everywhere guys. I will never not smile at the idea that some of the best IT protections in the world, costing millions upon millions, from the Xbox to HD-DVD to whatever, were brought down because someone wanted to run it on linux.

Thanks for mentioning the demoscene. While I was never really involved, I learnt about self modifying i386 assembly code from a 15 year old. I was roughly the same age back then. And we were both writing asm at that time, so actually employed the technique, not just knew what it was. Compare that to what gets done in school, and you see how far away from the real thing our education system is.

I think the consensus is that it is better for the whole of society to make sure the bottom 5% makes it through that for the top 5% of the class to fulfill their potential.

It's not that I don't understand what you are saying. I was at the top in school and also as a parent I notice how removed education is from... beauty? It's just that economy dictates the rules. Also, teaching is hard.

I wholely disagree, it's just that people see the two groups differently. The bottom-5% kid who has trouble tying their shoes is an easy target for sympathy and you can throw money at special-ed programs. The top-5% kid who's bored off their ass and could be excelling elsewhere, not so much.

My explanation is that americans consider ourselves individually to be exceptional, and have trouble recognizing when others are. To the detriment of everyone.

Why do those two 5% have to be in the same class?

Lots of reasons. To start with it is positive for the bottom 5%. School is also not just about academic results but also integration and socialization. I think that ideally you would have one class and several teachers.

If integration and socialization is important, perhaps we should outlaw private schools and redistribute all students across nearby schools randomly, so the rich kids get to understand what it means to be poor from the early age.

Dragging down the top 5% to the lowest common denominator is not a benefit to society overall. Would you recommend cancelling AP classes?

This is a big subject; none of us are in a position to do anything other than philosophize. But I do think that when your goal is "socialization" you don't need to force everyone to study the same thing. I think it's a moral tragedy that we hold back our brightest students.

Could you perhaps tell me what AP means? Thanks!

I’m guessing he means “Advanced Placement” here.

Sorry should not have used an abbreviation: "Advanced Placement" -- courses meant for excelling students.

> courses meant for excelling students

I took a couple of them and dropped them as fast as I could. I found the classes mostly just included more homework than was necessary. I thought it’d be more like university and focus more on lecture, notes, and reading. Writing more essays and more projects was a big NOPE for me.

How do you identify the students which belong in the separate classes, and at what age do you do it?

It's not uncommon for students to switch between top 5% and bottom 5% - yet people act as if its static. Education and chid development are ridiculously complex. It's not possible to have a solution that solves for everyone.

That's a good question that has several reasonable answers (also depending on context), so we should run some proper experiments to find which answers are best.

Though from a policy point of view, we don't need to know those answers: different schools should try different answers (including sending all kids to the same class), parents should send their kids to the schools they deem the best overall package.

As long as you permit or even enforce periodic rotation between the "tracks" when appropriate given the performance/speed of learning, you might as well start early.

The former is a lot harder.

It's like what you learned with the i386 was much more inline with a trade school level of learning vs the more theoretical learning of a university. you learned the down and dirty and was immediately able to do something with it. it didn't teach you the hows & whys and blah blah. it's that 30,000' view vs 10'000' view vs down in the trenches view. sometimes you need all of the views, but some tasks don't always need the 30,000' theoretical lesson

Curiosity, time and an open mind is all you need. Teens have plenty of that.

"Time" is something I rarely had as a teenager, and that was more than 15 years ago at this point. On top of all the normal high school-age pressures, I had tennis, martial arts, drum lessons, weekend volunteering, AND an unquenchable passion for computers. And with good grades and AP classes, all that got me nowhere higher than state universities. Whatever it would have taken for me to get better than that, it wouldn't have been worth it.

I can't imagine what it's like for kids now. It must be hellishly difficult trying to find time for themselves. Gen-Z already is aware that they will not fare much better than me, at best. It must be tough being a kid today.

There are different "paths" with different environments (parents).

I had plenty of time, and my parents were rather open regarding hobbies/extracurricular activities.

But they soon had "given up" on making me do thigs, caused by me quitting everything that wasn't novel and dismissing their "inputs". My father is also a generalist, and was never keen on getting focused on single skills (he has some mad skills on some fields though, e.g. swimming, and that took quite some time and effort).

And although we were not poor, my father thought that hobbies don't have to be expensive, quite the opposite for kid's hobbies in fact (but we had our fair share of expensive hobbies and gear: snowboarding and bikes and unicycles and sailing and what not).

But most prominently, being a kid of the eighties, and dur to my parents working with "IT-People" in their job, they thought computers were not at all a desirable past time for a kid/teenager (despite my very urgent interest). A waste of time, either toy or nerd-tool. Somewhat what people previously thought of books.

So my biggest interest (programming) was always something I was myself dismissing and even avoiding for a few very important years, and I spent most of my time as a teenager in front of a TV, gaming or partying/meeting with friends outdoors.

I could have spent that time... differently, and I am not blaming anyone, not my parents and not me. I had a good time, but I might have appreciated a little guidance/company.

I just want to say I know why I was not a mad hacker at 17 - small details in paths are waiting everywhere, and there are more exceptions than rules.

My parents were the same way. Computers are a waste of time and expensive diversion! They steered me into a much more important (to them) career, that was somewhat less my interest, but at least I was good at it: Music.

I'm not kidding.

It took me until less than a decade ago to revert the brainwashing and pursue my actual passion (computers and programming). I immediately doubled both my income and my joy-on-the-job.

I think most parents acted the same, my mother would always tell me to get off the computer and stop playing "video games".

While i was indeed playing games... i also started reverse engineering the games at the age of 14 so i could cheat.

That started around 2006 so it was a bit more acceptable to be all day on your pc but friends/family never really understood until way later in life.

My uncle raised his kids similarly, "they have to always be doing something!" with the side effect being that they sunk thousands of hours into learning skills that served no intellectual/real life purpose (beyond team building), and took time away from their actual interests. One was interested in building drones, but spent his time hitting balls with a stick, instead. Anytime I need someone to hit things with sticks, I call him. I haven't had to yet, but one of these days!

Every success story, intellectual or not, involves passion and time. I'll never comprehend the mindset that completely prevents a kid from experiencing both.

I would add to that list (35 years ago) - girlfriends.

This takes a lot of time too - when you are with them, and then also when you are not, but your brain is still with them anyway.

Technology and development is an area of constant growth, pretty much by definition. A technology can't be the latest and greatest technology without pushing the boundaries. Companies who market themselves as innovators, have to, well, push innovations. Frameworks and developer tools keep appearing and being added to because they can exist to help others innovate. Individual developers who want to make sure that their skills are up to date, have to keep learning new skills as new ones appear. There are so many different bits of hardware and software out there now, and possible ways and benefits to integrating with them or supporting them; that even just working on a given project for a while with broad enough vision, might see you having to learn lots of different things

this is more like front-end dev than IT

front-end dev squarely falls into the realm of IT. What's your motivation to separate or "gatekeep" these fields?

IT is often used as a term referring to the more practical side of things - infrastructure management, sysadmin, and helpdesk type work. It's not really a limitation that I subscribe to but it's hardly uncommon. Maybe they were using that definition.

that is what I was getting at. I worked with someone in IT and their main focus was on preventing system from being exploited and helping people with their credentials or if other problems arose. I don't recall any coding work such as the actual functioning of the site. Their focus was more about keep the severs running.

One of the companies that I've worked at in the past had software developers in three main different departments.

There were developers in the classic engineering department - creating the product that gets sold. That's an easy one.

There were developers in the customer support department. This group was known as 'sustaining engineering' and developed the tools for the rest of the customer support department to use to diagnose and troubleshoot issues... along with identifying bugs that one customer has in the software (and then creating patch releases for those customers). They had some really interesting problems solved there like "how do you set up a remote debugger to look at a core file that is greater than 2gb in size?"

There were also developers in the IT department. Along with the sysadmins, DBAs, an helpdesk... the "Internal Business Applications" team was the one that maintained the corporate site, or did projects for other departments that didn't have the headcount for a full time developer of their own (sales department and the cornfugraiton tool). Developers also were the ones that did updates to the ERP and CRM systems. The customer support website was maintained as part of this.

So... the point is that IT can certainly encompass software developers. You'll often see this on various job sites - all of the computer technology jobs tend to get categorized under IT unless the company has a technology product of their own (and then separates it by engineering and IT).

> the "Internal Business Applications" team was the one that maintained the corporate site, or did projects for other departments that didn't have the headcount for a full time developer of their own (sales department and the cornfugraiton tool). Developers also were the ones that did updates to the ERP and CRM systems. The customer support website was maintained as part of this.

We usually call this "IS" (Information Systems). This usually goes hand in hand with IT as IT/IS.

In USA. Yes. But America only accounts for 4.1% of world population. So.

Information Technology is technology that involves information manipulation and storage. Development and software engineering are subsets of IT, infrastructure maintenance is another.

Computer Science has nothing to do with computers, or science. It is generally a mistake to read the name of something as defining that thing — it has almost certainly shifted meaning since it’s origin, especially if someone is arguing about it.

I disagree with you there. In Computer Science, you learn the science(determined truth) of computation(the act of mathematical calculation). The names of most degree programs are pretty accurate when you really parse the jargon.

> The dude's in high school and finds stuff like this?

As someone who once got a $500 bug bounty from Facebook in high school, I'll have to say that this is not so surprising. Finding these bugs is 10% knowing about attack vectors (CSRF, XSS etc.) and 90% spending time trying them out in various forms in all the places you can think of. In school (especially if you're already doing well there) you often have lots of spare time and not so many others way of earning money. And with just a few years of experience with programming/computers there are far quicker ways to earn money.

It's still super impressive work by the author of course!! They are extremely talented! I'm not trying to dismiss what they accomplished; just commenting on the strange dynamic of software security research.

Also its worth considering that all these applications get updated like every week, so something that was ruled out in the past can be exploitable now.

This is not just finding bugs but coding and dev., work. I too found bugs in fakebook but they never paid me, but they did fix them.


Just to clarify: you exploited said hole to access private photos of other people without their consent or knowledge?

What agreement they had with facebook I am unaware of. But facebook seemed to treat them as public.

Facebook and access have always been confusing and this is setup on purpose. Some photos show up in feeds some are available to other group members. Photos you think are private are private until someone else you know is tagged in them.

I not only have an ethical issue with what you just said, I also question the wisdom in disclosing your illegal activities on a public forum.

I wouldn't assume illegal activities occurred. Did you know facebook had an easy to access search feature for key employees to view anyone's private profile. All very legal.

As for the ethnical issues. I'm not sure I buy in. At one point typing profilename/photos showed you everyones albums private or not with a small picture. You needed access to go further. That's more of a facebook design decision. At least one private photo and the any album name were purposely not private.

Finding a bug and not reporting it is not illegal. The rest is assumptions.

Plus just being curious. I once found an exploit in a large IAM vendor's multifactor auth flow by just playing around with the Inspect tool in my browser. I told my customer success manager and they fixed it. Too bad there was no bounty :(

The bounty was you not having to deal the someone else exploiting it and your org having to deal with the fall out.

I worked at a very large bug bounty program for a short spell a few years back. I’ve been in the infosec world since the 90’s, and the number of ridiculously talented teenagers around the world that were sending in very good reports on sophisticated attacks was pretty mind boggling.

Yes there was a lot of garbage too, but that was to be expected. I came away thinking that i hope these kids remain engaged and invested in society lol.

I knew some teenagers in the warez scene in the 90’s and early aughts. Insanely talented folks. One even runs a bank now...

So, a life of crime?...

It's only a crime if you get caught and convicted in a court of law.

Innocent until and unless proven guilty.

I think that was just a joke towards their career in banking.

I was trying to extend the joke---what banks do might be shady by some standards, but they have good enough lawyers and accountants that it's seldom outright illegal enough to get individual bankers criminally liable.

And judging by the blog post he's a talented technical writer as well.

I came to the comments to say the same thing. Being able to write clearly while holding together a narrative is a gift.

The CTF part made me confused though, was this bugbounty part of some CTF?

Was the second part of your comment really necessary?

That's huge pros of this industry, that all you need to enter is just cheap computer and willingness to put effort and you can make huge career.

You don't even need to waste time on college because of the information avaliability, especially when you're already this good before even being able to attend.

> You don't even need to waste time on college

Well that completely depends on where you live, and the culture there.

You could be a rock-star 10x dev in every sense of the word, but certain places in the world, if you don't have a degree or decades of industry experience, you will almost certainly never even get an interview.

You will need to seek out, and network your way into positions where you can actually impress some human being, and not get stonewalled by resume-screening software.

For every whizkid without a degree that managed to finesse their way into some hot startup or Big N company, there are likely ten that are wasting their talents on some irrelevant dead-end job doing data entry or whatever.

Yes this is me, I am self taught and from a small and poor town. Other than freelance I was 27 when I got my first high paying, full time programming job (in the Bay Area.) While the other couple of self taught people from my home town work in factories (and the one I know who went to school for it works at Boeing.) It was definitely not easy and I’d be further ahead had I went to school, but I probably wouldn’t work in startups and I don’t think I’d be anywhere near as good (and the people I’ve interviewed with seem to concur.)

100%. Companies don't hire based on capability, they hire based on reputation and the labor supply/demand of the area. If you can do neat stuff but have never proven your capacity to do so, then it's unlikely you'd be hired.

> If you can do neat stuff but have never proven your capacity to do so, then it's unlikely you'd be hired.

And as far as this goes, this feels fair.

There are tons of opportunities to show your chops, whether a code jam or CTF or hobby project or whatever (and yes, I know hobby projects aren't for everyone, so find something that is). A skill is only valuable on the job market once you've proved it actually exists.

The larger problem that you're touching on is that some companies have streamlined their application process in such a way that any way of demonstrating your capability that isn't a university degree or prior corporate experience is disqualified without consideration, and that I would agree is a serious problem.

The same reasons why some teenagers find exploit in online games, finding exploits is "easy", programs nowdays are so complex that the security vector is pretty open.

On the otherside those rock star dev will get their stuff broken because of the above.

I was thinking the same thing, makes me wonder wtf I'm doing with my life.

yeah I know. I thought leering about infinite series in HS was impressive but this blows that away. A solid 6-figure salary awaits him . So much for that media narrative about America dumbing down and lowered academic standards. America still produces among the best talent in the world.

Lowered academic standards usa concern for the median and lower quartile, not the extreme high end of people who probably didn't learn their exceptional talents in school.

You undermine your own point by pointing out that this person is a 1%er.

> "That’s because .github.io is not on the Public Suffix List."*

I'm confused because I remember github.io always mentioned in explanations of the public suffix list and as rationale why the list exists. Looking at the list it sure enough is there. What am I missing?

I think the public suffix list only works exactly one level down (otherwise you wouldn't be able to share cookies with sub-levels that _should_ be able to share cookies).

Thus, with `github.io' on the list, everything on `*.github.io' can't share with each other, but everything on `*.a.github.io' _can_. The author is sharing between `private-org.github.io' and `private-page.private-org.github.io', which is allowed because `private-org.github.io' (or the more general `*.github.io') isn't on the list.

I think you're missing that github.io is a public suffix but microsoft.github.io or yourcorp.github.io isn't. He finds a publicproj.microsoft.github.io and abuses the fact it shares cookies with privateproj.microsoft.github.io.

I think him mentioning {anything}.github.io not being on the public suffix list is a slightly misunderstanding. While true, it's expected. The same is true for {anything}.com.

I thought it might have been added after the incident, but a `git blame` says otherwise:

    7b7f575f public_suffix_list.dat                  (Simone Carletti                   2013-04-23 11:51:10 +0100 11950) github.io

github.io is on the list. *.github.io is not. They are different. The rule only goes one level down.

The github.io rule means foo.github.io cannot share with bar.github.io.

However foo.private-org.github.io can share with bar.private-org.github.io. The *.github.io rule would prevent that.

Oh yes, that makes sense. Thanks!

Jeez. I know I'll never be on this guys level, but where can I go about learning these topics?

You can start with the Web Application Hacker's Handbook https://portswigger.net/web-security/web-application-hackers...

Also, the LiveOverflow YouTube channel has some great, accessible (yet very technical) videos on security!

hacker101.com and join the community Discord. There's a ton of Bug Bounty hunting content on the internet. Plenty of room to explore and find your niche.

Very good find and a payout with a “High” severity bug with a thorough report and for $35k for one person from high school.

Great job.

the write-up mentions that it was reported with another person. Still a very good find.

Duplicated https://news.ycombinator.com/item?id=26692304 Post from real Philip @ginkoid at Github, author also mention him in article.

This is impressive!! People might call it luck but it isn't. If you are brute forcing, maybe it is luck but brute forcing won't take you anywhere unless the application is pretty amature. Using the right stuff at the right places and strategically chaining bugs just makes things much more interesting.

If this article was hosted on Github Pages we would have been able to open it..

Is there a technical reason why Microsoft doesn’t use IdentityServer4 with its implementation of the OpenId Connect and Oauth2 technologies. https://github.com/IdentityServer/IdentityServer4. They seem to be pushing it quite hard in Visual Studio and it appears to be designed for this very use case (cross domain authentication). Or do even they think it’s too overkill for GitHub ;)

Because GitHub Pages isn't written in .NET. However, the general implementation is heavily inspired by this and other similar projects.

GitHub Pages doesn't have to be written in .NET to use IdentityServer4 or any other implementation of OpenID Connect and OAuth2 whatever language it may be written in. Anyway, I'm not a fan of OpenID Connect because I personally think that it is overkill for most projects I work on. I was just wondering why Microsoft doesn't use the tech if they are pushing it so hard.

> "the general implementation is heavily inspired by this" Inspired but flawed because they "rolled their own"

I wish I could work for this guy.

Archived here: https://archive.is/b6TYj

How in the world does a high schooler have this depth of skill and knowledge? I admit it, I'm jealous, I am.

Congratulations mate. Stay liquid :))

Great writeup!

So, who else wants to hire Robert now?

Not to be "that guy" but why would you downvote somebody wanting to hire somebody?

Please consider hosting your blog in a static form as it is being hugged to death.

Maybe on github pages?

It's fine for me.


Considering the reemergence of static sites is a relatively recent counter-trend to the prevalence of bulky dynamic sites, you'd probably call dynamic sites so 1989.

That was my attempt at sarcasm :) Should've added a /s

Time is a flat circle.

Punchline: GitHub accepts arbitrary user input and spits it back out into page HTML without verifying it.

> Punchline: GitHub accepts arbitrary user input and spits it back out into page HTML without verifying it.

Because of a bug related to parsing integers, the behavior you described can happen in the GitHub pages code. However, it is a far cry from "GitHub accepts arbitrary user input and spits it back out into page HTML without verifying it". That's an exceedingly antagonistic way of putting it - and smacks of "I want to make GitHub look dumb by misstating the problem."

> bug related to parsing integers,

Right, that's not verifying the result.

"alert()" is pretty far cry from an "integer"!

How do you even create an "integer" this bad?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact