MD5 password hashing:
More hard coded secrets:
This configuration is my favourite:
And of course, RSA keys which they use for all of their RSA encryption: https://github.com/swituo/openbilibili-go-common/blob/8866d1...
... their problem is not that the source code is all public over the internet now... their problem is the engineering team. If source code leaks the worst outcome should be some IP leakage, but not a compromised live system. That can and should be easily avoided by not having everything in your source code, especially when you are such a big company with so many employees...
Here are some interesting things I noticed:
- GitHub has a lot of DMCAs each month and going through them it seems that all repos have been taken down by GitHub, but in this case the entire source code is still online despite it being posted here on HN for hours now and after they have been notified.
- None of the other DMCAs (some of them really interesting) have ever trended on HN
- The above linked repo has been forked more than 5k times, which is so much more than what any other DMCA reported repo has been ever forked from what I could see
- The repo with the source code put a link to https://996.icu in the description
- The person who posted the DMCA here on HN seems to be a new user who has only posted or commented on topics related to 996. Potentially the person/group has also gamed HN to get this link to the front page
There is no proof, but it feels like there is a very coordinated and deliberate attempt to harm Bilibili which is kind of sad.
Either way, it's now spread to far and they need to take actions to protect their users.
I can forgive the use of MD5, because they probably just don't know their hashing/crypto but secrets? It's literally in the name.
There is so much material in your 5 links alone, that anyone who desires could utterly own their infrastructure, and then some.
Man, I have tons of auth data in services like AWS just in environment variables. But pushing your rsa key to github must have happened on a bad monday.
I do often have auth info in code, plainly because of time constraints. You just have to remember it before pushing anything on github.
But aside from that, is it possible to file a DMCA for anything that has been forked if it was published under a license that permitted that action?
Do you read commit history looking for, say, relocated secrets? Do you go through the pain of rewriting said history regardless of whether you avoid merges with your current workflow or not? For me, that's too many risky and involving things to do. This advice will only work if you're only going to export squashed commits from private repo to the public one once in a while.
As for the aside, I imagine (with no background knowledge here) that as "owner" if you accidentally published something within that licensed code that does not belong or isn't covered by that license, you should probably have the right to remove it
I'm not sure the answer is, or should be, as simple as "derp - delete immediately, this was never meant to be Open Source".
That said, if the code was stolen, and then published under an OS licence - it's not OS. The "publisher" never had the rights to make it OS, so the "contract" is null and void...
How you'd solve this, logistically in an example like yours (Linux kernel) I have no idea, but I presume by the time it's been made public long enough to be cloned/forked (a la the company in question in OP) all of your efforts should be focused on damage control.
In some industries it’s also an audit or legal requirement that developers not have access to production credentials, so there’s no other way to reasonably handle that.
Edit: also, rebuilding your source because an API key changed... no thanks.
1. api key is publicly readable in a configuration files you ship
2. api key is compiled into the binary you ship.
There is only obfuscation. Then again api keys are not security keys.
Even including an API key into the binary build is avoidable. Add an OAuth-style negotiation for the key as the first startup process.
Start digging deeper and there are fewer and fewer reasons.
How does this possibly work? You must have some “bootstrap key” that you would use to fetch the API key. You’re going to ship something in the app that says “hey, I’m really your app” or else you’re doing to allow anyone to fetch your API key. All you can do is obfuscate the process of getting the API key. You cannot actually keep it secret when you need clients to have access to it.
Even if your app doesn’t require any login there’s no reason you shouldn’t go through the same process. Every device gets its own key and then you apply limits to it...
However the term API key may also refer to an App key, intended to identify all users of a given app. The intention there is to be able to engage and widespread revocation or throttling in case an app is misbehaving or compromised. You cannot keep this sort of API key secret because it must be shared and must be available.
In the non-user device scenario you mentioned, I’m pretty sure you would actually be better off simply having the device generate a random GUID instead of going through a pointless negotiation to hand out a random GUID.
4. You use a secret service that has pk or secure access from the service machine to retrieve secure configs/settings from.
You open up the code, find all of the secrets (e.g. using high-entropy substring search), replace them with access to a global variable, and set it from a file set by a configuration from a command line.
Huh? Amazingly you make it sound so easy when it's anything but. Where is this file? How is it deployed? How is it rotated? If I have 10,000 VMs, how do I deliver that file to those machines? If my VMs aren't persistent, how do I ensure new VMs are deployed with this file?
Or were you thinking that they would just ssh into production and scp the file into there? To me you glossed over so many details that you hardly solved the problem at all. Replacing the secrets in source code is the _easy_ part. Deploying secrets is the hard part.
There are tools out there to make this easy, but (1) they almost always make things harder for developers and (2) they almost always require a large infrastructure change. Sure, if your deployment is small enough where you could just SSH into a single prod server and scp a file - then you are likely miles ahead - in terms of both security and culture, but changing the culture is harder than it looks.
I don't want to sound like I am making excuses for them - but I only want to show how shortcuts when you are small can snowball into a culture where you have a gaping wound that may be difficult to fix.
Managers always want compromises for cost or speed, is up to us to make them understand which compromises they really don't want to make. If they want to make it anyway, get proof that it was done on their orders.
You can take arguments or env vars or config files (not added to Git) for your secrets. If you begin with a system of not putting the secrets in the code, ever, it's fairly straightforward to not make this mistake.
A few minutes of setup on a repository and a mindfulness to be sure not to commit any new secret files that may be in use (and add them to the .gitignore) is a great start before getting to secret management a la Vault.
Even Apple has released code doing "dumb" things. goto goto for example . This is a simple mistake, easily caught using proper code reviewing techniques and tools, and yet it still happened. This means they could have prevented someone making this mistake if they invested the time and energy doing things properly. This is Apple here. We aren't even talking about mistakes from Microsoft or Amazon or other major software companies.
And these people aren't dumb. Facebook isn't filled with "dumb" people, and yet, they've done far worse than Bilibili here with just their 'mistakes'.
The reality is, smart people do dumb things all the time because they are trying to get things done.
I'm not going to pass judgement on what other people did. I'm going to assume they did the best they could, and while they made mistakes, it doesn't make them dumb. Maybe someone got lazy, maybe someone was under pressure, and things just piled up.
"It's bad, but we'll get to it later when we have the time."
No one plans to have their code shared out to the public. I wonder how many of us could honestly come out clean for code they've written along the way. To not have someone say "Oh, you are using an older library there that's got a security bug" or "You shouldn't have done this" and what not.
What should I do with those secrets though? I'm not sure how to store them securely. So far I've been considering putting them in the server configuration so they can be read from environment variables, but that seems inconvenient for me and other developers and also not that much more secure.
You can hardcode the secrets to test stuff, but the first time you push the code to the repo should be the time you change it to reading from config. And add config to gitignore cause even if you don't stage the particular lines with the secrets in them, there will come one time where you'll rush or will have too long of a day when you'll push those secrets by accident. If you've got a public repo, then it's over. On a private repo then you may not notice this or not remember to remove it with a force push.
A point in time when you get tired of juggling config files manually in dev/prod is the point in time you explore the system for secret management and auto build/deployment as clearly your project has become useful/popular enough.
Those are my IMO and what I use as thresholds. Of course, if your environment is more relaxed there's no limit on further improving this practice.
Again,. the LEAST you should do is use environment variables and keep the actual keys out of your code. .env files are a developer convenience measure, and easy enough to use side channels. I go a step further and ensure a fallback that might be the dev environment, but that is not the same as any higher environment
So the question is who leaked it and why? Just a disgruntled employee or the effect of 996?
This is not some small-time shop, as far as social media companies go bilibili is one of the more established companies out there.
Correct me if I'm wrong but theoretically an overhyped two man operation running at a financial loss could generate the same market cap as a much larger company with massive profits? As I understand it, the only somewhat tangible factor is the actual money in the company which again can be bloated by overeager investors.
I'm not being facetious, I'm genuinely curious about the rationale.
As a first approximation, this is how much money it would cost to buy all the shares. You’d pay $SHARE_PRICE for each share and then own the entire company. Therefore the concept is a decent measure for what the market has decided the company as a whole is worth.
A company with 10x the market cap of a competitor is considered 10x bigger, because it would take about 10x as many dollars to acquire.
I say “first approximation” because if you actually tried to buy all the shares on the open market, then increased demand would drive the price up, and not everyone would want to sell right away. In an acquisition, the acquirer offers a deal where all shareholders get, say, 1.25x the current share price, but only if all shareholders sell all their stock. And the board of directors of the company being acquired can compel all shareholders to do so, if that’s in the best interest of the shareholders.
After some googling it seems that my intuition about the relevance of stock prices is mostly right:
> If the stock price falls, these investors lose money, not the company.
The stock price is entirely speculative and detached from the company's actual performance. At best it's informed by a perception of the company's performance and an expectation of how the stock price might change in the future in reaction to the company's future performance.
While shares will be worthless if the company goes bankrupt, the company won't be directly affected if the stock market plummets -- except in situations where (additional) stock can be used as a currency in lieu of actual cash, like buying out competitors.
So to answer my own question: market cap (but mostly share price really) is only a measure of company size in so far as it indicates how much money the company could generate by selling additional shares. It doesn't provide any indication of how well the company is doing financially, how many employees it has, how much market share it serves or any other measure of size BUT generally people are willing to pay more for shares of companies that are likely to grow or at least outperform their competitors in the short term.
EDIT: In other words, yes, an overhyped two man operation running at a financial loss could end up with a massive market cap but in practice it's unlikely to happen because hype rarely works that well.
You can think of this as equivalent to the size of the company because it would be the amount of money you'd need (roughly speaking) to buy the entire company outright.
"I don't know if these are embarrassing... The troubles of morality are going out and turning right to pay attention to 996.icu"
The original repo was taken down, so I don't think you can attribute that message to the leaker.
literally has their name in the path/package name, so it isn't crazy (unless you know more?) to believe that this was stolen/reverse engineered Symantec code, no? At LEAST it's a very confusing and misleading directory tree for a random repository?
Files below src/ should be sources, not a 3rd party Symantec lib or whatever, so I'd give them the benefit of the doubt here.
This seems to have been the main factor - com/symantec/mobilesecurity - in the code.
FWIW, the report you link in your other comment has a screenshot of a conversation where someone claims that the code was leaked by an intern from Nankai University who didn't know how to use git.  That they're identified by their university makes me suspect that it's a rumor (edit: making fun of the university), though.
flv.js is opened sourced by bilibili, it has 14,668 starts on github . Bilibili paid the smart & hardworking programmer who single handedly started this project and made it popular $700 USD per month , there is a very long zhihu.com thread  on this matter with 4 million views and almost 400 detailed responses. $700 is about 10% of the fair market rate in China for skills like that.
Sorry, but I am not going to take the moral high ground and defend bilibili's rights any time soon. It is a company violating the rights of its programmers on hourly basis.
Shame on you BiliBili.
I think defending corporate legal rights is mostly not about morality which is why speaking about morals in this story is important.
Like in how the legal system it's not what's right or wrong or truth that's important but legal justice, so we need morality to play a part in making sure it doesn't get out of hand.
So you are suggesting $7000 for skills like that? There are still countless PHP / Golang / Rails jobs going for under $2K. While I agree $700 is insanely low even if you are in some Tier 3 cities, I don't think 10% paint an accurate picture of the current state of Programming Paid in China.
Apple does this with Mac OS X. The System Management Controller contains a key, and the "Dont Steal Mac OS X" kernel extension (which checks for that key) contains a poem that must be present for Mac OS X to run.
The level of creativity needed for copyright is minimal. A key pair is generated by machine, but at the request of a human according to parameters selected by the human. That is likely enough.
OTOH, a passphrase of substantial creativity  may be protected by copyright. Crucially (for any takedown), this would cover transformations by key derivation functions.
And what matters for a takedown is not the number, but the actually document being published. Github is not hosting the random number. It is hosting that number in the context of a larger document and it is that document that is subject to the takedown request. Things might be different if github hosted only the number without the associated labels and code.
I have no idea what it means, but I like it.
Most of these probably also appear in World of Warcraft, though I cannot say for sure.
As to why this file is in the top directory of the repo, your guess is as good as mine.
Also there are a lot more heroes in the Dota map for Warcraft 3 than on that short list
This keeps my secrets out of the source code.
go build \
-X main.programVersion=`git describe` \
-X main.username=$USERNAME \
It's fine for our internal go apps. I'm not sure what I would do if the secrets were for connecting to public cloud infrastructure though.
Perhaps encrypt them with a separate key per customer, then feed in the key via an env variable?
Ownership of the code could belong to the leaker.
It's not a simple case either. But it feels a bit strange that there aren't any links to this kind of DMCA takedown. It seems strange that a company like BitBucket would even have this kind of information without the DMCA notice.
Or maybe I'm just a cynic.