Hacker News new | past | comments | ask | show | jobs | submit login
Tencent WeChat is now a GitHub secret scanning partner (github.blog)
154 points by 5amdotis on Dec 20, 2022 | hide | past | favorite | 141 comments



Lame corporate partnership announcements aren't on topic for HN, and the wording here looks to have been a boilerplate malfunction: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que....

Poor functionary creates political incident with humble template...sounds like a Gogol short story. "but it worked great for redirect.pizza!"

Btw I assume this recent thread was about the same feature:

Secret scanning is now available for free on public repositories - https://news.ycombinator.com/item?id=34007637 - Dec 2022 (70 comments)


This is simultaneously an epic clickbait and a very accurate represenatation of reality that is very boring and not shocking at all. Congrats to whoever wrote the line.


Brilliant title for the article.

Even though I'm a paid github customer, I had no idea they had a program called "secret scanning" and that it's actually beneficial.

So I obviously assumed they're letting China scan my private repos.

They really need to work on wording.


>, I had no idea they had a program called "secret scanning" and that it's actually beneficial.

Fyi... this feature was also previously mentioned in the news for public repos: https://techcrunch.com/2022/12/15/github-brings-free-secret-...

>So I obviously assumed they're letting China scan my private repos.

To clarify, it's Microsoft/Github doing the scanning of private repos on behalf of the partners. They're just forwarding the tokens that match the partners' regexp.


Yeah I read the article and the comments on HN so I know what it's about now. I still think they (not HN) should change the title to include what secret scanning means.

Edit: how about dropping the corporatese and title it "github will now scan public repos for secret WeChat tokens"?


Yea, but you don't get panic clickthru with this message.


hmm it's a pity github blog didn't have any advertisement...


Actually there's an even better term below in the thread. Use just "for WeChat credentials". No mention of secret scanning.


This is 100x better than the original!


Like .* ?


>Like .* ?

Assuming your question is not a joke...

The partner has to email the regex to secret-scanning@github.com for their approval. See the steps at: https://docs.github.com/en/developers/overview/secret-scanni...

Once it's in the scanning system, the partner receives JSON messages alerts such as:

  [
    {
      "token":"NMIfyYncKcRALEXAMPLE",
      "type":"mycompany_api_token",
      "url":"https://github.com/octocat/Hello-World/blob/12345600b9cbe38a219f39a9941c9319b600c002/foo/bar.txt",
      "source":"content"
    } 
  ]
So instead of ""token":"NMIfyYncKcRALEXAMPLE"," -- the private repo owners would worry about '.*' regex leaking full source code instead of API credentials such as ""token":"#include <stdio.h>\nmain(){\nprintf("hello world");\n}","

The above scenario requires believing the following:

- Microsoft/Github is technically incompetent and an employee and/or their internal regex sanity checking tool will blindly accept open-ended regex like '.*'

- MS/Github will then allow that unbounded regex to leak petabytes of private source code out to China partners via the JSON "token:" response. (Github says they have 18+ petabytes of data and most of that is private repos: https://twitter.com/github/status/1569852682239623173)

If one believes their entire private repo source code is at risk of being copied to TenCent being leaked by the '.*' threat because the above scenario seems realistic, I assume the answer is to delete the repo.


https://docs.github.com/en/code-security/secret-scanning/abo... is pretty damn clear that secret scanning for private repos only alert owners; only the public repo scans alert partners (for instant revocation).


I don’t think GitHub will send back the matching string, just the name of the repos


That seems bad? Look for /winnie/i in all private repos. The repo name includes the owner. Then go and arrest them.


Devils advocate: I read recently that GitHub is being used to circumvent censorship in China. Does this system of allowing them to provide regexes allow China to automatically obtain lists of users who are mentioning certain words or phrases? Or is that nonsense?


> Or is that nonsense?

Yes, that is nonsense.

1) secret scanning can be disabled (not even sure it's enabled by default). 2) the regexes are fairly specific, length limited, etc. 3) github is obviously reviewing regexes that are accepted.

Check the list of stuff supported: https://docs.github.com/en/code-security/secret-scanning/sec...

A bit sad, they don't publish the list of regexes, etc.

--------------

I added a similar thing to the package manager for Dart / Flutter, because we saw users accidentally publishing secrets. That code is public, it relies on regexes and entropy estimation:

https://github.com/dart-lang/pub/blob/eb8ee21a089ebe0f2c2dd8...

It was heavily inspired by the researchers in: https://www.ndss-symposium.org/wp-content/uploads/2019/02/nd...

Worth a read, and certainly provides motivation for Github to do this kind of work :D

(disclosure: I work for Google. The opinions stated here are my own)


Once again[1][2], scanning alerts on private repos are only sent to owners. Whereas public repos are, you know, public.

It's really tiring that people correct other people's misinformation when they themselves haven't read the bold bullets points in "Learn more about secret scanning"[3] and end up totally missing the point.

[1] https://news.ycombinator.com/item?id=34067335

[2] https://news.ycombinator.com/item?id=34067625

[3] https://docs.github.com/en/code-security/secret-scanning/abo...


Ah yeah, that's a good point.

Honestly, I'm just very happy GitHub is doing this, because we've all made these mistakes. And it's so easy for then to hide in git revision history. Only the be found when someone scans for the secrets.


I had the same reaction. This seems like the plan of scanning of pictures on iPhones for CSAM; it would not be hard to add extra patterns that match materials beyond the original intent.

Are the secret patterns all publicly available? Or is the secret scanning patterns themselves secret? Without public review, we cannot know what secrets they will obtain.

I for one do not trust GutHub/Microsoft to act in the interest of the average user. Their past actions disqualify them from receiving any benefit of doubt.


Yeah, you have to read the article to realize that "secret" is a noun in this case, not an adjective...


Seems unbelievable how they'd fumble the title when they could just as easily have called it "secrets scanning" and it would be OK.


Or maybe “secret detection”

Though perhaps that’s just my own bias on the subtle differences in the meanings of those words.


Maybe it's intentional, to generate traffic.


I think the name of the service is a bit ambiguous; they could've called it "Access Key Scanning" or even just "Secret*s* Scanning". Even capitalizing it would set it apart as a service instead of regular words in a sentence.


Credential scanning.

It's not scanning that they're doing in secret. Credential scanning removes the ambiguity


Scanning repos for secrets has been a thing for a while now. But seeing Tencent might put people on edge.


Secret scanning is a thing.

But this is an excellent next step where they build an integration with these partners where, as soon as a secret is scanned, they can notify tencent/AWS/other providers automatically to instantly invalidate those keys before they’re abused.

That’s what’s novel here.


I wasn’t commenting about whether it was novel or standard practice. OP seemed to have gotten the heebie-jeebies from the submission title.


China doesn’t even have to scan now, Github is going to sort it out and send it all for us. Sounds bad.


There's never been a better time to migrate your projects away from corporate control


Actually the best time expired several years ago. Also prevention is better than cure.


Not not true, but if one migrates their repos now, they will create new projects on their new home which will start to break the cycle.


Hmm, 2022 (2023 soon!) and people are jumping and screaming on their first and incorrect reaction to some headline (not you, but scroll down for a comment doing just that). God Bless The Internet! /s


To everyone portraying this as harmless and as Wechat just looking for security breaches: Tencent itself is the security breach. Not only can Chinese ppl not sign up without providing a phone number, just to get a SIM card they now take your government ID, a picture of your face and a fingerprint! Xi is making absolutely sure that every single internet user is IDed and has their conversations tracked on apps like Wechat. Whatsapp, Signal & co are banned.

These "leaked" secrets GitHub forwards might be dissidents getting access without being tracked. It might not be a WeChat secret at all who knows? They're not a trustworthy partner, nothing should be shared with this company.

And to the folks saying it's public information and they already have it: That makes no sense, then they don't need GitHubs help. Obviously GitHub is supporting their scanning efforts here.


> And to the folks saying it's public information and they already have it: That makes no sense, then they don't need GitHubs help.

GitHub has a global stream API for all public events,[1] but it is delayed by five minutes, precisely so that sensitive actions like revoking leaked tokens can be performed before the world sees them. That’s what the secret scanning program is about, and you would have known if you spent 1/3 of the time of your rant learning about it.

Edit: Additionally, for private repos, secret scanning is opt-in and only alerts owners.

[1] https://docs.github.com/en/rest/activity/events?apiVersion=2...


Wait a second, the requirement of a government to get a sim card is kinda standard practice in multiple countries. Also, when it comes to privacy, US based companies must be last ones to talk, like as if China is the only bad guy who infringes upon peoples right to privacy. China is dangerous, but it's not the only dangerous thing in the room. Also, your comment doesn't make sense. If you are committing your public credentials while diseenting against the government, you are doing it wrong. Also, any publicly committed credentials are like literally tracked by thousands of both within minutes. Its not like if China really want to scan them, they can't do it without Github telling them they found something.


You may have misunderstood. There is no way to anonymously access Weixin from China unless you have hacked credentials. You need a phone number. Note that local Weixin and foreign Wechat are not the same. Last time my Mainland friend bought a SIM card the vendor had a government app on his phone, snapped a picture of my friend's face, scanned the ID (身份证) and had him take a fingerprint with a reader he also had connected to his phone. All this data gets uploaded directly to the Chinese government.

There isn't a country in the world which does this. But the details are also not the main point, it's how extremely restricted and controlled simple access to information or forums of free expression is for people in China. Tencent has party officials working within the company. This isn't a regular business as Westerners might imagine it, it's an extended part of the CCP just like any other large corporation under Xi.

Again, people are saying it's no big deal but why would GitHub help them at all? It's not a good cause.


Github here isn't supporting China govt, they're partnering with companies that want to provide a regex to their credentials. And I dont know where you hail from but, Im from India and I have a govt issued mandatory id card that has multiple biometrics and my photo associated to it. And to get a sim card I need to provide that ID and authenticate with my fingeprints. Also tell me which US companies isn't drooling over China contracts, and to an extension orther local hostile activites. There is literally a recent story where using facial recognition Madison square garden denied entry to an attorney, who was related to a company that is in litigation with its parent company. Buy yeah China Bad.


> There isn't a country in the world which does this.

My government requires me to have ID, which contains a photo and finger prints and you cannot get a SIM without ID. That's Germany and it's true for many, many countries.


> There isn't a country in the world which does this

Does what? The thing extra is the fingerprint but literally every modern country requires ID registration and more. My government also knows this IP belongs exactly to me. Stop spouting nonsense.

Plus this is completely unrelated.


This is offtopic (as it has nothing to do with the linked blogpost) but it's even worse. At the tail end of 2019 I went to China for a few weeks, I created a WeChat account at home without any problems. As soon as I stepped into China it got locked and I needed someone with a WeChat account to verify me. They can only verify (I think?) 3 new accounts per year, and 6 accounts that got locked out for whatever reason. This is (from my perspective) even worse than requiring ID for SIM etc. It links people together and I'm sure it brings some repercussions to the people that verified you if you make trouble down the line.

It was very fascinating to see, a near total domination of WeChat everywhere and relatively very hard onboarding for new accounts. Contrary to the west where most of services seek to streamline onboarding as much as possible - I guess that becomes an anti feature when you have total monopoly and _everyone_ has a WeChat account. I think it's a very effective (and very dystopian) form of control. P.S: Signal worked without any problems for me, even on a Chinese SIM (one "trick" to go around most of the GFW was buying a HK SIM in HK. Works across china and has a lot less blocks, but for various reasons I got a China SIM too).


This is a service running on public repos, anyone can scrape this which is the problem. GitHub does the scanning and all that is forwarded is the "secret" matching their regex. Tencent then identifies the account owner and informs them about the public secret. That's all.

GitHub is available in China, why shouldn't they protect their Chinese users?

And the SIM card requirements have nothing to do with Tencent, have you tried getting a SIM in Germany? Impossible without government ID and an address. And there are a lot of services which you can't sign up for without German ID / address. As a foreigner I also can't easily open a bank account in the US.


Why do they notify tencent instead of the repo owner?


Once a co-worker accidentally pushed an AWS key pair to his public dotfiles repo. About 30 seconds later AWS disabled the key and notified the account admin about the possibility of an account breach.


This is my question too… why not just let the owner of the repo know, why notify Tencent at all?


Answered elsewhere: https://news.ycombinator.com/item?id=34067625.

Instead of repeatedly having a question in an HN thread, next time try to read the source article.


Without taking away from your first paragraph at all, if any dissidents are publishing their access codes to GitHub repos, they are 1) doing it completely wrong and 2) are already screwed.

The threat here, in the worst case, is associating a GitHub ID with a WeChat ID.


Quoted from the blog post:

> We have partnered with Tencent WeChat to scan for their tokens and help secure our mutual users on all public repositories and private repositories with GitHub Advanced Security.

This is GitHub scanning private repos and telling WeChat about them.

WeChat can already scan public repos.

They are not already screwed if they’re publishing something to a private repo, it might be the wrong way to do it, but it doesn’t mean they’re already screwed.

If you don’t trust GitHub’s private repo security then why are you using it in the first place?



Obviously you’re wrong or the article is wrong… I’m gonna lean on you being wrong as the article is coming from GitHub and you’re not GitHub.


For private repos it is opt-in requiring the Advanced Security license: https://docs.github.com/en/get-started/learning-about-github...


Imagine if someone protested against Finnish–Russian cooperation on search-and-rescue operations near their border because the evil Russian government could be searching for political dissidents to imprison. That’s what your comment sounds like.

This is about preventing things like API keys from being published to code. That’s not a dissident use-case…


While Tencent and Wechat sound absolutely dystopian, the "you need a Government ID and a picture of your face" is often a requirement for creating a Facebook account or retaining your old one as well. Twitter also used to require a phone number to retain an active account; and Google frequently locks people out of old accounts unless they provide a phone.

Is this whataboutism? Possibly – but what I'd actually like to happen is US-based companies are charged company-hurting fines for mismanaging PII like this (Twitter, for example, is currently openly planning to sell user phone data [1] that they previously gathered for security purposes).

All this to say, we can't reasonably call out other dystopian companies if the ones we use everyday are doing the exact same thing. So we should call out secret scanning from Meta [2] and (if it ever happens) Twitter as well.

----------------------------------------

[1] https://www.businessinsider.com/twitter-plans-to-force-users...

[2] https://developers.facebook.com/blog/post/2021/11/09/meta-jo...


> These "leaked" secrets GitHub forwards might be dissidents getting access without being tracked.

"Leaked" here means "made public", i.e. "published such that literally anyone can use them", for example when burned into a commit of a public repo. Even for a dissident, publishing an API key or other credential where literally anyone can find it to use it, is almost assuredly a mistake. Because external scrapers can also find it there, such that the key will be inevitably picked up and fed into a botnet to abuse — at which point the ops staff at the service will notice the abuse and revoke the key, thus "burning" it as useful from the dissident's perspective.

If you store a secret on Github somewhere that only people and people you trust have access to, rather than everyone having access to it, then this is not considered a "leak", and so Github does not detect this as a "leaked secret." For example, commit data of private repos is not scanned for secrets (if it was, GitOps as a concept would be impossible!); nor are a repo's formal Actions Secrets store (part of a repo's configuration readable only by triggered Github Actions CI jobs).

Github's own secret-scanning here, is trying to catch the cases where a user has done something stupid by accident. Whether or not they reported secrets to third parties, they'd still be doing leaked-secret scanning of their own Github API keys, to ensure that people aren't accidentally trying to configure Github Actions by burning their Github Actions CI API key into the workflow itself. If they find such keys, they revoke them.

The point of Github's secret-scanning partner program, is that because Github is doing this leaked-secret scanning for their own purposes anyway, you (the partner) can sign up to be told when API keys of yours are accidentally made public as well.

> That makes no sense, then they don't need GitHubs help.

Ignoring for a moment that Github is a website, and so anyone can just crawl it—

Did you know? Github pushes the commit data of all public repos to BigQuery as a public research dataset: https://codelabs.developers.google.com/codelabs/bigquery-git.... Literally anyone can do their own "secret scanning" with a simple BigQuery query. It costs about $500 to run such a query, because the Github dataset is pretty large. It's not a price most SMEs would pay. But it's definitely a price attackers could be willing willing to pay. It's a lot cheaper than running your own web-spider infrastructure!

The difference with Github's own secret scanning, is that it happens synchronously, on push of commits; whereas the ETL of commit data to Github et al happens asynchronously, some time after commits happen. Tencent — and every other secret-scanning partner — depends on Github to stay ahead of any third-party attackers trying to scrape leaked credentials for use in botnets et al.

Also, FYI, you yourself can sign up to be a Github secret-scanning partner. You just need 1. a regex that uniquely identifies your secrets, so that Github can recognize them on push, and 2. a webhook URL to report them to. (https://docs.github.com/en/developers/overview/secret-scanni...)

And by the way, this isn't a hypothetical nice-to-have. I run an API SaaS — and not one that's even very large, in relative terms. But my own customers' accidentally-leaked secrets have been scraped from their Github repos and used by botnets already! Signing up as a Github secret-scanning partner is on my to-do list.


This is part of https://docs.github.com/en/developers/overview/secret-scanni...

It lets WeChat revoke tokens that GitHub finds in public repositories.


It lets WeChat see tokens that GitHub forwards to them. What they do with it is up to them, but the intent is that they mitigate the issue.

“GitHub will forward access tokens found in public repositories to Tencent WeChat, who will notify affected users.”


Why did you edit the full quote?

Here’s what I just copied from the blog post without modification:

> We have partnered with Tencent WeChat to scan for their tokens and help secure our mutual users on all public repositories and private repositories with GitHub Advanced Security.

It’s not just public repos, it’s private repos too.


A comment above says that the for private repos only the repo owner will be notified vs sending the secret to the partner for public repos


That is insane. They just leak data from your private repos to a hostile foreign govt agency. Unbelievable.

Edit: apparently they notify you for private repos, not Tencent. Still not thrilled.


Optics of this article could be improved.

However, this is already a well established and useful thing. When you publish your AWS (for example) secrets to your public repo, it will scan it and stop it leaking before damage can be done. This is just the same for another service.


Why could the optics be improved?


It would be nice of Github if they could publish a transparency repo with all the partners and all the regex along with this initiative. I see a lot of people in this thread worried that "China gets their data" and this transparency repo could alleviate some of that.


Why do people worry about China so much? There is barely any cooperation between Chinese intelligence and the rest of the world.

If I was forced to pick one government to share my secrets with, it would be the Chinese, because there's nothing they can do about it. My own government and its allies is infinitely more dangerous to me than such a foreign one.


> because there's nothing they can do about it

Are you talking about the China that bought huge areas in ports around the world? The same one that has secret police stations as well?


TECHNICALLY they only supply loans and construction assistance to countries that don't have the infrastructure themselves yet, but, yeah.

Not like the US is much better though; US based investors and shell companies own property all across the world, and US intelligence agencies infiltrated / owned European based secure communications companies / products (https://en.wikipedia.org/wiki/Crypto_AG).

Also, the US has public / non-secret army bases in 100+ countries, and secret police stations in loads of places elsewhere (50 prisons in 28 countries: https://en.wikipedia.org/wiki/CIA_black_sites)


So, basically... nothing? You should go into China to get a real threat. On the other hand, there is global cooperation with intelligence agencies of USA and many European countries.


China spies on Uyghurs in other countries, puts pressure on them and has been known to run secret police stations as well.

China spying on Uyghurs in Sweden: https://www.rferl.org/a/Sweden_Jails_Uyghur_Chinese_Man_For_...

China controls dissidents abroad through relatives back home: https://www.reuters.com/article/us-china-uighurs-idUSKBN0UD1...

China's secret police stations in Europe: https://www.spiegel.de/international/world/beijing-s-long-ar...

>On the other hand, there is global cooperation with intelligence agencies of USA and many European countries.

Two wrongs don't make a right.


They key point was not in acquiring information.. instead in authority to do something real with that information without real concequences.


> If I was forced to pick one government to share my secrets with, it would be the Chinese, because there's nothing they can do about it.

Unless you live or visit there. Wasn’t there reports of China having concentration camps?


Obviously if someone would pick China it was because they were not planning to visit there


Because if you want to influence the voting patterns of a population, knowing as much as you can is useful. Search history + TikTok + FB would give you unbeatable datasets that you could use for the lifetime of the person. Take a dataset now, add a decade of AI progress and I’m pretty sure a nefarious actor would be leading many people around by the nose. Not all, but 5% would move many elections.

They wouldn’t even need to learn all that much about you as an individual. Just enough to match you with a cluster from their own population that they have infinite data on.


In other words, your data in their hands is dangerous to you when combined with that of others.


> If I was forced to pick one government to share my secrets with, it would be the Chinese, because there’s nothing they can do about it.

What makes you so sure about that?

I worry about China because there’s no internal checks to prevent them from doing anything.

Western governments and allies have a long culture of court systems and thinking about balancing constituent needs. That is eroding and becoming more dangerous to the extent western leaders are envious of dictatorial powers and trying to emulate Chinese totalitarianism, but there is a lot of institutional and cultural bulwark against it.

Any powerful totalitarian country should worry people. People underestimate the level of covert aggression in all facets of foreign involvement in regimes with no internal accountability.


> What makes you so sure about that?

I think the commenter means that China can't do much about them while living where they do (e.g. the US), whereas the US based intelligence agencies can black bag them in most of the western world.

Given the Snowden revelations, I don't believe there's much accountability to what the NSA and co collect and analyze. Secret courts, CIA dark sites, Gitmo, all kinds of things that might be legal according to the letter of the law, but feel immoral and unjust (war crimes level of unjust).


The fact that we’re able to talk openly about the unaccountability of the NSA, CIA, FISA courts and Gitmo is a form of accountability that does not exist in China.

I do not condone the actions of all of the US institutions and believe many are largely unaccountable, but there is still far more accountability in our least accountable institutions than any in China. Culture within institutions, the populace at large from which institutions hire from, and the history of institutions is a very important but subtle factor that westerners take for granted and is not identical between countries.

The fact that we even call US actions war crimes is another form of accountability China lacks. China forcibly annexed Tibet. China has concentration camps in inner Mongolia. China massacred it’s own people during the Mao era and is run by the same party today. China releases dams upstream of cities and drowns its own people. China committed war crimes in its support of North Korea during the Korean War, and Vietnam in the Vietnam War. China turned hundreds of its college students into mincemeat in Tiananmen square.

These are not on the radar of most people in China the same way Gitmo and Iraq and Vietnam are in the US because the CCP has a level of unaccountability westerners do not understand. Their stranglehold on media and public perception is extreme, and they have been exerting influence on western social media like reddit, youtube, and facebook, as well as google search. They do so through virtually all corporations that want access to Chinese labor and the Chinese market.

The ramifications of this are severe. The lockdown policies that have destroyed a lot of the global economy were pushed on the world by pressure from China (and our own fear and incompetence), despite existing pandemic plans in western nations that were far more sane, more humane, and less destructive.

They were able to get us to discard our own pandemic plans from our own institutions because of the influence they have been gaining. China exerted influence over the WHO recommendations, and was very aggressive/in PR panic mode when the virus first escaped. Our own internal division is largely our own doing, but they were extremely aggressive in the early phases of the pandemic online and greatly exacerbated western problems precisely because they knew what talking points to copy and paste for maximum division and lack of consensus.

Giving them your data and more ability to influence you and people that fit your demographics online is incredibly naive and foolish, no matter how much you may dislike your own government or how valid that dislike may be.


Eh, never stopped USA


> because there's nothing they can do about it

What makes you think this?


https://amp.cnn.com/cnn/2022/12/04/world/china-overseas-poli...

Yep totally harmless.

> China operating over 100 police stations across the world with the help of some host nations, report claims


I agree. For most people here, it is their own government (or allied governments) which are the ones to be most cautious about. They're the ones most likely to ruin your life if they don't agree with what you're up to - elected or not.


Because American propoganda.


Some people are Chinese.


You may not have noticed this, but China is seeking to extend it's influence. By whatever means required including deadly force. There is a picture you might want to look at. Search "Tank Man". That is why people worry about China so much.


oh, you mean like the US and its global reaching government?


There is a list of partners [0], I thought I had seen regexes at some point but I can no longer find them.

[0]: https://docs.github.com/en/enterprise-cloud@latest/code-secu...


I don't think they would release the regex used to validate the API keys since it would help people automate scanning for API keys of all supported providers on public repos on any other site using the regex given by the provider itself.


Is this blog post not the transparency part?

I assume that the regex is `TC:[a-z0-9]{20}` or something uninteresting like that.


I guess because some lazy people will use those regex in all other public git hosts and search engine


Wait, what?

So any string (which Github deems an access token) is forwarded to Tencent?

Or will Tencent share all their current access tokens with github?


Any string that matches access token regexp provided by Tencent (see https://docs.github.com/en/developers/overview/secret-scanni...).


For public repositories only though. For private repos it's optional, and when enabled the repo admins get an alert to handle it themselves without it going to the vendor.


    .*
;-)


So it is just one bad regexp away from sending them other companies secrets


I don't see what your comment is trying to point out.

The same could be said for all the other Secret Scanning partners GitHub has, like AWS and so on.

That being said, it's impossible that a "bad regexp" is gonna make its way to the GitHub codebase.


You can already do the former by using GitHub Events API. This simply helps with the accidental leak of tokens into the public, so Tencent / Repo owner can revoke it before it gets abused. https://docs.github.com/en/rest/activity/events?apiVersion=2...


They had to have titled it like this on purpose. I almost spat out my tea.


What does a wechat token look like, as in can i scan my repo to see if i do not leak anything unwanted to wechat?

That said, could one also generate tokens and essentially DDOS the wechat org by having them inform their customers unnecessarily?


Isn't this information already public?


Which information?

Your wechat tokens, no that should never be public, hence why this feature exist?

That github reports that you leaked your wechat tokens, it was announced just recently, hence the post.

That github is giving wechat your secrets, not that is not what this is about although the article title would make you think that.


GitHub is indeed giving WeChat our information, but only when it looks just like WeChat secrets, and only once it's already publicly leaked (any leaked via private repo instead goes to the repo admin).

So technically the answer to GP is 'yes'.


I dont think you can claim an API key is your information. It is quite by definition information created by WeChat and Github is sharing only that with them a few minutes before it shares it with bad actors.


It's not necessarily created by WeChat. It just needs to follow the same pattern as a key generated by WeChat (thus all the .* regex jokes here). It could very well be anyone's information, but information about to be public anyway (making "yes" the answer to the question "Isn't this information already public?").


They should scan for Bitcoin seeds too.



1. If their regex matches my company token, will it be send to them? 2. Can Wechat update the token regex to collect tokens from competitor company? 3. Can Tencent collect information about applications that use wechat?


Why is everyone upset? This is a good thing.

Where are you seeing a privacy or security risk?


It’s a combination of missing hyphens (it should be “secret-scanning partners” to avoid adjective ambiguity) and people’s inability to open links and read anything past the title. Sprinkle a bit of Sinophobia and we’re golden.


Tencent provides a list of regexps, and anything matching those regexps is passed to them. As far as I can tell, we don't get to know what those regexps are (and presumably they can be changed at Tencent's whim). Can you not see the issue?


We do not know if Tencent can use arbitrary regex to find, I don't know, anti-Chinese sentiment content or just preapproved ones like "tencentToken=([a-zA-Z\d]{15})". Also, it's just for public repositories!

In any case, this announcement changes nothing. If you trusted GitHub with something before that you wouldn't trust them with now, your mental model is wrong. GitHub might allow any kind of partner (customer?) to scan their private or public repos in any way they want without making it public. In other words, if you are someone this announcement is problematic to, you shouldn't have anything on GitHub in the first place.


I cannot see the issue because the regex are pre-approved by GitHub. And even then, the service will only return the string, not who wrote it. Unless GitHub approves /Jonh Doe said:.*/ there is no issue whatsoever.


> I cannot see the issue because the regex are pre-approved by GitHub.

GitHub is a private company with one dual obligation, to prolong its existence and keep increasing its profit margin.

It is not any sort of arbiter for morality - morality being an externality to its central obligation - so it cannot be relief upon to “do the right thing”.

So it is not in any position of authority that would enable it to “approve”, in the moral sense of the word. They can only “allow” for the regex to be ran and the results sent off.

For example, the “right thing” for GH would be to increase profit, while for another entity might instead be to uphold its users’ privacy.

(You may think that it’s only for public repos, so they’re already made public, but isn’t GH here facilitating an aggressive collection and summation of information, that would otherwise be much more difficult and error-prone?)

The power of approval would rather come from an elected entity that would also determine who may request that such searches are executed, and which reasons would be valid.

Otherwise, we get a William Gibson-esque megacorp cyberspace future with clear but corporate Orwellian overtones.

Isn’t this obvious?

(I’m not being snarky at all - I’m genuinely asking: isn’t this glaringly and terrifyingly obvious?)


Generally when you come up with something from first principles that appears to be "terrifyingly obvious", that thing is false.

Microsoft's mission in life is to do whatever its directors want, insofar as its shareholders don't get /too/ annoyed about this, and to not break the law. They have no actual "obligations" or "fiduciary duties" to keep increasing profits or anything like that.


So they have no duty towards the general public either, until the public’s too irritated with them, and calls for overview. How’s that better, or any different at all?


I don't understand what you're trying to say here.


They’re not your friends, they’re out to profit from anything they can grab from you, so on principle, DON’T TRUST THEM.


They’re not out to profit from anything they can get though. They’re out to do Microsoft things.

This is pretty obvious; most companies stay in their industry even if it becomes irrelevant. Furniture companies rarely switch to software and software companies rarely just start selling heroin. (Except for gacha games.)


I guess I just have a lot less faith in the ability of companies to design perfect processes, and the ability of humans to perfectly carry them out, than you do.


Yeah because it is oh-so-easy to ensure your regexp matches only your company's tokens and not 10000 other companies tokens /s


And? What are you going to do with a singular token that you don’t know what company it belongs to? But obviously those devs at GitHub don’t know what they’re talking about so they’ll gladly notify two companies at once.

It’s super easy too: take a look at GitHub’s tokens, they all start with gh.


> Where are you seeing a privacy or security risk?

Well...

> Tencent

Here.

It's really the combination Tencent and Partnership that I find a problem. These things tend to lead to closer collaboration and WeChat is a huge surveillance tool.

Sure they have access to public info anyway because everyone does. Just let them scan it themselves then.

And yes I'd feel almost the same if it was Facebook.


It’s absolutely shocking to observe how hostile HN is to Chinese affairs. While in real life many must have collaborated A LOT with Chinese engineers & managers. Are you worry about bias bleeding into real life? I’m indeed worried as a Chinese immigrant working in tech


It's mostly because of the government. Don't take it personally. I have quite strong opinions of China, but I don't let it influence my relationships with my chinese collegues.


Is simply xenophobia, as it is with Russians or Russia-related issues. This is how the internet works in the Western world. When you point it out, they simply downvote you; if your account is new, they will claim it is a bot or that you are an agent/collaborator, and so on.


Just make sure your secret doesn’t look like a WeChat secret


Is this a joke? I'm not laughing.


No joke. «GitHub will forward access tokens found in public repositories to Tencent WeChat, who will notify affected users.» In other words, Tencent now has access to all of your public repositories.

Also, Github now has code recognise Tencent access tokens.


> In other words, Tencent now has access to all of your public repositories.

They already did. That's what public means. This is just an optimization to make it harder for WeChat access tokens to be inadvertently compromised without getting noticed.

If you're worried about the Chinese government having inappropriate influence over or access to various things outside China, that's in general a valid concern indeed, but facilitating credential scanning in public repositories really doesn't seem worrying.


I'm shocked by the number of respondents who felt the need to point out what public means.


It's because this site is dead and more than half the comments are written by bots. On average HN article, like 75% of the comments make absolutely no sense.


Tencent already had access to all your public repositories? They're public.


Thats one thing, but its not like they have access to my public instagram photos, tweets or anything like that (/s?)


> We have partnered with Tencent WeChat to scan for *THEIR* tokens


No one owns random strings. They can claim whatever they want to be "their" tokens


Tencent isn't claiming ownership of these strings, it's claiming that strings with a particular format have special meaning wrt. Tencent's APIs. It has told Github about that format.

This is fundamentally similar to how UPS, DHL etc. document how to recognise their tracking numbers.

"Their" and genitive in general doesn't necessarily mean ownership. It's often used for various sorts of connection. For example, "my address" doesn't claim ownership of either the street or the house, "my age", "my wife", all connected to me somehow but not owned by me.


Not your random string not your coins. Not your random string not your monkey picture.


Just a reminder that Git is a decentralized protocol and Github is merely a (poor) implementation of it. Microsoft-Github have been increasingly introducing antifeatures, just one of which is sending repository contents to China automatically.

For the last few years I've been running Git off my own servers with a cgit [0] frontend, and couldn't be happier.

[0] https://git.zx2c4.com/cgit/about/


Repository contents aren’t “sent” to China, companies like Tencent specify the shape of tokens to GitHub and GitHub does the scanning and then notifies Tencent to revoke the token if one is found.


How is Githun a poor implementation of Git? Because it’s centralized?


I believe it's roughly a quote from Linus Torvalds, the creator of git who has many issues with githubs decisions. See https://www.wired.com/2012/05/torvalds-github/ for a start, his opinion hasn't improved over the decade.


His oppositions seems to be nitpicking and he says it is fine for hosting?


The issues with the way they handle commits is a fairly fundamental disagreement, one that ensures he will never use github for development without it being changed.


Ye well he says the web interface sucks and that "pull request" do too. Like, all their value add.

I have never tried Github's web interface but I guess not being able to write tab is the first thing that pissed Linus off.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: