Hacker News new | past | comments | ask | show | jobs | submit login

There is a third possibility I haven't seen discussed yet: That DeepSeek, illegally, got their hands on an OpenAI model via a breach of OpenAI's systems. Its easy to laugh at OpenAI and say "you reap what you sow", I'm 100% in that camp, but given the lengths other Chinese entities have gone to when it comes to replicating Western technology; we should not discount this.

That being said, breaching OAI's systems, re-training a better model on top of their closed source model, then open sourcing it: That's more Robinhood than Villain I'd say.




The reason you’re not seeing that being discussed is it’s totally unsupported by any evidence that’s in the public domain. Unless you have some actual evidence of such a breach, you may as well introduce the possibility that DeepSeek was reverse engineered from data found at an alien crash site.


Why stop there.... Deep seek is actually an alien intelligence sent via sophons to destroy all of particle physics!


Definitely would make a lot more sense, if the leaderships are just secretly wallfacers.


There's no public evidence to that effect but the speculation makes a lot more sense than you make it sound.

The Chinese Communist party very much sees itself in a global rivalry over "new productive forces". That's official policy. And US leadership basically agrees.

The US is playing dirty by essentially embargoing China over big AI - why wouldn't it occur to them to retaliate by playing dirtier?

I mean we probably won't know for sure, but it's much less far fetched than a lot of other speculation in this area.

E.g., R1's cold start training could probably have benefited quite a bit from having access to OpenAI's chain of thought data for training. The paper is a bit light on detail on how it was made.


> The Chinese Communist party very much sees itself in a global rivalry over "new productive forces".

interestingly, that actually makes the CCP the largest political party pursuing state capitalism.

there won't be any competition between China and the US if the CCP is indeed a communist party as we all know full well that communism doesn't work at all.


What a ridiculous thing to say. Equivocating the occurrence of a non-us nation-state backed organization of hacking a western organization with data found at an alien crash site is bananas.

Edit: added clarity to geographical perspective


DeepSeek is basically a startup, not a "foreign nation-state backed organization". They were forced to pivot to AI when their original business model (quant hedge fund) was stomped on by the Chinese government.

Of course this is China so the government can and does intervene at will, but alleging that this required CIA level state espionage to pull off is alien crash levels of implausible. They open sourced the entire thing and published incredibly detailed papers on how they did it!


You don’t need a CIA level agent to get someone with a fraudulent job at OpenAI for a few months, load some files on a thumb drive, and catch a plane to Shanghai.


You may be unaware, but CCP has far more control over private companies than you might think: https://www.cna.org/our-media/indepth/2024/09/fused-together...

This is not America. Your ideas do not apply the same way.


Naivety of some folks here is astounding… CCP has golden shares in anything that could possibly be important at some point in the next hundred years, and yes golden shares are either really that or they’re an euphemism, the point is it doesn’t even matter.


China has tens of millions of companies. The government can't, doesn't and isn't even interested in micromanaging all of them.


It doesn’t have to micromanage. It doesn’t care about most. It is only interested in the politically important ones, but it needs the optionality if something becomes worthwhile.


You're suggesting that DeepSeek was a Chinese government operation that gained access to OpenAI's proprietary data, and then you're justifying that by saying that the government effectively controls every important company. You're even chiding people who don't believe this as naive.

I think you have a cartoonish view of China. A huge amount goes on that the government has no idea about. Now that DeepSeek has made a huge media splash, the Chinese government will certainly pay attention to them, but then again, so will the US government.


I never suggested anything of the sort.

I’m suggesting it will be happening now and any past efforts will be retroactively analyzed by the appropriate CCP apparatus since everyone is aware of the scale of success as of Monday. It has become a political success, thus it is imperative the CCP partakes in it.


This is the argument we're discussing:

> DeepSeek, illegally, got their hands on an OpenAI model via a breach of OpenAI's systems. [...] given the lengths other Chinese entities have gone to when it comes to replicating Western technology; we should not discount this.

Above, teractiveodular said that "DeepSeek is basically a startup, not a 'foreign nation-state backed organization'". You called teractiveodular naive for saying that. So forgive me if I take the obvious implication that you think DeepSeek is actually a state-backed actor enabled by government hacking of OpenAI.


You took a major leap. No one made any such argument.


> foreign nation-state backed organization

I'm European, are you talking about Microsoft, Google, or OpenAI?


They’re referring to an organization (like a hacking group) backed by a country (like china, North Korea).


So, which of them 3?


You're missing the point that for a much larger portion of the world, all "tech" is a foreign entity to them


Until recently treating the US and China on the same geopolitical level for allied countries would have been insanely uncharitable and impossible to do honestly and in good faith.

But now we have a bully in the whitehouse who seems to want to literally steal neighboring land, or is throwing shit everywhere to distract from the looting and oligarchy being formed. So I suddenly have more empathy for that position.


I notice that your geographical perspective doesn’t stretch to any actual evidence that such a thing took place. So it really has exactly the same amount of supporting evidence as my alien crash reverse engineering scenario at present.


The surrounding facts matter a lot here. For example, there are plenty of instances of governments hacking companies of their competing nations. Motives are incredibly easy to come by as well, be they political or economical. We also have no proof that aliens exist at all, so you've not only conjured them into existence, but also their motive and their skills.

Are you trolling me?


Ok so to be clear: your surrounding facts are they may have a motive and nation states hack people. I don’t disagree with those, but there really are no facts that support the idea that there was a hack in this case and the null hypothesis is that researchers all around the world (not just in the US) are working on this so not all breakthroughs are going to be made in the US. That could change if facts come to light but att the moment it’s not really useful to speculate on something that is in essence entirely made up.

No I’m not trolling you.


Are you a Chinese military troll? The fact that China engages in industrial espionage is well known. So I’m surprised at your resistance to that possibility.


This thread reads like sour grapes to me. When people can’t compete but instead start throwing unfounded allegations is not a good look.

Even OpenAI itself hasn’t resorted to these wild conspiracy theories.

Unless you’re an insider in these companies, you’re just like the rest of us, you know nothing.


Are you saying Chinese industrial espionage is not a well established fact?


Industrial espionage isn't magic. Airbus once stole basically everything Boeing had, but that doesn't mean Airbus could magically build a better 737 tomorrow.

China steals a lot of documentation from the US but in a tech forum you of all people should be very familiar with how little actual progress a bunch of documentation is towards a finished unit.

The Comac C19 still uses American engines despite all the industrial espionage in the world because most actual engineering is still a brute force affair into finding how things fail and fixing that. That's one of the main advantages SpaceX has proven out with their "eh fuck it, just launch and we will see what breaks" methodology.

Even fraud filled Chinese research makes genuine advancements.

Believing that China, a wealthy nation of over a billion people, with immense unity, nationality, and a regime able to explicitly write blank checks could only possibly beat the US at something by cheating is like, infinite hubris. It's hilarious actually.

I don't know if DeepSeek is actually just a clone of something or a shenanigan, that's possible and China certainly has done those kinds of things before, but to think it's the MOST LIKELY outcome, or to over rely on it in any way is a death sentence. OpenAI claims to have evidence, why do they not show it?


>>>Believing that China, a wealthy nation of over a billion people, with immense unity, nationality, and a regime able to explicitly write blank checks could only possibly beat the US at something by cheating is like, infinite hubris. It's hilarious actually

So this is the first time I’ve heard the Chinese regime being described in such flowery terms on HN - lol. But ok - haha


> exactly the same amount of supporting evidence

The evidence supporting offensive hacking is abundant in recent history; the number of things which have been learned from alien crash data is surely smaller by comparison to the number of things which have been learned from offensive hacking.


More to the point, offensive hacking is something that all governments do, including the US, on a regular basis.

However, there is no evidence this is how the data was obtained. Zero, zilch.

So its a useless statement which only plays on peoples bias against their hated nation state de jour.


That would require stealing the model weights and the code as OpenAI has been hiding what they are doing. Running models properly is still quite artistic.

Meanwhile, they have access to Meta models and Qwen. And Meta models are very easy to run and there's plenty of published work on them. Occam's Razor.


How hard it is, if you have someone inside with the access of the code? If you have 100s of people with full access, not hard to have someone that is willing to sell it or do some industrial espionage...


Lots of if's here. They need specific US employee contacts at a company thars quickly growing and one of those needs to be willing to breach their contracts to share it. That contact also needs to trust that Deepseek can properly utilize such code and completely undercut their own work.

Lot of hoops when there's simply other models to utilize publicly


How big are the weights for the full model? If it's on the scale of a large operating system image then it might be easy to sneak, but if it's an entire data lake, not so much.


devil's advocate says that we know that foreign (hell even national) intelligence attempt to infiltrate agents by having them become employees at any company they are interested. So the idea isn't just pulled from thin air as a concept. I do agree that it is a big if with no corroborating evidence for the specific claim.


I doubt that many people have full access to OpenAI's code. Their team is pretty small.


Do you have ANY reason to believe this might be true, or is this 100% pure speculation based on absolutely nothing?


I discount this because OpenAI is pumping the whole internet for money, and Zuckerberg torrented LibGen for its AI. We cannot blame the Chinese anymore. They went through the crappy "Made in China" phase in the 80s/90s, but they mastered the art of improving stuff instead of mere cloning, and it makes the big companies angry which is a nice bonus.

IMHO the whole world is becoming crazy for a lot of reasons, and pissing off billionaires makes me laugh.


Deepseek v2 and v2.5 was still very good but not par with frontier models. How would you explain that?


I don't think you need to steal a model - you need training samples generated from the original, which you can get simply by buying access to perform API calls. This is similar to TinyStories (https://arxiv.org/abs/2305.07759), except here they're training something even better than the original model for a fraction of the price.


I don't think we should discount it as such, but given there's no evidence for it, yet plenty of evidence that they trained this themselves surely we can't seriously entertain it?


Given the openness of their model, that should be pretty easy to detect. If it were even a small possibility, wouldn’t openAI be talking about it very very loudly?


I think people overestimate the amount of secret sauce needed to train these models. The reason AI has come this far since AlexNet is that most of the foundational techniques are easy to share and implement, and that companies have been surprisingly willing to share their tricks openly, at least until OpenAI decide to become evil hoarders.


We shouldn't discount a thing for which there is absolutely zero evidence? Sorry that's not how it works.


I really doubt it. If that's the case the US GOV is in serious shit. They have a contract with OpenAI to chuck all their secret data in there... In all likelihood they just distilled. It's a start up company that is publishing all of their actual advances in the open, with proof. I think a lot of people run to "espionage" super fast, when reality is, the US probably sucks at what we call AI. Don't read that wrong, they are a world leader obviously. However, there is a ton of stuff they have yet to figure out.

Cheapening a series of fact checkable innovations because of the country of origin when so far all that they have showed are signs of good faith is paranoid at best and propaganda to support the billionaire tech lords saving face for their own arrogance at worst.


If the US government is "chucking all their secret data" into OpenAI servers/models, frankly they deserve everything they get for that level of stupidity.


https://openai.com/global-affairs/introducing-chatgpt-gov/

And don't forget the billions in partnerships...


ChatGPT, please complete a memo that starts with: "Our 5 year plan for military deployments in southeast Asia are..."



Can't wait for gpt gov to hallucinate my PII!


Probably more like specialized tools to help spy on and forecast civilian activities more than anything else. Definitely with hallucinations, but that's not really important. Facts don't matter much these days...


But remember: we cannot fire anyone over this because then we're riding with Hitler /s

I can see why people refuse to pay taxes.


Can you explain at a technical level how you view this as necessary for the observed result?


I'd be perfectly fine with China stealing all "our" shit if they just shared it.

The word "our" does a lot of heavy lifting in politics[0]. America is not a commune, it's a country club, one which we used to own but have been bought out of, and whose new owners view us as moochers but can't actually kick us out (yet). It is in competition with another, worse country club that purports to be a commune. We owe neither country club our loyalty, so when one bloodies the other's nose, I smile.

[0] Some languages have a notion of an "exclusive we". If English had such a concept, this would be an exclusive our.


This comment made me realize we don’t have a pronoun for n-our or x-nour


[flagged]


> based purely on racial prejudices

I don't think that's what the parent was getting at. The US and China are in an ongoing "cyber war". Both sides of that conflict actively use their computers to send messages/signals to other computers, hoping that the exploits contained in those messages/signals can be used to exfiltrate data from and/or gain control of the computer receiving the message. It would really be weird to flatly discount the possibility that some OpenAI data was leaked, however closely guarded it may be.


I flatly discount the possibility because OpenAI can't produce evidence of a breach. At best, they'd rather hide the truth than admit a compromise. At worst they show incompetence that they couldn't detect such a breach. Not a good look either way.


> It would really be weird to flatly discount the possibility that some OpenAI data was leaked, however closely guarded it may be.

It’s even weirder to raise it as a possibility when there is literally nothing suggesting that was even remotely the case.

So if there is no evidence nor even formal speculation, then the only other reason to suggest this as a possibility would be because of one’s own opinions regarding Chinese companies. Hence my previous comment.


> Because that would be jumping to conclusions based purely on racial prejudices.

Not purely. There may be some prejucide but look at Nortel[1] as a famous example of a situation where technological espionage from Chinese firms wreaked havoc on a company's fortunes and technology.

I too would want to see the evidence and forensics of such a breach to believe this is more than sour grapes from OpenAI.

[1] https://financialpost.com/technology/nortel-hacked-to-pieces


This is ahistorical.

Nortel survived the fucking great depression. But a bunch of outright fraudulent activity by it's C-Suite to bump stock prices led to them vastly overstating and overplanning and over-committing resources to a market that was much much smaller than they were claiming. Nortel spent billions and billions on completely absurd acquisitions while they were making no money explicitly to boost their stock price.

That was all laid bare when the telecom bust happened. Then the great recession culled some of the dead wood in the economy.

Huawei stealing tech from them did not kill them. This was a company so rotten that the people put in charge right after this huge scandal put the investigative lights on them IMMEDIATELY turned around and pulled another scam! China could have been completely removed from history and Nortel would have died the same. They were killed by the same disease that killed and nearly killed a lot of stuff in 2008, and are still trying to kill us: Line MUST go up.


Nobody is accusing them, just stating it’s a possibility, which would also be true if they were an American or European company. Corporate espionage is just more common in China.


I can't? I am going to make that accusation if we're talking about the govt of China.


> based purely on racial prejudices.

At some point these straw men start to look like ignorance or even reverse racism. As if (presumably non-Han Chinese) Americans are incapable of tolerance.

There are plenty of Han Chinese who are citizens of democratic nations. China is not the only nation with Han Chinese.

America, for instance, has a large number of Asian citizens, including a large number of Han Chinese. The number of white, non-Hispanic Americans is decreasing, while the number of Asian Americans is increasing at a rate 3x the decrease in whites. America is a melting pot and deals with race relations issues far more than ethnically uniform populations. The conversations we have about race are because we're so exposed to it -- so racially and culturally diverse. If anything, we're equipped to have these conversations gracefully because they're a part of our everyday lived experience.

At the end of the day, this is 100% a geopolitical argument. Pulling out the race card any time China is criticized is arguing in bad faith. You don't see the same criticisms lobbied against South Korea, Vietnam, Taiwan, or Singapore precisely because this is a geopolitical issue.

As further evidence you can recall the conversations we had in the 90's when we were afraid Japan would take over. All the newspapers wrote about was "Japan, Japan, Japan" and the American businesses they were buying up and taking over. It was 100% geopolitical fear. You'll note that we no longer fill the zeitgeist with these discussions today save for a recent and rather limited conversation about US Steel. And that was just a whimper.

These conversations about China are going to increase as the US continues to decouple from Chinese trade. It's not racism, it's just competition.


That’s a lot of mental gymnastics you’ve pulled to try and justify baseless accusations.


It's pretty clear he wasn't defending the accusations and simply stating the other comment was clearly a strawman.


This is cultural prejudice, not racial.


[flagged]


I got a kick out of this headline yesterday:

"Meta is reportedly scrambling ‘war rooms’ of engineers to figure out how DeepSeek’s AI is beating everyone else at a fraction of the price"

https://fortune.com/2025/01/27/mark-zuckerberg-meta-llama-as...


If it doesn't work, there's no need to even defend against it. Idc if someone wants to call me racist.


[flagged]


It's not good to talk about other HN users that way, and anyway I don't think it's the case this time


There are users and there are trolls. There is nothing racist in calling a government of a superpower interested and involved in the most revolutionary tech since the Internet.


Agree about the last part, but that doesn't make someone a troll


It does for me. Not sure what your definition of troll is.


It used to mean someone who's trying to enrage people by baiting ("trolling"), and now it can also mean someone arguing in bad faith. And Chinese troll I guess means someone doing this on behalf of the Chinese govt.


Yup we agree then. Claiming an argument to be racist is a bad faith attempt at guilt tripping Americans; a form of FUD and whataboutism. It is not done by normal users, they don’t need it.


Or it can just be a normal user who's wrong this time. He looks like a normal user. In theory it could all be a cover, but that'd be ridiculous effort just for HN boards. Throwing those accusations around will make this place more like Twitter or Reddit.


There’s ordinary xkcd wrong on the internet and there’s repeating foreign nation state propaganda lines. Doing it in good faith does not make it less bad.


No reason why you were downvoted. This is completely valid.


There’s no evidence.

We can talk about hypotheticals all we want, but who wants to do that?


There's no evidence for almost any of this, and even when there is, we won't see it. Just like 95% of posts on here.


Belief that the CCP is behaving poorly isn’t racial prejudice, it’s a factual statement backed by a mountain of evidence across many areas including an ongoing genocide.

Extending that to a new bad behavior we don’t have evidence for is pure speculation, but it need not be based on race.


Yea but I think the OPs point is something along the following lines. Not everything you buy from China, or every person you interact with from China is part of a clandestine CCP operation. People buy stuff everyday from Alibaba and its not a CCP scheme to sell portable fans, or phone chargers. A big chunk of the factories over there are US funded after all... Just like how it's not a CCP scheme to write a scientific paper, or create a ML model.

Similarly, I see no evidence (yet) that DeepSeek is a CCP operated company anymore than saying any given AI start up in the US is a three letter agencies direct handiwork or a US political party directive. The US has also supported genocides and a bunch of crazy stuff, but that doesn't mean any company in YC is part of a US government plot.

I know of people who immigrated to China, I know people who immigrated from China, I went to school with people who were on visas from China. Maybe some of them were CCP assets or something, but mostly they appeared to me to be people who were doing what they wanted for themselves.

If you believe both sides are up to no-goodery thats in the face of the OPs statement. If you think it's just one, and the enemy is in complete control of all of its people doing all of their commerce then I think the OP may have a point.


Absolutism (“Every person”, “CCP operated”, etc) isn’t a useful methodology to analyze anything.

Implying that because something isn’t clandestine it can’t be part of a scheme ignores open manipulation which is often economy wide. Playing with exchange rates or electricity subsidies can turn every bit of international trade into part of a scheme.

In the other direction some economic activity is meaningfully different. The billions in LLM R&D is a very tempting target for clandestine activities in a way that a cheap fan design isn’t.

I wouldn’t be surprised if DeepSeak’s results where independent and the CCP was doing clandestine activities to get data from OpenAI. Reality does need to conform to narrative conventions, it can be really odd.


I completely agree with you and apologize for cheapening both the nuance and complexity where I did.

My personal take is this. What deepseek is offering is table scraps for the CCP's actual ambitions with what we call AI. China's economy is huge on industrial automation, and they care a lot about raw materials and manufacturing efficiently than say the US's interests.


It’s downvoting blatant propaganda.


Basically, without some kind of shred of evidence, this is completely chauvinist to make this accusation.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: