There are lots of good answers in this thread. In particular, someone took the time to look at the code, and that person wrote it only took them minutes to find similarities that make my opinion difficult to dismiss.
About the lack of documentation, consider that only a tiny fraction of the kernel’s internals surface in the DDK. All our internal headers have a .w extension and are littered with directives like begin_ntddk, end_ntddk (and many others), so a large number of structures and many fields inside existing structures are hidden or replaced by a generic reserved field.
Macros names, parameters, etc. never appears in the compiled code. It is highly improbable (I studied math, “impossible” is not a word lol), almost surely impossible that a clean-room reimplementation ends up using macros for the same things, let alone macros with the same or similar names.
Also, consider that optimizing compilers emit code that is extremely hard to follow even for us. Heck, the debugger, together with the PRIVATE symbols that has the name of all private symbols, and the actual source code, has a hard time to sync with the disassembly with the code when stepping through. While it is possible to write code that behaves similarly seen from outside, it is again implausible that the expression of this behavior results in code that looks nearly identical to the original. Consider that the name of local variables is never part of the binaries, only public symbols are. Also, consider that the compiler aggressively optimizes out variables to reduce memory accesses and holds values in CPU registers as much as possible, so those variables, while conceptually present, don’t really exist as such in the disassembled code. How to explain that a reimplementation comes up with the same variables, declared in the same order, when those variables are optimized out by the compiler?
What about in-line functions, both explicit and auto-unlined by the compiler? How do you even know there was a function in the first place, and how do you invent a name that is identical to the name of said function in the original code?
Funnily, I had a conversation with a very seasoned kernel engineer (I report directly to him) about ReactOS and my Quora reply. He told me the team looked into ReactOS some time ago and reached the exact same conclusions: impossible.
In particular, this person distinctly remembers a hack he implemented (I’m not going to reveal any details, but suffice to say it was in response to some assertion by some 3rd parties that something Microsoft declared in a court of law as very difficult). He explained the hack to me in full details and, boy, hacky that was, and they found the same hack in ReactOS’s code, except that the présumer authors of that “clean room” implementation probably have no idea regarding why the hack was there.
Finally, “clean room” takes another sense when one knows that Alex (yes) worked for Microsoft until spring 2019 at least, as a contractor for a company called Cloudbase Solutions SRL. His Microsoft email address was email@example.com. I don’t know if he had access to the ntos code, or NTFS or anything else, but very close from home he was, for sure.
So to those who want to take my opinion to court, I say it’s a “careful what you wish for” type of thing, but again I’m not a lawyer, and in other regards ReactOS aligns with a very old version of the NT kernel. It is possible (my opinion only) that Microsoft does does not care?
"Never" and "only public" are wrong in the statement above, because non public symbols were indeed released by Microsoft.
I guess you are young enough not to know that Microsoft accidentally did release some NT builds with the names of the internal variables, and such builds were intentionally made with less compiler optimizations, allowing for easier reversing. Such events of releasing the internal names resulted in some very interesting stories and statements:
"_NSAKEY was a variable name discovered in Windows NT 4 Service Pack 5 (which had been released unstripped of its symbolic debugging data) in August 1999 by Andrew D. Fernandes of Cryptonym Corporation."
Private symbols are not the only way of gleaning more information, other examples I can think of are:
* Checked builds (prior to Win10). These builds shipped de-optimized kernels (e.g. no inlines) typically with copious debug strings which gave away important details. For example I gleaned a lot of knowledge of ALPC MSRPC from the checked build of rpcrt4.dll from Windows 8.
* SDK/DDK headers, especially in the brave new world of insider previews with preview SDK/DDKs there is sometimes information present which should not have been released including "private" information. Again bit of a grey area.
* The private symbols MS do ship. For example a significant proportion of the COM runtime has private symbols, intentionally. You can extract from those a surprising amount of system call structure information.
I'd recommend watching Alex Ionescu's talk at OffensiveCon about how he does reverse engineering on Windows to see many of these things in action. https://www.youtube.com/watch?v=2D9ExVc0G10
I'm not saying any of this would make it a clean-room re-implementation but to say ReactOS cannot possibly have been reverse engineered without just up and copying source isn't true.
It is very possible that some private symbols were part of some leak, but stolen data does not qualify as “shipping” :)
Again, I stand behind my opinion. I eyeballed some of the code side-by-side and there was portions where I could literally see a line-by-line correlation, which I can hardly explain.
Then if reversing the kernel is so doable using legitimate means, why ReactOS is still largely stuck in the early 2000’s, coincidentally where the major leaks happened?
However you seem to want to claim the only place those symbols can come from is being stolen. Of course in this case you use leak as a synonym for stolen, bit leak can just as much mean they were released accidentally by the owner, MS can't steal their own private symbols and release them on the web. I'm sure there's some symbol files traded in private scenarios which are actually taken through non public means but there have been actual incidences of public release of private symbols.
I'm not trying to claim that ReactOS is clean, I have no skin in the game from a project or user perspective. For all I know it might have lifted significant portions of its code from stolen source code or the WRK (which isn't stolen in so much as used without permission, which I'd regard as a totally different thing). I do however take exception to the typical software engineer's view there are somethings which cannot be reverse engineered into a almost similar form.
As to why ReactOS is stuck in the early 2000s, it could be because of all the source code which was stolen and put wholesale into the project. Although if that was the case I'd have expect MS would have sued the living shit of the project by now. It could also be because Windows was and is a very complex OS with many layers which if you're trying to re-implement with a team of 10s to 100s versus 1000s it's going to take a lot of time. It's seems unlikely that the project would spend the millions of man hours to create the abomination that is UWP.
Perhaps the best way to determine if ReactOS is unclean is for MS to open source the Windows Kernel, hell why would you even need ReactOS then :-)
“Windows NT 4 Service Pack 5 (which had been released unstripped of its symbolic debugging data)”
I’ve seen private symbols for sql server with the guid to switch editions published on the public symbol server for at least 6 months before they were pulled.
Full releases and service packs typically are stripped very well but if you are saying that no private symbols have been published to the public symbol servers then you are incorrect.
The only product that has been effective at stripping symbols traditionally has been office, they were always stripped if you could even get hold of them which was unlikely.
Don’t forget also you could download the checked windows builds which were very open.
Because they don't care or need about the newer MS stuff and also don't have the resources either.
Also they can exploit Microsoft good record of backward compatibility, once you have a good enough lower API compatibility, you can just install a lot of newer MS tech directly on top of it.
Even if that is the case, it's an incredibly poor idea to use them, so that the code ends up with spurious similarities in spite of being (otherwise) cleanly developed.
Because the former is fairly well documented and the latter doesn't seem right in the context of this discussion. If MS themselves messed up and published the symbols through an official channel that's fair game IMO. Although obviously IANAL etc... I'm talking from an ethical perspective, not a legal one.
I don't know much about ReactOS or the NT kernel but we have this type of controversy regularly in the emulation scene and while sometimes it's true that people reuse docs they shouldn't have, a lot of the time people underestimate the skill and cunning of reverse-engineers to figure out how things work without having access to any restricted information.
I don't really subscribe to a belief of absolute morality, but in the context of the discussion, I think that no matter how you got access to that code, if you say that you are reverse engineering it, then copying an implementation is not doing what you are saying you are doing (as well as being copyright infringement).
I think he's saying that even having access to leaked or accidentally released originals is not implicit permission to use it freely. Otherwise any piece of software that was ever legitimately released would be fair game, just throw it at a decompiler and profit.
If you're making a clean room design having so many similarities to the original is unlikely to happen accidentally.
Anybody implementing a clean room design should theoretically have no prior knowledge of the original's inner workings. The specs are written by one person, checked to not include any of the original material by a second one, before being passed to a third to be implemented.
From far enough a piece of wire and an isolation transformer do the same thing. The secret sauce is in that isolation, you can't just shunt it and pretend it's the same.
There IS public documentation for the NT kernel internals. It is called Windows Internals Book (https://amzn.to/2xCQla5) and has survived its’ 7th Edition
But still in 2006 ReactOS already had its 0.3.0 version and 8 years of existence ^) And book Microsoft Windows Internals, Fourth Edition has already been released in 2004
Of course it’s not a proper clean-room copy. This is immediately clear to anyone who’s worked on NTOS or who has source code license.
It been obvious for years. The only possible conclusion is that MSFT doesn’t care.
Not that I work for MS but I used to work on NT graphics drivers and had a copy of the leaked NT kernel source for a while (strictly for personal interest!). This so very like what I remember.
IMHO such blatant theft shouldn't really go unpunished.
You are the blackest pot in the history of computer software.
Not that microsoft has never crossed any lines - even the naming of the Windows operating system seems morally dubious - but in this case I don't see any issue with what they've done.
Looking at how slowly that project is moving though I think your conclusion is right: They don't care. In some other comment you mentioned how the project seems to be stuck in the 2000/XP area. How relevant does that make this project today? But whether it's based on stolen code or not, it seems like the right choice to focus on getting that completed before moving on to everything that came after. And after all, Windows isn't just the kernel. They have to implement a lot of user space too, because software out there might assume it's available and behaving in very certain ways. Just an explorer clone sounds like a mammoth task on its own.
I still find the project very interesting and overly ambitious, even if code was stolen. I think at their current pace, reactos won't become usable before operating systems became largely irrelevant and everything runs in the cloud. Everyone still working on it today is probably just doing it for the challenge.
You make a theoretically compelling argument like the "hack" you describe above and then leave out the only part that really matters where you prove it. It is entirely likely that you are simultaneously entirely correct and entirely unable to prove it due to company policy but this leads us to the next logical question.
If you knew you couldn't prove it why did you open your mouth? It seems the only result will be negative PR.
I have no intention of reading through millions of lines of code from a legally questionable source to make your point for you.
How about you do so and post an in depth analysis?
I'm thinking they'd disapprove. That's just a guess, maybe they think it's fine and great. However, it may be worth at least considering another possibility.
Does your company want random technical employees, making non-technical statements, about how people are "ripping off" MS?
It's your job to raise the flag internally and help people understand up and down the chain, and across functional lines the context and perceived severity of the issue.
It's your job to write a blog post about how great your containers are architected. Not be a lone wolf, self appointed spokesperson using inflammatory language on issues with potential legal implications.
Please don't take this comment in the vein of you're just an engineer shut up and code, in fact you have a critical role as someone with domain expertise.
The point is when you stumble onto a high visibility cross-functional issue, I've found on sensitive topics many organizations seem to appreciate when someone reacts by coordinating, discussing, and facilitating a cross functional, unified response with one designated voice, that acts in the best interests the company.
Secondly you object to me presuming I know what his responsibilities are. I don't claim to know what he codes. However I'm able to make a pretty good guess about what the scope of a MSFT senior developers job is and this kind of thing is not even close, unless it's a special case worked out in advance with others. It wasn't worked out in advance in this case because he says that he made the comments before getting feedback from more senior people on his team.
Finally I assume some of the down votes have mistakenly conflated my comments with trying to hide information behind corporate walls.
I advocate hiding nothing - and fully support the new generation of companies who believe in transparency and ethics.
Being transparent and ethical has nothing to do whatsoever with letting developers try and make legal decisions when it's not their area of expertise. It also doesn't mean that a company shouldn't work together across departments to try to decide the right thing to do.
Good management and coordination between roles don't preclude transparency or doing the right thing in anyway shape or form .
> First off I'm not policing anything, because, how would I have the authority to do that?
Maybe policing isn't the right word, but your earlier post was fairly condescending to axelriet. "I think they'd disapprove... It's your job to do this... It's your job to do that..."
Moreover, I don't believe in such a philosophy between any two people, so to the extent it is I regret the tone of the comment.
If it makes any difference, the true source of the tone was based on thinking similar to, holy crap that person is backing up into a stove someone speak loudly and quickly... There is no evidence for you to believe that, but I hope you will, somehow, decide the intent was nothing more.
Appeal to authority may be a classical logical fallacy, but it's a highly effective one and is used by many people to infer the merit of an opinion rightly or wrongly so. Yeah, I know it's weird, that people would associate a comment one person posts with the voice of a trillion dollar company. Just happens sometimes.
Moreover I have nothing to gain by trying to "make this into" anything.
Believe it or not, I actually hoped it could be of some help to you as another data point - as far as all the things you take into consideration when choosing how to exercise your bill of rights free speech, which has nothing to do with your corporate free-speech. The latter of which unfortunately has cost many people dearly.
With respect to Microsoft, I'm certainly not making any comment about them as a company one way or another. That wasn't the point because, this concept applies to most large similar tech companies.
Things do change rapidly at such companies. Fwiw, I have done your job, your bosses job, been the janitor, and a few other things there. Things may be different now, but i'm not making the most possible uninformed guesses just to mess with you.
I sincerely hope the truth is known and that there are no negative repercussions for you personally.
Look at my HN profile and contact me, so I can send you an email and ask you something. Or, put a way to contact you in your HN profile.
Shutting down discussion behind lawyers and bland public statements should be discouraged.
Its simply impossible to say, "no but this openness is good for the company" in an informed way. But we can certainly hope the company comes to this conclusion and encourages openness.
If you have good people in those positions in your organization, they will come together and make good decisions.
If you don't have the right people in those positions, then that's not going to be a good thing irrespective of my advice.
Transparency doesn't require anarchy and anarchy doesn't buy transparency.