
Kaspersky: Duqu Trojan uses 'unknown programming language' - Slimy
http://www.zdnet.com/blog/security/kaspersky-duqu-trojan-uses-unknown-programming-language/10625
======
scottdw2
The payload could have been modified (to obfuscate its origin / source
language) using a product named codesurfer/x86.

[http://www.grammatech.com/research/products/CodeSurferx86.ht...](http://www.grammatech.com/research/products/CodeSurferx86.html)

If it has access to source code, it can instrument the build process, and
obtain disassembly that is high quality enough to support rewriting. Using
it's scheme API you can modify the CFG of each procedure directly, serialize
the rewritten parts out as nasm, and even relink with the object files you
don't have source for.

It works with any build system, and supports gcc / as / ld and cl / link.

So it may not have actually been written using a custom pl.

~~~
muyuu
As a part-time Schemer this does not surprise me... Schemers have a tendency
to craft their own languages. It's only natural.

~~~
bhrgunatha
But surely the compiler (assuming a compiler is used) would convert the new
high level language into regular Scheme primitives - I think it's unlikely
that the result wouldn't be identifiable.

~~~
scottdw2
No... the product allows you to write scripts to manipulate its machine code
IR database in scheme, and then spit out the machine code as nasm assembly,
assembly them, and then run the appropriate linker in the same way that was
used to produce the original exe. Scheme is used as a macro language. So you
use scheme to say: change the code at EA 0xdeadbeef from a mov to a jmp. You
can reorder functions, insert and remove code, etc. It works because it has
very high quality disassembly based on observing compiler and linker
invocations and introspecting the artifacts involved.

~~~
bhrgunatha
Ahh, that makes more sense, I thought it meant simply creating a higher level
language from Scheme rather than manipulating the last stage(s) of producing
the binary.

------
anigbrowl
_The company has named it the Duqu Framework_

I am confident that within a week there will be 3 front page posts on HN along
the lines of 'Why I use Duqu and you should too'.

~~~
rmason
Within a month there will be a Dice job listing asking for five years of Duqu
experience.

~~~
est
TFA

> Duqu and Stuxnet components date to 2007

~~~
getsat
That's the joke, e.g., "Ruby on Rails expert with 10 years of experience
wanted"

~~~
tlrobinson
Amusing joke, yes, but the parent was pointing out that 5 years experience in
Duqu framework might not be impossible. 2012 - 2007 = 5

~~~
getsat
Ah, I misunderstood. Sorry.

------
runn1ng
I will be repeating a notion I read on YCombinator elsewhere - but I, too,
find it incredibly cool that we live in a time when wars are fought online
like that.

We have online revolutionaries anarchists and REAL nation-wide revolutions,
started on online networks (talking about Arabic Spring here); we got FBI
agents, looking through IP addresses on IRC networks to catch a small group of
bragging attackers; we got invisible army of Chinese hackers that noone knows
who they are, only that they are _really good_ ; some unknown entity making
amazingly well done and thought out trojan like stuxnet and now duqu, that
seems to be right from pages of some hyperbolic comic book; and, last but not
the least, the Russian mafia lords employing Zeus trojans and whatnot to make
botnets that mine bitcoin, purely digital currency.

It's an amazing world we live in. Can't wait what the future will bring.

~~~
adriand
You seem to be pretty enthusiastic about rather worrying and even disturbing
developments. This is not a science fiction novel, this is real life. One day
it is Israeli hackers destroying Iranian centrifuges, perhaps the next day it
will be nuclear reactor facilities that are sent into meltdowns.

~~~
runn1ng
As I think about it, you are probably right.

And yet, I can't help myself but watch in fascination as all this happens.
Maybe it's because this time, the war is fought with means and tools I
understand (if only a little)? Maybe.

Maybe it has something to do with the morbid fascination people have with
anything destructive - the World War II books and movies are still sold like
cakes, while noone actually wants to repeat the world war.

~~~
firefoxman1
I, for one, appreciate your enthusiasm. I mean, it's just amazing to look at
how far man has advanced from just 100 years ago.

And for me, it's definitely not about the destruction. I just think it's
awesome to realize just what humans are capable of. Now, humans are also
capable of some ridiculously destructive things. But the problem with
@adriand's point of view is that it's as if there's something to be lost from
all of Man's destruction. When really, what significance does Earth have in
the Universe anyway?

~~~
archangel_one
It's pretty significant to us, because we're here :) Seriously, from our point
of view, including adriand's, there is something to be lost from Mankind's
destruction because it would be the end of our entire species. From the
perspective of the universe as a whole, it may not be significant, but it
seems eminently sensible for humans to be concerned about it.

Also, it is possible that Earth _is_ significant to the universe because it's
the only place where intelligent life has arisen. I don't expect that this is
actually the case, but so far we have no evidence to the contrary, and if it
were true it would be tragic if we wiped ourselves out by playing with virii
and nuclear technology.

~~~
JVIDEL
500 years ago all human settlements on Earth had always been agrarian
economies with different levels of development but primitive nonetheless,
until the industrial revolution started.

What if it had never happened? Now we now there were previous attempts at
industry, like steam engines and chemistry, but there was always a war, a
drought or some other disaster that destroyed the framework where said
developments were made, and sometimes even killed the people making them
(Archimedes for example).

What if the universe is just like that? what if we're the most advanced
species and all the aliens out there are either animals or still haven't even
figured out how to build clocks or engines?

If that's the case then all the knowledge that ever existed would die with us.

------
viraptor
Unless I'm mistaken it looks like a very dynamic language. The screenshot
they're showing seems to point at initialisation of a new object, which
actually copies function pointers for each of its methods. That's not needed
for static languages which would just point to vtables. It looks like it
doesn't use real GC though - object's destructor is called right away on a
failed allocation. And the destructor is possible to change too...

So something like compiled javascript sans GC really. Or maybe like
precompiled python.

Doesn't seem very obfuscated either imho - there's a bunch of static data
copied in a series of moves. If someone really wanted to obfuscate those, this
looks like a fairly low hanging fruit: grab a list of 5+ mov-s of constants
and change them into xor+copy of a memory range to confuse pointer detection.

Can you see any more characteristics in that fragment?

~~~
shmageggy
The characteristics listed in the actual article (posted by computerbob:
[http://www.securelist.com/en/blog/667/The_Mystery_of_the_Duq...](http://www.securelist.com/en/blog/667/The_Mystery_of_the_Duqu_Framework))
also support this conclusion.

    
    
      -Everything is wrapped into objects
      -Function table is placed directly into the class instance and can be
       modified after construction
      -There is no distinction between utility classes (linked lists, hashes) and
       user-written code
      -Objects communicate using method calls, deferred execution queues and
       event-driven callbacks

~~~
LeafStorm
Reminds me of Smalltalk.

~~~
schwa571
Smalltalk objects typically don't have a per-object table of function pointers
(some dialects do, such as Self).

------
apaprocki
To me this just seems like someone wrote their own little OO system in C,
similar to how GObject works. The book Object Oriented Programming with ANSI
by Axel-Tobias Schreiner[1] even has example types which use the nomenclature
'ctor' and 'dtor' as in the snippet of code they show (See section 2.5, page
17). It isn't hard to write a little class generator that writes out all this
boilerplate code[2] from a C++/C# like input file. The benefit is, of course,
the resulting code size and avoiding any linkage to the std C++ library.

[1] <http://www.planetpdf.com/codecuts/pdfs/ooc.pdf> [2]
<http://www.jirka.org/gob.html>

~~~
stianan
The inconsistent placement of the "this" argument in function calls seems to
support this being C. The vtable moving around would indicate that each class
layout is hand-written, though.

~~~
apaprocki
Yeah, the author made a point of noting "this" could be in a register or the
stack, but that to me just says "C". The functions moving around wouldn't
necessarily mean it is written by hand, though. There just needs to be some
rules governing the system and we don't know what those rules are (yet).

I would just be very surprised if this is anything other than some convention
developed on top of C.

~~~
dfox
Differing calling conventions can point to combination of hand crafted object
system in C with some custom code generator with some high level input that
produces machine code directly without C in between. When you generate machine
code that does not directly interface with system libraries it is often useful
to ignore platform ABI calling conventions and make up your own.

~~~
stianan
Perhaps they use some kind of right-to-left fastcall convention. Or maybe they
are just unconventional, putting "this" at the end of the parameter list,
hence ending up in different registers or the stack depending on the number of
arguments?

------
computerbob
The actual real blog which zdnet summarized from which is far more
interesting:

[http://www.securelist.com/en/blog/667/The_Mystery_of_the_Duq...](http://www.securelist.com/en/blog/667/The_Mystery_of_the_Duqu_Framework)

~~~
p0ss
and from the comments on that blog:

> The code your referring to .. the unknown c++ looks like the older IBM
> compilers found in OS400 SYS38 and the oldest sys36.

> The C++ code was used to write the tcp/ip stack for the operating system and
> all of the communications. The protocols used were the following x.21(async)
> all modes, Sync SDLC, x.25 Vbiss5 10 15 and 25. CICS. RSR232. This was a
> very small and powerful communications framework. The IBM system 36 had only
> 300MB hard drive and one megabyte of memory,the operating system came on
> diskettes.

> This would be very useful in this virus. It can track and monitor all types
> of communications. It can connect to everything and anything.

~~~
cag_ii
But this comment doesn't ever mention specifically what makes his suggestion
"look like" the given examples. I find it highly unlikely, given all the
available networking/comm libraries available that old, proprietary IBM code
would be used. Maybe there's something to it, but he certainly didn't mention
anything convincing.

More unusual (to me) is that there are two separate comments suggesting it may
be RPG (an OS400/iSeries language), which is very unlikely due to it not being
an OOP language therefore not having constructor/destructor functionality, and
otherwise a very high level language.

I'd guess some high level assembly, though this suggestion does look
interesting.

[http://www.securelist.com/en/blog/667/The_Mystery_of_the_Duq...](http://www.securelist.com/en/blog/667/The_Mystery_of_the_Duqu_Framework#c15313)

------
jd
Writing an unpolished programming language isn't that much work in comparison
to writing a complex virus. Especially low level languages where instructions
map pretty closely to the CPU instructions are easy to create.

I think it makes a lot of sense to write a custom programming
language/compiler because virus scanners tend to use fingerprints to recognize
dangerous pieces of code. So you want a compiler that deliberately obfuscates
the code it writes and also outputs instructions in such a way that it avoids
triggering known virus scanner fingerprints.

~~~
RodgerTheGreat
Agreed. Writing compilers is easy; The "hard" aspects of creating a new
language usually boil down to issues like tooling, documentation and support
libraries. In the case of a virus, the only users of the language are the
virus authors and the language can be highly tailored to the domain.

------
nivertech
They doing it wrong.

Instead of trying to compile code examples in every candidate PL, they should:

1\. Crawl x86 binaries from the Internet / download sites / code archives.

2\. Write M/R job, which will disassemble and look for patterns they
discovered.

3\. Once patterns found - investigate the source of binary (i.e. who uploaded
it to download site, maybe it was on university FTP server or maybe it's part
of commercial driver released by company XYZ).

------
jgrahamc
It might well be a macro language and not compiled. For example, HLA
(<http://en.wikipedia.org/wiki/High_Level_Assembly>) has many of the features
that are present here. It has its own library functions, objects/classes, and
produces code that looks bit like it was compiled.

------
beza1e1
Looking the assembly I see two things: (1) no name mangling. So either this
was lost in the decompilation/deciphering phase by the Kaspersky guys or
mysterious language does not support method overloading. (2) the assembly
looks reasonably tight and optimized, so a solid code generation backend
(GCC,LLVM,MSVC,...) was used.

Especially because of the name mangling i was thinking of Vala [0]. However,
Vala relies on GObject and does probably not work on Windows. Anyways, I guess
it's an OO language compiled to C in an intermediate step. This would explain
(2).

------
samstave
This is one of the most interesting netsec (if not THE most) questions of our
time.

We have what is effectively an alien virus, given how advanced it was, its
construction and spawing of duqu and being written in an unknown language.

This is serious awesome cyberpunk stuff - but scary as hell at the same time.

With the revelation of Stuxnet and Duqu, NOBODY should think anything they
do/say online is safe.

~~~
driverdan
Not necessarily. Occam's Razor. It's more likely that they used something
obscure to compile or obfuscate the code or wrote a tool to do so. Creating a
new language just to write this seems highly unlikely.

~~~
joezydeco
Is writing something like this in pure assembly beyond the realm of
possibility for some reason? There are still a few dedicated people that code
on the metal for high level operating systems (Steve Gibson comes to mind
immediately)

~~~
ori_b
Writing something like this in assembly isn't impossible (or that difficult),
but the patterns look like something that you wouldn't be using if you were
writing in assembly.

My bet is that it's just C with a hand-rolled OO framework.

------
andrewcooke
is it dumb to suggest that someone who understands c++ (or other compiled oo
language) would/could write assembler in this way? just from reading the
description (things like variable locations of method tables, various
registers for "this", and lack of memory management) it sounds like it could
be handwritten, but structured in a similar way to c++.

~~~
dfox
The disassembly snippet looks like typical C/assembly handcrafted object
system without real classes, but responses by Kaspersky guys in blog comments
seem to imply that it uses these objects even in places where it is highly
impractical when writing code manually. So it's possibly C/assembly written by
typical hardcore Java programmer, but I find that highly unlikely.

------
RodgerTheGreat
If I read that properly, it sounds like Objects have their own function
tables- this would seem to indicate an object oriented language based on
prototypal inheritance.

------
tlrobinson
I've seen the theory that Stuxnet/Duqu was developed by a state thrown around
a lot, but what's the actual evidence?

It's not particularly hard to write a simple programming language. Worms are
very specialized pieces of code. It doesn't seem that crazy that someone would
create a language tailored for worm development.

~~~
JanezStupar
Considering the facts that are known (e.g.:
[http://www.digitalbond.com/2012/01/31/langners-stuxnet-
deep-...](http://www.digitalbond.com/2012/01/31/langners-stuxnet-deep-
dive-s4-video/)) it is nigh impossible for Stuxnet to be commissioned by a
nongovernmental entity.

The sheer complexity is off the charts. Stuxnet's sophistication and
complexity is an cybernetic equivalent of Manhattan project. E.g.: As a domain
expert on industrial automation of this kind Langners states that whoever
created Stuxnet _had to have a testing facility_. How many hackers do you know
who build uranium enrichment centrifuges to test their cyber attack tools?

Yeah thought so.

Edit: TL;DR: There are two pieces of evidence. No.1: Motive, No.2:
Sophistication and complexity.

------
nikcub
They learnt from watching all the research firms reverse engineer Stuxnet and
eventually stop it. What they are doing is obfuscating the output. If you look
at a default DLL or EXE build from VS it is amazing how much information is
included that helps you attach a debugger and work out how it works.

The authors learnt from the Stuxnet experience and I wouldn't be surprised if
they are not testing their own worm using black-box reverse engineering tools
to figure out what the research guys will work out when they eventually find
it in the wild.

This has worked so well that Kaspersky think that the authors actually
invented a new language, when it is likely still just C++, some machine
generated code, some obfuscator tools (game makers have been using them for
years to stop crackers) and likely manually changing the outputted assembler.

~~~
jberryman
> The authors learnt from the Stuxnet experience and I wouldn't be surprised
> if they are not testing their own worm using black-box reverse engineering
> tools to figure out what the research guys will work out when they
> eventually find it in the wild.

Don't they mention that these components were floating around in 2007?

~~~
nikcub
Where does it say that? All the references are to 'Duqu Framework', which they
recently found, I may have missed something

They also completely rule out C++, C etc. when what they should be ruling out
is C++, C compiled with a standard VS compiler (or an easily recognizable
compiler). It is silly to completely rule out C++ and C just because they
don't immediately recognize the output and because it doesn't reference
anything else

~~~
jberryman
bottom of TFA:

"Duqu was first detected in September 2011, but Kaspersky Lab believes it has
seen the first pieces of Duqu-related malware dating back to August 2007"

------
prajesh
MASM32

<http://www.infernodevelopment.com/introduction-masm32>

[http://www.opensc.ws/asm-sources/11120-masm-simple-memory-
co...](http://www.opensc.ws/asm-sources/11120-masm-simple-memory-code-
injection.html)

------
Yxven
Why would creating your own programming language for a virus be a good thing?
If viruses are the only thing written in this language, wouldn't the language
make it easier for the anti-virus companies to detect it without having to
worry as much about false positives?

~~~
cag_ii
It may not have been done for that reason alone. NIHS is pretty common among
hackers:

<http://en.wikipedia.org/wiki/Not_invented_here#In_computing>

It also seems to me that a new language for this exact purpose is unlikely,
however, it could very well be a proprietary or otherwise unknown language
that may have been built for another purpose (internal company, domain-
specific development, etc) and not often seen in this context.

------
Cieplak
Haskell or Lisp, perhaps?

~~~
sek
I really had to laugh. Cmon guys, this humor is ok on HN.

------
meatsock
are properly written code obfuscators able to throw off detection of their
originating language?

------
alan_cx
Please forgive, and correct, me if I have this wildly wrong:

This is referred to as Stuxnet 2. And the original was "proven" to have been
made to attack Iranian nuclear labs, and what not. Conclusion being that it
was made by some government agency. I suppose foil hat theory would point
fingers at CIA/NSA type people.

Assuming the above is correct, or correct enough, its it not surprising to see
what might be a new language for this virus, if it has a nation state's
resources behind it? If that is the case, what chance is there that any one
will be able to crack this mystery?

~~~
yuvadam
The best guess, at least in Israel, has been for a while now that Stuxnet was
a joint Israeli/US effort.

I haven't heard any similar speculation about Duqu.

------
daeken
This makes fairly little sense to me. Why wouldn't one write such a virus
using straight ASM, or possibly write a VM in ASM and write the payload in the
VM's bytecode (this option makes it particularly easy to do metamorphic code,
though doing it with well-written ASM is also very possible)? It seems like
creating a custom language -- or hacking up compiled C++ or whatnot -- is a
bit of overkill considering that the basic tenets of virus writing are: keep
it simple, don't get caught; this wouldn't aid in either of those.

~~~
maaku
Perhaps because this is no simple virus? Perhaps because Stuxnet/Duqu is one
of if not the most sophisticated examples of cyber warfare* we have public
examples of? Perhaps because its compiled binary is a whopping half-megabyte
in size?

The amount of effort that went into Stuxnet is truly massive. It'd be no
surprise if a custom framework or even a domain-specific language were written
to support it.

* Usually I cringe at just hearing the phrase cyber warfare and at the ridiculous way in which it is used by media and the government. But looking at what Stuxnet accomplished ( _physically_ damaging or disabling critical components of the Iranian nuclear enrichment program), the term applies completely.

------
damiankennedy
Maybe they just embedded something like RTOS-32. The company that makes it
(www.On-Time.com) say it can be integrated with Visual Studio and used along
side Visual C++.

------
darxius
What driverdan said makes sense. It is much more likely that they obfuscated
or masked their code instead of inventing an "unknown" programming language.
Even so, the obfuscation method used I'm sure would be pretty cunning (seeing
as these folk mean business).

It's interesting to see Kaspersky suggest that the state is behind this solely
based on some unknown code. Does anyone know why that would be a likely
conclusion on their part?

~~~
cag_ii
From what I gathered from the article, this doesn't appear to be obfuscated
since the purpose/method is transparent in the decompiled code, it simply
doesn't match the constructs you would expect to see in common programming
languages when decompiled.

------
vessenes
Apparently I am the only one who thought immediately "Goldman Sachs." My money
is on this being SLANG.

~~~
gaius
Slang is not compiled, and isn't very useful without SecDB behind it.

------
spullara
I like the idea that it is an OO language that forgoes its own runtime library
and instead uses the Win32 API as its native library. This would be a great
language to write viruses in -- perfect for just glueing together APIs and
doing some very small scale business logic without having to learn C++.

~~~
DaveMebs
Anyone writing a virus already knows C++.

~~~
spullara
Sorry, anyone who is writing the "business logic" for a virus. Someone else
might write the exploit.

------
wololo
related: an anti-reverse engineering tool at microsoft called "warbird"
(<http://forum.doom9.org/archive/index.php/t-130166.html>, search for warbird)

------
Marwan
To me it seems just Kaspersky are advertising their product, nothing more.

------
flabberghast
Well, given that Unit 8200 is up, and the U.S. Cyber Command is active this
looks exactly like what they'd do.

Just wait until they let the A.I. make the language as so it is not human
decipherable.

~~~
GeogreHost
Relevant link:

<http://www.kurzweilai.net/ai-designs-its-own-video-game>

Pretty much, let the machine make the code.

------
ekm2
Someone in an earlier post had claimed it is MASM32

------
logn
ahh, that's assembly

------
petermcd
In my best Commander Tigh voice: "Looks like exactly the kind of code a Cylon
would write, to me."

