
Oracle plans to dump risky Java serialization - s-macke
https://www.infoworld.com/article/3275924/java/oracle-plans-to-dump-risky-java-serialization.html
======
xg15
I applaud the general decision, but I wonder what will happen to existing
blobs of serialized data. Will there be any migration tools provided?

Apart from the horrible security, what annoyed me the most with serialization
is the lack of control you have over the process. There doesn't seem to be a
way to access serialized data as a simple parse tree or record sequence - you
_have_ to construct objects of the actual classes. If only one class is not
available or has breaking changes, there is no (built-in) way to access
_anything_ inside the blob.

This is particularly fun if you want to refactor things. Suddenly package
names, class names and names of private fields (!) are part of your public
interface.

So if we could drop reflection/unsafe-based serialization and instead just got
a simple parser/writer for java's binary object graph format, I'd be very
happy.

~~~
dnomad
There's really no such thing as a "binary object graph format." Objects are
not data. Code is not data. It's why serializing objects is so very
complicated and dangerous. Serialized "data" is a _program_ that can do
everything a normal Java program can do. In the 90s we called this feature
"mobile code" and thought it might be the future of distributed computing.
Today we call it a "code injection attack" and recognize how enormously
dangerous it is.

If you want data then define a schema and write/read your data. Serialization
should really never be used and certainly not for long-lived data storage.

There is one place where serialization becomes useful and that is for storing
objects off-heap, IPC, and for short-term storage (ie snapshots like Android's
parcelable). For these cases I'd like to see the JVM embrace not just
immutable value objects but full-fledged structs that have a well-defined
memory layout. You can do this today using off-heap Buffers and interfaces but
language support is always good so there's a universal standard that everybody
can build upon. Once that's in place there'd be no need to ever use
serialization.

That said I can't imagine Oracle will simply remove support object
serialization. It may be kicked out of the "core" JDK and become an optional
module. The classes may be deprecated. But the functionality likely isn't
going anywhere in the next ten years.

Even if they did remove it nobody should be using the standard object
serialization anyways. If you're going to use serialization (and you
shouldn't) then you should absolutely be using FST [0].

[0] [https://github.com/RuedigerMoeller/fast-
serialization](https://github.com/RuedigerMoeller/fast-serialization)

~~~
resource0x
There's no such thing as "data". Every piece of "data" eventually gets
interpreted, thus becoming a "command" in some level of abstraction. The core
of every security vulnerability is the belief that "this is just a piece of
data".

~~~
ableal
A parable for that viewpoint:
[https://en.wikipedia.org/wiki/BLIT_(short_story)](https://en.wikipedia.org/wiki/BLIT_\(short_story\))

------
rad_gruchalski
There are so many great alternatives available. It's a no-brainer. I applaud.

[edit] so why the downvote? We have YAML, JSON, protobuf, Thrift, Avro. Yup,
these serialize "contents" rather than "structure + contents" but one gets
interop with other technologies for free. Every tech mentioned above is so
simple to use that removing Java serialization is a no-brainer.

~~~
MichaelMoser123
That would also kill RMI (at least JRMP), and that too would be a good thing,
RMI doesn't even pass through a router (unless you do RMI-IIOP - and that is
very different from regular RMI - JRMP)

For java backwards compatibility used to be very important, so this is big
news for the platform.

I think this remoting the bytecode with serialisation madness was once upon a
time very important part of RMI/serialisation - back in the thin client java
days this was supposed to be the way to distribute code across a network link,
security was not the very first priority in the nineties (beats me why they
made JRMP non routable)

~~~
parasubvert
If you mean that JRMP leaks internal IP addresses in the protocol itself for
callbacks and thus can't be NATted, that can be fixed with a couple properties
to force the use of outer-IP DNS:

    
    
      java.rmi.server.hostname=myhostname.com
      java.rmi.server.useLocalHostname=true
    

You also can tunnel JRMP through HTTP - there was a CGI script called java-rmi
dating back to the late 1990's that I think was still distributed through Java
8 (!), and also an RMI Servlet Handler which was a bit more robust/performant.
Spring also still has the RmiServiceExporter and HttpInvokerServiceExporter.

I remember building Java applets and servers that did fixed income quotes &
bond trading systems via streamed encrypted serialized Java objects circa
1999-2000. What a security nightmare, but no one knew better.

I feel old.

~~~
MichaelMoser123
if i remember correctly then you still you need to open a port in the router
so that the server can call back the client - now good luck with persuading
anybody to do this kind of insanity.

------
kpcyrd
In case anybody is wondering, this is the attack vector that was used for the
equifax hack and is also used in a fair number of remote code execution
exploits for java servers.

~~~
latchkey
Through the fact that Struts did not implement things correctly and equifax
did not upgrade their systems. It also sounds like it was multiple
vulnerabilities. Like this one as well:
[https://www.cvedetails.com/cve/CVE-2017-5638/](https://www.cvedetails.com/cve/CVE-2017-5638/)

------
Hupriene
Having used java serialization a few times for POC level work, I'll be sorry
to see it go.

I wish they would just rename it something sufficiently ominous sounding that
people wouldn't think about using it on untrusted data sources.

Maybe AribitraryCodeAndDataSerialization

~~~
ta5244626777
Yes it was a nice/quick/dirty way to persist an object graph, I bet it's been
abused richly in a lot of codebases out there, it'll be curious to see how
this is handled in terms of breakage i.e. the JDK that introduces this might
be the Python 2/3 showdown (if that's still a thing).

Be curious to do a Github wide grep for ObjectOutputStream or something
similar and see what it's like in open source land.

~~~
gmueckl
This is made more exciting by the fact that Java rarely introduced big
breaking changes in the past. Especially deprecated classes
(java.util.Date...) never were actually removed. Starting this now is going to
wreak havoc.

~~~
djsumdog
We're already starting to see the effects in Java 9. Our team attempted to use
it on our current project and many of our dependencies broke because they were
using deprecated parts of the API that were removed. We ended up having to
drop back to Java 8.

------
needusername
Does anybody have a source for this? The publication doesn't doesn't exactly
have the best reputation.

A lot of things are currently built on top of serialization, from JMX to
almost everything in Java EE including servlet sessions.

------
scarface74
I was trying to understand if C# has the same issue. From what I can tell, as
long as you use the default serialization, it seems to be safe. But I can't
really tell.

[https://www.alphabot.com/security/blog/2017/net/How-to-
confi...](https://www.alphabot.com/security/blog/2017/net/How-to-configure-
Json.NET-to-create-a-vulnerable-web-API.html)

~~~
dwaite
I'm rusty on C#, but I believe the equivalent would probably be [Serializable]
types being read from a Stream using a BinaryFormatter or SoapFormatter. A
malicious stream could include any known types in the system marked as
[Serializable], and as they are deserialized any associated static
constructor/no argument constructor/property setters could be called.

In the JSON case linked, I presume there is a root type given when you are
attempting to deserialize a document. However, if one of the properties of
that type is ambiguous (say System.Object), and the deserialization algorithm
looks for a 'type' property in the JSON with a class name to determine what is
instantiated, then there can be all sorts of unintentional types that might be
built by the processing of that malicious JSON.

~~~
Const-me
They moved away from that. Many newer parts of the .NET framework (e.g. SOAP
1.2 implementation in WCF) use data contract serializers by default. With
them, a complete list of known types must be provided to deserializer, it's
typically done with [KnownType] attribute.

------
rukuu001
It's less useful now than it used to be. In an age of ubiquitous JSON there's
much less need for sending Java objects over the wire.

~~~
paulddraper
And even more seamless...protobuf.

~~~
rukuu001
Blushing because this is the first I've heard of it - I owe you one :)

~~~
echelon
Protos are amazing and have so many advantages over json. You'll love them!

~~~
skocznymroczny
Wouldn't they be a pain to debug, given that they are binary? I wouldn't want
to use a hex editor to verify if my data serializes properly.

~~~
pure-awesome
I am not sure, exactly what do you mean by "verifying" that your data
serializes properly?

Do you mean that you don't trust the protobuf implementation itself to write
the correct bytes, or are you worried you may have written the proto file
wrong?

If it's the first case, protobuf is a widely used format that has been
rigorously tested in the field. Provided you use it for a major language, you
should be fine.

If it's the second case - could you not just serialize and then deserialize,
and check that the objects that pop out again are the same as the ones that
you sent in?

------
gmueckl
I would hate to see serialization go completely. It has its uses. Maybe Java
should copy the attributes that the C# data contract serializer uses to mark
the subset of classes that may appear in serialized form. It coukd be
retrofitted and would be a less drastic change at the same time.

~~~
tannhaeuser
Java has had the "transient" keyword for that since 1.0.

~~~
pvg
It also didn't have serialization in 1.0 neatly avoiding the entire problem.

~~~
tannhaeuser
You're right, it was introduced in 1.1 (February 1997)

------
jeffdavis
Can someone briefly explain the problems and how general they are to other
serialization interfaces?

~~~
mbfg
The problems is, any class (that's serializable) that is on the target
machine's classpath, can be used as a serializable object. So if i want to
screw with a server, i can create a payload with any of those classes, and
send it to the server. The server will load that class when it sees it in the
payload, and will execute code any code in the classes static initializer,
before trying to instantiate an object. All of this happens before the host
application regains control of the serialization, and realizes that the
objects it's deserializing is not what it expects to be sent. So there have
been a bunch of problems where classes no one expected to be used in
serialization to be exploited because it's really simple for a class to marked
serializable thru inheritance (be it class or interface).

~~~
chii
> create a payload with any of those classes, and send it to the server.

But on a deeper level, if a programmer doesn't know anything about security,
then this sort of hole will continue to happen even if java serialization is
disabled (just a bit harder to screw up). I m not that big of a fan of making
security decisions without programmer's input, since you'd assume a
professional programmer should know better anyway.

~~~
dwaite
The security hole is in the design of java serialization. Any class marked
serializable (in your code, your included libraries, or the JVM itself) is a
potential security vulnerability as soon as you use this feature. Mark
estimated that over half of JRE/JDK vulnerabilities have been due to
Serialization.

Recent releases added an opt-in feature to filter which classes are allowed to
be deserialized, but there's still a horrible amount of open, unauthenticated
network ports that take in serialized java.

The programmer trying to get the same failures in a post-serialization world
would presumably have to find or build a new system with the same design
issues.

~~~
usrusr
> Any class marked serializable (in [...] included libraries [...])

This is the big one: as soon as you deserialize incoming data, any library on
the classpath becomes a potential source of remote-callable snippets. And when
a vulnerability resides in a library, exploits will tend to be compatible
across applications, which makes them far more likely to actually hit you than
any custom weaknesses.

------
gaius
_Oracle has received many reports are received about application servers
running on the network with unprotected ports taking serialization streams_

I’m just not seeing how this is a problem with the language. Leave anything on
an unprotected port taking unsanitised input and it’s vulnerable no matter
what it is written in.

~~~
wepple
> Leave anything on an unprotected port taking unsanitised input and it’s
> vulnerable no matter what it is written in

I don’t think you’re comparing apples with apples.

Even in a default configuration, the majority of services that show up on a
network don’t give you instant code exec. Maybe tomcat servers with default
passwords, and a few other things.

But (de)serialization isn’t securable at all. You can’t add auth, you can’t
WAF it, you can’t fix the underlying vulnerabilities.

There is absolutely nothing which has that level of vulnerability and lack of
security on a modern network.

~~~
gaius
_There is absolutely nothing which has that level of vulnerability and lack of
security on a modern network._

OK, let's say I overwrite part of your system with something evil, knowing
that your app will Class.forName() it. Is that a problem with the ClassLoader
or is the problem that your perimeter is already compromised anyway?

~~~
wepple
I don’t think it matters, at all.

Java has serialization, and everyone uses it, and it’s a security nightmare.

As uncomfortable as it may feel to me, I kinda agree with Oracles approach
here: kill serialization, make things more secure. If they have to change core
language/libraries, so be it.

You were trying to say that anything left on a network socket will be
compromised, and that’s simply not true. Most software is pretty solid.
Anything in C will have more than it’s fair share of upcoming patches, but
unlike Java serialization, you can’t fix a single lib to fix 99% of bugs.. or
we would’ve surely done that.

------
jdeca568
If i understand correctly, the problem of Java deserialization is 1) gadgets
for code execution and maybe 2) DoS. And this is bad because even if it looks
secure now, a gadget could be discovered later.

The default Java serialization is one of the easiest way to serialize instance
of objects but there are many other ways, and many other risky ways among
them.

It seems to be always the same problem: ClassLoader access. Couldn't there be
a way to let the deserializers use a specific ClassLoader?

I mean some sort of (Sandboxed)ObjectInputStream that uses a specific
ClassLoader defined in the JRE config. The sandboxed contexts could be defined
in something like java.security, .policy, to define what it is supposed to
know and when/where it is supposed to be used.

------
sewercake
Won't this have serious implications for Spark, Hadoop, and other frameworks
that distribute workloads across multiple JVM instances?

~~~
zeroxfe
AFAIK, Hadoop uses protocol buffers for message passing, not Java
serialization.

~~~
yzmtf2008
Hadoop uses Avro actually, but the point still stands :)

~~~
erik_seaberg
Hadoop expects your keys and values to implement Writable and serialize
themselves (a lot of these are actually hand-written expecting the instance to
get reused for each input tuple). There's optional and fairly clumsy glue that
makes Avro work in a key or value.

------
mbfg
all web applications use serialization to do session replication. Now this is
safe as the both stream ends are controlled by the developer, and only if they
are stupid, (instead of malicious) will there problems. Still if java
serialization goes away, this will be the most widespread impacting change in
the java ecosphere wrt changing how serialization works.

~~~
koolba
Using java serialization for sessions is _terrible_ idea. It’s mildy
convenient to start but bites you as soon as you want anything non-java to be
able to read your session store.

------
ddtaylor
In theory this can be solved by using a security policy, but I don't think
most are doing anything like that for deserialization.

------
exabrial
There are some valid uses for it, but the interface is unsafe by default. How
about instead, requiring a MAC key in the serialization protocol?

~~~
dwaite
You can package up the data so that there is authentication to evaluate before
you hit the serialization layer, and integrity behind that authentication.

This might be your approach if say your session cookie is based on serialized
Java.

(However, most people give up on this approach - java serialization is also
very inefficient space-wise, and the cookie will get too big for the browser
to honor)

------
hprotagonist
cool. can we kill pickle next?

~~~
fpoling
One of the attack vector in Java was classes storing native pointers as
integer fields calling free on the above pointers in the finalizer. So the
moment one can force deserialiazation of such classes one ends up with
corrupted heap and trivially weponized exploits. Does pickle in Python suffer
from the same problem?

~~~
icebraining
Arbitrary code execution:
[https://www2.cs.uic.edu/~s/musings/pickle/](https://www2.cs.uic.edu/~s/musings/pickle/)

~~~
bitL
That only works because Python is an interpreter. Won't cause a thing on Java.

~~~
dwaite
The JRE has had arbitrary code execution attacks on serialization. The leaked
classes eventually invoke a class loader and instantiate your binary code as a
new java class.

~~~
bitL
Does the serialization work with bytecode though? Doesn't it stream just data
members, not method implementations?

------
eternalban
> Serialization was a “horrible mistake”

RMI required it. Mr. Reinhold has apparently forgotten the scene in late 90s.
CORBA anyone?

------
boobsbr
Would this affect network class loading?

------
lolive
I really hope they choose something like RDF. Which is the only serialization
format I know of that handles graph and types natively.

------
darkhorn
Use Lisp, data and code is same thing.

------
ataturk
It's funny that serialization is noted as a "horrible mistake" from 1997 but
no mention of RMI, the even worse mistake. I guess Java has a lot of bad
mistakes. I wish Oracle would end Java so I could move on to something else.

Or maybe I'll just move on to something else anyways as I'm really rather sick
of writing these syntactically crippled lambdas. The streaming stuff is almost
good.

~~~
adrianmsmith
Could you point me to some information on why RMI is a bad mistake? (Genuine
question.)

~~~
jnwatson
RMI (and similar technologies like CORBA) is a “mistake” in today’s world
because it makes every public method an attack surface. It also hides
important details that the client really can’t not deal with, namely network
issues.

I put mistake in quotes because there are situations where RMI (and Java
serialization) work fine: trusted, reliable networks like cluster or grid
computing.

