

Java.sun.com is down again - breaking bad apps across the land - zorlem

This is a notice to fellow HN readers.<p>This morning around 8:00 UTC Nagios alerted me, that the thread count on a number of Apache Tomcat servers I support for a client started to rise dramatically. I've discovered that a library that is part of the application was trying to fetch DTDs like http://java.sun.com/dtd/properties.dtd from java.sun.com. The servers are unreachable so each request thread was taking 30+ seconds to time out. I've worked around the problem by putting an iptables rule for 192.9.162.55 port 80, that would reject the request immediately and informed to client to look for a permanent solution.<p>In case the library can't be fixed (I don't know who developed it), I'm planning on putting the relevant DTD files on a local HTTP server and redirect all requests for java.sun.com to that virtual host (eg. through relevant entries in the hosts file).<p>Check your application servers and the software you use if it processes XML documents. One way to check:<p><pre><code>   $ netstat -ant | grep -E '192.9.162.55:80[[:space:]]+SYN_SENT' | wc -l
   184
   $
</code></pre>
SYN_SENT is because the 192.9.162.55 (java.sun.com) is not responding ATM. If it becomes reachable again, just s/[[:space::]]+SYN_SENT// .<p>Relevant URLs for more information:<p>http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic/<p>http://www.oasis-open.org/committees/entity/spec-2001-08-06.html<p>And a previous discussion here about the same problem:<p>http://news.ycombinator.com/item?id=3094075
======
ShabbyDoo
More generally, it ought to be easier to constrain apps running on the JVM to
declared sandboxes. I once looked at the Java security model and found it to
be totally inadequate for such purposes as it seemed to have been designed for
ensuring that desktops could not be compromised by rogue applets running in
browsers. Specifically, I was surprised by the coarse-grainedness of the
security settings. Want to limit access to the network by disallowing use of
the Socket class? Done. Want to only allow access to a whitelist of hosts? No
dice. No filesystem access at all? Easy. Limit the app to only reading and/or
writing to certain directories? Not a chance.

I want to define whitelists for each environment in which my app will run --
development, QA, production, etc. To which hosts may it connect? Where may it
access files? What else might I wish to constrain as way of avoiding
inadvertent dependencies? Particular queues/topics on messaging buses?
Database schemas within a particular server (network restrictions are too
coarse for this)? When asking this question, I'm not trying to protect myself
from rogue developers with malevolent intentions -- I just want to avoid a
scenario like the one described by the OP.

Recently, I started-up the Java app upon which I am currently working and
watched its network behavior via Microsoft's Wireshark-esque network
monitoring tool. It turns out that EHCache now asks one of Terracotta's
servers for the most recent EHCache version number so that it can spit an out-
of-date warning in the logs. Benign and useful, but I still had to spend a few
minutes in the EHCache source to make sure that, if Terracotta's servers were
down, our app would still start-up.

Should one do this at the OS level (jails, perhaps)? I'm not limiting this
idea to just Java apps, but I'm really only an expert in the Java space.

I also argue that the whitelist would help codify inter-app dependencies in
large IT environments. A few years ago, the large IT shop for which I worked
did a disaster recovery drill where they literally deployed 10's of apps in an
IBM-provided datacenter as a dry run. One thing they learned was that a
particular production app was erroneously configured to log certain audit
events to a server in a QA environment (which was not part of the disaster
recovery plan for obvious reasons). Whitelists would have prevented this
issue.

~~~
zorlem
On several occasions I've used the Tomcat Security Manager and it have not
given me any problems. I find it fine-grained enough for my purposes, although
not as fine-grained as you wish. I've used it to limit JVM's access to a
specific set of hosts and TCP ports, restrict (RW and RO) access to files with
and without wildcards, restrict access to specific methods, properties,
classes and what-not. One can't use it to restrict the access to specific
databases, or topics in the message queues as you suggest, but I don't think
it's necessary. I think it's out of scope for the VM to restrict access to a
specific DB and this should be the responsibility of the specific DBMS.
Otherwise I'd imagine the overhead would be quite serious. I haven't heard of
any VM manager that would provide such a thorough and deep access control, do
you know of any?

If you still need this functionality in a Java security manager I believe you
could build it using the existing hooks, they look quite powerful and
flexible.

Now, the real pain I've had with the Security Manager in Tomcat 5.5 was
writing the rules for a pre-canned application, not written with SM in mind.
It was quite a tedious process, but all MAC systems are tedious to set-up
initially. That's life.

~~~
ShabbyDoo
I hadn't known about Tomcat's security manager stuff. Interesting, and proof
that Java's security manager stuff can be extended for practical purposes.

I had thought about using AspectJ to wrap interesting points in various APIs
and then do "stuff". The obvious behavior is to restrict usage based on
whitelists. However, it might also be interesting to run one's app in an
access logging mode, especially when trying to wrap some controls around a
previously unrestricted production application.

------
blinkingled
As a best practice applications should reference dtds from local filesystem.
Most sane data centers would have outbound (App->Internet) access locked down
- only needed hosts/ports are allowed after the application developer
specifically requests for it.

~~~
neild
Sadly, if you use Python's batteries-included XML tools, this is virtually
impossible to do. See <http://bugs.python.org/issue2124> for some discussion.

~~~
lucian1900
Those tools suck.

lxml is better.

------
ShabbyDoo
Around 2005, I was semi-forced to use Xalan/Xerces (the Apache reference
implementation of SAX, DOM, XPath, XSLT, etc.) for a project. These libraries
were included in the JDK [edited from orig post]

To make sure that these libraries did not attempt to talk servers outside my
company's control, I had to dig through the code and implement "neutered"
forms of schema look-up interfaces, etc. I can't recall exact details. The
default behavior was promiscuity and presumption, and making sure that these
libraries didn't strike-up conversations with random servers was not trivial
or terribly well documented. So, I'm not surprised by the current state of
affairs.

------
sxtxixtxcxh
you can pass -c to grep to get a count, you don't need to pipe it to `wc`

~~~
zorlem
thanks :)

the "| wc -l" was tacked in for the submission :)

------
rshm
virtualbox.org is down as well.

~~~
kia
For virtualbox.org it's planned maintenance from April 27th to April 30th. The
announcement was on the main page.

~~~
eagsalazar
Planned downtime is handled like this? It isn't hard to put up a temporary
page. Just taking it down, even with notice is really poor form.

~~~
bdunbar
At times like this I think of Lily Tomlin's Ernestine character: "We don't
care. We don't have to. We're Oracle!"

(I've been dealing with Oracle for a few years. It started with just database
stuff, but they kept buying applications I supported, now they own Solaris ...
anyway.)

~~~
re_todd
I got a job where we don't deal with Oracle at all, life is so much better!
I'd recommend it to anyone. Eat your veggies, exercise regularly, and work in
an Oracle-free workplace .... this is the secret to happiness!

~~~
bdunbar
_work in an Oracle-free workplace .... this is the secret to happiness!_

In Big Enterprise the alternative [1] to Oracle is Microsoft.

You're darned if you do and darned if you don't.

[1] Don't even mention open source. Not going to fly at BE, in my experience.

------
ryandvm
They're probably busy switching everything over to an Oracle stack...

------
soc88
I like how Oracle educates developers about the proper handling of DTD's.
(They didn't break it by accident for the third time already, right? RIGHT?!)

~~~
HaakonKL
Of course not. Oracle would never do anything bad to a developer community.

To be honest though, if this have already happened TWICE do people really have
any excuse for using a server that goes down a lot for something important?

Why would you NOT just download the stuff on some separate server and at most
run some cronjob to keep it up to date?

Or am I just being stupid?

~~~
zorlem
See the link to the w3c - several times they've started delivering 503 HTTP
error codes, hoping that the applications would start to break. It didn't have
a big effect, either because the application didn't actually use the DTD
they've retrieved for anything or because they broke in a non-obvious manner
(like with the servers I'm administering). Had the outage been shorter or if I
wasn't monitoring the Tomcat JVMs it could've stayed under the radar. That's
one of the reasons I've made this submission.

