

Clojure's Default Reader is Unsafe - llambda
http://www.learningclojure.com/2013/02/clojures-reader-is-unsafe.html#post-body-2898830171141471587

======
fogus
Clojure's reader is designed to provide the types of tasks useful for Lisps;
specifically to take a string and convert it to data structures. However, like
most Lisps Clojure has some special syntax that does various tasks at read-
time like the #= tag that might not be safe to use as a webform processor, but
it's critical to the way that Clojure works.

In Clojure 1.5 there are EDN functions that handle a subset of the Clojure
syntax and parse strings into data types like numbers, maps and vectors (and
others). It's meant to deal with data only and not "reader forms". I wrote a
Ring middleware to handle EDN data and recently ported it to use the new 1.5
EDN reader.

[https://github.com/fogus/ring-
edn/blob/master/src/ring/middl...](https://github.com/fogus/ring-
edn/blob/master/src/ring/middleware/edn.clj#L15)

It looks almost exactly like the old clojure.core/read-string except it will
not execute any dangerous code.

~~~
stouset
Anything that can turn strings into symbols is dangerous code. Symbols can't
be garbage collected, and an attacker can conduct a denial of service by
forcing you to load unlimited numbers of uniquely-named symbols.

~~~
technomancy
Symbols can be GCed as of 1.2.

------
puredanger
I think this is an excellent write-up of the state of the world with Clojure
1.5: <http://clojuredocs.org/clojure_core/clojure.core/read>

In short, don't use read-eval at all if you want safe reading. Use the
clojure.edn alternatives to read data safely in those circumstances.

~~~
willismichael

      ;; The particular issue of executing arbitrary Java constructors used
      ;; in the examples above no longer works in Clojure 1.5 when
      ;; *read-eval* is false.  Even so, you SHOULD NEVER USE
      ;; clojure.core/read or clojure.core/read-string for reading untrusted
      ;; data.  Use an edn reader or a different data serialization format.
    

From that comment it seems like the Java constructor loophole has been closed.
With that gone, I don't understand why the reader is still dangerous. What am
I missing?

~~~
mjw
I was wondering this too, and deliberately tried to break something using the
clojure 1.5 reader with _read-eval_ false. (Makes a change from building
things all day!).

Result:

    
    
        user=> (binding [*read-eval* false] (read-string "``````````x"))
        OutOfMemoryError Java heap space  clojure.lang.RT.cons (RT.java:570)

------
matthavener

      ;; especially since read-string will be fixed in clojure 1.5 so that
      ;; binding *read-eval* to false around it will make it safe, as far as
    

In clojure 1.5 with _read-eval_ bound to false, the reader is still unsafe for
malicious input. Since the Clojure reader must be able to construct arbitrary
Java objects via their constructors, any constructor can be called from the
reader even with _read-eval_ bound to false.

~~~
fogus
So you wouldn't want to use the reader if safety is needed. Instead you would
use the EDN data functions.

<http://clojure.github.com/clojure/#clojure.edn>

------
lucian1900
In short, use an explicit serialisation format (edn) instead of read. Which
mirrors the advice for JavaScript of using a JSON parser instead of eval.

------
drcode
Glad I ran across this this week- I would have most likely ended up with a
security hole in an app otherwise. (I had thought _read-eval_ =false was an
adequate safeguard...)

------
baq
executing code you didn't write is unsecure. news at 11.

...am i missing something? isn't this the same problem as with evaling json in
javascript?

~~~
djpowell
Lisps separate the notion of reading strings into data structures, from
evaluating those data structures. So it would be easy to assume that read-
string was safe to use, but there were actually some lesser-known features in
the Clojure reader that allowed code execution.

------
pfraze
This is basically what got Ruby with its YAML reader, isn't it?

EDIT: I'm not familiar enough with clojure; is read-line supposed to be safe?

~~~
stonemetal
Not really, read-line is more or less equivalent to python's input method. It
reads raw input and tries to evaluate it. Passing false makes it safer for
reading data structures, but it will still try to execute Java constructors so
that your data is correctly initialized. To be safe you need to use other
methods i.e. edn(extensible data notation).

edn limits what is allowed to a safe subset of what read-line will allow, all
of the base clojure types and collection types (maps, lists, sets, vectors)
are allowed, Java isn't.

~~~
mjw
> It reads raw input and tries to evaluate it

Not quite. It reads raw input and tries to build a syntax tree from it.
Unfortunately reader macros in clojure allow all sorts of objects to form part
of said syntax tree and arbitrary-ish code to run at parse time to instantiate
them. But the main job of the reader isn't to evaluate code.

------
pjungwir
I'm feeling vindicated for this comment now:

    
    
        https://news.ycombinator.com/item?id=5258993
    

As more people start adopting Lispy languages, it will be important to build
up a cultural awareness of what's dangerous. Things like not evaling code is
obvious enough (I'd hope), but avoiding read less so. That seems less like
"avoid eval" and more like "avoid gets".

~~~
KirinDave
You feel vindicated over this vacuous comment article? Let me sum it up for
you: "It turns out if I set a safety variable, then unset it in a way that
most of my audience won't recognize as assignment, I get votes up on
hackernews."

Cultural awareness of read-time evaluation is as frustratingly absent now as
it has always been; same as strcpy vs. strncpy. The only difference now is
that we have increasingly more programmers with very little experience in
charge of large software projects.

~~~
pjungwir
The people on HN are a big part of those defining the future of Clojure and
other languages, so I'm glad to see an article like this raise awareness of
something that will bite many as Clojure grows. Lines like these suggest the
set-then-unset bit is part of a broader issue:

    
    
        ;; By a weird coincidence this week I wanted to read a file of data
        ;; coming from a web app, and I was about to use Clojure's reader to
        ;; do it.
    

It's disappointing to see that some Clojure practitioners are more interested
in downvoting my comment and minimizing that issue rather than educating
people about it.

~~~
KirinDave
I didn't downvote your comment, but I hold your sentiment in low esteem.

We don't stress over this for the same reason we don't stress over strlen-
style errors. They're considered part of the basic education for the language.
If someone does do this, they are operating below the expected standard for
the language.

There is only so much a language and environment can be expected to do for
you. Your understanding of _trivial_ concepts like, "Loading code from
untrusted sources is dangerous" is not an unreasonable bar to set. It is only
an "issue" if you are fundamentally misinformed about writing software in
interpreted environments.

~~~
pjungwir
It's bizarre that you can think something is "the expected standard for the
language" but also not worth pointing out. How are people supposed to learn
it?

There are incompetent programmers that no effort can reach, sure, but there
are also junior programmers who are just innocent and haven't yet learned all
the implications of what they're doing. I think it's important to teach these
people, whether re Ruby or C or Clojure or whatever. That people treat such
teaching with disdain makes me think they are misled by their defensiveness.
It's not an insult of Clojure to say that (as in any language) there is stuff
worth warning people about.

~~~
KirinDave
> How are people supposed to learn it?

By understanding the halting problem and the implications thereof? This is not
Clojure-specific, it's part of any dynamic eval tool.

> It's not an insult of Clojure to say that (as in any language) there is
> stuff worth warning people about.

This argument might hold some water if the presentation of the information was
not couched in a different way. It is clearly not that. And you are clearly
more concerned with being right than teaching people, from the words you are
choosing to use.

------
yawnt
this looks like an interesting alternative
<https://github.com/clojure/tools.reader>

