
Software Security Is a Programming Languages Issue (2018) - azhenley
http://www.pl-enthusiast.net/2018/08/13/security-programming-languages-issue/
======
tptacek
Here's the counterargument:

Regardless of language, all software has bugs; that's such a banal argument it
barely even needs to be made. Offensive software security is the practice of
refining and chaining bugs to accomplish attacker goals. You can make it
easier or harder to do that, but in the long run, more bugs classes are going
to be primitives for attackers than not.

Rust is a fine language to teach people. But what Rust accomplishes for
security is table stakes. People building software in Java in 2002 had largely
the same promises from their language; Rust just makes those promises viable
in a larger class of systems applications. But Java programs are riddled with
security vulnerabilities, too! So are Golang programs, and, when Rust gets
more mainstream, so too will Rust.

Aside: I'm really frustrated by the advice to _validate all input_. That's not
useful advice and people should stop teaching it. It begs the question:
validate _for what_? If you know every malicious interaction that can occur,
you know where all the bugs are. And you don't know where all the bugs are.

~~~
loup-vaillant
> _Aside: I 'm really frustrated by the advice to validate all input. That's
> not useful advice and people should stop teaching it._

Do you realise this reads as "don't bother validating your inputs"? Surely
that's not what you meant?

> _It begs the question: validate for what?_

All inputs necessarily follow some protocol. Command line options and
configuration files have syntax. Network packets have sizes & fields. Binary
files have some format. So you just make sure your input correctly follows the
protocol, and reject anything that doesn't.

Now the protocol itself should also be designed such that implementations are
easily correct and secure. In practice, this generally means your protocol
must be as simple as possible.

And it's probably a good idea to make sure your parser is cleanly separated
from the rest of your program, either by the OS (qmail splits itself in
several processes), or by the programming language itself (with type/memory
safety).

~~~
tptacek
That's a silly way to read what I said.

~~~
loup-vaillant
Of course it is. But no matter how I read that sentence, I cannot parse
something someone of your calibre could possibly have meant.

Unless I missed something, you couldn't have meant that input validation is
useless. You most probably didn't mean that input validation is superseded by
some other technique, and therefore best ignored. You couldn't possibly think
that everyone already does input validation, and therefore don't need the
advice.

But then I wonder. _Why_ saying "validate all your inputs" is not useful? What
would you say instead?

~~~
tptacek
Why would I engage in an argument premised on how unreasonable the argument
is? You said: "you couldn't have meant input validation is useless". You're
right.

~~~
carty76ers
Loup is right. No need to be super defensive.... Your initial post reads very
wrong. I also was very surprised. It would help if you clarified

~~~
tptacek
I'm not feeling defensive so much as aware that I'm talking to someone whose
goal isn't to understand what I was trying to say, and I'm not especially
interested in trying to clarify to them.

~~~
loup-vaillant
Look, I know you and I have some history. But I assure you, I'm genuinely
trying to understand.

Besides, it's not just about us: other people, (including @carty76ers
apparently) would like to know what you would advise instead of "please
validate all inputs".

~~~
tptacek
This is a weird and kind of creepy message. I'm reacting to the comments you
wrote on this thread, not some personal history you think we have. You keep
writing things like "surely this is not what you mean" (and, of course, it
isn't) and then continue to argue against the argument you imagine I must not?
or must be? making. This doesn't seem productive and I'm not interested in
continuing. Sorry.

~~~
loup-vaillant
Sorry, that was uncalled for. I guess that for you, it was Tuesday.

The reason I kept guessing, was because you kept _not_ telling. Until then:
[https://news.ycombinator.com/item?id=21719065](https://news.ycombinator.com/item?id=21719065)

Finally something I can argue with.

------
zer00eyz
Two malicious Python libraries caught stealing SSH and GPG keys:
[https://news.ycombinator.com/item?id=21701488](https://news.ycombinator.com/item?id=21701488)

Raise your hand if you are executing code in your project that was written
OUTSIDE your organization, and has never been reviewed by people within your
organization.

This is the ham sandwich problem.

If a stranger walked up to you on the street and handed you a ham sandwich
would you eat it? I would venture to guess that most of you would NOT. However
many of us are all too happy to grab some random chunk of code off the
internet and shove it into production without a second thought.

Personal and social information of 1.2B people discovered in data leak
[https://news.ycombinator.com/item?id=21606415](https://news.ycombinator.com/item?id=21606415)

Is this another ham sandwich from a stranger that has been eaten? How often is
the "bug" implicit trust and poor design or default behavior?

What about Specter? A hardware level exploit was bound to come up again, we
have had hardware bugs before (1994 pentium math bug) but exploitable ones are
"fairly rare".

\------------

The old mantra "security through obscurity" is true, but it has some serious
validity issues when there aren't enough eyeballs on the software we are
already running, and were generating more at a rate where people can NOT keep
up! (This doesn't address the fact that we are now building opaque boxes with
ML that NO ONE understands accurately).

~~~
raesene9
My favourite comparison for all the 3rd party library use is, if you walked in
to the CEO of your company and said

"we want to hire some people with no background checks, no interviews and no
idea where they live to write code for our critical line of business
application. We're then going to put that code into production without
reviewing it, and regularly update it without reviewing the updates. Oh also
we won't have any form of enforcable contract as come-back if something goes
wrong."

You'd, at best, be laughed out of the room.

Yet that's exactly what pretty much every company does with 3rd party libs.

~~~
kod
On the contrary, you'd be asked what the cost benefit analysis was vs doing it
in house, then told to continue.

CEOs outsource critical things without meaningful oversight all the time.

~~~
raesene9
The point I was flagging up was the dissonance of corporate hiring policies
relative to their use of third party code.

For hiring, no chance you'd get that policy past corporate HR in any large
company. Try hiring a developer sight unseen to work remotely with no contract
in any large organization and see how well that goes.

Yet companies effectively do just that with 3rd party library use. the reason
the CEO doesn't do anything isn't likely to be because they've made an
informed risk decision on the topic, it's because no-one is telling them the
risks :)

~~~
kod
I work for a 10,000+ person public company that just outsourced critical
business functions via a 5 year contract that doesn't have any meaningfully
enforceable description of the work to be done.

If you're trying to talk about hiring for full time employment as opposed to
contract work... what happens there is as long as people are able to get
through whatever idiosyncratic hazing process was involved in hiring, they're
going to be at the company for at least a year. It's perceived as hard / risky
to fire people, even if they can't program their way out of a wet paper bag.

This stuff happens all the time, it's not that different from evaluation and
use of third party code. "This project has 500 stars on github, it must be
good." "This guy used to work for Google, he must be good." Now you're stuck.

~~~
raesene9
In the first case you still have a contract, and therefore contact law in your
country applies. Blatent breach like "they wrote code that stole all our SSH
private keys and then they deployed cryptocoin mining software to our systems"
would be covered, regardless of how bad the contract is.

Same with hiring, the person may or may not be able to code, but active malice
would likely result in firing, and the code they write should be subject to
review before being put into production.

Of course you can argue "hey where I work hiring is trash, we write bad
contracts and have no internal standards, so this 3rd party stuff isn't much
worse" but I'd suggest that's not an argument most companies would make
publicly about their processes.

~~~
kod
Whether they'd make the argument publicly or not, it's still true that it is
the reality, and the company I currently work for is better than a lot I've
seen.

Point taken about active malice... but I've also seen companies with mostly
in-house code cover up instances of rootkits on production servers and
malfeasance related to credit cards.

I'd rather companies use third party open source crypto, for instance, even if
it sometimes gets compromised, because it's a lot more likely to come to
light.

------
bearer_token
Security issues, like other emergent system properties, can arise at any layer
of the stack.

While code level issues should absolutely be a focus in the SDLC, it's common
to find security issues crop up from:

* Hardware, kernel, OS, package, and library vulnerabilities

* Component integration / API contract misunderstandings

* Transitive trust between services and third parties

* Accumulation of access over time

* Demos, hotfixes, and workarounds that are somehow now mission critical

~~~
AmericanChopper
Even poorly designed business rules create huge number of security issues. The
whole stack could be perfectly bug-free and you’d still get those.

~~~
AnimalMuppet
It can get worse. How about _deliberately designed features_ that are security
bugs? I'm looking at Microsoft's "sure, we'll execute any email attachment
that the user clicks on, because that's more convenient!". Implementation
language wasn't going to save you there...

------
m463
I loved the perl taint mode. In this mode, perl understands what data
originates outside the program and puts restrictions on how you use it.

Perl also has strict mode, to tighten up your programming. Without it you
don't have to declare variables, but on it requires variables to be declared
before use.

More languages should help programmers like this - sort of a ladder to go
beyond just low hanging fruit.

~~~
claytonjy
Python now has MyPy, a static type-checker which can be tuned to be more or
less strict. Pretty cool to see what can be bolted on to a dynamic language.

~~~
hnick
Perl has various levels of bolted-on type-strictness too AFAIK, I don't have
much experience with it due to being stuck on an older version professionally.

I personally think it's a great idea. Loose types can get a POC or even early
production models up and running quickly while you're changing your opinions
on the data every hour, then once it grows to a size that's hard to reason
about and static analysis can really pay off, start tightening it up.

------
fmavituna
We need to kill the cliche that language/framework doesn't matter but we also
need to understand, it still won't solve all problems.

I wrote about a similar topic from a web application security point of view:
Why Framework Choice Matters in Web Application Security* (
[https://www.netsparker.com/blog/web-security/why-
framework-c...](https://www.netsparker.com/blog/web-security/why-framework-
choice-matters-in-web-application-security/) )

Also today there is enough data in the industry to prove this argument beyond
any doubt for web applications.

* original article is written about 11 years ago or something this is a republished version

------
wglb
I disagree with the premise that Software Security is a language issue.

There are a lot of basics in this post.

However, this is really only a small fraction of the issue in application
security.

Just thinking back to some recent major breaches recently in the headlines, we
have

    
    
      * failure to update (Equifax)
      * unencrypted backup files (Adobe breach of long ago)
      * A long-ago root-level compromise of 90 servers of a giant bank.
      * The Target breach: vendor access to network
      * Numerous breaches related to improper setup of AWS
    

Also, two interesting SSL/TLS vulnerabilities had nothing to do with anything
a language design can address. The GOTOFAIL and HeartBleed. In fact, someone
illustrated how to make the same error in Rust (and promptly got downvoted).

A good view of front-line security problems is addressed in
[https://www.youtube.com/watch?v=_4vSurKPl6I](https://www.youtube.com/watch?v=_4vSurKPl6I)
(Attack Oriented Defense.)

I have audited applications in many languages, from C, Clojure, .NET, Java,
Perl, and Ruby. Vulnerabilities found did not relate to langsec at all.

------
sinuhe69
To me, the headline is misleading. Security is part of the daily programming
job, not just of the security specialists. But as a programming language
feature, I’m not so sure. Certainly some languages present a large attacking
surface than the others (like generic pointers or pointers arithmetics) but in
general, security is a product of process, not a feature per se. It’s naive to
believe when I program in a certain language, my program will be automatically
secure (or at least more secure than programs in other languages).

~~~
andrewflnr
The programming language is part of your process. To put a finer point on it,
your type checker is collaborating with you on a process that results in
better security than you would have otherwise.

The article starts out talking about a programming languages course, then
tries to justify including security therein. I think they're really just
hammering home the PL-specific case of your point that "security is part of
the daily programming job".

------
vemv
Agreed that post-hoc security cannot work very well. Hope it becomes more and
more obvious industry-wide.

One thing I do to practice "continuous security", is to accompany each code
review with an additional code review, solely focused on security. Else it's
too many balls to juggle when doing a general-purpose code review.

Do you know of similar simple yet effective techniques?

------
austincheney
> Why teach security in a programming languages course? Doesn’t it belong in,
> well, a security course?

Because most software developers have no idea what security is. By most I mean
almost all. This point is easily proven. Ask any software developer what
security is and compare their answer against the standard answer. All security
courses and certifications I have seen define security in exactly the same
way.

If people wanted to take security seriously in software they would train their
developers on security or require that they be security certified.

> I believe that if we are to solve our security problems, then we must build
> software with security in mind right from the start.

Yes, but clearly most organizations don't take security seriously. Instead
they bolt it on at the end just enough to appease the corporate attorneys the
same way they do for accessibility or any other _necessary requirement_ whose
absence results in class action lawsuits.

------
cryptica
> It turns out the defense against many of these vulnerabilities is the same,
> at a high level: validate any untrusted input before using it

It should be noted that type safety itself does not solve this problem. For
external input, you need explicit schema validation or the type needs to be
enforced at the protocol level using something like Protocol Buffers with
implicit schema validation.

I think that security has nothing to do with the programming language and
everything to do with the developers who are writing the code.

------
gatestone
This excellent presentation was referenced:

Secure Design: A Better Bug Repellent Christoph Kern, IEEE SecDev '17

[https://s3.amazonaws.com/cybersec-prod/secdev/wp-
content/upl...](https://s3.amazonaws.com/cybersec-prod/secdev/wp-
content/uploads/2017/06/25180810/Keynote-Secure-Design-A-Better-Bug-
Repellent1.pdf)

------
kephasp
It's ironic that the post would say security is a PL issue, but not adress
capability-safe PLs like E, Pony or Caja…

------
paulie_a
While programming languages could definitely be more secure. Programmers need
to put security first. Currently it's like giving a five year old a gun and
saying go play

------
Ace17
> Software Security Is a Programming Languages Issue

Then, how could a programming language help me prevent high-level security
bugs like Shellshock? Is it even possible/practical?

~~~
fulafel
Shellshock was a bug in the language implementation so this is an interesting
question. It'd need some background on how the bug came to be.

------
ngneer
Security is the study of control. LangSec and PL have their role to play, but,
as others have noted, cannot be expected to guarantee security.

~~~
tptacek
This post isn't really about "LangSec", and I'd ask, in the years and years
we've had "LangSec", what have its major contributions been? I find it hard to
pin down, and not especially instructive, but I'm prepared to be schooled.

~~~
ngneer
Mainly its contribution has been the linguistic study of bugs, and the
recognition of the importance of well defined recognizers as opposed to
shotgun parsers. That is the contribution to theory. In practice, people still
write shotgun parsers, because the incentives to avoid them are not there. The
market does not reward security. Also, if a programmer has an incentive to
alloc() for their program to function but no incentive to check that all
control paths free only once, you are left with memory bugs. I think that PL
can help with that, but LangSec cannot.

~~~
tptacek
I don't follow. Parsers for what? I'm never clear what they're talking about.
Parsers for programming languages? You don't think modern languages are
competently parsed? Parsers for file formats?

~~~
ngneer
Yes, parsers for incoming packets and file formats, for example. If they are
poorly written, you end up with Heartbleed, for example. I should maybe add
that the LangSec folks try to provide formal footing for what an adversary can
achieve with a given set of primitives. They refer to it as programming a
weird machine. The reason that is important is that their view is that access
to computational capability amounts to privilege. If you have a packet coming
in from the untrusted outside, you should not allow it to be parsed by a
parser with unbounded computational complexity, rather you should opt for a
computationally limited parser.

~~~
tptacek
Could you be more specific about Heartbleed was a _parsing_ problem?

I'm familiar with the LangSec lingo and the concept of a "weird machine", much
as I hate the term itself, has value. But it's not a product of LangSec so
much as a name for a concept we've had for decades.

~~~
ngneer
[https://xkcd.com/1354/](https://xkcd.com/1354/)

In Heartbleed the parser parsed an incoming field specifying length, but
failed to correlate it with the rest of the request. Had the parser been
written to a stricter specification, that would not have happened. Typically
that is what is meant by bounding computational complexity. Or at least that
is my understanding of it.

~~~
tptacek
I'm familiar with the bug, but not with the parser-theoretic response to the
bug. Would LangSec somehow do away with the on-the-wire length encoding? Or
would it simply say "your parser should check the length of incoming data"?
Isn't that about as useful an insight as "validate user input"?

~~~
ngneer
You raise a good point. The thing to realize is that the incoming packet data,
like all data, is code. "Code written for which machine?" you might ask. Well,
think of that data as code that is executed by the parser. If you think of the
parser as the machine (yes, it is weird), then it becomes easier to see how
the adversary tries to program it using data as code. Just like we advise
people never to eval() untrusted data, the LangSec response is not just to
validate user input, but to constrain the programmability of the machine. By
lowering its ability to compute, you are effectively lowering the privilege
given to the adversary. So, the LangSec answer would be to design the protocol
so as to not require a parser that can become so easily confused by
conflicting information. Hope that helps. FWIW, I think you raise a great
point about the ease of use and practical utility of LangSec, and agree that
its potential is thus capped.

~~~
ngneer
As another example, you should never unpickle untrusted data. The machine for
parsing pickles is too powerful.

[https://stackoverflow.com/questions/25353753/python-can-i-
sa...](https://stackoverflow.com/questions/25353753/python-can-i-safely-
unpickle-untrusted-data)

