Hacker News new | past | comments | ask | show | jobs | submit login

> Can't we extend this argument to eliminating basically all static typing?

No, because static typing exists in all sorts of places. This argument is primarily about cases where you're exchanging data, which is a very specific use case.




To elaborate on your point:

Static type systems in programming languages are designed to break at compilation-time. The reason this works is because all users are within the same “program unit”, on the same version.

In other words, static typing allows more validation to be automated, and removes the need for multiple simultaneous versions, but assumes that the developer has access and ability to change all other users at the same “time” of their own change.

I find this whole topic fascinating. It seems like programmers are limited to an implicit understanding of these differences but it’s never formalized (or even properly conceptualized). Thus, our intuition often fails with complex systems (eg multiple simultaneous versions, etc). Case in point: even mighty Google distinguished engineers made this “billion-dollar mistake” with required fields, even though they had near-perfect up-front knowledge of their planned use-cases.


It's actually the opposite. The billion dollar mistake is to have pervasive implicit nullability, not to have the concept of optionality in your type system. Encoding optionality in the type system and making things required by default is usually given as the fix for the billion dollar mistake.


Huh? Did you read the link, from the guy who was there during the major failure at Google that led to proto3 being redesigned without that flaw?

The whole lesson is that you can’t apply the lessons from static type systems in PLs when you have multiple versions and fragmented validation across different subsystems. Counter-intuitively! Everyone thought it was a good idea, and it turned out to be a disaster.


I did read the link and I was at Google at the time people started arguing for that. With respect, I think the argument was and still is incorrect, that the wrong lessons were drawn and that proto3 is worse than proto2.


Alright, fair enough. Apologies for the dismissive tone. Could you elaborate (or point to) these wrong lessons or an alternative?


OK, what do you do when a message comes in missing a field? Crash the server?


you reject the message in the framework? and if the client is aware it’s required they fail to send?

the bigger challenge with proto3 is that people use it both for rpc and storage, in some cases directly serializing rpc payloads. Disregarding how awful a choice that is, you likely want to trade off flexible deserialization of old data at the expense of rigidity, and conformance.


It remains a big asterisk to me, why was some random middleware validating an end-to-end message between two systems, instead of treating it as just an opaque message.

Why are we not having this debate about "everything must be optional" for Internet Packets (IP) for example? Because it's just binary load. If you want to ensure integrity you checksum the binary load.


Things like distributed tracing, auth data, metrics, error logging messages and other “meta-subsystems” is certainly typical use cases. Reverse proxies and other http middleware do exactly this with http headers all the time.


No one has near-perfect up-front knowledge of a software system designed to change and expand. The solution space is too large and the efficient delivery methods are a search thru this space.


I may have phrased it poorly. What I should have said is that Google absolutely could have “anticipated” that many of their subsystems would deal with partial messages and multiple versions, because they most certainly already did. The designers would have maintained, developed and debugged exactly such systems for years.


Makes sense: they knew arbitrary mutability was a requirement but did not think it thru for the required keyword.


Static types are a partial application/reduction when certain mutable or unknown variables become constants (i.e. "I for sure only need integers between 0-255 here").

I'm not rejecting static types entirely, and yes I was discussing exchanging data here, as Alan Kay's OOP is inherently distributed. It's much closer to Erlang than it is to Java.


> I'm not rejecting static types entirely, and yes I was discussing exchanging data here

OK I guess I'm having a hard time reconciling that with:

> basically all static typing


I'm not the person you're responding to, but I interpreted their comment as, "doesn't the argument against having protobuf check for required fields also apply to all of protobuf's other checks?"

From the linked article the post: "The right answer is for applications to do validation as-needed in application-level code. If you want to detect when a client fails to set a particular field, give the field an invalid default value and then check for that value on the server. Low-level infrastructure that doesn’t care about message content should not validate it at all."

(I agree that "static typing" isn't exactly the right term here. But protobuf dynamic validation allows the programmer to then rely on static types, vs having to dynamically check those properties with hand-written code, so I can see why someone might use that term.)


Sorry, I see how I'm vague. The idea is you have no "pre-burned" static types, but dynamic types. And static types then become a disposable optimization compiled out of more dynamic code, in the same way JIT works in V8 and JVM for example (where type specialization is in fact part of the optimization strategy).


You're describing dynamic types


But with the benefit of static types, and without the drawbacks of static types.


No. "Types only known at runtime" are dynamic types. "And also you can optimize by examining the types at runtime" is just dynamic types. And it does not have the benefit of static types because it is dynamic types.


This is devolving into a "word definition war" so I'll leave aside what you call static types and dynamic types and get down to specifics. Type info is available in these flavors, relative to runtime:

1. Type info which is available before runtime, but not at runtime (compiled away).

2. Type info which is available at runtime, but not at compile time (input, statistics, etc.).

3. Type info which is available both at compile time and runtime (say like a Java class).

When you have a JIT optimizer that can turn [3] and [2] into [1], there's no longer a reason to have [1], except if you're micro-optimizing embedded code for some device with 64kb RAM or whatever. We've carried through legacy practices, and we don't even question them, and try to push them way out of their league into large-scale distributed software.

When I say we don't need [1], this doesn't mean I deny [3], which is still statically analyzable type information. It's static types, but without throwing away flexibility and data at runtime, that doesn't need to be thrown away.


Short of time travel one can not turn (3) or (2) into (1). I'm not sure where the confusion here is or what you're advocating for because this isn't making sense to me.

> there's no longer a reason to have [1]

I guess if you're assuming the value of static types is just performance? But it's not, not by a long shot - hence 'mypy', a static typechecker that in no way impacts runtime.

I think this conversation is a bit too confusing for me so I'm gonna respectfully walk away :)


The confusion is to assume "runtime" is statically defined. JIT generates code which omits type information that's determined not to be needed in the context of the compiled method/trace/class/module. That code still "runs" it's "runtime".


Yes, the types that JIT omits are dynamic types.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: