
Just-In-Time Static Type Checking for Dynamic Languages - pron
https://arxiv.org/abs/1604.03641
======
junke
Fine, but there is surprisingly no mention of Lisp in the related work, which
defines a COMPILE function in its specification. In a language where the
compiler is part in your runtime, type checking is equally available at
different execution times. I mean, when I connect to a running instance of
SBCL (Common Lisp) to compile my programs, that's exactly what happens:
static-type analysis at runtime. Let's say I define a function which builds a
function and compiles it:

    
    
          CL-USER> (defun foo (x) (compile () `(lambda (u) (aref ,x u))))
          FOO
    

Call it with an array:

    
    
          CL-USER> (foo #(2 3 2))
          #<FUNCTION (LAMBDA (U)) {100B3C610B}>
          NIL
          NIL
    

The resulting closure allows to access elements of that array:

    
    
          CL-USER> (funcall * 0)
          2
    

Call FOO again, with a bad input:

    
    
          CL-USER> (foo 332)
          ; in: LAMBDA (U)
          ;     (AREF 332 U)
          ; 
          ; note: deleting unreachable code
          ; 
          ; caught WARNING:
          ;   Constant 332 conflicts with its asserted type ARRAY.
          ;   See also:
          ;     The SBCL Manual, Node "Handling of Types"
          ; 
          ; compilation unit finished
          ;   caught 1 WARNING condition
          ;   printed 1 note
          #<FUNCTION (LAMBDA (U)) {100B41E16B}>
          T
          T
    

COMPILE reports a type error (an exception that I could handle if needed).

~~~
ufo
Common LISP is a bit weird because its type system was originally designed
primarily as a way to speed up the executables (by letting the compiler
aggressively optimize memory representation and remove runtime checks).

Nowadays that kind of efficiency isn't as big of a deal as it used to be and
people tend to go more towards the angle of using the type system for error
detection but I still can't shake that feeling that CL's type system feels a
bit off when you compare it to other modern type systems.

(this is no excuse to not at least cite Common LISP though)

~~~
rurban
No it's not. That's exactly the same use-case those slow dynamic languages
have now. This kind of efficiency is a big deal, in languages designed by
people with the same mindset.

Getting errors is the same if at compile-time or in the test-suite or at the
customer, though having them at compile-time is of course preferred. In LISP
you have all 3 cases. In static languages only the first.

These guys here solve their type-checking performance problem by deferring the
check to dynamic code. Checks are expensive if your type checking code is
slow, or the source code is a dynamic mess. We are talking about ruby on rails
here, which is the biggest dynamic mess one can think of. It's the similar
problem the JIT guys have, deferring compilation to only actually invoked code
and skipping dead branches.

------
MustardTiger
This isn't static type checking though, if it happens at run time then it is
by definition dynamic. And these languages already do that. What problem is
this solving? I'll already get a runtime error now, how is this runtime error
better?

~~~
munificent
It's somewhere between the two. Consider this program (in C-ish pseudocode):

    
    
        main() {
          foo();
        }
    
        foo() {
          if (false) {
            int i = "not an int";
          }
        }
    
        bar() {
          int j = "not an int";
        }
    

A statically typed language would report two type errors. A dynamically
checked language would report none. Hummingbird would report one.

It gives a some of the benefits of static typing: it reports errors in code
paths not hit through imperative branching. But it doesn't give all of it
since it doesn't report errors in uncalled functions.

However, what it gives you in return is the ability to handle functions that
don't even exist at program startup and are instead defined at runtime later
in the program's execution through metaprogramming.

~~~
MustardTiger
>Hummingbird would report one.

When?

>However, what it gives you in return is the ability to handle functions that
don't even exist at program startup and are instead defined at runtime later
in the program's execution through metaprogramming.

Those are already dynamically "type checked", how is this adding anything?

~~~
munificent
> When?

It would report the error in foo() when foo() is invoked, even though the code
path containing the error is never executed.

> Those are already dynamically "type checked", how is this adding anything?

Only code paths that actually get executed are dynamically checked. This
checks all code paths in the method.

~~~
MustardTiger
>It would report the error in foo() when foo() is invoked

That's "dynamic typing".

>Only code paths that actually get executed are dynamically checked. This
checks all code paths in the method.

Ok, so it is extended dynamic typing. Why on earth is it being described as
having anything to do with static typing?

~~~
junke
Because it uses static analysis techniques, that's all. You type variables and
consider all execution paths at once, instead of executing code and checking
the types of values.

In particular, the code that was generated won't get executed if it can be
proved beforehand that there is a type error.

I agree that in effect you won't get an error message until execution, but
that does not mean that this is dynamic typing. Execution time and compile
time are generally interleaved with metaprogramming so it can make sense to
consider static-analysis for code at runtime.

What I found strange is the fact that it happens "Just-In-Time", which for me
means at the last possible moment. This is counter-intuitive with what static
typing aims to do. I would prefer "ASAP static-typing" (the code could be
analyzed long before it is run).

That's why I still find the COMPILE approach of Lisp better, since we have
precise control over when analysis is done.

------
raould42
vs. "¿How can a dynamically typed language not actively prevent static
checking?" [http://lambda-the-ultimate.org/node/5320](http://lambda-the-
ultimate.org/node/5320)

------
eridius
Just reading the abstract it's not entirely clear what's going on. My
impression is that it instruments the app at runtime to gather type signatures
for methods, and then.. it saves those type signatures somewhere that can be
used to statically type-check the program later?

Two things come to mind when considering this approach:

1\. How does it handle the type signatures changing as the code is modified?
Either it must detect when the code is modified and invalidate its saved
signatures for that (although dynamic metaprogramming would make that hard),
or you'll just have to live with static type-checking temporarily failing
until you've run the program again to let it gather a new type signature.

2\. This can only be accurate if you've actually tested all of your code
paths, because any code path you don't test may call some method using a type
signature that Hummingbird hasn't seen before. Granted, the scripting language
community already focuses heavily on 100% code coverage (though 100% code
coverage doesn't necessarily mean all the code paths have been tested), but it
still seems like an important limitation to be aware of.

~~~
scott_s
Point 1 is explained, at a high level, in the introduction:

"In Hummingbird, user-provided type annotations actually execute at _run-time_
, adding types to an environment that is maintained during execution. As
metaprogramming code creates new methods, it, too, executes type annotations
to assign types to dynamically created methods. Then whenever a method m is
called, Hummingbird _statically_ type checks m’s body in the current dynamic
type environment. More precisely, Hummingbird checks that m calls methods at
their types as annotated, and that m itself matches its annotated type.
Moreover, Hummingbird caches the type check so that it need not recheck m at
the next call unless the dynamic type environment has changed in a way that
affects m."

Personally, I think they're pushing the bounds of what "statically" means, but
hey, they do say it's "just-in-time static". It does indeed do the checking
before executing the function, it just happens to be _immediately_ before
executing the function. A paragraph later:

"To ensure our approach to type checking is correct, we formalize Hummingbird
using a core, Ruby-like language in which method creation and method type
annotation can occur at arbitrary points during execution. We provide a
flowsensitive type checking system and a dynamic semantics that invokes the
type system at method entry, caching the resulting typing proof. Portions of
the cache may be invalidated as new methods are defined or type annotations
are changed. We prove soundness for our type system."

Based on how their approach works, I don't think Point 2 stands.

~~~
eridius
Oh ok. So it's not really static at all, they're just calling it that because
it sounds better. It's simply dynamic type-checking that checks the types
immediately prior to calling the function instead of checking them at the top
of the function body. Offhand, it seems that the only benefit to this approach
(checking prior to calling the function instead of checking inside the
function) is the ability to cache the results and skip checking the same
callsite in the future if nothing's changed.

Being able to significantly optimize type-checking for scripting languages
like this to the point where you can actually run this in production seems
quite valuable. But calling it "static type-checking" is simply wrong.
Granted, they're calling it "just-in-time static", but that phrase is
contradictory, it would be like saying "dynamic static type checking".

------
IshKebab
Doesn't sound like it solves my main objections to dynamic typing:

1\. No compile-time type checking.

2\. No autocomplete / intellisense in IDEs (you can try but it's never very
good).

~~~
riyadparvez
My main objection is always readability. Sure, it is nice when I am writing
code and on top of my head I know the type of each variable, but when I am
reading code, this becomes nightmare for lack of type information.

~~~
B1FF_PSUVM
> My main objection is always readability.

Strange, that's my objection to type declarations.

~~~
ufo
I'm nor sure I would use the word "readability" there. I'd prefer saying that
types aid in documentation.

