Hacker News new | past | comments | ask | show | jobs | submit login

You can argue that YouTube is less dynamic than Twitter. When a user posts a message on Twitter, that message has to be pushed to all the users that follow that user and so every user has a personalized stream of incoming tweets that has to be updated, not in realtime, but with small latency nonetheless.

Also, Twitter doesn't have Google's infrastructure.

In regards to "static compilation", that's not the important bit, but rather the performance of the virtual machine. The JVM, at its core, is not working with static types. The bytecode itself is free of static types, except for when you want to invoke a method, in which case you need a concrete name for the type for which you invoke that method ... this is because the actual method that gets called is not known, being dispatched based on "this", so you need some kind of lookup strategy and therefore the conventions that Java uses for that lookup are hardcoded in the bytecode. However, invokeDynamic from JDK 7 allows the developer to override that lookup strategy, allowing one complete dynamic freedom at the bytecode level, with good performance characteristics.

The real issue is the JVM versus the reference implementations of Ruby/Python. The JVM is the most advanced mainstream VM (for server-side loads at least).

Unfortunately for Facebook, they didn't have a Charles Oliver Nutter to implement a kickass PHP implementation on top of the JVM - not that it's something feasible, because PHP as a language depends a lot on a multitude of C-based extensions. The more pure a language is (in common usage), the easier it is to port to other platforms. Alternative Python implementations (read Jython, IronPython) have failed because if you want to port Python, you also have to port popular libraries such as NumPy. Which is why the PyPy project is allocating good resources towards that, because otherwise nobody would use it.




> The JVM, at its core, is not working with static types. The bytecode itself is free of static types, except for when you want to invoke a method, in which case you need a concrete name for the type for which you invoke that method ...

Not sure what gave you this impression, as the majority of Java bytecode instructions are typed. For example, the add instruction comes in typed variants: iadd (for ints), ladd (for longs), dadd (for doubles), fadd (floats), etc.

The same is true for most other instructions: the other arithmetic instructions (div, sub, etc.), the comparison instructions (*cmp), pushing constants on the stack, setting and loading local variables, returning values from methods, etc.

http://en.wikipedia.org/wiki/Java_bytecode_instruction_listi...

InvokeDynamic, as you point it out, was added to make implementing dynamic languages on the JVM easier, because the JVM was too statically typed at its core.


Arithmetic operations on numbers are not polymorphic, but polymorphism has nothing to do with static typing per se. You're being confused here by the special treatment the JVM gives to primitives, special treatment that was needed to avoid the boxing/unboxing costs, but that's a separate discussion and note that hidden boxing/unboxing costs can also happen in Scala, which treats numbers as Objects.

Disregarding primitives, the JVM doesn't give a crap about what types you push/pop the stack or what values you return.

invokeDynamic is nothing more than an invokeVirtual or maybe an invokeInterface, with the difference that the actual method lookup logic (specific to the Java language) is overridden by your own logic, otherwise it's subject to the same optimizations that the JVM is able to perform on virtual method calls, like code inlining:

http://cr.openjdk.java.net/~jrose/pres/200910-VMIL.pdf

> ... because the JVM was too statically typed at its core

Nice hand-waving of an argument, by throwing a useless Wikipedia link in there as some kind of appeal to authority.

I can do that too ... the core of the JVM (the HotSpot introduced in 1.4) is actually based on Strongtalk, a Smaltalk implementation that used optional-typing for type-safety, but not for performance:

http://strongtalk.org/ + http://en.wikipedia.org/wiki/HotSpot#History


> Nice hand-waving of an argument, by throwing a useless Wikipedia link in there as some kind of appeal to authority

No need to get agressive over this :) I disagreed with your first comment regarding the dynamic nature of the JVM, and replied trying to explain why.

I posted the wikipedia link not as kind of an "appeal to authority", but to give the readers a full listing of bytecode instructions, so that they can check what I was saying for themselves.

> Disregarding primitives, the JVM doesn't give a crap about what types you push/pop the stack or what values you return.

It depends how you see things: the JVM can't possibly provide instructions for every possible user type, so apart from primitives, the other object types are passed around as pointers or references, but whenever you try to do something other than storing/loading them on the stack, the type checking kicks in, ensuring that the reference being manipulated has the right types.

For instance, the putfield instruction doesn't just take the field name where the top of the stack is going to get stored. It also takes the type of the field as a parameter, to ensure that the types are compatible.

Constrast this to Python's bytecode, where the equivalent STORE_NAME (or the other variants) doesn't ask you to provide type informations.

But then again, we might be splitting hairs here: since this type checking is happening at runtime (when the JVM is running your code), you could indeed question calling it "static typing", which is usually performed at compile time (an is partially performed by the java compiler for example).


I don't think pushing tweets to every user is a scalable approach. It would seem more sensible to read rather than push tweets associated with a user. I don't know about Twitter's infrastructure, but storing more than 1 copy of the same data wouldn't make that much sense.


> I don't think pushing tweets to every user is a scalable approach. It would seem more sensible to read rather than push tweets associated with a user. I don't know about Twitter's infrastructure, but storing more than 1 copy of the same data wouldn't make that much sense.

Suppose I have 5000 followers. How do you propose to get the timeline? Since you said "single copy of data", I assume you would do something like this:

https://github.com/mitsuhiko/flask/blob/master/examples/mini...

At twitter scale, that query will be a major bottleneck.


When a user posts a video it gets published in the subscription feed of every user subscribed(maybe not with the ajax like twitter does); plus they tell you which videos have you already watched. Plus the search feature is probably more used than in Twitter.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: