A couple of years ago I wrote a simple tool (https://github.com/aerofs/openjdk-trim) that allows you to filter out what you don't need. We were able to get the size of OpenJDK from 100MB down to around 10MB.
Note that the work of determining which classes you need is entirely manual. In our case I used strace to check what classes where being loaded.
Already today you can build a custom JDK in the early access release.
0 - http://zulu.org/zulu-9-pre-release-downloads/
It'd be a short hop from here to a tool that basically does for JDK-platform apps what Erlang's releases do for the ERTS platform: builds a new JRE (as a portable executable, not an installer) that actually contains the app and its deps in the JRE's stdlib, such that you just end up with a dir containing "a JRE", plus a runtime "boot config" file that tells the JRE what class it should run when you run the JRE executable.
With such a setup, your Java program could actually ship as an executable binary, rather than a jar and an instruction to install Java. Nobody would have to know Java's involved! :)
We fixed this whole class of issues by doing exactly what you suggest: bundling the JRE and writing our own launcher binary.
Not only is it a short hop, it already exists :P
In that way the JVM can be "heavy".
It is quite easy to know which features those are when they require a flag named -XX:+UnlockCommercialFeatures, you just don't use them by mistake.
In the future, we'll hopefully have the Substrate VM  for Java and other Truffle-supported languages. It's embeddable and also does reachability analysis to exclude unused library code. For now, it seems to be closed source.
That's the difference between an industrial-strength platform like Erlang, and a dev-centric deployment nightmare like Python. Java is normally on the enterprise side of the spectrum, but unfortunately it didn't get deployment right for quite a long time, even though it appears it's getting there lately.
Also, Proguard's config is pretty complicated and the results are hard to understand. Our approach (openjdk-trim) is dumb simple: unpack the java runtime jars, use rsync to filter out entire directories we don't need, pack it back.
It's a simple, brute-force approach compared to Proguard's advanced static analysis, but in this case it gives better results. Maybe a good example of "worse is better".
I think once there is official way to trim JDK down, making deployment super easy and fast ( Single executable File ), Java will pick up stream again.
The only problem and hesitation we have...is Oracle.
Why is it so big? What do we gain?
I don't have a system with 10MB of cache, so I imagine Java can't run faster than memory...
* Who said the 10MB is all used at once?
* I don't know your hardware, but there is very, very good chance you are actually quite wrong about <10MB of cache. These days, most magnetic disks have more cache than that, and if you are using SSD's, there's a boatload more cache than that in there.
* If you were referring strictly to CPU cache, then I'm even more confused, because the entire existence of that stuff is predicated on it being faster than memory, so... (and even still, if your total CPU cache isn't 10MB, it likely isn't that much smaller).
* It's not like the whole package would sit in RAM the whole time anyway. By your same assertion, I could say that one of my CPU registers is only 64-bits wide, so I imagine all programs larger than 64-bits can't run faster than L3 cache...
I'm not sure why you'd say it is too big. The article page is 1.4 MB alone... and it still needs to leverage a general purpose runtime/JIT that is orders of magnitude larger to do its single fixed purpose.
The parent was suggesting that this was all that was actually needed out of the 100mb or so downloadable. If you think the JVM is smaller, how small is it exactly?
> If you were referring strictly to CPU cache, then I'm even more confused, because the entire existence of that stuff is predicated on it being faster than memory, so... (and even still, if your total CPU cache isn't 10MB, it likely isn't that much smaller).
I don't have anything with 10MB cache.
> It's not like the whole package would sit in RAM the whole time anyway. By your same assertion, I could say that one of my CPU registers is only 64-bits wide, so I imagine all programs larger than 64-bits can't run faster than L3 cache...
If you get into L1, you get about 1000x faster.
> I'm not sure why you'd say it is too big.
Maybe I have a different perspective? If a 600kb runtime is 1000x faster, I want to know what I get by being 10x bigger. I'm quite surprised that there are so many responders defending it given that these benchmarks were just on Hacker News a few days ago.
You could easily see that your assumption is wrong by observing that a typical C application is not 1000 times faster than a typical Java application.
Cache fills optimize for linear scans, and have nothing to do with eviction.
> You could easily see that your assumption is wrong by observing that a typical C application is not 1000 times faster than a typical Java application.
What assumption are you talking about?
Where do you find your typical applications? Spark is supposed to be one the fastest Java-implementations of a database system, and it's 1000x slower than the fastest C-implementation database systems, but this is clearly a problem limited by memory.
What about problems that are just CPU-bound? C is at least 3x faster than Java for those, so just by being "a little bit faster" (if 3x is a "little" faster) then as soon as we introduce latency (like memory, or network, or disk, and so on) this problem magnifies quickly.
Wow.. so much wrong, I'm not sure how to unpack it all.
a) Spark is Scala, not Java, though both do use the JVM, so I'll give you that.
b) Spark is not a database system, though it is a framework for manipulating data
c) Spark is generally considered to be much faster than Hadoop, and does it's job well, but I'm not sure it qualifies as the fastest anything.
d) By any reasonable interpretation, the fastest Java database system is definitely not Spark. You will find that benchmarks of Java database systems generally don't even include Spark (as an example https://github.com/lmdbjava/benchmarks/blob/master/results/2...)
e) Fast is an ambiguous term... usually you are looking at things like latency, throughput, efficiency, etc. I'm not sure which you mean here.
f) If you know anything at all about runtimes, you'd know that if you've found a Java based system that is 1000x slower than a C based system, either your benchmark is extremely specialized, broken, or you are comparing apples & oranges.
Look, Java certainly has some overhead to it, and sometimes it significantly impacts performance. Before you get too excited about attributing it to runtime size, you might want to look at the size of glibc...
What database would you recommend for solving the taxi problem using the JVM?
> Spark is Scala, not Java, though both do use the JVM, so I'll give you that.
What does JVM stand for? I was under the impression that we were talking about it's size (10mb v. 100mb).
> You will find that benchmarks of Java database systems generally don't even include Spark
And? What are we talking about here?
> If you know anything at all about runtimes, you'd know that if you've found a Java based system that is 1000x slower than a C based system, either your benchmark is extremely specialized, broken, or you are comparing apples & oranges.
We're talking about business problems, not about microbenchmarks.
If this is a business problem, and I solve it in 1/1000th the time, for roughly the same cost, then what exactly is your complaint?
> Fast is an ambiguous term... usually you are looking at things like latency, throughput, efficiency, etc. I'm not sure which you mean here.
It's not ambiguous. I'm pointing to the timings for a specific, and realistic business problem.
> Look, Java certainly has some overhead to it, and sometimes it significantly impacts performance. Before you get too excited about attributing it to runtime size, you might want to look at the size of glibc...
Does Java include glibc?
What exactly is your point here?
You have me at a disadvantage here... The only taxi problem that comes to mind is a probability problem that I'd not likely use a database for at all...
> If this is a business problem, and I solve it in 1/1000th the time, for roughly the same cost, then what exactly is your complaint?
If you came to the conclusion that your business problem runs 1000x faster because of differences in the runtime... you've made a mistake. It is far more likely your benchmark is flawed, or there are significant differences in the compared solutions beyond just the runtimes.
Seriously, I've spent a career dealing with situations exactly like that: "hey, this is 1000x slower than what we are doing before... can you fix that?". Once you are dealing with optimized runtimes, while there can be important differences between them, there just isn't that much room left for improvement.
> It's not ambiguous. I'm pointing to the timings for a specific, and realistic business problem.
The problem is perhaps not ambiguous to you, but you haven't described it in terribly specific terms. More importantly though, you haven't described what you mean by "faster"? That's the ambiguity.
> Does Java include glibc?
> What exactly is your point here?
C programs do. Lots of very efficient, high performance C programs.
It's the problem that I linked to previously.
Finding good benchmarks is hard: Business problems are a good one because these are the ways experts will solve problems using these tools, and we can discuss the choice of tooling, whether this is the right way to solve the problem, and even what the best tools for this problem is -- in this case, GPU beats CPU, but what's amazing is just how close a CPU-powered solution gets by turning it into a memory-streaming problem (which the GPU needs to do anyway).
> If you came to the conclusion that your business problem runs 1000x faster because of differences in the runtime...
I haven't come to any conclusion.
There are a lot of differences between a JVM-powered business solution and a KDB-powered business solution, however one striking difference is the cache-effect.
However the question remains: What exactly do we get by having a big runtime? That we get to write loops?
Yes, it turns out the algorithmic approach you use to solve the problem tends to dwarf other factors.
> There are a lot of differences between a JVM-powered business solution and a KDB-powered business solution, however one striking difference is the cache-effect.
Wait, you looked at those benchmarks and came to the conclusion that the language runtimes were the key to the differences?
> However the question remains: What exactly do we get by having a big runtime? That we get to write loops?
There is absolutely no intrinsic value in a big runtime.
Now, one can trivially make a <1KB read-eval-print runtime. So I'll answer your question with a question: why do people not use <1KB runtimes?
At the risk of repeating myself: I don't have any conclusions.
> There is absolutely no intrinsic value in a big runtime.
And yet there is cost. It is unclear if that cost is a factor.
> Now, one can trivially make a <1KB read-eval-print runtime. So I'll answer your question with a question: why do people not use <1KB runtimes?
Because they are not useful.
We are looking at a business problem, think about the ways people can solve that problem, and cross-comparing the tooling used by those different solutions.
Is there really nothing to be gained here?
The memory-central approach clearly wins out so heavily (and the fact we can map-reduce across cores or machines as our problem gets bigger) is a huge advantage in the KDB-powered solution. It's also the obvious implementation for a KDB-powered solution.
Is this Spark-based solution not the typical way Spark is implemented?
Could a 10mb solution do the same if it can't get into L1? Is it worth trying to figure out how to make Spark work correctly if the JVM has a size limit? Is that a size limit?
There are a lot of questions here that require more experiments to answer, but one thing stands out to me: Why bother?
If I've got a faster tool, that encourages the correct approach, why should I bother trying to figure these things out? Or put perhaps more clearly: What do I gain with that 10mb?
That CUDA solution is exciting... There is stuff to think about there.
For someone who doesn't have any conclusions, you're making a lot of assertions that don't jive with reality.
> And yet there is cost. It is unclear if that cost is a factor.
It's a factor... just not the factor you think it is.
> Because they are not useful.
I think you grokked it.
> The memory-central approach clearly wins out so heavily (and the fact we can map-reduce across cores or machines as our problem gets bigger) is a huge advantage in the KDB-powered solution. It's also the obvious implementation for a KDB-powered solution.
KDB is a great tool, but you are sadly mistaken if you think the trick to its success is the runtime. That its runtime is so small is impressive, and a reflection of its craftsmanship, but it isn't why it is efficient. For most data problems, the runtime is dwarfed by the data, so the efficiency that the runtime organizes and manipulates the data dominates other factors, like the size of the runtime. This should be obvious, as this is a central purpose of a database.
> There are a lot of questions here that require more experiments to answer, but one thing stands out to me: Why bother?
Yes, you almost certainly shouldn't bother.
Spark/Hadoop/etc. are intended for massively distributed compute jobs, where the runtime overhead on an individual machine is comparatively trivial to inefficiencies you might encounter from failing to orchestrate the work efficiently. They're designed to tolerate cheap heterogenous hardware that fails regularly, so they make a lot of trade-offs that hamper getting to anything resembling peak hardware efficiency. You're talking about a runtime fitting in L1, but these are distributed systems that orchestrate work over a network... Your compute might run in L1, but the orchestration sure as heck doesn't. Consequently, they're not terribly efficient for smaller jobs. There is a tendency for people to use them for tasks that are better addressed in other ways. It is unfortunate and frustrating.
Until you are dealing with such a problem, they're actually quite inefficient for the job... but that inefficiency is not a function of JVM.
Measuring the JVM's efficiency with Spark is like measuring C++'s efficiency with Firefox.
> If I've got a faster tool, that encourages the correct approach, why should I bother trying to figure these things out? Or put perhaps more clearly: What do I gain with that 10mb?
If you read the documentation, the gains should be clear. If you are asking the question, likely the gains are irrelevant to your problem. I would, however, caution you to worry less about the runtime size and more about the runtime efficiency. The two are often at best tenuously related.
The link you provided was to three distinct models of i7 processors... all with 8MB of L3 cache. I would argue that 8MB isn't much smaller than 10MB, but I will understand if you disagree. However, even the slowest of those processors also has 1MB of L2 cache and 256KB of L1 cache, not to mention other "cache-like" memory in the form of renamed registers, completion queues, etc. At most, we're talking <800KB shy of 10MB in cache.
> If you get into L1, you get about 1000x faster.
I think you are making my point for me.
> Maybe I have a different perspective? If a 600kb runtime is 1000x faster, I want to know what I get by being 10x bigger.
You are assuming that at all times all of that 10MB must be touched by the processor at once. You can have a 10MB runtime where most of the cycles are being spent on a hotspot <4KB of data.... Having a hot spot that is orders of magnitude smaller than the full runtime is totally unsurprising. It's particularly true when your runtime has a JIT in it. With a JIT, most of the time, the bytes that are being executed aren't part of that 10MB, but rather are generated by it. Are you going to penalize your 600KB runtime for the size of the source code? ;-)
Still your point holds.
1000x slower doesn't sound like a huge win to me; it sounds like a huge cost, so my question is what do we gain by making our programs 1000x slower?
For comparison, a C++ wxWidgets 3.0 application isn't going to be much smaller than 10MB in release mode if you statically link it. Much as I hate to admit it, 10MB just isn't that big in an age of terabyte SSDs and systems with 32GB of RAM.
It's connected to your CPU by a serial communications interface so access is not uniform or timely, and if the CPU needs any of it, it stops what it's doing while it waits.
The "cache ram" (L1 and to a lesser extent L2) actually acts like the RAM that we learn about in Knuth, so that when we discuss algorithms in terms of memory/time costs, this is the number we should be thinking about. Algorithms that are performant on disk/drum are modern solutions for what you're calling "RAM".
Can't tell if sarcastic or... o_O.
In the unlikely case you're actually serious, you really need to rethink your perception of memory costs in 2017.
KDB is about 1000x faster than Spark, and is only about 600kb (and most of that is shared library dynamic linker stuff that makes interfacing with the rest of the OS easier). A big part of why it's fast is because it's small -- once you're inside cache memory everything gets faster.
That's the real cost of memory in 2017. So what did we gain for paying it?
You're comparing KDB running on 4x Intel Xeon Phi 7210 CPUs, totaling 256 physical CPUs.
Compared to the best result for Java/Spark, which was running on 11x m3.xlarge instances on AWS. That's only 44 CPUs, plus it's running on AWS, not 100% dedicated hardware, so it's tough to tell what sort of an impact the virtualization + EBS has on performance. Plus, from the AWS page: "Each vCPU is a hyperthread of an Intel Xeon core except for T2 and m3.medium", which does not do anything good for the results.
Yes, technically, KDB was 199.80x faster (not 1000!) than Java/Spark, when it was given vastly superior, dedicated hardware without virtualization, and when tackling a problem that the hardware setup is optimized for. Note that the author calls this out by saying "This isn't dissimilar to using graphics cards" when talking about the setup he was using for the KDB benchmarks.
To get a sensible idea of the relative difference in performance, you would have to compare KDB and Java/Spark both running on the Xeon Phis, and/or running both on 11x m3.xlarge AWS instances - and even then, if Java/Spark does poorly on the Xeon Phi test, that might just mean that the Java/Spark developers haven't optimized for that particular setup.
Then argue with the point you think I could be making instead of the point that you think I'm making
> you would have to compare KDB and Java/Spark both running on the Xeon Phis, and/or running both on 11x m3.xlarge AWS instances - and even then, if Java/Spark does poorly on the Xeon Phi test...
If Spark can solve the business problem in less real-time in another way, I think that would be worth talking about, but it's my understanding that a bunch of mid/large machines connected to shared storage is the typical Spark deployment, and the hardware costs are similar to the Phi solution.
So my larger question still stands: What is the value in this approach, if it's not faster or cheaper?
- people don't want to write C (or K, or whatever yields a small binary)
- the cost of switching languages is not worth the speed-up
- it's already fast enough
I don't think you're wrong, overall, that, specifically, kdb can be much faster than an equivalently sized Spark cluster, but simply being faster does not invalidate other approaches, which is what you seem to be arguing for.
It sounds like you're suggesting we get:
* Not having to write in SQL (note KDB supports SQL92)
Maybe something else? I'm not sure I understand.
BTW, libruby-2.3 is 2,5M, just the shared object file, and it tries to use all aforementioned stuff from the underlying UNIX.
So in my situation, the JVM is heavier by every single measure listed, and for each by a considerable margin.
This is the easy trap to fall into though. What if you aggressively rewrote the Java apps from crappy legacy frameworks to well developed Java apps?
A rewrite ALMOST always is faster. So the new language seems faster. Except if you would then rewrite the rewrite back in the original language... you could even still be faster.
Very hard to split apart what is faster because the rewrite got rid of lots of bloat, and what is faster because it is legit faster. Java is legit fast when it is written well. Also very easy to make a bloat fest.
DropWizard is modern, but it isn't fast. Go and even Node.js are significantly faster. If you want performance, you cut layers out of the stack - check out the numbers for raw servlets or even just straight Jersey annotations in that benchmark. If I were doing JSON-over-HTTP microservices in Java, I'd likely use straight Jersey + Jackson, or if performance was really a problem, Boon over straight servlets.
What framework did your Go rewrite use? The standard libs?
Call me crazy, but I like my dropwizard with Spring DI for (singleton) resource setup, a micro-ORM to get work done, and HikariCP datasources at runtime.
Entity framework has include and active record has includes which do the same thing. The qt ORM also has something similar.
The only ORM I have seen that lacks this critical feature is odb. It doesn't allow setting the fetching strategy on a per query basis. You have to either always use eager loading or lazy loading which basically makes it useless for my purposes.
Any benchmarks to provide in order to support this wild claim?
The main advantages that Go has over Java is that the standard library is brilliant - thus obviating the need for folks to create monstrous frameworks (and losing performance) and that Go has better memory utilization because of value types (structs) and because it is AOT compiled. Unfortunately Java JIT as designed by the JVM devs takes a lot of memory.
In raw performance, I would still give the edge to Java over Golang though.
It's a typical honeymoon phase with very little regards to 1-2-5 years in the future. The cost of having picked to Go will be fully apparent then.
I really feel like that one of the big issues we as programmers want to get a better handle on but there isn't a lot to go off that isn't based off opinions (which can be hard to validate).
Also, when you do the rewrite you have already solved the domain problem that you did not fully understand when implementing it the first time.
But deployment, gc pauses and startup time (jvm vs go) are orthogonal to program quality. I would also expect go to have less memory usage.
> deployment was simpler and faster, memory usage was slashed..., request/response times were much more consistent, startup was practically instant
At the end of the day despite Go's failings it's a good (maybe the best?) language for large projects and teams because it compiles fast, is easy to anyone to run anywhere, tests run quickly, programs execute quickly and there is already good tooling/editor support.
Nothing beats efficient workflow for improving velocity.
PHP is a great velocity language, provided you have a small(er) team or are willing to commit to additional controls on how you write your PHP (document types/structure of arguments mainly) to ensure that your PHP code is able to be read quickly by other developers.
Personally I prefer Go here because it enforces good readability by default and therefore scales better with team size.
And in PHP7.x you have even more type hinting than before and with an IDE like PHPStorm refactoring is a breeze.
And with the release of PHP7, PHP is future proof. The community will continue improve it with the major features, they have shown it. Interest in the language have increased. More RFCs is contributed to the language than before. https://wiki.php.net/rfc
Multiple teams on a large code base is not really a problem in modern PHP. I do it every day. We follow modern design patterns, code reviews, code coverage over 80% of the system (old as new code). New code is probably over 95% coverage. Deploys regularly multiple times every week.
Almost all (>95%) of my problems stem from design decisions made in the past, not the language itself.
I'm not saying that you should not use Go (or Java). Both are fine languages. Use the right tool for the job. If you don't do a realtime stock trading system or some embedded system, but some web stack, I can't really see that the majority of the problems stem from language choice (whatever you choose). It is in the team, the culture, the understanding of the domain. There should be your focus.
Personally, the most two important things I look for in a language/platform is tooling and community.
For my usages its a reasonable language.
The power of Java is that there is more than one JVM and that can really save you a lot of money/developer time if the world changes under your ass ;) i.e. had a JVM based graph database, ran it on Hotspot -> big GC pauses, moved to Zing no more pauses.
All we needed to do is run a different VM and problem went away (new problem was of course that Zing costs but not much, also now with Shendoah coming for free we could probably have moved to that)
With GO you can't do that yet. If your app is not latency, bound but throughput bound there is no place to switch too other that a rewrite.
That flexibility of deployment on JVM tech gives us a lot insurance for no costs, until we need it.
We hired some Node maintainer(s) a long time ago, rumor has it, who got us on the Node train.
Unfortunately it seems difficult to use (to me at least), but frameworks like netty are build on top of it to provide incredible performance.
However, the fact that Java provides real threading means that a blocking io is not a performance problem if you use the correct patterns.
The tooling, especially for runtime operations, are so much superior to the golang options its night and day. I have much more success modeling complex business models in java with its better type system, and for doing low latency work its much easier to do on the jvm due to the availability of better libraries (which may get better in go) and the concurrency options are miles better on the jvm.
Go's stack allocation and gc defaults make for easy management in most of my default cases. The ease of adding http endpoints to things is phenomenal. Being able to write easy cli applications in the same language I write daemons in is great.
All told, I think for simple daemons and cli's I'd go golang, for more complex systems I'd go jvm.
I, personally, think the binary deployment thing is overblown. I've never had any problems deploying jvm applications and the automation to do either seems essentially the same to me.
As for the relative "heaviness" I think golang definitely feels lighter, but that is largely because golang apps do less. Once you start having them do more they start to "feel" just as heavy as java apps (for whatever "feel" means).
*  called golang heavier meant lighter
I also run these in on cloud platforms that auto scale. The golang processes spin up very quickly, the java ones not so much.
In these two respects the JVM is heavy compared to golang for my very common scenarios. The heaviness also causes me to spend more money for the JVM solution.
I have an app that people were complaining took too much memory. A quick look with VisualVM showed that its actual heap usage when idling was only 50 mb but because we hadn't set any heap size limit, it was reserving hundreds of megs from the OS. The idea is that it can run faster if it does that. The fix was simply to use the -Xmx option to tell it to use less memory and GC more often.
In other words, JVM deployments need a lot more tuning than Go and they will generally need a lot more memory as well. But you're right, not setting -Xmx at all will make the JVM look worse than it really is.
$ ps -eo rss,cmd,user | grep jenkins
4928228 /usr/bin/java -Djava.awt.he jenkins
$ ps -eo rss,cmd,user | grep drone
12940 /drone agent root
19924 /drone server root
P.S. The Drone server and agent are running within docker containers.
Of course "Go was Faster". It's because you started with a clean slate!
I think the perception of Java suffers a lot because it will consume all the RAM on your machine by default if you let it (but not immediately). It's a very poor default because even though there are technical arguments for doing that (goes faster), they aren't well known and people tend to assume "more memory usage == worse design".
There are a lot of myths about the JVM out there. We can see on this thread the idea that it takes 1.5 seconds to start being repeated multiple times, each time someone else points out that it's actually more like tens of milliseconds to start.
I second that. I have deployed a medium traffic web-server written in Scala backed by a postgresql DB on 128MB VPS, back in 2009!
> I think the perception of Java suffers a lot because it will consume all the RAM on your machine by default if you let it (but not immediately).
I don't think that is true. The default heap size for Oracle and OpenJDK VMs has been bounded as far as I remember. In fact, I would like it if the VM, by default, allowed the heap size to grow upto available RAM when GC pressure increases, but that doesn't seem to be the case as of now.
Edit: Did you mean non-heap VM arenas grow indefinitely? If so, I am not aware of them.
Edit: do you have a twitter or Reddit account? I'll ping you when I have code examples if you want.
I wonder if Oracle documents are plain wrong for JDK 8 docs for maximum heap size:
"Smaller of 1/4th of the physical memory or 1GB. Before Java SE 5.0, the default maximum heap size was 64MB. You can override this default using the -Xmx command-line option.
Also Oracle has chosen correct defaults because it took Java long time to shed its reputation of being dog slow and if they optimize for memory it will start looking worse in performance.
I can get it to start around 30-50mb, but as soon as you hit it with traffic the memory usage jumps up.
Have you reported to Go devs? Sounds interested use case.
0 - https://github.com/golang/go/issues/18602
A lot of this has to do with another unmentioned, terrifically annoying property of the JVM: pre-launch min/max heap allocation. Standard operating procedure is to go with the default, and overbump it if your needs exceed it. I can't possibly imagine how many petabytes of memory are unnecessarily assigned to JVMs throughout the world as I type, apps consuming 79MB with a 256MB/512MB heap size.
(I'm sure a chunk of the difference is due to a better understanding of the program during rewrites.)
Please refrain from making statements like this unless you have a reproducible quantifiable analysis.
If you really wanted to demonstrate the effect you describe you'd need to have the same team rewrite the application twice, once Java->Java, once Java->go, making sure to align the program structure as much as possible (making exceptions to take advantage of lang specific features of course).
If you were to do that, then that would be interesting! No one does that of course because it's expensive and wasteful from a business perspective, but it's the only way to determine anything useful.
* a small and fast CLR (JVM)
* a class library that defaults to almost nothing but primitive classes
* proper and standardized version, platform and package management (NuGet)
* open source and MIT license
* a patent promise
* arguably the best dev IDE available (Visual Studio) and one of the best up-and-coming dev text editors (VS Code)
* Native ORM, templating, MVC, web server so there is one way to do things
* open source middleware standard (OWIN)
* they left out, for now, attempting the hard ugly stuff like x-platform GUI
* all platforms are equal citizens, they acquired Xamarin for dev tools and release their own Docker containers.
* it's already getting good distribution (on RedHat) even tho it's only 6 months out from a 1.0 release.
Java may have missed the window for fixing some of these issues in their platform - I feel that if Android were being developed today, they'd almost certainly take .NET Core as the runtime.
I've yet to commit to using .NET Core anywhere, but from what I know about it so far it is impressive.
This may be true for the Core CLR specifically, but it's not true of real .NET apps that are being built today. The vast, vast majority are strongly tied to the Windows platform, especially because of the lack of a cross-platform GUI like you mention. As a Wine developer, it's a huge pain in our side because we either have to run the entire .NET virtual machine, which is hard, or depend on Mono, which is by design not completely compatible. This results in really souring my opinion of .NET and .NET applications when compared with win32 applications that do tend to work quite well in Wine.
> arguably the best dev IDE available (Visual Studio) and one of the best up-and-coming dev text editors (VS Code)
Refactoring, Coding assistance, Navigation & search sections being most important.
On the other hand I've been using Eclipse and IntelliJ for the past year. Eclipse is not even worth talking about but even IntelliJ does not come close to vanilla VS in terms of usability. Again, my opinion.
If it were to develop today as against raising against time(Apple) then Google would have written their own runtime and everything.
Of course they did. It's not a secret they designed it as a Java clone when the justice ruled they couldn't embrace the original one.
However, they missed something: cross-platformness. So essentially you get a windows only Java platform. That's why not everybody finds it impressive nor are looking forward to commit to using it everywhere (they wouldn't be able, though)
What is the status of this? Will MS be bringing WPF (XAML) to all platforms?
1. The startup times, not so much of the JVM itself, that just takes 1,5 secs, but the startup time of your application gets higher if you have a lot of classes on the classpath. I guess it's the classpath scanning that takes a lot of time (?).
2. Memory usage of Java objects is quite heavy. See this article: http://www.ibm.com/developerworks/library/j-codetoheap/index...
3. The heavyness of the ecosystem in terms of the magnitude of concepts and tools being used and the enterprisy-ness of libraries.
Where do you get these numbers from? On my five year old MacBook Pro with default JVM options parsing a 20 MB file:
> 2. Memory usage of Java objects is quite heavy.
That's IBMs enterprise VM that uses three word headers. HotSpot is actually better. If you compare that with other "lightweight" programming languages it is really, really light.
A quarter of a second to start up the VM, run some code, and exit again is actually pretty steep compared to typical interpreted and compiled languages. Among other things, this means that you can't really call Java executables from a loop in a shell script.
For comparison purposes, both Ruby and Rust will show between "0.00 elapsed" and "0.02 elapsed" for a simple "Hello, world" program on my laptop.
To do a fair comparison, with your example, I just compiled and ran Hello World in Java on my machine and got this:
1. Startup time being addressed by precompiling the standard library (or your own library). See "JEP 295: Ahead-of-Time Compilation": http://openjdk.java.net/jeps/295. Also addressed by modularisation of the standard library, "JEP 220: Modular Run-Time Images".
2. Memory usage (and less garbage collection overhead) using value types. See "JEP 169: Value Objects": http://openjdk.java.net/jeps/169.
You don't have to use the enterprisey libraries though. Using Dropwizard, for example, gives you a tight and performant set of libraries that have a fairly minimal learning curve and require relatively little boilerplate.
I think it was originally designed in a Lisp library called CLOS, which incidentally stands for Common Lisp Object System.
Very nice explained on how to implement OOP in Lisp, in a book called "The Art of the Metaobject Protocol".
Users of Lisp based languages should think twice before criticizing OOP.
Well that's true regardless of the language. If you're not making the decisions on the codebase, there can be all kinds of gnarly dependencies and practices that you have to adhere to. I agree that big legacy corps tend to have over cumbersome setups, but hey, at least it's not cobol. My advice is not to work for big legacy corps.
Believe me that I know where you're coming from -- I have a real aversion the big enterprise side of the Java world. There's a lot of interesting development in Java open source though, and it'd be a shame to throw the baby out with the bathwater.
And I'm a big fan of Clojure, but it's not because Clojure is cool that Java becomes de-facto a big pile of poo. People have been drilled by so much FUD about Javaland that they simply can't stand to try it correctly without preconceptions.
This makes me shiver in terror.
There's still a ton of "enterprise-grade" shit in Spring, you just aren't forced to use it if you don't want - but it's always there, lurking behind the scenes.
$ time java -jar target/uberjar/clojure.jar
1.29s user 0.08s system 181% cpu 0.755 total
$ /usr/sbin/system_profiler -detailLevel full
Model Name: MacBook Pro
Model Identifier: MacBookPro11,3
Processor Name: Intel Core i7
Processor Speed: 2.3 GHz
Number of Processors: 1
Total Number of Cores: 4
L2 Cache (per Core): 256 KB
L3 Cache: 6 MB
Memory: 16 GB
These are the kind of lazy generalization that causes people to make poor technology decisions.
4. Garbage collection and lack of value typed records. As far as I know there is currently no way around going full SOA (structures of arrays (of primitive types)) for large data collections.
Object overhead (memory usage) and GC overhead are the reason why only SOA will work (and it's a pain because the language doesn't make it convenient) if you have like >10^7 objects. (That's my personal experience from a 2-month project, and I normally don't use Java).
There are if you use language extensions like Packed Objects on the IBM JVM or Object Layouts on Azul.
So just like C, you have C and then GCC C, clang C, ....
Eventually Java 10 will fix this, but for those that like to live on the edge there are already snapshots available.
https://objectlayout.github.io/ObjectLayout/ does not save you any headers. It just allows you to control where your objects are in memory and compiler optimizations based on this. It does not help you with memory footprint.
Also I'm not sure if it's really implemented on Zing considering that from the outside the project seems dead.
> Eventually Java 10 will fix this
I would not be so sure. The challenges especially regarding primitive generics are not to be underestimated. See
It is already better than what you get on Hotspot.
> The challenges especially regarding primitive generics are not to be underestimated. See
The challenge here is due to how Java designers to build them in first place.
Modula-3 and Eiffel are two examples of languages with proper generics, value types and toolchains that do AOT compilation to native code.
So I am still hopeful.
However, like everything, some challenges are technical and some are political.
$time java HelloWorld
That is a linux vm running in a mba (first run).
In any case, the point that I wanted to make in the parent comment was that the JVM startup time itself was basically fine.
I just checked on my Macbook and a HelloWorld class gives me .13 secs real.
There are work arounds for this (things that reuse jvms and such) but until that is overcome the jvm is largely not appropriate for cli tools that start/stop.
But for other kinds of interactive programs, things with long running sessions and such, it is pretty easy to a) lower that startup time and b) do things that mitigate it to the user.
Picture a Clojure macro library just for writing CLI driver programs, where you could call all your Clojure code like normal, and where some of the subcommand-defining methods of the driver program could be annotated with something like "@inline".
The un-annotated subcommands, as a simpler case, would translate into calls to spawn a JRE and feed your ARGV over to it once it gets running. These would be the slow-startup calls, so you'd just use them for the things that need the full "horsepower" of the JVM.
The @inline subcommands, on the other hand, would grab your app, its deps, and the JRE, do a whole-program dead-code-elimination process over them to trim them down to just what that subcommand needs, and then would transpile that whole resulting blob to bash code and shove it into a bash function. (So, something like Emscripten with a different frontend + backend.)
I boot the JVM once and iterate endlessly in the same process. Same for ClojureScript in the browser or node.js. Lisp is by far the most interactive language there is with the fastest iteration times (AFAIK).
1.5 seconds would be huge if you had to constantly restart your application like you do everywhere outside Lisp. Iterating in Clojure is literally instant.
I wrote applications in dozens of languages, and none come remotely close to Clojure's iteration speed or joy of use.
That's Forth. Lisp comes next.
This was probably true in the 80s, but hasn't been in a while. Many languages have this, either built-in or as a tool. In the case of the JVM, there's spring-loaded, which works in Java, Groovy, etc.
Certainly the JVM startup always feels slow, in my experience.
$ time java Hello
0.04user 0.01system 0:00.12elapsed 43%CPU (0avgtext+0avgdata 15436maxresident)k
While 120ms elapsed is not stellar, it's rarely a problem with how the JVM ecosystem looks.
Startup time of a simple Java application and therefore also whole JVM is 0.4s (in the linked article).
1.2s is for the implementation in Closure that includes its additional quite heavy runtime.
A Clojure interpreter written in C, if written the same way as for Java, it would run just as slow, given the way it is building Clojure every time the application starts.
20:43:44,578 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 10.1.0.Final (WildFly Core 2.2.0.Final) started in 6551ms - Started 331 of 577 services (393 services are lazy, passive or on-demand)
Apparently, Square was first built out on Ruby with the mindset that the JVM is an old clunker.
Fast-forward a few years they switched to the JVM because it was faster and the language (I know, not related) provided compile-time safety.
But I have to agree w/ others that after using golang, where an equivalent web app would run in <50Mi of RAM with far better tail latencies, the memory cost of the JVM feels very large.
Ah yes, this would be the same industry where people lead their teams to switch from Java to Go because they believe it will improve the productivity of development.
For Clojure, starting `lein repl`, takes 16 seconds on my 2012 Macbook and 9 seconds on my similarly-aged Dell laptop, both with SSDs and i7 quads.
Regarding memory usage examples, the base memory usage of a Google App Engine instance running the move trivial Hello World Java program takes around 140MB. Given that the default F1 instance has a soft memory limit of 128MB, it becomes clear that the JVM is working against you in both cost effectiveness (the price to spin up new instances as your existing ones are already above the soft limit) and latency (since spinning up instances is slow). Add Clojure on top and the problem certainly doesn't get any better. As an added annoyance, which is specific to App Engine but a result of using the JVM, it's impossible to specify JAVA_OPTS, so any of the -X flags, without switching to the Flexible environment.
As a result of both of the above, choosing Clojure for developing on App Engine, as my specific example, has had the serious downfall of slow development tools and memory issues out of the gate on my instances, causing me to pay more for a beefier instance class. The REPL is really hard to beat, but the combination of JVM and Clojure are the biggest pain in the ass, with this stack.
JVM deployments tend to assume nothing else happens on the same machine, in my experience.
I haven't used App Engine either but I suspect that the flexible environment is more expensive.
I run a fairly complex Clojure app and it uses 1.5GB of RAM all told. Factor in cron jobs, caching, DB flushes and other periodic spikes on the machine and whoops! we're over 2GB used. To prevent thrashing and OOMEs, I sized up to a 4GB VM, doubling my monthly costs (there are several of these boxes).
Now yes, I can go through and swap out libraries to slim the beast down. But that would mean rewriting large chunks of it, since so few Clojure libraries are interoperable and most simply wrap "enterprisey" (and heavy) Java libraries in sexier syntax. And if I'm rewriting it, I might as well avoid the whole mess and pick an ecosystem with a more streamlined standard library.
It's tough. I adore Clojure, but the combination of Clojure+JVM has made deployment and management less fun and more expensive than necessary. The JVM is awesome, but, just as it's not the hog so many claim it is, it's also not sleek.
On the desktop, laptop, phone, or embedded environment, the JVM is heavy. It starts up slow, jars carry around ridiculous amounts of dead dependencies, garbage collectors require immense amounts of tuning, etc. And we shouldn't really expect otherwise. If you can't even keep your VM in cache, how are you supposed to have fast application code?
Specialty closed source JVM vendors have done wonders in terms of improving this problem...but it's still an uphill battle. AOT native compilation down to machine code is becoming more popular because of the proliferation of resource-constrained environments, and it will take time for new languages/compilers to take over, but take over they will.
This comes back time and time again. AOT native results in slower runtimes for applications with one simple exception: startup time.
In every other case a modern JIT compiler like the JVM will win due to gathering information and layered compilation.
Where AOT really makes sense is for an interactive app on a mobile device where you don't care about the last millisecond of performance but startup times and even much more importantly: energy expenditure. (That's why google AOTing the apps on device startup is quite sensible)
Most funnily Microsoft was heavily advertising AOT with .net framework 1.0 but in general switched to dynamic profiling and optimization in later versions of .net. (System assemblies that everybody uses during startup are AOT compiled using ngen, however)
It is just not "one size fits all", depending on your platform and requirements you'd have completely different requirements to your virtual machine. What you want is different between interactive use and "batch processing" and between energy starved devices and big iron.
Java - while ironically advertisted for "applets" years ago - is optimized for the latter case and there it really shines. On a server with long running processes AOT makes no sense at all.
So what AOT will take over is phones. IOT devices. Everything were energy is at a premium and startup times need to be quick. Layered JIT compilation takes over where you want to squeeze out the last bit of total performance. (Even interactively, looking at you, Google V8 and Chakra)
AOT native results in slower runtimes for applications with one simple
exception: startup time. In every other case a modern JIT compiler like the
JVM will win due to gathering information and layered compilation.
JIT compilers work very well on untyped languages like Smalltalk because the compiler can discover the type information at runtime, and then pre-compile the types that it sees most often. But that's not really that useful on the JVM, because Java is typed, as are most other JVM languages, with the exception of Clojure.
Most funnily Microsoft was heavily advertising AOT with .net framework 1.0
but in general switched to dynamic profiling and optimization in later
versions of .net. (System assemblies that everybody uses during startup are
AOT compiled using ngen, however)
What I personally don't understand: Why don't we cache JIT results between runs? That might be a worthwile optimization and even possible in the face of class loading.
It probably would be like running ngen on .net, just WITH performance statistics of a program. (Enabling specialization of calls for types commonly passed or eliminating constant expressions while keeping the generic version of a function around - that's hard in AOT as you need profiling information. I think sun's C/C++ compiler was able to do that for AOT, resulting in large speedups. But maybe it only used it for branch prediction).
Edit: What I forgot to add - I like the way you could always AOT things in .NET with ngen but also use a JIT where possible. Now that Java turned out to be owned by the evil empire and .NET the one by the company committed to open source - imagine reading that 10 years ago - I'm really curious in which way things will develop. And with all the new contenders as well. JVM (and .net) is not dead, but a lot of interesting alternatives are getting traction now.
This is configurable, Hotspot can JIT right away when application starts, but then be prepared to wait a bit.
> What I personally don't understand: Why don't we cache JIT results between runs?
They do, just not the OpenJDK that many only care about.
All commercial JDKs support code caches between executions and AOT compilation.
But I'd argue that there are very few benefits of JIT that can't be achieved by AOT + PGO. A sound static type system nullifies the need for most of those benefits (like speculative type optimizations and deoptimizations). But it might have the upper hand in cases where profiling can't capture all of the possible optimizable workloads that the binary would see. Databases or other large programs that continuously specialize over the lifecycle of the process. But that is far more niche than most people realize.
I'll half-agree with respect to phones and embedded environments since those are wildly variable and may include extremely low-specification platforms.
But a desktop or laptop? The JVM launches in milliseconds on my desktop and laptop. The monstrous Eclipse IDE launches in about six seconds on my desktop, and about five seconds of that time is Eclipse loading various plugins and what-not, in what looks like a single-threaded manner.
My desktop and laptop can both spin up Undertow and fire up a web-app from a Jar in about two seconds.
I'm fairly sure Eclipse is just using an old clunky CMS garbage collector. I've never tuned it on my desktop or laptop. Maybe Neon is using G1 now. I don't know and don't care because it runs just fine.
Maybe you've done something different with the JVM on desktops and laptops, but in my experience, on desktops and laptops, the JVM behaves more or less the same as it does on servers.
Kids today! Sit down over here, and Grandpa will tell you about the days when a few hundred megabytes was more than your average server's entire storage capacity. Now, in those days you tied an onion to your servers, which was the style at the time...
• the JVM on smart cards, e.g. EMV (chip) credit cards, or GSM cellular SIM cards
• the JVM embedded into the Intel Management Engine coprocessor
First time I've heard that one. Got a source?
The numbers we're talking about here just aren't a practical consideration any more.
But the CPU time spent by the dynamic linker resolving thousands upon thousands of symbols? That's actually rather painful.
The Avian JVM can statically link an entire program and widget toolkit with itself and produce a 1mb binary.
It's not that big a deal. The space gets taken up by all the libraries. But then you'd want to compare a JVM against e.g. /usr/lib on a fresh Linux install ...
 - https://en.wikipedia.org/wiki/LibreSSL#Code_removal
On the bad site of the JVM and assorted Java tools is that they are second class citizens of the unix world. The command arguments are all messed up, much like a windows tool ported to unix, and the interaction with the rest of the unix stack like sockets, files are all solipsistic and off, which leaves an ill stink on everything touched by it.
One of the things I find funny with java is the once upon a time much touted security model, fast forward a couple of years and the event of android - using the unix security model and none of the java stuff.
This is exactly the kind of development culture that produces heavyweight, unresponsive tools.
If it wasn't for the pressure of game developers, the NDK wouldn't even exist.
Remember Brillo? It was supposed to be like Android, but using C++ frameworks instead, as presented at Linux Embedded 2015 conference.
Guess what, when it got recently re-branded as Android Things, it switched to the Java Frameworks instead and it doesn't even allow for the NDK, with the user space device drivers being written in Java.
Is this cognitive dissonance? Dishonesty? I don't understand.
Some people don't think that those massive 16/32G MBPs etc with SSDs are not available to everyone.
You should ensure both memory modules are same size (say, 4 GB or 8 GB), otherwise performance can suffer noticeably.
I think my main point was making luxury as basic because even more luxurious stuff exist.
- Rails (just about the heaviest web framework ever made for ruby, despite it's wide appeal)
- Ember (I absolutely love ember, but it is by far the heaviest modern JS framework... I don't include things like ExtJS)
Also, I routinely use the heavy/light distinction, but it seems in a completely different way. I almost don't care how heavy something that run on the server-side of a web application is, on the back-end "heavy" generally translates to "contains complexity I'm not willing to deal with". "heavy" on the frontend for me means both in footprint and complexity.
Given how often Android evicts apps that's something I'm not comfortable with shipping. Would love to use Clojure but it was definitely a show-stopper for us.