From the code samples it's hard to tell whether or not this has to do with de-serialization though. It would have been fun to see profiling results for tests such as these.
That's nice - I'd encourage you to play around with attaching e.g. JMC [1] to the process to better understand why things are as they are.
I tried recreating your DataInputStream + BufferedInputStream (wrote the 1brc data to separate output files, read using your code - I had to guess at ResultObserver implementation though). On my machine it roughly in the same time frame as yours - ~1min.
According to Flight Recorder:
- ~49% of the time is spent in reading the strings (city names). Almost all of it in the DataInputStream.readUTF/readFully methods.
- ~5% of the time is spent reading temperature (readShort)
- ~41% of the time is spent doing hashmap look-ups for computeIfAbsent()
- About 50GB of memory is allocated - %99.9 of it for Strings (and the wrapped byte[] array in them). This likely causes quite a bit of GC pressure.
Hash-map lookups are not de-serialization, yet the lookup likely affected the benchmarks quite a bit. The rest of the time is mostly spent in reading and allocating strings. I would guess that that is true for some of the other implementations in the original post as well.
JMC is indeed a valuable tool, though what you see in any java profiler is to be taken with a grain of salt. The string parsing and hash lookups are present in most of the implementations, yet some of them are up to 10 times faster than the DataInputStream + BufferedInputStream code.
It doesn't seem like it can be true that 90% of the time is spent in string parsing and hash lookups if the same operation takes 10% of the time when reading from a filechannel and bytebuffer.
var buffer = ByteBuffer.allocate(4096);
try (var fc = (FileChannel) Files.newByteChannel(tempFile,
StandardOpenOption.READ))
{
buffer.flip();
for (int i = 0; i < records; i++) {
if (buffer.remaining() < 32) {
buffer.compact();
fc.read(buffer);
buffer.flip();
}
int len = buffer.get();
byte[] cityBytes = new byte[len];
buffer.get(cityBytes);
String city = new String(cityBytes);
int temperature = buffer.getShort();
stats.computeIfAbsent(city, k -> new ResultsObserver())
.observe(temperature / 100.);
}
}
My bad - I got confused as the original DIS+BIS took ~60s on my machine. I reproducing the Custom 1 implementation locally (before seeing your repo) and it took ~48s on the same machine. JFR (which you honestly can trust most of the time) says that the HashMap lookup now is ~50% of the time and the String constructor call being ~35%.
Please add https://github.com/apache/fury to the benchmark. It claims to be a drop-in replacement for the built-in serialization mechanism so it should be easy to try.
I don't actively use it, unfortunately. The main inspiration was to faster sort inputs into the https://github.com/BurntSushi/fst crate, which I in turn used to try to build a search library.
This got me thinking - the salinity of the water in the bothnian bay is very low (seems to be about 1/10th of ocean water). Wouldn't that effect electrolysis?
This assumes a clear interface. Which assumes that you get the interfaces right - but what's the chance of that if the code needs rewriting?
Most substantial rewrites crosses module boundaries. In micro services changing the module boundary is harder than in a monolith, since it can be done in a single commit/deploy.
Most times you won't need a well-formed (REST) API to start with, and many times you will never need it. Most small projects should start without one and create one when need arises.
Sure, you won't "need" them. But it's very likely that you'll have resources that behave very well for rest like operations - user comes to mind, for things like user profile page, user settings, user detail, etc
A rest endpoint is likely handy to use across your application without having to replicate the same serialization over and over again.
Also many frameworks remove all the boilerplate for those operations (eg. Django class views), so creating one is a really good starting point.
Elasticsearch is decent at using non-text criteria provided that they are:
1. In the same document (a document in ES is a JSON object)
2. You have indices for them. ES (and Lucene) supports indices on raw text values and numbers as well.
ES does not do well with relations (joins). You can de-normalize data to deal with that.. but that makes data consistency harder.
An alternative if you want a bit more help with charting, without client side JS, is to use d3-shape (https://github.com/d3/d3-shape) to server-side render SVGs.
It is quite common that you only need to optimize very small parts of a program to this level. The rest of the program can be written in more conventional styles.
You could of course FFI into e.g. C for those parts, but that is usually harder to maintain than a few well optimized java classes.
> It is quite common that you only need to optimize very small parts of a program to this level.
It's a quite common myth developers believe about performance. Hotspots do happen sometimes, but once they have been optimized you quicky end up with a flat profile and an "everything is slow" problem. And in some types of apps, the majority of code is performance critical.
It depends - even when you run into an "everything is slow" problem, it might be that it's like 1 endpoint out of 2000 that causes performance issues. In this case, you might need to focus very much on performance for that endpoint, but maybe not for other endpoints. Profilers can help you figure out what code to focus on.
If the majority of code is performance critical, the tradeoffs are of course different.
Depends on the application. In area of compilers, databases, CAD, game engines, simulation, distributed analytics, machine learning the only code that isn't on the critical path is some configuration / control plane / UI - which is minority of the code.
Certainly not in the entirety of those areas. I've worked in a couple of them and functionality was a higher priority than performance and we picked our tools accordingly.
reply