Hacker News new | comments | show | ask | jobs | submit login
Fixing high GC pressure when MySQL driver uses server-side prepared statements (github.com)
71 points by johnou 10 months ago | hide | past | web | favorite | 35 comments

Saw the commit, its nice you improved the code quality too, wherever you touched by adding diamond brackets and making fields final.

Refactoring for code quality improvements is a good thing but it shouldn't be bundled with changes like this. It makes it harder to see what actually changed. In this case there are 20 lines modified but only one (or four if you count deletions) is part of the core change.

Unless the "final" qualifiers are required as part of this improvement, say to better inform the GC of what can be changed, then I'd suggest separating them out into separate commits. That way the lines that improves the hashCode() method and switches to using CompoundCacheKey stand out as the real changes.

Thanks for this comment. I've had this discussion over and over with other developers. Separate commits for separate things no matter how small. Doesn't matter if the code change is 1 line. Conflating multiple issues (even reformatting) with a bugfix brings a lot of noise to the diff and increases the review process.

In this case I feel changing the LRUCache to support generics was an integral part of the changeset to prevent possible regressions with cache usage (as I was changing the cache key from String to another type). Without it, it would be possible to use the incorrect object type in one of the many code paths, by adding generics we benefit from compile time checks.

I find that in practice, these types of changes tend to happen a lot less if they are forced in separate pull requests.

Separate commits is sufficient.

Indeed - thanks for taking the time, we really appreciate that! (I'm from MySQL's connectors team, while not working on Java)

I added generics to ensure type safety, it is not technically diamond brackets (they came in jdk7 but mysql connector targets jdk6).

Thank you to the people that make these kinds of contribution to open source. :)

Yeah, it is pretty easy to shoot yourself in the foot by using buffer-bound types in Java. Another example is java.io.ByteArrayOutputStream and arbitrary data sizes. Array.copy can be dangerous.

It's amazing how performance can be much improved by profiling and fixing little things, such as optimizing a Java hashCode() method.

I had a similar experience helping with the H2 database engine.

I didn't benchmark the difference between cache usage with precalculating the hash code in a constructor vs calculation in the hash code itself but string concatenation then taking the hash code of that was the real performance killer.

Hi johnou,

thanks for trying to report the issue. Do you have a bug ID for the item you're referring to? Searching for the title doesn't show me an issue. In general we try to react to all reports, unfortunately sometimes some are missed or wrongly categorized due to the big overall inflow.

johannes (MySQL Connectors team, but not working on Java and without deeper Java knowledge)

The link is right down the bottom but the GitHub page also contains the benchmark code and results which seemed more interesting https://bugs.mysql.com/bug.php?id=88021

Hey johnou, thanks I saw this, my comment originally had been a response to https://news.ycombinator.com/item?id=15449672, which I interpreted as referring to a different bug, and was then moved and edited by an admin ... anyways thanks for your contribution and my colleagues will review and probably (as said: I don't know enough Java and thus can't judge it) apply it :)

I'd appreciate it if you could nag one of your colleagues to pick it up ;)

There is no need to nag me. :) We always welcome contributions, especially those like this one. Thanks!

dang is the administrator of this site, and his comment was referring to the submission title on this site, which he changed.

Sorry for the misunderstanding! I'm just the janitor here, I clean up titles and so on. The person you want is johnou, who posted the article, so I've edited your comment to s/dang/johnou/ and moved it to the root level so it will be a reply to him instead of to https://news.ycombinator.com/item?id=15449672.

No problem, just wanted to make sure we're not missing anything :-D

edit: Well actually the recent edit makes this comment now more confusing. The github page is associated with a MySQL bug, which will be handled. This comment, originally, was under a different thread, which, to me, sounded like there was another bug report by dang we missed ...

Please don't try looking at MySQL's python connector... from a quick inspection there are quiet a few memory copies going on there... just in Python it does not end in GC, but needless wasted CPU / memory cycles (and less performance, depending on what yo do).

The submitted bug, for context - https://bugs.mysql.com/bug.php?id=88021

Ok, we've attempted to do that. Submitted title was "Up to 498% throughput increase in the mysql connector cache implementation". If anyone can suggest a better title we can change it again.

Slightly OT; I'm guessing that for every developer that goes through that horrible "sign our contribution agreement" there are 99 of us that would just fork+patch and get on with our lives.

But then you are stuck with the maintenance burden..

Often getting that CLA signed involves getting at least the legal department involved, especially in larger, more corporate organizations. In those cases, indefinitely maintaining a patch may sound like a good alternative.

I get paid for that

It's also about giving back to the open source community.

Well that I believe in. It's also about having a github of green.

(for Java)

(when using server-side prepared statements, which are disabled by default)

And 250% throughput increase with their cache when using their compound cache key which is not strictly related to server-side prepared statements.

And even so, kudos to the author for making a real contribution.

the equals and hashcodes are kind of inconsistent as well. hashcode supports null for the second component (explicitly in patch, implicitly previously) and equals doesn't check for null in the second component. i guess second component can never be null and probably both can never be null and none of the checks should be there.

Sounds like the classic ad selling something that is actually not as fast as it claims

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact