> Every user and organization on GitHub.com with Git LFS enabled will begin with 1 GB of free file storage and a monthly bandwidth quota of 1 GB.
Does this mean that with the free tier I can upload a 1GB file which can be downloaded at most once a month?
Even a small 10MB file, which fits comfortably in a git repo, could be downloaded only 100 times a month. Maybe they meant 1TB bandwidth?
I would suspect the point is rather that you have a bunch of megabyte range files, and you rarely update them and don't have to sync. But for most workflows this feature seems targeted at, the free tier seems insufficient.
Yes, that is the free tier. If you want to use it seriously, it will cost some money. OR you can use it and host your own file server, for free.
I don't think these facts are a problem. They create an open source tool, provide a location to try it out, and a service to pay to use it if you like it and don't want to host yourself. Seems like a fair offer.
If you use a fast hash function you still have to look up a hash table, so the memory access is still there. If you use some clever probing strategy you might be able to have only 1 cache line access per lookup on average, while most perfect hashing functions, which use the MWHC scheme, do 3 random accesses.
However, these 3 random accesses are independent, so the CPU can mostly pipeline them, and the perceived latency is that of a single random access.
cmph is not the fastest library for MPHF, mostly because it uses a slow ranking function to turn a PHF into a MPHF.
I wrote a small library some time ago that performs much faster, both for in-memory construction and lookups, [1, see the linked paper for benchmarks], unfortunately I have no time to maintain it but I recently found out that some projects are using it, so it wasn't all wasted time :)