Hacker News new | comments | show | ask | jobs | submit login
Show HN: Uint1Array – JavaScript Bit Array (npmjs.com)
24 points by 19eightyfour 180 days ago | hide | past | web | favorite | 7 comments

Damn I really wish I had this 6 months ago when I was re-implementing a way to convert ImageData into a 1-bit BMP image directly in the browser to easily interface with a legacy system.

Have you tested the performance of this? Some drop is expected over the larger size arrays, but are we talking magnitudes slower or just 2-3x slower?

I see most of the "public" methods are just using the `.toArray()` method of the private class, doesn't that just murder the performance and negate the whole reason for using this?

We eventually basically implemented a version of this ourselves, but we ended up doing it a little differently. We would pull a full uint8 from the array and break it out into the 8 bits then just let the iterator return/set those one at a time until it's done, then it would write the whole 8 bits to the "main" array at once which massively sped things up for us.

Also, your getter to access the private section is adding a lot of overhead to accessing it. It's probably better to just forego the "absolute safety" of being unable to overwrite your private inner class and just rely on the symbol to prevent access to the private bits.

Thanks alot for the feedback and info on your implementation! I'll do some perf tests and see if there are ideas from you it works to include.

Is your code online?

Also, I didn't know that getters add more overhead than just a property slot. I'll research this.

I don't understand your criticism of `toArray()` because to me it's the best it can be. To explain: `toArray()` returns a cached array of the unpacked bits, unless a bit has been changed, then it lazily recomputes it. So that's pretty much as good as it can get. All the methods that use `toArray()` require an iteration over the bits, so the alternative to a call to `toArray()` would be to inline its code into every caller. Two reasons I didn't do that were DRY and because 1 more call in the stack won't have a huge overhead, IMHO. Still, more testing is required to know more perf details.

A possible optimization is factoring out the cache check from `toArray` and inlining that check into each method calling `toArray`, but that really only saves on a method call, so to me, it's negligible.

The design reason I've done things the way I did was because the aim of this class was convenience, simple code and good performance. I didn't factor it to be about absolute best performance. I think, if we're going for absolute best performance, an implementation in wasm or asm.js would work better, and I'm not interested in doing that right now.

So that's my point of view and I don't understand your criticism of `toArray()`. What am I missing in this perspective and what ideas do you have to create improvements?

I can't edit my comment any more so I'll just throw another one here.

My criticism of `toArray()` is that it's making the rest of the code mostly useless. If you are just going to keep a traditional array around in a cache at all times, and you need to convert your underlying storage to that array any time you iterate then why not just use an array directly? It would be faster, have lower memory usage, and be simpler.

The whole point of using a typed array is to avoid the overhead of a normal array, and in your implementation you not only are using a normal array (and all of the downsides of it, larger memory footprint, slower iteration, the JIT still needs to box and check types, etc...), but you are also keeping a copy of that data in a buffer, as well as converting between the 2 quite frequently and inefficiently (your method to build the array in `toArray()` reads each word from the buffer 8 times and just shifts it around a bit each time).

It would be easier, faster, have lower memory usage, and simpler to understand to just replace this whole module with `const bitArray = [0, 1, 1, 0, 1, 1, 0, 0]` at any size, since that's basically what it's doing under the hood but with a lot more complexity and overhead.

Sadly no, the code is in an Enterprise application and that part of the codebase is extremely unlikely to be open sourced.

And getters add overhead because it invokes a function which returns the actual value. Honestly it's not that much as the most JS engines are really good at optimizing that kind of stuff, but when it comes to typed arrays every bit counts, as their whole purpose is for the fast-path of code.

And simpler is almost always better.

Ech, is that right? I don't understand `wordSizeShift`.

I have always done bitset tests as follows:

  this.set = function(bit, value) {
    const index = Math.floor(bit / BitsPerElement);
    const offset = bit - (index * BitsPerElement);

    var element = this.buffer[index];
    if (value)
      element |=  (1 << offset);
      element &= ~(1 << offset);

    this.buffer[index] = element;

  this.get = function(bit) {
    const index = Math.floor(bit / BitsPerElement);
    const offset = bit - (index * BitsPerElement);

    const element = this.buffer[index];
    return (element & (1 << offset));

I could have the code wrong or overly complex.

Thank you for posting your code. I'll see what it works to include.

That certainly looks tight / good, what you've got there!

A reason I wrote it the way i did was I wanted to get rid of the conditional testing if value is zero or not.

Also `wordSizeShift` is just the number of bits to right shift by to divide by the word size, to save a division.

There were a few comments you deleted before I could reply, so I just wanted to put this here.

I don't mean any harm with my criticisms here, this was an area that I explored a lot with specific goals in mind and I was just trying to talk about why you went with the direction you did.

I didn't mean any harm!

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact