Hacker News new | past | comments | ask | show | jobs | submit login

> a mapping of a String (could also be called a word) of arbitrary length to a fixed length String

Right. It takes a longer sequence of bits and reduces it to a shorter sequence of bits, with a resulting chance for collision.

> since the identity is a virtual concept it replaces the identity with a randomly generated number

It's not replacing anything. Here's another way to think of it. V8 needs this function:

    int getHash(Object object) { ... }
Given a reference to some object, it returns a hash code for it, based on the identity of the object. The requirements are:

1. Returned values should be nicely distributed over the number space.

2. If you same object in, you must get the same hash code out. Even if the object is mutated between calls.

This sounds like a reasonable definition of a hash function, right?

If all this function had access to was the object's properties and it had to be pure (couldn't modify anything), it's actually impossible to implement this function. (This, strangely enough, touches on fundamental ideas around mutability, identity, and equality.)

So V8 relaxes the requirement that the function be pure. getHash() then generates a random number for the object, stuffs it in some hidden chunk of memory inside the object (what the linked article is about) and declares by fiat that that number is now the hash code for that object's otherwise-intangible identity.




> This sounds like a reasonable definition of a hash function, right?

Right.

> So V8 relaxes the requirement that the function be pure. getHash() then generates a random number for the object, stuffs it in some hidden chunk of memory inside the object (what the linked article is about) and declares by fiat that that number is now the hash code for that object's otherwise-intangible identity.

Thanks for writing it out like that, That's exactly how I understood it, it is good to have that confirmed.

I understand why that is called a hash in V8 now, but I feel exactly like ninkendo writes in a comment above:

> You’d just as well call it an “equality checking code”, or name it after any other use case you could think of for a unique identifier. Ruby calls its equivalent construct “object_id” and that makes so much more sense.

But your position is not wrong, this is only about what you expect to stand behind which word, and since I agree that the definition of a hash fits to that scheme this is only about connotations. Thinking of this hash as an object_id makes it clearer to me why this is done and how it could be used (and I'd absolutely use an object_id as input to a hash function, the usage does not have to differ, though that is not done here).


> You’d just as well call it an “equality checking code”, or name it after any other use case you could think of for a unique identifier. Ruby calls its equivalent construct “object_id” and that makes so much more sense.

The number V8 generates for each object is not guaranteed to be unique. Like any hash code, collisions between distinct objects may happen. That's another reason why I think it's useful to consider it a hash and not an "equality checking number" or "ID".


Good point!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: