Hacker News new | past | comments | ask | show | jobs | submit login
Ask Matz: why the id of nil is 4? (bigbinary.com)
58 points by potomak on Feb 2, 2012 | hide | past | favorite | 18 comments



Fixnum values (numbers smaller than native one machine word minus 1 bit) are stored in an object pointer. When you do object.id it returns the address stored in that pointer.

For Ruby to know that the value in a pointer is a Fixnum (and not a pointer to an address), it will tag the first bit w/ 1 and shift the integer value by one bit. So storing the value 7 will be done like this:

  (7 << 1) + 1 # => 14
This is why 7.id == 14.

You can see where this is implemented right here: https://github.com/ruby/ruby/blob/trunk/include/ruby/ruby.h#...

For true, false and nil they are also stored right in the object pointer using special values defined here: https://github.com/ruby/ruby/blob/trunk/include/ruby/ruby.h#.... You'll notice that those 3 values in binary format all end with 0b...10. First bit to 1 means Fixnum, second bit to 1 means special value: true, false or nil.


Sorry for nitpicking, but you meant 15, not 14. Other than that, it seems you are 100% right.


Oops! Indeed

  (7 << 1) + 1 # => 15


Might also want to use 'last' instead of 'first' in this part of your explantion

> tag the first bit w/ 1

I was confused with your explanation until I went and redid the binary on the original, then I got what you meant. sorry if its nitpicky. ignore if you got it.



I found it very odd that he would place false before true, but then I realized that's probably just the C mindset leaking through.


Since in Ruby false and nil are the only "falsish" values and everything else is true, their values has been chosen to make a boolean test as efficient as possible. See: https://github.com/ruby/ruby/blob/trunk/include/ruby/ruby.h#...


wow I didn't know you can multi-line highlight code like that in GitHub. Thanks for the tip!

(click on the second github link and you'll note multiple lines are highlighted. check the url for how it's done)


You may find Gudeman's "Representing Type Information in Dynamically Typed Languages" (http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.43...) interesting - it's an overview of type tagging strategies used in many dynamically typed languages. It's not specific to Ruby (it's from 1993), but gives more context to the methods Ruby uses.


Mark Zuckerbergs facebook user id is 4 (https://graph.facebook.com/4), so its always fun when you have a nil bug and then you think Zuckerberg is using your site.



Much less relevant today with Ruby 1.9.x, but still very interesting.


Less relevant how? nil.object_id is still 4 in MRI 1.9.2.

The article noted that all integers have odd ids, but not why—the LSB is a http://c2.com/cgi/wiki?TagBit in MRI. I would hope other implementations are permitted to choose different schemes; it would suck if user code began assuming nobody found a use for more than one tag bit.


This situation was most encountered by Rails developers trying to access the ID of an ActiveRecord object. Since they changed the method to object_id in Ruby 1.9, calling id on a nil object should just give a NoMethodError instead of that slightly cryptic warning.


Ah, that was a mistake. ActiveRecord should have avoided :id until it was renamed in core Ruby, or returned some sentinel (other than nil) whose :id method fails. Assuming Ruby didn't critically depend on :id succeeding for every object, that is.


Yes, you're right but, as you observed, my question is driven by curiosity.


Hm, I think it's better style to refer to true and false as true and false (their keywords) instead of TRUE and FALSE (constants, could go away in the future?). :)

  irb(main):001:0» TRUE
  » true
  irb(main):002:0» FALSE
  » false


Interesting. I had always assumed it had something to do with word size being 4.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: