[LRUG] An interesting Ruby hash semantics gotcha

Daniel Tenner daniel.ruby at tenner.org
Wed May 7 08:03:46 PDT 2008


Agreed.

This is quite blog-worthy...anyone willing to post it up for posterity/ 
non-londoners? :-)

Daniel

On 7 May 2008, at 15:117 May 2008, James Adam wrote:

> Really nice post Matthew - it's good to know the quirks of our
> rarely-fickle mistress :)
>
> On Wed, May 7, 2008 at 2:50 PM, Matthew Willson <matthew at playlouder.com 
> > wrote:
>> Thought this might amuse some on here (or be useful to know - it's  
>> been the
>> source of a couple of hard-to-track-down bugs in the past).
>>
>>>> {{} => true}[{}]
>> => nil
>>
>>>> {{} => true, {} => false}
>> => {{}=>true, {}=>false}
>>
>> but yet,
>>
>>>> {} == {}
>> => true
>>
>> What's going on here?
>>
>> Ruby's Hashes behave very strangely when you try to use a Hash  
>> itself as a
>> key of a Hash.
>>
>> This acts as a subtle gotcha when you try to memoize a function  
>> which takes
>> hash arguments - and so a tricky-to-address bug in libraries like  
>> this:
>> http://raa.ruby-lang.org/project/memoize/
>>
>>
>> Why?
>>
>> Ruby calls Object#hash on each key of a Hash, using that numeric hash
>> (small h) to allocate that object to a bucket of the hash table data
>> structure. Equality, when it comes to Hash lookups and unique keys  
>> of a
>> Hash, will only work if the keys generate the same numeric hash as  
>> a result
>> of their hash methods.
>>
>> For most ruby data structures, x.hash == y.hash  is implied by   x  
>> == y,
>> and everything works fine.
>> But, not for Hashes themselves!
>>
>> (NB. this also affects data structures like Arrays which themselves  
>> contain
>> a Hash, since Array#hash must call hash recursively on its contents).
>>
>> (Interestingly, for things like 1.0 == 1, x.hash == y.hash also  
>> fails.
>> Note, x.hash == y.hash is always implied by x.eql?(y), but this  
>> equality
>> isn't a desperately useful one, and seems to have been constructed
>> artificially as an equality for use with Hash which is consistent  
>> with
>> .hash)
>>
>>
>>
>> Why might it have been implemented this way?
>>
>> Hashes are insensitive to the order of their keys - so, for  
>> example, we
>> have:
>> {:a => true, :b => true} == {:b => true, :a => true}
>>
>> When you're actually being given two concrete objects to compare,  
>> you can
>> just check that each key from the one has an equal corresponding  
>> value in
>> the other, and vice versa.
>>
>> But, when you're asked to generate a numeric hash which uniquely  
>> identifies
>> the equivalence class, you'd have to do something to ensure the  
>> hashing
>> isn't order-sensitive. Like ordering the key/value pairs by their  
>> individual
>> hashes before feeding into the hash function.
>>
>>
>> Some attempts at an answer in the form of an overridden Hash#hash:
>>
>> (1) sort key/value pairs by the numeric hash of the pair first:
>>
>> class Hash
>>  def hash
>>    sort_by {|pair| pair.hash}.hash
>>  end
>>
>>  def eql?(other)
>>    self == other   # hash is now consistent with ==, so we need to  
>> make
>> eql? consistent too
>>  end
>> end
>>
>> (2) Use an XOR of the hashes of the key/value pairs (XOR is
>> order-insensitive)
>>
>> class Hash
>>  def hash
>>    inject(0) {|hash,pair| hash ^ pair.hash}
>>  end
>>
>>  def eql?(other)
>>    self == other
>>  end
>> end
>>
>> These then fix, eg:
>>>> {}.hash == {}.hash
>> => true
>>
>>>> {{} => true}[{}]
>> => true
>>
>>>> {{} => true, {} => false}
>> => {{}=>false}
>>
>> (Note, overriding eql? is required to make the last two work)
>>
>> Now, I'm sure there's a reason Matz didn't do it this way - perhaps a
>> performance reason, perhaps a gotcha that I haven't noticed with my
>> approach. Perhaps it'll be fixed in 1.9.
>> But at any rate, it's useful to be aware of the issue.
>>
>> -Matt
>>
>> _______________________________________________
>> Chat mailing list
>> Chat at lists.lrug.org
>> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>>
>
>
>
> -- 
> * J *
> ~
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org




More information about the Chat mailing list