1

Given an array of Ruby hashes like this:

[{"lib1"=>"30"}, {"lib2"=>"30"}, {"lib9"=>"31"}, {"lib2"=>"31"}, {"lib3"=>"31"}, {"lib1"=>"32"}, {"lib2"=>"32"}, {"lib1"=>"33"}, {"lib3"=>"36"}, {"lib2"=>"36"}, {"lib1"=>"37"}]

How do I get a hash like this:

{"lib1"=>[30,32,33,37], lib2=>[30,31,32,36], lib3=>[31,36], lib9=>[31]}

5 Answers 5

3
a = [{"lib1"=>"30"}, {"lib2"=>"30"}, {"lib9"=>"31"}, {"lib2"=>"31"}, {"lib3"=>"31"}, {"lib1"=>"32"}, {"lib2"=>"32"}, {"lib1"=>"33"}, {"lib3"=>"36"}, {"lib2"=>"36"}, {"lib1"=>"37"}]

a.map(&:to_a).flatten(1).each_with_object({}) do |(k, v), h|
  h[k] ||= []
  h[k] << v
end
#=> {"lib1"=>["30", "32", "33", "37"],
#    "lib2"=>["30", "31", "32", "36"],
#    "lib9"=>["31"],
#    "lib3"=>["31", "36"]}

Alternatively:

Hash[a.map(&:to_a).flatten(1).group_by(&:first).map { |k, v| [k, v.map(&:last)] }]

If you're willing to use Facets then this becomes absurdly simple with collate:

a.inject(:collate)
Sign up to request clarification or add additional context in comments.

3 Comments

Perfect. I spent 2 days on this (learning ruby) and to see it explained so succinctly is sublime.
@cymorg I've added an alternative solution as well. I still think there must be a shorter way, but no matter what I do I can't escape the map(&:to_a).flatten(1).
FYI, the OP example results show the values coerced to Fixnums. Ordinary nested iteration involves only one pass and fewer intermediate objects generated.
2
t = [{"lib1"=>"30"}, {"lib2"=>"30"}, {"lib9"=>"31"}, {"lib2"=>"31"},
  {"lib3"=>"31"}, {"lib1"=>"32"}, {"lib2"=>"32"}, {"lib1"=>"33"},
  {"lib3"=>"36"}, {"lib2"=>"36"}, {"lib1"=>"37"}]
result = {}

t.group_by { |x| x.keys.first }.each_pair do |k, v|
  result[k] = v.map { |e| e.values.first }
end

Or, for a more purely functional version and fitting, sort-of, on the all-important one-line (it is Ruby, after all) ...

Hash[t.group_by { |x| x.keys[0] }.map { |k, v| [k, v.map { |e| e.values[0] }]}]

1 Comment

+1. Don't know why I didn't bother to try the group_by first.
2

Alternate to Andrew's, avoids flatten and to_a, just nested iteration. Will gather multiple keys from element hashes in source array if present.

a.each_with_object({}) do |element,result|
  element.each do |k,v|
    (result[k] ||= []) << v.to_i
  end
end

Golfed to one-line:

a.each_with_object({}) {|e,r| e.each {|k,v| (r[k] ||= []) << v.to_i } }

I would note that this version examines each source element only once, while the to_a/flatten and group_by answers involve multiple iterations over the source or transformations of the source.

Andrew makes a good point that constant factors in big-O algorithm complexity are often a wash in reality. I put together a quick benchmark of the answers supplied so far (correcting them all to cast values to fixnum as the OP example implies). My nested iteration approach does turn out to be somewhat (23-45%) faster with the OPs example source data:

ruby 1.9.2p318 (2012-02-14 revision 34678) [x86_64-linux]
Rehearsal ------------------------------------------------------------
to_a_flat                  3.100000   0.000000   3.100000 (  3.105873)
to_a_flat_construct        4.060000   0.000000   4.060000 (  4.076938)
group_by_each              3.010000   0.000000   3.010000 (  3.015367)
group_by_each_construct    3.040000   0.000000   3.040000 (  3.050500)
nested_iter                2.300000   0.000000   2.300000 (  2.307776)
-------------------------------------------------- total: 15.510000sec

                               user     system      total        real
to_a_flat                  3.080000   0.000000   3.080000 (  3.096301)
to_a_flat_construct        4.050000   0.000000   4.050000 (  4.059409)
group_by_each              2.980000   0.000000   2.980000 (  2.997074)
group_by_each_construct    3.050000   0.000000   3.050000 (  3.057770)
nested_iter                2.300000   0.000000   2.300000 (  2.311855)

3 Comments

However, it's worth noting that the number of iterations has no effect on time complexity. I say this only because you seem to imply that iterating only once is more efficient (it may be, but only real benchmarks can show that).
@AndrewMarshall just because I'm procrastinating on real work, I tossed together little benchmark for ya.
+1 for real benchmarks :) (and I added the facets one to it to see how it faired… and boy is it slow—and now I'm working on making it faster)
0

Just to add to the spectrum; you can avoid each_with_object by defining the resulting Hash up front and use a regular .each loop. I find it easier to read than the former approach.

grouped = {}
hashes.each do |hash|
  hash.each do |key, value|
    (grouped[key] ||= []) << value.to_i
  end
end

Comments

-1

Similar to DigitalRoss's, but I would do it in place:

array.group_by{|h| h.keys.first}.each{|_, a| a.map!{|h| h.values.first}}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.