2

Given this array (generated from a file)

["Yonkers", "DM1210", "70.00 USD"], ["Yonkers", "DM1182", "19.68 AUD"], 
["Nashua", "DM1182", "58.58 AUD"], ["Scranton", "DM1210", "68.76 USD"], 
["Camden", "DM1182", "54.64 USD"]]

I convert it to a hash indexed by the second element (the sku) with the code below:

result = Hash.new([])
trans_data.each do |arr|
  result[arr[1]].empty? ? result[arr[1]] = [[arr[0], arr[2]]] : result[arr[1]] << [arr[0], arr[2]] 
end
result

This outputs the hash in the format I want it:

{"DM1210"=>[["Yonkers", "70.00 USD"], ["Scranton", "68.76 USD"]], "DM1182"=>[["Yonkers", "19.68 AUD"], ["Nashua", "58.58 AUD"], ["Camden", "54.64 USD"]]}

I don't feel like my code is... clean. Is there a better way of accomplishing this?

EDIT: So far I was able to replace it with: (result[arr[1]] ||= []) << [arr[0], arr[2]]

With no default value for the hash

4
  • Is this an office reference? DM = Dunder Mifflin? :) Commented May 31, 2013 at 7:37
  • 1
    A perfect question, giving input and expected output and what you tried so far. Well done! Commented May 31, 2013 at 9:36
  • @ThomasKlemm Actually, the expression at the beginning is not a valid Ruby expression. And it is not explained what trans_data is. And I don't know what sku means. Commented May 31, 2013 at 10:08
  • 1
    @sawa That might be awkward for some people but certainly it wouldn't stop you from helping, guessing what he's looking for and pointing him in the right direction. I'm sure that you'll answer this question just as marvelously no matter what. Commented May 31, 2013 at 13:18

6 Answers 6

7

Looks like people need to learn about group_by:

ary = [
  ["Yonkers", "DM1210", "70.00 USD"], ["Yonkers", "DM1182", "19.68 AUD"],
  ["Nashua", "DM1182", "58.58 AUD"], ["Scranton", "DM1210", "68.76 USD"],
  ["Camden", "DM1182", "54.64 USD"]
]
hash = ary.group_by{ |a| a.slice!(1) }

Which results in:

=> {"DM1210"=>[["Yonkers", "70.00 USD"], ["Scranton", "68.76 USD"]], "DM1182"=>[["Yonkers", "19.68 AUD"], ["Nashua", "58.58 AUD"], ["Camden", "54.64 USD"]]}

It's possible to write this fairly succinctly without slice!, allowing ary to remain unchanged, and without the need to pull in any extra classes or modules:

irb(main):036:0> Hash[ary.group_by{ |a| a[1] }.map{ |k, v| [k, v.map{ |a,b,c| [a,c] } ] }]
=> {"DM1210"=>[["Yonkers", "70.00 USD"], ["Scranton", "68.76 USD"]], "DM1182"=>[["Yonkers", "19.68 AUD"], ["Nashua", "58.58 AUD"], ["Camden", "54.64 USD"]]}
irb(main):037:0> ary
=> [["Yonkers", "DM1210", "70.00 USD"], ["Yonkers", "DM1182", "19.68 AUD"], ["Nashua", "DM1182", "58.58 AUD"], ["Scranton", "DM1210", "68.76 USD"], ["Camden", "DM1182", "54.64 USD"]]

Several other answers are using each_with_object, which removes the need to coerce the returned array to a hash using Hash[...]. Here's how I'd use each_with_object to avoid a bunch of line-noise inside the block as they try to initialize unknown keys:

ary.each_with_object(Hash.new{ |h,k| h[k] = [] }) { |(a, b, c), h| 
  h[b] << [a, c] 
}
=> {"DM1210"=>[["Yonkers", "70.00 USD"], ["Scranton", "68.76 USD"]], "DM1182"=>[["Yonkers", "19.68 AUD"], ["Nashua", "58.58 AUD"], ["Camden", "54.64 USD"]]}

This takes advantage of Hash.new taking an initialization block that gets called when a key hasn't been previously defined.

Sign up to request clarification or add additional context in comments.

10 Comments

No, it's all Ruby; It's a great language.
Can you explain exactly how this works?? I mean it looks like it sets the key as a[1] (which is cut with slice! and removes it from the array) and has the values set to whatever is left over. I looked at the documentation here: ruby-doc.org/core-2.0/Enumerable.html#method-i-group_by But I still don't understand how it works
I will go with this answer. +1 to you.
@Senjai, that's what Ruby is doing.
You have ary = [a1, a2, a3], where a1 = [1,2,3], so a1[1] == a1.slice!(1) && a1 = [1,3]
|
4

Functional approach using the abstraction Enumerable#map_by from Facets:

require 'facets'
records.map_by { |name, key, price| [key, [name, price]] }
#=> {"DM1210"=>[["Yonkers", "70.00 USD"], ... }

It's a pity that Ruby does not ship map_by within the core, it's a very useful (as it's unknown) variation of Enumerable#group_by (where you choose the grouping key and the value to accumulate).

3 Comments

I just checked map_by. I think it is tricky to use it, particularly when the second element of an array is falesy. For example, [[:a, true], [:b, false]].map_by{|v1, v2| [v1, v2]} gives { :a => [true], :b => [[ :b, false ]] }, which is counter-intuitive.
@sawa, you are right, but that's not a problem of the abstraction per se, it's a bug in the implementation: it should do a more thorough check of the returned value or drop the "act like group_by" 'feature' altogether (it makes no sense, if I want something to act like group_by... I use group_by). Of course it should return {:a=>[true], :b=>[false]}. I'll open a ticket.
Facets has some very usable code it in, similar to Rails. I'd like to see the core team do some cherry picking and pull in more of the utility methods from both. Active Support has some great tools I often pull in though I seldom work in Rails.
3

What about

result = trans_data.each_with_object({}) do |arr, hash|
  (hash[arr[1]] ||= []) << [arr[0], arr[2]]
end

4 Comments

I didn't even know each_with_object existed. so that passes in an empty hash every iteration? Or is it persistent.
Let's say it keeps track of the object's object_id you passed. So you don't need to repass the object each time like in inject.
Ahh, I understand. Excellent. Thanks for teaching me something new and taking your time out to help others :)
Aha! I didn't know about each_with_object either. Updating!
2

Note: Accepted answer is best answer, but I'm really happy with the weird awesomeness I use and how I explain it:

arr = [["Yonkers", "DM1210", "70.00 USD"], ["Yonkers", "DM1182", "19.68 AUD"], 
["Nashua", "DM1182", "58.58 AUD"], ["Scranton", "DM1210", "68.76 USD"], 
["Camden", "DM1182", "54.64 USD"]]
 arr.each_with_object({}){|(a, b, c), hash| (hash[b] || hash[b]=[]).push [a,c]}

Props to Older God for each_with_object!

Explanation: There are two wacky things going on here. The first, the (a, b, c) magic, I think it works like this:

( 

  #This bit:
  arr.collect{|(a,b,c)| "#{a}#{b}#{c}"}

) - (

  #Is equivalent to this bit:
  (0..arr.size).collect {|i|
    (a,b,c) = arr[i] #=> (a,b,c) = ["Yonkers", "DM1210", "70.00 USD"]
    "#{a}#{b}#{c}"
  }

  #as you can see, they generate identical arrays:
) == []

Note that you can treat the parens as implicit in certain circumstances: arr.collect{|a, b, c| [a, b, c]} == arr

The second wacky thing:

(hash[b] || hash[b]=[]).push(...)

Remember that everything in Ruby is both an expression and a reference.

[

 (hash[:a] || "foo") == (nil || "foo"),
 (hash[:b]=[]) == [],
 (hash[:b]=[]) === hash[:b],
 (hash[:b] || "foo") == ([] || "foo"),

] == [true, true, true, true]

hash[b], when the key does not exist, evaluates to nil (which is falsey), so we evaluate and return the second half: hash[b]=[] which returns the value of the assignment, which is the array now referenced by hash[b], so we can push on to it, and hash[b] will [still be a] reference the updated array.

:D

PS - This is, I think, the first Ruby question I've ever answered, and it's the first time I've ever even thought of, let alone be able to, turn the comments into code, and oh my do I like it. Thank you for the puzzle!

4 Comments

I love your use of short circuit eval here! Brilliant
What you are calling "wacky" aren't really wacky at all. They might have surprised you, but they're logical uses of the language. We used the || trick for assignment, or initialization then assignment, in Perl years ago. It's probably borrowed from shell scripting with a bit different syntax, or C.
It's totally inherited from Perl (I did Perl before I did Ruby), and it's one of the things that I love about ruby is that is has fewer "things" and those things are more similar: everything is an expression, everything is a reference. These kind of tricks are less readable when you first start programming, but as you start gaining the concepts, they're /absolutely/ natural.
(Well, and I think Perl probably got it from Lisp, which got it from... and gave it to...)
0

try this out

arr = [["Yonkers", "DM1210", "70.00 USD"], ["Yonkers", "DM1182", "19.68 AUD"], ["Nashua", "DM1182", "58.58 AUD"], ["Scranton", "DM1210", "68.76 USD"], ["Camden", "DM1182", "54.64 USD"]] 

hash = Hash.new{|h,k| h[k] = []}

 arr.each{|a| hash[a[1]].push([a[0],a[2]])}


   hash => {"DM1210"=>[["Yonkers", "70.00 USD"], ["Scranton", "68.76 USD"]], "DM1182"=>[["Yonkers", "19.68 AUD"], ["Nashua", "58.58 AUD"], ["Camden", "54.64 USD"]]}

1 Comment

Ahh, so that's the right way to set the default value. Previously my hash would be empty unless I indexed it.
0

More or less extracted from the facets library tokland suggests:

ary = [["Yonkers", "DM1210", "70.00 USD"], ["Yonkers", "DM1182", "19.68 AUD"], ["Nashua", "DM1182", "58.58 AUD"], ["Scranton", "DM1210", "68.76 USD"], ["Camden", "DM1182", "54.64 USD"]]

hash = {}
ary.each{ |a,b,c| (hash[b] ||= []) << [a,c] }

hash
# => {"Camden"=>[["DM1182", "54.64 USD"]], "Nashua"=>[["DM1182", "58.58 AUD"]], "Scranton"=>[["DM1210", "68.76 USD"]], "Yonkers"=>[["DM1210", "70.00 USD"], ["DM1182", "19.68 AUD"]]}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.