Sort an array according to the elements of another array

Question

I have an array of ids

a1 = [1, 2, 3, 4, 5]

and I have another array of objects with ids in random order

a2 = [(obj_with_id_5), (obj_with_id_2), (obj_with_id_1), (obj_with_id_3), (obj_with_id_4)]

Now I need to sort a2 according to the order of ids in a1. So a2 should now become:

[(obj_with_id_1), (id_2), (id_3), (id_4), (id_5)]

a1 might be [3, 2, 5, 4, 1] or in any order but a2 should correspond to the order of ids in a1.

I do like this:

a1.each_with_index do |id, idx|
  found_idx = a1.find_index { |c| c.id == id }
  replace_elem = a2[found_idx]
  a2[found_idx] = a2[idx]
  a2[idx] = replace_elem
end

But this still might run into an O(n^2) time if order of elements of a2 is exactly reverse of a1. Can someone please tell me the most efficient way of sorting a2?

pguardiario · Accepted Answer · 2012-08-15 05:04:43Z

88

I'll be surprised if anything is much faster than the obvious way:

a2.sort_by{|x| a1.index x.id}

answered Aug 15, 2012 at 5:04

pguardiario

55.2k21 gold badges130 silver badges169 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Jato Over a year ago

assuming a1 is sorted (and it is from the problem stmt) and the container you're using for a1 can take advantage of the fact that a1 is sorted then I agree this would be faster than O(n^2).

pguardiario Over a year ago

No a1 being sorted is not an advantage, I'm not sure why you would think that. This way is fast because it's built-in. Trying to beat built-in sort_by seems a waste of time to me.

Jato Over a year ago

a1 being sorted is an advantage. If it is sorted then the index operation should run in O(log n) time (assuming binary search) and if it is not sorted the index will run in O(n) time.

Ari53nN3o Over a year ago

This method is blazing fast too but using hashes is like travelling faster than the speed of light. I ran a test with both methods for 10,000 numbers (just for the sake of testing). Your method took 1.3secs on an avg but with hashes it took 0.009secs on avg..

isqad Over a year ago

-1 for this method. For Example: x - Array of 1000 elements, not sorted x2 - Array of same elements, sorted

Benchmark.bm { |t|             t.report('test1') {                 x.index_by { |c| c }.values_at(*x2).compact               }               t.report('test2') {                 x.sort_by { |v| x2.index v }               }            }

test1 real: 0.000709 test2 real: 0.048563

|

megas · Accepted Answer · 2012-08-14 22:46:53Z

27

hash_object = objects.each_with_object({}) do |obj, hash| 
  hash[obj.object_id] = obj
end

[1, 2, 3, 4, 5].map { |index| hash_object[index] }
#=> array of objects in id's order

I believe that the run time will be O(n)

edited Aug 14, 2012 at 22:46

answered Aug 14, 2012 at 22:34

megas

21.9k12 gold badges84 silver badges134 bronze badges

7 Comments

Jato Over a year ago

I believe this would be O(n^2). the actual sort is O(n), but the preparation step would make it n^2

megas Over a year ago

I'm not agree, to build hash table require O(n), look here en.wikipedia.org/wiki/Hash_table

Jato Over a year ago

Yes, building the hash table is O(n) time. And the sort is O(n) time. So you have 2xO(n)... hmmm... that would be less than n^2. I stand corrected. good catch!

ozzyaaron Over a year ago

The first step seems the same as using hash_object = objects.index_by(&:object_id)

johncip Over a year ago

@kamal: It's O(n), but doesn't do what's asked -- it'll return [nil, nil, nil, nil, nil] unless the object_ids happen to be the numbers 1 through 5. To make it work, you need to get the object_ids and sort them, which won't be any better than objects.index_by(&:object_id). Also, it isn't necessary to explain the O(n) claim here, but note that the O(n log n) lower bound only applies to comparison sorts.

|

Community · Accepted Answer · 2017-05-23 11:55:13Z

20

I like the accepted answer, but in ActiveSupport there is index_by which makes creating the initial hash even easier. See Cleanest way to create a Hash from an Array

In fact you could do this in one line since Enumerable supports index_by as well:

a2.index_by(&:id).values_at(*a1)

edited May 23, 2017 at 11:55

CommunityBot

11 silver badge

answered Aug 13, 2014 at 22:51

Eric Woodruff

6,4303 gold badges38 silver badges33 bronze badges

1 Comment

Josh Over a year ago

This only works if you don't have any duplicates in your original list. Index by will overwrite any duplicate ids. This may or may not be an issue for you.

Community · Accepted Answer · 2017-05-23 10:30:49Z

7

Inspired by Eric Woodruff's Answer, I came up with the following vanilla Ruby solution:

a2.group_by(&:object_id).values_at(*a1).flatten(1)

Method documentation:

edited May 23, 2017 at 10:30

CommunityBot

11 silver badge

answered Nov 7, 2014 at 21:06

Ajedi32

48.9k22 gold badges135 silver badges177 bronze badges

1 Comment

Cary Swoveland Over a year ago

I like this solution best. It's efficient (and I suspect more efficient than @pguardiario's solution) and, importantly, permits two elements of a2 to have the same value for "id". The question does not state that the id's are unique, yet some answers, including the one selected, depend on the id's being unique.

Collectives™ on Stack Overflow

Sort an array according to the elements of another array

4 Answers 4

8 Comments

7 Comments

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

8 Comments

7 Comments

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related