Ruby delete_if an array subset

Question

I have an array, and I want to perform a delete_if on a subset (any n items) of that array (and it modifies the array in memory)

With the full array I can do

array.delete_if do |item|
  should_be_deleted?(item)
end

If I want to restrict to the first n items, the following won't work

array.take(n).delete_if do |item|
  should_be_deleted?(item)
end

As it will create a new array and perform the delete_if on that new array

Is there an alternative, like a take_and_delete_if that would delete only the first n items (if the block returns true for each) ?

EDIT :

I want to process from array a and b, c by chunks of 3 (and delete from the array after performing the operation)

by_batch_of(3, until: (proc { a.empty? })) do 
  # This sets an instance variable @by = 3, and will iterate as long as `a` has any item
  process_from_a # Will move @by items in a to either array b or c or fail
  process_from_b # Will move @by items in b to c or fail
  process_from_c # Should move items or fail and put back in a
end

Sample processing method

process_from_a(by: @by)
  a.take_and_delete_if(by: by) do |item| # The +take_and_delete_if+ methods is the one I need
  b << item if reason1
  c << item if reason2
  reason1 or reason2 # Delete if the item was moved away
end

Performance is what I am looking for

Example

a = [1,2,3,4,5,6,7,8,9]
b = []
c = []

1st batch of 3

process_from_a(by: 3)

a = [3,4,5,6,7,8,9] # 3 failed so delete_if returned false, it remains in the array (order doesn't matter)
b = [1] # 1 moved to b
c = [2] # 2 moved to c

process_from_b

a = [3,4,5,6,7,8,9]
b = []
c = [1,2] # 2 moved to c

process_from_c

a = [3,4,5,6,7,8,9,1] # 1 was rejected in a
b = []
c = [] # 1,2 processed from c

The next iteration would for example process [3,4,5] from a, etc.

Performance

Suppose my array is very big (10k, 100k) and I want process items by batch of 10. I don't want expensive solutions to filter the first 10 items and delete_if the whole array with index < 10...

An example with expected output would be helpful here.

Sagar Pandya
– Sagar Pandya

2017-03-03 10:45:03 +00:00
Commented Mar 3, 2017 at 10:45 — Sagar Pandya
– Sagar Pandya, Commented Mar 3, 2017 at 10:45
@sagarpandya82 Yes just added that, thanks

Cyril Duchon-Doris
– Cyril Duchon-Doris

2017-03-03 13:25:55 +00:00
Commented Mar 3, 2017 at 13:25 — Cyril Duchon-Doris
– Cyril Duchon-Doris, Commented Mar 3, 2017 at 13:25
Your example looks like a producer/consumer problem.

Stefan
– Stefan

2017-03-03 17:07:17 +00:00
Commented Mar 3, 2017 at 17:07 — Stefan
– Stefan, Commented Mar 3, 2017 at 17:07

score 1 · Accepted Answer · 2017-03-09 17:32:00Z

1

It should be possible to do an in place replacement with filtered elements from a subset:

a = (0..10000).to_a;
a[0, 100] = a[0, 100].delete_if(&:odd?)

A benchmark:

require 'benchmark/ips'

Benchmark.ips do |x|
  x.report("with_index")  { (0..10000).to_a.delete_if.with_index { |k, i| k.odd? && i < 100 } }
  x.report("slice") { a = (0..10000).to_a; a[0, 100] = a[0, 100].delete_if(&:odd?) }

  x.compare!
end

Gives these results on MRI Ruby 2.4.0p0:

Warming up --------------------------------------
          with_index    58.000  i/100ms
               slice   273.000  i/100ms
Calculating -------------------------------------
          with_index    602.354  (± 6.6%) i/s -      3.016k in   5.033200s
               slice      2.775k (±10.0%) i/s -     13.923k in   5.075605s

Comparison:
               slice:     2774.9 i/s
          with_index:      602.4 i/s - 4.61x  slower

edited Mar 9, 2017 at 17:32

answered Mar 3, 2017 at 23:53

user1895144

Sign up to request clarification or add additional context in comments.

2 Comments

Cyril Duchon-Doris Over a year ago

Hey, do you have any idea of the performance gain VS other methods ? That sounds like exactly what I wanted.

user1895144 Over a year ago

I updated the answer to include a benchmark against the previously suggested version.

Sergio Tulentsev · Accepted Answer · 2017-03-03 10:45:32Z

1

may be you need something like this?

[1,2,3,4,5,6,7,8].delete_if.with_index{|e,i| i<3} # => [4, 5, 6, 7, 8]

items with indexes in range 0..2 were deleted

edited Mar 3, 2017 at 10:45

Sergio Tulentsev

231k43 gold badges381 silver badges373 bronze badges

answered Mar 3, 2017 at 10:44

Dmitry Cat

4753 silver badges11 bronze badges

3 Comments

Stefan Over a year ago

To complete your answer, you should incorporate the should_be_deleted?(item) part from the OP's question as well.

Cyril Duchon-Doris Over a year ago

Does the with_index method iterate on the whole array to select a subset ? See my comment on the previous answer. I'm dealing with big arrays and I want to avoid looping through all items.

Dmitry Cat Over a year ago

yep, btw you can use with_index in map, select, each and other enumerators

Oleksandr Holubenko · Accepted Answer · 2017-03-03 10:47:57Z

1

You can use method #shift to remove first n elements, for example:

> a = [1, 2, 3, 4, 5]
 => [1, 2, 3, 4, 5] 
 > a.shift(3)
 => [1, 2, 3] 
 > a
 => [4, 5]

answered Mar 3, 2017 at 10:47

Oleksandr Holubenko

4,4482 gold badges18 silver badges29 bronze badges

2 Comments

Stefan Over a year ago

I think the OP wants to remove the first n items only if should_be_deleted?(item) for the corresponding item returns true.

Cyril Duchon-Doris Over a year ago

@Stefan Exactly. Now What can be done (and that I have implemented in the mean time), is to delete_if the unshifted array and put the remnant back in a

Community · Accepted Answer · 2020-06-20 09:12:55Z

1

The logic might look as follows:

array.delete_if do |item|
  next if should_be_skipped?(item)
  should_be_deleted?(item)
end

Example:

a = [1,2,3,4,5,6]
a.delete_if do |item|
  next if item == 2 # would skip 2 because we want so
  item % 2 == 0     # would remove all even numbers (except for 2)
end
#=> [1, 2, 3, 5]

Just to clarify: answer is rather general, just to show the OP the idea about how such cases might be handled.

edit

For given case, to skip 4 first elements you'd go with:

a = [1,2,3,4,5,6,7,8]
a.delete_if.with_index do |item, index|
  index > 3 && item.even?
end
#=> [1, 2, 3, 4, 5, 7]

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Mar 3, 2017 at 10:43

Andrey Deineko

52.5k11 gold badges119 silver badges151 bronze badges

5 Comments

Stefan Over a year ago

"first n items" requires some sort of index.

Andrey Deineko Over a year ago

@Stefan I think the general solution is better in terms of showing how it could work. Whether it would be based on index or on anything else depends on case

Stefan Over a year ago

Sure, but how could should_be_skipped? determine the item's position?

Andrey Deineko Over a year ago

@Stefan ok, I'll maybe edit the answer with something indexing

Cyril Duchon-Doris Over a year ago

Hi sorry actually I don't care if it's the first n or any n, just edited. Also the problem with your solution, is that it iterates on the full array. Imagine my array has 10k items and I just want to process by group of 10... that means your condition would be evaluated on 10k objects instead of just 10

Cary Swoveland · Accepted Answer · 2017-03-08 07:17:29Z

0

def skip_then_test(arr, nbr_to_skip)
  arr.delete_if { |item| (nbr_to_skip -= 1) < 0 && item.even? }
end

skip_then_test [2,3,4,5,6,7,8], 0
  #=> [3, 5, 7] 
skip_then_test [2,3,4,5,6,7,8], 1
  #=> [2, 3, 5, 7] 
skip_then_test [2,3,4,5,6,7,8], 2
  #=> [2, 3, 5, 7] 
skip_then_test [2,3,4,5,6,7,8], 3
  #=> [2, 3, 4, 5, 7]

arr = [2,3,4,5,6,7,8]  
skip_then_test arr, 4
  #=> [2, 3, 4, 5, 7]
arr 
  #=> [2, 3, 4, 5, 7]

Another way follows.

def skip_then_test(arr, nbr_to_skip)
  arr.replace(arr[0, nbr_to_skip] + arr[nbr_to_skip..-1].delete_if(&:even?))
end

arr = [2,3,4,5,6,7,8]
skip_then_test arr, 3
  #=> [2, 3, 4, 5, 7] 
arr
  #=> [2, 3, 4, 5, 7]

edited Mar 8, 2017 at 7:17

answered Mar 8, 2017 at 7:06

Cary Swoveland

111k6 gold badges69 silver badges105 bronze badges

Collectives™ on Stack Overflow

Ruby delete_if an array subset

EDIT :

Performance

5 Answers 5

2 Comments

3 Comments

2 Comments

edit

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

EDIT :

Performance

5 Answers 5

2 Comments

3 Comments

2 Comments

edit

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related