1

I have an array, and I want to perform a delete_if on a subset (any n items) of that array (and it modifies the array in memory)

With the full array I can do

array.delete_if do |item|
  should_be_deleted?(item)
end

If I want to restrict to the first n items, the following won't work

array.take(n).delete_if do |item|
  should_be_deleted?(item)
end

As it will create a new array and perform the delete_if on that new array

Is there an alternative, like a take_and_delete_if that would delete only the first n items (if the block returns true for each) ?

EDIT :

I want to process from array a and b, c by chunks of 3 (and delete from the array after performing the operation)

by_batch_of(3, until: (proc { a.empty? })) do 
  # This sets an instance variable @by = 3, and will iterate as long as `a` has any item
  process_from_a # Will move @by items in a to either array b or c or fail
  process_from_b # Will move @by items in b to c or fail
  process_from_c # Should move items or fail and put back in a
end

Sample processing method

process_from_a(by: @by)
  a.take_and_delete_if(by: by) do |item| # The +take_and_delete_if+ methods is the one I need
  b << item if reason1
  c << item if reason2
  reason1 or reason2 # Delete if the item was moved away
end

Performance is what I am looking for

Example

a = [1,2,3,4,5,6,7,8,9]
b = []
c = []

1st batch of 3

  • process_from_a(by: 3)

    a = [3,4,5,6,7,8,9] # 3 failed so delete_if returned false, it remains in the array (order doesn't matter)
    b = [1] # 1 moved to b
    c = [2] # 2 moved to c
    
  • process_from_b

    a = [3,4,5,6,7,8,9]
    b = []
    c = [1,2] # 2 moved to c
    
  • process_from_c

    a = [3,4,5,6,7,8,9,1] # 1 was rejected in a
    b = []
    c = [] # 1,2 processed from c
    

The next iteration would for example process [3,4,5] from a, etc.

Performance

Suppose my array is very big (10k, 100k) and I want process items by batch of 10. I don't want expensive solutions to filter the first 10 items and delete_if the whole array with index < 10...

3
  • 3
    An example with expected output would be helpful here. Commented Mar 3, 2017 at 10:45
  • @sagarpandya82 Yes just added that, thanks Commented Mar 3, 2017 at 13:25
  • Your example looks like a producer/consumer problem. Commented Mar 3, 2017 at 17:07

5 Answers 5

1

It should be possible to do an in place replacement with filtered elements from a subset:

a = (0..10000).to_a;
a[0, 100] = a[0, 100].delete_if(&:odd?)

A benchmark:

require 'benchmark/ips'

Benchmark.ips do |x|
  x.report("with_index")  { (0..10000).to_a.delete_if.with_index { |k, i| k.odd? && i < 100 } }
  x.report("slice") { a = (0..10000).to_a; a[0, 100] = a[0, 100].delete_if(&:odd?) }

  x.compare!
end

Gives these results on MRI Ruby 2.4.0p0:

Warming up --------------------------------------
          with_index    58.000  i/100ms
               slice   273.000  i/100ms
Calculating -------------------------------------
          with_index    602.354  (± 6.6%) i/s -      3.016k in   5.033200s
               slice      2.775k (±10.0%) i/s -     13.923k in   5.075605s

Comparison:
               slice:     2774.9 i/s
          with_index:      602.4 i/s - 4.61x  slower
Sign up to request clarification or add additional context in comments.

2 Comments

Hey, do you have any idea of the performance gain VS other methods ? That sounds like exactly what I wanted.
I updated the answer to include a benchmark against the previously suggested version.
1

may be you need something like this?

[1,2,3,4,5,6,7,8].delete_if.with_index{|e,i| i<3} # => [4, 5, 6, 7, 8] 

items with indexes in range 0..2 were deleted

3 Comments

To complete your answer, you should incorporate the should_be_deleted?(item) part from the OP's question as well.
Does the with_index method iterate on the whole array to select a subset ? See my comment on the previous answer. I'm dealing with big arrays and I want to avoid looping through all items.
yep, btw you can use with_index in map, select, each and other enumerators
1

You can use method #shift to remove first n elements, for example:

> a = [1, 2, 3, 4, 5]
 => [1, 2, 3, 4, 5] 
 > a.shift(3)
 => [1, 2, 3] 
 > a
 => [4, 5]

2 Comments

I think the OP wants to remove the first n items only if should_be_deleted?(item) for the corresponding item returns true.
@Stefan Exactly. Now What can be done (and that I have implemented in the mean time), is to delete_if the unshifted array and put the remnant back in a
1

The logic might look as follows:

array.delete_if do |item|
  next if should_be_skipped?(item)
  should_be_deleted?(item)
end

Example:

a = [1,2,3,4,5,6]
a.delete_if do |item|
  next if item == 2 # would skip 2 because we want so
  item % 2 == 0     # would remove all even numbers (except for 2)
end
#=> [1, 2, 3, 5]

Just to clarify: answer is rather general, just to show the OP the idea about how such cases might be handled.

edit

For given case, to skip 4 first elements you'd go with:

a = [1,2,3,4,5,6,7,8]
a.delete_if.with_index do |item, index|
  index > 3 && item.even?
end
#=> [1, 2, 3, 4, 5, 7]

5 Comments

"first n items" requires some sort of index.
@Stefan I think the general solution is better in terms of showing how it could work. Whether it would be based on index or on anything else depends on case
Sure, but how could should_be_skipped? determine the item's position?
@Stefan ok, I'll maybe edit the answer with something indexing
Hi sorry actually I don't care if it's the first n or any n, just edited. Also the problem with your solution, is that it iterates on the full array. Imagine my array has 10k items and I just want to process by group of 10... that means your condition would be evaluated on 10k objects instead of just 10
0
def skip_then_test(arr, nbr_to_skip)
  arr.delete_if { |item| (nbr_to_skip -= 1) < 0 && item.even? }
end

skip_then_test [2,3,4,5,6,7,8], 0
  #=> [3, 5, 7] 
skip_then_test [2,3,4,5,6,7,8], 1
  #=> [2, 3, 5, 7] 
skip_then_test [2,3,4,5,6,7,8], 2
  #=> [2, 3, 5, 7] 
skip_then_test [2,3,4,5,6,7,8], 3
  #=> [2, 3, 4, 5, 7]

arr = [2,3,4,5,6,7,8]  
skip_then_test arr, 4
  #=> [2, 3, 4, 5, 7]
arr 
  #=> [2, 3, 4, 5, 7]

Another way follows.

def skip_then_test(arr, nbr_to_skip)
  arr.replace(arr[0, nbr_to_skip] + arr[nbr_to_skip..-1].delete_if(&:even?))
end

arr = [2,3,4,5,6,7,8]
skip_then_test arr, 3
  #=> [2, 3, 4, 5, 7] 
arr
  #=> [2, 3, 4, 5, 7] 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.