4

Given an array of strings,

array1 = ["abcdwillbegoneabcccc","cdefwilbegokkkabcdc"]

and another array of strings which consist of patterns e.g. ["abcd","beg[o|p]n","bcc","cdef","h*gxwy"]

the task is to remove substrings that match any of the pattern strings. for example a sample output for this case should be:

["willbegonea","wilbegokkk"]

because we have removed the substrings (prematch or postmatch as is appropriate depending on the position of occurrence) that matched one of the patterns. Assume that the one or two matches will always occur at the beginning or towards the end of each string in array1.

Any ideas of an elegant solution to the above in ruby?

1
  • 2
    Can you explain how you come to those results? IMHO, the result should be ['willeacc', 'wilbegokkkc']. Also, your specification is incomplete: what should the result of ['beabcdefgonx'] be, given your patterns? ['bdefgonx'], ['beabgonx'], ['begonx'] or ['x']? All of these are valid under some interpretation of your rules. As a general hint: whenever someone asks such a question on SO, they should supply an acceptance test suite with it. That makes it way easier to answer and shows that the poster actually put some genuine effort into the question. Commented Feb 5, 2010 at 12:44

3 Answers 3

7

How about building a single Regex?

array1 = ["abcdwillbegoneabcccc","cdefwilbegokkkabcdc"]

to_remove = ["abcd","beg[o|p]n","bcc","cdef","h*gxwy"]

reg = Regexp.new(to_remove.map{ |s| "(#{s})" }.join('|'))
#=> /(abcd)|(beg[o|p]n)|(bcc)|(cdef)|(h*gxwy)/

array1.map{ |s| s.gsub(reg, '') }
#=>  ["willeacc", "wilbegokkkc"]

Note that my result is different to your

["willbegonea","wilbegokkk"]

but I think mine's correct, it removes "abcd", "begon" and "bcc" from the original, which seems to be what's wanted.

Sign up to request clarification or add additional context in comments.

3 Comments

and how can you easily get the co-ordinates of the resulting string in relation to the initial string using this approach.
@George - I'm not sure I understand the question! Can you expand on it? What would you hope to see?
Is it also possible to know the match offsets? i.e. from what position to which position did the pattern match.
2

I can see some potential gotchas here, in that if you change the order of the pattern strings, you could get a different result; and also, the second pattern might leave the string in a state that would have matched the first one, only it's too late now.

Assuming those are givens, I would go with Yoann's answer. The only way I can slightly improve it is to make the patterns regexen rather than strings, like this:

[/abcd/,/beg[o|p]n/,/bcc/,/cdef/,/h*gxwy/].each do |pattern|
    string_to_test.gsub!(pattern,'')
end

But of course if the patterns are coming from somewhere else, maybe they have to be strings.

Comments

1

I think something like that should work :

def gimme_the_substring(string_to_test)
  ["abcd","beg[o|p]n","bcc","cdef","h*gxwy"].each do |pattern|
    string_to_test.gsub!(/#{pattern}/,'')
  end
  return string_to_test
end

array1.map!{|s| gimme_the_substring(s)}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.