Check if an element of an array partly exists in a given string

Question

I have a line of text

this is the line

and I want to return true if one of the elements in that array:

['hey', 'format', 'qouting', 'this']

is a part of the string given above.

So for the line above it should return true.

For this line hello my name is martin it should not.

I know include? but I don't know how to use it here if it helps at all.

Michael Kohl · Accepted Answer · 2011-03-30 06:52:04Z

24

>> s = "this is the line"
=> "this is the line"
>> ['hey', 'format', 'qouting', 'this'].any? { |w| s =~ /#{w}/ }
=> true
>> ['hey', 'format', 'qouting', 'that'].any? { |w| s =~ /#{w}/ }
=> false
>> s2 = 'hello my name is martin'
=> "hello my name is martin"
>> ['hey', 'format', 'qouting', 'this'].any? { |w| s2 =~ /#{w}/ }
=> false

answered Mar 30, 2011 at 6:52

Michael Kohl

66.9k14 gold badges144 silver badges161 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

Martin Klepsch Over a year ago

Well, that was fast, thanks. Would you mind explaining what happens here s =~ /#{w}/ ?

Michael Kohl Over a year ago

It's a regular expression match. Since regex in Ruby support string interpolation, I use that to create one out of the strings in the array.

Michael Kohl Over a year ago

I should add that this also will return true if a word is part of a longer word in the string, so if you don't want that, you'll have to match with word boundaries /\b#[w}\b/.

rubyprince Over a year ago

Why dont you use include? instead of regex. It would be more readeable(dont know about performance)..like this ['hey', 'format', 'qouting', 'this'].any? { |w| s.include? w }

Michael Kohl Over a year ago

@rubyprince: I find regex very readable, especially simple ones like this. And when someone talks about matching something in a string, I'll almost always go down the regex route, because that's what they are there for.

|

the Tin Man · Accepted Answer · 2011-03-30 09:53:41Z

16

The simplest way I know to test for inclusion of one string inside another is:

text = 'this is the line'
words = ['hey', 'format', 'qouting', 'this']

words.any? { |w| text[w] }  #=> true

No need for regex, or anything complicated.

require 'benchmark'

n = 200_000
Benchmark.bm(3) do |x|
  x.report("1:") { n.times { words.any? { |w| text =~ /#{w}/ } } }
  x.report("2:") { n.times { text.split(" ").find { |item| words.include? item } } }
  x.report("3:") { n.times { text.split(' ') & words } }
  x.report("4:") { n.times { words.any? { |w| text[w] } } }
  x.report("5:") { n.times { words.any? { |w| text.include?(w) } } }
end

>>          user     system      total        real
>> 1:   4.170000   0.160000   4.330000 (  4.495925)
>> 2:   0.500000   0.010000   0.510000 (  0.567667)
>> 3:   0.780000   0.030000   0.810000 (  0.869931)
>> 4:   0.480000   0.020000   0.500000 (  0.534697)
>> 5:   0.390000   0.010000   0.400000 (  0.476251)

edited Mar 30, 2011 at 9:53

answered Mar 30, 2011 at 9:21

the Tin Man

161k44 gold badges222 silver badges308 bronze badges

4 Comments

the Tin Man Over a year ago

I don't find include? to be more readable than text[w]. It is a bit faster though.

rubyprince Over a year ago

text.include? w suggests that it returns a boolean value whether w is included in text. text[w] at first glance may be interpreted as giving the starting value of w in text.

Gus Shortz Over a year ago

This will also return true if a word in words is part of the string text. Unlike the regex solution, there is no way to make it only match whole words - that I can see anyway :)

the Tin Man Over a year ago

A regex pattern is the only way to test for complete word matches, however that solution can cause the test to run extremely slow unless the pattern being used is written correctly. If patterns are anchored to the string start or end, then the engine can do an extremely fast search. If anchoring isn't possible, then the engine slows down significantly and a simple sub-string match will beat it. And, the more complex the pattern is, the slower it will run; Trying to use look-ahead/behind makes it worse. The smart programmer will test with benchmarks to figure out what is the fastest route.

jaredonline · Accepted Answer · 2011-03-30 06:58:44Z

6

You could split the strling into an array, and check for the intersection between your array and the newly split array, like so.

This is handy because it'll give you more than a true false, it will give you the matched strings.

> "this is the line".split(' ') & ["hey", "format", "quoting", "this"]
=> ["this"]

If you needed a true / false you could easily do:

> s = "this is the line"
=> "this is the line" 
> intersection = s.split(' ') & ["hey", "format", "quoting", "this"]
=> ["this"] 
> intersection.empty?
=> false

answered Mar 30, 2011 at 6:58

jaredonline

2,9022 gold badges19 silver badges24 bronze badges

1 Comment

James Chevalier Over a year ago

I love this approach for single word checking (e.g. 'hey'). Things fall apart if you need phrase checking, though (e.g. 'hey you').

RameshVel · Accepted Answer · 2011-03-30 06:54:28Z

1

> arr = ['hey', 'format', 'qouting', 'this']
=> ["hey", "format", "qouting", "this"]
> str = "this is the line"
=> "this is the line"
> str.split(" ").find {|item| arr.include? item }
=> "this"
> str.split(" ").any? {|item| arr.include? item }
=> true

answered Mar 30, 2011 at 6:54

RameshVel

66.1k32 gold badges173 silver badges213 bronze badges

2 Comments

rubyprince Over a year ago

you can go the other way also arr.any? { |item| str.include? item }

RameshVel Over a year ago

yep, thats neat.. :). i just started ruby... still doing ruby in c# way :(

Collectives™ on Stack Overflow

Check if an element of an array partly exists in a given string

4 Answers 4

9 Comments

4 Comments

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

9 Comments

4 Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related