0

I'm trying to read a text file and then store its individual words in an array. But I can't find a way to split it according to words.

text_file = []

File.open(file, "r") do |f|
  f.lines.each do |line|
    text_file << line.split.map(&:to_s)
  end
end

The above method creates an array of arrays which stores all the words in a single line in an array and so on.

Is there a way in which the array text_file can hold a single array of all the words?

1
  • If the file is not huge, you could "gulp" it into a string, then apply String#scan: File.read(FName).scan(/\w+/). To try it, first create a file: File.write("temp", "Now is\nthe time\nto rejoice.") #=> 27. Then test: File.read("temp").scan(/\w+/) #=> ["Now", "is", "the", "time", "to", "rejoice"] . Commented Mar 22, 2016 at 3:11

4 Answers 4

2

Yes. Either do:

text_file.push(*line.split.map(&:to_s))

or:

text_file.concat(line.split.map(&:to_s))
Sign up to request clarification or add additional context in comments.

3 Comments

It seems odd converting strings to strings.
@tadman Right. map(&:to_s) is redundant. But that is what the OP gave, and even though it does not make sense, it still works, so I left it.
@tadman you are right, it's redundant so I'm removing it now. Thanks
1

If you want all of the words, uniquely, sorted:

text_file = [ ]

File.open(file, "r") do |f|
  f.each_line do |line|
    text_file += line.split
  end
end

text_file.uniq!
text_file.sort!

This is not the most optimal implementation, but it should work. To adapt this to more real-world situations you probably need to use String#scan to pull out more specific words instead of getting tripped up on things like punctuation or hyphens.

Comments

0

Modifying your code, this will do the trick:

text_file = []

File.open('document.rb', "r") do |f|
  f.each_line do |line|
    arr = line.split(' ')
    arr.each do |word|
      text_file << word
    end  
  end
end

Comments

0

The following reads the contents of file, splits lines and words delimited by spaces, and then makes a constant called WORDS by freezing the result.

WORDS = File.read(file).split(/[ \n]/).freeze

If you also want to use tabs as well as spaces and newlines as delimiters use the following:

WORDS = File.read(file).split(/[ \n\t]/).freeze

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.