3

I have a number of text files, need to extract the first instance of some single lines, some consecutive lines and some text between lines:

Document 1

Title of the document
(TOD)

Release 3
Version 2

Authors

Thomas E. Thomas, John L. John,
Fred A. Fred, Sandra K. Sandra

Company A Address

More Authors

Page 3

From this example I need "Title of the Document (TOD)", 3, 2, and all the text between Authors and Page 3, not inclusive. I'm slowly learning so I have some code snippets, but they don't go far enough. I can get a match but need the first instance, and the instance and next line:

File.open("sample.txt").each do |line|
    if line[/Document/]
        puts line

I've tried to get intervening text but it's not quite right:

File.open("sample.txt").each do |line|
while gets
  print if [/Authors/../Page/]
end

If you feel this is too much help to ask for I'd appreciate hints/pointers.

2
  • 1
    Couldn't you just output lines until you hit the "Page 3" line? Commented Nov 3, 2011 at 1:41
  • You need to use a state-based approach. Keep track of what 'state' you're in, and for each line use a case statement to decide what to look for and what to do with what you find, and to change the state variable to a new value (probably a symbol) when it sees something else. Commented Nov 3, 2011 at 4:54

1 Answer 1

4

Rather than read the file line by line I think it would be easier to read in the whole thing then search through it with regex. Something like:

File.open("sample.txt","r") do |f|
  text = f.read

  # everything between Document and Authors
  m1 = text.match(/Document(.*)Authors/m)

  # everything between Authors and Page
  m2 = text.match(/Authors(.*)Page/m)
end
Sign up to request clarification or add additional context in comments.

1 Comment

You'll want to use the m regex modifier for this (it makes . match newline characters). w is not a valid regex option in Ruby.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.