0

I have a text file that looks like this:

STUFF UP HERE

APEXED NUMBER : 123456789

1234567   Bob,Hope E.                   123.12              
1234567   TOM ROGERS JR III             123.18                
1234567   NICE, JOHNATH               4,450.00                 
1234567   PERDOND, DELLA              4,762.00               
1234567   ERICCY, PHIL                4,552.00               


  STUFF IN BETWEEN




APEXED NUMBER :

1234567   RICHARDSON,FELICIA D          632.00     
1234567   EARLEY, RICKY L               140.00     

STUFF ON THE BOTTOM

I want to read the file and find the words "APEXED NUMBER :" Then I want to determine if there are numbers after the colon. For example after the first APEXED NUMBER : the numbers 123456789 appear. I want to save this number. Then I want the file to skip a line and read the numbers and information after - assigning the information to different variables.

Then I want to continue through the file (line by line) until I find another "APEXED NUMBER" text and check if there are numbers after it - if there are not I want to assign these APEX NUMBER a value of "unknown" and move on.

Then take all the text found and store in an array separated by commas.

Here is my current attempt:

def is_numeric?(object)  #used to determine if a number is a number
true if Float(object) rescue false 
end

def is_apexed_line?(object)   # check if text has "APEXED NUMBER :"
true if object == "APEXED NUMBER :" rescue false
end

def load_file
 raw_records = []
 infile = File.open("test.txt", "r") 
 while line = infile.gets
 possible_apexed_line = line[2,15]

 if is_apexed_line?(possible_apexed_line)
 apexed_line = line[2,15]
 possible_apexed_number_present = line[18,9]

 if is_numeric?(possible_apexed_number_present)  
  abc_apexed_number = line[18,9]
  else abc_apexed_number = "unknown"
  end  # end of if


record = [apexed_line, abc_apexed_number]
raw_records << record

end  # end of if

end

puts raw_records.map {|record| record*','}


 infile.close

end

load_file

This produces:

APEXED NUMBER :, 123456789
APEXED NUMBER :, unknown

But this is as far as my learning thus far will take me. The result I am looking for is this:

1234567, BOB, HOPE E., 123.12, APEXED NUMBER :, 123456789
1234567, TOM ROGERS JR III, 123.18 , APEXED NUMBER :, 123456789              
1234567, NICE, JOHNATH,  4450.00  ,APEXED NUMBER :, 123456789               
1234567, PERDOND, DELLA, 4762.00 , APEXED NUMBER :, 123456789              
1234567, ERICCY, PHIL, 4552.00, APEXED NUMBER :, 123456789
1234567,   RICHARDSON,FELICIA D, 632.00 ,  APEXED NUMBER :, unknown  
1234567,   EARLEY, RICKY L, 140.00 , APEXED NUMBER :, unknown

Any suggestions/help to point me in the right direction will be appreciated. I am not wedded to this approach. If there are other ways to do it please suggest... I am learning ruby so I would prefer ruby suggestions.

Thanks

3
  • @paguardiario answer helps a lot. However, the STUFF IN BETWEEN has numbers too... These get detected in the regex solution. I do not know regex well. Does anyone also have a non-regex way of doing it? Commented May 12, 2012 at 23:31
  • you might need to add beginning/end markers: /^(\d+)\s{2,}(.*?)\s{2,}([\d,.]+)\s*$/ or fine tune the regex depending on the stuff between Commented May 13, 2012 at 0:28
  • @paguardiario - Thank you. I have ordered a regex book and will have to dig into it to see how to make it work better. Commented May 14, 2012 at 22:10

1 Answer 1

1

Here's mine:

File.open(filename).each_line do |line|
    @apexed_number = ('' == $1) ? 'unknown' : $1 if line =~ /APEXED NUMBER :\s*(\d*)/
    puts [$1,$2,$3,@apexed_number].join(', ') if line =~ /(\d+)\s{2,}(.*?)\s{2,}([\d,.]+)/
end
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for your suggestion. I see you are using regex and and it seems much more concise then I have... Can you explain it a bit so I can get a grasp of what is going on?
Sure, the first matches a sequence of numbers after an optional space and the second one matches a sequence of numbers followed by 2 or more space chars, any non-greedy sequence of chars followed by 2 or more space chars and finally a sequence of numbers/commas/periods

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.