0

I have multiple files of below format

Eg. file: sample.txt

id = class\234ha, class\poi23, class\opiuj, cap\7y6t5
dept = sub\6985de, ret\oiu87, class\234ha
cko = cyr\hui87

I'm finding string and removing it from multiple files.Like, find and remove string - class\234ha.

My code is working fine and its removing all the intended strings but there is a trailing comma at the end of the line after the intended or marked string is deleted.

Eg. sample.txt after removal of string - class\234ha

id = class\poi23, class\opiuj, cap\7y6t5
dept = sub\6985de, ret\oiu87,
cko = cyr\hui87

I want to remove the last comma only after ret\oiu87, .it should be the same for multiple files. I'm not sure whether there is a new line character or space after the comma. How can I make it work. Thanks in advance.

code

 pool = ''
svn_files = Dir.glob("E:\work'*-access.txt")

value=File.open('E:\nando\list.txt').read
value.each_line do |line|
    line.chomp!
    print "VALUE: #{line}\n"
    svn_files.each do |file_name|
      text = File.read(file_name)
      replace = text.gsub( /#{Regexp.escape(line)}\,\s/, '').gsub( /#{Regexp.escape(line)}/, '' )

      unless text == replace

        text.each_line do |li|
          if li.match(/#{Regexp.escape(line)}/) then
           #puts "Its matching"
           pool = li.split(" ")[0] 
          end
        end 
        File.open('E:\Removed_users.txt', 'a') { |log| 
        log.puts "Removed from: #{file_name}"
        log.puts "Removed user : #{line}"
        log.puts "Removed from row :#{pool}"
        log.puts "*" * 50 
        }
        File.open(file_name, "w") { |file| file.puts replace }
      end
    end
end

1 Answer 1

1

Complex regular expressions are evil. Don't use them unless your application domain truly requires them. Instead, do multiple passes. I'm not really sure what your intended substitutions are, but structurally this is what you want to do:

# Create an interim string using your existing substitutions. For example,
# the corpus you currently have after substitutions contains:
tmp_str = <<~'EOF'
  id = class\poi23, class\opiuj, cap\7y6t5
  dept = sub\6985de, ret\oiu87, 
  cko = cyr\hui87EOF
EOF

# Remove trailing commas.
final_str = tmp_str.gsub /,\s*$/m, ''

puts final_str

This will send the following output to the screen:

id = class\poi23, class\opiuj, cap\7y6t5
dept = sub\6985de, ret\oiu87
cko = cyr\hui87EOF

With this approach, it doesn't matter if you work line by line, or on a multiline string. Either way, you're just stripping commas and trailing space at the end of each line. Simple!

Sign up to request clarification or add additional context in comments.

2 Comments

I have tried earlier as you have suggested but it didn't work out. I have put up my entire code in the question now. THanks much for looking in to my query.
I have modified the code , your suggestion to include the * in the regex is the key.thanks a lot. Sorry I cant up-vote as I didnt have enough reputation

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.