0

I am trying to do a replace of a specific set of characters in a file in Perl but it does not seem to work, here is my code.

my $file = shift;
open(FILE, "$file") or die "File not found";
while (<FILE>){
   $data .=$_
}
$data =~ s/[^A-CEGHJ-PR-TW-Z]{1}[A-CEGHJ-NPR-TW-Z]{1}\s?[0-9]{2}\s?[0-9]{2}\s[0-9]{2}\s?[A-DEM]{0,1}$/XX012345X/g;

I know that my pattern matching works for finding the set of characters, I am not entirely sure the replace works. However, my main concern is the Perl code. The file remains untouched after I run it.

Sample File.

AB123456C Ab12345678 DG657465 GH123456FG
5
  • Can you post a sample file in your question ? Commented Jan 29, 2013 at 10:49
  • 1. You don't write to that file, you just read the data from it, so why should the file be changed? 2. Your regex uses anchors to match the start and the end of the string, you read multiple lines, probably you need the m modifier to change that behaviour? Commented Jan 29, 2013 at 10:51
  • Perhaps you should mention what it is that you hope your code will do. This code doesn't do anything unless you print $data. Also, in the first part of your regex, you have Z{1}, which looks like a typo. Commented Jan 29, 2013 at 10:51
  • 1
    Oh, and also "it does not work" is a horribly bad way to describe your problem. It doesn't really say anything, does it. Commented Jan 29, 2013 at 10:53
  • "The file remains untouched after I run it." Answers that. Edits made. My intentions are clear in the first line but for clarity, I am trying to open a file, do a replace regex on the entire file. Thanks Commented Jan 29, 2013 at 10:54

2 Answers 2

2

The code does not alter the file because you don't tell it to. You open the file for reading, not writing, plus you do not print anything.

If you want a quick way to handle this, just put your regex substitution in a file and use it as a source file. Like this:

Content of regex.pl:

s/[^A-CEGHJ-PR-TW-Z]{1}[A-CEGHJ-NPR-TW-Z]{1}\s?[0-9]{2}\s?[0-9]{2}\s[0-9]{2}\s?[A-DEM]{0,1}$/XX012345X/g;

One-liner:

perl -p regex.pl inputfile.txt > output.txt

This way you can quickly check the output. You can also pipe to a pager command or not at all.

Sign up to request clarification or add additional context in comments.

4 Comments

Okay, thanks for this. I like the idea of doing it one line. I have slightly modified by regex. It is now /^[A-CEGHJ-PR-TW-Z]{1}[A-CEGHJ-NPR-TW-Z]{1}[0-9]{6}[A-DFM]{0,1}$/ which works perfect for AB123456C when matching in regex tester websites. When I change it to a replace by adding /XX01234X/g it does not seem to work. Any ideas?
This now seems to work after reemoving the ^ and $. Any way I can avoid piping into another file and just modify the original?
Yes, you can use the -i switch, which will edit in-place. It is recommended to use backups, e.g. -i.bak (backup is saved in file.txt.bak). So perl -pi.bak regex.pl input.txt
I usually do not recommend the -i switch to beginners because it is somewhat dangerous. The changes are irreversible, and even if you use backups, you can overwrite your original by running the script twice (file.txt.bak gets overwritten).
0

The file your are opening is read only. So you need to open a temporary second file (File::Temp) where your write the $data variable, close it, remove the first file (unlink) and rename the temporary file to the desired name.

This SO question may be helpful.

Off topic note: please, use modern Perl approach to handle your files. For example:

open my $fh, "<", $filename or die "Cannot open file $filename"

See also this SO question. Avoid the use of package-global typeglob filehandles.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.