0

I would like to find and replace dates in a CSV file under the following conditions:

1) The first to columns are blank, like "","",

1a) $case[1] should not match because of the text in the first two columns

2) Each of the next 6 columns may contain dates, like in $case[0] below

2a) $case[2] should not match since all 6 columns are blank

my @case = (
'"","","","1/2/2012","","","","",="12345678"',
'"Add","New","1/1/2012","1/2/2012","","","",""="0987654"',
'"","","","","","","","",="91234567"'
); 

I have used the following code, but it incorrectly matches $case[2] and impacts the script's efficiency:

my $argFind = (qr/^"","",("[\d\/]*",){6}(.*)/);

$replace = '"","","","","","","","",'; 

if (grep(/$argFind/,@case)) 
{
       s/$argFind/$replace$2/ for @case;
       #write file
    }

The end result should be like:

$case = [
'"","","","","","","","",="12345678"',
'"Add","New","1/1/2012","1/2/2012","","","",""="0987654"',
'"","","","","","","","",="91234567"'
]; 
4
  • Requirement 2: Any of the 6 columns can contain dates, but at least one of them should. All of these examples I would like to correct: "","","1/1/11","","","","","" "","","","1/2/13","","4/3/2010","","" "","","","","","","","1/3/2012" Commented Mar 17, 2012 at 23:33
  • I was writing an equivalent program using Text::CSV but it has drawn my attention to your data being strange CSV. The end of the strings is either "",="12345678" or "",""="0987654". Is either of these correct? Please explain. Commented Mar 18, 2012 at 14:29
  • I realise now that you already have a working solution of your own except that you are worried about the cost of performing the substitution on records where all of the first eight fields are already empty. Please forget these misgivings. Not only does it break the rule that you should write purely to optimise clarity in your code until you have found that your solution runs too slowly, but I am sure it will also provide a negligible improvement in performance - especially if you are reading your data from a disk file. Commented Mar 18, 2012 at 14:35
  • @Borodin You're exactly right. I do have a working but not optimal solution. However, there is a noticeable impact to efficiency, given a large amount of data processed on remote file shares. Commented Mar 18, 2012 at 15:35

3 Answers 3

2

I believe you should really use Text::CSV to retrieve a list of data values from the CSV record. Then you can examine the fields individually to check whether they match your requirements.

But as long as the data is produced automatically and remains well-behaved you could try

qr[ ^ (?: "", ){2} (?: " (?: \d\d?/\d\d?/\d\d\d\d )? ", ){6} ]x;

whcih finds two empty fields followed by six fields that are either empty or contain something that looks like a date. This program demonstrates

use strict;
use warnings;

my @case = (
  '"","","","1/2/2012","","","","",="12345678"',
  '"Add","New","1/1/2012","1/2/2012","","","",""="0987654"',
  '"","","","","","","","",="91234567"'
);

my $argFind = qr[ ^ (?: "", ){2} (?: " (?: \d\d?/\d\d?/\d\d\d\d )? ", ){6} ]x;

my $replace = '"",' x 8;

for (@case) {
  print "$_\n" if s/$argFind/$replace/;
}

OUTPUT

"","","","","","","","",="12345678"
"","","","","","","","",="91234567"

Sign up to request clarification or add additional context in comments.

2 Comments

+1 to this answer. When dealing with CSV files using a module is a must, they are a pain and easy to break.
@alfa64: except that this doesn't seem to be any sort of standard CSV. See my comment on the question.
0

Well, I got your end result with this:

 qr/^(?:"",){3}"\d\d?\/\d\d?\/\d{4}",(?:"",){4}/;
  • Your regex was off because there are 3 blank columns, not 2.
  • Also note that you don't need the generic capture at the end just to paste it back into place.
  • Since the 4th column was not just any number of digits and slashes, I look for a more specific pattern.

Comments

0

Thanks for all of your input. I'm going to go with something like this as a solution:

use warnings;
use Data::Dumper;

my @case = (
        '"","","","1/2/2012","","","","",="12345678"',
        '"","","1/1/11","","","","","",="12345678"',
        '"","","","1/2/13","","4/3/2010","","",="987654"',
        '"","","","","","","","1/3/2012",="567890"',
        '"Add","New","1/1/2012","1/2/2012","","","","",="0987654"',
        '"","","","","","","","",="91234567"'
); 

my $argFind = (qr/^"","",("[\d\/]*",){6}/);

my $replace = '"",' x 8; 

for (@case) {
        unless (m/$replace/) {
                s/$argFind/$replace/;
        # Set flag to write file after loop
        }
}

warn Dumper \@case;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.