Ahoy!
This is an interesting problem! I dont think you can solve this problem with awk or sed because you have to read the file into memory and manually iterate through each row and column of the data. The awk and sed solutions above are hardcoded for only the example input and will not work if the input file changes.
I was able to solve in Perl using the following steps:
- Put the data in two dimensional array
@rows also known as a 2D matrix
- Starting with the first column
$c, check column $c and $c+1
- Find any columns where
$c and $c+1 have a dash - value for all rows
- Set these columns to
1 or TRUE in @columnToRemove
- Print all columns except those with a
1 or TRUE value in @columnToRemove
Here is the code...
#!/usr/bin/perl
my (@rows, @columnToRemove);
while(<>){
next if(/^$/); #skip blank lines
chomp; #remove newline
s/\|//g; #remove separator
push(@rows,[split(//)]);#put data in two dimensional array
}
my ($rowLength, $columnLength) = ($#rows, $#{$rows[0]});
#find columns to delete, two consecutive columns with "-"
COLUMN: for my $c (0..$columnLength){
#assume column can be deleted until non dash "-" is found in $c and $c+1
$columnToRemove[$c] = 1;
for my $r (0 .. $rowLength){
if( !($rows[$r][$c] eq "-" && $rows[$r][$c+1] eq "-") ){ #removed warnings because $c+1 value can be undef
#this column has data and can not be deleted
#mark false and move to next column
$columnToRemove[$c] = 0;
next COLUMN;
}
}
}
#put "|" seperator back around data
#do not print columns marked to remove in @columnToRemove
for my $r (0 .. $rowLength){
print "|";
for my $c (0 .. $columnLength){
print $rows[$r][$c] if(!$columnToRemove[$c]);
}
print "|\n";
}
Output looks like this...
$ cat condense.txt
|------------------|
|------------------|
|-0----------------|
|-2-----2-----2----|
|-------2----------|
|-------------0----|
$ perl condense.pl condense.txt
|-------|
|-------|
|-0-----|
|-2-2-2-|
|---2---|
|-----0-|
I wanted to check to make sure it worked, so I created some test files in the same format using the following script.
#!/usr/bin/perl -w
my ($rowLength, $columnLength) = (10,17);
for(0..$rowLength){
print "|";
for(0..$columnLength){
my $n = int(rand 200); #high enough value for blank rows to delete
if($n < 10){
print "$n";
}else{
print "-";
}
}
print "|\n";
}
Here is the command to generate a file to condense, print the generated file, then condense the file using the above scripts.
$ perl create.condense.file.pl | tee /dev/tty | perl condense.pl
|-----3-----------0|
|------------------|
|------------------|
|--1----4----------|
|----------------8-|
|---3--------------|
|------------------|
|------------------|
|------------------|
|------------------|
|----------8-------|
|----3------0|
|------------|
|------------|
|-1----4-----|
|----------8-|
|--3---------|
|------------|
|------------|
|------------|
|------------|
|--------8---|
To create a new file with the output, try the following command...
$ perl condense.pl condense.txt > updated.condense.txt
I think that is what you were looking for! Give it a try and let me know how it works or if something needs to be changed.
Good Luck!
xand two backslashes? I'm not sure how that is two columns, can you edit the question to clarify? And if it's not literallyxx<backslash><backslash>, can you add an example of what the data actually would be?-x----, wherexis either the number of the fret or another-. Removing some of the dashes from that would be simple, one could do that by just looking at one line at a time. But the problems come if the same file contains tablatures with different spacings (different numbers of dashes between the notes). Especially since we can't tell from the first line how many time units there are...