Using an if block in awk

Question

I'm processing a file in awk.

I want to pass along the rows in the file that have blanks in column positions 25 through 34 and I want to do work on the rows that have blanks in column positions 10 through 19. Specifically I want to replace the blanks in columns positions 10 through 19 with 0s. That way the output file will have the original rows with blanks in 25-34 untouched. and the rows with blanks in 10-19 with have been replaced with '0's. So the output file will be the same as the input file only with zeros in the relevant rows in positions 10-19. The file looks like this:

###########################################
#########          ########################
###########################################
###########################################
###########################################
###########################################
###########################################
###########################################
#########          #####          #########
###########################################
###########################################
###########################################

I know I have to use an if block but I've never used one before in awk. The syntax below is what I think I need but please help me with the details. Specifically what I'm using to specify 'blanks' in the if statements.

I apologize ahead of time for the bad syntax. This is my first time using an If block in awk. I know the syntax doesn't work, which is one of the reasons I'm posting this.

cat scr2 | awk 'BEGIN {
    pos1=substr($0,25,10); 
    pos2=substr($0,10,10);

      if (pos1 = ^[[:blank:]]$) 
         printf $0 
      else if (pos2 == ^[[:blank:]]$)
         {val=substr($0,25,10)} 
         gsub(/ /,0,val){$0=substr($0,1,24) val substr($0,35)} 1}'`

The sample output would be :

###########################################
#########0000000000########################
###########################################
###########################################
###########################################
###########################################
###########################################
###########################################
#########          #####          #########
###########################################
###########################################
###########################################

So the row with blanks only at positions 10-19 gets changed and the row with blanks at both 10-19 and 25-34 get left alone.

How do you expect anyone to read that script? For pity's sake — use newlines; use newlines liberally. You seem to be missing /…/ delimiters around some regexes, and to be using == where you need ~ (the regex match operator). Or you're missing double quotes around strings. There is nothing in $0 inside the BEGIN block of an Awk script — nothing has been read when the BEGIN block is executed. — Jonathan Leffler
– Jonathan Leffler, Commented Jul 15, 2021 at 16:55
Follow-up question: stackoverflow.com/questions/68412741/… — tripleee
– tripleee, Commented Jul 16, 2021 at 19:54

Jonathan Leffler · Accepted Answer · 2021-07-15 19:05:37Z

3

With your shown samples, please try following awk code, written and tested in GNU awk, should work in any awk.

awk '
substr($0,10,10) ~ /^ +$/ && substr($0,20) !~ / / {
  $0=substr($0,1,9) "0000000000" substr($0,20)
}
1
' Input_file

Explanation: Simple explanation would be, checking 2 conditions in main program of awk. 1st to make sure position 10th to 20th contains only space AND 2nd rest of the line's values are NOT having spaces in it, if this is the case then enter zeroes in place of spaces and print edited/non-edited lines.

edited Jul 15, 2021 at 19:05

Jonathan Leffler

760k145 gold badges961 silver badges1.3k bronze badges

answered Jul 15, 2021 at 18:54

RavinderSingh13

135k14 gold badges61 silver badges100 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

RavinderSingh13 Over a year ago

@DavidC.Rankin, Thank you sir.

The fourth bird Over a year ago

Always an interesting read, awk is getting less of a riddle to me now :-)

RavinderSingh13 Over a year ago

@Thefourthbird, your welcome, you are a champ in regex, I have same feeling for your answers too cheers.

Carbon Over a year ago

Hi Guys, The answers are all great. I see some of the answers are 'seeing' both blocks of space and only taking action on the first. I think I should have been more specific. The data I'm working with is Bank data so I couldn't put actual numbers down. I thought excluding blanks anywhere would be fine. In my case, on the lines we don't want, most of the columns are blank so I tried to just use '#' for the 'generic' data. In reality the rows we want to screen out are mostly blank. They contain marco info. I'll add the new example rows in the main section.

RavinderSingh13 Over a year ago

@Carbon, Request you to please revert your question's latest update to previous one as many users had replied as per your previous question and it will be waste of their efforts as well as it will confuse users. You could open a fresh question for same and it could be discussed there.

|

glenn jackman · Accepted Answer · 2021-07-15 19:09:26Z

3

I'd use sed here:

sed -E 's/^(.{9}) {10}(.{5}[^ ]{10})/\10000000000\2/' file

answered Jul 15, 2021 at 19:09

glenn jackman

249k42 gold badges233 silver badges363 bronze badges

1 Comment

David C. Rankin Over a year ago

Very nicely done. (and much shorter than mine) I'll give it a nod.

David C. Rankin · Accepted Answer · 2021-07-15 19:17:33Z

Another option is using match() to fill the RSTART built-in variable specifying the start of the block of spaces. You can then use substr() in a regex comparison to verify the remainder of the line is comprised only of '#' characters. For example:

awk '{if (match($0,/[ ]{10}/) && RSTART == 10 && substr($0,20) ~ /^#*$/) sub(/[ ]{10}/,"0000000000")}1' file

The above will match() each line with 10-spaces beginning at column 10 and replace them with 10 '0's.

Example Use/Output

With your input in the file named file, you would have:

$ awk '{if (match($0,/[ ]{10}/) && RSTART == 10 && substr($0,20) ~ /^#*$/) sub(/[ ]{10}/,"0000000000")}1' lines
###########################################
#########0000000000########################
###########################################
###########################################
###########################################
###########################################
###########################################
###########################################
#########          #####          #########
###########################################
###########################################
###########################################

Pierre François · Accepted Answer · 2021-07-15 19:37:19Z

2

You can do it in awk without any if statement:

awk '{print gensub(/^(.{9}) {10}([^ ]{24})/, "\\10000000000\\2", "g")}' file

This will replace 10 blanks by 10 0 in positions 10 to 19 only on the lines where there are no blanks in positions 20 to 43, which is what you want, I guess.

edited Jul 15, 2021 at 19:37

answered Jul 15, 2021 at 19:25

Pierre François

6,1681 gold badge21 silver badges42 bronze badges

1 Comment

David C. Rankin Over a year ago

Only note would be that gensub() is gawk and may not be available in other awks.

Collectives™ on Stack Overflow

Using an if block in awk

4 Answers 4

10 Comments

1 Comment

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

10 Comments

1 Comment

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related