0

How can I cut the leading zeros in the third field so it will only be 6 characters?

 xxx,aaa,00000000cc
 rrr,ttt,0000000yhh

desired output

  xxx,aaa,0000cc
  rrr,ttt,000yhh
2
  • 1
    How are you deciding how many zeros to cut off? Commented Mar 4, 2015 at 1:16
  • 1
    Are you assuming you are always cutting off the same number of characters? Commented Mar 4, 2015 at 1:20

3 Answers 3

3

or here's a solution using awk

 echo " xxx,aaa,00000000cc
 rrr,ttt,0000000yhh"|awk -F, -v OFS=, '{sub(/^0000/, "", $3)}1'

output

 xxx,aaa,0000cc
 rrr,ttt,000yhh

awk uses -F (or FS for FieldSeparator) and you must use OFS for OutputFieldSeparator) .

sub(/srchtarget/, "replacmentstring", stringToFix) is uses a regular expression to look for 4 0s at the front of (^) the third field ($3).

The 1 is a shorthand for the print statement. A longhand version of the script would be

echo " xxx,aaa,00000000cc
 rrr,ttt,0000000yhh"|awk -F, -v OFS=, '{sub(/^0000/, "", $3);print}'
 # ---------------------------------------------------------^^^^^^

Its all related to awk's /pattern/{action} idiom.

IHTH

Sign up to request clarification or add additional context in comments.

5 Comments

Much better answer than mine. Would you mind explaining what that 1 at the end does?
@AndrewMagee : Better, well, shorter, anyway. Thanks. I've added a short explanation. If you read awk postings for a week or so, you'll find fuller definitions of how the /pattern/{action} works. Good luck to all.
@AndrewMagee: 1 is shorthand for: print the (potentially modified) line at hand unconditionally. Technically, 1 serves as a pattern that always evaluates to true (non-negative numbers in awk are considered true in a Boolean context). awk programs come in pattern-action pairs: if the pattern matches, the associated action ({...}) is executed. Patterns that do not have an associated default to printing the line at hand. In awk, much is about what's left unsaid; the cleverly designed default behavior allows for very terse programs.
@mklement0 : Well put! Thanks for that. Good luck to all. Wow! Love your answer to stackoverflow.com/questions/12882611/… .
@shellter: thanks - almost: of course, I meant to say: nonzero numbers in awk are considered true in a Boolean context -- and thanks for the compliment on the linked answer.
1

If you can assume there are always three fields and you want to strip off the first four zeros in the third field you could use a monstrosity like this:

$ cat data
xxx,0000aaa,00000000cc
rrr,0000ttt,0000000yhh

$ cat data |sed 's/\([^,]\+\),\([^,]\+\),0000\([^,]\+\)/\1,\2,\3/
xxx,0000aaa,0000cc
rrr,0000ttt,000yhh

Another more flexible solution if you don't mind piping into Python:

cat data | python -c '
import sys
for line in sys.stdin():
  print(",".join([f[4:] if i == 2 else f for i, f in enumerate(line.strip().split(","))]))
'

This says "remove the first four characters of the third field but leave all other fields unchanged".

4 Comments

how if I also have 4 leading zeros at another column that i don't need to cut? I just need the 3rd column.. thanks
I may consider this one. But how about if I have more than 15 columns and I will only remove particular leading zeros in column 3, isn't it tedious?
It is a little bit tedious. Though it doesn't matter how many columns you have after the one you want to modify as you can just match a .* at the end.
Actually it will already work regardless of how many columns you have after the one you want to modify; they'll just remain unchanged. If you wanted to modify the 15th column, though, this solution would be a bit ridiculous.
0

Using awks substr should also work:

awk -F, -v OFS=, '{$3=substr($3,5,6)}1' file
xxx,aaa,0000cc
rrr,ttt,000yhh

It just take 6 characters from 5 position in field 3 and set it back to field 3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.