3

I need 7th column of a csv file to be converted from float to decimal. It's a huge file and I don't want to use while read for conversion. Any shortcuts with awk?

Input:

"xx","x","xxxxxx","xxx","xx","xx"," 00000001.0000"  
"xx","x","xxxxxx","xxx","xx","xx"," 00000002.0000"  
"xx","x","xxxxxx","xxx","xx","xx"," 00000005.0000"  
"xx","x","xxxxxx","xxx","xx","xx"," 00000011.0000"  

Output:

"xx","x","xxxxxx","xxx","xx","xx","1"  
"xx","x","xxxxxx","xxx","xx","xx","2"  
"xx","x","xxxxxx","xxx","xx","xx","5"   
"xx","x","xxxxxx","xxx","xx","xx","11" 

Tried these, worked. But anything simpler ?

awk 'BEGIN {FS=OFS="\",\""} {$7 = sprintf("%.0f", $7)} 1' $test > $test1
awk '{printf("%s\"\n", $0)}' $test1

3 Answers 3

3

With your shown samples, please try following awk program.

awk -v s1="\"" -v OFS="," '{$NF = s1 ($NF + 0) s1} 1' Input_file

Explanation: Simple explanation would be, setting OFS as , then in main program; in each line's last field keeping only digits and covering last field with ", re-shuffle the fields and printing edited/non-edited all lines.

Sign up to request clarification or add additional context in comments.

7 Comments

There is a closing parenthesis missing. Can you explain what does $NF do ?
@GD, sorry that was a typo fixed it now, should fly now.
@GD, NF is the awk variable holding the number of fields in the current record. Since fields are numbered from one, NF is also the number of the last field. So $NF is the contents of the last field.
@glennjackman "xx","x","xxxxxx","xxx","xx","xx",",1" "xx","x","xxxxxx","xxx","xx","xx",",2" I get like this.. extra comma added
@Ravinder, the first awk body only needs {$NF = s1 ($NF + 0) s1} -- need to add the leading quote. Also, assigning $NF already rewrites $0, don't need $1=$1
|
3

Another simple awk solution:

awk 'BEGIN {FS=OFS="\",\""} {$NF = $NF+0 "\""} 1' file

"xx","x","xxxxxx","xxx","xx","xx","1"
"xx","x","xxxxxx","xxx","xx","xx","2"
"xx","x","xxxxxx","xxx","xx","xx","5"
"xx","x","xxxxxx","xxx","xx","xx","11"

Comments

1
awk 'BEGIN{FS=OFS=","} {gsub(/"/, "", $7); $7="\"" $7+0 "\""; print}' file

Output:

"xx","x","xxxxxx","xxx","xx","xx","1"
"xx","x","xxxxxx","xxx","xx","xx","2"
"xx","x","xxxxxx","xxx","xx","xx","5"
"xx","x","xxxxxx","xxx","xx","xx","11"

gsub(/"/, "", $7): removes all " from $7

$7+0: Reduces the number in $7 to minimal representation

1 Comment

Worked like a charm. Just what I needed. Thanks for the detailed explanation too

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.