Problems with traversing Array in csv file in bash script

Question

So what I'm trying to do in my code is basically read in a spreadsheet that has this format

username,   lastname,   firstname,    x1,      x2,       x3,      x4
user1,       dudette,    mary,         7,       2,                 4
user2,       dude,       john,         6,       2,        4,
user3,       dudest,     rad,
user4,       dudaa,      pad,          3,       3,        5,       9

basically, it has usernames, the names those usernames correspond to, and values for each x. What I want to do is read in this from a csv file and then find all of the blank spaces and fill them in with 5s. My approach to doing this was to read in the whole array and then substitute all null spaces with 0s. This is the code so far...

#!/bin/bash

while IFS=$'\t' read -r -a myarray
do
echo $myarray
done < something.csv

for e in ${myarray[@]
do
echo 'Can you see me #1?'
if [[-z $e]]
echo 'Can you see me #2?'
sed 's//0'
fi
done

The code isn't really changing my csv file at all. EDITED NOTE: the data is all comma separated.

What I've figured out so far:

Okay, the 'Can you see me' and the echo myarray are test code. I wanted to see if the whole csv file was being read in from echo myarray (which according to the output of the code seems to be the case). It doesn't seem, however, that the code is running through the for loop at all...which I can't seem to understand.

Help is much appreciated! :)

Why are there commas after x1, x2, and x3, but none elsewhere? — Emmet
– Emmet, Commented Mar 10, 2014 at 4:09
So is it comma separated, tab separated, or whitespace separated? That makes a big difference for a shell implementation. — Emmet
– Emmet, Commented Mar 10, 2014 at 4:18
Originally my code had while IFS=, read -a line in it...and I think that's the right answer ultimately...but when it outputted the array in the test code, only the column was outputted. After some googling, I found IFS = '/t' while read -r -a and tried that. It seemed to work for the output of the array...but nothing happened to my csv file. — user3399613
– user3399613, Commented Mar 10, 2014 at 4:21

John B · Accepted Answer · 2014-03-10 18:13:14Z

1

The format of your .csv file is not comma separated, it's left aligned with a non-constant number of whitespace characters separating each field. This makes it difficult to be accurate when trying to find and replace empty columns which are followed by non-empty columns.

Here is a Bash only solution that would be entirely accurate if the fields were comma separated.

#!/bin/bash

n=5
while IFS=, read username lastname firstname x1 x2 x3 x4; do
    ! [[ $x1 ]] && x1=$n
    ! [[ $x2 ]] && x2=$n
    ! [[ $x3 ]] && x3=$n
    ! [[ $x4 ]] && x4=$n
    echo $username,$lastname,$firstname,$x1,$x2,$x3,$x4
done < something.csv > newfile.csv && mv newfile.csv something.csv

Output:

username,lastname,firstname,x1,x2,x3,x4
user1,dudette,mary,7,2,5,4
user2,dude,john,6,2,4,5
user3,dudest,rad,5,5,5,5
user4,dudaa,pad,3,3,5,9

edited Mar 10, 2014 at 18:13

answered Mar 10, 2014 at 3:14

John B

3,6661 gold badge19 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user3399613 Over a year ago

Thank you for your response! As with my code, for some bizarre reason, running your code in my computer doesn't seem to change the csv file at all. I don't know...maybe there's something up with my terminal? I've tried all sorts of approaches to this but they don't seem to work. Does it change your csv file?

OnlineCop Over a year ago

Works fine for me: it outputs to newfile.csv, however. If it didn't, then it would clobber the input file something.csv as it is being read in, which is a bad thing. You could always output to newfile.csv and, after that entire process is finished, mv newfiles.csv something.csv.

user3399613 Over a year ago

Would you happen to know how I could modify this code for a variable number of columns?

Beel · Accepted Answer · 2014-03-10 01:00:10Z

0

I realize you asked for bash, but if you don't mind perl in lieu of bash, perl is a great tool for record-oriented files.

#!/usr/bin/perl 
open (FILE, 'something.csv');   
open (OUTFILE, '>outdata.txt'); 
while(<FILE>) {         
        chomp;          
        ($username,$lastname,$firstname,$x1,$x2,$x3,$x4) = split("\t");
        $x1 = 5 if $x1 eq "";
        $x2 = 5 if $x2 eq "";
        $x3 = 5 if $x3 eq "";
        $x4 = 5 if $x4 eq "";
        print OUTFILE "$username\t$lastname\t$x1\t$x2\t$x3\t$x4\n";
}
close (FILE);
close (OUTFILE);
exit;

This reads your infile, something.csv which is assumed to have tab-separated fields, and writes a new file outdata.txt with the re-written records.

answered Mar 10, 2014 at 1:00

Beel

1,03013 silver badges23 bronze badges

6 Comments

user3399613 Over a year ago

Thank you so much for your response! I really appreciate it. Unfortunately, my project specifications state that it has to be shell script. Would you know how I translate this to bash?

Emmet Over a year ago

You've said it has to be Bash, not Perl, but you've used sed. What are you allowed and not allowed use? I could see an easy awk solution, for example, but I don't know whether awk is allowed (like sed) or not (like Perl).

user3399613 Over a year ago

awk is allowed. But it has to be bash script.

user3399613 Over a year ago

Also, unless if I'm horribly mistaken, I believe sed is also in bash. At least my terminal doesn't seem to be complaining about it...

Emmet Over a year ago

Sed isn't “in bash”. It's a separate executable for its own language (a very compact stream-oriented editing language).

|

Emmet · Accepted Answer · 2014-03-10 19:03:35Z

0

I'm sure there's a better or more idiomatic solution, but this works:

#!/bin/bash

infile=bashcsv.csv     # Input filename
declare -i i           # Iteration variable
declare -i defval=5    # Default value for missing cells
declare -i n_cells=7   # Total number of cells per line
declare -i i_start=3   # Starting index for numeric cells
declare -a cells       # Array variable for cells

# We'd usually save/restore the old value of IFS, but there's no need here:
IFS=','

# Convenience function to bail/bug out on error:
bail () {
    echo $@ >&2
    exit 1
}

# Strip whitespace and replace empty cells with `$defval`:
sed -s 's/[[:space:]]//g' $infile | while read -a cells; do

    # Skip empty/malformed lines:
    if [ ${#cells[*]} -lt $i_start ]; then
        continue
    fi

    # If there are fewer cells than $n_cells, pad to $n_cells
    # with $defval; if there are more, bail:
    if [ ${#cells[*]} -lt $n_cells ]; then
        for ((i=${#cells[*]}; $i<$n_cells; i++)); do
            cells[$i]=$defval
        done
    elif [ ${#cells[*]} -gt $n_cells ]; then
        bail "Too many cells."
    fi

    # Replace empty cells with default value:
    for ((i=$i_start; $i<$n_cells; i++)); do
        if [ -z "${cells[$i]}" ]; then
            cells[$i]=$defval
        fi
    done

    # Print out whole line, interpolating commas back in:
    echo "${cells[*]}"
done

Here's a gratuitous awk one-liner that gets the job done:

awk -F'[[:space:]]*,[[:space:]]*' 'BEGIN{OFS=","} /,/ {NF=7; for(i=4;i<=7;i++) if($i=="") $i=5; print}' infile.csv

edited Mar 10, 2014 at 19:03

answered Mar 10, 2014 at 14:37

Emmet

6,45128 silver badges40 bronze badges

2 Comments

user3399613 Over a year ago

Emmet, thank you so much! The first solutions seems to result in some odd compiler errors. The second solution with awk works great though! Do you know how I can get the awk output to simply replace the input file, so that I wind up with a csv with the 5s filled in? I tried simply piping it in > but it the output is very weird. It seems to put ,,,0,0,0,0,0,0,0 below each row...

Emmet Over a year ago

If you have a very recent GNU awk, it can do inplace edits, but even then it's probably better to use output redirection to write to a temporary file and then copy that over the original (be careful not to use a static file name if you can have more than one copy of your script running at once). As a general rule, you can't redirect to and from the same file. Unfortunately, there are a few limited cases where it works (more by luck than by design), and sometimes people will stumble on one of these and think it works in general, but it's far better to avoid it completely.

Collectives™ on Stack Overflow

Problems with traversing Array in csv file in bash script

3 Answers 3

3 Comments

6 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

6 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related