I have a file containing two columns, name and ID. Using BASH, I would like to read through all row of the file and assign each column to a variable, $name, and $ID. The columns are seperated by white space, and I want to be sure to include the last row in the file. Can some one help?
2 Answers
To ensure that the loop runs even after parsing an invalid line (such as one with no trailing newline at the end of the file), you can use an alternate test to indicate success if any data is read:
while read -r name id _ || [[ $name ]]; do
printf 'Read name: %s, and id: %s\n' "$name" "$id"
done <input
Here, we check whether $name is non-empty if read reports failure, and proceed to run the body of the loop if that did take place. The _ variable "soaks up" columns 3 and onward, such that only the first two are read.
Because this approach doesn't depend on a pipeline, you can be assured that variables set in the loop are persisted past its exit:
name_count=0
while read -r name id _ || [[ $name ]]; do
name_count=$((name_count+1))
printf 'Read name: %s, and id: %s\n' "$name" "$id"
done <input
echo "Read a total of $name_count names"
Doing otherwise may run afoul of BashFAQ #24 ("I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?").
Comments
You problem probably is that the last line is not processed in while read loop if it doesn't end in a newline:
printf '%s %s\n%s %s' a b c d | while read x y ; do
printf '%s, %s\n' $x $y
done
You can add newline to a line for example by the following Perl oneliner:
perl -pe '$_ .= "\n" unless /\n/'
It adds newline to a line that doesn't have one, it must be the last one.
Check:
printf '%s %s\n%s %s' a b c d \
| perl -pe '$_ .= "\n" unless /\n/' \
| while read x y ; do
printf '%s, %s\n' $x $y
done
If you're processing several files, just process them first to add newlines to the last line of each:
perl -pe '$_ .= "\n" unless /\n/' -- files* | while read x y ; do ...
2 Comments
while read means that variables set there aren't persisted -- as you well know. Should also quote appropriately. I'd strongly suggest || [[ $x ]] to execute the loop even after a read with a nonzero exit status if it succeeded in populating at least one variable, as an alternative to modifying the input stream.read was using the inherited IFS we know that x won't contain any characters in same, but y can, and there's also potential for globs to be present).