Bash scripting: why is the last line missing from this file append?

Question

I'm writing a bash script to read a set of files line by line and perform some edits. To begin with, I'm simply trying to move the files to backup locations and write them out as-is, to test the script is working. However, it is failing to copy the last line of each file. Here is the snippet:

    while IFS= read -r line
    do
            echo "Line is ***$line***"
            echo "$line" >> $POM
    done < $POM.backup

I obviously want to preserve whitespace when I copy the files, which is why I have set the IFS to null. I can see from the output that the last line of each file is being read, but it never appears in the output.

I've also tried an alternative variation, which does print the last line, but adds a newline to it:

    while IFS= read -r line || [ -n "$line" ]
    do
            echo "Line is ***$line***"
            echo "$line" >> $POM
    done < $POM.backup

What is the best way to do this do this read-write operation, to write the files exactly as they are, with the correct whitespace and no newlines added?

I can see that the last line is being read, as it is output by the echo command. However it does not appear in the new file. — Hedley
– Hedley, Commented Feb 2, 2015 at 17:46
the POSIX definition of a line is: A sequence of zero or more non- <newline>s plus a terminating <newline>. If the file doesn't terminate by a newline character, the last line is called an incomplete line. Text processing tools are generally not good at processing incomplete lines, as a file containing such a line is not a text file. — gniourf_gniourf
– gniourf_gniourf, Commented Feb 2, 2015 at 17:54
What's wrong with cp $POM.backup $POM? And when you actually start editing the data, something like sed '<some_commands>' $POM.backup > $POM...? — twalberg
– twalberg, Commented Feb 2, 2015 at 21:05

bgoldst · Accepted Answer · 2015-02-02 18:14:57Z

1

The command that is adding the line feed (LF) is not the read command, but the echo command. read does not return the line with the delimiter still attached to it; rather, it strips the delimiter off (that is, it strips it off if it was present in the line, IOW, if it just read a complete line).

So, to solve the problem, you have to use echo -n to avoid adding back the delimiter, but only when you have an incomplete line.

Secondly, I've found that when providing read with a NAME (in your case line), it trims leading and trailing whitespace, which I don't think you want. But this can be solved by not providing a NAME at all, and using the default return variable REPLY, which will preserve all whitespace.

So, this should work:

#!/bin/bash

inFile=in;
outFile=out;

rm -f "$outFile";

rc=0;
while [[ $rc -eq 0 ]]; do
    read -r;
    rc=$?;
    if [[ $rc -eq 0 ]]; then ## complete line
        echo "complete=\"$REPLY\"";
        echo "$REPLY" >>"$outFile";
    elif [[ -n "$REPLY" ]]; then ## incomplete line
        echo "incomplete=\"$REPLY\"";
        echo -n "$REPLY" >>"$outFile";
    fi;
done <"$inFile";

exit 0;

Edit: Wow! Three excellent suggestions from Charles Duffy, here's an updated script:

#!/bin/bash

inFile=in;
outFile=out;

while { read -r; rc=$?; [[ $rc -eq 0 || -n "$REPLY" ]]; }; do
    if [[ $rc -eq 0 ]]; then ## complete line
        echo "complete=\"$REPLY\"";
        printf '%s\n' "$REPLY" >&3;
    else ## incomplete line
        echo "incomplete=\"$REPLY\"";
        printf '%s' "$REPLY" >&3;
    fi;
done <"$inFile" 3>"$outFile";

exit 0;

edited Feb 2, 2015 at 18:14

answered Feb 2, 2015 at 18:00

bgoldst

35.6k6 gold badges44 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Charles Duffy Over a year ago

This works, but it's a bit hard to read. Using a compound command in the while's conditional might help on that count, perhaps?

Charles Duffy Over a year ago

Also, see the "APPLICATION USAGE" section of pubs.opengroup.org/onlinepubs/009604599/utilities/echo.html for notes straight from the POSIX spec on echo's portability limitations. Safer to use printf '%s\n' "$REPLY" (or printf '%s' "$REPLY" when no newline is desired) if you want this to work on systems with both plain POSIX echo, XSI-extended echo, and GNU's implementation (which conforms with neither standard).

Charles Duffy Over a year ago

Also, more efficient to open your output file only once rather than reopening it every time you want to append another line to the end. Just put 3>"$outFile" on the end of your loop, and redirect >&3 every time you want to add a line; not only is this more efficient, but it also means you don't need the rm -f.

philippe lhardy · Accepted Answer · 2015-02-02 19:57:46Z

After review i wonder if :

{
line=
while IFS= read -r line
do
    echo "$line"
    line=
done
echo -n "$line"
} <$INFILE >$OUTFILE

is juts not enough...

Here my initial proposal :

#!/bin/bash

INFILE=$1

if [[ -z $INFILE ]]
then
    echo "[ERROR] missing input file" >&2
    exit 2
fi

OUTFILE=$INFILE.processed

# a way to know if last line is complete or not :
lastline=$(tail -n 1 "$INFILE" | wc -l)

if [[ $lastline == 0 ]]
then
    echo "[WARNING] last line is incomplete -" >&2
fi

# we add a newline ANYWAY if it was complete, end of file will be seen as ... empty.
echo | cat $INFILE - | {
    first=1
    while IFS= read -r line
    do
        if [[ $first == 1 ]]
        then
        echo "First Line is ***$line***" >&2
        first=0
        else
        echo "Next Line is ***$line***" >&2
        echo
        fi
        echo -n "$line" 
    done
} > $OUTFILE

if diff $OUTFILE $INFILE
then
    echo "[OK]"
    exit 0
else
    echo "[KO] processed file differs from input"
    exit 1
fi

Idea is to always add a newline at the end of file and to print newlines only BETWEEN lines that are read.

This should work for quite all text files given they are not containing 0 byte ie \0 character, in which case 0 char byte will be lost.

Initial test can be used to decided whether an incomplete text file is acceptable or not.

score 0 · Accepted Answer · 2015-02-02 21:11:33Z

0

Add a new line if line is not a line. Like this:

while IFS= read -r line
do
    echo "Line is ***$line***";
    printf '%s' "$line" >&3;
    if [[ ${line: -1} != '\n' ]]
    then
        printf '\n' >&3;
    fi
done < $POM.backup 3>$POM

edited Feb 2, 2015 at 21:11

answered Feb 2, 2015 at 18:07

user4516901

2 Comments

Charles Duffy Over a year ago

echo "\n" will echo the literal characters \ and n on several systems. printf '\n' would be the safer approach. Likewise, printf '%s\n' "$line" will handle content where echo "$line" will (on many systems) mess things up -- like a line containing the literal contents -n.

Charles Duffy Over a year ago

Also, as I commented on the other answer, reopening the output file for every line is a substantial unneeded performance penalty, rather than just opening it once and reusing the file descriptor.

Collectives™ on Stack Overflow

Bash scripting: why is the last line missing from this file append?

3 Answers 3

3 Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related