2

I'm trying to get a multiline output from a CSV into one line in Bash.

My CSV file looks like this:

hi,bye
hello,goodbye

The end goal is for it to look like this:

"hi/bye", "hello/goodbye"

This is currently where I'm at:

INPUT=mycsvfile.csv
while IFS=, read col1 col2 || [ -n "$col1" ]
do
  source=$(awk '{print;}' | sed -e 's/,/\//g' )
  echo "$source";
done < $INPUT

The output is on every line and I'm able to change the , to a / but I'm not sure how to put the output on one line with quotes around it.

I've tried BEGIN:

source=$(awk 'BEGIN { ORS=", " }; {print;}'| sed -e 's/,/\//g' )

But this only outputs the last line, and omits the first hi/bye:

hello/goodbye

Would anyone be able to help me?

1
  • 1
    First, understand why the first line is omitted. The while loop reads the first line, and awk reads the second line. Why are you letting awk read from the same source as the while loop? Commented Apr 19, 2017 at 0:45

7 Answers 7

1

Just do the whole thing (mostly) in awk. The final sed is just here to trim some trailing cruft and inject a newline at the end:

< mycsvfile.csv awk '{print "\""$1, $2"\""}' FS=, OFS=/ ORS=", " | sed 's/, $//'
Sign up to request clarification or add additional context in comments.

1 Comment

That awk command will produce output that doesn't have a terminating newline and so is not a POSIX "text file" and so YMMV with piping it to any other command (sed in this case) as it'll be relying on undefined behavior (per POSIX) at that point.
1

If you're willing to install trl, a utility of mine, the command can be simplified as follows:

input=mycsvfile.csv
trl -R '| ' < "$input" | tr ',|' '/,'
  • trl transforms multiline input into double-quoted single-line output separated by ,<space> by default.

    • -R '| ' (temporarily) uses |<space> as the separator instead; this assumes that your data doesn't contain | instances, but you can choose any char. that you know not be part of your data.
  • tr ',|' '/,' then translates all , instances (field-internal to the input lines) into / instances, and all | instances (the temporary separator) into , instances, yielding the overall result as desired.


Installation of trl from the npm registry (Linux and macOS)

Note: Even if you don't use Node.js, npm, its package manager, works across platforms and is easy to install; try
curl -L https://git.io/n-install | bash

With Node.js installed, install as follows:

[sudo] npm install trl -g

Note:

  • Whether you need sudo depends on how you installed Node.js and whether you've changed permissions later; if you get an EACCES error, try again with sudo.
  • The -g ensures global installation and is needed to put trl in your system's $PATH.

Manual installation (any Unix platform with bash)

  • Download this bash script as trl.
  • Make it executable with chmod +x trl.
  • Move it or symlink it to a folder in your $PATH, such as /usr/local/bin (macOS) or /usr/bin (Linux).

Comments

1
$ awk -F, -v OFS='/' -v ORS='"' '{$1=s ORS $1; s=", "; print} END{printf RS}' file
"hi/bye", "hello/goodbye"

Comments

0

There is no need for a bash loop, which is invariably slow.

sed and tr can do this more efficiently:

input=mycsvfile.csv
sed 's/,/\//g; s/.*/"&", /; $s/, $//' "$input" | tr -d '\n'
  • s/,/\//g uses replaces all (g) , instances with / instances (escaped as \/ here).

  • s/.*/"&", / encloses the resulting line in "...", followed by ,<space>:

    • regex .* matches the entire pattern space (the potentially modified input line)
    • & in the replacement string represent that match.
  • $s/, $// removes the undesired trailing ,<space> from the final line ($)

  • tr -d '\n' then simply removes the newlines (\n) from the result, because sed invariably outputs each line with a trailing newline.

Note that the above command's single-line output will not have a trailing newline; simply append ; printf '\n' if it is needed.

Comments

0

In awk:

$ awk '{sub(/,/,"/");gsub(/^|$/,"\"");b=b (NR==1?"":", ")$0}END{print b}' file
"hi/bye", "hello/goodbye"

Explained:

$ awk '
{
    sub(/,/,"/")            # replace comma
    gsub(/^|$/,"\"")        # add quotes
    b=b (NR==1?"":", ") $0  # buffer to add delimiters
}
END { print b }             # output
' file

Comments

0

I'm assuming you just have 2 lines in your file? If you have alternating 2 line pairs, let me know in comments and I will expand for that general case. Here is a one-line awk conversion for you:

# NOTE: I am using the octal ascii code for the 
# double quote char (\42=") in my printf statement
$ awk '{gsub(/,/,"/")}NR==1{printf("\42%s\42, ",$0)}NR==2{printf("\42%s\42\n",$0)}' file

output:

"hi/bye", "hello/goodbye"

Comments

0

Here is my attempt in awk:

awk 'BEGIN{ ORS = " " }{ a++; gsub(/,/, "/"); gsub(/[a-z]+\/[a-z]+/, "\"&\""); print $0; if (a == 1){ print "," }}{ if (a==2){ printf "\n"; a = 0 } }'

Works also if your Input has more than two lines.If you need some explanation feel free to ask :)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.