Multiline CSV: output on a single line, with double-quoted input lines, using a different separator

Question

I'm trying to get a multiline output from a CSV into one line in Bash.

My CSV file looks like this:

hi,bye
hello,goodbye

The end goal is for it to look like this:

"hi/bye", "hello/goodbye"

This is currently where I'm at:

INPUT=mycsvfile.csv
while IFS=, read col1 col2 || [ -n "$col1" ]
do
  source=$(awk '{print;}' | sed -e 's/,/\//g' )
  echo "$source";
done < $INPUT

The output is on every line and I'm able to change the , to a / but I'm not sure how to put the output on one line with quotes around it.

I've tried BEGIN:

source=$(awk 'BEGIN { ORS=", " }; {print;}'| sed -e 's/,/\//g' )

But this only outputs the last line, and omits the first hi/bye:

hello/goodbye

Would anyone be able to help me?

First, understand why the first line is omitted. The while loop reads the first line, and awk reads the second line. Why are you letting awk read from the same source as the while loop? — William Pursell
– William Pursell, Commented Apr 19, 2017 at 0:45

William Pursell · Accepted Answer · 2017-04-19 00:50:11Z

1

Just do the whole thing (mostly) in awk. The final sed is just here to trim some trailing cruft and inject a newline at the end:

< mycsvfile.csv awk '{print "\""$1, $2"\""}' FS=, OFS=/ ORS=", " | sed 's/, $//'

answered Apr 19, 2017 at 0:50

William Pursell

214k49 gold badges279 silver badges317 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Ed Morton Over a year ago

That awk command will produce output that doesn't have a terminating newline and so is not a POSIX "text file" and so YMMV with piping it to any other command (sed in this case) as it'll be relying on undefined behavior (per POSIX) at that point.

mklement0 · Accepted Answer · 2017-04-19 01:35:42Z

If you're willing to install trl, a utility of mine, the command can be simplified as follows:

input=mycsvfile.csv
trl -R '| ' < "$input" | tr ',|' '/,'

trl transforms multiline input into double-quoted single-line output separated by ,<space> by default.
- -R '| ' (temporarily) uses |<space> as the separator instead; this assumes that your data doesn't contain | instances, but you can choose any char. that you know not be part of your data.
tr ',|' '/,' then translates all , instances (field-internal to the input lines) into / instances, and all | instances (the temporary separator) into , instances, yielding the overall result as desired.

Installation of `trl` from the npm registry (Linux and macOS)

^{Note: Even if you don't use Node.js, npm, its package manager, works across platforms and is easy to install; try

curl -L https://git.io/n-install | bash}

With Node.js installed, install as follows:

[sudo] npm install trl -g

Note:

Whether you need sudo depends on how you installed Node.js and whether you've changed permissions later; if you get an EACCES error, try again with sudo.
The -g ensures global installation and is needed to put trl in your system's $PATH.

Manual installation (any Unix platform with `bash`)

Download this bash script as trl.
Make it executable with chmod +x trl.
Move it or symlink it to a folder in your $PATH, such as /usr/local/bin (macOS) or /usr/bin (Linux).

Ed Morton · Accepted Answer · 2017-04-20 00:59:05Z

1

$ awk -F, -v OFS='/' -v ORS='"' '{$1=s ORS $1; s=", "; print} END{printf RS}' file
"hi/bye", "hello/goodbye"

answered Apr 20, 2017 at 0:59

Ed Morton

209k18 gold badges90 silver badges212 bronze badges

Comments

mklement0 · Accepted Answer · 2017-04-19 01:03:12Z

There is no need for a bash loop, which is invariably slow.

sed and tr can do this more efficiently:

input=mycsvfile.csv
sed 's/,/\//g; s/.*/"&", /; $s/, $//' "$input" | tr -d '\n'

s/,/\//g uses replaces all (g) , instances with / instances (escaped as \/ here).
s/.*/"&", / encloses the resulting line in "...", followed by ,<space>:
- regex .* matches the entire pattern space (the potentially modified input line)
- & in the replacement string represent that match.
$s/, $// removes the undesired trailing ,<space> from the final line ($)
tr -d '\n' then simply removes the newlines (\n) from the result, because sed invariably outputs each line with a trailing newline.

Note that the above command's single-line output will not have a trailing newline; simply append ; printf '\n' if it is needed.

James Brown · Accepted Answer · 2017-04-19 04:45:25Z

0

In awk:

$ awk '{sub(/,/,"/");gsub(/^|$/,"\"");b=b (NR==1?"":", ")$0}END{print b}' file
"hi/bye", "hello/goodbye"

Explained:

$ awk '
{
    sub(/,/,"/")            # replace comma
    gsub(/^|$/,"\"")        # add quotes
    b=b (NR==1?"":", ") $0  # buffer to add delimiters
}
END { print b }             # output
' file

answered Apr 19, 2017 at 4:45

James Brown

37.7k8 gold badges52 silver badges64 bronze badges

Comments

isosceleswheel · Accepted Answer · 2017-04-19 06:16:49Z

0

I'm assuming you just have 2 lines in your file? If you have alternating 2 line pairs, let me know in comments and I will expand for that general case. Here is a one-line awk conversion for you:

# NOTE: I am using the octal ascii code for the 
# double quote char (\42=") in my printf statement
$ awk '{gsub(/,/,"/")}NR==1{printf("\42%s\42, ",$0)}NR==2{printf("\42%s\42\n",$0)}' file

output:

"hi/bye", "hello/goodbye"

answered Apr 19, 2017 at 6:16

isosceleswheel

1,5461 gold badge13 silver badges21 bronze badges

Comments

JFS31 · Accepted Answer · 2017-04-19 13:48:49Z

0

Here is my attempt in awk:

awk 'BEGIN{ ORS = " " }{ a++; gsub(/,/, "/"); gsub(/[a-z]+\/[a-z]+/, "\"&\""); print $0; if (a == 1){ print "," }}{ if (a==2){ printf "\n"; a = 0 } }'

Works also if your Input has more than two lines.If you need some explanation feel free to ask :)

answered Apr 19, 2017 at 13:48

JFS31

5185 silver badges14 bronze badges

Collectives™ on Stack Overflow

Multiline CSV: output on a single line, with double-quoted input lines, using a different separator

7 Answers 7

1 Comment

Installation of `trl` from the npm registry (Linux and macOS)

Manual installation (any Unix platform with `bash`)

Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

1 Comment

Installation of trl from the npm registry (Linux and macOS)

Manual installation (any Unix platform with bash)

Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related

Installation of `trl` from the npm registry (Linux and macOS)

Manual installation (any Unix platform with `bash`)