Use Unix's sed to turn a csv into a javascript object

Question

Sorry to post such a rudimentary question, but I'm getting confused by all the different tutorials and examples (and slashes and hyphens and back-ticks oh my) so I figured I would get someone's experienced input.

I have a .csv which is obviously comma seperated that has several hundred lines which looks like this:

abcd-3096,62#,,100,,,25,,75,3, and it should be formatted like so:

{name: 'abcd-3096', weight : 62, some-field1: null, class: 100, some-field2: null, some-field3: null, unit-weight : 25, some-field4 : null, capacity : 75,   }

I know you'll either want to use awk or sed in order to replace it, and I'm more than fine with doing the formatting in several commands.

I don't expect anyone to format the whole line for me, but I'm hoping some one can show me how to prepend a column with some some text. I can't seem to find a reliable explanation of the command anywhere online.

No, we can assume that commas only delineate the fields or columns. — Csteele5
– Csteele5, Commented Oct 20, 2015 at 23:40

Community · Accepted Answer · 2017-05-23 12:14:41Z

2

You can use negating character classes like [^,] for this:

sed -r 's/^([^,]*),([^,]*),([^,]*)/{ name: "\1", weight: "\2", somefield1: "\3" }/' file.csv

The example uses only 3 groups for simplicity ... but you get the idea.

If your system does not support sed -r (extended regex syntax), you need to use \(group\) instead of (group):

sed 's/^\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\)/{ name: "\1", weight: "\2", somefield1: "\3" }/' file.csv

In case you don't need to use sed, you can also use bash directly:

while IFS=',' read -r name weight somefield1 class somefield2 somefield3 unitweight capacity rest
do
    echo -e "{ name: \"$name\", weight: \"$weight\", somefield1: \"$somefield1\",";
    echo -e " class: \"$class\", somefield2: \"somefield2\" somefield3: \"$somefield3\",";
    echo -e " unitweight: \"$unitweight\", capacity: \"$capacity\" }";
done < file.csv
IFS=$' \t\n'

(taken from this answer by koola)

edited May 23, 2017 at 12:14

CommunityBot

11 silver badge

answered Oct 20, 2015 at 23:53

Leon Adler

3,3921 gold badge33 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Leon Adler Over a year ago

(All of these solutions assume that you have no commas in your data, as you stated in your comment on your question.)

Csteele5 Over a year ago

This is an excellent answer. For your first example, it looks like you are trying to negate the same thing 3 times. Is that in compensation for the triple comma part of my data?

Leon Adler Over a year ago

([^,]*) means "capture 0 or more characters that are not a comma". So for 2 values the pattern is ([^,]*),([^,]*), matching "a value, then a comma, then a value". For each additional group, you'd add ,([^,]*).

Csteele5 Over a year ago

So if I wanted to specify for just a single column using sed, how would the following pseudo code: sed -r /regex field#/'string' field#/?

Leon Adler Over a year ago

sed -r 's/([^,]*).*/{ "name": "\1" }/'. The basic sed replace syntax is sed 's/pattern/replacement'. For regular expressions specifically I can recommend a read of regular-expressions.info

Collectives™ on Stack Overflow

Use Unix's sed to turn a csv into a javascript object

1 Answer 1

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related