5

I want to parse the following strings

3693,"Toxic Avenger, The (1985)",Comedy|Horror

to

3693,
"Toxic Avenger, The (1985)",
Comedy|Horror.

similarly, the following

161944,The Last Brickmaker in America (2001),Drama

should be parsed to

161944

The Last Brickmaker in America (2001)

Drama

I can't do it by splitting by comma, since there is a comma within " , ".

The worked solution: LS05 suggested me to use "substring", so I did it and it worked perfect. here it is.

    var pos1 = line.indexOf(',');
    var line = line.substring(pos1+1); 

    pos1 = line.indexOf(',');
    pos2 = line.lastIndexOf(',');

    let movie_id = line.substring(0,pos1);
    let movie_tag = line.substring(pos1+1,pos2);
    let movie_timespan = line.substring(pos2+1);

Thanks to LS05 :)

6
  • What type of data is this ? Commented Feb 22, 2017 at 9:33
  • 1
    Maybe you can substring the first and the last part, so titles will remain Commented Feb 22, 2017 at 9:34
  • @alim oh ok, I've based my comment on your example data :) Commented Feb 22, 2017 at 13:59
  • @LS05 actually your idea turned out to be the best, worked great. thanks! :) Commented Feb 23, 2017 at 2:37
  • @alim Good! Maybe you can show up the code (or the part that uses this strategy) just for reference on your process to the solution :) Commented Feb 23, 2017 at 10:51

2 Answers 2

7

You can use regex to parse your string which will exclude commas which are inside quotes

var str = '3693,"Toxic Avenger, The (1985)",Comedy|Horror';
console.log(str.match(/(".*?"|[^",\s]+)(?=\s*,|\s*$)/g).join("\n"));

Demo (Refer to credits if you want to know how the above regex works)

As far as the code goes, I try to split your string ignoring the commas which are inside the string, and later we join the array items again using a new line character \n

Credits for Regex

Sign up to request clarification or add additional context in comments.

5 Comments

This removes , at the end of line. join(",\n") may help
it has a problem. can't parse this string '161944,The Last Brickmaker in America (2001),Drama'. removed the strings if there are no " ".
I added one more example. could you check it out please!
@alim You need to tweak your regex - jsfiddle.net/cc38s1a6/1 example, but you need to tweak that so it works with any types of quotes
@alim Try this console.log(str.match(/(["'].*?["']|[^",\s]+)(?=\s*,|\s*$)/g).join("\n")); (not 100% sure though if it might break something else), this will accept single and double quoted strings
3

You could use a CSV parser such as papa parse or if you feel that a third party library is not needed you may take a look at this function.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.