4

I am trying to split a string by ",". 'split' function works fine for the following 'example1' as expected.

example1 = "1,'aaa',337.5,17195,.02,0,0,'yes','abc'"
example1.split(",")
Result: ['1', "'aaa'", '337.5', '17195', '.02', '0', '0', "'yes'", "'abc'"]

But, here i have a scenario, where there are commas within the single quotes, on which i do not want to split on.

example2 = "1,'aaa',337.5,17195,.02,0,0,'yes','abc, def, xyz'"
example2.split(",")
Result: ["1,'aaa',337.5,17195,.02,0,0,'yes','abc,", 'def,', "xyz'"]

But I am trying to get this result instead:

['1', "'aaa'", '337.5', '17195', '.02', '0', '0', "'yes'", "'abc, def, xyz'"]

How can I achieve this with string split function?

2
  • 2
    list(ast.literal_eval(example2)) should work; that's a valid Python tuple literal. But some context would help figure out the best solution. Where does that string come from? Commented Jan 4, 2019 at 11:15
  • Is that some sort of established format like CSV? If not, why not? If you have control over this, you should switch to using established formats, probably JSON, precisely to avoid reinventing this wheel. Commented Jan 4, 2019 at 11:16

2 Answers 2

7

You should first try to use built-ins or the standard library to read in your data as a list, for instance directly from a CSV file via the csv module.

If your string is from a source you cannot control, adding opening and closing square brackets gives a valid list, so you can use ast.literal_eval:

from ast import literal_eval

example2 = "1,'aaa',337.5,17195,.02,0,0,'yes','abc, def, xyz'"

res = literal_eval(f'[{example2}]')

# [1, 'aaa', 337.5, 17195, 0.02, 0, 0, 'yes', 'abc, def, xyz']

This does convert numeric data to integers / floats as appropriate. If you would like to keep them as strings, as per @JonClements' comment, you can pass to csv.reader:

import csv

res = next(csv.reader([example2], quotechar="'")) 

# ['1', 'aaa', '337.5', '17195', '.02', '0', '0', 'yes', 'abc, def, xyz']
Sign up to request clarification or add additional context in comments.

6 Comments

Note you can also use: next(csv.reader([example2], quotechar="'")) if preserving the elements as strings is definitely required - as opposed to ast.literal_eval which'll convert to other Python types.
@JonClements, Excellent point. I made the assumption OP wants numeric data as numbers.
Well - that makes more sense to me as well... but just going by their But I am trying to get this result instead:
Thanks jpp and @JonClements.. currently preserving them as strings, but converting it to appropriate types is good as well. I will explore that option.
@ArunNalpet literal_eval will work fine... you just can't use Python f-strings (3.6+)... try: res = literal_eval('[{}]'.format(example2)) instead...
|
0

Assuming that you want to keep those 's around the elements ("'aaa'" instead of 'aaa' as in your expected output), here's how you may do it with a function:

def spl(st, ch):
  res = []
  temp = []
  in_quote = False
  for x in st:
    if (x == "'"):
      in_quote = not in_quote

    if (not in_quote and x == ch):
      res.append("".join(temp))
      temp = []
    else:
      temp.append(x)

  res.append("".join(temp))
  return res




example2 = "1,'aaa',337.5,17195,.02,0,0,'yes','abc, def, xyz'"

print(spl(example2, ','))

Output:

['1', "'aaa'", '337.5', '17195', '.02', '0', '0', "'yes'", "'abc, def, xyz'"]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.