4

I have a string for items separated by comma. Each item is surrounded by quotes ("), but the items can also contain commas (,). So using split(',') creates problems.

How can I split this text properly in Python?

An example of such string

"coffee", "water, hot"

What I want to achieve

["coffee", "water, hot"]

5
  • Can you show a few more examples of what you're looking for? Commented Sep 25, 2022 at 5:26
  • 1
    @Dennis Sure, but before that I updated to question with the result I want to achieve. Does that help? Commented Sep 25, 2022 at 5:31
  • 1
    @Dennis Now that I see example and the result visually, I think I can just add [] to string and convert to list? Haha I'll try that. Commented Sep 25, 2022 at 5:32
  • 1
    Does this answer your question? How do I split a string on a delimiter in Bash? Commented Sep 25, 2022 at 5:36
  • 1
    @Kaneki21 Thank you for the suggestion. It splits the items into individual lists and doesn't have conflicting delimiter in individual items. So unfortunately, that's not the answer. Commented Sep 25, 2022 at 5:40

6 Answers 6

3

You can split on separators that contain more than one character. '"coffee", "water, hot"'.split('", "') gives ['"coffee','water, hot"']. From there you can remove the initial and terminal quote mark.

Sign up to request clarification or add additional context in comments.

2 Comments

I thought and tried this actually but it seemed like an a little bit ugly solution and gave the vibe that it may ruin things when dealing with lots of data haha. Thank you though, if I can't find something better, I'll go with this.
If "coffee" were ", ", you'd split incorrectly. Admittedly that seems unlikely, though.
2
import ast

s = '"coffee", "water, hot"'

result = ast.literal_eval(f'[{s}]')

print(result)

5 Comments

Update: This doesn't work when the text is not surrounded by quotes ("). Gives invalid syntax.
@stackyname It's pretty easy to simply check for quotes before doing the eval. This is the most complete answer of them all in my opinion.
@user56700 thank you. And what do you suggest to do if there is no quotes?
@stackyname The same as always: Look up which format was used and then use a parser for that format. This time I actually didn't fully, as I'm hacking the brackets around it and guessing that it's otherwise appropriate for literal_eval, but I was displeased by the many previous answers doing imho worse hacks. Anyway, this question says "Each item is surrounded by quotes", so if you want to ask about about a different format, I suggest to ask another question.
@KellyBundy Yeah you're right. It's different topic. Just wanted to know if there is a solution with this. Thanks again.
1

You can use re.findall

import re

s = '"coffee", "water, hot"'
re.findall('"(.*?)"', s) # ['coffee', 'water, hot']

1 Comment

It's not uncommon for strings to contain quotes, usually escaped like \", which you'd match.
0

Firstly, I defined a function 'del_quote' to remove unnecessary quotes and spaces. Then I split it taking '",' as a separator. then the result was mapped to remove quotes then converted into list.

def del_quote(s):
return s.replace('"','').strip()

x='"coffee", "water, hot"'
result=list(map(del_quote,x.split('",')))
print(result)

Comments

0

You can almost use csv for this:

import csv
from io import StringIO
sio = StringIO()
sio.write('"coffee", "water, hot"')
sio.seek(0)
reader = csv.reader(sio)
print(next(reader))
# Prints ['coffee', ' "water', ' hot"']

The problem is that there is a space before the opening quote of "water, hot". If you replace '", "' with ",", then csv will work, and you will get ['coffee', 'water, hot'].

Comments

-1

How about:

string = '"coffee", "water, hot"'

stringList = string.split("'")

string = stringList[0]

print(string)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.