1

I have a huge list in python that looks like this:

('foo','bar','foo/bar','foo1','bar/1')

Each value above demonstrates the character variety that the list contains - aplhanumeric plus slash. I need a way to turn that list into a list of tuples, like this:

(('foo','foo'),('bar','bar'),('foo/bar','foo/bar'),('foo1','foo1'),('bar/1','bar/1'))

So what better way to do this than Regex search and replace, right? (correct me if I'm wrong).

I am therefore trying to match anything between the quotes except for the commas, because technically, they are also between quotes. I used lookahead and lookbehind to match anything:

(?<=')(.*?)(?=')

But that only matches the values within the quotes and the commas. What I need is to match the value plus the quotes except the commas, and use a replacing regex to make the list look like the tuple above.

I can't do this by hand because the list is huge.

Any thoughts?

5
  • 1
    You say you have a huge list, then you show us a tuple, then you talk about parsing it with a regex, which implies that it's a str. Which is it? Commented Feb 21, 2013 at 20:07
  • Ok, let me clear it up. It is what you see. I need to convert the first to the second. It's a list of str but I need to convert it to a tuple of str Commented Feb 21, 2013 at 20:08
  • Also, it seems like all you're trying to do is: tuple((element, element) for element in huge_list). Or, even more simple: tuple(zip(huge_list, huge_list)). Am I missing something? Commented Feb 21, 2013 at 20:08
  • And to convert a list to a tuple, you just call tuple. Commented Feb 21, 2013 at 20:09
  • @Robᵩ: It definitely isn't a duplicate of that meta question. And it isn't even really an instance of an XY question. The OP didn't ask "How can I fix this regex?", he asked how to his actual problem, then showed what he tried (the regex). Maybe he tried the wrong thing, but that doesn't make it a bad question. Commented Feb 21, 2013 at 22:18

1 Answer 1

2

OK, you have a huge list of strings. You want a tuple, where for each element of the list, you have the pair (element, element).

That's exactly what zip does, except that it returns a list of such pairs in 2.x, or an iterator in 3.x. Either way, you can convert that to a tuple just by calling tuple. So:

tuple(zip(huge_list, huge_list))

More generally, if you want to transform a sequence element by element, you can use a comprehension or a generator expression. There are no "tuple comprehensions", but just passing a generator expression to the tuple function does the same thing. So:

tuple((element, element) for element in huge_list)

Or, if you wanted a tuple of (s[0], s[1:]) pairs instead of (s, s) pairs:

tuple((element[0], element[1:]) for element in huge_list)

And so on.

Meanwhile, I can't think of any situation where converting an object into its repr to run a regex transformation on it and re-parse it would be a good idea in Python. This isn't just a "Now they have two problems" issue; parsing the resulting string (and, even if you don't care about safety, figuring out how to deal with things where eval(repr(x)) != x) is going to be a harder problem than whatever you started with. So, if you ever spot yourself trying to make that work, take a step back.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.