3

I'm trying to write a regex for the following situation. I have a file with hundreds of dictionaries as string.

EG:

{'a':1'}
{{'a':1, 'b':2}{'c':3}}
{'a':4, 'b':6}

I read the file and removed the newlines. Now I'm trying split them based on a regex.

{'a':1'}{{'a':1, 'b':2}{'c':3}}{'a':4, 'b':6}

re.split("({.*?})", str). This wouldn't work because the whole second dict wouldn't match. How can I write a regex that would match all the lines return a list of dictionaries.

4
  • Where s this data coming from? Also, is that single quote after 1 intentional? Thanks. Commented Apr 6, 2016 at 3:31
  • 2
    Your input data are malformed: {{'a':1, 'b':2}{'c':3}} is not valid Python syntax. If it's a single dictionary with nested dictionaries then it's missing keys and a comma, and if you treat it as two separate dictionaries then you have extra braces. Commented Apr 6, 2016 at 3:35
  • any chance your file is json? Commented Apr 6, 2016 at 3:37
  • Each line is a json response. Every response is written/appended to a file. I'mean trying to read the file and store each response in a dictionary hence trying to split the string into dictionaries Commented Apr 6, 2016 at 3:55

2 Answers 2

3

You could simply do:

(\{[^{}]+\})
# look for an opening {
# and anything that is not { or }
# as well as an ending }

In Python this would be:

import re
rx = r'(\{[^{}]+\})'
string = "{'a':1'}{{'a':1, 'b':2}{'c':3}}{'a':4, 'b':6}"
matches = re.findall(rx, string)
print matches
# ["{'a':1'}", "{'a':1, 'b':2}", "{'c':3}", "{'a':4, 'b':6}"]

See a demo on regex101.com.

Sign up to request clarification or add additional context in comments.

Comments

0

Python regular expressions are not able to handle nested structures by themselves. you would have to do some looping or recursion separately.

However, you commented above that each line is a json response. Why not use json.loads() on each line.

import json

with open('path_to_file', 'r') as f:
    data = [json.loads(line) for line in f]

data is now a list of dictionaries.

2 Comments

The json.loads() will fail because some of the dictionaries don't have the right structure. I'm trying to split the string into dictionaries before I can do a json.loads().
So it not json then. Does the data have a well defined structure that can be parsed? What is the expected output for your sample data?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.