0

I have many strings like this:

"[{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}]"

But since I'm working with a dataframe, I need to convert them into JSON (or that's what it looks like by the format) so I can access and flatten the data. Any idea on how this can be achieved?

EDIT: I realised that it's not JSON, but I still don't know how to convert this to a dictionary or so in order to manipulate it.

3

3 Answers 3

2

You can use ast.literal_eval:

import ast
x = ast.literal_eval("[{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}]")
x[0]["name"]  # evaluates to 'Romance'

From the documentation:

Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None.

This can be used for safely evaluating strings containing Python values from untrusted sources without the need to parse the values oneself. It is not capable of evaluating arbitrarily complex expressions, for example involving operators or indexing.

Sign up to request clarification or add additional context in comments.

Comments

0

It looks like the data is almost in JSON, but I believe in the double quotes should be around the dictionary keys, while single quotes should be around the entire object. You can fix this by running:

data_string = "[{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}]"
json_string = data_string.replace("'", '''"''')

You now have a JSON string!

If you need to convert the string to python structures you can do the following:

import json

data = json.loads(json_string)
print(data[0]['id']) # 10749

2 Comments

This works in this case, but it's possible that things other than single-quotes could make a string non-JSON. For example, there might be commas at the end of the last item in a sequence, which would make this invalid JSON.
There also could be single quotes inside strings, this would convert them to double quotes as well.
-1

As this could be a potentially repetitive task. It's probably a good idea to make a function out of it.

import json  # Import json module to work with json data
import ast


data = "[{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}]"


def clean_data_for_json_loads(input_data):
    """Prepare data from untrusted sources for json formatting. 
    Output JSON object as string """
    evaluated_data = ast.literal_eval(input_data)
    json_object_as_string = json.dumps(evaluated_data)
    return json_object_as_string

evaluated_data = clean_data_for_json_loads(data)


# Load json data from a string, the (s) in loads stands for string. This helps to remember the difference to json.load
json_data = json.loads(evaluated_data)
print(json_data)

2 Comments

This won't work, the data isn't in JSON format. Don't you see the single quotes that should be double quotes?
Thanks for the feedback. I left some out of it. Now it's updated.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.