0

Help I have a JSON file and I can't get it to load in to a python DataFrame. First question should probably be is this actually a json file? Was the resulting file when a chart was rendered on a Web page and this was the back end data I pulled from chrome network inspection.

It seems there is code in front that is not JSON or not a table so it messing up the import.

http://pastebin.com/ne4RRrgP

Can you please help

the below loads the file in to python

import json
from pprint import pprint
with open('data2.json') as data_file:
    data = json.load(data_file)

and

pprint(data)

does print the data but I can't then convert to a pandas dataframe

edit

Ok this must be javascript file that I think is JSON.

3
  • json validator says that this is not correct json (http://jsonlint.com/) +why you want to convert that huge html-like string and javascript to dataframe? Commented May 9, 2016 at 8:43
  • Is possible add url adrress with this data to question? Commented May 9, 2016 at 8:47
  • Sorry not possible as it's a log in site and don't want to share the password out. A chart is generated from the date and price data. I have searched how the chart is generated and came to this table and I'm assuming it's json, it may not be. Thing is there is possibly extra code at the start and end that I need to remove. So this file I found from inspecting the data loaded to chrome when the chart renders Commented May 9, 2016 at 9:20

2 Answers 2

1

Just use pandas.read_json, which is merely a wrapper around the json class, but can take a remote URL as well as a local filename:

import pandas as pd
pandas_dataframe = pd.read_json('data2.json')

Hope that helps.

Sign up to request clarification or add additional context in comments.

3 Comments

Unfortunately it doesnt work, check it: print pd.read_json('http://pastebin.com/raw/ne4RRrgP')
@jezrael use the "raw" link -- pd.read_json('http://pastebin.com/raw/ne4RRrgP')
return for me ValueError: Unrecognized escape sequence when decoding 'string'
0

The data was not formatted correctly I had to edit the text and then it loads correctly in to Pandas. By removing the header and the back slashes as it now looks like it's an ajax request with jason data inside.

{"status": "success", "chart": "\n\n\n\n\n\n\n\n\n\n    \n\n\n\n\n\n\n\n\n<div id=\"id-freight-cash-prices-chart\" style=\"min-height: 480px\"></div>\n\n\n    <script>\n        $(function () {\n\n            var series = [{\"type\": \"line\", \"data\": [{\"date\": 1262563200000, \"last\": \"319.00000\"}, {\"date\": 1262649600000, \"last\": \"318.00000\"}, {\"date\": 1262736000000, \"last\": \"320.00000\"}, {\"date\": 1262822400000, \"last\": \"321.00000\"},

Thanks for all your help

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.