-1

Anyone know an effective way to select just one line at a time of this json file in Python?

I want to be able to write each line into a relational database but json.loads() throws an 'Extra Data' error if I don't select each line separately.

Thanks

{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Seoul (Incheon)","departure.timezone":"Asia/Seoul","departure.iata":"ICN","departure.icao":"RKSI","departure.terminal":"1","departure.gate":"38","departure.scheduled":"2023-04-14T12:00:00+00:00","departure.estimated":"2023-04-14T12:00:00+00:00","arrival.airport":"Fukuoka","arrival.timezone":"Asia/Tokyo","arrival.iata":"FUK","arrival.icao":"RJFF","arrival.terminal":"I","arrival.scheduled":"2023-04-14T13:20:00+00:00","arrival.estimated":"2023-04-14T13:20:00+00:00","airline":"Korean Air","flight.number":"5077","flight.iata":"KE5077","flight.icao":"KAL5077","flight.codeshared.airline_name":"jin air","flight.codeshared.airline_iata":"lj","flight.codeshared.airline_icao":"jna","flight.codeshared.flight_number":"223","flight.codeshared.flight_iata":"lj223","flight.codeshared.flight_icao":"jna223","destination":"Tokyo","country":"Japan","arrival_airport":"Fukuoka","schedule_arrive":"2023-04-14T13:20:00+00:00","temperature":16,"description":1,"wind_speed":24,"wind_degree":240,"humidity":72,"feelslike":16,"visibility":10,"cloud_cover":25}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Seoul (Incheon)","departure.timezone":"Asia/Seoul","departure.iata":"ICN","departure.icao":"RKSI","departure.terminal":"1","departure.gate":"E01","departure.scheduled":"2023-04-14T12:00:00+00:00","departure.estimated":"2023-04-14T12:00:00+00:00","arrival.airport":"Taiwan Taoyuan International (Chiang Kai Shek International)","arrival.timezone":"Asia/Taipei","arrival.iata":"TPE","arrival.icao":"RCTP","arrival.terminal":"2","arrival.gate":"C8","arrival.baggage":"7B","arrival.scheduled":"2023-04-14T13:35:00+00:00","arrival.estimated":"2023-04-14T13:35:00+00:00","airline":"Thai Airways International","flight.number":"6397","flight.iata":"TG6397","flight.icao":"THA6397","flight.codeshared.airline_name":"eva air","flight.codeshared.airline_iata":"br","flight.codeshared.airline_icao":"eva","flight.codeshared.flight_number":"169","flight.codeshared.flight_iata":"br169","flight.codeshared.flight_icao":"eva169","destination":"Taipei","country":"Taiwan","arrival_airport":"Taiwan Taoyuan International (Chiang Kai Shek International)","schedule_arrive":"2023-04-14T13:35:00+00:00","temperature":22,"description":2,"wind_speed":6,"wind_degree":310,"humidity":88,"feelslike":25,"visibility":6,"cloud_cover":50}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Cologne/bonn","departure.timezone":"Europe/Berlin","departure.iata":"CGN","departure.icao":"EDDK","departure.delay":22,"departure.scheduled":"2023-04-14T03:55:00+00:00","departure.estimated":"2023-04-14T03:55:00+00:00","arrival.airport":"Vienna International","arrival.timezone":"Europe/Vienna","arrival.iata":"VIE","arrival.icao":"LOWW","arrival.scheduled":"2023-04-14T05:23:00+00:00","arrival.estimated":"2023-04-14T05:23:00+00:00","airline":"UPS Airlines","flight.number":"274","flight.iata":"5X274","flight.icao":"UPS274","destination":"Vienna","country":"Austria","arrival_airport":"Vienna International","schedule_arrive":"2023-04-14T05:23:00+00:00","temperature":7,"description":3,"wind_speed":17,"wind_degree":330,"humidity":87,"feelslike":5,"visibility":10,"cloud_cover":75}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Seoul (Incheon)","departure.timezone":"Asia/Seoul","departure.iata":"ICN","departure.icao":"RKSI","departure.terminal":"1","departure.gate":"E01","departure.scheduled":"2023-04-14T12:00:00+00:00","departure.estimated":"2023-04-14T12:00:00+00:00","arrival.airport":"Taiwan Taoyuan International (Chiang Kai Shek International)","arrival.timezone":"Asia/Taipei","arrival.iata":"TPE","arrival.icao":"RCTP","arrival.terminal":"2","arrival.gate":"C8","arrival.baggage":"7B","arrival.scheduled":"2023-04-14T13:35:00+00:00","arrival.estimated":"2023-04-14T13:35:00+00:00","airline":"Thai Airways International","flight.number":"6397","flight.iata":"TG6397","flight.icao":"THA6397","flight.codeshared.airline_name":"eva air","flight.codeshared.airline_iata":"br","flight.codeshared.airline_icao":"eva","flight.codeshared.flight_number":"169","flight.codeshared.flight_iata":"br169","flight.codeshared.flight_icao":"eva169","destination":"Taipei","country":"Taiwan","arrival_airport":"Taiwan Taoyuan International (Chiang Kai Shek International)","schedule_arrive":"2023-04-14T13:35:00+00:00","temperature":22,"description":4,"wind_speed":6,"wind_degree":310,"humidity":88,"feelslike":25,"visibility":6,"cloud_cover":50}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Guangzhou Baiyun International","departure.timezone":"Asia/Shanghai","departure.iata":"CAN","departure.icao":"ZGGG","departure.terminal":"2","departure.scheduled":"2023-04-14T10:10:00+00:00","departure.estimated":"2023-04-14T10:10:00+00:00","arrival.airport":"Xiamen","arrival.timezone":"Asia/Shanghai","arrival.iata":"XMN","arrival.icao":"ZSAM","arrival.terminal":"3","arrival.scheduled":"2023-04-14T11:40:00+00:00","arrival.estimated":"2023-04-14T11:40:00+00:00","airline":"Hebei Airlines","flight.number":"8312","flight.iata":"NS8312","flight.icao":"HBH8312","flight.codeshared.airline_name":"xiamen airlines","flight.codeshared.airline_iata":"mf","flight.codeshared.airline_icao":"cxa","flight.codeshared.flight_number":"8306","flight.codeshared.flight_iata":"mf8306","flight.codeshared.flight_icao":"cxa8306","destination":"Shanghai","country":"China","arrival_airport":"Xiamen","schedule_arrive":"2023-04-14T11:40:00+00:00","temperature":16,"description":5,"wind_speed":4,"wind_degree":170,"humidity":94,"feelslike":16,"visibility":10,"cloud_cover":75}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Hangzhou","departure.timezone":"Asia/Shanghai","departure.iata":"HGH","departure.icao":"ZSHC","departure.terminal":"3","departure.scheduled":"2023-04-14T09:20:00+00:00","departure.estimated":"2023-04-14T09:20:00+00:00","arrival.airport":"Nanning","arrival.timezone":"Asia/Shanghai","arrival.iata":"NNG","arrival.icao":"ZGNN","arrival.terminal":"T2","arrival.scheduled":"2023-04-14T12:05:00+00:00","arrival.estimated":"2023-04-14T12:05:00+00:00","airline":"Loong Air","flight.number":"3479","flight.iata":"GJ3479","flight.icao":"CDC3479","flight.codeshared.airline_name":"xiamen airlines","flight.codeshared.airline_iata":"mf","flight.codeshared.airline_icao":"cxa","flight.codeshared.flight_number":"8351","flight.codeshared.flight_iata":"mf8351","flight.codeshared.flight_icao":"cxa8351","destination":"Shanghai","country":"China","arrival_airport":"Nanning","schedule_arrive":"2023-04-14T12:05:00+00:00","temperature":16,"description":6,"wind_speed":7,"wind_degree":210,"humidity":94,"feelslike":16,"visibility":10,"cloud_cover":75}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Hangzhou","departure.timezone":"Asia/Shanghai","departure.iata":"HGH","departure.icao":"ZSHC","departure.terminal":"3","departure.scheduled":"2023-04-14T09:20:00+00:00","departure.estimated":"2023-04-14T09:20:00+00:00","arrival.airport":"Nanning","arrival.timezone":"Asia/Shanghai","arrival.iata":"NNG","arrival.icao":"ZGNN","arrival.terminal":"T2","arrival.scheduled":"2023-04-14T12:05:00+00:00","arrival.estimated":"2023-04-14T12:05:00+00:00","airline":"Hebei Airlines","flight.number":"8353","flight.iata":"NS8353","flight.icao":"HBH8353","flight.codeshared.airline_name":"xiamen airlines","flight.codeshared.airline_iata":"mf","flight.codeshared.airline_icao":"cxa","flight.codeshared.flight_number":"8351","flight.codeshared.flight_iata":"mf8351","flight.codeshared.flight_icao":"cxa8351","destination":"Shanghai","country":"China","arrival_airport":"Nanning","schedule_arrive":"2023-04-14T12:05:00+00:00","temperature":16,"description":7,"wind_speed":7,"wind_degree":210,"humidity":94,"feelslike":16,"visibility":10,"cloud_cover":75}
{"flight_date":"2023-04-14","flight_status":"scheduled","departure.airport":"Yancheng","departure.timezone":"Asia/Shanghai","departure.iata":"YNZ","departure.icao":"ZSYN","departure.scheduled":"2023-04-14T11:55:00+00:00","departure.estimated":"2023-04-14T11:55:00+00:00","arrival.airport":"Changsha","arrival.timezone":"Asia/Shanghai","arrival.iata":"CSX","arrival.icao":"ZGHA","arrival.terminal":"2","arrival.scheduled":"2023-04-14T14:00:00+00:00","arrival.estimated":"2023-04-14T14:00:00+00:00","airline":"Chongqing Airlines","flight.number":"2005","flight.iata":"OQ2005","flight.icao":"CQN2005","destination":"Shanghai","country":"China","arrival_airport":"Changsha","schedule_arrive":"2023-04-14T14:00:00+00:00","temperature":16,"description":8,"wind_speed":7,"wind_degree":210,"humidity":94,"feelslike":16,"visibility":10,"cloud_cover":75}
2

1 Answer 1

2

You could open() it and read it back by-lines

data = []
with open(path) as fh:  # where path is your JSON file
    for line in fh:     # file-likes are iterable
        data.append(json.loads(line))

Alternatively, you might consider having the data source make this valid JSON by wrapping it with [] and adding commas between lines to make this a list (or directly pick a friendlier format if you can control the data source)

>>> json.loads("""{"foo":1}\n{"bar":2}""")      # current
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.9/json/decoder.py", line 340, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 10)
>>> json.loads("""[{"foo":1},\n{"bar":2}]""")   # JSON list
[{'foo': 1}, {'bar': 2}]
Sign up to request clarification or add additional context in comments.

2 Comments

newline-delimited JSON is a pretty standard format though, and it makes memory efficient parsing trivial
@juanpa.arrivillaga oh, definitely - still, they might find they can just export to a format directly compatible with their database (even just a block of SQL statements) if they control the data source

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.