1

I need to write a string from HTTP into CSV file. My columns have to be: LATITUDE,LONGITUDE, OSM_ID, HIGHWAY, UPDATED_AT

This is a sample of the output of HTTP link starting from top:

{
  "datetime": "2018-06-08T08:26:09.375Z",
  "success": true,
  "bbox": {
    "xmin": "12.335513",
    "ymin": "42.035682",
    "xmax": "12.758896",
    "ymax": "42.050826"
  },
  "data": [
    {
      "aggregate_id": 30201274,
      "ppe": 0.316954620298806,
      "geom": {
        "type": "Point",
        "coordinates": [
          12.532972800901,
          42.045435384225
        ]
      },
      "osm_id": "37015042",
      "highway": "motorway",
      "updated_at": "2018-01-20T03:27:11.047Z"
    },
    {
      "aggregate_id": 30201275,
      "ppe": 0.318124963244448,
      "geom": {
        "type": "Point",
        "coordinates": [
          12.5329908742,
          42.045615145535
        ]
      },
      "osm_id": "37015042",
      "highway": "motorway",
      "updated_at": "2018-01-20T03:27:11.047Z"
    },
    {
      "aggregate_id": 30201276,
      "ppe": 0.204792151096739,
      "geom": {
        "type": "Point",
        "coordinates": [
          12.533008947499,
          42.045794906844
        ]
      },
      "osm_id": "37015042",
      "highway": "motorway",
      "updated_at": "2018-01-20T03:27:11.047Z"
    },
    {
      "aggregate_id": 30201277,
      "ppe": 0.194797261691664,
      "geom": {
        "type": "Point",
        "coordinates": [
          12.533030586679,
          42.045974206816
        ]
      },
      "osm_id": "37015042",
      "highway": "motorway",
      "updated_at": "2018-01-20T03:27:11.047Z"
    }
  ]
}

Each row is separated by ','.

I wrote this code

import pandas as pd
import csv
import urllib.request

from urllib.request import urlopen

CSV_URL = 'http://www.smartroadsense.it/bb/12.335513/42.035682/12.758896/42.050826'

request = urllib.request.Request(CSV_URL)
response = urllib.request.urlopen(request)
response.read().decode('utf-8')

#write into csv
colNames = ["longitude","latitude","ppe","osm_id","highway","updated_at"]
data = pd.read_csv(CSV_URL, names=colNames, sep=',')

The problem is how to split the string from http into rows. Someone can help me?

3
  • The question has nothing to do with machine-learning - kindly do not spam the tag (edited & removed). Commented Jun 8, 2018 at 10:55
  • It has nothing to do with requests either (tag removed). Commented Jun 8, 2018 at 11:10
  • Your source string's format is not CSV, it's JSON. Use json.load() to parse it into a Python dict, then build your csv from this dict (you don't need panda for this, the stdlib's csv module will be enough). Commented Jun 8, 2018 at 11:13

4 Answers 4

2

Here's a simple implementation using pandas and requests:

import pandas as pd
import requests

url = 'http://www.smartroadsense.it/bb/12.335513/42.035682/12.758896/42.050826'
response = requests.get(url).json()

df = pd.DataFrame(response['data'])

You can extract the longitude and latitude from the geom column like this:

df['longitude'] = df.apply(lambda row: dict(row['geom'])['coordinates'][0], axis=1)
df['latitude'] = df.apply(lambda row: dict(row['geom'])['coordinates'][1], axis=1)

Finally, save the required columns to a csv file like so:

df[['latitude', 'longitude', 'osm_id', 'highway', 'updated_at']].to_csv('output.csv', index=False)
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your time! I have another question: could you explain me the following instruction, please? 'lambda row: dict(row['geom'])['coordinates'][0]', in particular I didn't understand how the program can read each row.
the apply function (with axis=1) applies some transformation to all the rows of the DataFrame.
1

Here is another way to do it :

import pandas as pd
import urllib.request
from urllib.request import urlopen
CSV_URL = "http://www.smartroadsense.it/bb/12.335513/42.035682/12.758896/42.050826"

request = urllib.request.Request(CSV_URL)
response = urllib.request.urlopen(request)
test = json.loads(response.read())["data"]

def transform_row(x):
  x["longitude"] = x["geom"]["coordinates"][0]
  x["latitude"] = x["geom"]["coordinates"][1]
  del x["geom"]
  return x

res = list(map(transform_row, test))
pd.DataFrame(res)[['latitude', 'longitude', 'osm_id', 'highway', 'updated_at']].to_csv('output.csv', index=False)

Comments

0

A straightforward approach.

  1. Fetch the data
  2. Order the data
  3. Export to CSV
import csv
import requests


def main():
    url = 'http://www.smartroadsense.it/bb/12.335513/42.035682/12.758896/42.050826'
    data = fetch_data(url)
    rows = [row(r) for r in data]
    fields = ['LATITUDE', 'LONGITUDE', 'OSM_ID', 'HIGHWAY', 'UPDATED_AT']
    with open('output.csv', "w") as output:
        writer = csv.writer(output, lineterminator='\n')
        writer.writerows([fields])
        writer.writerows(rows)


def row(data):
    r = list()
    r.append(data['geom']['coordinates'][0])  # LATITUDE
    r.append(data['geom']['coordinates'][1])  # LONGITUDE
    r.append(data['osm_id'])  # OSM_ID
    r.append(data['highway'])  # HIGHWAY
    r.append(data['updated_at'])  # UPDATED_AT
    return r


def fetch_data(url):
    response = requests.get(url).json()
    return response['data']

Comments

0

Adding to other answers, here is another way of doing it with csv.writer:

from requests import get
from csv import writer

CSV_URL = 'http://www.smartroadsense.it/bb/12.335513/42.035682/12.758896/42.050826'

response = get(CSV_URL).json()

headers = ['LATITUDE', 'LONGITUDE', 'OSM_ID', 'HIGHWAY', 'UPDATED_AT']
columns = ["osm_id", "highway", "updated_at"]

with open('output.csv', 'w') as out:
    csv_writer = writer(out)
    csv_writer.writerow(headers)
    for row in response['data']:
        items = row['geom']['coordinates'] + [row[col] for col in columns]
        csv_writer.writerow(items)

print(open('output.csv').read())

Which outputs this csv file:

LATITUDE,LONGITUDE,OSM_ID,HIGHWAY,UPDATED_AT
12.532972800901,42.045435384225,37015042,motorway,2018-01-20T03:27:11.047Z
12.5329908742,42.045615145535,37015042,motorway,2018-01-20T03:27:11.047Z
12.533008947499,42.045794906844,37015042,motorway,2018-01-20T03:27:11.047Z
12.533030586679,42.045974206816,37015042,motorway,2018-01-20T03:27:11.047Z
...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.