Parsing .DAT file with Python

Question

I need to convert a .dat file that's in a specific format into a .csv file.

The .dat file has multiple rows with a repeating structure. The data is held in brackets and have tags. Below is the sample data; it repeats throughout the data file:

{"name":"ABSDSDSRF","ID":"AFJDKGFGHF","lat":37,"lng":-122,"type":0,"HAC":5,"verticalAccuracy":4,"course":266.8359375,"area":"san_francisco"}

Can anyone provide a starting point for the script?

the problem it is in a .DAT file and I dont know how to bring it in as a JSON file. Can you provide a starting script? — Gonzalo68
– Gonzalo68, Commented Aug 13, 2015 at 18:36
In notepad it does. I am using wing though as my environment. :) — Gonzalo68
– Gonzalo68, Commented Aug 13, 2015 at 18:38

Cody Bouche · Accepted Answer · 2015-08-13 19:17:23Z

3

This will create a csv assuming each line in your .DAT is json. Just order the header list to your liking

import csv, json

header = ['ID', 'name', 'type', 'area', 'HAC', 'verticalAccuracy', 'course', 'lat', 'lng']

with open('file.DAT') as datfile:
    with open('output.csv', 'wb') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=header)
        writer.writeheader()
        for line in datfile:
            writer.writerow(json.loads(line))

answered Aug 13, 2015 at 19:17

Cody Bouche

9555 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

kamy22 · Accepted Answer · 2015-08-13 18:41:51Z

2

Your row is in json format. So, you can use:

import json
data = json.loads('{"name":"ABSDSDSRF","ID":"AFJDKGFGHF","lat":37,"lng":-122,"type":0,"HAC":5,"verticalAccuracy":4,"course":266.8359375,"area":"san_francisco"}')

print data.get('name')
print data.get('ID')

This is only a start point. You have to iter all the .dat file. At the end, you have to write an exporter to save the data into the csv file.

answered Aug 13, 2015 at 18:41

kamy22

1335 bronze badges

1 Comment

Gonzalo68 Over a year ago

so I have a .dat file that contains the information. I get a ValueError: Extra data: line 2 column 1 - line 45670 column 77 (char 216 - 11091968) when putting into python.

wwii · Accepted Answer · 2015-08-13 19:22:06Z

1

Use a regex to find all of the data items. Use ast.literal_eval to convert each data item into a dictionary. Collect the items in a list.

import re, ast
result = []
s = '''{"name":"ABSDSDSRF","ID":"AFJDKGFGHF","lat":37,"lng":-122,"type":0,"HAC":5,"verticalAccuracy":4,"course":266.8359375,"area":"san_francisco"}'''

item = re.compile(r'{[^}]*?}')
for match in item.finditer(s):
    d = ast.literal_eval(match.group())
    result.append(d)

If each data item is on a separate line in the file You don't need the regex - you can just iterate over the file.

with open('file.dat') as f:
    for line in f:
        line = line.strip()
        line = ast.literal_eval(line)
        result.append(line)

edited Aug 13, 2015 at 19:22

answered Aug 13, 2015 at 19:13

wwii

23.9k7 gold badges42 silver badges81 bronze badges

Comments

daniel kullmann · Accepted Answer · 2015-08-13 18:40:32Z

0

Use json.load:

import json
with open (filename) as fh:
  data = json.load (fh)
   ...

answered Aug 13, 2015 at 18:40

daniel kullmann

14.1k9 gold badges53 silver badges69 bronze badges

1 Comment

Gonzalo68 Over a year ago

I get this error: ValueError: Extra data: line 2 column 1 - line 45670 column 77 (char 216 - 11091968)

Collectives™ on Stack Overflow

Parsing .DAT file with Python

4 Answers 4

Comments

1 Comment

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related