0

I need to convert a .dat file that's in a specific format into a .csv file.

The .dat file has multiple rows with a repeating structure. The data is held in brackets and have tags. Below is the sample data; it repeats throughout the data file:

{"name":"ABSDSDSRF","ID":"AFJDKGFGHF","lat":37,"lng":-122,"type":0,"HAC":5,"verticalAccuracy":4,"course":266.8359375,"area":"san_francisco"}

Can anyone provide a starting point for the script?

6
  • 1
    That's JSON. You can parse it using the JSON module. Commented Aug 13, 2015 at 18:34
  • the problem it is in a .DAT file and I dont know how to bring it in as a JSON file. Can you provide a starting script? Commented Aug 13, 2015 at 18:36
  • If you open a .dat file in Sublime, does it show raw json? Commented Aug 13, 2015 at 18:37
  • In notepad it does. I am using wing though as my environment. :) Commented Aug 13, 2015 at 18:38
  • 1
    The file extension really doesn't matter for anything. Commented Aug 13, 2015 at 18:42

4 Answers 4

3

This will create a csv assuming each line in your .DAT is json. Just order the header list to your liking

import csv, json

header = ['ID', 'name', 'type', 'area', 'HAC', 'verticalAccuracy', 'course', 'lat', 'lng']

with open('file.DAT') as datfile:
    with open('output.csv', 'wb') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=header)
        writer.writeheader()
        for line in datfile:
            writer.writerow(json.loads(line))
Sign up to request clarification or add additional context in comments.

Comments

2

Your row is in json format. So, you can use:

import json
data = json.loads('{"name":"ABSDSDSRF","ID":"AFJDKGFGHF","lat":37,"lng":-122,"type":0,"HAC":5,"verticalAccuracy":4,"course":266.8359375,"area":"san_francisco"}')

print data.get('name')
print data.get('ID')

This is only a start point. You have to iter all the .dat file. At the end, you have to write an exporter to save the data into the csv file.

1 Comment

so I have a .dat file that contains the information. I get a ValueError: Extra data: line 2 column 1 - line 45670 column 77 (char 216 - 11091968) when putting into python.
1

Use a regex to find all of the data items. Use ast.literal_eval to convert each data item into a dictionary. Collect the items in a list.

import re, ast
result = []
s = '''{"name":"ABSDSDSRF","ID":"AFJDKGFGHF","lat":37,"lng":-122,"type":0,"HAC":5,"verticalAccuracy":4,"course":266.8359375,"area":"san_francisco"}'''

item = re.compile(r'{[^}]*?}')
for match in item.finditer(s):
    d = ast.literal_eval(match.group())
    result.append(d)

If each data item is on a separate line in the file You don't need the regex - you can just iterate over the file.

with open('file.dat') as f:
    for line in f:
        line = line.strip()
        line = ast.literal_eval(line)
        result.append(line)

Comments

0

Use json.load:

import json
with open (filename) as fh:
  data = json.load (fh)
   ...

1 Comment

I get this error: ValueError: Extra data: line 2 column 1 - line 45670 column 77 (char 216 - 11091968)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.