4

Python: 3.x

Hi. i have below csv file, which has header and rows. rows count may vary file to file. i am trying to convert this csv to a dict format and data is being repeated for first row.

"cdrRecordType","globalCallID_callManagerId","globalCallID_callId"
1,3,9294899
1,3,9294933

Code:

parserd_list = []
output_dict = {}
with open("files\\CUCMdummy.csv") as myfile:
    firstline = True
    for line in myfile:
        if firstline:
            mykeys = ''.join(line.split()).split(',')
            firstline = False
        else:
            values = ''.join(line.split()).split(',')
            for n in range(len(mykeys)):
                output_dict[mykeys[n].rstrip('"').lstrip('"')] = values[n].rstrip('"').lstrip('"')
                print(output_dict)
                parserd_list.append(output_dict)
#print(parserd_list)

(Generally my csv column count is more than 20, but i have presented a sample file.)

(i have used rstrip/lstrip to get rid of double quotes.)

Output getting:

{'cdrRecordType': '1'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933'}

this is the output of print inside for loop. and final output is also the same.

i dont know what mistake i am doing. Someone please help correct it.

thanks in advance.

4
  • You are actually reusing and appending the same dictionary again and again. Move the output_dict = {} directly before the for n in range(len(mykeys)): loop. Additionally you should append the dict to the list after this loop and not for each iteration. Commented Feb 26, 2020 at 4:35
  • hi Michael, thanks for your help. i moved out_dict out of FOR loop, it says 'n' is not defined. i have used p = len(mykeys) in place of 'n' and it says "list index out of range". Commented Feb 26, 2020 at 4:51
  • You shouldn't move the output_dict[mykeys[n]... out of for-loop but the appending of the dict to the list. Commented Feb 26, 2020 at 4:53
  • Excellent. Now i got it. it works after moving append step out of FOR loop... thanks Commented Feb 26, 2020 at 4:59

3 Answers 3

7

Instead of manually parsing a CSV file, you should use the csv module.

This will result in a simpler script and will facilitate gracefully handling edge cases (e.g. header row, inconsistently quoted fields, etc.).

import csv

with open('example.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(row)

Output:

$ python3 parse-csv.py
OrderedDict([('cdrRecordType', '1'), ('globalCallID_callManagerId', '3'), ('globalCallID_callId', '9294899')])
OrderedDict([('cdrRecordType', '1'), ('globalCallID_callManagerId', '3'), ('globalCallID_callId', '9294933')])

If you're intent on parsing manually, here's an approach for doing so:

parsed_list = []
with open('example.csv') as myfile:
    firstline = True
    for line in myfile:
        # Strip leading/trailing whitespace and split into a list of values.
        values = line.strip().split(',')

        # Remove surrounding double quotes from each value, if they exist.
        values = [v.strip('"') for v in values]

        # Use the first line as keys.
        if firstline:
            keys = values
            firstline = False
            # Skip to the next iteration of the for loop.
            continue

        parsed_list.append(dict(zip(keys, values)))

for p in parsed_list:
    print(p)

Output:

$ python3 manual-parse-csv.py
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933'}
Sign up to request clarification or add additional context in comments.

Comments

3

use csv.DictReader

import csv

with open("files\\CUCMdummy.csv", mode='r',newline='\n') as myFile:
    reader = list(csv.DictReader(myFile, delimiter=',',quotechar='"'))

Comments

0

The indentation of your code is wrong.

These two lines:

  print(output_dict)
  parserd_list.append(output_dict)

can simply be un-indented to be on the same line as the for loop above them. On top of this, you need to set a new dict for each new file line.

You can do this: output_dict = {} right before the for loop for the keys.

As mentioned above there are some libraries that will make life easier. But if you want to stick to appending dictionaries, you can load the lines of the file, close it, and process the lines as such also:

with open("scratch.txt") as myfile:
    data = myfile.readlines()

keys = data[0].replace('"','').strip().split(',')

output_dicts = []
for line in data[1:]:
    values = line.strip().split(',')
    output_dicts.append(dict(zip(keys, values)))

print output_dicts 


[{'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899', 'cdrRecordType': '1'}, {'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933', 'cdrRecordType': '1'}]

2 Comments

HI LeKhan.. perfect. i have got required output. But could you please help how to get rid of double quotes in final output.?
You've done this before using rstrip and lstrip. My personal approach would be something like this: keys = data[0].replace('"','').strip().split(','). Hope that helps :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.