1

I have a list of dictionaries like this:

list_of_dict = [
    {'text': '"Some text1"', 
     'topics': ['Availability', 'Waits'], 
     'categories': ['Scheduler']},
    {'text': 'Alot to improve'},
    {'text': 'More text '}
    ]

I am writing it to a csv file as follows:

with open("text.csv", 'wb') as resultFile:
            wr = csv.writer(resultFile, dialect='excel')
            wr.writerow(['text', 'topics', 'categories'])

for d in list_of_dict:
    with open("text.csv", 'a') as f:
            w = csv.DictWriter(f, d.keys())
            w.writerow(d)

This writes to the csv file as follows:

text            |  topics                   | categories
Some text1      | ['Availability', 'Waits'] | ['Scheduler']
Alot to improve |
More text       |

However, I want it that for each category and for each topic there should be a separate column, then if some topic exists from the topics list or some category exists from the categories list, then write True in that cell for that particular topic/category of the text else write False.

OUTPUT:

text             | Availability | Waits | Scheduler |
Some text1       | True         | True  | True      |
Alot to improve  | False        | False | False     |
More text        | False        | False | False     |

How can this be done? Thanks!

0

1 Answer 1

1

For each row it will probably be easiest to start with a default dictionary containing all required column values set to False, then as each row in your list_of_dict is read in, you can spot if it contains the requred keys, and update your row accordingly:

import csv

list_of_dict = [
    {'text': '"Some text1"', 'topics': ['Availability', 'Waits'], 'categories': ['Scheduler']},
    {'text': 'Alot to improve'},
    {'text': 'More text '}]

all_topics = ["Availability", "Waits"]
all_categories = ["Scheduler"]
fieldnames = ["text"] + all_topics + all_categories

with open("text.csv", 'wb') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames=fieldnames, dialect='excel')
    csv_output.writeheader()

    for d in list_of_dict:
        # Build a default row
        row = {v:False for v in all_topics + all_categories}
        row['text'] = d['text'].strip('"')

        if 'topics' in d:
            row.update({topic:True for topic in d['topics']})
        if 'categories' in d:
            row.update({category:True for category in d['categories']})

        csv_output.writerow(row)

Giving you a text.csv file:

text,Availability,Waits,Scheduler
Some text1,True,True,True
Alot to improve,False,False,False
More text ,False,False,False
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.