Parsing text event file in Python

Question

I have a large text file with Event data that I am trying to parse to a csv. The structure looks like this:

START
USER: a
TIME: 1000
CLICKS: 1
COMMAND A: 2
COMMAND B: 1
END
START
USER: b
TIME: 00
CLICKS: 1
COMMAND A: 2
COMMAND B: 1
COMMAND C: 1
END

The events are separated using the START and END tags and I am trying to parse it to create a csv file that has each event as a row, and the other attributes as columns, so in the example above, the columns would be USER, TIME, CLICKS, COMMAND A, COMMAND B, COMMAND C and the values for each would be the value after the :

I know that this code will read an individual event:

with open('sampleIVTtxt.txt', 'r') as input_data:
for line in input_data:
    if line.strip() == 'START REPORT':
break
for line in input_data:  
    if line.strip() == 'END':

Where I am stuck is how to parse the lines within the event block and store them as columns and values in a csv. I'm thinking for each line within the event block I need to parse out the column name using regex and then store those names in an array and use writerow(namesarray) to create the columns. But I'm not sure how to loop through the whole txt file and store subsequent event values in those columns.

I am new to python, so any help would be appreciated.

I think it would help if you (1) format your post correctly, and (2) add a python tag. Oh, and (3) post what you got and point out where you are stuck. — Jongware
– Jongware, Commented May 4, 2015 at 23:55
Thank you for your response. I've edited the question with tags and provided more detail on where I'm stuck — user1735330
– user1735330, Commented May 5, 2015 at 0:51
Yes, I will know all the columns that could exist for an event. However, not all events will have input for each column. Basically, if a COMMAND A was not used, there will be no line for it in that event block, so I would want the row to just have a 0 or null cell for that column — user1735330
– user1735330, Commented May 5, 2015 at 1:04

kaz · Accepted Answer · 2015-05-05 01:41:45Z

2

Something like:

import csv

with open('sampleIVTtxt.csv', 'w') as csvfile:
    fieldnames = ['USER', 'TIME','CLICKS','COMMAND_A','COMMAND_B','COMMAND_C']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

    writer.writeheader()

with open('sampleIVTtxt.txt', 'r') as input_data:
for line in input_data:
    thisLine=line.strip()
    if thisLine == 'START':
       myDict={}
    elif "USER" in thisLine:
       myDict['USER'] = thisLine[6:]
     ....and so on....
    elif thisLine == 'END':
      writer.writerow(myDict)

edited May 5, 2015 at 1:41

answered May 5, 2015 at 1:12

kaz

1,1908 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user1735330 Over a year ago

Thanks kaz, I am getting an "invalid syntax" error on the line myDict{'USER': thisLine[6:]}. Does this part: elif "USER" in thisLine: myDict{'USER': thisLine[6:]} check if there is a row with "USER" and if so, store the value in the column called user?

kaz Over a year ago

sorry, been a while in Python - wrong syntax. I'll edit it. And yes, that is the approach - except I first store all the data for a row in a dictionary, then use a csv writer that uses that dictionary to write the values to the appropriate columns.

user1735330 Over a year ago

Thanks kaz, I am still tweaking my code but I think this answer will get me what I am looking for. I appreciate the help!

Collectives™ on Stack Overflow

Parsing text event file in Python

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related