converting text file to excel python 3

Question

I have a text file that i'm trying to convert to a Excel file in python 3. The text files have a series of accounts - one text file looks like: example -

PRODUCE_NAME: abc

PRODUCE_NUMBER: 12345

DATE: 12/1/13

PRODUCE_NAME: efg

PRODUCE_NUMBER: 987

DATE: 2/16/16

TIME: 12:54:00

PRODUCE_NAME: xyz

PRODUCE_NUMBER: 0046

DATE: 7/15/10

COLOR: blue.

I would like the excel file to look like this. enter image description here

some code: ` # open text file

op_file = open("Comp_file_1.txt", "r", encoding='windows-1252')
text_file = op_file.read()

##############################################################
# location of CAP WORD: and group them 

for mj in re.finditer(r"[A-Z]\w+(:)", text_file):
    col_list_start.append(mj.start(0))
    col_list_end.append(mj.end(0))
    col_list_group.append(mj.group()) 

#############################################################
# Location of the end of file and delete index 0 of start

while True:
    # Advance location by 1.
    location = text_file.find(".", location + 1)

    # Break if not found.
    if location == -1: break

# Display result.
    endline = location

col_list_start.append(int(endline))
del col_list_start[0]

##############################################################
# cut out the index of the rows - abc , 12345, 12/1/13

for m in range(len(col_list_end)):
    index4.append(file_data2[col_list_end[m]:col_list_start[m]]) 

##############################################################
# makes a data frame 
# and groups the data frame

group_excel_list = {}
for k,v in zip(col_list_group, index4):
     group_excel_list.setdefault(k, []).append(v)`

dataframe looks like this 
key                 value
{"PRODUCE_NAME:": [abc, efg, xyz]}    
{"PRODUCE_NUMBER:" : [12345, 987, 0046]}
{"DATE:" : [12/1/13, 2/16/16, 7/15/10]}
{"TIME:" : [12:54:00]}
{"COLOR:" [blue]}

df = pd.DataFrame(data=[group_excel_list], columns = col_list_group)

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter("Comp_file_1" + '.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')

# Close the Pandas Excel writer and output the Excel file.
writer.save()

I'm getting just one row of the dataframe. Header - PRODUCE_NAME: PRODUCE_NUMBER: DATE: row 0 - [abc, efg, xyz] [12345, 987, 0046] [12/1/13, 2/16/16, 7/15/10]

Whatever help you can give would be appreciated.

Refer to this post stackoverflow.com/questions/19677104/…

Akshay Bahadur
– Akshay Bahadur

2017-12-01 17:07:42 +00:00
Commented Dec 1, 2017 at 17:07 — Akshay Bahadur
– Akshay Bahadur, Commented Dec 1, 2017 at 17:07

user8956387 · Accepted Answer · 2017-12-01 17:12:20Z

Read in your data from your text file (.txt file where the columns are seperated with tabs, this was the case with my data, but might be different with yours of course!):

import csv

data = []

with open("file_%02d.txt" %fileNumber, 'r') as f:
    reader = csv.reader(f, dialect = 'excel', delimiter = '\t')
    % reads the rows from your imported data file and appends them to a list
    for row in reader:
        print row
        data.append(row)

Write your data to an external file:

import pandas as pd
newData= pd.DataFrame(data, columns = ['name1','name2',...,'nameN'])
expData.to_csv("new_file_%02d.csv" %fileNum, sep = ';')

This is more or less top of my head, but it should do the trick. You can write away data that is in a list, just make sure that the number of elements in the list and the columnnames match

I hope I helped a little!

Akshay Bahadur · Accepted Answer · 2017-12-01 17:25:30Z

0

I'm sorry that I can't remember the precise method but if you create a file using f = file ... etc. and make it a comma separated values (.csv) file then there is a way of loading that straight into excel so that all the items separated by commas go into separate columns and all the things split by enters go into separate rows (again sorry I can't remember the exact procedure)

See

edited Dec 1, 2017 at 17:25

Akshay Bahadur

5174 silver badges11 bronze badges

answered Dec 1, 2017 at 17:00

13ros27

1841 silver badge16 bronze badges

3 Comments

orthoeng2 Over a year ago

Thank you 13ros27 - i will look into this method ... Do you know if it will lineup the rows when there is a missing header? most of the headers are the same but some of the account have one more or less so i would have blanks ... Just need them to lineup in the right rows.

13ros27 Over a year ago

If you want to add blanks then the way to do that is to have two commas because excel should (it was a few years ago when I tried this) then count the double comma as a blank column

13ros27 Over a year ago

Thank you Akshay Bahadur for finding that link

Collectives™ on Stack Overflow

converting text file to excel python 3

2 Answers 2

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related