-1

I have a text file that i'm trying to convert to a Excel file in python 3. The text files have a series of accounts - one text file looks like: example -

PRODUCE_NAME: abc

PRODUCE_NUMBER: 12345

DATE: 12/1/13

PRODUCE_NAME: efg

PRODUCE_NUMBER: 987

DATE: 2/16/16

TIME: 12:54:00

PRODUCE_NAME: xyz

PRODUCE_NUMBER: 0046

DATE: 7/15/10

COLOR: blue.

I would like the excel file to look like this. enter image description here

some code: ` # open text file

op_file = open("Comp_file_1.txt", "r", encoding='windows-1252')
text_file = op_file.read()

##############################################################
# location of CAP WORD: and group them 

for mj in re.finditer(r"[A-Z]\w+(:)", text_file):
    col_list_start.append(mj.start(0))
    col_list_end.append(mj.end(0))
    col_list_group.append(mj.group()) 

#############################################################
# Location of the end of file and delete index 0 of start

while True:
    # Advance location by 1.
    location = text_file.find(".", location + 1)

    # Break if not found.
    if location == -1: break

# Display result.
    endline = location

col_list_start.append(int(endline))
del col_list_start[0]

##############################################################
# cut out the index of the rows - abc , 12345, 12/1/13

for m in range(len(col_list_end)):
    index4.append(file_data2[col_list_end[m]:col_list_start[m]]) 

##############################################################
# makes a data frame 
# and groups the data frame

group_excel_list = {}
for k,v in zip(col_list_group, index4):
     group_excel_list.setdefault(k, []).append(v)`

dataframe looks like this 
key                 value
{"PRODUCE_NAME:": [abc, efg, xyz]}    
{"PRODUCE_NUMBER:" : [12345, 987, 0046]}
{"DATE:" : [12/1/13, 2/16/16, 7/15/10]}
{"TIME:" : [12:54:00]}
{"COLOR:" [blue]}

df = pd.DataFrame(data=[group_excel_list], columns = col_list_group)

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter("Comp_file_1" + '.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')

# Close the Pandas Excel writer and output the Excel file.
writer.save()

I'm getting just one row of the dataframe. Header - PRODUCE_NAME: PRODUCE_NUMBER: DATE: row 0 - [abc, efg, xyz] [12345, 987, 0046] [12/1/13, 2/16/16, 7/15/10]

Whatever help you can give would be appreciated.

1

2 Answers 2

3

Read in your data from your text file (.txt file where the columns are seperated with tabs, this was the case with my data, but might be different with yours of course!):

import csv

data = []

with open("file_%02d.txt" %fileNumber, 'r') as f:
    reader = csv.reader(f, dialect = 'excel', delimiter = '\t')
    % reads the rows from your imported data file and appends them to a list
    for row in reader:
        print row
        data.append(row)

Write your data to an external file:

import pandas as pd
newData= pd.DataFrame(data, columns = ['name1','name2',...,'nameN'])
expData.to_csv("new_file_%02d.csv" %fileNum, sep = ';')

This is more or less top of my head, but it should do the trick. You can write away data that is in a list, just make sure that the number of elements in the list and the columnnames match

I hope I helped a little!

Sign up to request clarification or add additional context in comments.

Comments

0

I'm sorry that I can't remember the precise method but if you create a file using f = file ... etc. and make it a comma separated values (.csv) file then there is a way of loading that straight into excel so that all the items separated by commas go into separate columns and all the things split by enters go into separate rows (again sorry I can't remember the exact procedure)

See

3 Comments

Thank you 13ros27 - i will look into this method ... Do you know if it will lineup the rows when there is a missing header? most of the headers are the same but some of the account have one more or less so i would have blanks ... Just need them to lineup in the right rows.
If you want to add blanks then the way to do that is to have two commas because excel should (it was a few years ago when I tried this) then count the double comma as a blank column
Thank you Akshay Bahadur for finding that link

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.