How to replace string occurrences in python file?

Question

I have a CSV file which contains lines of sql entries. Each sql entry, contains message which are SOH ("\x01") delimited and are tag=value pairs.

8=Fix1.1<SOH>9=70<SOH>35=AE<SOH>10=237
8=Fix1.1<SOH>9=71<SOH>35=AE<SOH>10=238
8=Fix1.1<SOH>9=72<SOH>35=AE<SOH>10=239
8=Fix1.1<SOH>9=73<SOH>35=AE<SOH>10=240

(<SOH> is a placeholder for the actual character because Stack Overflow wouldn't let me include the \x01 character in the text)

Issue:

The below code snippet removes SOH to "," as expected, however, having trouble removing the tag part from the lines.

# Read in the file
with open('file.txt', 'r') as file :
  filedata = file.read()

# Replace the target string
filedata = filedata.replace('x01', ',') 
filedata2 = filedata.replace("=", ",") 
# Write the file out again
with open('file.txt', 'w') as file:
  file.write(filedata2)

Output:

8,Fix1.1,9,70,35,AE,10,237
8,Fix1.1,9,71,35,AE,10,238
8,Fix1.1,9,72,35,AE,10,239
8,Fix1.1,9,73,35,AE,10,240

I've also tried regex = re.compile ("[=]") and then loop into line reader and modify, but just returns all [=] in the print.

Desired output:

Fix1.1,70,AE,237

Fix1.1,71,AE,238

Fix1.1,72,AE,239

Fix1.1,73,AE,240

pho · Accepted Answer · 2021-09-27 19:14:07Z

2

Use csv.reader with delimiter="\x01" to split by the SOH character. Then, as you read each line, split each element by "=" and keep only the values.

import csv

filedata = []
with open('file.txt', 'r') as file:
    reader = csv.reader(file, delimiter="\x01")
    for row in reader:
        # Split each item in row
        # Keep only second element of each split
        values = [item.split("=", 1)[1] for item in row]
        filedata.append(values)

print(filedata)

which gives

[['Fix1.1', '70', 'AE', '237'], 
 ['Fix1.1', '71', 'AE', '238'], 
 ['Fix1.1', '72', 'AE', '239'], 
 ['Fix1.1', '73', 'AE', '240']]

You can write this list of lists to a file using csv.writer.writerows().

with open('outfile.txt', 'w') as f:
    w = csv.writer(f)
    w.writerows(filedata)

and your output file has:

Fix1.1,70,AE,237
Fix1.1,71,AE,238
Fix1.1,72,AE,239
Fix1.1,73,AE,240

If all you care about is to read the original file and write the new file, you can combine these two operations in one loop:

import csv

filedata = []
with open('file.txt', 'r') as file_in, open('outfile.txt', 'w') as file_out:
    reader = csv.reader(file_in, delimiter="\x01")
    writer = csv.writer(file_out)
    for row in reader:
        # Split each item in row
        # Keep only second element of each split
        values = [item.split("=", 1)[1] for item in row]
        # filedata.append(values)
        # Instead of appending to a container list,
        # Write to the output file
        writer.writerow(values)

edited Sep 27, 2021 at 19:14

answered Sep 27, 2021 at 15:27

pho

25.7k8 gold badges48 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

test1 1 Over a year ago

Many thanks @Pranav Hosangadi, can confirm this works. One question please, how can I keep the data in csv file as opposed to list? Used the below code, but got TypeError: a bytes-like object is required, not 'str'. So this CSV was generated by query to DB - output in tuple/list - which is converted to csv. So trying to keep in csv format as I can use for line by line comparison with another csv as opposed to list - csv module easier to control.

pho Over a year ago

@test11 I'm not sure what you mean by that. When you use writerows() it's written to an output file. How did you manage to get that error?

test1 1 Over a year ago

with open('test.csv', 'wb') as f: write = csv.writer(f) write.writerow(filedata) write.writerows(row)

pho Over a year ago

writerows() (notice the S at the end of the function) takes multiple rows, like filedata. writerow() takes a single row, so you have to iterate over each row in filedata before using writerow() (Or just use writerows(filedata)). Which line throws the error? @test11

test1 1 Over a year ago

Many thanks for your help @PranavHosangadi - much appreciated!

|

Collectives™ on Stack Overflow

How to replace string occurrences in python file?

1 Answer 1

8 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Related