I need to extract names/strings from a .txt file line by line. I am trying to use regex to do this.
Eg. In this below line I want to extract the name "Victor Lau", "Siti Zuan" and the string "TELEGRAPHIC TRANSFER" in three different lists then output them into an excel file. You may see the txt file also
TELEGRAPHIC TRANSFER 0008563668 040122 BRH BDVI0093 VICTOR LAU 10,126.75- .00 10,126.75- SITI ZUZAN 16:15:09
I have tried this code
for file in os.listdir(directory):
filename = os.fsdecode(file)
if (filename.endswith(".txt") or filename.endswith(".TXT")) and (filename.find('AllBanks')!=-1):
with open(file) as AllBanks:
for line in AllBanks:
try:
match4 = re.search(r'( [a-zA-Z]+ [a-zA-Z]+ [a-zA-Z]+ )|( [a-zA-Z]+ [a-zA-Z]+)', line)
List4.append(match4.group(0).strip())
except:
List4.append('NA')
df = pd.DataFrame(np.column_stack([List4,List5,List6]),columns=['a', 'b', 'c'])
df.to_excel('AllBanks.xlsx', index=False)