Been trying to figure this one out all day. I have a large text file (546 MB) that I am trying to parse in python looking to pull out the text between the open tag and the close tag and I keep getting memory problems. With the help of good folks on this board this is what I have so far.
answer = ''
output_file = open('/Users/Desktop/Poetrylist.txt','w')
with open('/Users/Desktop/2e.txt','r') as open_file:
for each_line in open_file:
if each_line.find('<A>'):
start_position = each_line.find('<A>')
start_position = start_position + 3
end_position = each_line[start_position:].find('</W>')
answer = each_line[start_position:end_position] + '\n'
output_file.write(answer)
output_file.close()
I am getting this error message:
Traceback (most recent call last):
File "C:\Users\Adam\Desktop\OEDsearch3.py", line 9, in <module>
end_position = each_line[start_position:].find('</W>')
MemoryError
I have little to no programming experience and I am trying to figure this out for a poetry project I am working on. Any help is greatly appreciated.