I'm using Beautiful Soup 4 to extract text from HTML files, and using get_text() I can easily extract just the text, but now I'm attempting to write that text to a plain text file, and when I do, I get the message "416." Here's the code I'm using:
from bs4 import BeautifulSoup
markup = open("example1.html")
soup = BeautifulSoup(markup)
f = open("example.txt", "w")
f.write(soup.get_text())
And the output to the console is 416 but nothing gets written to the text file. Where have I gone wrong?
withstatement to have that handled for yousoupandsoup.get_text()?f.write()(the number of bytes written). The writes are buffered by default; flush (application) buffers (f.flush()) or close the file (f.close()or usewith-statement that does it for you) to be able to see something in the file outside the Python process. Note: it doesn't ensure that the data is actually saved (physically) to disk depending on your OS, filesystem, hdd it may take a while (usually it doesn't matter unless there is a power failure).os.fsync()might flush OS buffers (usage example).