I’m trying to read an unknown large csv file with pandas. I came across some errors so I added the following arguments:
df = pd.read_csv(csv_file, engine="python", error_bad_lines=False, warn_bad_lines=True)
It is working good and skipping offending lines, and errors are prompted to the terminal correctly, such as:
Skipping line 31175: field larger than field limit (131072)
However, I’d like to save all errors to a variable instead of printing them. How can I do it?
Note that I have a big program here and can't change the output of all logs from file=sys.stdout to something else. I need a case specific solution.
Thanks!
stderr(notstdout) to a file.