Apologies if this question is similar to others posted on SO, but I have tried many of the answers given and could not achieve what I am attempting to do.
I have some code that calls an external module:
import trafilatura
# after obtaining article_html
text = trafilatura.extract(article_html, language=en)
This sometimes prints out a warning on the console, which comes from the following code in the trafilatura module:
# at the top of the file
LOGGER = logging.getLogger(__name__)
# in the method that I'm calling
LOGGER.warning('HTML lang detection failed')
I'd like to not print this and other messages produced by the module directly to the console, but to store them somewhere such that I can edit the messages and decide what to do with them. (Specifically, I want to save the messages in slightly modified form but only given certain circumstances.) I am not using the logging library in my own code.
I have tried the following solution suggestions:
buf = io.StringIO()
with contextlib.redirect_stderr(buf): # I also tried redirect_stdout
text = trafilatura.extract(article_html, language=en)
and
buf = io.StringIO()
sysout = sys.stdout
syserr = sys.stderr
sys.stdout = sys.stderr = buf
text = trafilatura.extract(article_html, language=en)
sys.stdout = sysout
sys.stderr = syserr
However, in both cases buf remains empty and trafilatura still prints its logging messages to the console. Testing the redirects above with other calls (e.g. print("test")) they seem to catch those just fine, so apparently LOGGER.warning() from trafilatura is just not printing to stderr or stdout?
I thought I could set a different output stream target for trafilatura's LOGGER, but it is using a NullHandler so I could neither figure out its stream target nor do I know how to change it:
# from trafilatura's top-level __init__.py
logging.getLogger(__name__).addHandler(NullHandler())
Any ideas? Thanks in advance.