Doesn't log_level parameter in python logging module affect performance?

Question

I am using an API, to get some service from my project. The API call is taking too long, so I thought one of reasons could be lots and lots of logs that I have put across the project and the IO reads/writes are taking time.

I am using logging. My guess was as a LOG_LEVEL discard logs of lower priority, with higher priorities the API call should be completed in less time. But the time is almost same(difference being in the range of 1/10th of seconds).

The only reference regarding LOG_LEVEL and performance I got from here is

The beauty of this is that if you set the log level to WARN, info and debug messages have next to no performance impact.

Some points I should note here

I have not configured my logs to stream to any log service, like Kibana.
I have checked with this kind of situations, I am not doing any prepossessing in log message.
I have done basic Logger initialization,i.e,

import logging
logger = logging.getLogger(__name__)

and not used any file to write logs into as follows. LOG_LEVEL is given as one of the environment variable.

logging.basicConfig(filename="file_name.log")

Considering every other thing is optimal(if also everything is not optimal, then too higher priority logs should take less time), am I wrong in my guess of more time because of log read/writes? If no, then why use of high priority LOG_LEVEL flags are not decreasing the time?

In which default location, logging module store the logs?

If you are facing a performance problem, why don't you use a profiler instead of guessing? — Leon
– Leon, Commented Sep 16, 2019 at 17:07
@Leon the problem here is not to know how my program executing, but to be able to debug if some mistake is done from the end users. — thepurpleowl
– thepurpleowl, Commented Sep 17, 2019 at 5:30
@thepurpleowl I understand that. My comment was related to the second sentence in your question: The API call is taking too long, so I thought one of reasons could be lots and lots of logs that I have put across the project and the IO reads/writes are taking time. — Leon
– Leon, Commented Sep 17, 2019 at 6:31

MyNameIsCaleb · Accepted Answer · 2019-09-16 18:07:44Z

What's the difference between log level performances?

Setting the log level can effect performance but may not be very noticeable until at scale.

When you set the level, you're creating a way to stop the logging process from continuing, and very little happens before this is checked with any individual log. For example, here is what CRITICAL logs look like in the code:

if self.isEnabledFor(CRITICAL):
    self._log(CRITICAL, msg, args, **kwargs)

The logger itself has much more to do as part of _log than just this check so there would be time gains by setting a log level. But, it is fairly optimized so at the point you have initiated a logger at all, unless the difference in the amount of calls is quite large you probably won't be able to notice it much.

If you removed any reference to the logging instead of just setting level, you would get more performance gains because that check is not happening at all (which obviously takes some amount of time).

Where are logs stored by default?

By default, without setting a File, StreamHandler [source] is enabled and without specifying a specific stream, it will stream to sys.stderr. When you set a file, it creates a FileHandler which inherits from the same functions as StreamHandlers [source].

How do I optimize?

For the question you didn't ask, which is How do I speed up logging? I would suggest looking at this which gives some advice. Part of that advice is what I pointed out above but telling you to explicitly check your log level, and you can even cache that result and check the cache instead which should reduce time even further.

Check out this answer for even more on optimizing logging.

And finally, if you want to determine the speed issues with your code, whether it is from logging or not, you need to use a profiler. There are built in profiling functions in Python, check here.

aviso · Accepted Answer · 2019-09-23 01:46:38Z

One log level isn't more performant than another, however, if a level is enabled for logging, loggers are nested (In your example, this would happen if __name__ had dots in it like mypackage.core.logs), and the version of Python you are running can. This is because three things happen when you make a logging call:

The logger determines if the logging level is enabled.

This will happen for every call. In versions of Python before 3.7, this call was not cached and nested loggers took longer to determine if they were enabled or not. How much longer? In some benchmarks it was twice as much time. That said, this is heavily dependent on log nesting and even when logging millions of messages, this may only save a few seconds of system time.
The logger processes the record.

This is where the optimizations outlined in the documentation come into play. They allow the record creation to skip some steps.
The logger send the record to the handler.

This maybe the default, StreamHandler, the FileHandler, the SysLogHandler, or any number of build-in or custom handlers. In your example, you are using the FileHandler to write to file_name.log in the current directory. This may be fine for smaller applications, larger applications would benefit from using an external logger like syslog or the systemd journal. The main reason for this is because these operate in a separate process and are optimized for processing a large number of logs.

Collectives™ on Stack Overflow

Doesn't log_level parameter in python logging module affect performance?

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related