24

Is it advantageous to call logging functions with format string + args list vs. formatting inline?

I've seen (and written) logging code that uses inline string formatting:

logging.warn("%s %s %s" % (arg1, arg2, arg3))

and yet I assume it's better (performance-wise, and more idiomatic) to use:

logging.warn("%s %s %s", arg1, arg2, arg3)

because the second form avoids string formatting operations prior to invoking the logging function. If the current logging level would filter out the log message, no formatting is necessary, reducing computing time and memory allocations.

Am I on the right track here, or have I missed something?

9
  • 2
    I think you're on the right track. Commented Aug 14, 2012 at 15:32
  • Do you have a reference about the timing of the string formatting taking place at different times? I would have expected both take place before the call to warn is made. Not disputing what you are saying, just curious to find out more. Commented Aug 14, 2012 at 15:42
  • Well, in the first form, we're doing string formatting (and tuple creation) prior to the call to logging.warn - in the second case, we're just passing a list of args to logging.warn (which is merely tuple creation?) - so we avoid a string formatting operation at the invocation. As I mentioned in the question, I assume that if the current logging level would filter out the log message, no formatting would be done, again avoiding the string formatting operation. My conjecture is this should save processing time and memory allocations. Commented Aug 14, 2012 at 15:48
  • @Inactivist Ah .. ok, thanks for the additional information. I guess you can draw your own conclusions from the timings I provided with the information you already have. I guess another approach might be to perhap time the actual calls and see how they compare? Commented Aug 14, 2012 at 15:50
  • 4
    possible duplicate of Lazy logger message string evaluation Commented Aug 14, 2012 at 16:17

3 Answers 3

25

IMHO, for messages that are very likely to be displayed, such as those given to error or warn it does not make much of a difference.

For messages that are less likely displayed, I would definitely go for the second version, mainly for performance reasons. I often give large objects as a parameter to info, which implement a costly __str__ method. Clearly, sending this pre-formatted to info would be a performance waste.

UPDATE

I just checked the source code of the logging module and, indeed, formatting is done after checking the log level. For example:

class Logger(Filterer):
    # snip
    def debug(self, msg, *args, **kwargs):
        # snip
        if self.isenabledfor(debug):
            self._log(debug, msg, args, **kwargs)

One can observe that msg and args are untouched between calling log and checking the log level.

UPDATE 2

Spired by Levon, let me add some tests for objects that have a costly __str__ method:

$ python -m timeit -n 1000000 -s "import logging" -s "logger = logging.getLogger('foo')" -s "logger.setLevel(logging.ERROR)" "logger.warn('%s', range(0,100))"
1000000 loops, best of 3: 1.52 usec per loop
$ python -m timeit -n 1000000 -s "import logging" -s "logger = logging.getLogger('foo')" -s "logger.setLevel(logging.ERROR)" "logger.warn('%s' % range(0,100))"
1000000 loops, best of 3: 10.4 usec per loop

In practice, this could give a fairly high performance boost.

Sign up to request clarification or add additional context in comments.

3 Comments

Another time this is likely to make a difference is if the logging is in a performance critical loop.
That's what I expected based on observation. Thanks for going source diving!
Python 3.6 f-string formatting shows the same time penalty: python -m timeit -n 1000000 -s "import logging" -s "logger = logging.getLogger('foo')" -s "logger.setLevel(logging.ERROR)" -s "x=list(range(0,100))" "logger.warn(f'{x}')" yields 10 usec per loop, same as with classic style string formatting. Passing the formatting parameters directly to the log statement yields 2.12 usec.
9

In case this is helpful, here is a quick timing test for just the two formatting options:

In [61]: arg1='hello'
In [62]: arg2='this'
In [63]: arg3='is a test'

In [70]: timeit -n 10000000 "%s %s %s" % (arg1, arg2, arg3)
10000000 loops, best of 3: 284 ns per loop

In [71]: timeit -n 10000000  "%s %s %s", arg1, arg2, arg3
10000000 loops, best of 3: 119 ns per loop

seems to give the second approach the edge.

11 Comments

And what's about subsequent call of % withing log function?
So, the most important lesson I can learn from your answer: timeit is my friend! :D
@Inactivist I've found it to be useful .. often to verify (or contradict) my assumptions about what's speedy and what's not :)
Is there a similar simple way to determine differences in memory consumption between the two variants?
@Inactivist I've wondered about this too, and unfortunately have not found one. If there's one, I'd love to know though.
|
6

Avoiding inline string formatting does save some time if the current logging level filters the log message (as I expected) -- but not much:

In [1]: import logging

In [2]: logger = logging.getLogger('foo')

In [3]: logger.setLevel(logging.ERROR)

In [4]: %timeit -n 1000000 logger.warn('%s %s %s' % ('a', 'b', 'c'))
1000000 loops, best of 3: 1.09 us per loop

In [12]: %timeit -n 1000000 logger.warn('%s %s %s', 'a', 'b', 'c')
1000000 loops, best of 3: 946 ns per loop

So, as user1202136 pointed out, the overall performance difference depends on how long it takes to format the string (which could be significant depending on the cost of calling __str__ on arguments being passed to the logging function.)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.