56

Python has a flag -O that you can execute the interpreter with. The option will generate "optimized" bytecode (written to .pyo files), and given twice, it will discard docstrings. From Python's man page:

-O Turn on basic optimizations. This changes the filename extension for compiled (bytecode) files from .pyc to .pyo. Given twice, causes docstrings to be discarded.

This option's two major features as I see it are:

  • Strip all assert statements. This trades defense against corrupt program state for speed. But don't you need a ton of assert statements for this to make a difference? Do you have any code where this is worthwhile (and sane?)

  • Strip all docstrings. In what application is the memory usage so critical, that this is a win? Why not push everything into modules written in C?

What is the use of this option? Does it have a real-world value?

1
  • 13
    You can use it to flip the blinkenlights on your test suite by making them sneakily ignore the assertions. Hurrah! You've finished the project! (Note: Don't do this) Commented Dec 2, 2014 at 8:55

7 Answers 7

52

Another use for the -O flag is that the value of the __debug__ builtin variable is set to False.

So, basically, your code can have a lot of "debugging" paths like:

if __debug__:
     # output all your favourite debugging information
     # and then more

which, when running under -O, won't even be included as bytecode in the .pyo file; a poor man's C-ish #ifdef.

Remember that docstrings are being dropped only when the flag is -OO.

Sign up to request clarification or add additional context in comments.

7 Comments

Wow. I thought you wanted to know what is the real world use of this option. Thanks for finding my answer next to useless. By the way, if you want someone to justify the choices of Guido and the rest of the Python core team, you shouldn't be asking questions here; finally, you can rely on a specific mode being used, the programmer can control whether optimization is used or not; ask a relevant question in SO as to how. I hereby declare your assumptions next to wrong and my time next to lost. Cheers. Sorry for disappointing you.
There is no reason for me to be disappointed about getting lots of answers to my question -- I like the conversations in stackoverflow. I mean what I say but I talk about the example you showed. The fact that you showed it or you yourself are not judged negatively at all.
python-ldap uses __debug__, it controls whether the debug trace statement logic is used or not. In fairness, checking against __debug__ is a lot faster than doing a hash look-up agains the local values in memory, then doing another hash look-up to see if it debug. However, since pyo files are generally not created for most people, you generally shouldn't bother with __debug__ and should have another means of having debug/non-debug mode.
Incidentally, a variety of real-world open-source frameworks already leverage __debug__ – including distlib, html5lib, IPython, Jinja2, matplotlib, python-ldap, speechd, and too many official CPython stdlib modules to count (e.g., imaplib, pickletools, statistics, unittest). __debug__ absolutely has its place. I'd like to see it leveraged more, honestly.
@CecilCurry: lilydjwg said they “haven't see people using __debug__ in real code yet”, so it was subjective, and that's why I never answered. In any case, I often find that using one's lack of experience (“I never experienced…”) as an argument against facts (in this case: actual uses of __debug__) is an alternative way of saying “I don't like it, so you shouldn't too”.
|
34

On stripping assert statements: this is a standard option in the C world, where many people believe part of the definition of ASSERT is that it doesn't run in production code. Whether stripping them out or not makes a difference depends less on how many asserts there are than on how much work those asserts do:

def foo(x):
    assert x in huge_global_computation_to_check_all_possible_x_values()
    # ok, go ahead and use x...

Most asserts are not like that, of course, but it's important to remember that you can do stuff like that.

As for stripping docstrings, it does seem like a quaint holdover from a simpler time, though I guess there are memory-constrained environments where it could make a difference.

4 Comments

history is important, good point. However, I don't want to see toy examples, I want to see what asserts are used in real-world code and if it makes a difference.
Memory speed is growing far slower than CPU speed, especially if you consider that we keep adding processors faster than adding memory bandwidth. So, memory is the new disk and L2 cache is the new memory. And L2 caches are tiny (compared to memory), and they actually keep getting smaller. (Core2 has 6144KiB, i7 only 256KiB, for example.) So, counting bytes is actually becoming useful again.
OpenGL libs like PyOpenGL and pyglet do some very expensive safety check assertions at runtime unless you specify -O.
If you use strict Contract Programming, you will likely have asserts at the beginning and end of every function you write.
9

If you have assertions in frequently called code (e.g. in an inner loop), stripping them can certainly make a difference. Extreme example:

$ python    -c 'import timeit;print timeit.repeat("assert True")'
[0.088717937469482422, 0.088625192642211914, 0.088654994964599609]
$ python -O -c 'import timeit;print timeit.repeat("assert True")'
[0.029736995697021484, 0.029587030410766602, 0.029623985290527344]

In real scenarios, savings will usually be much less.

Stripping the docstrings might reduce the size of your code, and hence your working set.

In many cases, the performance impact will be negligible, but as always with optimizations, the only way to be sure is to measure.

6 Comments

this question is about real-world code. btw, this is more practical: python -mtimeit "" "assert(True)" (setup in first argument)
This seems to be a strange example to me. You reduce code that is trivial to code that is nonexistant—that doesn't show much about practical speed gains I think. A realistic use case would be an operation that makes a lot of assumptions that are expensive to check compared to performing the operation, but you believe they should always be satisfied. For example, if I'm trying to return the roots of a parabola, I could check that b**2 - 4*a*c > 0 to ensure real roots, if that's what I am interested in. Many useful formulae have lots of constraints.
Also, assert is a statement that I meant to be used like "assert True", not assert(True). This becomes important when you add the message, as assert a == b, "Must be true" is very different than assert(a == b, "Must be true"), and in particular the latter always passes.
@kaizer.se: no stmt is first argument, setup is second; in your example, the assert would be in the setup, so that -O has no measurable effect
@Mike: of course it's strange, as most examples reduced to the most extreme. Basically, the optimized version example measures the overhead of the timeit loop, and the unoptimized version shows the overhead of assert itself. Real-life savings may be more or less, depending on what's more epensive: your working code or the assertions. Often, but not always, assertions are relatively trival, thus may claim that usually the savings will be less. Thanks for the reminder about the parentheses, I removed them!
|
8

I have never encountered a good reason to use -O. I have always assumed its main purpose is in case at some point in the future some meaningful optimization is added.

1 Comment

Well, it does do a couple things, they just aren't typically all that useful.
8

But don't you need a ton of assert statements for this to make a difference? Do you have any code where this is worthwhile (and sane?)

As an example, I have a piece of code that gets paths between nodes in a graph. I have an assert statement at the end of the function to check that the path doesn't contain duplicates:

assert not any(a == b for a, b in zip(path, path[1:]))

I like the peace of mind and clarity that this simple statement gives during development. In production, the code processes some big graphs and this single line can take up to 66% of the run time. Running with -O therefore gives a significant speed-up.

Comments

5

I imagine that the heaviest users of -O are py2exe py2app and similar.

I've personally never found a use for -O directly.

2 Comments

and why does py2exe use it?
When creating the stand-alone executable, there is no need for docstrings. They only take up space in memory.
4

You've pretty much figured it out: It does practically nothing at all. You're almost never going to see speed or memory gains, unless you're severely hurting for RAM.

1 Comment

or if __debug__: r = long_running_function(); assert n - 0.01 < r; assert r < n + 0.01, testing tolerances of an heuristic (n being the result of the heuristic), generally useful when programming, useless (and harmful, and might actually never complete using real data) when actually using the heuristic (since the whole point of the heuristic is to avoid the calculation). So you can have a function go from never halting, to completing in milliseconds. That sounds like a hell of a gain!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.