4

The % operator for string formatting is described here.

Usually, when presented a string without conversion specifier, it will raise a TypeError: not all arguments converted during string formatting. For instance, "" % 1 will fail. So far, so good.

Sometimes, it won't fail, though, if the argument on the right of the % operator is something empty: "" % [], or "" % {} or "" % () will silently return the empty string, and it looks fair enough.

The same with "%s" instead of the empty string will convert the empty object into a string, except the last which will fail, but I think it's an instance of the problems of the % operator, which are solved by the format method.

There is also the case of a non-empty dictionary, like "" % {"a": 1}, which works because it's really supposed to be used with named type specifiers, like in "%(a)d" % {"a": 1}.

However, there is one case I don't understand: "" % b"x" will return the empty string, no exception raised. Why?

3
  • Sorry, I can't reproduce that: ` "" % b"x"` Traceback (most recent call last): ` File "<input>", line 1, in <module>` TypeError: not all arguments converted during string formatting* What's your python version? Commented Feb 17, 2015 at 11:06
  • @MarcusMüller It's Python 3.4.2, 32 bits, on Windows. Your comment makes me think it may be a bug after all. (yes, thanks for the tag) Commented Feb 17, 2015 at 11:07
  • 1
    Added python-3.x tag. I've tried with python 2.7. Commented Feb 17, 2015 at 11:07

2 Answers 2

2

I'm not 100% sure, but after a quick look in the sources, I guess the reason is the following:

when there's only one argument on the right of %, Python looks if it has the getitem method, and, if yes, assumes it to be a mapping and expects us to use named formats like %(name)s. Otherwise, Python creates a single-element tuple from the argument and performs positional formatting. Argument count is not checked with mappings, therefore, since bytes and lists do have getitem, they won't fail:

>>> "xxx" % b'a'
'xxx'
>>> "xxx" % ['a']
'xxx'

Consider also:

>>> class X: pass
... 
>>> "xxx" % X()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

>>> class X:
...    def __getitem__(self,x): pass
... 
>>> "xxx" % X()
'xxx'

Strings are exception of this rule - they have getitem, but are still "tuplified" for positional formatting:

>>> "xxx" % 'a'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting

Of course, this "sequences as mappings" logic doesn't make much sense, because formatting keys are always strings:

>>> "xxx %(0)s" % ['a']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not str

but I doubt anyone is going to fix that, given that % is abandoned anyways.

Sign up to request clarification or add additional context in comments.

4 Comments

With support for % formating for bytes objects (which do fix this case), I suspect this can still be considered a bug though.
The idea looks nice, however "%d" % b"a" raises an exception (one could expect that it would return "97", using getitem to return an integer)
@Jean-ClaudeArbaut: no, getitem is only invoked for named placeholders like %(name)s. Unnamed placeholders are always positional.
Yes, I saw it after my comment. Actually, it would make sense if "%(0)d" was accepted, but it's not. I didn't pay enough attention to the end of your post at first :-)
2

The offending line is at unicodeobject.c. It considers all objects that are "mappings", and explicitly are not either tuples or strings, or subclasses thereof, as "dictionaries", and for those it is not error if not all arguments are converted.

The PyMapping_Check is defined as:

int
PyMapping_Check(PyObject *o)
{
    return o && o->ob_type->tp_as_mapping &&
        o->ob_type->tp_as_mapping->mp_subscript;
}

That is, any type with tp_as_mapping defined and that having mp_subscript is a mapping.

And bytes does define that, as does any other object with __getitem__. Thus in Python 3.4 at least, no object with __getitem__ will fail as the rightside argument to the % format op.

Now this is a change from Python 2.7. Furthermore, the reason for this is that it is that there is no way to detect all possible types that could be used for %(name)s formatting, except by accepting all types that implement __getitem__, though the most obvious mistakes have been taken out. When the Python 3 was published, no one added bytes there, though it clearly shouldn't support strings as arguments to __getitem__; but neither is there list there.

Another oversight is that a list cannot be used for formatting for positional parameters.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.