7

i have a <img src=__string__> but string might contain ", what should I do to escape it?

Example:

__string__ = test".jpg
<img src="test".jpg">

doesn't work.

1

5 Answers 5

15

In Python 3.2 a new html module was introduced, which is used for escaping reserved characters from HTML markup.

It has one function html.escape(s, quote=True). If the optional flag quote is true, the characters (") and (') are also translated.

Usage:

>>> import html
>>> html.escape('x > 2 && x < 7')
'x &gt; 2 &amp;&amp; x &lt; 7'
Sign up to request clarification or add additional context in comments.

1 Comment

Your answer sounds as if html was not available for Python 2, but it is.
13

If your value being escaped might contain quotes, the best thing is to use the quoteattr method: http://docs.python.org/library/xml.sax.utils.html#module-xml.sax.saxutils

This is referenced right beneath the docs on the cgi.escape() method.

3 Comments

+1, quoteattr is exactly the right function to use for this (and the online Python docs are pretty clear about this, too!).
That's cool. But worth noting if your string contains both single and double quotes, you'll get a URL with &quot; in it, which is not likely to resolve to the resource you are targeting.
This function is insufficient. I was able to inject HTML this way. django.utils.html.escape worked, though.
4
import cgi
s = cgi.escape('test".jpg', True)

http://docs.python.org/library/cgi.html#cgi.escape

Note that the True flag tells it to escape double quotes. If you need to escape single quotes as well (if you're one of those rare individuals who use single quotes to surround html attributes) read the note in that documentation link about xml.sax.saxutils.quoteattr(). The latter does both kinds of quotes, though it is about three times as slow:

>>> timeit.Timer( "escape('asdf\"asef', True)", "from cgi import escape").timeit()
1.2772219181060791
>>> timeit.Timer( "quoteattr('asdf\"asef')", "from xml.sax.saxutils import quoteattr").timeit()
3.9785079956054688

2 Comments

cgi.escape does not escape single quotes. For this reason it is dangerous to use it for HTML escaping, because the attribute the variable is being put into may be single quoted. If the attribute is single quoted, a cross-site scripting vulnerability could easily be found.
I explicitly mentioned the single quote issue in my answer.
2

If the URL you're using (as an img src here) might contain quotes, you should use URL quoting.

For python, use the urllib.quote method before passing the URL string to your template:

img_url = 'test".jpg'
__string__ = urllib.quote(img_url)

2 Comments

thanks, but if its not url or unicode, it fails for title attribute
@Timmy, what do you mean by "it fails for title attribute"? The call to urllib.quote returns "test%22.jpg", which I believe is what you want.
-3

The best way to escape XML or HTML in python is probably with triple quotes. Note that you can also escape carriage returns.

"""<foo bar="1" baz="2" bat="3">
<ack/>
</foo>
"""

2 Comments

I don't think that answers the question. He's wanting to know how to properly escape quotes inside __string__, since he's using quotes around __string__.
that does not answer the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.