Bringing sanity to world of messed-up data

Project maintained by Alir3z4 Hosted on GitHub Pages — Theme by mattgraham


Build Status Coverage Status Downloads Version Egg? Wheel? Format License

sanitize is a Python module for making sure various things (e.g. HTML) are safe to use. It was originally written by Mark Pilgrim and is distributed under the BSD license.


>>> from sanitize import HTML
>>> HTML('<b>hello')
>>> HTML('<img>')
'<img />'
>>> HTML(("<b><b><b>hello")
... )
>>> HTML('<img src="foo"/')
>>> HTML('<input type="checkbox" checked>')
'<input type="checkbox" checked="checked" />'
>>> # dangerous tags (a small sample)
>>> HTML('safe<applet code="foo.class" codebase=""></applet> <b>description</b>')
'safe <b>description</b>'
>>> HTML('safe<frameset rows="*"><frame src=""></frameset> <b>description</b>')
'safe <b>description</b>'
>>> # bad protocols (a small sample)
>>> HTML('<a href="java' + chr(1) + 'script:foo">bar</a>')
'<a href="#foo">bar</a>'
>>> HTML('<a href="vbscript:foo">bar</a>')
'<a href="#foo">bar</a>'

To see more usage examples see tests/


python-sanitize is available on pypi

So easily install it by pip:

pip install sanitize

Or by easy_install:

$ easy_install sanitize

Another way is by cloning python-sanitize's git repository

$ git clone git://

Then install it by running

$ python install


To run unit tests:

$ python test


Sanitize is distributed under BSD license.