Python处理HTML实体编码

python2
import HTMLParser  

char = r"〹"  
http_parser = HTMLParser.HTMLParser();  
uChar = http_parser.unescape(char);  
python3
from html import unescape

s = u'position.php?&amp;start=10#a" id="next">下一页</a>'

print(s)

print(unescape(s))

"""
position.php?&amp;start=10#a" id="next">下一页</a>
position.php?&start=10#a" id="next">下一页</a> 
"""

参考: Python处理HTML实体编码