通常会有这么一个应用场景,会用python去处理json格式的web API,以“ip.taobao.com”的API为例,详情见http://ip.taobao.com/instructions.php
是一个查询ip地址归属地的接口,其中包括国家、地区和ISP这些信息,均为中文显示。
我会用如下方式获取API数据:
myjson = json.loads(urllib.urlopen(url).read())
如上函数使用了urllib和json这2个模块,其中urllib用于请求页面获取json格式的数据;而json模块用于返回json格式数据;但是当你使用print来输出”myjson“这个变量的时候会发现中文全是unicode显示,可以用:
newjson = json.dumps(myjson, ensure_ascii=False)
让其显示成中文,默认是以ASCII来解析code的,中文不在ASCII编码当中,自然无法显示,可以看到json.dumps的帮助信息也说明了设为False后以:
>>> help(json.dumps)
Help on function dumps in module json:
dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=Tr
e, cls=None, indent=None, separators=None, encoding='utf-8', default=None, **kw
Serialize ``obj`` to a JSON formatted ``str``.
If ``skipkeys`` is ``True`` then ``dict`` keys that are not basic types
(``str``, ``unicode``, ``int``, ``long``, ``float``, ``bool``, ``None``)
will be skipped instead of raising a ``TypeError``.
If ``ensure_ascii`` is ``False``, then the return value will be a
``unicode`` instance subject to normal Python ``str`` to ``unicode``
coercion rules instead of being escaped to an ASCII ``str``.
If ``check_circular`` is ``False``, then the circular reference check
for container types will be skipped and a circular reference will
result in an ``OverflowError`` (or worse).
If ``allow_nan`` is ``False``, then it will be a ``ValueError`` to
serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``) in
strict compliance of the JSON specification, instead of using the
JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
If ``indent`` is a non-negative integer, then JSON array elements and
object members will be pretty-printed with that indent level. An indent
level of 0 will only insert newlines. ``None`` is the most compact
representation.
If ``separators`` is an ``(item_separator, dict_separator)`` tuple
then it will be used instead of the default ``(', ', ': ')`` separators.
``(',', ':')`` is the most compact JSON representation.
``encoding`` is the character encoding for str instances, default is UTF-8.
``default(obj)`` is a function that should return a serializable version
of obj or raise TypeError. The default simply raises TypeError.
To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
``.default()`` method to serialize additional types), specify it with
the ``cls`` kwarg.
>>>
最后附图一张,可以看到显示中文正常了: