接 Python Tutorial中英双语对照文档2


CHAPTER NINE


CLASSES

Classes provide a means of bundling data and functionality together. Creating a new class creates a new type of object, allowing new instances of that type to be made. Each class instance can have attributes attached to it for maintaining its state. Class instances can also have methods (defined by its class) for modifying its state. 类提供了将数据和功能捆绑在一起的方法. 创建新类会创建一种新类型的对象, 从而允许创建该类型的新实例. 每个类实例都可以附加属性以维护其状态. 每个类实例也都有方法(由类定义的)以修改状态.

Compared with other programming languages, Python’s class mechanism adds classes with a minimum of new syntax and semantics. It is a mixture of the class mechanisms found in C++ and Modula-3. Python classes provide all the standard features of Object Oriented Programming: the class inheritance mechanism allows multiple base classes, a derived class can override any methods of its base class or classes, and a method can call the method of a base class with the same name. Objects can contain arbitrary amounts and kinds of data. As is true for modules, classes partake of the dynamic nature of Python: they are created at runtime, and can be modified further after creation. 与相比程序语言相比, Python 的类机制用最少的新语法和语义添加类. 它是 C++ 和 Modula-3 中类机制的混合体. Python 类提供面向对象程序的所有标准特性: 类的继承机制允许多个基类, 派生类可以覆盖其基类或类的任何方法, 方法可以调用基类的同名方法. 对象可以包含任意数量和种类的数据. 与模块一样, 类也加入Python的动态特性: 它们是在运行时创建的, 可以在创建后可进一步修改.

In C++ terminology, normally class members (including the data members) are public (except see below Private Variables), and all member functions are virtual. As in Modula-3, there are no shorthands for referencing the object’s members from its methods: the method function is declared with an explicit first argument representing the object, which is provided implicitly by the call. As in Smalltalk, classes themselves are objects. This provides semantics for importing and renaming. Unlike C++ and Modula-3, built-in types can be used as base classes for extension by the user. Also, like in C++, most built-in operators with special syntax (arithmetic operators, subscripting etc.) can be redefined for class instances.

在 C++ 术语中, 一般类成员(包括数据成员)是公有的(除了下面将看到的私有变量), 所有成员函数是虚拟的. 与 Modula-3 一样, 在成员方法中没有简便的方式引用对象的成员: 方法函数使用表示对象的显式第一个参数声明, 这是由调用隐式提供的. 像 Smalltalk 一样, 类本身也是对象. 这为导入和重命名提供了语义. 不同于 C++ 和 Modula-3, 内置类型可以被用户用作基类来扩展. 同样地, 如 C++ 中, 大多数带有特殊语法的内置操作符(算法运算符, 下标等)都可以针对类的需要重新定义.

(Lacking universally accepted terminology to talk about classes, I will make occasional use of Smalltalk and C++ terms. I would use Modula-3 terms, since its object-oriented semantics are closer to those of Python than C++, but I expect that few readers have heard of it.) (在讨论类时, 没有足够的得到共识的术语, 我偶尔会使用Smalltalk和C ++术语. 我比较喜欢用 Modula-3 的用语, 因为比起 C++ 它的面向对象语义更接近 Python 的那些, 但我希望很少有读者听说过它.)

9.1 A Word About Names and Objects

9.1 关于名称和对象的术语

Objects have individuality, and multiple names (in multiple scopes) can be bound to the same object. This is known as aliasing in other languages. This is usually not appreciated on a first glance at Python, and can be safely ignored when dealing with immutable basic types (numbers, strings, tuples). However, aliasing has a possibly surprising effect on the semantics of Python code involving mutable objects such as lists, dictionaries, and most other types. This is usually used to the benefit of the program, since aliases behave like pointers in some respects. For example, passing an object is cheap since only a pointer is passed by the implementation; and if a function modifies an object passed as an argument, the caller will see the change — this eliminates the need for two different argument passing mechanisms as in Pascal. 对象具有个性: 多个名称(在多个作用域中)可以绑定到同一个对象. 这在其他语言中称为别名. 乍一看Python时, 通常不会对此表示赞赏, 而且处理不可变的基本类型(数字, 字符串, 元组)时可以安全地忽略. 然而, 别名有可能令人惊讶的效果在 Python 代码语义上, 包括易变的对象如列表, 字典, 和大多数其他类型. 这通常用于程序的优化, 因为别名在某些方面表现得像指针. 例如, 传递一个对象是很轻便的, 因为只是传递一个指针来实施的; 而且如果函数修改了作为参数传递的对象, 调用者可以接收这一变化 —— 这消除了对两种不同的参数传递机制的需要, 就像 Pascal 里样.

9.2 Python Scopes and Namespaces

9.2 Python 作用域和命名空间

Before introducing classes, I first have to tell you something about Python’s scope rules. Class definitions play some neat tricks with namespaces, and you need to know how scopes and namespaces work to fully understand what’s going on. Incidentally, knowledge about this subject is useful for any advanced Python programmer. 介绍类前, 我首先得告诉你一些关于 Python 作用域的规则. 类的定义非常巧妙的运用了命名空间, 所以你需要知道作用域和命名空间如何工作去完全理解发生了什么. 顺带地, 关于这个主题的知识对任何高级Python程序员都很有用.

Let’s begin with some definitions. 让我们从一些定义开始.

A namespace is a mapping from names to objects. Most namespaces are currently implemented as Python dictionaries, but that’s normally not noticeable in any way (except for performance), and it may change in the future. Examples of namespaces are: the set of built-in names (containing functions such as abs(), and built-in exception names); the global names in a module; and the local names in a function invocation. In a sense the set of attributes of an object also form a namespace. The important thing to know about namespaces is that there is absolutely no relation between names in different namespaces; for instance, two different modules may both define a function maximize without confusion — users of the modules must prefix it with the module name. 命名空间是从名称到对象的映射. 大多数名称空间都是当前作为 Python 字典实现的, 但通常不关心具体的实现方式(性能考虑除外), 而且以后它可以改变. 命名空间的示例: 内置命名集(); 模块中的全局命名; 还有函数调用中的局部命名. 从某种意义上说, 对象属性的集合也是一种命名空间的形式. 关于名称空间的重要事项是不同命名空间中的名称之间绝对没有关系; 比如, 两个不同的模块可以都定义函数 maximize 而不会混淆 —— 模块用户必须使用模块名前缀.

By the way, I use the word attribute for any name following a dot — for example, in the expression z.real, real is an attribute of the object z. Strictly speaking, references to names in modules are attribute references: in the expression modname.funcname, modname is a module object and funcname is an attribute of it. In this case there happens to be a straightforward mapping between the module’s attributes and the global names defined in the module: they share the same namespace!1 顺便说说, 我称 Python 中任何一个"."之后的命名为属性 —— 举个例子, 表达式 z.real 中, real 是对象 z 的一个属性. 严格来说, 模块中名称的引用是属性引用: 表达式 modname.funcname 中, modname 是模块对象, funcname 是它的属性. 在这个例子中, 恰好是一个在模块属性和模块中定义的全局命名之间简单的映射: 他们分享同一个命名空间!1

Attributes may be read-only or writable. In the latter case, assignment to attributes is possible. Module attributes are writable: you can write modname.the_answer = 42. Writable attributes may also be deleted with the del statement. For example, del modname.the_answer will remove the attribute the_answer from the object named by modname. 属性可以是制度或可写的. 在稍后的例子中, 可以分配属性. 模块属性是可写的: 你可以写 modname.the_answer = 42. 可写属性也可以用 del 语句被删除. 例如, del modname.the_answer 将从modname对象移除属性 the_answer.

Namespaces are created at different moments and have different lifetimes. The namespace containing the built-in names is created when the Python interpreter starts up, and is never deleted. The global namespace for a module is created when the module definition is read in; normally, module namespaces also last until the interpreter quits. The statements executed by the top-level invocation of the interpreter, either read from a script file or interactively, are considered part of a module called __main__, so they have their own global namespace. (The built-in names actually also live in a module; this is called builtins.) 命名空间是在不同时刻创建且有不同的生存期. 包含内置命名的命名空间是在 Python 解释器启动时创建的, 而且不会被删除. 模块的全局命名空间实在模块定义被读取时创建的; 一般地, 模块命名空间也会保留到解释器退出. 由解释器在最高层调用执行的语句, 不管它是从脚本文件读取或交互式输入, 都是 __main__模块的一部分, 所以他们有自己的全局命名空间. (内置名称实际上也存在于模块中; 这叫做 builtins.)

The local namespace for a function is created when the function is called, and deleted when the function returns or raises an exception that is not handled within the function. (Actually, forgetting would be a better way to describe what actually happens.) Of course, recursive invocations each have their own local namespace. 调用函数时会创建函数的本地名称空间, 当函数返回或引发未在函数内处理的异常时删除. (事实上, 用遗忘来形容到底发生了什么更为贴切.) 当然, 递归调用每个都有他们自己的局部命名空间.

A scope is a textual region of a Python program where a namespace is directly accessible. “Directly accessible” here means that an unqualified reference to a name attempts to find the name in the namespace. 作用域是 Python 程序的一个文本区域, 可以直接访问命名空间的位置. 这里的"直接访问"意思是一个对名称的不合格引用会尝试在命名空间内查找.

Although scopes are determined statically, they are used dynamically. At any time during execution, there are at least three nested scopes whose namespaces are directly accessible:

  • the innermost scope, which is searched first, contains the local names
  • the scopes of any enclosing functions, which are searched starting with the nearest enclosing scope, contains non-local, but also non-global names
  • the next-to-last scope contains the current module’s global names
  • the outermost scope (searched last) is the namespace containing built-in names 虽然作用域是静态地确定的, 但它们是动态地被使用. 在执行期间的任何时候, 这里最少三个可直接访问的命名空间嵌套作用域在一起:
  • 最里面的作用域, 它首先被搜索, 包含局部命名
  • 任何封闭函数的作用域, 从最近的封闭范围开始搜索, 包含非本地命名, 但也包含非全局命名
  • 接下来的作用域包含当前模块的全局命名
  • 最外层作用域(最后搜索的)是包含内置命名的命名空间

If a name is declared global, then all references and assignments go directly to the middle scope containing the module’s global names. To rebind variables found outside of the innermost scope, the nonlocal statement can be used; if not declared nonlocal, those variables are read-only (an attempt to write to such a variable will simply create a new local variable in the innermost scope, leaving the identically named outer variable unchanged). 如果一个命名被声明为全局的, 接着所用引用和参数直接地中间作用域包含模块的全局命名. 要重新绑定最里层作用域之外的变量, 请使用 nonlocal 语句; 如果不声明 nonlocal, 这些变量将是只读(尝试写入这样的变量只会在最里面的范围内创建一个新的局部变量, 保留同名的外部变量).

Usually, the local scope references the local names of the (textually) current function. Outside functions, the local scope references the same namespace as the global scope: the module’s namespace. Class definitions place yet another namespace in the local scope. 通常, 局部作用域引用(文本方式)当前函数的局部命名. 函数外, 局部作用域同一命名空间为全局作用域: 模块的命名空间. 类定义在局部作用域中放置另一个命名空间.

It is important to realize that scopes are determined textually: the global scope of a function defined in a module is that module’s namespace, no matter from where or by what alias the function is called. On the other hand, the actual search for names is done dynamically, at run time — however, the language definition is evolving towards static name resolution, at “compile” time, so don’t rely on dynamic name resolution! (In fact, local variables are already determined statically.)重要的是要意识到作用域是以文本方式确定的: 模块中函数定义的全局作用域是这个模块的命名空间, 而不是函数别名被调用的地方. 另一方面, 实际上命名搜索也是动态的, 在运行是 — 然而, 语言定义正朝着静态名称解析的方向发展, 在"编译"时间, 所以不要依赖动态名称解析! (事实上, 局部变量已经静态确定.)

A special quirk of Python is that – if no global statement is in effect – assignments to names always go into the innermost scope. Assignments do not copy data — they just bind names to objects. The same is true for deletions: the statement del x removes the binding of x from the namespace referenced by the local scope. In fact, all operations that introduce new names use the local scope: in particular, import statements and function definitions bind the module or function name in the local scope. Python的一个特殊怪癖是 –– 如果没有有效的全局声明 –– 其赋值操作总是在最里层的作用域. 赋值不会赋值数据 – 他们只是绑定命名到对象. 删除也是如此: 语句 del x 移除 x 从局部作用域的命名空间中应用的绑定.只是从局部作用域的命名空间中删除命名 x. 实际上, 引入新名称的所有操作都使用本地范围: 特别是 import 语句和函数定义绑定模块或含税名到本地作用域.

The global statement can be used to indicate that particular variables live in the global scope and should be rebound there; the nonlocal statement indicates that particular variables live in an enclosing scope and should be rebound there. 全局语句可用于指示特定变量存在于全局作用域内, 并应在那里复原; 非本地语句表明特定变量存在于封闭范围内, 应该在那里复原.

9.2.1 Scopes and Namespaces Example

9.2.1 作用域和命名空间的示例

This is an example demonstrating how to reference the different scopes and namespaces, and how global and nonlocal affect variable binding:

def scope_test():
    def do_local():
        spam = "local spam"

    def do_nonlocal():
        nonlocal spam
        spam = "nonlocal spam"

    def do_global():
        global spam
        spam = "global spam"
    
    spam = "test spam"
    do_local()
    print("After local assignment:", spam)
    do_nonlocal()
    print("After nonlocal assignment:", spam)
    do_global()
    print("After global assignment:", spam)

scope_test()
print("In global scope:", spam)

The output of the example code is:

After local assignment: test spam
After nonlocal assignment: nonlocal spam
After global assignment: nonlocal spam
In global scope: global spam

这个例子演示如何引用不同作用域和命名空间, 以及 global 和 nonlocal 如何影响变量绑定:

def scope_test():
    def do_local():
        spam = "local spam"

    def do_nonlocal():
        nonlocal spam
        spam = "nonlocal spam"

    def do_global():
        global spam
        spam = "global spam"
    
    spam = "test spam"
    do_local()
    print("After local assignment:", spam)
    do_nonlocal()
    print("After nonlocal assignment:", spam)
    do_global()
    print("After global assignment:", spam)

scope_test()
print("In global scope:", spam)

示例代码的输出结果:

After local assignment: test spam
After nonlocal assignment: nonlocal spam
After global assignment: nonlocal spam
In global scope: global spam

Note how the local assignment (which is default) didn’t change scope_test’s binding of spam. The nonlocal assignment changed scope_test’s binding of spam, and the global assignment changed the module-level binding. 注意, local 赋值()不会改变 scope_test 的 spam 绑定. nonlocal 赋值改变了 scope_test 的 spam 绑定, global 赋值改变模块水平的绑定.

You can also see that there was no previous binding for spam before the global assignment. 你也可以看到这里没用在 global 赋值前之前的对 spam 的绑定.

9.3 A First Look at Classes

9.3 初探类

Classes introduce a little bit of new syntax, three new object types, and some new semantics. 类引入了一些新语法, 三种新对象类型, 和一些新语义.

9.3.1 Class Definition Syntax

9.3.1 类定义语法

The simplest form of class definition looks like this:

class ClassName:
    <statement-1>
    .
    .
    .
    <statement-N>

类最简单的定义形式看起来像这样:

class ClassName:
    <statement-1>
    .
    .
    .
    <statement-N>

Class definitions, like function definitions (def statements) must be executed before they have any effect. (You could conceivably place a class definition in a branch of an if statement, or inside a function.) 像函数的定义(def 语句), 类的定义必须执行在它们产生影响前. (您可以想象将类定义放在if语句的分支中, 或函数里.)

In practice, the statements inside a class definition will usually be function definitions, but other statements are allowed, and sometimes useful — we’ll come back to this later. The function definitions inside a class normally have a peculiar form of argument list, dictated by the calling conventions for methods — again, this is explained later. 在实践中, 类定义中的语句通常是函数定义, 但其他语句也是被允许的, 而且又是很有用 —— 我们稍后再回过头来看看. 在类中, 类中的函数定义通常包括了一个特殊形式的参数列表, 用于方法调用约定——同样, 这稍后再解释.

When a class definition is entered, a new namespace is created, and used as the local scope — thus, all assignments to local variables go into this new namespace. In particular, function definitions bind the name of the new function here. 当进入类定义后, 新的命名空间被创建, 用作这里的本地作用域 – 然后徐, 所有的本地变量赋值进入这个新的命名空间. 特别地, 函数定义在此绑定新的命名.

When a class definition is left normally (via the end), a class object is created. This is basically a wrapper around the contents of the namespace created by the class definition; we’ll learn more about class objects in the next section. The original local scope (the one in effect just before the class definition was entered) is reinstated, and the class object is bound here to the class name given in the class definition header (ClassName in the example). 类定义正常完成时(到最后), 函数对象被创建. 这基本上市包装好的有函数定义创建的命名空间; 下一节我们将学习更多关于类对象. 原始的局部作用域(在输入类定义之前有效的那个)得到恢复, 并且类对象在此绑定到类定义头中给出的类名(例中 ClassName).

9.3.2 Class Objects

9.3.2 类对象

Class objects support two kinds of operations: attribute references and instantiation. 类对象支持两种操作: 属性引用和实例化.

Attribute references use the standard syntax used for all attribute references in Python: obj.name. Valid attribute names are all the names that were in the class’s namespace when the class object was created. So, if the class definition looked like this:

class MyClass:
    """A simple example class"""
    i = 12345
    def f(self):
        return 'hello world'

then MyClass.i and MyClass.f are valid attribute references, returning an integer and a function object, respectively. Class attributes can also be assigned to, so you can change the value of MyClass.i by assignment. __doc__ is also a valid attribute, returning the docstring belonging to the class: “A simple example class”.

属性引用使用 Python 中所用的属性引用标准语法: obj.name. 有效的属性名是类对象被创建时命名空间中所有在的命名. 所以, 如果类是如下定义的:

class MyClass:
    """A simple example class"""
    i = 12345
    def f(self):
        return 'hello world'

那么, MyClass.i 和 MyClass.f 就是有效的属性引用, 返回一个整数和一个函数对象, 类的属性也可以被赋值, 这样你可以通过参数改变 MyClass.i 的值. __doc__ 也是有效的属性, 返回属于类的文本文档: “A simple example class”.

Class instantiation uses function notation. Just pretend that the class object is a parameterless function that returns a new instance of the class. For example (assuming the above class):

x = MyClass()

creates a new instance of the class and assigns this object to the local variable x.

类实例化使用函数表示法. 只是假定类对象是一个无参数函数, 它返回类一个新的实例化. 例如(假设上述 class):

x = MyClass()

创建类的新实例化, 并赋值这个对象给本地变量 x.

The instantiation operation (“calling” a class object) creates an empty object. Many classes like to create objects with instances customized to a specific initial state. Therefore a class may define a special method named __init__(), like this:

def __init__(self):
    self.data = []

实例化操作("调用"一个类对象)创建一个空对象. 许多类倾向于创建对象时实例化定制一个特殊的初始状态. 因此, 类可以定义名为 __init__() 的特殊方法, 像这样:

def __init__(self):
    self.data = []

When a class defines an __init__() method, class instantiation automatically invokes __init__() for the newly-created class instance. So in this example, a new, initialized instance can be obtained by:

x = MyClass()

当类定义 __init__() 方法时, 类实例化操作自动调用__init__() 为新建类的实例. 所以, 这个例子中, 一个新的, 初始化实例可以有以下获取:

x = MyClass()

Of course, the __init__() method may have arguments for greater flexibility. In that case, arguments given to the class instantiation operator are passed on to __init__(). For example,

>>> class Complex:
...     def __init__(self, realpart, imagpart):
...         self.r = realpart
...         self.i = imagpart
...
>>> x = Complex(3.0, -4.5)
>>> x.r, x.i
(3.0, -4.5)

当然, 为了更大的灵活性, __init__() 方法可以有参数列表. 这种情况下, 通过 __init__()来给类实例化操作传递参数. 例如,

>>> class Complex:
...     def __init__(self, realpart, imagpart):
...         self.r = realpart
...         self.i = imagpart
...
>>> x = Complex(3.0, -4.5)
>>> x.r, x.i
(3.0, -4.5)

9.3.3 Instance Objects

9.3.3 实例对象

Now what can we do with instance objects? The only operations understood by instance objects are attribute references. There are two kinds of valid attribute names, data attributes and methods. 现在我们可以用实例对象做什么? 实例对象唯一可用的操作就是属性引用. 有两种有效的属性名, 数据属性和方法.

data attributes correspond to “instance variables” in Smalltalk, and to “data members” in C++. Data attributes need not be declared; like local variables, they spring into existence when they are first assigned to. For example, if x is the instance of MyClass created above, the following piece of code will print the value 16, without leaving a trace:

x.counter = 1
while x.counter < 10:
    x.counter = x.counter * 2
print(x.counter)
del x.counter

数据属性相当于 Smalltalk 中的"实例变量", 或 C++ 中的"数据成员". 数据属性不需要声明; 如同本地变量, 当他们第一次被赋值时, 它们就会存在. 例如, 如果 x 是上述创建类的实例化, 下面的代码片段将打印值 16, 不留痕迹:

x.counter = 1
while x.counter < 10:
    x.counter = x.counter * 2
print(x.counter)
del x.counter

The other kind of instance attribute reference is a method. A method is a function that “belongs to” an object. (In Python, the term method is not unique to class instances: other object types can have methods as well. For example, list objects have methods called append, insert, remove, sort, and so on. However, in the following discussion, we’ll use the term method exclusively to mean methods of class instance objects,unlessexplicitly stated otherwise.) 实例属性引用的另一种是方法. 方法是一种"隶属于"对象的函数. (Python 中, 术语方法不是类实例独有的: 其他对象类型也可以有方法. 例如, 列表对象有方法名为 append, insert, reemove, sort, 等等. 然而, 在下面的讨论中, 我们将使用术语方法仅仅意为类实例对象的方法, 除非另有明确说明.)

Valid method names of an instance object depend on its class. By definition, all attributes of a class that are function objects define corresponding methods of its instances. So in our example, x.f is a valid method reference, since MyClass.f is a function, but x.i is not, since MyClass.i is not. But x.f is not the same thing as MyClass.f — it is a method object, not a function object. 实例对象的有效方法名称取决于其类. 根据定义, 作为函数对象的类的所有属性都定义其实例的相应方法. 所以在我们例子中, x.f 是有效方法引用, 因为 MyClass.f 是一个函数, 但 x.i 不是, 因为 MyClass.i 不是函数. 但 x.f 和 MyClass.f 不是同一个事— 它是一个方法对象, 不是一个函数对象.

9.3.4 Method Objects

9.3.4 方法对象

Usually, a method is called right after it is bound:

x.f()

通常, 绑定后立即调用方法:

x.f()

In the MyClass example, this will return the string ‘hello world’. However, it is not necessary to call a method right away: x.f is a method object, and can be stored away and called at a later time. For example:

xf = x.f
while True:
    print(xf())

will continue to print hello world until the end of time. 在 MyClass 例中, 这将返回一个字符串 ‘hello world’. 然而, 没有必要立即调用方法: x.f 是方法对象, 可以存放起来, 稍后再调用. 比如:

xf = x.f
while True:
    print(xf())

将持续打印 hello world 直到永远.

What exactly happens when a method is called? You may have noticed that x.f() was called without an argument above, even though the function definition for f() specified an argument. What happened to the argument? Surely Python raises an exception when a function that requires an argument is called without any — even if the argument isn’t actually used… 当方法被调用时到底发生了什么? 你可能有注意到, 上面 x.f() 调用时没有参数, 即使 f() 的函数定义指定了一个参数. 参数发生了什么? 事实上函数调用中缺少参数时, Python 会抛出异常 – 甚至这个参数实际上没什么用……

Actually, you may have guessed the answer: the special thing about methods is that the instance object is passed as the first argument of the function. In our example, the call x.f() is exactly equivalent to MyClass.f(x). In general, calling a method with a list of n arguments is equivalent to calling the corresponding function with an argument list that is created by inserting the method’s instance object before the first argument. 实际上, 你可能已经猜到答案了: 方法的特殊之处在于实例对象作为函数的第一个参数传递. 在我们的例子中, 调用 x.f() 是恰好等价于 MyClass.f(x). 一般地, 使用n个参数列表调用方法相当于使用通过在第一个参数之前插入方法的实例对象而创建的参数列表来调用相应的函数.

If you still don’t understand how methods work, a look at the implementation can perhaps clarify matters. When a non-data attribute of an instance is referenced, the instance’s class is searched. If the name denotes a valid class attribute that is a function object, a method object is created by packing (pointers to) the instance object and the function object just found together in an abstract object: this is the method object. When the method object is called with an argument list, a new argument list is constructed from the instance object and the argument list, and the function object is called with this new argument list. 如果你还是不理解方法如何工作的, 看一看它的实现也许能弄清问题. 当实例的一个非数据属性被引用时, 实例的类被搜寻. 如果名称表示作为函数对象的有效类属性, 则通过打包(指向)实例对象和刚在抽象对象中找到的函数对象来创建方法对象: 这就是方法对象. 当方法对象调用时有一个参数列表, 新的参数列表是从实例对象和这参数列表构造的, 然后函数对象被使用这个新的参数列表调用.

9.3.5 Class and Instance Variables

9.3.5 类与实例变量

Generally speaking, instance variables are for data unique to each instance and class variables are for attributes and methods shared by all instances of the class:

class Dog:
    kind = 'canine' # class variable shared by all instances
    def __init__(self, name):
        self.name = name # instance variable unique to each instance

>>> d = Dog('Fido')
>>> e = Dog('Buddy')
>>> d.kind # shared by all dogs
'canine'
>>> e.kind # shared by all dogs
'canine'
>>> d.name # unique to d
'Fido'
>>> e.name # unique to e
'Buddy'

一般来说, 实例变量是每个实例的独有的数据, 类变量是类所有的实例属性和方法共享的:

class Dog:
    
    kind = 'canine' # class variable shared by all instances
    
    def __init__(self, name):
        self.name = name # instance variable unique to each instance

>>> d = Dog('Fido')
>>> e = Dog('Buddy')
>>> d.kind # shared by all dogs
'canine'
>>> e.kind # shared by all dogs
'canine'
>>> d.name # unique to d
'Fido'
>>> e.name # unique to e
'Buddy'

As discussed in A Word About Names and Objects, shared data can have possibly surprising effects with involving mutable objects such as lists and dictionaries. For example, the tricks list in the following code should not be used as a class variable because just a single list would be shared by all Dog instances:

class Dog:
    
    tricks = [] # mistaken use of a class variable
    
    def __init__(self, name):
        self.name = name

    def add_trick(self, trick):
        self.tricks.append(trick)

>>> d = Dog('Fido')
>>> e = Dog('Buddy')
>>> d.add_trick('roll over')
>>> e.add_trick('play dead')
>>> d.tricks # unexpectedly shared by all dogs
['roll over', 'play dead']

正如关于名称和对象的术语所讨论的, 共享数据用在调用易变对象如列表和字典时可能会产生意外的效果. 例如, 下面代码中的 tricks 列表不应该用作类变量, 因为所有的 Dog 实例将共享同一个列表:

class Dog:
    
    tricks = [] # mistaken use of a class variable
    
    def __init__(self, name):
        self.name = name

    def add_trick(self, trick):
        self.tricks.append(trick)

>>> d = Dog('Fido')
>>> e = Dog('Buddy')
>>> d.add_trick('roll over')
>>> e.add_trick('play dead')
>>> d.tricks # unexpectedly shared by all dogs
['roll over', 'play dead']

Correct design of the class should use an instance variable instead:

class Dog:

    def __init__(self, name):
        self.name = name
        self.tricks = []    # creates a new empty list for each dog

    def add_trick(self, trick):
        self.tricks.append(trick)

>>> d = Dog('Fido')
>>> e = Dog('Buddy')
>>> d.add_trick('roll over')
>>> e.add_trick('play dead')
>>> d.tricks
['roll over']
>>> e.tricks
['play dead']

当前类设计应该使用一个实例变量替代:

class Dog:

    def __init__(self, name):
        self.name = name
        self.tricks = []    # creates a new empty list for each dog

    def add_trick(self, trick):
        self.tricks.append(trick)

>>> d = Dog('Fido')
>>> e = Dog('Buddy')
>>> d.add_trick('roll over')
>>> e.add_trick('play dead')
>>> d.tricks
['roll over']
>>> e.tricks
['play dead']

9.4 Random Remarks

9.4 随意评论

Data attributes override method attributes with the same name; to avoid accidental name conflicts, which may cause hard-to-find bugs in large programs, it is wise to use some kind of convention that minimizes the chance of conflicts. Possible conventions include capitalizing method names, prefixing data attribute names with a small unique string (perhaps just an underscore), or using verbs for methods and nouns for data attributes. 数据属性会覆盖同名方法属性; 为了避免意外的命名冲突, 这可能会导致大型程序中难以发现的 bugs, 使用某种最小化冲突机会的惯例是明智的. 可能的惯例包括大写方法名称首字母, 使用一个唯一的小字符串(也许只是一个下划线)作为数据属性名称的前缀, 或方法使用动词而数据属性使用名词.

Data attributes may be referenced by methods as well as by ordinary users (“clients”) of an object. In other words, classes are not usable to implement pure abstract data types. In fact, nothing in Python makes it possible to enforce data hiding — it is all based upon convention. (On the other hand, the Python implementation, written in C, can completely hide implementation details and control access to an object if necessary; this can be used by extensions to Python written in C.) 数据属性可能被通过方法引用, 也可以由一个对象的普通用户(客户)使用. 就是说, 类不可用于实现纯抽象数据类型. 实际上, Python 任何事不能强制数据隐藏 – 这一切都基于惯例. (另一方面, 如果需要, 由 C 写成的 Python 实现, 可以完全地隐藏实现细节并控制对象访问; 这可以用C编写的Python扩展来使用.)

Clients should use data attributes with care — clients may mess up invariants maintained by the methods by stamping on their data attributes. Note that clients may add data attributes of their own to an instance object without affecting the validity of the methods, as long as name conflicts are avoided — again, a naming convention can save a lot of headaches here. 客户使用数据属性应该小心 – 客户端可能会通过标记其数据属性来破坏方法维护的不变量. 注意, 客户可以添加他们自己的数据属性到一个实例对象而不影响方法的有效性, 只要避免名称冲突 – 再次强调, 命名惯例可以在这里节省很多麻烦.

There is no shorthand for referencing data attributes (or other methods!) from within methods. I find that this actually increases the readability of methods: there is no chance of confusing local variables and instance variables when glancing through a method. 从方法中引用数据属性(或其他方法!)没有简写. 我发现实际上这增加了方法的可读性: 在浏览方法时, 没有机会混淆局部变量和实例变量.

Often, the first argument of a method is called self. This is nothing more than a convention: the name self has absolutely no special meaning to Python. Note, however, that by not following the convention your code may be less readable to other Python programmers, and it is also conceivable that a class browser program might be written that relies upon such a convention. 一般地, 方法的第一个参数被命名为 self. 这只不过是一个惯例: self这个名字对Python来说绝对没有特别的意义. 但是注意, 如果不遵循惯例, 您的代码可能对其他Python程序员不太可读, 而且有些类查看器程序也可能是遵循此约定编写的.

Any function object that is a class attribute defines a method for instances of that class. It is not necessary that the function definition is textually enclosed in the class definition: assigning a function object to a local variable in the class is also ok. For example:

# Function defined outside the class
def f1(self, x, y):
    return min(x, x+y)

class C:
    f = f1

    def g(self):
        return 'hello world'

    h = g

类属性的任何函数对象都是为其类实例定义的方法. 函数定义没有必要是类定义里的文本: 将函数对象分配给类中的局部变量也是可以的. 比如:

# Function defined outside the class
def f1(self, x, y):
    return min(x, x+y)

class C:
    f = f1

    def g(self):
        return 'hello world'

    h = g

Now f, g and h are all attributes of class C that refer to function objects, and consequently they are all methods of instances of C — h being exactly equivalent to g. Note that this practice usually only serves to confuse the reader of a program. 现在, f, g 和 h 是类 C 所有的属性, 引用的都是函数对象, 所以它们都是 C 实例的方法 – h 确切地说等价于 g. 注意这种习惯通常只会令人程序阅读者困惑.

Methods may call other methods by using method attributes of the self argument:

class Bag:
    def __init__(self):
        self.data = []

    def add(self, x):
        self.data.append(x)

    def addtwice(self, x):
        self.add(x)
        self.add(x)

通过使用 self 参数的方法属性, 方法可以调用其他方法:

class Bag:
    def __init__(self):
        self.data = []

    def add(self, x):
        self.data.append(x)

    def addtwice(self, x):
        self.add(x)
        self.add(x)

Methods may reference global names in the same way as ordinary functions. The global scope associated with a method is the module containing its definition. (A class is never used as a global scope.) While one rarely encounters a good reason for using global data in a method, there are many legitimate uses of the global scope: for one thing, functions and modules imported into the global scope can be used by methods, as well as functions and classes defined in it. Usually, the class containing the method is itself defined in this global scope, and in the next section we’ll find some good reasons why a method would want to reference its own class. 方法可以以与普通函数相同的方式引用全局命名. 与方法关联的全局作用域是包含类定义的模块. (类永远不会作为全局作用域.) 当然很少遇到一个很好的理由去在方法中使用全局数据, 这里有很多合法的全局作用域使用: 其一是方法可以调用导入全局作用域的函数和方法, 也可以调用定义在其中的类和函数. 通常, 包含该方法的类本身在此全局范围中定义, 下一节, 我们将发现一些好理由为什么方法想要去引用它自己的类.

Each value is an object, and therefore has a class (also called its type). It is stored as object.__class__. 每个值都是一个对象, 因此都有一个类(也称为它的类型). 它保存为 object.__class__.

9.5 Inheritance

9.5 继承

Of course, a language feature would not be worthy of the name “class” without supporting inheritance. The syntax for a derived class definition looks like this:

class DerivedClassName(BaseClassName):
    <statement-1>
    .
    .
    .
    <statement-N>

当然, 语言特性将不会是值得的名为"class", 如果不支持继承. 派生类定义的语法如下所示:

class DerivedClassName(BaseClassName):
    <statement-1>
    .
    .
    .
    <statement-N>

The name BaseClassName must be defined in a scope containing the derived class definition. In place of a base class name, other arbitrary expressions are also allowed. This can be useful, for example, when the base class is defined in another module:

class DerivedClassName(modname.BaseClassName):

命名 BaseClassName 必须是定义在一个包含派生类定义的作用域. 除了一个基础类名, 其他任意表达式也是被允许. 这非常有用, 例如, 当基类定义在其他模块里时:

class DerivedClassName(modname.BaseClassName):

Execution of a derived class definition proceeds the same as for a base class. When the class object is constructed, the base class is remembered. This is used for resolving attribute references: if a requested attribute is not found in the class, the search proceeds to look in the base class. This rule is applied recursively if the base class itself is derived from some other class. 派生类定义的执行过程与基类的执行相同. 构造类对象时, 会记住基类. 这在解析属性引用时很有用: 如果请求的属性没有在类里被发现, 搜索过程会在基类里查找. 这个规则是递归的, 如果基类是它自己也是从其他类派生出来的.

There’s nothing special about instantiation of derived classes: DerivedClassName() creates a new instance of the class. Method references are resolved as follows: the corresponding class attribute is searched, descending down the chain of base classes if necessary, and the method reference is valid if this yields a function object. 派生类的实例没有什么特别的: DerivedClassName() 创建一个类的新实例. 方法引用的解析如下: 搜索相应的类属性, 如有必要, 沿着基类链下行(逐级搜索), 如果这会产生一个函数对象, 则方法引用有效.

Derived classes may override methods of their base classes. Because methods have no special privileges when calling other methods of the same object, a method of a base class that calls another method defined in the same base class may end up calling a method of a derived class that overrides it. (For C++ programmers: all methods in Python are effectively virtual.) 派生类将覆盖它们在基类里的方法. 因为方法在调用同一对象的其他方法时没有特殊权限, 所以调用同一基类中定义的另一个方法的基类方法最终可能会调用覆盖它的派生类的方法. (给 C++ 程序员: Python 中所有的方法都是虚拟的.)

An overriding method in a derived class may in fact want to extend rather than simply replace the base class method of the same name. There is a simple way to call the base class method directly: just call BaseClassName.methodname(self, arguments). This is occasionally useful to clients as well. (Note that this only works if the base class is accessible as BaseClassName in the global scope.) 派生类中的盖方法实际上可能想要扩展而不是简单地替代基类的同名方法. 有一种简单的方法可以直接调用基类方法: 只要调用 BaseClassName.methodname(self, arguments). 这对客户来说偶尔也很有用. (注意, 这只工作在基类可以访问 BaseClassName 在全局作用域里.)

Python has two built-in functions that work with inheritance:

  • Use isinstance() to check an instance’s type: isinstance(obj, int) will be True only if obj.__class__ is int or some class derived from int.
  • Use issubclass() to check class inheritance: issubclass(bool, int) is True since bool is a subclass of int. However, issubclass(float, int) is False since float is not a subclass of int. Python 有两个用于继承的函数:
  • 使用 isinstance() 检查实例的类型: isinstance(obj, int) 只有在 obj.__class__ 是 int 或一些自 int 派生的类.
  • 使用 issubclass() 检查类继承: issubclass(bool, int) 为真, 因为 bool 是 int 的子类. 然而, issubclass(float, int) 为假, 因为 float 不是 int 的子类.

9.5.1 Multiple Inheritance

9.5.1 多继承

Python supports a form of multiple inheritance as well. A class definition with multiple base classes looks like this:

class DerivedClassName(Base1, Base2, Base3):
    <statement-1>
    .
    .
    .
    <statement-N>

Python也支持多继承的形式. 具有多个基类的类定义如下所示:

class DerivedClassName(Base1, Base2, Base3):
    <statement-1>
    .
    .
    .
    <statement-N>

For most purposes, in the simplest cases, you can think of the search for attributes inherited from a parent class as depth-first, left-to-right, not searching twice in the same class where there is an overlap in the hierarchy. Thus, if an attribute is not found in DerivedClassName, it is searched for in Base1, then (recursively) in the base classes of Base1, and if it was not found there, it was searched for in Base2, and so on. 对于大多数目的, 在最简单的情况下, 你可以考虑到搜索属性继承从父类作为深度优先, 从左至右, 而不是搜索两次在同一个类层次结构中, 其中有一个重叠. 因此, 如果一个属性没有在 DerivedClassName 里发现, 它将在 Base1 里查找到, 接着(递归地)如果在 Base1 没有被找到, 将在 Base2 里查找, 然后等等.

In fact, it is slightly more complex than that; the method resolution order changes dynamically to support cooperative calls to super(). This approach is known in some other multiple-inheritance languages as call-next-method and is more powerful than the super call found in single-inheritance languages. 实际上, 它稍微复杂一些; 方法解析顺序动态变化支持以 super() 的协作调用. 这种方式是所知的在一些其他多继承语言, 类似 call-next-method, 比 单继承语言中的 supper 更强大.

Dynamic ordering is necessary because all cases of multiple inheritance exhibit one or more diamond relationships (where at least one of the parent classes can be accessed through multiple paths from the bottommost class). For example, all classes inherit from object, so any case of multiple inheritance provides more than one path to reach object. To keep the base classes from being accessed more than once, the dynamic algorithm linearizes the search order in a way that preserves the left-to-right ordering specified in each class, that calls each parent only once, and that is monotonic (meaning that a class can be subclassed without affecting the precedence order of its parents). Taken together, these properties make it possible to design reliable and extensible classes with multiple inheritance. For more detail, see https://www.python.org/download/releases/2.3/mro/. 动态顺序是需要的, 因为所有多继承情况展示一个或多菱形关系(其中至少有一个父类可以通过最底层的多个路径访问). 例如, 所有类继承于对象, 这样任何多继承情况都提供提供多个到达对象的路径. 为了防止基类被访问不止一次, 动态算法线性化搜索顺序以某种方式方式: 保留每个类中指定的从左到右的顺序, 只调用一次祖先类, 而且是单调的(意味着可以在不影响其父项的优先顺序的情况下对类进行子类化). 总之, 这些属性使得设计具有多重继承的可靠且可扩展的类成为可能. 更多细节请看 https://www.python.org/download/releases/2.3/mro/.

9.6 Private Variables

9.6 私有变量

“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice. "私有"实例变量不能被访问, 在 Python 中是不存在的. 但是, 大多数 Python 代码有如下惯例: 带有下划线前缀的命名应该被当做 API 的非公开部分来对待(不论它是函数, 方法或对象). 它将被视为实现细节, 而且可无通知地调整.

Since there is a valid use-case for class-private members (namely to avoid name clashes of names with names defined by subclasses), there is limited support for such a mechanism, called name mangling. Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped. This mangling is done without regard to the syntactic position of the identifier, as long as it occurs within the definition of a class. 由于这是一个有效的类私有成员用例(亦即为了避免名称和子类名称定义的命名冲突), 这样的机制是有限的支持, 称为 name mangling(命名编码). 任何 __spam 形式的标识()是 _classname__spam 的文本替代, 这里的 classname 就是当前类的名(去掉了前下划线). 这 mangling 完成跟语法上表示的位置没有关系, 只要它出现在类的定义里.

Name mangling is helpful for letting subclasses override methods without breaking intraclass method calls. For example:

class Mapping:
    def __init__(self, iterable):
        self.items_list = []
        self.__update(iterable)

    def update(self, iterable):
        for item in iterable:
            self.items_list.append(item)

    __update = update # private copy of original update() method

class MappingSubclass(Mapping):
    def update(self, keys, values):
        # provides new signature for update()
        # but does not break __init__()
        for item in zip(keys, values):
            self.items_list.append(item)

Name mangling(命名编码)有助于让子类重写方法, 而不用破坏组内的方法调用. 例如:

class Mapping:
    def __init__(self, iterable):
        self.items_list = []
        self.__update(iterable)

    def update(self, iterable):
        for item in iterable:
            self.items_list.append(item)

    __update = update # private copy of original update() method

class MappingSubclass(Mapping):
    def update(self, keys, values):
        # provides new signature for update()
        # but does not break __init__()
        for item in zip(keys, values):
            self.items_list.append(item)

The above example would work even if MappingSubclass were to introduce a __update identifier since it is replaced with _Mapping__update in the Mapping class and _MappingSubclass__update in the MappingSubclass class respectively. 上面的例子将工作, 甚至如果 MappingSubclass 是要映入一个 __update 标识符, 由于它是用来替换 _Mapping__update 在 Mapping 类, _MappingSubclass__update 在 MappingSubclass 类分别. 即使MappingSubclass要引入__update标识符, 上面的示例也会起作用, 因为它分别被Mapping类中的_Mapping__update和MappingSubclass类中的_MappingSubclass__update替换.

Note that the mangling rules are designed mostly to avoid accidents; it still is possible to access or modify a variable that is considered private. This can even be useful in special circumstances, such as in the debugger. 注意, mangling 规则设计大多是为了避免意外; 它始终有可能访问或修改变量, 虽是被视为私有的. 在特定的场合它也是有用的, 比如在调试时.

Notice that code passed to exec() or eval() does not consider the classname of the invoking class to be the current class; this is similar to the effect of the global statement, the effect of which is likewise restricted to code that is byte-compiled together. The same restriction applies to getattr(), setattr() and delattr(), as well as when referencing __dict__ directly. 请注意, 传递给exec()或eval()的代码不会将调用类的类名视为当前类; 这类似于 global 语句的影响, 这个影响同样是受限于字节编译在一起的代码. 同样的限制于 getattr(), setattr() 和 delattr(), 像直接引用__dict__时.

9.7 Odds and Ends

9.7 可能性和完结

Sometimes it is useful to have a data type similar to the Pascal “record” or C “struct”, bundling together a few named data items. An empty class definition will do nicely:

class Employee:
    pass

john = Employee() # Create an empty employee record

# Fill the fields of the record
john.name = 'John Doe'
john.dept = 'computer lab'
john.salary = 1000

有时, 使用类似于 Pascal 的 “record” 或 C 的 “struct” 的数据类型是很有用的, 将一些命名数据项捆绑在一起. 一个空的类定义可以很好地完成:

class Employee:
    pass

john = Employee() # Create an empty employee record

# Fill the fields of the record
john.name = 'John Doe'
john.dept = 'computer lab'
john.salary = 1000

A piece of Python code that expects a particular abstract data type can often be passed a class that emulates the methods of that data type instead. For instance, if you have a function that formats some data from a file object, you can define a class with methods read() and readline() that get the data from a string buffer instead, and pass it as an argument. Python 代码片段需要特别的抽象数据类型通常可以传递一个模拟该数据类型的方法的类. 例如, 如果你有个函数, 它的格式化数据来自于文件对象, 你可以定义一个类用方法 read() 与 readline(), 从字符串缓冲区获取数据, 并将其作为参数传递.

Instance method objects have attributes, too: m.__self__ is the instance object with the method m(), and m.__func__ is the function object corresponding to the method. 实例方法对象也有属性: m.__self__ 是实例对象用方法 m(), m.__func__ 是函数对象对应的方法.

9.8 Iterators

9.8 迭代器

By now you have probably noticed that most container objects can be looped over using a for statement:

for element in [1, 2, 3]:
    print(element)
for element in (1, 2, 3):
    print(element)
for key in {'one':1, 'two':2}:
    print(key)
for char in "123":
    print(char)
for line in open("myfile.txt"):
    print(line, end='')

现在, 你可能已经注意到, 大多数容器对象都可以用 for 语句来循环:

for element in [1, 2, 3]:
    print(element)
for element in (1, 2, 3):
    print(element)
for key in {'one':1, 'two':2}:
    print(key)
for char in "123":
    print(char)
for line in open("myfile.txt"):
    print(line, end='')

This style of access is clear, concise, and convenient. The use of iterators pervades and unifies Python. Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method __next__() which accesses elements in the container one at a time. When there are no more elements, __next__() raises a StopIteration exception which tells the for loop to terminate. You can call the __next__() method using the next() built-in function; this example shows how it all works:

>>> s = 'abc'
>>> it = iter(s)
>>> it
<iterator object at 0x00A1DB50>
>>> next(it)
'a'
>>> next(it)
'b'
>>> next(it)
'c'
>>> next(it)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    next(it)
StopIteration

这种风格的访问是清晰, 简明, 方便的. Python 中迭代器的用法p普遍且统一. 在后台, for 语句调用在容器对象上的 iter(). 这函数返回一个迭代器对象, 它在访问容器中一个元素的同时定义方法 __next__(). 当没有更多元素时, __next__() 抛出一个 StopIteration 异常, 告诉 for 循环终止. 你可以使用 next() 内置函数来调用 __next__() 方法; 这是显示它如何工作的例子:

>>> s = 'abc'
>>> it = iter(s)
>>> it
<iterator object at 0x00A1DB50>
>>> next(it)
'a'
>>> next(it)
'b'
>>> next(it)
'c'
>>> next(it)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    next(it)
StopIteration

Having seen the mechanics behind the iterator protocol, it is easy to add iterator behavior to your classes. Define an __iter__() method which returns an object with a __next__() method. If the class defines __next__(), then __iter__() can just return self:

class Reverse:
    """Iterator for looping over a sequence backwards."""
    def __init__(self, data):
        self.data = data
        self.index = len(data)

    def __iter__(self):
        return self
    
    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index = self.index - 1
        return self.data[self.index]
>>> rev = Reverse('spam')
>>> iter(rev)
<__main__.Reverse object at 0x00A1DB50>
>>> for char in rev:
...     print(char)
...
m
a
p
s

看过迭代器协议背后的机制, 很容易地添加迭代器行为到你的类里. 定义一个 __iter__()方法, 它返回一个有 __next__() 方法的对象. 如果类定义了 __next__(), 那么 __iter__() 可以只返回它自己:

class Reverse:
    """Iterator for looping over a sequence backwards."""
    def __init__(self, data):
        self.data = data
        self.index = len(data)

    def __iter__(self):
        return self
    
    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index = self.index - 1
        return self.data[self.index]
>>> rev = Reverse('spam')
>>> iter(rev)
<__main__.Reverse object at 0x00A1DB50>
>>> for char in rev:
...     print(char)
...
m
a
p
s

9.9 Generators

9.9 发生器

Generators are a simple and powerful tool for creating iterators. They are written like regular functions but use the yield statement whenever they want to return data. Each time next() is called on it, the generator resumes where it left off (it remembers all the data values and which statement was last executed). An example shows that generators can be trivially easy to create:

def reverse(data):
    for index in range(len(data)-1, -1, -1):
        yield data[index]
>>> for char in reverse('golf'):
...     print(char)
...
f
l
o
g

发生器 是创建迭代器简单而强大的工具. 它们写来像常规函数, 但在它们想返回数据的地方使用 yield 语句. 每次调用next()时, 生成器恢复它离开的地方(它记住所有的数据值和最后执行的语句). 这个例子表明, 生成器很容易创建:

def reverse(data):
    for index in range(len(data)-1, -1, -1):
        yield data[index]               
""" yield data[index] 是上面的 return self.data[self.index]
    range(len(data)-1, -1, -1) 是上面的 self.index = self.index - 1
"""
>>> for char in reverse('golf'):
...     print(char)
...
f
l
o
g

Anything that can be done with generators can also be done with class-based iterators as described in the previous section. What makes generators so compact is that the __iter__() and __next__() methods are created automatically. 任何可以用生成器完成的也可以用基于类的迭代器完成, 像前一节描述的. 使得生成器如此紧凑的是 __iter__() 和 __next__() 方法是自动创建的.

Another key feature is that the local variables and execution state are automatically saved between calls. This made the function easier to write and much more clear than an approach using instance variables like self.index and self.data. 另一个关键特性是在调用之间自动保存局部变量和执行状态. 这使得函数更容易地区写, 且比使用像 self.index 和 self.data 样的实例变量更简洁的.

In addition to automatic method creation and saving program state, when generators terminate, they automatically raise StopIteration. In combination, these features make it easy to create iterators with no more effort than writing a regular function. 此外, 自动方法创建和保存程序状态, 在生成器终止时, 它们自动地抛出 StopIteration. 综上所述, 这些特性使得创建迭代器是很简单的, 不用更多的功夫去写一个常规的函数.

9.10 Generator Expressions

9.10 生成器表达式

Some simple generators can be coded succinctly as expressions using a syntax similar to list comprehensions but with parentheses instead of square brackets. These expressions are designed for situations where the generator is used right away by an enclosing function. Generator expressions are more compact but less versatile than full generator definitions and tend to be more memory friendly than equivalent list comprehensions. 一些简单的生成器可以简洁地编码, 类似于使用列表推导式语法的表达式, 但用圆括号替代方括号. 这些表达式适用于通过封闭函数立即使用生成器的情况. 生成器表达式更紧凑但比全生成器定义功能少, 并且比等效列表推导更具内存友好性.

Examples:

>>> sum(i*i for i in range(10)) # sum of squares
285

>>> xvec = [10, 20, 30]
>>> yvec = [7, 5, 3]
>>> sum(x*y for x,y in zip(xvec, yvec)) # dot product
260

>>> from math import pi, sin
>>> sine_table = {x: sin(x*pi/180) for x in range(0, 91)}
>>> unique_words = set(word for line in page for word in line.split())
>>> valedictorian = max((student.gpa, student.name) for student in graduates)
>>> data = 'golf'
>>> list(data[i] for i in range(len(data)-1, -1, -1))
['f', 'l', 'o', 'g']

例如:

>>> sum(i*i for i in range(10)) # sum of squares
285

>>> xvec = [10, 20, 30]
>>> yvec = [7, 5, 3]
>>> sum(x*y for x,y in zip(xvec, yvec)) # dot product
260

>>> from math import pi, sin
>>> sine_table = {x: sin(x*pi/180) for x in range(0, 91)}
>>> unique_words = set(word for line in page for word in line.split())
>>> valedictorian = max((student.gpa, student.name) for student in graduates)
>>> data = 'golf'
>>> list(data[i] for i in range(len(data)-1, -1, -1))
['f', 'l', 'o', 'g']

CHAPTER TEN


BRIEF TOUR OF THE STANDARD LIBRARY

标准库简介

10.1 Operating System Interface

10.1 操作系统接口

The os module provides dozens of functions for interacting with the operating system:

>>> import os
>>> os.getcwd() # Return the current working directory
'C:\\Python36'
>>> os.chdir('/server/accesslogs') # Change current working directory
>>> os.system('mkdir today') # Run the command mkdir in the system shell
0

模块 os 提供了许多与操作系统互动的函数:

>>> import os
>>> os.getcwd() # Return the current working directory
'C:\\Python36'
>>> os.chdir('/server/accesslogs') # Change current working directory
>>> os.system('mkdir today') # Run the command mkdir in the system shell
0

Be sure to use the import os style instead of from os import *. This will keep os.open() from shadowing the built-in open() function which operates much differently. 一定要用 import os 形式而不是 from os import *. 这样可以保证随操作系统不同而有所变化的 os.open() 不会覆盖内置函数 open().

The built-in dir() and help() functions are useful as interactive aids for working with large modules like os:

>>> import os
>>> dir(os)
<returns a list of all module functions>
>>> help(os)
<returns an extensive manual page created from the module's docstrings>

内置函数 dir() 和 help() 是很有用的用于处理像os这样的大型模块互动辅助工具:

>>> import os
>>> dir(os)
['DirEntry', 'F_OK', 'MutableMapping', 'O_APPEND', 'O_BINARY', 'O_CREAT', 'O_EXCL', 'O_NOINHERIT', 'O_RANDOM', 'O_RDONLY', 'O_RDWR', 'O_SEQUENTIAL', 'O_SHORT_LIVED', 'O_TEMPORARY', 'O_TEXT', 'O_TRUNC', 'O_WRONLY', 'P_DETACH', 'P_NOWAIT', 'P_NOWAITO', 'P_OVERLAY', 'P_WAIT', 'PathLike', 'R_OK', 'SEEK_CUR', 'SEEK_END', 'SEEK_SET', 'TMP_MAX', 'W_OK', 'X_OK', '_Environ', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_execvpe', '_exists', '_exit', '_fspath', '_get_exports_list', '_putenv', '_unsetenv', '_wrap_close', 'abc', 'abort', 'access', 'altsep', 'chdir', 'chmod', 'close', 'closerange', 'cpu_count', 'curdir', 'defpath', 'device_encoding', 'devnull', 'dup', 'dup2', 'environ', 'errno', 'error', 'execl', 'execle', 'execlp', 'execlpe', 'execv', 'execve', 'execvp', 'execvpe', 'extsep', 'fdopen', 'fsdecode', 'fsencode', 'fspath', 'fstat', 'fsync', 'ftruncate', 'get_exec_path', 'get_handle_inheritable', 'get_inheritable', 'get_terminal_size', 'getcwd', 'getcwdb', 'getenv', 'getlogin', 'getpid', 'getppid', 'isatty', 'kill', 'linesep', 'link', 'listdir', 'lseek', 'lstat', 'makedirs', 'mkdir', 'name', 'open', 'pardir', 'path', 'pathsep', 'pipe', 'popen', 'putenv', 'read', 'readlink', 'remove', 'removedirs', 'rename', 'renames', 'replace', 'rmdir', 'scandir', 'sep', 'set_handle_inheritable', 'set_inheritable', 'spawnl', 'spawnle', 'spawnv', 'spawnve', 'st', 'startfile', 'stat', 'stat_float_times', 'stat_result', 'statvfs_result', 'strerror', 'supports_bytes_environ', 'supports_dir_fd', 'supports_effective_ids', 'supports_fd', 'supports_follow_symlinks', 'symlink', 'sys', 'system', 'terminal_size', 'times', 'times_result', 'truncate', 'umask', 'uname_result', 'unlink', 'urandom', 'utime', 'waitpid', 'walk', 'write']
>>> help(os)
>>> help(os)
Help on module os:

NAME
    os - OS routines for NT or Posix depending on what system we're on.

DESCRIPTION
    This exports:
      - all functions from posix or nt, e.g. unlink, stat, etc.
      - os.path is either posixpath or ntpath
      - os.name is either 'posix' or 'nt'
      - os.curdir is a string representing the current directory (always '.')
      - os.pardir is a string representing the parent directory (always '..')
      - os.sep is the (or a most common) pathname separator ('/' or '\\')
      - os.extsep is the extension separator (always '.')
      - os.altsep is the alternate pathname separator (None or '/')
      - os.pathsep is the component separator used in $PATH etc
      - os.linesep is the line separator in text files ('\r' or '\n' or '\r\n')
      - os.defpath is the default search path for executables
      - os.devnull is the file path of the null device ('/dev/null', etc.)

    Programs that import and use 'os' stand a better chance of being
    portable between different platforms.  Of course, they must then
    only use functions that are defined by all platforms (e.g., unlink
    and opendir), and leave all pathname manipulation to os.path
    (e.g., split and join).

CLASSES
    builtins.Exception(builtins.BaseException)
        builtins.OSError
    builtins.object
        nt.DirEntry
    builtins.tuple(builtins.object)
        nt.times_result
        nt.uname_result
        stat_result
        statvfs_result
        terminal_size

    class DirEntry(builtins.object)
-- More  --

For daily file and directory management tasks, the shutil module provides a higher level interface that is easier to use:

>>> import shutil
>>> shutil.copyfile('data.db', 'archive.db')
'archive.db'
>>> shutil.move('/build/executables', 'installdir')
'installdir'

为了日常文件和目录管理任务, shutil 模块提供了一个高水平的接口, 且很容易使用:

>>> import shutil
>>> shutil.copyfile('data.db', 'archive.db')
'archive.db'
>>> shutil.move('/build/executables', 'installdir')
'ins

10.2 File Wildcards

10.2 文件通配符

The glob module provides a function for making file lists from directory wildcard searches:

>>> import glob
>>> glob.glob('*.py')
['primes.py', 'random.py', 'quote.py']

glob 模块提供函数从目录通配符搜索生成文件列表:

>>> import glob
>>> glob.glob('*.py')
['primes.py', 'random.py', 'quote.py']

10.3 Command Line Arguments

10.3 命令行参数

Common utility scripts often need to process command line arguments. These arguments are stored in the sys module’s argv attribute as a list. For instance the following output results from running python demo.py one two three at the command line:

>>> import sys
>>> print(sys.argv)
['demo.py', 'one', 'two', 'three']

The getopt module processes sys.argv using the conventions of the Unix getopt() function. More powerful and flexible command line processing is provided by the argparse module.

通常公用程序脚本经常需要处理命名行参数. 这些参数是作为一个列表保存在 sys 模块的 argv 属性. 例如在命令行中执行 python demo.py one two three 后可以得到以下输出结果:

>>> import sys
>>> print(sys.argv)
['demo.py', 'one', 'two', 'three']

getopt 模块使用 Unix getopt() 函数的惯例来处理 sys.argv . argparse 模块提供了更强大, 更灵活的命令行处理.

10.4 Error Output Redirection and Program Termination

10.4 错误输出重定向和程序终止

The sys module also has attributes for stdin, stdout, and stderr. The latter is useful for emitting warnings and error messages to make them visible even when stdout has been redirected:

>>> sys.stderr.write('Warning, log file not found starting a new one\n')
Warning, log file not found starting a new one

The most direct way to terminate a script is to use sys.exit().

sys 还有 stdin, stdout 和 stderr 属性. 后者(stderr)对于发出警告和错误消息非常有用, 即使在重定向stdout时也可以看到它们:

>>> sys.stderr.write('Warning, log file not found starting a new one\n')
Warning, log file not found starting a new one

大多数直接方式去终止脚本是用 sys.exit().

10.5 String Pattern Matching

10.5 字符串模式匹配

The re module provides regular expression tools for advanced string processing. For complex matching and manipulation, regular expressions offer succinct, optimized solutions:

>>> import re
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'

re 模块为高级字符串处理提供正则表达式工具. 为了复杂的匹配和操作, 正则表达式提供了简洁优化的解决方案:

>>> import re
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'

When only simple capabilities are needed, string methods are preferred because they are easier to read and debug:

>>> 'tea for too'.replace('too', 'two')
'tea for two'

当只需要简单的功能时, 字符串方法是首选的, 因为它们更容易阅读和调试:

>>> 'tea for too'.replace('too', 'two')
'tea for two'

10.6 Mathematics

10.6 数学

The math module gives access to the underlying C library functions for floating point math:

>>> import math
>>> math.cos(math.pi / 4)
0.70710678118654757
>>> math.log(1024, 2)
10.0

math 模块给出了为浮点数访问底层 C 库的功能:

>>> import math
>>> math.cos(math.pi / 4)
0.70710678118654757
>>> math.log(1024, 2)
10.0

The random module provides tools for making random selections:

>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(range(100), 10) # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random() # random float
0.17970987693706186
>>> random.randrange(6) # random integer chosen from range(6)
4

random 模块提供了来生成随机数选择的工具:

>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(range(100), 10) # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random() # random float 0~1 的随机浮点数
0.17970987693706186
>>> random.randrange(6) # random integer chosen from range(6)  0 到 6 取随机整数
4

The statistics module calculates basic statistical properties (the mean, median, variance, etc.) of numeric data:

>>> import statistics
>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
>>> statistics.mean(data)
1.6071428571428572
>>> statistics.median(data)
1.25
>>> statistics.variance(data)
1.3720238095238095

statistics 模块计算数字数据的基本统计功能(均值, 中位数, 方差, 等等):

>>> import statistics
>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
>>> statistics.mean(data)
1.6071428571428572
>>> statistics.median(data)
1.25
>>> statistics.variance(data)
1.3720238095238095

The SciPy project https://scipy.org has many other modules for numerical computations. SciPy项目https://scipy.org 有许多其他的数学计算模块.

10.7 Internet Access

10.7 互联网访问

There are a number of modules for accessing the internet and processing internet protocols. Two of the simplest are urllib.request for retrieving data from URLs and smtplib for sending mail:

>>> from urllib.request import urlopen
>>> with urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl') as response:
...     for line in response:
...         line = line.decode('utf-8') # Decoding the binary data to text.
...             if 'EST' in line or 'EDT' in line: # look for Eastern Time
...                 print(line)

<BR>Nov. 25, 09:43:32 PM EST

>>> import smtplib
>>> server = smtplib.SMTP('localhost')
>>> server.sendmail('soothsayer@example.org', 'jcaesar@example.org',
... """To: jcaesar@example.org
... From: soothsayer@example.org
...
... Beware the Ides of March.
... """)
>>> server.quit()

(Note that the second example needs a mailserver running on localhost.)

这里有数个访问网络和处理网络协议的模块. 两个最简单的是: urllib.request, 从 URLs 取回数据; smtplib, 发送邮件.

10.8 Dates and Times

10.8 日期和时间

The datetime module supplies classes for manipulating dates and times in both simple and complex ways. While date and time arithmetic is supported, the focus of the implementation is on efficient member extraction for output formatting and manipulation. The module also supports objects that are timezone aware.

>>> # dates are easily constructed and formatted
>>> from datetime import date
>>> now = date.today()
>>> now
datetime.date(2003, 12, 2)
>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'

>>> # dates support calendar arithmetic
>>> birthday = date(1964, 7, 31)
>>> age = now - birthday
>>> age.days
14368

datetime 模块以既简单又复杂(😦 干嘛???)的方式提供操作日期和时间的类. 当然是支持的日期和时间的运算, 实现的焦点是有效的成员提取输出格式和操作. 模块还支持时区设置对象.

>>> # dates are easily constructed and formatted
>>> from datetime import date
>>> now = date.today()
>>> now
datetime.date(2003, 12, 2)
>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'

>>> # dates support calendar arithmetic
>>> birthday = date(1964, 7, 31)
>>> age = now - birthday
>>> age.days
14368

10.9 Data Compression

10.9 数据压缩

Common data archiving and compression formats are directly supported by modules including: zlib, gzip, bz2, lzma, zipfile 和 tarfile.

>>> import zlib
>>> s = b'witch which has which witches wrist watch'
>>> len(s)
41
>>> t = zlib.compress(s)
>>> len(t)
37
>>> zlib.decompress(t)
b'witch which has which witches wrist watch'
>>> zlib.crc32(s)
226805979

通用的数据打包和压缩格式由以下模块直接支持, 包括: zlib, gzip, bz2, lzma, zipfile 和 tarfile.

>>> import zlib
>>> s = b'witch which has which witches wrist watch'
>>> len(s)
41
>>> t = zlib.compress(s)
>>> len(t)
37
>>> zlib.decompress(t)
b'witch which has which witches wrist watch'
>>> zlib.crc32(s)
22685979

10.10 Performance Measurement

10.10 性能评估

Some Python users develop a deep interest in knowing the relative performance of different approaches to the same problem. Python provides a measurement tool that answers those questions immediately. 一些 Python 用户对了解同一问题的不同方法的相对性能产生了浓厚的兴趣. Python 提供了一个评估工具, 为这些问题提供了直接答案.

For example, it may be tempting to use the tuple packing and unpacking feature instead of the traditional approach to swapping arguments. The timeit module quickly demonstrates a modest performance advantage:

>>> from timeit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()
0.57535828626024577
>>> Timer('a,b = b,a', 'a=1; b=2').timeit()
0.54962537085770791

例如, 使用元组打包和解包特性代替交换参数的传统方法是很诱人的. timeit 模块快速地显示适度的性能优势:

>>> from timeit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()
0.57535828626024577
>>> Timer('a,b = b,a', 'a=1; b=2').timeit()
0.54962537085770791

In contrast to timeit’s fine level of granularity, the profile and pstats modules provide tools for identifying time critical sections in larger blocks of code.相对于 timeit 的精细的粒度, profile 和 pstats 模块提供了针对更大代码块的时间度量工具

10.11 Quality Control

10.11 质量控制 <真的狠不懂>

One approach for developing high quality software is to write tests for each function as it is developed and to run those tests frequently during the development process. 开发高质量软件的一个途径是去写每个函数测试, 并且在开发过程中经常进行测试.

The doctest module provides a tool for scanning a module and validating tests embedded in a program’s docstrings. Test construction is as simple as cutting-and-pasting a typical call along with its results into the docstring. This improves the documentation by providing the user with an example and it allows the doctest module to make sure the code remains true to the documentation:

def average(values):
    """Computes the arithmetic mean of a list of numbers.
    
    >>> print(average([20, 30, 70]))
    40.0
    """
    
    return sum(values) / len(values)

import doctest
doctest.testmod() # automatically validate the embedded tests 自动验证嵌入式测试

doctest 模块提供了扫描模块和验证测试嵌入程序的文档字符串. 测试结构很简单, 就像将典型调用及其结果剪切并粘贴到文档字符串.

def average(values):
    """Computes the arithmetic mean of a list of numbers.
    
    >>> print(average([20, 30, 70]))
    40.0
    """
    
    return sum(values) / len(values)

import doctest
doctest.testmod() # 自动验证嵌入式测试

The unittest module is not as effortless as the doctest module, but it allows a more comprehensive set of tests to be maintained in a separate file:

import unittest

class TestStatisticalFunctions(unittest.TestCase):

    def test_average(self):
        self.assertEqual(average([20, 30, 70]), 40.0)
        self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
        with self.assertRaises(ZeroDivisionError):
            average([])
        with self.assertRaises(TypeError):
            average(20, 30, 70)

unittest.main() # Calling from the command line invokes all tests

unittest 模块不像 doctest 模块那么简单, 但它允许更全面地测试集维护在另一个分离文件:

import unittest

class TestStatisticalFunctions(unittest.TestCase):

    def test_average(self):
        self.assertEqual(average([20, 30, 70]), 40.0)
        self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
        with self.assertRaises(ZeroDivisionError):
            average([])
        with self.assertRaises(TypeError):
            average(20, 30, 70)

unittest.main() # Calling 从命令行调用所有测试

10.12 Batteries Included

10.12 自带电池(功能全面)

Python has a “batteries included” philosophy. This is best seen through the sophisticated and robust capabilities of its larger packages. For example:

  • The xmlrpc.client and xmlrpc.server modules make implementing remote procedure calls into an almost trivial task. Despite the modules names, no direct knowledge or handling of XML is needed.
  • The email package is a library for managing email messages, including MIME and other RFC 2822- based message documents. Unlike smtplib and poplib which actually send and receive messages, the email package has a complete toolset for building or decoding complex message structures (including attachments) and for implementing internet encoding and header protocols.
  • The json package provides robust support for parsing this popular data interchange format. The csv module supports direct reading and writing of files in Comma-Separated Value format, commonly supported by databases and spreadsheets. XML processing is supported by the xml.etree.ElementTree, xml.dom and xml.sax packages. Together, these modules and packages greatly simplify data interchange between Python applications and other tools.
  • The sqlite3 module is a wrapper for the SQLite database library, providing a persistent database that can be updated and accessed using slightly nonstandard SQL syntax.
  • Internationalization is supported by a number of modules including gettext, locale, and the codecs package. Python 展现了"自带电池"的哲学. 最好地展现是其更大的包装的复杂和强大的功能.
  • xmlrpc.client 和 xmlrpc.server 模块使远程程序调用实现成为几乎细小的任务. 尽管模块有这样的名称, 但不需要XML的直接知识和处理.
  • email 包是一个管理邮件信息得库, 包括 MIME 和其他基于 RFC 2822- 的文档. 不同于smtplib 和 poplib 实际地发送和接受信息, email 包由完整的工具去建立或解码复杂的信息结构(包括附件), 一级实现互联网编码和头协议.
  • json 包提供强大的支持去解析流行的数据交换格式. csv 模块支持直接地读写以逗号分隔值格式的文件, 这种文件通常由数据库和电子表格支持. XML 处理由 xml.etree.ElementTree, xml.dom 和 xml.sax packages 支持. 总地来说, 这些模块和包大大简化了 Python 应用和其他工具间的数据交换.
  • 国际化由包括gettex locale 的数个模块 和 codecs 包支持的.

  1. Except for one thing. Module objects have a secret read-only attribute called__dict__ which returns the dictionary used to implement the module’s namespace; the name __dict__is an attribute but not a global name. Obviously, using this violates the abstraction of namespace implementation, and should be restricted to things like post-mortem debuggers. ↩︎ ↩︎