Python标准库threading模块Condition原理浅析

本文环境python3.5.2

threading模块Condition的实现思路

在Python的多线程实现过程中,在Linux平台上主要使用了pthread线程库作为Python的多线程实现方式。其中Python提供了threading.py模块,来提供有关多线程的操作,在Python的多线程实现中比较重要的就是Condition该类的实现,好多相关方法都是通过操作该条件变量来实现的功能,首先先查看一下示例代码;

import time
from threading import Condition, RLock


def test_v1():
    t = Condition()
    a = 1

    def test_th(n):
        print('thread start ', n)
        with t:
            print('thread start wait ', n)
            if t.wait(5):
                print('thread ', n)
            else:
                print("************************* ", n)
                print("sleep ", n)
                time.sleep(n)
                print("sleep over ", n)

            if n == 2:
                t.notify()

    threads = [threading.Thread(target=test_th, args=(i, )) for i in range(1, 10)]
    [t.start() for t in threads]

    with t:
        print("start notify ")
        t.notify(2)


if __name__ == '__main__':
    test_v1()

其中运行的结果如下;

thread start  1
thread start  2
thread start wait  2
thread start  3
thread start wait  3
thread start  4
thread start wait  4
thread start wait  1
thread start  5
thread start wait  5
thread start  6
thread start wait  6
thread start  7
thread start wait  7
thread start  8
thread start wait  8
thread start  9
start notify 
thread start wait  9
thread  2
thread  3
thread  4
*************************  1
sleep  1
sleep over  1
*************************  5
sleep  5
sleep over  5
*************************  6
sleep  6
sleep over  6
*************************  7
sleep  7
sleep over  7
*************************  8
sleep  8
sleep over  8
*************************  9
sleep  9
sleep over  9

首先浅析一下该示例代码的主要执行流程;本文示例代码是一个典型的多线程的执行场景,首先先初始化一个Condition类实例,然后在各个线程中先加锁去获取该条件变量实例,然后在多线程start之后,所有线程都执行到了 'thread start’该行内容处,然后通过加锁的条件变量,然后通过t.wait(5)等待阻塞,如果等待5秒之后还未竞争到锁则继续执行else之后的内容,在启动所有子线程之后,就执行到’start notify’处,此时调用了t.notify(2)就是唤醒两个等待该条件变量的子线程,此时就输出了线程2,3,4中的任意两个,然后当执行到thread2的时候就会再继续t.notify(1)一个线程该线程就是激活的剩余的一个线程,所有正常被激活的线程就有三个,此时其他线程执行过程都没有了条件变量的唤醒操作,故剩余的线程都会等待t.wait(5)超时之后,执行剩余的sleep的打印代码,该示例代码的代码的执行流程大致如上所述,由于在Linux上基本上使用了pthread的线程库,所有底层的调用都是依赖于pthread的条件变量的实现。

threading模块中的Condition的实现

该Condition类的主要实现代码如下;

class Condition:
    """Class that implements a condition variable.

    A condition variable allows one or more threads to wait until they are
    notified by another thread.

    If the lock argument is given and not None, it must be a Lock or RLock
    object, and it is used as the underlying lock. Otherwise, a new RLock object
    is created and used as the underlying lock.

    """

    def __init__(self, lock=None):
        if lock is None:
            lock = RLock()                                                      # 如果传入的lock为空则默认初始化一个RLock实例
        self._lock = lock
        # Export the lock's acquire() and release() methods
        self.acquire = lock.acquire                                             # 获取lock的获取锁的方法
        self.release = lock.release                                             # 获取lock的释放锁的方法
        # If the lock defines _release_save() and/or _acquire_restore(),
        # these override the default implementations (which just call
        # release() and acquire() on the lock).  Ditto for _is_owned().
        try:
            self._release_save = lock._release_save
        except AttributeError:
            pass
        try:
            self._acquire_restore = lock._acquire_restore
        except AttributeError:
            pass
        try:
            self._is_owned = lock._is_owned
        except AttributeError:
            pass
        self._waiters = _deque()                                                # 生成一个队列

    def __enter__(self):
        return self._lock.__enter__()                                           # 调用lock的上下文

    def __exit__(self, *args):
        return self._lock.__exit__(*args)

    def __repr__(self):
        return "<Condition(%s, %d)>" % (self._lock, len(self._waiters))

    def _release_save(self):
        self._lock.release()           # No state to save                       # 释放锁

    def _acquire_restore(self, x):
        self._lock.acquire()           # Ignore saved state                     # 获取锁

    def _is_owned(self):
        # Return True if lock is owned by current_thread.
        # This method is called only if _lock doesn't have _is_owned().
        if self._lock.acquire(0):                                               # 判断是否加锁
            self._lock.release()                                                # 
            return False
        else:
            return True                                                         # 加锁成功返回True

    def wait(self, timeout=None):
        """Wait until notified or until a timeout occurs.

        If the calling thread has not acquired the lock when this method is
        called, a RuntimeError is raised.

        This method releases the underlying lock, and then blocks until it is
        awakened by a notify() or notify_all() call for the same condition
        variable in another thread, or until the optional timeout occurs. Once
        awakened or timed out, it re-acquires the lock and returns.

        When the timeout argument is present and not None, it should be a
        floating point number specifying a timeout for the operation in seconds
        (or fractions thereof).

        When the underlying lock is an RLock, it is not released using its
        release() method, since this may not actually unlock the lock when it
        was acquired multiple times recursively. Instead, an internal interface
        of the RLock class is used, which really unlocks it even when it has
        been recursively acquired several times. Another internal interface is
        then used to restore the recursion level when the lock is reacquired.

        """
        if not self._is_owned():                                                 # 检查条件变量的是否加锁                          
            raise RuntimeError("cannot wait on un-acquired lock")
        waiter = _allocate_lock()                                                # 通过c编写的函数获取锁住同一个cond和锁的lock               
        waiter.acquire()
        self._waiters.append(waiter)                                             # 添加到等待获取线程列表中
        saved_state = self._release_save()
        gotit = False
        try:    # restore state no matter what (e.g., KeyboardInterrupt)
            if timeout is None:                                                  # 如果超时未传值
                waiter.acquire()                                                 # 一直阻塞直到获取到锁
                gotit = True                                                     # 获取返回为True
            else:
                if timeout > 0:                                                  # 如果超时为正值
                    gotit = waiter.acquire(True, timeout)                        # 等待该条件变量直到timeout超时
                else:
                    gotit = waiter.acquire(False)                                # 如果传入小于0则一直阻塞直到被唤醒
            return gotit
        finally:
            self._acquire_restore(saved_state)
            if not gotit:
                try:
                    self._waiters.remove(waiter)                                 # 如果阻塞返回之后则从列表中移除
                except ValueError:
                    pass

    def wait_for(self, predicate, timeout=None):
        """Wait until a condition evaluates to True.

        predicate should be a callable which result will be interpreted as a
        boolean value.  A timeout may be provided giving the maximum time to
        wait.

        """
        endtime = None
        waittime = timeout
        result = predicate()
        while not result:
            if waittime is not None:
                if endtime is None:
                    endtime = _time() + waittime
                else:
                    waittime = endtime - _time()
                    if waittime <= 0:
                        break
            self.wait(waittime)
            result = predicate()
        return result

    def notify(self, n=1):
        """Wake up one or more threads waiting on this condition, if any.

        If the calling thread has not acquired the lock when this method is
        called, a RuntimeError is raised.

        This method wakes up at most n of the threads waiting for the condition
        variable; it is a no-op if no threads are waiting.

        """
        if not self._is_owned():                                                    # 判断是否加锁
            raise RuntimeError("cannot notify on un-acquired lock")
        all_waiters = self._waiters                                                 # 获取所有等待列表
        waiters_to_notify = _deque(_islice(all_waiters, n))                         # 获取指定n个等待的锁
        if not waiters_to_notify:                                                   # 如果获取为空则返回
            return
        for waiter in waiters_to_notify:                                            # 依次遍历
            waiter.release()                                                        # 释放锁
            try:
                all_waiters.remove(waiter)                                          # 并从等待列表中移除
            except ValueError:
                pass

    def notify_all(self):
        """Wake up all threads waiting on this condition.

        If the calling thread has not acquired the lock when this method
        is called, a RuntimeError is raised.

        """
        self.notify(len(self._waiters))                                             # 唤醒所有等待的锁

    notifyAll = notify_all

在Condition中在初始化的时候会在没有传入lock的情况下会默认初始化一个RLock实例,该类一般是通过c编写的方法,

try:
    _CRLock = _thread.RLock
except AttributeError:
    _CRLock = None

...

def RLock(*args, **kwargs):
    """Factory function that returns a new reentrant lock.

    A reentrant lock must be released by the thread that acquired it. Once a
    thread has acquired a reentrant lock, the same thread may acquire it again
    without blocking; the thread must release it once for each time it has
    acquired it.

    """
    if _CRLock is None:
        return _PyRLock(*args, **kwargs)
    return _CRLock(*args, **kwargs)

所以在使用with Condition时,就调用了RLock的__enter__和__exit__方法,来加锁和释放锁,RLock在c语言中的定义与属性如下;

static PyMethodDef rlock_methods[] = {
    {"acquire",      (PyCFunction)rlock_acquire,
     METH_VARARGS | METH_KEYWORDS, rlock_acquire_doc},
    {"release",      (PyCFunction)rlock_release,
     METH_NOARGS, rlock_release_doc},
    {"_is_owned",     (PyCFunction)rlock_is_owned,
     METH_NOARGS, rlock_is_owned_doc},
    {"_acquire_restore", (PyCFunction)rlock_acquire_restore,
     METH_VARARGS, rlock_acquire_restore_doc},
    {"_release_save", (PyCFunction)rlock_release_save,
     METH_NOARGS, rlock_release_save_doc},
    {"__enter__",    (PyCFunction)rlock_acquire,
     METH_VARARGS | METH_KEYWORDS, rlock_acquire_doc}, 					# 加锁进入  该方法与直接调用acquire相同
    {"__exit__",    (PyCFunction)rlock_release, 						# 释放锁离开  该方法与直接调用release相同
     METH_VARARGS, rlock_release_doc},
    {NULL,           NULL}              /* sentinel */
};


static PyTypeObject RLocktype = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "_thread.RLock",                    /*tp_name*/
    sizeof(rlockobject),                /*tp_size*/
    0,                                  /*tp_itemsize*/
    /* methods */
    (destructor)rlock_dealloc,          /*tp_dealloc*/
    0,                                  /*tp_print*/
    0,                                  /*tp_getattr*/
    0,                                  /*tp_setattr*/
    0,                                  /*tp_reserved*/
    (reprfunc)rlock_repr,               /*tp_repr*/
    0,                                  /*tp_as_number*/
    0,                                  /*tp_as_sequence*/
    0,                                  /*tp_as_mapping*/
    0,                                  /*tp_hash*/
    0,                                  /*tp_call*/
    0,                                  /*tp_str*/
    0,                                  /*tp_getattro*/
    0,                                  /*tp_setattro*/
    0,                                  /*tp_as_buffer*/
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /* tp_flags */
    0,                                  /*tp_doc*/
    0,                                  /*tp_traverse*/
    0,                                  /*tp_clear*/
    0,                                  /*tp_richcompare*/
    offsetof(rlockobject, in_weakreflist), /*tp_weaklistoffset*/
    0,                                  /*tp_iter*/
    0,                                  /*tp_iternext*/
    rlock_methods,                      /*tp_methods*/
    0,                                  /* tp_members */
    0,                                  /* tp_getset */
    0,                                  /* tp_base */
    0,                                  /* tp_dict */
    0,                                  /* tp_descr_get */
    0,                                  /* tp_descr_set */
    0,                                  /* tp_dictoffset */
    0,                                  /* tp_init */
    PyType_GenericAlloc,                /* tp_alloc */
    rlock_new                           /* tp_new */
};

此时继续查看rlock_acquire的方法执行了什么步骤;

static PyObject *
rlock_acquire(rlockobject *self, PyObject *args, PyObject *kwds)
{
    _PyTime_t timeout;
    long tid;
    PyLockStatus r = PY_LOCK_ACQUIRED;

    if (lock_acquire_parse_args(args, kwds, &timeout) < 0) 				# 解析传入参数, 是否正确,此时没有传入其他参数,此时timeout也为空
        return NULL;

    tid = PyThread_get_thread_ident(); 									# 获取线程的ident
    if (self->rlock_count > 0 && tid == self->rlock_owner) { 			# 检查该线程的锁是否已经被锁住了
        unsigned long count = self->rlock_count + 1;
        if (count <= self->rlock_count) {
            PyErr_SetString(PyExc_OverflowError,
                            "Internal lock count overflowed");
            return NULL;
        }
        self->rlock_count = count; 										# 重置锁住次数
        Py_RETURN_TRUE; 												# 并返回True
    }
    r = acquire_timed(self->rlock_lock, timeout); 						# 
    if (r == PY_LOCK_ACQUIRED) { 										# 第一次获取锁
        assert(self->rlock_count == 0); 								# 检查锁的次数
        self->rlock_owner = tid; 										# 设置id
        self->rlock_count = 1; 											# 设置次数
    }
    else if (r == PY_LOCK_INTR) {
        return NULL;
    }

    return PyBool_FromLong(r == PY_LOCK_ACQUIRED); 						# 返回是否是已经获取锁
}



static PyLockStatus
acquire_timed(PyThread_type_lock lock, _PyTime_t timeout)
{
    PyLockStatus r;
    _PyTime_t endtime = 0;
    _PyTime_t microseconds;

    if (timeout > 0)
        endtime = _PyTime_GetMonotonicClock() + timeout; 								# 获取当前时间并计算结束时间

    do {
        microseconds = _PyTime_AsMicroseconds(timeout, _PyTime_ROUND_CEILING); 	 		# 转换时间

        /* first a simple non-blocking try without releasing the GIL */
        r = PyThread_acquire_lock_timed(lock, 0, 0); 									# 获取锁住的
        if (r == PY_LOCK_FAILURE && microseconds != 0) {
            Py_BEGIN_ALLOW_THREADS
            r = PyThread_acquire_lock_timed(lock, microseconds, 1);
            Py_END_ALLOW_THREADS
        }

        if (r == PY_LOCK_INTR) {
            /* Run signal handlers if we were interrupted.  Propagate
             * exceptions from signal handlers, such as KeyboardInterrupt, by
             * passing up PY_LOCK_INTR.  */
            if (Py_MakePendingCalls() < 0) {
                return PY_LOCK_INTR;
            }

            /* If we're using a timeout, recompute the timeout after processing
             * signals, since those can take time.  */
            if (timeout > 0) {
                timeout = endtime - _PyTime_GetMonotonicClock();

                /* Check for negative values, since those mean block forever.
                 */
                if (timeout < 0) {
                    r = PY_LOCK_FAILURE;
                }
            }
        }
    } while (r == PY_LOCK_INTR);  /* Retry if we were interrupted. */ 				# 

    return r;
}


PyLockStatus
PyThread_acquire_lock_timed(PyThread_type_lock lock, PY_TIMEOUT_T microseconds,
                            int intr_flag)
{
    PyLockStatus success;
    pthread_lock *thelock = (pthread_lock *)lock; 									# 传入的锁
    int status, error = 0;

    dprintf(("PyThread_acquire_lock_timed(%p, %lld, %d) called\n",
             lock, microseconds, intr_flag));

    status = pthread_mutex_lock( &thelock->mut ); 									# 加锁
    CHECK_STATUS("pthread_mutex_lock[1]");

    if (thelock->locked == 0) { 													# 如果没有锁住
        success = PY_LOCK_ACQUIRED; 												# 则锁住成功
    } else if (microseconds == 0) { 												# 如果时间为0
        success = PY_LOCK_FAILURE; 													# 则获取失败
    } else {
        struct timespec ts;
        if (microseconds > 0)
            MICROSECONDS_TO_TIMESPEC(microseconds, ts);
        /* continue trying until we get the lock */

        /* mut must be locked by me -- part of the condition
         * protocol */
        success = PY_LOCK_FAILURE;
        while (success == PY_LOCK_FAILURE) { 										# 如果失败
            if (microseconds > 0) { 												# 如果时间大于0
                status = pthread_cond_timedwait(
                    &thelock->lock_released,
                    &thelock->mut, &ts); 											# 等待一定时间然后释放
                if (status == ETIMEDOUT)
                    break;
                CHECK_STATUS("pthread_cond_timed_wait");
            }
            else {
                status = pthread_cond_wait(
                    &thelock->lock_released,
                    &thelock->mut); 												# 如果没有时间则一直阻塞等待释放
                CHECK_STATUS("pthread_cond_wait");
            }

            if (intr_flag && status == 0 && thelock->locked) {
                /* We were woken up, but didn't get the lock.  We probably received
                 * a signal.  Return PY_LOCK_INTR to allow the caller to handle
                 * it and retry.  */
                success = PY_LOCK_INTR;
                break;
            } else if (status == 0 && !thelock->locked) {
                success = PY_LOCK_ACQUIRED; 										# 如果返回为0 并且没有被锁住则返回锁住成功
            } else {
                success = PY_LOCK_FAILURE;
            }
        }
    }
    if (success == PY_LOCK_ACQUIRED) thelock->locked = 1; 							# 如果锁住则设置锁住标识位
    status = pthread_mutex_unlock( &thelock->mut ); 								# 释放锁
    CHECK_STATUS("pthread_mutex_unlock[1]"); 

    if (error) success = PY_LOCK_FAILURE;
    dprintf(("PyThread_acquire_lock_timed(%p, %lld, %d) -> %d\n",
             lock, microseconds, intr_flag, success));
    return success;
}

通过查看该函数执行的流程可知,先判断是否获取锁,然后通过所去获取是否可以获取锁,通过传入的时间可知,是否通过条件变量来一直阻塞获取还是等待一段时间之后直接超时,该函数的实现思路可参考pthread中条件变量的实现过程,其实现原理是一样的。此时rlock_release的函数执行流程大家可自行查看。然后我们继续分析在wati的函数中,会执行到waiter=_allocate_lock()函数该处,该函数也是调用了c模块中的该thread_PyThread_allocate_lock方法,

static PyObject *
thread_PyThread_allocate_lock(PyObject *self)
{
    return (PyObject *) newlockobject();
}



static lockobject *
newlockobject(void)
{
    lockobject *self;
    self = PyObject_New(lockobject, &Locktype);
    if (self == NULL)
        return NULL;
    self->lock_lock = PyThread_allocate_lock();
    self->locked = 0;
    self->in_weakreflist = NULL;
    if (self->lock_lock == NULL) {
        Py_DECREF(self);
        PyErr_SetString(ThreadError, "can't allocate lock");
        return NULL;
    }
    return self;
}

此时生成的是一个lock对象

static PyMethodDef lock_methods[] = {
    {"acquire_lock", (PyCFunction)lock_PyThread_acquire_lock,
     METH_VARARGS | METH_KEYWORDS, acquire_doc},
    {"acquire",      (PyCFunction)lock_PyThread_acquire_lock,
     METH_VARARGS | METH_KEYWORDS, acquire_doc},
    {"release_lock", (PyCFunction)lock_PyThread_release_lock,
     METH_NOARGS, release_doc},
    {"release",      (PyCFunction)lock_PyThread_release_lock,
     METH_NOARGS, release_doc},
    {"locked_lock",  (PyCFunction)lock_locked_lock,
     METH_NOARGS, locked_doc},
    {"locked",       (PyCFunction)lock_locked_lock,
     METH_NOARGS, locked_doc},
    {"__enter__",    (PyCFunction)lock_PyThread_acquire_lock,
     METH_VARARGS | METH_KEYWORDS, acquire_doc},
    {"__exit__",    (PyCFunction)lock_PyThread_release_lock,
     METH_VARARGS, release_doc},
    {NULL,           NULL}              /* sentinel */
};

static PyTypeObject Locktype = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "_thread.lock",                     /*tp_name*/
    sizeof(lockobject),                 /*tp_size*/
    0,                                  /*tp_itemsize*/
    /* methods */
    (destructor)lock_dealloc,           /*tp_dealloc*/
    0,                                  /*tp_print*/
    0,                                  /*tp_getattr*/
    0,                                  /*tp_setattr*/
    0,                                  /*tp_reserved*/
    (reprfunc)lock_repr,                /*tp_repr*/
    0,                                  /*tp_as_number*/
    0,                                  /*tp_as_sequence*/
    0,                                  /*tp_as_mapping*/
    0,                                  /*tp_hash*/
    0,                                  /*tp_call*/
    0,                                  /*tp_str*/
    0,                                  /*tp_getattro*/
    0,                                  /*tp_setattro*/
    0,                                  /*tp_as_buffer*/
    Py_TPFLAGS_DEFAULT,                 /*tp_flags*/
    0,                                  /*tp_doc*/
    0,                                  /*tp_traverse*/
    0,                                  /*tp_clear*/
    0,                                  /*tp_richcompare*/
    offsetof(lockobject, in_weakreflist), /*tp_weaklistoffset*/
    0,                                  /*tp_iter*/
    0,                                  /*tp_iternext*/
    lock_methods,                       /*tp_methods*/
};



static PyObject *
lock_PyThread_acquire_lock(lockobject *self, PyObject *args, PyObject *kwds)
{
    _PyTime_t timeout;
    PyLockStatus r;

    if (lock_acquire_parse_args(args, kwds, &timeout) < 0)
        return NULL;

    r = acquire_timed(self->lock_lock, timeout); 								# 同样的使用了acquire_timed执行流程如上相同
    if (r == PY_LOCK_INTR) {
        return NULL;
    }

    if (r == PY_LOCK_ACQUIRED)
        self->locked = 1;
    return PyBool_FromLong(r == PY_LOCK_ACQUIRED);
}




static PyObject *
lock_PyThread_release_lock(lockobject *self)
{
    /* Sanity check: the lock must be locked */
    if (!self->locked) {
        PyErr_SetString(ThreadError, "release unlocked lock");
        return NULL;
    }

    PyThread_release_lock(self->lock_lock);
    self->locked = 0;
    Py_INCREF(Py_None);
    return Py_None;
}



void
PyThread_release_lock(PyThread_type_lock lock)
{
    pthread_lock *thelock = (pthread_lock *)lock;
    int status, error = 0;

    (void) error; /* silence unused-but-set-variable warning */
    dprintf(("PyThread_release_lock(%p) called\n", lock));

    status = pthread_mutex_lock( &thelock->mut ); 						# 首先获取锁
    CHECK_STATUS("pthread_mutex_lock[3]");

    thelock->locked = 0; 												# 将锁标志设置为未加锁

    /* wake up someone (anyone, if any) waiting on the lock */
    status = pthread_cond_signal( &thelock->lock_released ); 			# 唤醒一个线程
    CHECK_STATUS("pthread_cond_signal");

    status = pthread_mutex_unlock( &thelock->mut ); 					# 释放锁
    CHECK_STATUS("pthread_mutex_unlock[3]");
}

通过分析了lock的基本属性,加锁和释放锁的流程从流程可知,该操作的方法都是围绕pthread的条件变量加互斥变量来实现线程直接的消息竞争机制,至此Python线程的基本实现原理分析完毕。

总结

通过对比threading模块的基本原理,可知,在Linux的平台下面,Python的多线程的机制都是通过pthread的线程库来实现的,主要通过条件变量加互斥锁,来实现了Python多线程直接的通信,唤醒调度,大家如有兴趣可具体查看pthread的线程库的相关文档资料,鉴于本人才疏学浅,如有疏漏请批评指正。