线程池的原理和实现

一. 线程池的简介

       通常我们使用多线程的方式是,需要时创建一个新的线程,在这个线程里执行特定的任务,然后在任务完成后退出。这在一般的应用里已经能够满足我们应用的需求,毕竟我们并不是什么时候都需要创建大量的线程,并在它们执行一个简单的任务后销毁。

         但是在一些web、email、database等应用里,比如彩铃,我们的应用在任何时候都要准备应对数目巨大的连接请求,同时,这些请求所要完成的任务却又可能非常的简单,即只占用很少的处理时间。这时,我们的应用有可能处于不停的创建线程并销毁线程的状态。虽说比起进程的创建,线程的创建时间已经大大缩短,但是如果需要频繁的创建线程,并且每个线程所占用的处理时间又非常简短,则线程创建和销毁带给处理器的额外负担也是很可观的。

         线程池的作用正是在这种情况下有效的降低频繁创建销毁线程所带来的额外开销。一般来说,线程池都是采用预创建的技术,在应用启动之初便预先创建一定数目的线程。应用在运行的过程中,需要时可以从这些线程所组成的线程池里申请分配一个空闲的线程,来执行一定的任务,任务完成后,并不是将线程销毁,而是将它返还给线程池,由线程池自行管理。如果线程池中预先分配的线程已经全部分配完毕,但此时又有新的任务请求,则线程池会动态的创建新的线程去适应这个请求。当然,有可能,某些时段应用并不需要执行很多的任务,导致了线程池中的线程大多处于空闲的状态,为了节省系统资源,线程池就需要动态的销毁其中的一部分空闲线程。因此,线程池都需要一个管理者,按照一定的要求去动态的维护其中线程的数目。

        基于上面的技术,线程池将频繁创建和销毁线程所带来的开销分摊到了每个具体执行的任务上,执行的次数越多,则分摊到每个任务上的开销就越小。

        当然,如果线程创建销毁所带来的开销与线程执行任务的开销相比微不足道,可以忽略不计,则线程池并没有使用的必要。比如,FTP、Telnet等应用时。

二. 线程池的设计

       下面利用C语言来实现一个简单的线程池,为了使得这个线程池库使用起来更加方便,特在C实现中加入了一些OO的思想,与Objective-C不同,它仅仅是使用了struct来模拟了c++中的类,其实这种方式在linux内核中大量可见。

       在这个库里,与用户有关的接口主要有:


       typedef struct tp_work_desc_s tp_work_desc; //应用线程执行任务时所需要的一些信息

       typedef struct tp_work_s tp_work; //线程执行的任务

       typedef struct tp_thread_info_s tp_thread_info; //描述了各个线程id,是否空闲,执行的任务等信息

       typedef struct tp_thread_pool_s tp_thread_pool; // 有关线程池操作的接口信息

         //thread parm

       struct tp_work_desc_s{

                  ……

        };

       //base thread struct

       struct tp_work_s{

                  //main process function. user interface

                  void (*process_job)(tp_work *this, tp_work_desc *job);

        };

        tp_thread_pool *creat_thread_pool(int min_num, int max_num);


        tp_work_desc_s表示应用线程执行任务时所需要的一些信息,会被当作线程的参数传递给每个线程,依据应用的不同而不同,需要用户定义结构的内容。tp_work_s就是我们希望线程执行的任务了。当我们申请分配一个新的线程时,首先要明确的指定这两个结构,即该线程完成什么任务,并且完成这个任务需要哪些额外的信息。接口函数creat_thread_pool用来创建一个线程池的实例,使用时需要指定该线程池实例所能容纳的最小线程数min_num和最大线程数max_num。最小线程数即线程池创建时预创建的线程数目,这个数目的大小也直接影响了线程池所能起到的效果,如果指定的太小,线程池中预创建的线程很快就将分配完毕并需要创建新的线程来适应不断的请求,如果指定的太大,则将可能会有大量的空闲线程。我们需要根据自己应用的实际需要进行指定。描述线程池的结构如下:

     

//main thread pool struct
        struct tp_thread_pool_s{
             TPBOOL (*init)(tp_thread_pool *this);
             void (*close)(tp_thread_pool *this);
             void (*process_job)(tp_thread_pool *this, tp_work *worker, tp_work_desc *job);
             int  (*get_thread_by_id)(tp_thread_pool *this, int id);
             TPBOOL (*add_thread)(tp_thread_pool *this);
             TPBOOL (*delete_thread)(tp_thread_pool *this);
              int (*get_tp_status)(tp_thread_pool *this); 
              int min_th_num;                //min thread number in the pool
              int cur_th_num;                 //current thread number in the pool
              int max_th_num;         //max thread number in the pool
              pthread_mutex_t tp_lock;
              pthread_t manage_thread_id;  //manage thread id num
              tp_thread_info *thread_info;   //work thread relative thread info
};

tp_thread_info_s描述了各个线程id、是否空闲、执行的任务等信息,用户并不需要关心它。

     

//thread info
         struct tp_thread_info_s{
              pthread_t          thread_id;         //thread id num
             TPBOOL                   is_busy;    //thread status:true-busy;flase-idle
             pthread_cond_t          thread_cond;
             pthread_mutex_t               thread_lock;
             tp_work                      *th_work;
             tp_work_desc            *th_job;
         };

       tp_thread_pool_s结构包含了有关线程池操作的接口和变量。在使用creat_thread_pool返回一个线程池实例之后,首先要使用明确使用init接口对它进行初始化。在这个初始化过程中,线程池会预创建指定的最小线程数目的线程,它们都处于阻塞状态,并不损耗CPU,但是会占用一定的内存空间。同时init也会创建一个线程池的管理线程,这个线程会在线程池的运行周期内一直执行,它将定时的查看分析线程池的状态,如果线程池中空闲的线程过多,它会删除部分空闲的线程,当然它并不会使所有线程的数目小于指定的最小线程数。

tp_work_desc_s和tp_work_s结构,并使用线程池的process_job接口来执行它们。这些就是我们使用这个线程池时所需要了解的所有东西。如果不再需要线程池,可以使用close接口销毁它。

三. 实现代码

Thread-pool.h(头文件):

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <pthread.h>
#include <signal.h>

#ifndef TPBOOL
typedef int TPBOOL;
#endif

#ifndef TRUE
#define TRUE 1
#endif

#ifndef FALSE
#define FALSE 0
#endif

#define BUSY_THRESHOLD 0.5	//(busy thread)/(all thread threshold)
#define MANAGE_INTERVAL 5	//tp manage thread sleep interval

typedef struct tp_work_desc_s tp_work_desc;
typedef struct tp_work_s tp_work;
typedef struct tp_thread_info_s tp_thread_info;
typedef struct tp_thread_pool_s tp_thread_pool;

//thread parm
struct tp_work_desc_s{
	char *inum;	//call in
	char *onum;	//call out
	int chnum;	//channel num
};

//base thread struct
struct tp_work_s{
	//main process function. user interface
	void (*process_job)(tp_work *this, tp_work_desc *job);
};

//thread info
struct tp_thread_info_s{
	pthread_t		thread_id;	//thread id num
	TPBOOL  		is_busy;	//thread status:true-busy;flase-idle
	pthread_cond_t          thread_cond;	
	pthread_mutex_t		thread_lock;
	tp_work			*th_work;
	tp_work_desc		*th_job;
};

//main thread pool struct
struct tp_thread_pool_s{
	TPBOOL (*init)(tp_thread_pool *this);
	void (*close)(tp_thread_pool *this);
	void (*process_job)(tp_thread_pool *this, tp_work *worker, tp_work_desc *job);
	int  (*get_thread_by_id)(tp_thread_pool *this, int id);
	TPBOOL (*add_thread)(tp_thread_pool *this);
	TPBOOL (*delete_thread)(tp_thread_pool *this);
	int (*get_tp_status)(tp_thread_pool *this);
	
	int min_th_num;		//min thread number in the pool
	int cur_th_num;		//current thread number in the pool
	int max_th_num;         //max thread number in the pool
	pthread_mutex_t tp_lock;
	pthread_t manage_thread_id;	//manage thread id num
	tp_thread_info *thread_info;	//work thread relative thread info
};

tp_thread_pool *creat_thread_pool(int min_num, int max_num);

Thread-pool.c(实现文件):

#include "thread-pool.h"

static void *tp_work_thread(void *pthread);
static void *tp_manage_thread(void *pthread);

static TPBOOL tp_init(tp_thread_pool *this);
static void tp_close(tp_thread_pool *this);
static void tp_process_job(tp_thread_pool *this, tp_work *worker, tp_work_desc *job);
static int  tp_get_thread_by_id(tp_thread_pool *this, int id);
static TPBOOL tp_add_thread(tp_thread_pool *this);
static TPBOOL tp_delete_thread(tp_thread_pool *this);
static int  tp_get_tp_status(tp_thread_pool *this);

/**
  * user interface. creat thread pool.
  * para:
  * 	num: min thread number to be created in the pool
  * return:
  * 	thread pool struct instance be created successfully
  */
tp_thread_pool *creat_thread_pool(int min_num, int max_num){
	tp_thread_pool *this;
	this = (tp_thread_pool*)malloc(sizeof(tp_thread_pool));	

	memset(this, 0, sizeof(tp_thread_pool));
	
	//init member function ponter
	this->init = tp_init;
	this->close = tp_close;
	this->process_job = tp_process_job;
	this->get_thread_by_id = tp_get_thread_by_id;
	this->add_thread = tp_add_thread;
	this->delete_thread = tp_delete_thread;
	this->get_tp_status = tp_get_tp_status;

	//init member var
	this->min_th_num = min_num;
	this->cur_th_num = this->min_th_num;
	this->max_th_num = max_num;
	pthread_mutex_init(&this->tp_lock, NULL);

	//malloc mem for num thread info struct
	if(NULL != this->thread_info)
		free(this->thread_info);
	this->thread_info = (tp_thread_info*)malloc(sizeof(tp_thread_info)*this->max_th_num);

	return this;
}


/**
  * member function reality. thread pool init function.
  * para:
  * 	this: thread pool struct instance ponter
  * return:
  * 	true: successful; false: failed
  */
TPBOOL tp_init(tp_thread_pool *this){
	int i;
	int err;
	
	//creat work thread and init work thread info
	for(i=0;i<this->min_th_num;i++){
		pthread_cond_init(&this->thread_info[i].thread_cond, NULL);
		pthread_mutex_init(&this->thread_info[i].thread_lock, NULL);
		
		err = pthread_create(&this->thread_info[i].thread_id, NULL, tp_work_thread, this);
		if(0 != err){
			printf("tp_init: creat work thread failed\n");
			return FALSE;
		}
		printf("tp_init: creat work thread %d\n", this->thread_info[i].thread_id);
	}

	//creat manage thread
	err = pthread_create(&this->manage_thread_id, NULL, tp_manage_thread, this);
	if(0 != err){
		printf("tp_init: creat manage thread failed\n");
		return FALSE;
	}
	printf("tp_init: creat manage thread %d\n", this->manage_thread_id);

	return TRUE;
}

/**
  * member function reality. thread pool entirely close function.
  * para:
  * 	this: thread pool struct instance ponter
  * return:
  */
void tp_close(tp_thread_pool *this){
	int i;
	
	//close work thread
	for(i=0;i<this->cur_th_num;i++){
		kill(this->thread_info[i].thread_id, SIGKILL);
		pthread_mutex_destroy(&this->thread_info[i].thread_lock);
		pthread_cond_destroy(&this->thread_info[i].thread_cond);
		printf("tp_close: kill work thread %d\n", this->thread_info[i].thread_id);
	}

	//close manage thread
	kill(this->manage_thread_id, SIGKILL);
	pthread_mutex_destroy(&this->tp_lock);
	printf("tp_close: kill manage thread %d\n", this->manage_thread_id);
	
	//free thread struct
	free(this->thread_info);
}

/**
  * member function reality. main interface opened. 
  * after getting own worker and job, user may use the function to process the task.
  * para:
  * 	this: thread pool struct instance ponter
  *	worker: user task reality.
  *	job: user task para
  * return:
  */
void tp_process_job(tp_thread_pool *this, tp_work *worker, tp_work_desc *job){
	int i;
	int tmpid;

	//fill this->thread_info's relative work key
	for(i=0;i<this->cur_th_num;i++){
		pthread_mutex_lock(&this->thread_info[i].thread_lock);
		if(!this->thread_info[i].is_busy){
			printf("tp_process_job: %d thread idle, thread id is %d\n", i, this->thread_info[i].thread_id);
			//thread state be set busy before work
		  	this->thread_info[i].is_busy = TRUE;
			pthread_mutex_unlock(&this->thread_info[i].thread_lock);
			
			this->thread_info[i].th_work = worker;
			this->thread_info[i].th_job = job;
			
			printf("tp_process_job: informing idle working thread %d, thread id is %d\n", i, this->thread_info[i].thread_id);
			pthread_cond_signal(&this->thread_info[i].thread_cond);

			return;
		}
		else 
			pthread_mutex_unlock(&this->thread_info[i].thread_lock);		
	}//end of for

	//if all current thread are busy, new thread is created here
	pthread_mutex_lock(&this->tp_lock);
	if( this->add_thread(this) ){
		i = this->cur_th_num - 1;
		tmpid = this->thread_info[i].thread_id;
		this->thread_info[i].th_work = worker;
		this->thread_info[i].th_job = job;
	}
	pthread_mutex_unlock(&this->tp_lock);
	
	//send cond to work thread
	printf("tp_process_job: informing idle working thread %d, thread id is %d\n", i, this->thread_info[i].thread_id);
	pthread_cond_signal(&this->thread_info[i].thread_cond);
	return;	
}

/**
  * member function reality. get real thread by thread id num.
  * para:
  * 	this: thread pool struct instance ponter
  *	id: thread id num
  * return:
  * 	seq num in thread info struct array
  */
int tp_get_thread_by_id(tp_thread_pool *this, int id){
	int i;

	for(i=0;i<this->cur_th_num;i++){
		if(id == this->thread_info[i].thread_id)
			return i;
	}

	return -1;
}

/**
  * member function reality. add new thread into the pool.
  * para:
  * 	this: thread pool struct instance ponter
  * return:
  * 	true: successful; false: failed
  */
static TPBOOL tp_add_thread(tp_thread_pool *this){
	int err;
	tp_thread_info *new_thread;
	
	if( this->max_th_num <= this->cur_th_num )
		return FALSE;
		
	//malloc new thread info struct
	new_thread = &this->thread_info[this->cur_th_num];
	
	//init new thread's cond & mutex
	pthread_cond_init(&new_thread->thread_cond, NULL);
	pthread_mutex_init(&new_thread->thread_lock, NULL);

	//init status is busy
	new_thread->is_busy = TRUE;

	//add current thread number in the pool.
	this->cur_th_num++;
	
	err = pthread_create(&new_thread->thread_id, NULL, tp_work_thread, this);
	if(0 != err){
		free(new_thread);
		return FALSE;
	}
	printf("tp_add_thread: creat work thread %d\n", this->thread_info[this->cur_th_num-1].thread_id);
	
	return TRUE;
}

/**
  * member function reality. delete idle thread in the pool.
  * only delete last idle thread in the pool.
  * para:
  * 	this: thread pool struct instance ponter
  * return:
  * 	true: successful; false: failed
  */
static TPBOOL tp_delete_thread(tp_thread_pool *this){
	//current thread num can't < min thread num
	if(this->cur_th_num <= this->min_th_num) return FALSE;

	//if last thread is busy, do nothing
	if(this->thread_info[this->cur_th_num-1].is_busy) return FALSE;

	//kill the idle thread and free info struct
	kill(this->thread_info[this->cur_th_num-1].thread_id, SIGKILL);
	pthread_mutex_destroy(&this->thread_info[this->cur_th_num-1].thread_lock);
	pthread_cond_destroy(&this->thread_info[this->cur_th_num-1].thread_cond);

	//after deleting idle thread, current thread num -1
	this->cur_th_num--;

	return TRUE;
}

/**
  * member function reality. get current thread pool status:idle, normal, busy, .etc.
  * para:
  * 	this: thread pool struct instance ponter
  * return:
  * 	0: idle; 1: normal or busy(don't process)
  */
static int  tp_get_tp_status(tp_thread_pool *this){
	float busy_num = 0.0;
	int i;

	//get busy thread number
	for(i=0;i<this->cur_th_num;i++){
		if(this->thread_info[i].is_busy)
			busy_num++;
	}

	//0.2? or other num?
	if(busy_num/(this->cur_th_num) < BUSY_THRESHOLD)
		return 0;//idle status
	else
		return 1;//busy or normal status	
}

/**
  * internal interface. real work thread.
  * para:
  * 	pthread: thread pool struct ponter
  * return:
  */
static void *tp_work_thread(void *pthread){
	pthread_t curid;//current thread id
	int nseq;//current thread seq in the this->thread_info array
	tp_thread_pool *this = (tp_thread_pool*)pthread;//main thread pool struct instance

	//get current thread id
	curid = pthread_self();
	
	//get current thread's seq in the thread info struct array.
	nseq = this->get_thread_by_id(this, curid);
	if(nseq < 0)
		return;
	printf("entering working thread %d, thread id is %d\n", nseq, curid);

	//wait cond for processing real job.
	while( TRUE ){
		pthread_mutex_lock(&this->thread_info[nseq].thread_lock);
		pthread_cond_wait(&this->thread_info[nseq].thread_cond, &this->thread_info[nseq].thread_lock);
		pthread_mutex_unlock(&this->thread_info[nseq].thread_lock);		
		
		printf("%d thread do work!\n", pthread_self());

		tp_work *work = this->thread_info[nseq].th_work;
		tp_work_desc *job = this->thread_info[nseq].th_job;

		//process
		work->process_job(work, job);

		//thread state be set idle after work
		pthread_mutex_lock(&this->thread_info[nseq].thread_lock);		
		this->thread_info[nseq].is_busy = FALSE;
		pthread_mutex_unlock(&this->thread_info[nseq].thread_lock);
		
		printf("%d do work over\n", pthread_self());
	}	
}

/**
  * internal interface. manage thread pool to delete idle thread.
  * para:
  * 	pthread: thread pool struct ponter
  * return:
  */
static void *tp_manage_thread(void *pthread){
	tp_thread_pool *this = (tp_thread_pool*)pthread;//main thread pool struct instance

	//1?
	sleep(MANAGE_INTERVAL);

	do{
		if( this->get_tp_status(this) == 0 ){
			do{
				if( !this->delete_thread(this) )
					break;
			}while(TRUE);
		}//end for if

		//1?
		sleep(MANAGE_INTERVAL);
	}while(TRUE);
}

四. 数据库连接池介绍

      数据库连接是一种关键的有限的昂贵的资源,这一点在多用户的网页应用程序中体现得尤为突出。

      一个数据库连接对象均对应一个物理数据库连接,每次操作都打开一个物理连接,使用完都关闭连接,这样造成系统的 性能低下。 数据库连接池的解决方案是在应用程序启动时建立足够的数据库连接,并讲这些连接组成一个连接池(简单说:在一个“池”里放了好多半成品的数据库联接对象),由应用程序动态地对池中的连接进行申请、使用和释放。对于多于连接池中连接数的并发请求,应该在请求队列中排队等待。并且应用程序可以根据池中连接的使用率,动态增加或减少池中的连接数。 

      连接池技术尽可能多地重用了消耗内存地资源,大大节省了内存,提高了服务器地服务效率,能够支持更多的客户服务。通过使用连接池,将大大提高程序运行效率,同时,我们可以通过其自身的管理机制来监视数据库连接的数量、使用情况等。

     1)  最小连接数是连接池一直保持的数据库连接,所以如果应用程序对数据库连接的使用量不大,将会有大量的数据库连接资源被浪费; 
     2)  最大连接数是连接池能申请的最大连接数,如果数据库连接请求超过此数,后面的数据库连接请求将被加入到等待队列中,这会影响之后的数据库操作。

参考资料:

1.  libthreadpool库的实现:

2. 《线程池技术在并发服务器中的应用》——计算机与数字工程