目录

​Boost::pool说明​

​boost::pool 的实现原理​

​部分源码​

​总结​

​经验​

​使用boost::pool<>遇到严重的性能问题​



Boost::pool说明

 ​

boost::pool 的实现原理

pool去按照一定的增长规则,从操作系统申请一大块内存,称为block,源码中用PODptr表示。

这个PODptr结构将block分为三块:

【C/C++】Boost::pool内存链/池_链表

第一块是大块数据区(后面会格式化为许多个小块chunk)

第二块只有sizeof(void*) 个字节,即指针大小,保存下一个PODptr的指针

第三块保存下一PODptr的长度。

最后一个PODptr指针为空。

PODptr的数据区被simple_segregated_storage格式化为许多个小块,称为chunk。一个chunk的大小是定义boost::object_pool时决定的,即 sizeof(T)>sizeof(void)?sizeof(T):sizeof(void)。

任意一个chunk未被占用时,使用其前sizeof(void*)个字节作为一个指针指向下一个未被占用的chunk。是的,单向链表。

【C/C++】Boost::pool内存链/池_开发语言_02

 

而从pool::malloc,就执行单向链表的删除节点操作,每次都返回首个chunk,因此未进行重新申请block前,malloc都是O(1)。

pool::free(ptr)操作就是找到ptr属于哪个PODptr,然后把ptr添加到单向链表头。

pool::ordered_free(ptr)找到ptr属于哪个PODptr,然后通过插入排序把ptr添加到单向链表。

部分源码

/*
该函数是simple_segregated_storage的成员函数。第一次看到一下懵逼了,不知其何用意。难道不就是得到 *ptr 的功能吗?!
事实是,对于一个void*是不能dereference的。因为*ptr你将得到一个void类型,C++不允许void类型。
*/
static void * & nextof(void * const ptr)
{
return *(static_cast<void **>(ptr));
}

simple_segregated_storage

//segregate会把给的一个sz大小的内存块,拆分为每个partition_sz大小的多个chunk单元,

//每个chunk的前4字节指向下一个chunk(作为链表的next),而最后一个chunk头指向end。

【C/C++】Boost::pool内存链/池_c++_03

//segregate会把给的一个sz大小的内存块,拆分为每个partition_sz大小的多个chunk单元,
//每个chunk的前4字节指向下一个chunk(作为链表的next),而最后一个chunk头指向end。

template <typename SizeType>
void * simple_segregated_storage<SizeType>::segregate(
void * const block,
const size_type sz,
const size_type partition_sz,
void * const end)
{
//找到最后一个chunk
char * old = static_cast<char *>(block)
+ ((sz - partition_sz) / partition_sz) * partition_sz;

nextof(old) = end;//把最后一个chunk指向end

if (old == block)
return block;//如果这块内存只有一个chunk就返回
//格式化其他的chunk,使每个chunk的前4字节指向下一个chunk
for (char * iter = old - partition_sz; iter != block;
old = iter, iter -= partition_sz)
nextof(iter) = old;

nextof(block) = old;

return block;
}

//添加一个block时,会把这该块分解成chunk,添加到链表的头部。因为无序,所以复杂度O(1)
void add_block(void * const block,
const size_type nsz, const size_type npartition_sz)
{
first = segregate(block, nsz, npartition_sz, first);
}
//通过find_prev找到这个内存块对应的位置,然后添加进去。复杂度O(n)
void add_ordered_block(void * const block,
const size_type nsz, const size_type npartition_sz)
{
void * const loc = find_prev(block);
if (loc == 0)
add_block(block, nsz, npartition_sz);
else
nextof(loc) = segregate(block, nsz, npartition_sz, nextof(loc));
}

//这个没什么好说的,通过比较地址,找到ptr在当前block中的位置,类似插入排序。
template <typename SizeType>
void * simple_segregated_storage<SizeType>::find_prev(void * const ptr)
{
if (first == 0 || std::greater<void *>()(first, ptr))
return 0;

void * iter = first;
while (true)
{
if (nextof(iter) == 0 || std::greater<void *>()(nextof(iter), ptr))
return iter;

iter = nextof(iter);
}
}
//simple_segregated_storage成员变量。 链表头指针。
void * first;

下段代码从simple_segregated_storage链表中获取内存:

template <typename SizeType>
void * simple_segregated_storage<SizeType>::malloc_n(const size_type n,
const size_type partition_size)
{
if(n == 0)
return 0;
void * start = &first;
void * iter;
do
{
if (nextof(start) == 0)
return 0;
//try_malloc_n会从start开始(不算start)向后申请n个partition_size大小的chunk,返回最后一个chunk的指针
iter = try_malloc_n(start, n, partition_size);
} while (iter == 0);
//此处返回内存chunk头
void * const ret = nextof(start);
//此处是经典的单向链表移除其中一个节点的操作。把该内存的前面chunk头指向该内存尾部chunk头指向的内存。即把该部分排除出链表。
nextof(start) = nextof(iter);
return ret;
}

//start会指向满足条件(连续的n个partition_size大小的chunk内存)的chunk头部,返回最后一个chunk指针。
template <typename SizeType>
void * simple_segregated_storage<SizeType>::try_malloc_n(
void * & start, size_type n, const size_type partition_size)
{
void * iter = nextof(start);
//start后面的块是否是连续的n块partition_size大小的内存
while (--n != 0)
{
void * next = nextof(iter);
//如果next != static_cast<char *>(iter) + partition_size,说明下一块chunk被占用或是到了大块内存(block)的尾部。
if (next != static_cast<char *>(iter) + partition_size)
{
// next == 0 (end-of-list) or non-contiguous chunk found
start = iter;
return 0;
}
iter = next;
}
return iter;
}

class PODptr

【C/C++】Boost::pool内存链/池_c++_04

 

如上图,类PODptr指示了一个block结构,这个block大小不一定相同,但都由 chunk data+ next ptr + next block size三部分组成。

  • chunk data部分被构造成一个simple_segregated_storage,切分为多个chunk,是一块连续的内存
  • next ptr 指向下一个block结构,next block size指出了下一个block结构的大小。
  • 也就是说,多个PODptr结构组成一个链表,而PODptr内部由simple_segregated_storage分成一个顺序表。
  • PODptr的大小不固定,增长方式见​​void * pool<UserAllocator>::malloc_need_resize()​​.
  • 初始化的每个chunk都指向下一个chunk

class pool

//pool 从simple_segregated_storage派生
template <typename UserAllocator>
class pool: protected simple_segregated_storage < typename UserAllocator::size_type >;

//返回父类指针以便调用父类函数,其实就是类型转换
simple_segregated_storage<size_type> & store()
{ //! \returns pointer to store.
return *this;
}

在调用pool::malloc只申请一个chunk时,如果有足够空间,使用父类指针调用malloc返回内存,否则就重新申请一个大block。代码简单,就不贴了。

下面代码是申请n个连续的chunk。如果没有连续的n个内存就需要重新分配内存了。分配好的内存,通过add_ordered_block添加到chunks的有序链表,并通过地址大小把刚申请的block放到PODptr链表的排序位置。

template <typename UserAllocator>
void * pool<UserAllocator>::ordered_malloc(const size_type n)
{ //! Gets address of a chunk n, allocating new memory if not already available.
//! \returns Address of chunk n if allocated ok.
//! \returns 0 if not enough memory for n chunks.

const size_type partition_size = alloc_size();
const size_type total_req_size = n * requested_size;
const size_type num_chunks = total_req_size / partition_size +
((total_req_size % partition_size) ? true : false);

void * ret = store().malloc_n(num_chunks, partition_size);

#ifdef BOOST_POOL_INSTRUMENT
std::cout << "Allocating " << n << " chunks from pool of size " << partition_size << std::endl;
#endif
if ((ret != 0) || (n == 0))
return ret;

#ifdef BOOST_POOL_INSTRUMENT
std::cout << "Cache miss, allocating another chunk...\n";
#endif

// Not enough memory in our storages; make a new storage,
BOOST_USING_STD_MAX();

//计算下次申请内存的大小,基本就是乘以2.integer::static_lcm是求最小公倍数。
next_size = max BOOST_PREVENT_MACRO_SUBSTITUTION(next_size, num_chunks);
size_type POD_size = static_cast<size_type>(next_size * partition_size +
integer::static_lcm<sizeof(size_type), sizeof(void *)>::value + sizeof(size_type));
char * ptr = (UserAllocator::malloc)(POD_size);
if (ptr == 0)
{
if(num_chunks < next_size)
{
// Try again with just enough memory to do the job, or at least whatever we
// allocated last time:
next_size >>= 1;
next_size = max BOOST_PREVENT_MACRO_SUBSTITUTION(next_size, num_chunks);
POD_size = static_cast<size_type>(next_size * partition_size +
integer::static_lcm<sizeof(size_type), sizeof(void *)>::value + sizeof(size_type));
ptr = (UserAllocator::malloc)(POD_size);
}
if(ptr == 0)
return 0;
}
const details::PODptr<size_type> node(ptr, POD_size);

// Split up block so we can use what wasn't requested.
if (next_size > num_chunks)
store().add_ordered_block(node.begin() + num_chunks * partition_size,
node.element_size() - num_chunks * partition_size, partition_size);

BOOST_USING_STD_MIN();
if(!max_size)
next_size <<= 1;
else if( next_size*partition_size/requested_size < max_size)
next_size = min BOOST_PREVENT_MACRO_SUBSTITUTION(next_size << 1, max_size*requested_size/ partition_size);

// insert it into the list,
// handle border case.
//对大块block进行排序
if (!list.valid() || std::greater<void *>()(list.begin(), node.begin()))
{
node.next(list);
list = node;
}
else
{
details::PODptr<size_type> prev = list;

while (true)
{
// if we're about to hit the end, or if we've found where "node" goes.
if (prev.next_ptr() == 0
|| std::greater<void *>()(prev.next_ptr(), node.begin()))
break;

prev = prev.next();
}

node.next(prev.next());
prev.next(node);
}

// and return it.
return node.begin();
}

下面代码是释放未被占用的块。(一个block任何一个chunk被占用就不会释放)

template <typename UserAllocator>
bool pool<UserAllocator>::release_memory()
{ //! pool must be ordered. Frees every memory block that doesn't have any allocated chunks.
//! \returns true if at least one memory block was freed.

// ret is the return value: it will be set to true when we actually call
// UserAllocator::free(..)
bool ret = false;

// This is a current & previous iterator pair over the memory block list
details::PODptr<size_type> ptr = list;
details::PODptr<size_type> prev;

// This is a current & previous iterator pair over the free memory chunk list
// Note that "prev_free" in this case does NOT point to the previous memory
// chunk in the free list, but rather the last free memory chunk before the
// current block.
void * free_p = this->first;
void * prev_free_p = 0;

const size_type partition_size = alloc_size();

// Search through all the all the allocated memory blocks
while (ptr.valid())
{
// At this point:
// ptr points to a valid memory block
// free_p points to either:
// 0 if there are no more free chunks
// the first free chunk in this or some next memory block
// prev_free_p points to either:
// the last free chunk in some previous memory block
// 0 if there is no such free chunk
// prev is either:
// the PODptr whose next() is ptr
// !valid() if there is no such PODptr

// If there are no more free memory chunks, then every remaining
// block is allocated out to its fullest capacity, and we can't
// release any more memory
if (free_p == 0)
break;

// We have to check all the chunks. If they are *all* free (i.e., present
// in the free list), then we can free the block.
bool all_chunks_free = true;

// Iterate 'i' through all chunks in the memory block
// if free starts in the memory block, be careful to keep it there
void * saved_free = free_p;
for (char * i = ptr.begin(); i != ptr.end(); i += partition_size)
{
// If this chunk is not free
if (i != free_p)
{
// We won't be able to free this block
all_chunks_free = false;

// free_p might have travelled outside ptr
free_p = saved_free;
// Abort searching the chunks; we won't be able to free this
// block because a chunk is not free.
break;
}

// We do not increment prev_free_p because we are in the same block
free_p = nextof(free_p);
}

// post: if the memory block has any chunks, free_p points to one of them
// otherwise, our assertions above are still valid

const details::PODptr<size_type> next = ptr.next();

if (!all_chunks_free)
{
if (is_from(free_p, ptr.begin(), ptr.element_size()))
{
std::less<void *> lt;
void * const end = ptr.end();
do
{
prev_free_p = free_p;
free_p = nextof(free_p);
} while (free_p && lt(free_p, end));
}
// This invariant is now restored:
// free_p points to the first free chunk in some next memory block, or
// 0 if there is no such chunk.
// prev_free_p points to the last free chunk in this memory block.

// We are just about to advance ptr. Maintain the invariant:
// prev is the PODptr whose next() is ptr, or !valid()
// if there is no such PODptr
prev = ptr;
}
else
{
// All chunks from this block are free

// Remove block from list
if (prev.valid())
prev.next(next);
else
list = next;

// Remove all entries in the free list from this block
//关键点在这里,释放了一个block之后,会把上一个chunk头修改。
if (prev_free_p != 0)
nextof(prev_free_p) = free_p;
else
this->first = free_p;

// And release memory
(UserAllocator::free)(ptr.begin());
ret = true;
}

// Increment ptr
ptr = next;
}

next_size = start_size;
return ret;
}

pool总结

pool的实现基本就是利用simple_segregated_storage内部实现的维护chunk的链表来实现内存管理的。simple_segregated_storage可以说是pool的核心。pool内部一共维护了两个链表:

  • simple_segregated_storage内部的chunk链表。分配单个chunk时,直接从这个链表拿一个chunk,复杂度O(1)。
  • pool内部有个成员变量​​details::PODptr<size_type> list;​​用来维护一个大块内存block的链表。可以知道,一个block内部是连续的,但block之间可以认为是不连续的内存。这个链表相当于一个内存地址索引,主要是为了提高查找效率:对于有序排列的内存池,归还内存时,用来快速判断是属于哪个块的。如果没有这个链表,就需要挨个chunk去判断地址大小。

class object_pool

class object_pool: protected pool<UserAllocator>;

object_pool继承自pool,但和pool的区别是,pool用于申请固定大小的内存,而object_pool用于申请固定类型的内存,并会调用构造函数和析构函数。主要的函数就两个:

调用构造函数,用到了一个placement new的方式,老生常谈。

唯一需要注意的是construct和destroy调用的malloc和free,都是调用的 ​​ordered_malloc​​ 和 ​​ordered_free​​。

elem``ent_type * construct(Arg1&, ... ArgN&){...}
element_type * construct()
{
element_type * const ret = (malloc)();
if (ret == 0)
return ret;
try { new (ret) element_type(); }
catch (...) { (free)(ret); throw; }
return ret;
}
element_type * malloc BOOST_PREVENT_MACRO_SUBSTITUTION()
{
return static_cast<element_type *>(store().ordered_malloc());
}

destroy显式调用析构函数去析构,然后把内存还给链表维护。

void destroy(element_type * const chunk)
{
chunk->~T();
(free)(chunk);
}
void free BOOST_PREVENT_MACRO_SUBSTITUTION(element_type * const chunk)
{
store().ordered_free(chunk);
}

class singleton_pool

单例内存池的实现,值得注意的有如下几点:

  • 单线程使用单例时(保证无同步问题),可以通过定义宏BOOST_POOL_NO_MT来取消同步的损耗。
#if !defined(BOOST_HAS_THREADS) || defined(BOOST_NO_MT) || defined(BOOST_POOL_NO_MT)                                   
typedef null_mutex default_mutex;

  • 单例内存池的单例实现如下,通过内部类object_creator调用private函数get_pool(),通过create_object.do_nothing();来保证在main之前实例化静态对象​​static object_creator create_object;​
class singleton_pool
{
public:
...
private:
typedef boost::aligned_storage<sizeof(pool_type), boost::alignment_of<pool_type>::value> storage_type;
static storage_type storage;

static pool_type& get_pool()
{
static bool f = false;
if(!f)
{
// This code *must* be called before main() starts,
// and when only one thread is executing.
f = true;
new (&storage) pool_type;
}

// The following line does nothing else than force the instantiation
// of singleton<T>::create_object, whose constructor is
// called before main() begins.
create_object.do_nothing();

return *static_cast<pool_type*>(static_cast<void*>(&storage));
}

struct object_creator
{
object_creator()
{ // This constructor does nothing more than ensure that instance()
// is called before main() begins, thus creating the static
// T object before multithreading race issues can come up.
singleton_pool<Tag, RequestedSize, UserAllocator, Mutex, NextSize, MaxSize>::get_pool();
}
inline void do_nothing() const
{
}
};
static object_creator create_object;
};

总结

  • 适用范围:频繁申请释放相同大小的内存,如需要频繁的创建同一个类的对象。
  • 优点:可以防止内存碎片、极快,避免频繁申请内存的调用.

boost::pool 的源代码一共就几个文件,简洁明了,读起来也不很难。由于代码时间远早于现代C++(C++11之后)成型,兼容编译器的代码建议忽略。因为重要的是其设计思想:如何通过自构两个链表来提升内存管理效率的。

数据结构很简单。适用场景比较狭窄,跟GC没法比。

经验

使用boost::pool<>遇到严重的性能问题

 ​