rocketMQ配置 python rocketmq mappedfile

转载

hackernew 2023-10-27 11:23:39

文章标签 rocketMQ配置 python 文件名 ci Time 文章分类 Python 后端开发

RocketMQ消息存储(三) - MappedFileQueue

上一篇讲解了 MappedFile 类，其底层实际上是通过 MappedByteBuffer采用零拷贝的方式来管理文件的读写。
既然 MappedFile 是管理单个文件的类，那么就会存在用来管理这些 MappedFile的类：MappedFileQueue。
我们可以把他们之间的关系形象的理解成： 文件(MappedFile) 和 目录(MappedFileQueue)

想要分析 MappedFileQueue ，刚开始会感觉很抽象。这里我考虑了一下，还是先把图放在文章的开头，在脑子里有个大概的印象，后面分析源码的时候，可以参考该图来帮助理解：

rocketMQ配置 python rocketmq mappedfile_ci

1.属性

那么直接来看代码吧，老规矩，先分析属性。下面我贴了几个核心的属性：

// 该MappedFileQueue 所管理的目录路径 
	//      1. CommitLog文件目录路径为： ../store/commit/log
	//      2. ConsumeQueue文件目录路径为： ../store/xxx_topic/x
    private final String storePath;

    // 目录下每个文件大小 
	//     1. commitLog文件 默认1g 
	//     2. consumeQueue文件 默认600w字节)
    private final int mappedFileSize;

    //  目录下所管理的所有 MappedFile 集合 
    private final CopyOnWriteArrayList<MappedFile> mappedFiles = new CopyOnWriteArrayList<MappedFile>();

    // 创建 MappedFile 的服务， 内部有自己的线程。 (通过该类能够异步创建 MappedFile)
    private final AllocateMappedFileService allocateMappedFileService;

    // 目录的刷盘位点  
	//    (最后一个MappedFile.fileName +  最后一个MappedFile.flushPosition)
    private long flushedWhere = 0;

    // 当前目录下最后一条msg存储时间
    private volatile long storeTimestamp = 0;

上述属性基本上都很简单，这里需要强调其中一个属性 flushedWhere ，请结合上面的图片来理解，

MappedFileQueue目录中的 MappedFile文件是顺序写的，当文件写满了之后，才回去创建新的MappedFile , 其中MappedFile的文件名为物理偏移量。

简单举个例子(仅作说明使用 )：假设每个文件大小为 64bytes 第一个文件名为 00000 , 当该文件写满了则需要创建第二个文件，那么这第二个文件的文件名为 00064 , 此时写也只能向第二个文件中写，那么当写了 32bytes后的 flushedWhere = 00064 + 00032 = 00096 .

2. 核心方法

1.load

/**
     *  broker启动阶段， 加载本地磁盘数据使用的。
     *  该方法会读取 "storePath" 目录下的文件， 为对应的文件创建mappedFile对象，并加入到List中
     */
    public boolean load() {
        // 创建目录对象
        File dir = new File(this.storePath);

        // 获取目录下 所有的文件
        File[] files = dir.listFiles();

        if (files != null) {
            // ascending order
            // 按照文件名排序
            Arrays.sort(files);
            for (File file : files) {

                if (file.length() != this.mappedFileSize) {
                    log.warn(file + "\t" + file.length()
                        + " length not matched message store config value, please check it manually");
                    return false;
                }

                try {
                    // 为当前File创建 对应的mappedFile对象
                    MappedFile mappedFile = new MappedFile(file.getPath(), mappedFileSize);

                    // 设置 wrotePosition 和 flushedPosition  (这里给的值都是 mappedFileSize, 并不是准确值。 准确值需要recover阶段设置)
                    mappedFile.setWrotePosition(this.mappedFileSize);
                    mappedFile.setFlushedPosition(this.mappedFileSize);
                    mappedFile.setCommittedPosition(this.mappedFileSize);

                    // 加入到 list中
                    this.mappedFiles.add(mappedFile);
                    log.info("load " + file.getPath() + " OK");
                } catch (IOException e) {
                    log.error("load file " + file + " error", e);
                    return false;
                }
            }
        }

        return true;
    }

上述代码简单易懂，先总结下该方法主要做的事情，如下：

根据指定文件目录( 如：../store/commit/log) , 构建 File 对象(注意：是个文件夹)。
遍历该文件夹下所有的文件并排序得到 File[] files 数组 (注意：这是文件的集合)。
遍历排序后的文件集合，为每个文件创建 MappedFile对象并赋上初始值，然后存入 MappedFiles集合中

其中第3条，给MappedFile 赋初始值， 注意：该值仅仅是初始值没有任何作用 。

正常Broker 在启动后，会先调用 load() 方法加载出目录下所有的MappedFile，然后再通过 recover的相关方法来重新赋上准确的值。

2. getLastMappedFile

该方法有 3个重载方法，直接来看其中参数最多的那个。

/**
     * 获取当前正在顺序写的MappedFile对象  
     *   (存储消息 或者 存储ConsumeQueue数据时， 都需要获取当前正在顺序写的MappedFile对象)
     *   注意： 如果MappedFile写满了 或者 不存在查找的MappedFile, 则创建新的MappedFile
     *
     * @param startOffset   文件起始偏移量  
     * @param needCreate    当list为空时，是否创建 mappedFile
     * @return
     */
    public MappedFile getLastMappedFile(final long startOffset, boolean needCreate) {

        // 该值 控制是否需要创建MappedFile ，当需要创建MappedFile时，它充当文件名
        // 两种情况 会创建：
        //  1. list 内没有mappedFIle
        //  2. list最后一个mappedFile (当前顺序写的mappedFile)它写满了
        long createOffset = -1;

		
        //  获取 list 中的最后一个 MappedFile
        MappedFile mappedFileLast = getLastMappedFile();

        
         // 情况1 list 内没有mappedFile
        if (mappedFileLast == null) { 
            
            // createOffset 取值必须是 mappedFileSize 的倍数 或者 0
            createOffset = startOffset - (startOffset % this.mappedFileSize);
        }

        
        // 情况2 list最后一个mappedFile (当前顺序写的mappedFile)它写满了
        if (mappedFileLast != null && mappedFileLast.isFull()) {  
            
            // 上一个文件名 转Long + mappedFileSize
            createOffset = mappedFileLast.getFileFromOffset() + this.mappedFileSize;
        }


         // 这里是创建 新的 mappedFile 逻辑
        if (createOffset != -1 && needCreate) {

            // 获取 下次待创建文件的 绝对路径
            String nextFilePath = this.storePath + File.separator + UtilAll.offset2FileName(createOffset);

            // 获取 下下次待创建文件的 绝对路径
            String nextNextFilePath = this.storePath + File.separator
                + UtilAll.offset2FileName(createOffset + this.mappedFileSize);
            MappedFile mappedFile = null;

			
            // 使用 allocateMappedFileService 来创建 MappedFile
            if (this.allocateMappedFileService != null) { 
                // 当mappedFileSize >= 1g 的话， 这里创建的mappedFile 会执行它的 预热方法
                mappedFile = this.allocateMappedFileService.putRequestAndReturnMappedFile(nextFilePath,
                    nextNextFilePath, this.mappedFileSize);
            } 
            
            // 直接创建 MappedFile (这里没有预热)
            else {
                try {
                    mappedFile = new MappedFile(nextFilePath, this.mappedFileSize);
                } catch (IOException e) {
                    log.error("create mappedFile exception", e);
                }
            }

            
            // 将创建的 mappedFile 添加到 list中 并返回
            if (mappedFile != null) {
                if (this.mappedFiles.isEmpty()) {
                    mappedFile.setFirstCreateInQueue(true);
                }
                this.mappedFiles.add(mappedFile);
            }

            return mappedFile;
        }
        
		// 走到这里... 是无需创建 MappedFile时 返回。
        return mappedFileLast;
    }

上述代码很长，理解起来可能稍微有些困难。

首先要理解的是该方法的目的是什么？ 获取当前正在顺序写的MappedFile.

前面在属性小结中着重讲解了 flushedWhere 字段，与它的例子类似， 当前正在顺序写的MappedFile 必定是 MappedFile集合中的末尾文件。 因此代码中直接调用 getLastMappedFile() 方法获取了末尾的MappedFile，而此时会存在 3中情况：

该 MappedFile 存在且 MappedFile 内还有剩余可写空间。(这也是最好的情况，正常返回就行了)
该 MappedFile 存在，但是该MappedFile 已经被写满了。 (需要创建新的MappedFile)
该 MappedFile 不存在，也就说明目录下并没有任何文件。(需要创建新的MappedFile)

其中 2 ，3 情况需要创建新的 MappedFile ，而创建 MappedFile 的方式分为了两种：

通过 allocateMappedFileService 使用其它线程来创建。( MappedFile >= 1g 时有预热操作)
普通 new MappedFile() 方式创建。(无预热操作)

预热操作 会在后面的文章中讲解，这里就只要理解字面意思就行了。

下面再简单总结梳理下该方法的步骤：

获取目录下最后一个 mappedFileLast
根据 mappedFileLast 判断是否需要创建新的 MappedFile

不需要创建新的MappedFile, 则直接返回 mappedFileLast
需要创建新的MappedFile，此时会根据是否存在 allocateMappedFileService 来决定采用哪种创建方式：

allocateMappedFileService 有预热操作的
普通创建

3.deleteExpiredFileByTime

/**
     * commitLog 目录删除过期文件调用   
     * @param expiredTime  过期时间
     * @param deleteFilesInterval  删除两个文件之间的时间间隔
     * @param intervalForcibly  强制关闭资源的时间间隔  mf.destory传递的参数
     * @param cleanImmediately  true 强制删除,不考虑过期时间这个条件
     * @return
     */
    public int deleteExpiredFileByTime(final long expiredTime,
        final int deleteFilesInterval,
        final long intervalForcibly,
        final boolean cleanImmediately) {

        // 获取mfs数组 (实际上就是将MappedFile集合 转成 数组) 
        Object[] mfs = this.copyMappedFiles(0);

        if (null == mfs)
            return 0;
		
        // 这里 减-1 是保证 当前正在顺序写的MappedFile不被删除
        int mfsLength = mfs.length - 1;

        // 记录删除的文件数
        int deleteCount = 0;

        // 被删除的文件集合
        List<MappedFile> files = new ArrayList<MappedFile>();
        
        if (null != mfs) {
            for (int i = 0; i < mfsLength; i++) {
                MappedFile mappedFile = (MappedFile) mfs[i];

                // 计算出当前文件的存活时间截止点
                long liveMaxTimestamp = mappedFile.getLastModifiedTimestamp() + expiredTime;

                // 条件成立：
                //     条件一： 文件存活时间 达到上限
                //     条件二： disk占用率达到上限  会强制删除
                if (System.currentTimeMillis() >= liveMaxTimestamp || cleanImmediately) {

                    // 删除文件
                    if (mappedFile.destroy(intervalForcibly)) {
                        files.add(mappedFile);
                        deleteCount++; // 增加删除文件计数

                        if (files.size() >= DELETE_FILES_BATCH_MAX) {
                            break;
                        }
						
                        // 在删除完文件后 需要sleep，然后再去删除下一个文件
                        if (deleteFilesInterval > 0 && (i + 1) < mfsLength) {
                            try {
                                Thread.sleep(deleteFilesInterval);
                            } catch (InterruptedException e) {
                            }
                        }
                    } else {
                        break;
                    }
                } else {
                    //avoid deleting files in the middle
                    break;
                }
            }
        }

        // 将满足删除条件的mf文件 从 list内删除
        deleteExpiredFile(files);

        return deleteCount;
    }

上述代码虽然长，但是很容易理解，就是遍历目录下的 MappedFile 集合，寻找出满足删除条件的 MappedFile ，再调用 mf.destory() 方法进行删除。

只需要注意的是：该方法是供删除 CommitLog 文件使用的。

万般皆下品，唯有读书高！

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。