文章目录

  • 1、性能测试
  • 1.1、FIO
  • 1.2、rust_echo_benc
  • 2、io_uring
  • 2.1、io_uring_setup
  • 2.2、io_uring_enter
  • 2.3、io_uring_register
  • 2.4、使用方法:cat 程序为例
  • 3、liburing
  • 3.1、liburing api
  • 3.2、测试代码
  • 4、参考



io_uring 是 linux 内核 5.10 引入的异步 io 接口。相比起用户态的DPDK、SPDK,io_uring 作为内核的一部分,通过 mmap 的方式实现用户和内核共享内存,并基于 memory barrier 在这块内存上实现了两个无锁环形队列:

submission queue ring(sq)

completion queue ring(cq)。 sq 用于用户程序向内核提交 IO 任务,内核执行完成的任务会放入cq,用户程序从 cq 获取结果。在提交任务和返回任务结果时,用户程序和内核共用环形队列中的数据,不再需要额外的数据拷贝。此外,io_uring 还提供了两种轮询 Polling 模式,可以避免提交任务时的系统调用,以及io完成后的中断通知。

1、性能测试

1.1、FIO

iops 是指单位时间内系统能处理的I/O请求数量,用于存储设备性能测试。这里我们使用硬盘性能辅助测试工具 FIO,来直观感受异步 io: io_uring 的性能优势。

# 安装 fio
sudo apt install fio
# 运行方式
fio job_file

需要通过编写一个配置文件来预定义 FIO 将要以什么样的模式来执行任务。

FIO 的基本参数:

  • rw readwrite:定义 IO 类型。随机读 randread、随机写 randwrite、顺序读 read、顺序写 write、顺序读写 rw readwrite ,随机混合读写 randrw
  • bs, blocksize:IO 的块大小。默认 4k
  • size: IO 传输的数据大小
  • ioengine:IO 引擎。同步模式psync、异步模式io_uring
  • iodepth:I/O 引擎若使用异步模式,保持队列深度
  • direct: 是否使用非缓冲 io ,默认 false 缓冲 io

编写的 posix.fio 配置文件如下

[global]
thread=1
group_reporting=1
direct=1
verify=0
time_based=1
runtime=10
bs=16K
size=16384
iodepth=64
rw=randwrite
filename=Cat
ioengine=io_uring 
[test]

stonewall

description="variable bs"

实验结果:iops:psync 8k, io_uring 19.0k,由此可以看出异步 io 的性能优势。

1.2、rust_echo_benc

服务器性能测试方法

  • 连接数
  • 每个请求连接的大小
  • 持续时间

epoll 与 io_uring 事件的区别

  • epoll 设置完后,不更改。
  • io_uring 设置一次,触发一次。

接下来,进行同步 epoll 与异步 io_uring 服务器的测试对比,代码见 liburing 测试代码

# 安装 rust_echo_benc
git clone https://github.com/haraldh/rust_echo_bench.git
cargo run --release
测试
cargo run --release -- --address "127.0.0.1:9999" --number 1000 --duration 60 --length 512

实验结果:在网络 io 方面,io_uring并不明显。在磁盘 io 方面,io_uring 具有一定的优势。

2、io_uring

io_uring 提供了三个系统调用接口 io_uring_setupio_uring_enterio_uring_register

2.1、io_uring_setup

在 kernel 中创建:

  • 提交队列 SQ:里面每一项是 sqe(submission queue event),描述1个任务
  • 完成队列 CQ:里面每一项是 cqe(completion queue event),描述1个任务返回结果
  • 提交队列项 SQEs 数组(Submission Queue Entries)

[转帖]高性能异步io机制:io_uring_系统调用

SQ 和 CQ 采用 Ringbuffer 的结构,有 head 和 tail 两个成员,head = tail 时队列为空。每个节点保存的是 SQEs 数组的偏移量,实际的请求保存在 SQEs 数组中,这样就可以批量提交一组 SQEs 上不连续的请求。SQ 和 CQ 本身没有提供锁等同步机制,向 SQ中放入 sqe,从 CQ 中取出 cqe,都需要通过 memory barrier 来实现。

函数返回1个 fd 用于 io_uring 管理。用户将 fd 以 mmap 的方式映射到内存,实现了用户态和内核态的共享内存。

/*
- 参数1 entries:期望的 sq 长度。默认cq长度是sq的两倍
- 参数2 params: 配置io_uring,内核返回的 sq/cq 配置信息也通过它带回来
 */
int io_uring_setup(unsigned entries, struct io_uring_params *params)
struct io_uring_params {

__u32 sq_entries;

__u32 cq_entries;

__u32 flags;

__u32 sq_thread_cpu;

__u32 sq_thread_idle;

__u32 resv[5];
struct io_sqring_offsets sq_off;
struct io_cqring_offsets cq_off;
};

2.2、io_uring_enter

调用时,执行两个操作

  • 提交 IO 请求:把 sqe 的索引尾插到 SQ 中,调用io_uring_enter提交到内核
  • 等待 IO 完成:内核将完成的 IO 放到 CQ 中,用户轮询 CQ 来等待结果

[转帖]高性能异步io机制:io_uring_数组_02

/*
- 参数1 fd:io_uring_setup返回的fd
- 参数2 to_submit: 一次提交多少个 sqe 到内核
- 参数3 min_complete: 要求内核至少等待min_complete个任务完成再返回
- 参数4 flags:接口控制行为,IORING_ENTER_GETEVENTS
 */
int io_uring_enter(unsigned int fd, u32 to_submit, u32 min_complete, u32 flags);

2.3、io_uring_register

注册用于异步 I/O 的文件或用户缓冲区

对于文件, 保持内核长时间持有该文件的索引。每次通过 sqe 向内核传递一个 fd,内核都需要通过 fd 找到对应的文件索引,完成该sqe 处理后,则将该索引释放。对于高 iops 的场景,这个开销会拖慢请求的速度。通过预先注册一组已经打开的文件。

对于缓冲区,保持内存的长期映射。内核在读写前进行page map,读写完成后,执行unmap。类似的,通过预注册,来避免多次的 map 和 unmap。

/*
- 参数1 fd:io_uring_setup返回的fd
- 参数2 opcode: 注册类型。
	文件类型: IORING_REGISTER_FILES;
	用户缓冲类型 buffer: IORING_REGISTER_BUFFERS
- 参数3 arg: 
	文件类型: 指向一个fd数组;
	用户缓冲类型:指向一个struct iovec的数组。
- 参数4 nr_args:arg数组的长度
 */
int io_uring_register(unsigned int fd, unsigned int opcode,
                      void *arg, unsigned int nr_args);

2.4、使用方法:cat 程序为例

接下来,基于 io_uring 的系统调用接口进行封装,实现自定义的 uring_cat 程序

// gcc -o uring_cat uring_cat.c
// ./uring_cat filename
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <sys/syscall.h>
#include <sys/mman.h>
#include <sys/uio.h>
#include <linux/fs.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <linux/io_uring.h>
#define URING_QUEUE_DEPTH		1024
#define BLOCK_SZ    1024
// sqring
struct app_io_sq_ring {
<span >unsigned</span> <span >*</span>head<span >;</span>
<span >unsigned</span> <span >*</span>tail<span >;</span>

<span >unsigned</span> <span >*</span>ring_mask<span >;</span>
<span >unsigned</span> <span >*</span>ring_entries<span >;</span>

<span >unsigned</span> <span >*</span>flags<span >;</span>
<span >unsigned</span> <span >*</span>array<span >;</span>

};
// cqring
struct app_io_cq_ring {
<span >unsigned</span> <span >*</span>head<span >;</span>
<span >unsigned</span> <span >*</span>tail<span >;</span>

<span >unsigned</span> <span >*</span>ring_mask<span >;</span>
<span >unsigned</span> <span >*</span>ring_entries<span >;</span>

<span >struct</span> <span >io_uring_cqe</span> <span >*</span>cqes<span >;</span>

};
// 提交器: cq, sq, sqe
struct submitter {
<span >int</span> ring_fd<span >;</span>

<span >struct</span> <span >app_io_sq_ring</span> sq_ring<span >;</span>
<span >struct</span> <span >app_io_cq_ring</span> cq_ring<span >;</span>

<span >struct</span> <span >io_uring_sqe</span> <span >*</span>sqes<span >;</span>

};
 -------------------
struct file_info {
off_t file_sz;
struct iovec iovecs[];
};
 -------------------
// 利用系统调用执行 io_uring_setup 流程
// 1、int 0x80 中断信号
// 2、mv arg1, eax
// 3、mv arg2, ebx
// 4、call sys_call_table: sys_call_table[__NR_io_uring_setup]
int io_uring_setup(unsigned entries, struct io_uring_params *p)
{
return (int) syscall(__NR_io_uring_setup, entries, p);
}
int io_uring_enter(int ring_fd, unsigned int to_submit,
unsigned int min_complete, unsigned int flags)
{
return (int) syscall(__NR_io_uring_enter, ring_fd, to_submit, min_complete,

flags, NULL, 0);
}
int app_setup_uring(struct submitter *s) {
<span >struct</span> <span >io_uring_params</span> p<span >;</span>
<span >memset</span><span >(</span><span >&</span>p<span >,</span> <span >0</span><span >,</span> <span >sizeof</span><span >(</span>p<span >)</span><span >)</span><span >;</span>

<span >// 创建sq, cq, sqes</span>
s<span >-></span>ring_fd <span >=</span> <span >io_uring_setup</span><span >(</span>URING_QUEUE_DEPTH<span >,</span> <span >&</span>p<span >)</span><span >;</span>
<span >if</span> <span >(</span>s<span >-></span>ring_fd <span ><</span> <span >0</span><span >)</span> <span >return</span> <span >-</span><span >1</span><span >;</span>

<span >// 获取初始的sq,cq的大小,sq_off, cq_off起始偏移地址</span>
<span >int</span> sring_sz <span >=</span> p<span >.</span>sq_off<span >.</span>array <span >+</span> p<span >.</span>sq_entries <span >*</span> <span >sizeof</span><span >(</span><span >unsigned</span><span >)</span><span >;</span>
<span >int</span> cring_sz <span >=</span> p<span >.</span>cq_off<span >.</span>cqes <span >+</span> p<span >.</span>cq_entries <span >*</span> <span >sizeof</span><span >(</span><span >struct</span> <span >io_uring_cqe</span><span >)</span><span >;</span>

<span >// io_uring特性:IORING_FEAT_SINGLE_MMAP:内核通过一次mmap完成sq, cq的映射</span>
<span >// 即sq,cq共用1块内存,则两者大小必须设置相同</span>
<span >if</span> <span >(</span>p<span >.</span>features <span >&</span> IORING_FEAT_SINGLE_MMAP<span >)</span> <span >{<!-- --></span>
	<span >if</span> <span >(</span>cring_sz <span >></span> sring_sz<span >)</span> <span >{<!-- --></span>
		sring_sz <span >=</span> cring_sz<span >;</span>
	<span >}</span>
	cring_sz <span >=</span> sring_sz<span >;</span>
<span >}</span>

<span >// 1、将 sq 的映射到用户空间,sq_ptr 指向sq首地址</span>
<span >void</span> <span >*</span>sq_ptr <span >=</span> <span >mmap</span><span >(</span><span >0</span><span >,</span> sring_sz<span >,</span> PROT_READ<span >|</span>PROT_WRITE<span >,</span> MAP_SHARED<span >|</span>MAP_POPULATE<span >,</span>
					s<span >-></span>ring_fd<span >,</span> IORING_OFF_SQ_RING<span >)</span><span >;</span>
<span >if</span> <span >(</span>sq_ptr <span >==</span> MAP_FAILED<span >)</span> <span >return</span> <span >-</span><span >1</span><span >;</span>

<span >// 2、将 cq 的映射到用户空间,cq_ptr 指向cq首地址</span>
<span >void</span> <span >*</span>cq_ptr<span >;</span>
<span >// 若共用一块内存,则两个指针指向相同</span>
<span >if</span> <span >(</span>p<span >.</span>features <span >&</span> IORING_FEAT_SINGLE_MMAP<span >)</span> <span >{<!-- --></span>
	cq_ptr <span >=</span> sq_ptr<span >;</span>
<span >}</span> <span >else</span> <span >{<!-- --></span>
<span >// 若使用两块内存,则重新对cq进行mmap,</span>
	cq_ptr <span >=</span> <span >mmap</span><span >(</span><span >0</span><span >,</span> sring_sz<span >,</span> PROT_READ<span >|</span>PROT_WRITE<span >,</span> MAP_SHARED<span >|</span>MAP_POPULATE<span >,</span>
					s<span >-></span>ring_fd<span >,</span> IORING_OFF_CQ_RING<span >)</span><span >;</span>
	<span >if</span> <span >(</span>cq_ptr <span >==</span> MAP_FAILED<span >)</span> <span >return</span> <span >-</span><span >1</span><span >;</span>

<span >}</span>

<span >struct</span> <span >app_io_sq_ring</span> <span >*</span>sring <span >=</span> <span >&</span>s<span >-></span>sq_ring<span >;</span>
<span >struct</span> <span >app_io_cq_ring</span> <span >*</span>cring <span >=</span> <span >&</span>s<span >-></span>cq_ring<span >;</span>

sring<span >-></span>head <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>head<span >;</span>
sring<span >-></span>tail <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>tail<span >;</span>

sring<span >-></span>ring_mask <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>ring_mask<span >;</span>
sring<span >-></span>ring_entries <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>ring_entries<span >;</span>

sring<span >-></span>flags <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>flags<span >;</span>
sring<span >-></span>array <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>array<span >;</span>

<span >// 3、将 seqs 映射到用户空间</span>
s<span >-></span>sqes <span >=</span> <span >mmap</span><span >(</span><span >0</span><span >,</span> p<span >.</span>sq_entries <span >*</span> <span >sizeof</span><span >(</span><span >struct</span> <span >io_uring_sqe</span><span >)</span><span >,</span> 
	PROT_READ <span >|</span> PROT_WRITE<span >,</span> MAP_SHARED <span >|</span> MAP_POPULATE<span >,</span> s<span >-></span>ring_fd<span >,</span> IORING_OFF_SQES<span >)</span><span >;</span>
<span >if</span> <span >(</span>s<span >-></span>sqes <span >==</span> MAP_FAILED<span >)</span> <span >{<!-- --></span>
	<span >return</span> <span >1</span><span >;</span>
<span >}</span>

cring<span >-></span>head <span >=</span> cq_ptr <span >+</span> p<span >.</span>cq_off<span >.</span>head<span >;</span>
cring<span >-></span>tail <span >=</span> cq_ptr <span >+</span> p<span >.</span>cq_off<span >.</span>tail<span >;</span>
cring<span >-></span>ring_mask <span >=</span> cq_ptr <span >+</span> p<span >.</span>cq_off<span >.</span>ring_mask<span >;</span>
cring<span >-></span>ring_entries <span >=</span> cq_ptr <span >+</span> p<span >.</span>cq_off<span >.</span>ring_entries<span >;</span>
cring<span >-></span>cqes <span >=</span> cq_ptr <span >+</span> p<span >.</span>cq_off<span >.</span>cqes<span >;</span>

<span >return</span> <span >0</span><span >;</span>

}
off_t get_file_size(int fd) {
struct stat st;
if(fstat(fd, &st) < 0) {
perror("fstat");
return -1;
}
if (S_ISBLK(st.st_mode)) {
unsigned long long bytes;
if (ioctl(fd, BLKGETSIZE64, &bytes) != 0) {
perror("ioctl");
return -1;
}
return bytes;
} else if (S_ISREG(st.st_mode))
return st.st_size;
return -1;
}
void output_to_console(char buf, int len) {
while (len--) {
fputc(buf++, stdout);
}
}
void read_from_cq(struct submitter *s) {
<span >struct</span> <span >file_info</span> <span >*</span>fi<span >;</span>

<span >struct</span> <span >app_io_cq_ring</span> <span >*</span>cring <span >=</span> <span >&</span>s<span >-></span>cq_ring<span >;</span>
<span >struct</span> <span >io_uring_cqe</span> <span >*</span>cqe<span >;</span>

<span >unsigned</span> head <span >=</span> <span >*</span>cring<span >-></span>head<span >;</span>

<span >while</span> <span >(</span><span >1</span><span >)</span> <span >{<!-- --></span>

	<span >//read_barrier();</span>

	<span >if</span> <span >(</span>head <span >==</span> <span >*</span>cring<span >-></span>tail<span >)</span> <span >break</span><span >;</span>

	cqe <span >=</span> <span >&</span>cring<span >-></span>cqes<span >[</span>head <span >&</span> <span >*</span>s<span >-></span>cq_ring<span >.</span>ring_mask<span >]</span><span >;</span>
	fi <span >=</span> <span >(</span><span >struct</span> <span >file_info</span><span >*</span><span >)</span>cqe<span >-></span>user_data<span >;</span>

	<span >if</span> <span >(</span>cqe<span >-></span>res <span ><</span> <span >0</span><span >)</span> <span >{<!-- --></span>
		<span >fprintf</span><span >(</span><span >stderr</span><span >,</span> <span >"Error: %d\n"</span><span >,</span> cqe<span >-></span>res<span >)</span><span >;</span>
	<span >}</span>

	<span >int</span> blocks <span >=</span> fi<span >-></span>file_sz <span >/</span> BLOCK_SZ<span >;</span>
	<span >if</span> <span >(</span>fi<span >-></span>file_sz <span >%</span> BLOCK_SZ<span >)</span> blocks <span >++</span><span >;</span>

	
	<span >int</span> i <span >=</span> <span >0</span><span >;</span>
	<span >while</span> <span >(</span><span >++</span>i <span ><</span> blocks<span >)</span> <span >{<!-- --></span>
		<span >output_to_console</span><span >(</span>fi<span >-></span>iovecs<span >[</span>i<span >]</span><span >.</span>iov_base<span >,</span> fi<span >-></span>iovecs<span >[</span>i<span >]</span><span >.</span>iov_len<span >)</span><span >;</span>
		<span >printf</span><span >(</span><span >"------------------------i : %d, blocks: %d\n"</span><span >,</span> i<span >,</span> blocks<span >)</span><span >;</span>
	<span >}</span>
	head <span >++</span><span >;</span>

	<span >printf</span><span >(</span><span >"head: %d, tail: %d, blocks: %d\n"</span><span >,</span> 
		head<span >,</span> <span >*</span>cring<span >-></span>tail<span >,</span> blocks<span >)</span><span >;</span>
<span >}</span>

<span >*</span>cring<span >-></span>head <span >=</span> head<span >;</span>

<span >printf</span><span >(</span><span >"exit read_from_cq\n"</span><span >)</span><span >;</span>
<span >//write_barrier();</span>

}
int submit_to_sq(char file_path, struct submitter s) {
<span >int</span> filefd <span >=</span> <span >open</span><span >(</span>file_path<span >,</span> O_RDONLY<span >)</span><span >;</span>
<span >if</span> <span >(</span>filefd <span ><</span> <span >0</span><span >)</span> <span >{<!-- --></span>
	<span >return</span> <span >-</span><span >1</span><span >;</span>
<span >}</span>

<span >struct</span> <span >app_io_sq_ring</span> <span >*</span>sring <span >=</span> <span >&</span>s<span >-></span>sq_ring<span >;</span>

<span >off_t</span> filesz <span >=</span> <span >get_file_size</span><span >(</span>filefd<span >)</span><span >;</span>
<span >if</span> <span >(</span>filesz <span ><</span> <span >0</span><span >)</span> <span >return</span> <span >-</span><span >1</span><span >;</span>

<span >off_t</span> bytes_remaining <span >=</span> filesz<span >;</span>
<span >int</span> blocks <span >=</span> filesz <span >/</span> BLOCK_SZ<span >;</span>

<span >if</span> <span >(</span>filesz <span >%</span> BLOCK_SZ<span >)</span> blocks <span >++</span><span >;</span>

<span >struct</span> <span >file_info</span> <span >*</span>fi <span >=</span> <span >malloc</span><span >(</span><span >sizeof</span><span >(</span><span >struct</span> <span >file_info</span><span >)</span> <span >+</span> <span >sizeof</span><span >(</span><span >struct</span> <span >iovec</span><span >)</span> <span >*</span> blocks<span >)</span><span >;</span>
<span >if</span> <span >(</span><span >!</span>fi<span >)</span> <span >return</span> <span >-</span><span >2</span><span >;</span>

fi<span >-></span>file_sz <span >=</span> filesz<span >;</span>

<span >unsigned</span> current_block<span >;</span>
<span >while</span> <span >(</span>bytes_remaining<span >)</span> <span >{<!-- --></span>

	<span >off_t</span> bytes_to_read <span >=</span> bytes_remaining<span >;</span>
	<span >if</span> <span >(</span>bytes_to_read <span >></span> BLOCK_SZ<span >)</span> bytes_to_read <span >=</span> BLOCK_SZ<span >;</span>

	fi<span >-></span>iovecs<span >[</span>current_block<span >]</span><span >.</span>iov_len <span >=</span> bytes_to_read<span >;</span>


	<span >void</span> <span >*</span>buf<span >;</span>
	<span >if</span> <span >(</span><span >posix_memalign</span><span >(</span><span >&</span>buf<span >,</span> BLOCK_SZ<span >,</span> BLOCK_SZ<span >)</span><span >)</span> <span >{<!-- --></span>
		<span >return</span> <span >1</span><span >;</span>
	<span >}</span>

	fi<span >-></span>iovecs<span >[</span>current_block<span >]</span><span >.</span>iov_base <span >=</span> buf<span >;</span>

	current_block <span >++</span><span >;</span>
	bytes_remaining <span >-=</span> bytes_to_read<span >;</span>

<span >}</span>


<span >unsigned</span> next_tail <span >=</span> <span >0</span><span >,</span> tail <span >=</span> <span >0</span><span >,</span> index <span >=</span> <span >0</span><span >;</span>

next_tail <span >=</span> tail <span >=</span> <span >*</span>sring<span >-></span>tail<span >;</span>
next_tail <span >++</span><span >;</span>

index <span >=</span> tail <span >&</span> <span >*</span>s<span >-></span>sq_ring<span >.</span>ring_mask<span >;</span>

<span >struct</span> <span >io_uring_sqe</span> <span >*</span>sqe <span >=</span> <span >&</span>s<span >-></span>sqes<span >[</span>index<span >]</span><span >;</span>
sqe<span >-></span>fd <span >=</span> filefd<span >;</span>
sqe<span >-></span>flags <span >=</span> <span >0</span><span >;</span>
sqe<span >-></span>opcode <span >=</span> IORING_OP_READV<span >;</span>
sqe<span >-></span>addr <span >=</span> <span >(</span><span >unsigned</span> <span >long</span><span >)</span>fi<span >-></span>iovecs<span >;</span>
sqe<span >-></span>len <span >=</span> blocks<span >;</span>
sqe<span >-></span>off <span >=</span> <span >0</span><span >;</span>

sqe<span >-></span>user_data <span >=</span> <span >(</span><span >unsigned</span> <span >long</span> <span >long</span><span >)</span>fi<span >;</span>
sring<span >-></span>array<span >[</span>index<span >]</span> <span >=</span> index<span >;</span>
tail <span >=</span> next_tail<span >;</span>

<span >if</span> <span >(</span><span >*</span>sring<span >-></span>tail <span >!=</span> tail<span >)</span> <span >{<!-- --></span>
	<span >*</span>sring<span >-></span>tail <span >=</span> tail<span >;</span>
<span >}</span>

<span >int</span> ret <span >=</span> <span >io_uring_enter</span><span >(</span>s<span >-></span>ring_fd<span >,</span> <span >1</span><span >,</span> <span >1</span><span >,</span> IORING_ENTER_GETEVENTS<span >)</span><span >;</span>
<span >if</span> <span >(</span>ret <span ><</span> <span >0</span><span >)</span> <span >{<!-- --></span>
	<span >return</span> <span >1</span><span >;</span>
<span >}</span>

<span >return</span> <span >0</span><span >;</span>

}
int main(int argc, char *argv[]) {
<span >struct</span> <span >submitter</span> <span >*</span>s <span >=</span> <span >malloc</span><span >(</span><span >sizeof</span><span >(</span><span >struct</span> <span >submitter</span><span >)</span><span >)</span><span >;</span>
<span >if</span> <span >(</span><span >!</span>s<span >)</span> <span >{<!-- --></span>
	<span >perror</span><span >(</span><span >"malloc"</span><span >)</span><span >;</span>
	<span >return</span> <span >-</span><span >1</span><span >;</span>
<span >}</span>
<span >memset</span><span >(</span>s<span >,</span> <span >0</span><span >,</span> <span >sizeof</span><span >(</span><span >struct</span> <span >submitter</span><span >)</span><span >)</span><span >;</span>

<span >// 1、setup</span>
<span >if</span> <span >(</span><span >app_setup_uring</span><span >(</span>s<span >)</span><span >)</span> <span >return</span> <span >1</span><span >;</span>

<span >int</span> i <span >=</span> <span >1</span><span >;</span>
<span >for</span> <span >(</span>i <span >=</span> <span >1</span><span >;</span>i <span ><</span> argc<span >;</span>i <span >++</span><span >)</span> <span >{<!-- --></span>
	<span >// 2、submit</span>
	<span >if</span> <span >(</span><span >submit_to_sq</span><span >(</span>argv<span >[</span>i<span >]</span><span >,</span> s<span >)</span><span >)</span> <span >{<!-- --></span>
		<span >//fprintf(stderr, "Error reading file\n");</span>
		<span >return</span> <span >1</span><span >;</span>
	<span >}</span>
	
	<span >read_from_cq</span><span >(</span>s<span >)</span><span >;</span>

<span >}</span>

<span >return</span> <span >0</span><span >;</span>

}

<span >unsigned</span> <span >*</span>head<span >;</span>
<span >unsigned</span> <span >*</span>tail<span >;</span>

<span >unsigned</span> <span >*</span>ring_mask<span >;</span>
<span >unsigned</span> <span >*</span>ring_entries<span >;</span>

<span >unsigned</span> <span >*</span>flags<span >;</span>
<span >unsigned</span> <span >*</span>array<span >;</span>

<span >unsigned</span> <span >*</span>head<span >;</span>
<span >unsigned</span> <span >*</span>tail<span >;</span>

<span >unsigned</span> <span >*</span>ring_mask<span >;</span>
<span >unsigned</span> <span >*</span>ring_entries<span >;</span>

<span >struct</span> <span >io_uring_cqe</span> <span >*</span>cqes<span >;</span>

<span >int</span> ring_fd<span >;</span>

<span >struct</span> <span >app_io_sq_ring</span> sq_ring<span >;</span>
<span >struct</span> <span >app_io_cq_ring</span> cq_ring<span >;</span>

<span >struct</span> <span >io_uring_sqe</span> <span >*</span>sqes<span >;</span>

<span >struct</span> <span >io_uring_params</span> p<span >;</span>
<span >memset</span><span >(</span><span >&</span>p<span >,</span> <span >0</span><span >,</span> <span >sizeof</span><span >(</span>p<span >)</span><span >)</span><span >;</span>

<span >// 创建sq, cq, sqes</span>
s<span >-></span>ring_fd <span >=</span> <span >io_uring_setup</span><span >(</span>URING_QUEUE_DEPTH<span >,</span> <span >&</span>p<span >)</span><span >;</span>
<span >if</span> <span >(</span>s<span >-></span>ring_fd <span ><</span> <span >0</span><span >)</span> <span >return</span> <span >-</span><span >1</span><span >;</span>

<span >// 获取初始的sq,cq的大小,sq_off, cq_off起始偏移地址</span>
<span >int</span> sring_sz <span >=</span> p<span >.</span>sq_off<span >.</span>array <span >+</span> p<span >.</span>sq_entries <span >*</span> <span >sizeof</span><span >(</span><span >unsigned</span><span >)</span><span >;</span>
<span >int</span> cring_sz <span >=</span> p<span >.</span>cq_off<span >.</span>cqes <span >+</span> p<span >.</span>cq_entries <span >*</span> <span >sizeof</span><span >(</span><span >struct</span> <span >io_uring_cqe</span><span >)</span><span >;</span>

<span >// io_uring特性:IORING_FEAT_SINGLE_MMAP:内核通过一次mmap完成sq, cq的映射</span>
<span >// 即sq,cq共用1块内存,则两者大小必须设置相同</span>
<span >if</span> <span >(</span>p<span >.</span>features <span >&</span> IORING_FEAT_SINGLE_MMAP<span >)</span> <span >{<!-- --></span>
	<span >if</span> <span >(</span>cring_sz <span >></span> sring_sz<span >)</span> <span >{<!-- --></span>
		sring_sz <span >=</span> cring_sz<span >;</span>
	<span >}</span>
	cring_sz <span >=</span> sring_sz<span >;</span>
<span >}</span>

<span >// 1、将 sq 的映射到用户空间,sq_ptr 指向sq首地址</span>
<span >void</span> <span >*</span>sq_ptr <span >=</span> <span >mmap</span><span >(</span><span >0</span><span >,</span> sring_sz<span >,</span> PROT_READ<span >|</span>PROT_WRITE<span >,</span> MAP_SHARED<span >|</span>MAP_POPULATE<span >,</span>
					s<span >-></span>ring_fd<span >,</span> IORING_OFF_SQ_RING<span >)</span><span >;</span>
<span >if</span> <span >(</span>sq_ptr <span >==</span> MAP_FAILED<span >)</span> <span >return</span> <span >-</span><span >1</span><span >;</span>

<span >// 2、将 cq 的映射到用户空间,cq_ptr 指向cq首地址</span>
<span >void</span> <span >*</span>cq_ptr<span >;</span>
<span >// 若共用一块内存,则两个指针指向相同</span>
<span >if</span> <span >(</span>p<span >.</span>features <span >&</span> IORING_FEAT_SINGLE_MMAP<span >)</span> <span >{<!-- --></span>
	cq_ptr <span >=</span> sq_ptr<span >;</span>
<span >}</span> <span >else</span> <span >{<!-- --></span>
<span >// 若使用两块内存,则重新对cq进行mmap,</span>
	cq_ptr <span >=</span> <span >mmap</span><span >(</span><span >0</span><span >,</span> sring_sz<span >,</span> PROT_READ<span >|</span>PROT_WRITE<span >,</span> MAP_SHARED<span >|</span>MAP_POPULATE<span >,</span>
					s<span >-></span>ring_fd<span >,</span> IORING_OFF_CQ_RING<span >)</span><span >;</span>
	<span >if</span> <span >(</span>cq_ptr <span >==</span> MAP_FAILED<span >)</span> <span >return</span> <span >-</span><span >1</span><span >;</span>

<span >}</span>

<span >struct</span> <span >app_io_sq_ring</span> <span >*</span>sring <span >=</span> <span >&</span>s<span >-></span>sq_ring<span >;</span>
<span >struct</span> <span >app_io_cq_ring</span> <span >*</span>cring <span >=</span> <span >&</span>s<span >-></span>cq_ring<span >;</span>

sring<span >-></span>head <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>head<span >;</span>
sring<span >-></span>tail <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>tail<span >;</span>

sring<span >-></span>ring_mask <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>ring_mask<span >;</span>
sring<span >-></span>ring_entries <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>ring_entries<span >;</span>

sring<span >-></span>flags <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>flags<span >;</span>
sring<span >-></span>array <span >=</span> sq_ptr <span >+</span> p<span >.</span>sq_off<span >.</span>array<span >;</span>

<span >// 3、将 seqs 映射到用户空间</span>
s<span >-></span>sqes <span >=</span> <span >mmap</span><span >(</span><span >0</span><span >,</span> p<span >.</span>sq_entries <span >*</span> <span >sizeof</span><span >(</span><span >struct</span> <span >io_uring_sqe</span><span >)</span><span >,</span> 
	PROT_READ <span >|</span> PROT_WRITE<span >,</span> MAP_SHARED <span >|</span> MAP_POPULATE<span >,</span> s<span >-></span>ring_fd<span >,</span> IORING_OFF_SQES<span >)</span><span >;</span>
<span >if</span> <span >(</span>s<span >-></span>sqes <span >==</span> MAP_FAILED<span >)</span> <span >{<!-- --></span>
	<span >return</span> <span >1</span><span >;</span>
<span >}</span>

cring<span >-></span>head <span >=</span> cq_ptr <span >+</span> p<span >.</span>cq_off<span >.</span>head<span >;</span>
cring<span >-></span>tail <span >=</span> cq_ptr <span >+</span> p<span >.</span>cq_off<span >.</span>tail<span >;</span>
cring<span >-></span>ring_mask <span >=</span> cq_ptr <span >+</span> p<span >.</span>cq_off<span >.</span>ring_mask<span >;</span>
cring<span >-></span>ring_entries <span >=</span> cq_ptr <span >+</span> p<span >.</span>cq_off<span >.</span>ring_entries<span >;</span>
cring<span >-></span>cqes <span >=</span> cq_ptr <span >+</span> p<span >.</span>cq_off<span >.</span>cqes<span >;</span>

<span >return</span> <span >0</span><span >;</span>

<span >struct</span> <span >file_info</span> <span >*</span>fi<span >;</span>

<span >struct</span> <span >app_io_cq_ring</span> <span >*</span>cring <span >=</span> <span >&</span>s<span >-></span>cq_ring<span >;</span>
<span >struct</span> <span >io_uring_cqe</span> <span >*</span>cqe<span >;</span>

<span >unsigned</span> head <span >=</span> <span >*</span>cring<span >-></span>head<span >;</span>

<span >while</span> <span >(</span><span >1</span><span >)</span> <span >{<!-- --></span>

	<span >//read_barrier();</span>

	<span >if</span> <span >(</span>head <span >==</span> <span >*</span>cring<span >-></span>tail<span >)</span> <span >break</span><span >;</span>

	cqe <span >=</span> <span >&</span>cring<span >-></span>cqes<span >[</span>head <span >&</span> <span >*</span>s<span >-></span>cq_ring<span >.</span>ring_mask<span >]</span><span >;</span>
	fi <span >=</span> <span >(</span><span >struct</span> <span >file_info</span><span >*</span><span >)</span>cqe<span >-></span>user_data<span >;</span>

	<span >if</span> <span >(</span>cqe<span >-></span>res <span ><</span> <span >0</span><span >)</span> <span >{<!-- --></span>
		<span >fprintf</span><span >(</span><span >stderr</span><span >,</span> <span >"Error: %d\n"</span><span >,</span> cqe<span >-></span>res<span >)</span><span >;</span>
	<span >}</span>

	<span >int</span> blocks <span >=</span> fi<span >-></span>file_sz <span >/</span> BLOCK_SZ<span >;</span>
	<span >if</span> <span >(</span>fi<span >-></span>file_sz <span >%</span> BLOCK_SZ<span >)</span> blocks <span >++</span><span >;</span>

	
	<span >int</span> i <span >=</span> <span >0</span><span >;</span>
	<span >while</span> <span >(</span><span >++</span>i <span ><</span> blocks<span >)</span> <span >{<!-- --></span>
		<span >output_to_console</span><span >(</span>fi<span >-></span>iovecs<span >[</span>i<span >]</span><span >.</span>iov_base<span >,</span> fi<span >-></span>iovecs<span >[</span>i<span >]</span><span >.</span>iov_len<span >)</span><span >;</span>
		<span >printf</span><span >(</span><span >"------------------------i : %d, blocks: %d\n"</span><span >,</span> i<span >,</span> blocks<span >)</span><span >;</span>
	<span >}</span>
	head <span >++</span><span >;</span>

	<span >printf</span><span >(</span><span >"head: %d, tail: %d, blocks: %d\n"</span><span >,</span> 
		head<span >,</span> <span >*</span>cring<span >-></span>tail<span >,</span> blocks<span >)</span><span >;</span>
<span >}</span>

<span >*</span>cring<span >-></span>head <span >=</span> head<span >;</span>

<span >printf</span><span >(</span><span >"exit read_from_cq\n"</span><span >)</span><span >;</span>
<span >//write_barrier();</span>

<span >int</span> filefd <span >=</span> <span >open</span><span >(</span>file_path<span >,</span> O_RDONLY<span >)</span><span >;</span>
<span >if</span> <span >(</span>filefd <span ><</span> <span >0</span><span >)</span> <span >{<!-- --></span>
	<span >return</span> <span >-</span><span >1</span><span >;</span>
<span >}</span>

<span >struct</span> <span >app_io_sq_ring</span> <span >*</span>sring <span >=</span> <span >&</span>s<span >-></span>sq_ring<span >;</span>

<span >off_t</span> filesz <span >=</span> <span >get_file_size</span><span >(</span>filefd<span >)</span><span >;</span>
<span >if</span> <span >(</span>filesz <span ><</span> <span >0</span><span >)</span> <span >return</span> <span >-</span><span >1</span><span >;</span>

<span >off_t</span> bytes_remaining <span >=</span> filesz<span >;</span>
<span >int</span> blocks <span >=</span> filesz <span >/</span> BLOCK_SZ<span >;</span>

<span >if</span> <span >(</span>filesz <span >%</span> BLOCK_SZ<span >)</span> blocks <span >++</span><span >;</span>

<span >struct</span> <span >file_info</span> <span >*</span>fi <span >=</span> <span >malloc</span><span >(</span><span >sizeof</span><span >(</span><span >struct</span> <span >file_info</span><span >)</span> <span >+</span> <span >sizeof</span><span >(</span><span >struct</span> <span >iovec</span><span >)</span> <span >*</span> blocks<span >)</span><span >;</span>
<span >if</span> <span >(</span><span >!</span>fi<span >)</span> <span >return</span> <span >-</span><span >2</span><span >;</span>

fi<span >-></span>file_sz <span >=</span> filesz<span >;</span>

<span >unsigned</span> current_block<span >;</span>
<span >while</span> <span >(</span>bytes_remaining<span >)</span> <span >{<!-- --></span>

	<span >off_t</span> bytes_to_read <span >=</span> bytes_remaining<span >;</span>
	<span >if</span> <span >(</span>bytes_to_read <span >></span> BLOCK_SZ<span >)</span> bytes_to_read <span >=</span> BLOCK_SZ<span >;</span>

	fi<span >-></span>iovecs<span >[</span>current_block<span >]</span><span >.</span>iov_len <span >=</span> bytes_to_read<span >;</span>


	<span >void</span> <span >*</span>buf<span >;</span>
	<span >if</span> <span >(</span><span >posix_memalign</span><span >(</span><span >&</span>buf<span >,</span> BLOCK_SZ<span >,</span> BLOCK_SZ<span >)</span><span >)</span> <span >{<!-- --></span>
		<span >return</span> <span >1</span><span >;</span>
	<span >}</span>

	fi<span >-></span>iovecs<span >[</span>current_block<span >]</span><span >.</span>iov_base <span >=</span> buf<span >;</span>

	current_block <span >++</span><span >;</span>
	bytes_remaining <span >-=</span> bytes_to_read<span >;</span>

<span >}</span>


<span >unsigned</span> next_tail <span >=</span> <span >0</span><span >,</span> tail <span >=</span> <span >0</span><span >,</span> index <span >=</span> <span >0</span><span >;</span>

next_tail <span >=</span> tail <span >=</span> <span >*</span>sring<span >-></span>tail<span >;</span>
next_tail <span >++</span><span >;</span>

index <span >=</span> tail <span >&</span> <span >*</span>s<span >-></span>sq_ring<span >.</span>ring_mask<span >;</span>

<span >struct</span> <span >io_uring_sqe</span> <span >*</span>sqe <span >=</span> <span >&</span>s<span >-></span>sqes<span >[</span>index<span >]</span><span >;</span>
sqe<span >-></span>fd <span >=</span> filefd<span >;</span>
sqe<span >-></span>flags <span >=</span> <span >0</span><span >;</span>
sqe<span >-></span>opcode <span >=</span> IORING_OP_READV<span >;</span>
sqe<span >-></span>addr <span >=</span> <span >(</span><span >unsigned</span> <span >long</span><span >)</span>fi<span >-></span>iovecs<span >;</span>
sqe<span >-></span>len <span >=</span> blocks<span >;</span>
sqe<span >-></span>off <span >=</span> <span >0</span><span >;</span>

sqe<span >-></span>user_data <span >=</span> <span >(</span><span >unsigned</span> <span >long</span> <span >long</span><span >)</span>fi<span >;</span>
sring<span >-></span>array<span >[</span>index<span >]</span> <span >=</span> index<span >;</span>
tail <span >=</span> next_tail<span >;</span>

<span >if</span> <span >(</span><span >*</span>sring<span >-></span>tail <span >!=</span> tail<span >)</span> <span >{<!-- --></span>
	<span >*</span>sring<span >-></span>tail <span >=</span> tail<span >;</span>
<span >}</span>

<span >int</span> ret <span >=</span> <span >io_uring_enter</span><span >(</span>s<span >-></span>ring_fd<span >,</span> <span >1</span><span >,</span> <span >1</span><span >,</span> IORING_ENTER_GETEVENTS<span >)</span><span >;</span>
<span >if</span> <span >(</span>ret <span ><</span> <span >0</span><span >)</span> <span >{<!-- --></span>
	<span >return</span> <span >1</span><span >;</span>
<span >}</span>

<span >return</span> <span >0</span><span >;</span>

<span >struct</span> <span >submitter</span> <span >*</span>s <span >=</span> <span >malloc</span><span >(</span><span >sizeof</span><span >(</span><span >struct</span> <span >submitter</span><span >)</span><span >)</span><span >;</span>
<span >if</span> <span >(</span><span >!</span>s<span >)</span> <span >{<!-- --></span>
	<span >perror</span><span >(</span><span >"malloc"</span><span >)</span><span >;</span>
	<span >return</span> <span >-</span><span >1</span><span >;</span>
<span >}</span>
<span >memset</span><span >(</span>s<span >,</span> <span >0</span><span >,</span> <span >sizeof</span><span >(</span><span >struct</span> <span >submitter</span><span >)</span><span >)</span><span >;</span>

<span >// 1、setup</span>
<span >if</span> <span >(</span><span >app_setup_uring</span><span >(</span>s<span >)</span><span >)</span> <span >return</span> <span >1</span><span >;</span>

<span >int</span> i <span >=</span> <span >1</span><span >;</span>
<span >for</span> <span >(</span>i <span >=</span> <span >1</span><span >;</span>i <span ><</span> argc<span >;</span>i <span >++</span><span >)</span> <span >{<!-- --></span>
	<span >// 2、submit</span>
	<span >if</span> <span >(</span><span >submit_to_sq</span><span >(</span>argv<span >[</span>i<span >]</span><span >,</span> s<span >)</span><span >)</span> <span >{<!-- --></span>
		<span >//fprintf(stderr, "Error reading file\n");</span>
		<span >return</span> <span >1</span><span >;</span>
	<span >}</span>
	
	<span >read_from_cq</span><span >(</span>s<span >)</span><span >;</span>

<span >}</span>

<span >return</span> <span >0</span><span >;</span>

3、liburing

由于 io_uring 使用起来比较麻烦,作者封装了 io_uring 接口,创作了 liburing 库。

# 安装 liburing
git clone https://github.com/axboe/liburing.git
./configure 
make && make install
  • 1
  • 2
  • 3
  • 4

3.1、liburing api

// 初始化io_uring,内部调用io_uring_setup
int io_uring_queue_init_params(unsigned entries, struct io_uring *ring,
				struct io_uring_params *p);
// 提交 sq 到内核,内核完成后移动到 cq,内部调用 io_uring_enter
// 1、提交io请求:将sqe的偏移信息加入到sq,提交sq到内核,不阻塞等待其完成
// 2、等待io完成:内核在io完成后,自动将sqe的偏移信息加入到cq
int io_uring_submit(struct io_uring *ring);
// 等待io完成,获取cqe
// 阻塞等待
unsigned io_uring_peek_batch_cqe(struct io_uring ring,
struct io_uring_cqe cqes, unsigned count);
// 不阻塞等待
int io_uring_wait_cqes(struct io_uring ring, struct io_uring_cqe cqe_ptr,
unsigned wait_nr, struct __kernel_timespec ts,

sigset_t sigmask);
// 轮询 cq 队列,将 cq 队首后移动 nr 个
static inline void io_uring_cq_advance(struct io_uring *ring, unsigned nr)
// 和libaio封装的io_prep_writev一样
static inline void io_uring_prep_writev(struct io_uring_sqe sqe, int fd,const struct iovec iovecs, unsigned nr_vecs, off_t offset)
// 和libaio封装的io_prep_readv一样
static inline void io_uring_prep_readv(struct io_uring_sqe sqe, int fd, const struct iovec iovecs, unsigned nr_vecs, off_t offset)
// 销毁 io
void io_uring_queue_exit(struct io_uring *ring);

3.2、测试代码

利用 liburing 编写的简单测试 iouring_server

// gcc -o iouring_server iouring_server.c -luring
#include <liburing.h>
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <unistd.h>
#define ENTRIES_LENGTH		4096
#define MAX_CONNECTIONS		1024
#define BUFFER_LENGTH		1024
char buf_table[MAX_CONNECTIONS][BUFFER_LENGTH] = {0};
// 传递的事件
enum {

READ,

WRITE,

ACCEPT,
};
// 连接信息
struct conninfo {
int connfd;	// fd
int type;	// 事件类型
};
void set_read_event(struct io_uring ring, int fd, void buf, size_t len, int flags) {
<span >struct</span> <span >io_uring_sqe</span> <span >*</span>sqe <span >=</span> <span >io_uring_get_sqe</span><span >(</span>ring<span >)</span><span >;</span>

<span >// io_uring 读事件</span>
<span >io_uring_prep_recv</span><span >(</span>sqe<span >,</span> fd<span >,</span> buf<span >,</span> len<span >,</span> flags<span >)</span><span >;</span>

<span >struct</span> <span >conninfo</span> ci <span >=</span> <span >{<!-- --></span>
	<span >.</span>connfd <span >=</span> fd<span >,</span>
	<span >.</span>type <span >=</span> READ
<span >}</span><span >;</span>

<span >memcpy</span><span >(</span><span >&</span>sqe<span >-></span>user_data<span >,</span> <span >&</span>ci<span >,</span> <span >sizeof</span><span >(</span><span >struct</span> <span >conninfo</span><span >)</span><span >)</span><span >;</span>

}
void set_write_event(struct io_uring ring, int fd, const void buf, size_t len, int flags) {
<span >struct</span> <span >io_uring_sqe</span> <span >*</span>sqe <span >=</span> <span >io_uring_get_sqe</span><span >(</span>ring<span >)</span><span >;</span>

<span >// io_uring 写事件</span>
<span >io_uring_prep_send</span><span >(</span>sqe<span >,</span> fd<span >,</span> buf<span >,</span> len<span >,</span> flags<span >)</span><span >;</span>

<span >struct</span> <span >conninfo</span> ci <span >=</span> <span >{<!-- --></span>
	<span >.</span>connfd <span >=</span> fd<span >,</span>
	<span >.</span>type <span >=</span> WRITE
<span >}</span><span >;</span>

<span >memcpy</span><span >(</span><span >&</span>sqe<span >-></span>user_data<span >,</span> <span >&</span>ci<span >,</span> <span >sizeof</span><span >(</span><span >struct</span> <span >conninfo</span><span >)</span><span >)</span><span >;</span>

}
void set_accept_event(struct io_uring ring, int fd,
struct sockaddr cliaddr, socklen_t *clilen, unsigned flags) {
<span >// 获取 sq 队列的空 sqe</span>
<span >struct</span> <span >io_uring_sqe</span> <span >*</span>sqe <span >=</span> <span >io_uring_get_sqe</span><span >(</span>ring<span >)</span><span >;</span>

<span >// io_uring的accept事件:将fd放入到sqe里</span>
<span >io_uring_prep_accept</span><span >(</span>sqe<span >,</span> fd<span >,</span> cliaddr<span >,</span> clilen<span >,</span> flags<span >)</span><span >;</span>

<span >// 用于回调函数</span>
<span >struct</span> <span >conninfo</span> ci <span >=</span> <span >{<!-- --></span>
	<span >.</span>connfd <span >=</span> fd<span >,</span>
	<span >.</span>type <span >=</span> ACCEPT
<span >}</span><span >;</span>

<span >memcpy</span><span >(</span><span >&</span>sqe<span >-></span>user_data<span >,</span> <span >&</span>ci<span >,</span> <span >sizeof</span><span >(</span><span >struct</span> <span >conninfo</span><span >)</span><span >)</span><span >;</span>

}
int main() {
<span >int</span> listenfd <span >=</span> <span >socket</span><span >(</span>AF_INET<span >,</span> SOCK_STREAM<span >,</span> <span >0</span><span >)</span><span >;</span>  
<span >if</span> <span >(</span>listenfd <span >==</span> <span >-</span><span >1</span><span >)</span> <span >return</span> <span >-</span><span >1</span><span >;</span>

<span >struct</span> <span >sockaddr_in</span> servaddr<span >,</span> clientaddr<span >;</span>
servaddr<span >.</span>sin_family <span >=</span> AF_INET<span >;</span>
servaddr<span >.</span>sin_addr<span >.</span>s_addr <span >=</span> <span >htonl</span><span >(</span>INADDR_ANY<span >)</span><span >;</span>
servaddr<span >.</span>sin_port <span >=</span> <span >htons</span><span >(</span><span >9999</span><span >)</span><span >;</span>

<span >if</span> <span >(</span><span >-</span><span >1</span> <span >==</span> <span >bind</span><span >(</span>listenfd<span >,</span> <span >(</span><span >struct</span> <span >sockaddr</span><span >*</span><span >)</span><span >&</span>servaddr<span >,</span> <span >sizeof</span><span >(</span>servaddr<span >)</span><span >)</span><span >)</span> <span >{<!-- --></span>
    <span >return</span> <span >-</span><span >2</span><span >;</span>
<span >}</span>

<span >listen</span><span >(</span>listenfd<span >,</span> <span >10</span><span >)</span><span >;</span>

<span >struct</span> <span >io_uring_params</span> params<span >;</span>
<span >memset</span><span >(</span><span >&</span>params<span >,</span> <span >0</span><span >,</span> <span >sizeof</span><span >(</span>params<span >)</span><span >)</span><span >;</span>


<span >// 初始化队列,内部调用io_uring_setup</span>
<span >struct</span> <span >io_uring</span> ring<span >;</span>
<span >io_uring_queue_init_params</span><span >(</span>ENTRIES_LENGTH<span >,</span> <span >&</span>ring<span >,</span> <span >&</span>params<span >)</span><span >;</span>

<span >socklen_t</span> clilen <span >=</span> <span >sizeof</span><span >(</span>clientaddr<span >)</span><span >;</span>
<span >set_accept_event</span><span >(</span><span >&</span>ring<span >,</span> listenfd<span >,</span> <span >(</span><span >struct</span> <span >sockaddr</span><span >*</span><span >)</span><span >&</span>clientaddr<span >,</span> <span >&</span>clilen<span >,</span> <span >0</span><span >)</span><span >;</span>

<span >while</span> <span >(</span><span >1</span><span >)</span> <span >{<!-- --></span>

	<span >// 封装 io_uring_enter</span>
	<span >// 1、提交io请求:将sqe的偏移信息加入到sq,提交sq到内核,不阻塞等待其完成</span>
	<span >// 2、等待io完成:内核在io完成后,自动将sqe的偏移信息加入到cq</span>
	<span >io_uring_submit</span><span >(</span><span >&</span>ring<span >)</span><span >;</span>

	<span >// 从获取 cqe 的两种方式</span>
	<span >// 1、阻塞等待io完成,获取 cqe</span>
	<span >struct</span> <span >io_uring_cqe</span> <span >*</span>cqe<span >;</span>
	<span >int</span> ret <span >=</span> <span >io_uring_wait_cqe</span><span >(</span><span >&</span>ring<span >,</span> <span >&</span>cqe<span >)</span><span >;</span>

	<span >// 2、不阻塞等待io完成,没有cqe返回错误,获取 cqe</span>
	<span >struct</span> <span >io_uring_cqe</span> <span >*</span>cqes<span >[</span><span >10</span><span >]</span><span >;</span>
	<span >int</span> cqecount <span >=</span> <span >io_uring_peek_batch_cqe</span><span >(</span><span >&</span>ring<span >,</span> cqes<span >,</span> <span >10</span><span >)</span><span >;</span>

	<span >int</span> i <span >=</span> <span >0</span><span >;</span>
	<span >unsigned</span> count <span >=</span> <span >0</span><span >;</span>
	<span >for</span> <span >(</span>i <span >=</span> <span >0</span><span >;</span> i <span ><</span> cqecount<span >;</span> <span >++</span>i<span >)</span> <span >{<!-- --></span>

		cqe <span >=</span> cqes<span >[</span>i<span >]</span><span >;</span>
		count <span >++</span><span >;</span>

		<span >struct</span> <span >conninfo</span> ci<span >;</span>
		<span >memcpy</span><span >(</span><span >&</span>ci<span >,</span> <span >&</span>cqe<span >-></span>user_data<span >,</span> <span >sizeof</span><span >(</span>ci<span >)</span><span >)</span><span >;</span>

		<span >if</span> <span >(</span>ci<span >.</span>type <span >==</span> ACCEPT<span >)</span> <span >{<!-- --></span>

			<span >int</span> connfd <span >=</span> cqe<span >-></span>res<span >;</span>
			<span >char</span> <span >*</span>buffer <span >=</span> buf_table<span >[</span>connfd<span >]</span><span >;</span>
			
			<span >set_read_event</span><span >(</span><span >&</span>ring<span >,</span> connfd<span >,</span> buffer<span >,</span> <span >1024</span><span >,</span> <span >0</span><span >)</span><span >;</span>
			<span >// io_uring 设置一次,触发一次</span>
			<span >set_accept_event</span><span >(</span><span >&</span>ring<span >,</span> listenfd<span >,</span> <span >(</span><span >struct</span> <span >sockaddr</span><span >*</span><span >)</span><span >&</span>clientaddr<span >,</span> <span >&</span>clilen<span >,</span> <span >0</span><span >)</span><span >;</span>

		<span >}</span> <span >else</span> <span >if</span> <span >(</span>ci<span >.</span>type <span >==</span> READ<span >)</span> <span >{<!-- --></span>

			<span >int</span> bytes_read <span >=</span> cqe<span >-></span>res<span >;</span>
			<span >if</span> <span >(</span>bytes_read <span >==</span> <span >0</span><span >)</span> <span >{<!-- --></span>
				<span >close</span><span >(</span>ci<span >.</span>connfd<span >)</span><span >;</span>
			<span >}</span> <span >else</span> <span >if</span> <span >(</span>bytes_read <span ><</span> <span >0</span><span >)</span> <span >{<!-- --></span>

			<span >}</span> <span >else</span> <span >{<!-- --></span>		
				<span >char</span> <span >*</span>buffer <span >=</span> buf_table<span >[</span>ci<span >.</span>connfd<span >]</span><span >;</span>
				<span >set_write_event</span><span >(</span><span >&</span>ring<span >,</span> ci<span >.</span>connfd<span >,</span> buffer<span >,</span> bytes_read<span >,</span> <span >0</span><span >)</span><span >;</span>
			<span >}</span>

		<span >}</span> <span >else</span> <span >if</span> <span >(</span>ci<span >.</span>type <span >==</span> WRITE<span >)</span> <span >{<!-- --></span>
			<span >char</span> <span >*</span>buffer <span >=</span> buf_table<span >[</span>ci<span >.</span>connfd<span >]</span><span >;</span>
			<span >set_read_event</span><span >(</span><span >&</span>ring<span >,</span> ci<span >.</span>connfd<span >,</span> buffer<span >,</span> <span >1024</span><span >,</span> <span >0</span><span >)</span><span >;</span>
		<span >}</span>
	<span >}</span>
	
	<span >// cq队列一次轮询完成后,因为cqe的取出,需要调整队首的位置,以便下次使用</span>
	<span >io_uring_cq_advance</span><span >(</span><span >&</span>ring<span >,</span> count<span >)</span><span >;</span>
<span >}</span>

}

<span >struct</span> <span >io_uring_sqe</span> <span >*</span>sqe <span >=</span> <span >io_uring_get_sqe</span><span >(</span>ring<span >)</span><span >;</span>

<span >// io_uring 读事件</span>
<span >io_uring_prep_recv</span><span >(</span>sqe<span >,</span> fd<span >,</span> buf<span >,</span> len<span >,</span> flags<span >)</span><span >;</span>

<span >struct</span> <span >conninfo</span> ci <span >=</span> <span >{<!-- --></span>
	<span >.</span>connfd <span >=</span> fd<span >,</span>
	<span >.</span>type <span >=</span> READ
<span >}</span><span >;</span>

<span >memcpy</span><span >(</span><span >&</span>sqe<span >-></span>user_data<span >,</span> <span >&</span>ci<span >,</span> <span >sizeof</span><span >(</span><span >struct</span> <span >conninfo</span><span >)</span><span >)</span><span >;</span>

<span >struct</span> <span >io_uring_sqe</span> <span >*</span>sqe <span >=</span> <span >io_uring_get_sqe</span><span >(</span>ring<span >)</span><span >;</span>

<span >// io_uring 写事件</span>
<span >io_uring_prep_send</span><span >(</span>sqe<span >,</span> fd<span >,</span> buf<span >,</span> len<span >,</span> flags<span >)</span><span >;</span>

<span >struct</span> <span >conninfo</span> ci <span >=</span> <span >{<!-- --></span>
	<span >.</span>connfd <span >=</span> fd<span >,</span>
	<span >.</span>type <span >=</span> WRITE
<span >}</span><span >;</span>

<span >memcpy</span><span >(</span><span >&</span>sqe<span >-></span>user_data<span >,</span> <span >&</span>ci<span >,</span> <span >sizeof</span><span >(</span><span >struct</span> <span >conninfo</span><span >)</span><span >)</span><span >;</span>

<span >// 获取 sq 队列的空 sqe</span>
<span >struct</span> <span >io_uring_sqe</span> <span >*</span>sqe <span >=</span> <span >io_uring_get_sqe</span><span >(</span>ring<span >)</span><span >;</span>

<span >// io_uring的accept事件:将fd放入到sqe里</span>
<span >io_uring_prep_accept</span><span >(</span>sqe<span >,</span> fd<span >,</span> cliaddr<span >,</span> clilen<span >,</span> flags<span >)</span><span >;</span>

<span >// 用于回调函数</span>
<span >struct</span> <span >conninfo</span> ci <span >=</span> <span >{<!-- --></span>
	<span >.</span>connfd <span >=</span> fd<span >,</span>
	<span >.</span>type <span >=</span> ACCEPT
<span >}</span><span >;</span>

<span >memcpy</span><span >(</span><span >&</span>sqe<span >-></span>user_data<span >,</span> <span >&</span>ci<span >,</span> <span >sizeof</span><span >(</span><span >struct</span> <span >conninfo</span><span >)</span><span >)</span><span >;</span>

<span >int</span> listenfd <span >=</span> <span >socket</span><span >(</span>AF_INET<span >,</span> SOCK_STREAM<span >,</span> <span >0</span><span >)</span><span >;</span>  
<span >if</span> <span >(</span>listenfd <span >==</span> <span >-</span><span >1</span><span >)</span> <span >return</span> <span >-</span><span >1</span><span >;</span>

<span >struct</span> <span >sockaddr_in</span> servaddr<span >,</span> clientaddr<span >;</span>
servaddr<span >.</span>sin_family <span >=</span> AF_INET<span >;</span>
servaddr<span >.</span>sin_addr<span >.</span>s_addr <span >=</span> <span >htonl</span><span >(</span>INADDR_ANY<span >)</span><span >;</span>
servaddr<span >.</span>sin_port <span >=</span> <span >htons</span><span >(</span><span >9999</span><span >)</span><span >;</span>

<span >if</span> <span >(</span><span >-</span><span >1</span> <span >==</span> <span >bind</span><span >(</span>listenfd<span >,</span> <span >(</span><span >struct</span> <span >sockaddr</span><span >*</span><span >)</span><span >&</span>servaddr<span >,</span> <span >sizeof</span><span >(</span>servaddr<span >)</span><span >)</span><span >)</span> <span >{<!-- --></span>
    <span >return</span> <span >-</span><span >2</span><span >;</span>
<span >}</span>

<span >listen</span><span >(</span>listenfd<span >,</span> <span >10</span><span >)</span><span >;</span>

<span >struct</span> <span >io_uring_params</span> params<span >;</span>
<span >memset</span><span >(</span><span >&</span>params<span >,</span> <span >0</span><span >,</span> <span >sizeof</span><span >(</span>params<span >)</span><span >)</span><span >;</span>


<span >// 初始化队列,内部调用io_uring_setup</span>
<span >struct</span> <span >io_uring</span> ring<span >;</span>
<span >io_uring_queue_init_params</span><span >(</span>ENTRIES_LENGTH<span >,</span> <span >&</span>ring<span >,</span> <span >&</span>params<span >)</span><span >;</span>

<span >socklen_t</span> clilen <span >=</span> <span >sizeof</span><span >(</span>clientaddr<span >)</span><span >;</span>
<span >set_accept_event</span><span >(</span><span >&</span>ring<span >,</span> listenfd<span >,</span> <span >(</span><span >struct</span> <span >sockaddr</span><span >*</span><span >)</span><span >&</span>clientaddr<span >,</span> <span >&</span>clilen<span >,</span> <span >0</span><span >)</span><span >;</span>

<span >while</span> <span >(</span><span >1</span><span >)</span> <span >{<!-- --></span>

	<span >// 封装 io_uring_enter</span>
	<span >// 1、提交io请求:将sqe的偏移信息加入到sq,提交sq到内核,不阻塞等待其完成</span>
	<span >// 2、等待io完成:内核在io完成后,自动将sqe的偏移信息加入到cq</span>
	<span >io_uring_submit</span><span >(</span><span >&</span>ring<span >)</span><span >;</span>

	<span >// 从获取 cqe 的两种方式</span>
	<span >// 1、阻塞等待io完成,获取 cqe</span>
	<span >struct</span> <span >io_uring_cqe</span> <span >*</span>cqe<span >;</span>
	<span >int</span> ret <span >=</span> <span >io_uring_wait_cqe</span><span >(</span><span >&</span>ring<span >,</span> <span >&</span>cqe<span >)</span><span >;</span>

	<span >// 2、不阻塞等待io完成,没有cqe返回错误,获取 cqe</span>
	<span >struct</span> <span >io_uring_cqe</span> <span >*</span>cqes<span >[</span><span >10</span><span >]</span><span >;</span>
	<span >int</span> cqecount <span >=</span> <span >io_uring_peek_batch_cqe</span><span >(</span><span >&</span>ring<span >,</span> cqes<span >,</span> <span >10</span><span >)</span><span >;</span>

	<span >int</span> i <span >=</span> <span >0</span><span >;</span>
	<span >unsigned</span> count <span >=</span> <span >0</span><span >;</span>
	<span >for</span> <span >(</span>i <span >=</span> <span >0</span><span >;</span> i <span ><</span> cqecount<span >;</span> <span >++</span>i<span >)</span> <span >{<!-- --></span>

		cqe <span >=</span> cqes<span >[</span>i<span >]</span><span >;</span>
		count <span >++</span><span >;</span>

		<span >struct</span> <span >conninfo</span> ci<span >;</span>
		<span >memcpy</span><span >(</span><span >&</span>ci<span >,</span> <span >&</span>cqe<span >-></span>user_data<span >,</span> <span >sizeof</span><span >(</span>ci<span >)</span><span >)</span><span >;</span>

		<span >if</span> <span >(</span>ci<span >.</span>type <span >==</span> ACCEPT<span >)</span> <span >{<!-- --></span>

			<span >int</span> connfd <span >=</span> cqe<span >-></span>res<span >;</span>
			<span >char</span> <span >*</span>buffer <span >=</span> buf_table<span >[</span>connfd<span >]</span><span >;</span>
			
			<span >set_read_event</span><span >(</span><span >&</span>ring<span >,</span> connfd<span >,</span> buffer<span >,</span> <span >1024</span><span >,</span> <span >0</span><span >)</span><span >;</span>
			<span >// io_uring 设置一次,触发一次</span>
			<span >set_accept_event</span><span >(</span><span >&</span>ring<span >,</span> listenfd<span >,</span> <span >(</span><span >struct</span> <span >sockaddr</span><span >*</span><span >)</span><span >&</span>clientaddr<span >,</span> <span >&</span>clilen<span >,</span> <span >0</span><span >)</span><span >;</span>

		<span >}</span> <span >else</span> <span >if</span> <span >(</span>ci<span >.</span>type <span >==</span> READ<span >)</span> <span >{<!-- --></span>

			<span >int</span> bytes_read <span >=</span> cqe<span >-></span>res<span >;</span>
			<span >if</span> <span >(</span>bytes_read <span >==</span> <span >0</span><span >)</span> <span >{<!-- --></span>
				<span >close</span><span >(</span>ci<span >.</span>connfd<span >)</span><span >;</span>
			<span >}</span> <span >else</span> <span >if</span> <span >(</span>bytes_read <span ><</span> <span >0</span><span >)</span> <span >{<!-- --></span>

			<span >}</span> <span >else</span> <span >{<!-- --></span>		
				<span >char</span> <span >*</span>buffer <span >=</span> buf_table<span >[</span>ci<span >.</span>connfd<span >]</span><span >;</span>
				<span >set_write_event</span><span >(</span><span >&</span>ring<span >,</span> ci<span >.</span>connfd<span >,</span> buffer<span >,</span> bytes_read<span >,</span> <span >0</span><span >)</span><span >;</span>
			<span >}</span>

		<span >}</span> <span >else</span> <span >if</span> <span >(</span>ci<span >.</span>type <span >==</span> WRITE<span >)</span> <span >{<!-- --></span>
			<span >char</span> <span >*</span>buffer <span >=</span> buf_table<span >[</span>ci<span >.</span>connfd<span >]</span><span >;</span>
			<span >set_read_event</span><span >(</span><span >&</span>ring<span >,</span> ci<span >.</span>connfd<span >,</span> buffer<span >,</span> <span >1024</span><span >,</span> <span >0</span><span >)</span><span >;</span>
		<span >}</span>
	<span >}</span>
	
	<span >// cq队列一次轮询完成后,因为cqe的取出,需要调整队首的位置,以便下次使用</span>
	<span >io_uring_cq_advance</span><span >(</span><span >&</span>ring<span >,</span> count<span >)</span><span >;</span>
<span >}</span>

4、参考