解码 TS 码流以获取 H.266 视频的屏幕宽度和高度

原创

YNXZ 2025-02-18 13:53:45 ©著作权

文章标签 字段码流数据 文章分类 游戏开发

©著作权归作者所有：来自51CTO博客作者YNXZ的原创作品，请联系作者获取转载授权，否则将追究法律责任

解码 TS 码流以获取 H.266 视频的屏幕宽度和高度涉及以下步骤：

解析 TS 码流：

TS（Transport Stream）是一种用于传输视频和音频数据的容器格式。你需要先解析 TS 码流以提取视频和音频的 PES（Packetized Elementary Stream）包。

查找 SPS（Sequence Parameter Set）：

在 H.266/VVC 中，像在 H.265/HEVC 一样，SPS 包含了视频的关键参数，包括分辨率信息。SPS 通常以 NAL 单元的形式存在，NAL 类型为 33。

解析 SPS：

SPS 结构复杂，需要理解和实现 H.266 标准中定义的 SPS 语法。你需要用到 Exp-Golomb 编码解码器来读取 SPS 中的参数，因为 H.266 使用了这种编码方式来表示某些字段。

具体步骤包括：

找到 SPS NALU：SPS NAL 单元的开始字节通常是 00 00 00 01 或 00 00 01，后跟 NAL 头，其中 NAL 类型为 33（0x21）。
解码 SPS：SPS 数据中包含了 pic_width_in_luma_samples 和 pic_height_in_luma_samples 字段，这些字段决定了视频的宽度和高度。

计算宽度和高度：

使用从 SPS 中提取的 pic_width_in_luma_samples 和 pic_height_in_luma_samples，并根据 conformance_window 参数调整显示尺寸，计算出实际的宽度和高度。

由于 H.266 标准的复杂性和 SPS 解析的细节性，通常建议使用现有的库或工具来处理这些步骤。例如，FFmpeg 提供了对这些操作的支持，尽管目前（截至 2025 年）FFmpeg 可能还不完全支持 H.266，你可以尝试使用最新版或寻找支持 H.266 的第三方库。

在视频编码中，SPS（Sequence Parameter Set）包含了视频序列的全局参数。pic_width_in_luma_samples 和 pic_height_in_luma_samples 分别表示图像在亮度样本中的宽度和高度。然而，实际的显示尺寸可能因为conformance_window 参数而有所调整。以下是如何计算调整后的实际宽度和高度的步骤：

获取原始尺寸：

original_width = pic_width_in_luma_samples
original_height = pic_height_in_luma_samples

应用conformance_window参数：

conformance_window 包含四个参数：

left_offset
right_offset
top_offset
bottom_offset

计算实际宽度和高度：

实际宽度：

actual_width = original_width - (left_offset + right_offset)

实际高度：

actual_height = original_height - (top_offset + bottom_offset)

假设 pic_width_in_luma_samples 为 1920，pic_height_in_luma_samples 为 1080，并且 conformance_window 参数为：

left_offset = 8
right_offset = 8
top_offset = 4
bottom_offset = 4

那么计算将如下：

original_width = 1920
original_height = 1080

left_offset = 8
right_offset = 8
top_offset = 4
bottom_offset = 4

actual_width = original_width - (left_offset + right_offset)
actual_height = original_height - (top_offset + bottom_offset)

print(f"实际宽度: {actual_width}")
print(f"实际高度: {actual_height}")

根据以上计算，结果将是：

实际宽度: 1904
实际高度: 1072

请注意，这里的计算假设所有参数都是以亮度样本为单位的。此外，实际显示尺寸还可能受到其他因素如显示比率（SAR/DAR）的影响，但这里只考虑了conformance_window的直接影响。

Here are several brainstormed ideas for extracting the width and height of an H.266 (Versatile Video Coding, VVC) video from an MPEG2 Transport Stream (MPEG2-TS) encapsulated within an HTTP payload. Each approach considers the layered structure of the data—HTTP payload, MPEG2-TS, and H.266 bitstream—and proposes methods to retrieve the resolution parameters.

Idea 1: Custom Parser for Step-by-Step Extraction

Develop a custom solution to parse through each layer of the data manually and extract the width and height from the H.266 Sequence Parameter Set (SPS).

Process:

Extract MPEG2-TS Packets from HTTP Payload:

Read the HTTP payload as a sequence of 188-byte TS packets, each starting with a sync byte (0x47).
Verify alignment and handle potential misalignment by searching for the sync byte if needed.

Parse the Program Association Table (PAT):

Identify TS packets with PID 0x0000, which contain the PAT.
Parse the PAT to find the PID of the Program Map Table (PMT).

Parse the Program Map Table (PMT):

Use the PMT PID to locate PMT packets.
Identify the PID of the H.266 video stream by examining stream types (H.266 may use a specific type, e.g., a new value beyond H.265’s 0x24).

Extract Video TS Packets:

Filter TS packets matching the video PID.

Reassemble Packetized Elementary Stream (PES) Packets:

Detect the start of PES packets using the payload_unit_start_indicator in the TS header.
Concatenate payloads to form complete PES packets.

Extract H.266 NAL Units:

Parse the PES payload for H.266 Network Abstraction Layer (NAL) units, identified by start codes (0x000001 or 0x00000001).

Locate the SPS NAL Unit:

Check NAL unit types to find the SPS (type specific to H.266, possibly distinct from H.265’s 33).

Parse the SPS:

Decode the SPS bitstream to extract width and height (e.g., pic_width_in_luma_samples and pic_height_in_luma_samples, likely encoded with exponential-Golomb coding).

Pros:

Full control over the process.
Works without external dependencies if the H.266 SPS syntax is known.

Cons:

Requires detailed knowledge of H.266 SPS syntax (per the VVC standard).
Time-consuming to implement, especially the bitstream parsing.

Idea 2: Leverage FFmpeg or Libavcodec

Use FFmpeg or its libraries (e.g., libavformat and libavcodec) to simplify parsing and extract the resolution.

Process:

Feed HTTP Payload to FFmpeg:

Save the HTTP payload to a temporary file or use a custom AVIOContext to feed the TS data directly into FFmpeg.

Parse the TS Container:

Use avformat_open_input to interpret the MPEG2-TS and identify streams.

Locate the H.266 Stream:

Iterate through the streams in the AVFormatContext to find the video stream (check codec ID for H.266/VVC support).

Extract Codec Parameters:

Access the AVCodecParameters for the H.266 stream, which includes width and height after parsing the SPS internally.

Sample Code (Conceptual):

#include <libavformat/avformat.h>

void extract_resolution(const uint8_t* payload, size_t payload_size) {
    AVFormatContext* fmt_ctx = avformat_alloc_context();
    AVIOContext* avio_ctx = avio_alloc_context(payload, payload_size, 0, NULL, NULL, NULL, NULL);
    fmt_ctx->pb = avio_ctx;

    if (avformat_open_input(&fmt_ctx, NULL, NULL, NULL) == 0) {
        avformat_find_stream_info(fmt_ctx, NULL);
        for (unsigned i = 0; i < fmt_ctx->nb_streams; i++) {
            AVCodecParameters* codecpar = fmt_ctx->streams[i]->codecpar;
            if (codecpar->codec_id == AV_CODEC_ID_VVC) {
                printf("Width: %d, Height: %d\n", codecpar->width, codecpar->height);
                break;
            }
        }
    }
    avformat_close_input(&fmt_ctx);
    avio_context_free(&avio_ctx);
    avformat_free_context(fmt_ctx);
}

Pros:

FFmpeg handles TS and H.266 parsing, reducing implementation effort.
Supports H.266 if compiled with VVC support (increasingly available in recent versions).

Cons:

Dependency on FFmpeg and its build configuration.
May require adapting the HTTP payload into a format FFmpeg expects.

Idea 3: Check for Associated Metadata (e.g., HLS/DASH Manifest)

If the HTTP payload is part of a streaming protocol like HLS or DASH, extract resolution from a manifest file instead of parsing the TS.

Process:

Identify the Streaming Context:

Check if the HTTP payload is a segment of an HLS .ts file or DASH MPD segment.

Access the Manifest:

For HLS, retrieve the .m3u8 playlist; for DASH, retrieve the .mpd file.

Parse Resolution:

Look for resolution tags (e.g., RESOLUTION=1920x1080 in HLS) or <Representation> attributes in DASH.

Pros:

Faster and simpler if metadata is available.
Avoids low-level bitstream parsing.

Cons:

Assumes access to a manifest, which may not be provided with a standalone HTTP payload.
Less reliable if the task strictly requires extraction from the TS itself.

Idea 4: Parse PMT Descriptors for Video Info

Investigate if the PMT or other TS tables (e.g., DVB descriptors) contain width and height directly, bypassing H.266 bitstream parsing.

Process:

Parse PAT and PMT:

Follow steps from Idea 1 to locate the PMT.

Check Descriptors:

Look for video-specific descriptors (e.g., maximum_bitrate_descriptor or custom H.266 descriptors) that might include resolution.

Extract Parameters:

If present, read width and height directly.

Pros:

Potentially quicker than parsing the H.266 bitstream.
Leverages container-level metadata.

Cons:

Unlikely for H.266, as resolution is typically in the SPS.
Standard-specific and not universally supported.

Idea 5: Minimal Parser with H.266 Library

Combine a lightweight TS/PES parser with an existing H.266 bitstream parsing library.

Process:

Custom TS and PES Parsing:

Extract the H.266 bitstream as in Idea 1 (steps 1–6).

Use an H.266 Parser:

Feed the NAL units into an open-source H.266 parser (e.g., from a VVC reference implementation like VTM) to decode the SPS.

Pros:

Balances custom code with reuse of existing tools.
Avoids full dependence on large frameworks like FFmpeg.

Cons:

Requires finding a suitable H.266 parsing library.
Still needs TS parsing implementation.

Recommended Approach

The most reliable and practical method is a hybrid of Idea 1 and Idea 2:

Use a lightweight custom parser to extract the H.266 bitstream from the TS (steps 1–6 of Idea 1).
Leverage FFmpeg’s libavcodec or an H.266-specific parser to interpret the SPS and retrieve width and height. This balances control, accuracy, and development effort, ensuring compatibility with H.266’s structure within MPEG2-TS.

提取宽度和高度的关键步骤

从每个HTTP流中解析MPEG2-TS（MP2T）容器，提取视频基本流。
识别视频格式（H.264、H.265或H.266），通过检查NAL单元头。
找到对应的SPS NAL单元（H.264为类型7，H.265为类型8，H.266为类型36）。
解析SPS以获取宽度和高度：

H.264：计算基于宏块和裁剪偏移。
H.265和H.266：直接读取亮度样本的宽度和高度。

解析MP2T容器的步骤

首先，需要从MP2T流中提取视频基本流：

读取188字节的MP2T数据包，查找同步字节0x47。
找到PID为0x0000的PAT包，获取PMT的PID。
解析PMT包，找到视频流的PID（H.264为0x1B，H.265为0x24）。
收集所有具有该视频PID的数据包，组装视频基本流。

识别视频格式

视频格式通过NAL单元头识别：

H.264：NAL单元类型为1-31，SPS类型为7，首字节后为0x67（若nal_ref_idc=3）。
H.265：NAL单元类型为0-31，SPS类型为8，首字节后为0x10。
H.266：NAL单元头为两字节，SPS类型为36，第二字节为0x48（假设nuh_layer_id=0）。

解析SPS获取宽度和高度

H.264：

找到SPS NAL单元（类型7）。
解析RBSP，提取pic_width_in_mbs_minus1、pic_height_in_map_units_minus1、frame_mbs_only_flag和裁剪偏移。
计算：宽度=((pic_width_in_mbs_minus1 + 1) * 16) - (左裁剪 + 右裁剪) * 2；高度=(2 - frame_mbs_only_flag) * ((pic_height_in_map_units_minus1 + 1) * 16) - (上裁剪 + 下裁剪) * 2。

H.265和H.266：

找到SPS NAL单元（H.265类型8，H.266类型36）。
解析RBSP，直接读取pic_width_in_luma_samples（宽度）和pic_height_in_luma_samples（高度）。

令人惊讶的细节

H.266（VVC）使用两字节NAL单元头，这与H.264和H.265的一字节头不同，增加了解析复杂性。

详细调研报告

本文详细探讨了如何从三个不同HTTP流中各自的MPEG2-TS（MP2T）容器格式中，提取H.264、H.265和H.266的序列参数集（SPS）中的宽度和高度信息。以下是完整的步骤和分析，涵盖了从容器解析到SPS提取的各个方面。

MP2T容器的解析

MP2T是一种用于广播和流媒体的传输流格式，每个数据包为188字节。解析过程如下：

同步和PID提取：每个数据包以同步字节0x47开始，包含13位的PID（数据包标识符）。首先，找到PID为0x0000的PAT（程序关联表）包，获取PMT（程序映射表）的PID。
PAT和PMT解析：PAT包含节目号和对应的PMT PID。PMT进一步列出每个基本流的类型和PID，例如H.264的流类型为0x1B（27），H.265为0x24（36），H.266可能为0x28（40）或类似值。
视频流提取：根据PMT中的视频PID，收集所有具有该PID的数据包，移除传输头和可能的适应字段，组装视频基本流。

视频格式的识别

视频基本流可能是H.264、H.265或H.266，需通过NAL（网络抽象层）单元头识别：

H.264：NAL单元以0x00 0x00 0x01或0x00 0x00 0x00 0x01开始，首字节后为NAL单元头，bit 0为forbidden_zero_bit（0），bits 1-5为nal_unit_type（1-31），bits 6-7为nal_ref_idc（0-3）。SPS的nal_unit_type为7，若nal_ref_idc=3，则首字节为0x67。
H.265：类似H.264，NAL单元头首字节bit 0为forbidden_zero_bit，bits 1-5为nal_unit_type（0-31），bits 6-7为nuh_reserved_zero_bit（0）。SPS的nal_unit_type为8，首字节为0x10。
H.266（VVC）：NAL单元头为两字节，第一字节bit 0为forbidden_zero_bit，bits 1-6为nuh_temporal_id_plus1，bit 7为nuh_reserved_zero_bit；第二字节bit 0为nuh_layer_id，bits 1-6为nal_unit_type（0-63），bit 7为nuh_reserved_zero_bit。SPS的nal_unit_type为36，假设nuh_layer_id=0和nuh_reserved_zero_bit=0，第二字节为0x48。

识别格式时，可搜索特定SPS类型：H.264为7，H.265为8，H.266为36。

SPS的提取与解析

找到SPS NAL单元后，需解析其RBSP（原始字节序列有效载荷）以提取宽度和高度：

H.264的SPS解析

SPS NAL单元：nal_unit_type=7，首字节后为0x67（nal_ref_idc=3）。
RBSP解析：移除NAL单元头后的数据为RBSP，使用Exp-Golomb解码提取参数，包括：

profile_idc：配置文件标识
constraint_set flags：约束集标志
level_idc：级别标识
seq_parameter_set_id：序列参数集ID
pic_width_in_mbs_minus1：宏块宽度减1
pic_height_in_map_units_minus1：映射单位高度减1
frame_mbs_only_flag：仅帧宏块标志
frame_cropping_flag：帧裁剪标志，若为1则有裁剪偏移
frame_crop_left_offset、frame_crop_right_offset、frame_crop_top_offset、frame_crop_bottom_offset：裁剪偏移

宽度和高度计算：

宽度 = ((pic_width_in_mbs_minus1 + 1) * 16) - (frame_crop_left_offset + frame_crop_right_offset) * 2
高度 = (2 - frame_mbs_only_flag) * ((pic_height_in_map_units_minus1 + 1) * 16) - (frame_crop_top_offset + frame_crop_bottom_offset) * 2
假设4:2:0颜色采样，裁剪偏移需乘2。

H.265的SPS解析

SPS NAL单元：nal_unit_type=8，首字节后为0x10。
RBSP解析：类似H.264，使用Exp-Golomb解码，提取参数包括：

pic_width_in_luma_samples：亮度样本宽度，u(16)
pic_height_in_luma_samples：亮度样本高度，u(16)

宽度和高度：直接为pic_width_in_luma_samples和pic_height_in_luma_samples，无需额外计算。

H.266（VVC）的SPS解析

SPS NAL单元：nal_unit_type=36，第二字节为0x48（nuh_layer_id=0）。
RBSP解析：与H.265类似，提取参数包括：

pic_width_in_luma_samples：亮度样本宽度
pic_height_in_luma_samples：亮度样本高度

宽度和高度：直接为上述两个值，解析方式与H.265一致。

实现注意事项

工具支持：可使用FFmpeg等工具解析MP2T和视频流，但用户要求手动从SPS提取，需实现Exp-Golomb解码器。
复杂性：H.264的计算涉及宏块和裁剪，H.265和H.266更直观，直接读取样本尺寸。
H.266的特殊性：NAL单元头为两字节，增加解析复杂度，需注意nuh_layer_id和nuh_reserved_zero_bit的影响。

总结与对比

视频格式	SPS NAL单元类型	宽度高度提取方式	备注
H.264	7	计算：基于宏块和裁剪偏移	需要Exp-Golomb解码
H.265	8	直接读取：pic_width_in_luma_samples	简单，u(16)格式
H.266	36	直接读取：pic_width_in_luma_samples	NAL头两字节，类似H.265

H.266的NAL单元头两字节设计是其与H.264、H.265的显著区别，增加了解析复杂性，但宽度高度提取方式与H.265相似。

关键引用

Extracting H.265 Stream: 从MPEG2-TS流中提取出H.265视频码流，通常需要先进行TS解复用（demuxing），识别并分离出视频PID数据。
Identifying NALU Boundaries: 利用NALU的起始码（如 0x000001 或 0x00000001）来划分每个NAL单元。
Parsing the NALU Header: 每个NAL单元的头部包含NALU类型等信息。对H.265来说，NALU头一般包含6位nal_unit_type、6位nuh_layer_id以及3位nuh_temporal_id_plus1。例如，NALU类型32表示VPS，33表示SPS，34表示PPS。
Extracting Specific NALU Types: 根据NALU类型筛选出你需要的（如SPS和PPS），然后对其payload进行进一步解析以提取编码参数（例如分辨率、帧率、色彩空间信息等）。
Decoding and Using Parameters: 提取到的SPS、PPS数据通常用于配置解码器的上下文，从而正确解码后续的视频帧数据。
Implementation Examples: 可以通过自定义解析函数或利用现有库（如FFmpeg）来完成NALU解析和解码操作。

当我们从MPEG2-TS字节流中提取到H.265（HEVC）视频码流后，下一步就是对码流中的NAL单元进行解析。其核心步骤如下：

解复用与NALU分割

TS解复用（Demuxing）： 先对MPEG2-TS流进行解复用，识别出视频对应的PID，将视频字节流提取出来。
NALU边界识别： H.265的NAL单元通常以起始码（0x000001或0x00000001）作为分隔标志。利用这些标志可以将连续的码流划分为独立的NAL单元。

NALU头部解析

每个NAL单元的头部包含关键信息，其中最重要的是nal_unit_type。
对于H.265而言，NALU头中6位的nal_unit_type用来标识单元的类型。常见的类型包括：

32： Video Parameter Set（VPS）
33： Sequence Parameter Set（SPS）
34： Picture Parameter Set（PPS）

通过读取NALU头部，你可以判断该NAL单元是否为你所需的SPS或PPS。

提取SPS、PPS等关键信息

当检测到NALU类型为33（SPS）或34（PPS）时，需要解析其后续的payload数据。
SPS中包含视频序列的参数，如分辨率、帧率、码流限制等，PPS中则包含图像解码过程中的相关配置。
这些信息一般经过压缩编码，需要按照HEVC标准（如ITU-T H.265）中定义的语法结构进行解析。例如，可以按照比特流中exp-Golomb编码等规则逐步提取各字段。

解码器配置与后续帧解码

在解码前，一般需要先将VPS、SPS、PPS提取出来，配置解码器的上下文。
有了这些参数，解码器才能正确地解析后续的slice数据（例如IDR帧、P帧、B帧等），保证视频画面的正确还原。

示例代码与工具

使用FFmpeg库：
例如，在C/C++中可以利用FFmpeg的av_parser_parse2()函数来自动解析NALU，并结合AVCodecContext进行解码。

// 示例：初始化解码器并解析NALU
AVCodec *codec = avcodec_find_decoder(AV_CODEC_ID_HEVC);
AVCodecContext *codec_ctx = avcodec_alloc_context3(codec);
// 读取NALU数据后调用解析函数
int ret = av_parser_parse2(parser, codec_ctx, &packet.data, &packet.size,
                           nalu_data, nalu_size, AV_NOPTS_VALUE, AV_NOPTS_VALUE, 0);

自定义NALU解析：
如果需要手动处理，可以通过查找起始码实现NALU切分，并利用如下方式获取NALU类型：

// 假设header是NALU头部的首字节数组
uint8_t nal_unit_type = (header[0] >> 1) & 0x3F;
if(nal_unit_type == 33) {
    // 处理SPS
} else if(nal_unit_type == 34) {
    // 处理PPS
}

上一篇：多人游戏网络同步：帧同步与状态同步

下一篇：RTP加密：SRTP协议的原理与C++实现

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯