FFmpeg avformat_find_stream_info函数优化

原创

fengyuzaitu 2017-11-18 14:29:41 博主文章分类：视音频 ©著作权

©著作权归作者所有：来自51CTO博客作者fengyuzaitu的原创作品，请联系作者获取转载授权，否则将追究法律责任

函数说明

avformat_find_stream_info函数主要用来探测码流格式，例如视频格式有H265,H264,H263,MP4等等格式，例如音频格式有AAC,PCM,MP2等等格式，至于对于视频格式中的图片长宽以及颜色位深，在获取到适合的解码器以后，通过解码视频帧就可以获取到这些参数

背景

       一般的应用场景对实时点播速度要求不高的情况下，可以设置探测码流的延时和探测数据的大小，代码如下：
   pFormatContext->probesize = 500 *1024;
   pFormatContext->max_analyze_duration = 5 * AV_TIME_BASE;//AV_TIME_BASE是定义的时间标准，代表1秒

弊端

这样设置probesize和max_analyze_duration是可以减少探测时间，但是是以牺牲成功率为代价的，有时候探测不到流信息，就会播不出来，

出现在网络丢包的情况下（使用UDP进行视频数据的传输）或者网路特别复杂，跨越多个网段

优化

要求点播时间不超过1秒，允许指定摄像机的视频参数如下：
当前只考虑视频流，后续会添加音频流，已知输入的流格式video: H264 1920*1080 25fps，直接打开解码器，不需要再调用avformat_find_stream_info

曾经优化的方向

该方向是没有必要的

AVStream* CDecoder::CreateStream(AVFormatContext* pFormatContext, int nCodecType)

{

AVStream *st = avformat_new_stream(pFormatContext, NULL);

if (!st)

return NULL;

st->codecpar->codec_type = (AVMediaType)nCodecType;

return st;

}



int CDecoder::GetVideoExtraData(AVFormatContext* pFormatContext, int nVideoIndex)

{

int  type, size, flags, pos, stream_type;

int ret = -1;

int64_t dts;

bool got_extradata = false;


if (!pFormatContext || nVideoIndex < 0 || nVideoIndex > 2)

return ret;


for (;; avio_skip(pFormatContext->pb, 4)) {

pos = avio_tell(pFormatContext->pb);

type = avio_r8(pFormatContext->pb);

size = avio_rb24(pFormatContext->pb);

dts = avio_rb24(pFormatContext->pb);

dts |= avio_r8(pFormatContext->pb) << 24;

avio_skip(pFormatContext->pb, 3);


if (0 == size)

break;

if (FLV_TAG_TYPE_AUDIO == type || FLV_TAG_TYPE_META == type) {

/*if audio or meta tags, skip them.*/

avio_seek(pFormatContext->pb, size, SEEK_CUR);

}

else if (type == FLV_TAG_TYPE_VIDEO) {

/*if the first video tag, read the sps/pps info from it. then break.*/

size -= 5;

pFormatContext->streams[nVideoIndex]->codecpar->extradata = (uint8_t*)av_malloc(size + FF_INPUT_BUFFER_PADDING_SIZE);

if (NULL == pFormatContext->streams[nVideoIndex]->codecpar->extradata)

break;

memset(pFormatContext->streams[nVideoIndex]->codecpar->extradata, 0, size + FF_INPUT_BUFFER_PADDING_SIZE);

memcpy(pFormatContext->streams[nVideoIndex]->codecpar->extradata, pFormatContext->pb->buf_ptr + 5, size);

pFormatContext->streams[nVideoIndex]->codecpar->extradata_size = size;

ret = 0;

got_extradata = true;

}

else {

/*The type unknown,something wrong.*/

break;

}


if (got_extradata)

break;

}


return ret;

}


int CDecoder::InitDecode(AVFormatContext *pFormatContext)

{

int video_index = -1;

int audio_index = -1;

int ret = -1;


if (!pFormatContext)

return ret;


/*

Get video stream index, if no video stream then create it.

And audio so on.

*/

if (0 == pFormatContext->nb_streams) {

CreateStream(pFormatContext, AVMEDIA_TYPE_VIDEO);

CreateStream(pFormatContext, AVMEDIA_TYPE_AUDIO);

video_index = 0;

audio_index = 1;

}

else if (1 == pFormatContext->nb_streams) {

if (AVMEDIA_TYPE_VIDEO == pFormatContext->streams[0]->codecpar->codec_type) {

CreateStream(pFormatContext, AVMEDIA_TYPE_AUDIO);

video_index = 0;

audio_index = 1;

}

else if (AVMEDIA_TYPE_AUDIO == pFormatContext->streams[0]->codecpar->codec_type) {

CreateStream(pFormatContext, AVMEDIA_TYPE_VIDEO);

video_index = 1;

audio_index = 0;

}

}

else if (2 == pFormatContext->nb_streams) {

if (AVMEDIA_TYPE_VIDEO == pFormatContext->streams[0]->codecpar->codec_type) {

video_index = 0;

audio_index = 1;

}

else if (AVMEDIA_TYPE_VIDEO == pFormatContext->streams[1]->codecpar->codec_type) {

video_index = 1;

audio_index = 0;

}

}


/*Error. I can't find video stream.*/

if (video_index != 0 && video_index != 1)

return ret;


//Init the audio codec(AAC).

pFormatContext->streams[audio_index]->codecpar->codec_id = AV_CODEC_ID_AAC;

pFormatContext->streams[audio_index]->codecpar->sample_rate = 44100;

pFormatContext->streams[audio_index]->codecpar->bits_per_coded_sample = 16;

pFormatContext->streams[audio_index]->codecpar->channels = 2;

pFormatContext->streams[audio_index]->codecpar->channel_layout = 3;

pFormatContext->streams[audio_index]->pts_wrap_bits = 32;

pFormatContext->streams[audio_index]->time_base.den = 1000;

pFormatContext->streams[audio_index]->time_base.num = 1;


//Init the video codec(H264).

pFormatContext->streams[video_index]->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;

pFormatContext->streams[video_index]->codecpar->codec_id = AV_CODEC_ID_H264;

pFormatContext->streams[video_index]->codecpar->format = 12;

pFormatContext->streams[video_index]->codecpar->bits_per_raw_sample = 8;

pFormatContext->streams[video_index]->codecpar->profile = 66;

pFormatContext->streams[video_index]->codecpar->level = 42;

pFormatContext->streams[video_index]->codecpar->width = 1920;

pFormatContext->streams[video_index]->codecpar->height = 1080;

pFormatContext->streams[video_index]->codecpar->sample_aspect_ratio.num = 0;

pFormatContext->streams[video_index]->codecpar->sample_aspect_ratio.den = 1;



pFormatContext->streams[video_index]->pts_wrap_bits = 64;

pFormatContext->streams[video_index]->time_base.den = 1200000;

pFormatContext->streams[video_index]->time_base.num = 1;

pFormatContext->streams[video_index]->avg_frame_rate.den = 1;

pFormatContext->streams[video_index]->avg_frame_rate.num = 25;

/*Need to change, different condition has different frame_rate. 'r_frame_rate' is new in ffmepg2.3.3*/

pFormatContext->streams[video_index]->r_frame_rate.den = 25;

pFormatContext->streams[video_index]->r_frame_rate.num = 1;

/* H264 need sps/pps for decoding, so read it from the first video tag.*/

ret = GetVideoExtraData(pFormatContext, video_index);


/*Update the AVFormatContext Info*/

pFormatContext->nb_streams = 1;

/*empty the buffer.*/

pFormatContext->pb->buf_ptr = pFormatContext->pb->buf_end;


return ret;

}

已有的方案

AVDictionary* pOptions = NULL;

pFormatCtx->probesize = 200 *1024;

pFormatCtx->max_analyze_duration = 3 * AV_TIME_BASE;


//Retrieve stream information

 if (avformat_find_stream_info(pFormatCtx, &pOptions) < 0)

 {

 return -1; // Couldn't find stream information

 }

调用逻辑

InitDecode(pFormatCtx);

优化效果

通过测试，速度优化了1200毫秒

测试结果

探测ES流，avformat_open_input会非常快的返回，PS反而是一个例外。通过调用av_log_set_callback设置日志写文件的方式，
在调用avformat_open_input函数探测PS输入格式时候，
会打印如下的日志：
Probing mp3 score:1 size:2048
Probing mp3 score:1 size:4096
Probing mp3 score:1 size:8192
Probing mp3 score:1 size:16384
Probing h264 score:51 size:32768
Format h264 probed with size=32768 and score=51
Input #0, h264, from '':
Duration: N/A, bitrate: N/A
Stream #0:0, 0, 1/1200000: Video: h264 (Baseline), yuvj420p, 1920x1080, 25 fps, 0.04 tbr, 1200k tbn
deprecated pixel format used, make sure you did set range correctly
non-existing PPS 0 referenced
non-existing PPS 0 referenced
nal_unit_type: 1, nal_ref_idc: 3
non-existing PPS 0 referenced
non-existing PPS 0 referenced
decode_slice_header error
non-existing PPS 0 referenced
non-existing PPS 0 referenced
non-existing PPS 0 referenced
no frame!

通过跟踪源码Probing h264 score:51 size:32768日志打印在调用
av_probe_input_format3函数会打印
name h264
long_name raw H.264 video
raw_codec_id 28
说明如果指定了AVInputFormat结构体，就可以节省探测码流格式的时间

ffmpeg 针对指定的h264 es流延时优化
参考http://blog.csdn.net/rain_2011_kai/article/details/7746805文章，
是否只需要知道发送端发送的视屏的×××ID,视频帧的长和宽，就可以直接
直接省略掉ffmpeg库的视频流探测接口，avformat_open_input函数
和avformat_find_stream_info函数耗时超过500毫秒

手动指定解码格式效果不明显
+buffer0x00000000002cbdc0 <字符串中的字符无效。>unsigned char *

+buf_end0x00000000002d3626 <字符串中的字符无效。>unsigned char *

30822

pos 262349

buffer 2932160

buf_end 2962982

buffsize 35840

buf_end-buffer 30822

pos的值从哪里来，值得考虑

pFormatContext->pb->pos = pFormatContext->pb->buf_end;

在已有的版本是编译不过的，因为pos是一个64位整型，buf_end是一个字符指针

但是从上面还是看不出它们之间的关系，尽管手动指定解码格式，但是效果并不理想