FFMPEG4.0 音频解码解封装

原创

liu149339750 2019-05-30 11:21:07 ©著作权

文章标签 ffmpeg 文章分类 开源

©著作权归作者所有：来自51CTO博客作者liu149339750的原创作品，请联系作者获取转载授权，否则将追究法律责任

下面的函数方法基于最新的FFMPEG 4.0（4.X）：

音频的原始数据是pcm编码，关于PCM编码的相关信息请看这篇文章：https://www.jianshu.com/p/cfb3d4dc3676

本文的解码就是要把原始文件中的音频部分提取出来解码生成PCM文件，以下是正文，将会去除逻辑相关代码，按照流程顺序用最基础的方法展现，方便大家掌握用法：

一、获取多媒体文件的信息 1.声明并分配内存格式信息的Context avformat_context = avformat_alloc_context(); 2.打开文件读取头信息 avformat_open_input(&avformat_context,src_name,NULL,NULL);其中avformat_context如果没有被声明分配内存，此方法会给分配。 3.某些格式没有头信息，需要读取数据来分析 avformat_find_stream_info(avformat_context,NULL); 二、解码设置 4.找到你想要的数据流，可用方法av_find_best_stream代替：

	int audio_stream_index = 0;
	//like av_find_best_stream
	for(i = 0;i<avformat_context->nb_streams;i++) {
		if(avformat_context->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
			audio_stream_index = i;
			av_log(NULL,AV_LOG_ERROR,"find audio stream index = %d\n",audio_stream_index);
			break;
		}
	}
	AVStream *stream = avformat_context->streams[audio_stream_index];

5.从数据流信息中得到×××信息，生成××× codec = avcodec_find_decoder(stream->codecpar->codec_id); 6.声明解码Context。 codec_ctx = avcodec_alloc_context3(codec); 7.把stream信息中的参数拷贝到解码Context中。 ret = avcodec_parameters_to_context(codec_ctx,stream->codecpar); 8.打开××× avcodec_open2(codec_ctx, codec, NULL); 三、进行数据流解封装解码 9.声明数据包packet与数据帧frame

	frame = av_frame_alloc();
	pkt = av_packet_alloc();

10.读取数据包 av_read_frame(avformat_context,pkt) 11.发送数据包 avcodec_send_packet(codec_ctx,pkt); 12.接收解码后的数据帧，需要注意的是一个数据包可能解压出多个数据帧，所以需要循环读取 avcodec_receive_frame(codec_ctx,frame)//读取一帧一个packet的解码范例：

	while((ret = avcodec_receive_frame(codec_ctx,frame)) >= 0) {
        if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF)
            return;
        int data_size = av_get_bytes_per_sample(codec_ctx->sample_fmt);
        LOGV("data_size = %d,line0 = %d,codec_ctx->channels = %d\n",data_size * frame->nb_samples,frame->linesize[0],codec_ctx->channels);
        int i,ch = 0;
        for(i = 0;i<frame->nb_samples;i++)
        {
        	for(ch = 0;ch<codec_ctx->channels;ch++)
        	{
        		fwrite(frame->extended_data[ch] + data_size*i,1,data_size,outfile);
        	}
        }
	}

13.每种采样格式的数据大小都是固定的 int data_size = av_get_bytes_per_sample(codec_ctx->sample_fmt);得到的是此格式每个采样的字节大小。 frame->linesize[0]内存储的是frame单声道的字节大小，=data_size*frame->nb_samples 14.解码得到的每帧数据都有多个样本，每帧数据可能有多个通道 //下面的代码针对的是planar类型格式。

        for(i = 0;i<frame->nb_samples;i++)
        {
        	for(ch = 0;ch<codec_ctx->channels;ch++)
        	{
        		fwrite(frame->extended_data[ch] + data_size*i,1,data_size,outfile);
        	}
        }

音频格式存储分为两种类型，分别为packed和planar，区别为格式后面是否带p 带P和不带P的数据类型的区别： P表示Planar（平面），声道分开存放。其数据格式排列方式为 :* LLLLLLRRRRRRLLLLLLRRRRRRLLLLLLRRRRRRL...（每个LLLLLLRRRRRR为一个音频帧）而不带P的数据格式（即交错排列）排列方式为： LRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRL...（每个LR为一个音频样本）

目前的音频一般是planar格式，packed格式的数据存在一列之中，左右声道交替存放。 planar格式声道的数据存在各自的数组中，生成PCM文件时需要注意左右声道加起来才是一个完整的采样点。对于音频，使用extended_data，data注意是给视频用的，虽然单纯的音频文件在声道较少时也是能用的，但是官方不推荐用。