今天我们来谈论下如何在进行音频采集。

系统多种多样 电脑上的系统有mac,windows,linux。手机系统有android,ios等。如果我们使用每种系统的音视频函数进行采集,成本会很大。ffmpeg已经为我们封装了相应的api。

今天我们讲一下,音频的采集流程和相应的api。最后我们通过一个例子来实现音频的采集。

音频采集的流程简单来说如下图

pc itunes 音频采样 电脑音频采集_初始化

用ffmepg描述上面的流程

可以下面流程图描述

                                                                                               

pc itunes 音频采样 电脑音频采集_pc itunes 音频采样_02

ffmpeg流程讲解

1、要想对设备操作我们就要对设备进行注册

/**
 * Initialize libavdevice and register all the input and output devices.
 */
void avdevice_register_all(void);

2、设置采集方式

/**
 * Find AVInputFormat based on the short name of the input format.
 */
ff_const59 AVInputFormat *av_find_input_format(const char *short_name);

av_find_input_format 传入的参数avfoundataion/dshow/alsa

avfoundataion mac

dshow  windows

alsa  linux

AVInputFormat是ffmpeg的解复用器对象,

一种文件格式对应一个AVInputFormat结构,在程序运行时候有多个实例。

next变量用于把支持的所有输入文件容器格式连接成链表,便于遍历查找。

priv_data_size标识具体的文件容器格式对应的Context大小。

我们看下AVInputFormat 的结构体都有什么内容

typedef struct AVInputFormat {
    /*格式的短名称*/
    const char *name;

    /*格式的长名称*/
    const char *long_name;

    /**
     * Can use flags: AVFMT_NOFILE, AVFMT_NEEDNUMBER, AVFMT_SHOW_IDS,
     * AVFMT_NOTIMESTAMPS, AVFMT_GENERIC_INDEX, AVFMT_TS_DISCONT, AVFMT_NOBINSEARCH,
     * AVFMT_NOGENSEARCH, AVFMT_NO_BYTE_SEEK, AVFMT_SEEK_TO_PTS.
     */
    int flags;

    /*定义了扩展,如果不进行格式探测通常不用扩展*/
    const char *extensions;
    
    /*codec 的tag*/
    const struct AVCodecTag * const *codec_tag;

    /*AvClass 用户内部的context*/
    const AVClass *priv_class; ///< AVClass for the private context

    /**
     * Comma-separated list of mime types.
     * It is used check for matching mime types while probing.
     * @see av_probe_input_format2
     */
    /*mime类型,当probing的时候需检测是否是需要的mime类型*/
    const char *mime_type;

    /*****************************************************************
     * No fields below this line are part of the public API. They
     * may not be used outside of libavformat and can be changed and
     * removed at will.
     * New public fields should be added right above.
     *****************************************************************
     */
    /*用于把支持的所有输入文件容器格式连接成链表*/
#if FF_API_NEXT
    ff_const59 struct AVInputFormat *next;
#endif

    /**
     * 原始解封装存储的id
     */
    
    int raw_codec_id;

    /**
     * 格式文件的大小
     */
    int priv_data_size;

    /**
     * 判断给定的文件是否有机会被解析为这种格式。
     * 提供的缓冲区保证为AVPROBE_PADDING_SIZE字节大小,因此除非您需要更多,否则不必检查该缓冲区。.
     */
    int (*read_probe)(const AVProbeData *);

    /**
       读取格式头并初始化AVFormatContext结构。 如果正确,则返回0。 应该调用“ avformat_new_stream”来创建新的流。
     */
    int (*read_header)(struct AVFormatContext *);

   
    
    /*读取一个包,并放在pkt当中。pts和flags常常被设置。加入当前flag = AVFMTCTX_NOHEADER,就会调用avformat_new_stream,这个方法必须在可见线程调用
        0为成功,<0错误
        调用后,不管成功与否,必须释放AVPacket
     */
    int (*read_packet)(struct AVFormatContext *, AVPacket *pkt);

    
    /*关闭流。AVFormatContext和AVStreams没有被释放*/
    int (*read_close)(struct AVFormatContext *);

   
    
    /*读取给定的时间错*/
    int (*read_seek)(struct AVFormatContext *,
                     int stream_index, int64_t timestamp, int flags);

    /**
     * Get the next timestamp in stream[stream_index].time_base units.
     * @return the timestamp or AV_NOPTS_VALUE if an error occurred
     */
    /*读取stream[stream_index]中的下一个时间戳, 如果错误发生返回AV_NOPTS_VALUE*/
    int64_t (*read_timestamp)(struct AVFormatContext *s, int stream_index,
                              int64_t *pos, int64_t pos_limit);


     /*恢复播放,只在RTSP格式下有意义*/
    int (*read_play)(struct AVFormatContext *);


    /*暂停播放,只在RTSP格式下有意义*/
    int (*read_pause)(struct AVFormatContext *);


    /*快进快退到指定时间戳*/

    int (*read_seek2)(struct AVFormatContext *s, int stream_index, int64_t min_ts, int64_t ts, int64_t max_ts, int flags);

   
    /*返回设备列表及其属性*/

    int (*get_device_list)(struct AVFormatContext *s, struct AVDeviceInfoList *device_list);

    
    /*初始化设备能力子模块*/

    int (*create_device_capabilities)(struct AVFormatContext *s, struct AVDeviceCapabilitiesQuery *caps);

    /*释放设备能力子模块*/

    int (*free_device_capabilities)(struct AVFormatContext *s, struct AVDeviceCapabilitiesQuery *caps);
} AVInputFormat;

3、打开设备

/**
 * Open an input stream and read the header. The codecs are not opened.
 * The stream must be closed with avformat_close_input().
 *
 * @param ps Pointer to user-supplied AVFormatContext (allocated by avformat_alloc_context).
 *           May be a pointer to NULL, in which case an AVFormatContext is allocated by this
 *           function and written into ps.
 *           Note that a user-supplied AVFormatContext will be freed on failure.
 * @param url URL of the stream to open.
 * @param fmt If non-NULL, this parameter forces a specific input format.
 *            Otherwise the format is autodetected.
 * @param options  A dictionary filled with AVFormatContext and demuxer-private options.
 *                 On return this parameter will be destroyed and replaced with a dict containing
 *                 options that were not found. May be NULL.
 *
 * @return 0 on success, a negative AVERROR on failure.
 *
 * @note If you want to use custom IO, preallocate the format context and set its pb field.
 */
int avformat_open_input(AVFormatContext **ps, const char *url, ff_const59 AVInputFormat *fmt, AVDictionary **options);

4、从音频设备获取数据放在AVPacket

/**
 * Return the next frame of a stream.
 * This function returns what is stored in the file, and does not validate
 * that what is there are valid frames for the decoder. It will split what is
 * stored in the file into frames and return one for each call. It will not
 * omit invalid data between valid frames so as to give the decoder the maximum
 * information possible for decoding.
 *
 * On success, the returned packet is reference-counted (pkt->buf is set) and
 * valid indefinitely. The packet must be freed with av_packet_unref() when
 * it is no longer needed. For video, the packet contains exactly one frame.
 * For audio, it contains an integer number of frames if each frame has
 * a known fixed size (e.g. PCM or ADPCM data). If the audio frames have
 * a variable size (e.g. MPEG audio), then it contains one frame.
 *
 * pkt->pts, pkt->dts and pkt->duration are always set to correct
 * values in AVStream.time_base units (and guessed if the format cannot
 * provide them). pkt->pts can be AV_NOPTS_VALUE if the video format
 * has B-frames, so it is better to rely on pkt->dts if you do not
 * decompress the payload.
 *
 * @return 0 if OK, < 0 on error or end of file. On error, pkt will be blank
 *         (as if it came from av_packet_alloc()).
 *
 * @note pkt will be initialized, so it may be uninitialized, but it must not
 *       contain data that needs to be freed.
 */
int av_read_frame(AVFormatContext *s, AVPacket *pkt);

5、AVPacket使用完要进行释放

涉及到四个函数

av_init_packet(<AVPacket *pkt)

av_packet_unref(AVPacket *pkt)

 

av_packet_alloc()

先分配空间,再进行初始化

av_packet_free(AVPacket **pkt)

先av_packet_unref,再释放空间

av_init_packet和av_packet_unref是一对

av_packet_alloc和av_packet_free是一对。

要成对出现要不会内存泄漏

6关闭输入设备

/**
 * Close an opened input AVFormatContext. Free it and all its contents
 * and set *s to NULL.
 */
void avformat_close_input(AVFormatContext **s);