多媒體開發（18）：FFmpeg的常見結構體

除了之前講的avpacket跟avframe，FFmpeg還有其它一些結構經常在流程中出現。FFmpeg還有哪些常見的結構呢？先來看一下這個截圖：

這張圖中的主角，是AVFormatContext。AVFormatContext是FFmpeg的基本結構之一，對應於封裝格式（或容器格式）。

圍繞FFmpeg的“格式場景”，本文介紹FFmpeg常見的資料結構。按照上圖，小程依次介紹圖中的幾個結構體。

（一）AVCodec

AVCodec是FFmpeg設計上的一個結構體，用來儲存編解碼器的資訊，也就是說，AVCodec是編碼器或解碼器。

還是以除錯的辦法，具體看一下AVCodec變數中的內容吧。

（1）演示程式碼

演示程式碼的目錄結構是這樣的：

其中的FFmpeg靜態庫是事先編譯好的（這裡是macos版本，因為我現在用的是mac電腦），編譯的辦法可以參考之前的文章。

moments.mp4 是試用的視訊檔案（mp4封裝格式）。

makefile是編譯指令碼，用來編譯演示程式碼，當然也可以直接用gcc來編譯。

show_avcodec.c就是演示程式碼了，內容如下：

#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"

void show_avcodec(const char* filepath) {
    av_register_all();
    av_log_set_level(AV_LOG_DEBUG);
    AVFormatContext* formatContext = avformat_alloc_context();
    int status = 0;
    int success = 0;
    int videostreamidx = -1;
    AVCodecContext* codecContext = NULL;
    status = avformat_open_input(&formatContext, filepath, NULL, NULL);
    if (status == 0) {
        status = avformat_find_stream_info(formatContext, NULL);
        if (status >= 0) {
            for (int i = 0; i < formatContext->nb_streams; i ++) {
                if (formatContext->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO) {
                    videostreamidx = i;
                    break;
                }
            }
            if (videostreamidx > -1) {
                codecContext = formatContext->streams[videostreamidx]->codec;
                AVCodec* codec = avcodec_find_decoder(codecContext->codec_id);
                if (codec) {
                    status = avcodec_open2(codecContext, codec, NULL);
                    if (status == 0) {
                        success = 1;
                    }
                }
            }
        }
        else {
            av_log(NULL, AV_LOG_DEBUG, "avformat_find_stream_info error\n");
        }
        avformat_close_input(&formatContext);
    }
    avformat_free_context(formatContext);
}

int main(int argc, char *argv[])
{
    show_avcodec("moments.mp4");
    return 0;
}

你已經看到一個常規的FFmpeg使用套路：open_input得到avformatcontext，然後find_stream_info得到avstream，再從avstream中找到avcodeccontext，然後找到avcodec，最後open開avcodec。

（2）編譯與除錯

makefile的內容：

exe=showavcodec
srcs=show_avcodec.c 
$(exe):$(srcs)
    gcc -o $(exe) $(srcs) -Iffmpeg/include/ -Lffmpeg -lffmpeg -liconv -lz -g
clean:
    rm -f $(exe) *.o

直接執行make來編譯，編譯後會生成符號表檔案即showavcodec.dSYM。

這裡只是簡單看一下AVCodec的內容，用gdb來除錯即可：

gdb showavcodec
b 25
r

在斷點的地方，看一下AVCodec變數中的值：

（3）AVCodec結構內容

AVCodec是編解碼器的結構體，在libavcodec/avcodec.h中定義。

在這個示例中，AVCodec是一個解碼器。

AVCodec結構中的一些變數，從它的名字或者FFmpeg詳細的註釋中，可以知道是什麼含義。

比如name是編解碼的名稱，而long_name就是長的名稱，等等。

在設計上，AVCodec是編解碼器的抽象，所以，編解碼器是有相應的具體實現的。

事實上，每一個編解碼器都有具體的實現。

比如h264的解碼器（libavcodec/h264.c）：

比如mp3lame的編碼器（libavcodec/libmp3lame.c）

FFmpeg使用這些具體的編解碼器的實現，以完成編解碼等功能。

（二）AVCodecContext

AVCodecContext可以簡單理解為AVCodec的使用場景，而實際上AVCodecContext包括的內容，除了關聯AVCodec，還有其它資訊。

跟除錯AVCodec變數一樣，直接使用上面的演示程式碼就可以除錯AVCodecContext，部分程式碼如下：

if (videostreamidx > -1) {
    codecContext = formatContext->streams[videostreamidx]->codec;
    AVCodec* codec = avcodec_find_decoder(codecContext->codec_id);
    if (codec) {
        status = avcodec_open2(codecContext, codec, NULL);
        if (status == 0) {
            success = 1;
        }
    }
}

同樣用gdb來除錯就可以了，在拿到codecContext後下斷點，可以看到AVCodecContext的部分內容如下：

其中有一些變數應該引起注意，比如：

width/height  視訊的寬與高
codec_id 編解碼器的id，根據它可以找到對應的編解碼器
extradata 對於h264編碼的視訊，儲存了pps與sps的引數資訊
profile 視訊編碼複雜等級
sample_rate 音訊的取樣率
channels 音訊的聲道數
sample_fmt 音訊的取樣格式

跟AVCodec一樣，AVCodecContext結構體在libavcodec/avcodec.h中定義。

（三）AVStream

上面介紹了AVCodec、AVCodecContext，現在介紹AVStream。

這三者的大概關係是這樣的：

AVStream對應音訊流、視訊流、字幕等媒體流。FFmepg以流的概念來封裝不同的媒體。

除錯AVStream的示例程式碼與編譯，可以檢視上面AVCodec除錯的介紹。大概如下：

下斷點，可以看到AVStream中的內容，比如：

AVStream中的一些變數：

index，流的索引
codec，流對應的avcodeccontext
time_base，時間基準（比例）
duration，流的時長
metadata，流的元資訊
nb_frames，流中幀的數量

AVStream結構，在libavformat/avformat.h中定義。

（四）AVFormatContext

AVFormatContext是主角，表示為格式的場景，對應於封裝格式（或容器格式）。

同樣，使用之前的示例程式碼，在avformat_open_input函式後下斷點：

可以檢視avformatcontext結構中的變數值：

AVFormatContext中的metadata記錄了多媒體檔案的一些資訊（比如作者、專輯之類），可以這樣取得裡面的資訊：

if (formatCtx->metadata) {
    AVDictionaryEntry *item = NULL;
    while((item = av_dict_get(formatCtx->metadata, "", item, AV_DICT_IGNORE_SUFFIX))){
        printf("key:%s value:%s \n", item->key, item->value);
    }

    // 或者這樣：
    AVDictionaryEntry *tag = NULL;
    tag = av_dict_get(formatCtx->metadata, "artist", NULL, 0);
    if (tag) {
        std::string artist = (char*)tag->value;
    }
}

AVFormatContext的一些變數說明：

iformat/oformat，輸入/輸出格式，在解複用（解封裝）或複用（封裝）時使用。
pb，輸入或輸出場景，提供資料操作介面（比如讀寫、seek等）。
nb_streams，流的個數（以流的方式來複用）。
streams，流的陣列。
filename，檔名。
start_time，流的起始時間，以AV_TIME_BASE為單位（除以AV_TIME_BASE轉為秒）。
duration，流的時長，以AV_TIME_BASE為單位。
bit_rate，位元率。
probesize，在檢測容器格式時，最大的探測大小，在avformat_open_input之前設定（或不設定使用預設值）。
max_analyze_duration，最大的分析資料的時長，在檢測編碼格式時使用，在avformat_find_stream_info前設定（或不設定），越大越耗時。
metadata，元資訊。

AVFormatContext結構，在libavformat/avformat.h中定義。

（五）AVIOContext

AVIOContext是輸入輸出資訊的結構體，它在FFmpeg結構體系中的位置是這樣的：

可以看到，AVIOContext是AVFormatContext的一個成員，叫作pb。

pb是提供資料的變數，既用於讀（解碼）也用於寫（編碼）。

（1）解碼時

在解碼時，pb提供解碼的原始資料，一般在呼叫avio_alloc_context建立aviocontext時，指定read與seek函式（自定義的實現，提供讀資料、跳轉位置的功能），然後把建立的aviocontext（即pb）直接設定給AVFormatContext。比如這樣：

pb = avio_alloc_context(readBuf, readBufLen, 0, this, myReadFunc, NULL, mySeekFunc);
mFormatCtx->pb = pb;

或者，在解碼時，這樣使用aviocontext：

mIOContext.read_packet = myReadFunc;
mIOContext.seek = mySeekFunc;
const int MAX_PRO_SIZE = 32*1024;
unsigned char* probuf = (unsigned char*)av_malloc(MAX_PRO_SIZE); 
mIOContext.buffer = probuf;
mIOContext.buf_ptr = probuf;
mIOContext.buffer_size = MAX_PRO_SIZE;
mIOContext.buf_end = probuf + MAX_PRO_SIZE;
mIOContext.max_packet_size = MAX_PRO_SIZE;
formatContext->pb = &mIOContext;

其中，函式myReadFunc與mySeekFunc，按照結構體AVIOContext中的格式說明（參照標頭檔案說明）來定義即可。

解碼時，pb的設定，要在avformat_open_input呼叫前完成。

（2）編碼時

在編碼寫檔案時，pb提供寫到檔案的資料（編碼後的資料，對應AVPacket），比如可以直接用avio_open2來開啟pb：

if (!(mFormatContext->flags & AVFMT_NOFILE)) {
    err = avio_open2(&mFormatContext->pb, url, AVIO_FLAG_WRITE, &mFormatContext->interrupt_callback, NULL);
    // ...
}

寫檔案時，可以用av_write_frame來寫入一個packet，也可以用avio_write往pb中寫入資料。

編碼寫檔案時，pb的設定，要在avformat_write_header呼叫前完成。

（3）AVIOContext的變數

這裡只列表一部分：

buffer，AVIOContext快取資料的buffer，起始地址。
buffer_size，buffer的大小。
buf_ptr，操作buffer的當前位置。
buf_end，資料的結束位置，有可能未到buffer的未端。
opaque，指向URLContext，提供讀、寫、seek等介面，可以讓它為空，從而使用自定義的介面。
read_packet/write_packet/seek，讀寫與seek的介面，可以在avio_alloc_context時指定，從而自定義。

AVIOContext結構體在libavformat/avio.h中定義。

至此，FFmpeg常見的幾個結構體就介紹完畢了。

總結一下，本文介紹了FFmpeg的常見結構體，包括AVFormatContext、AVIOContext、AVStream、AVCodecContext、AVCodec等，並且以除錯的方式查看了結構體的一些變數值。

多媒體開發（18）：FFmpeg的常見結構體

（一）AVCodec

（1）演示程式碼

（2）編譯與除錯

（3）AVCodec結構內容

（二）AVCodecContext

（三）AVStream

（四）AVFormatContext

（五）AVIOContext

（1）解碼時

（2）編碼時

（3）AVIOContext的變數

多媒體開發（18）：FFmpeg的常見結構體

多媒體開發（8）：調試FFmpeg

多媒體開發（15）：H264的常見概念

多媒體開發（7）：編譯Android與iOS平臺的FFmpeg

多媒體開發（8）：除錯FFmpeg

多媒體開發（3）：直播

多媒體開發（6）：濾鏡實現各種圖片效果 | Video-Filters | 變色

多媒體開發（9）：聲音采集的概念 | 振幅 | 頻率 | 共振 | 電平化

多媒體開發（10）：從視訊中提取圖片

多媒體開發（11）：Android平臺上裁剪m4a

多媒體開發（12）：解碼aac到wav檔案

多媒體開發（14）：媒體格式的概念

多媒體開發（16）：幀率與位元速率的概念

多媒體開發（2）：錄製視訊

多媒體開發（6）：用濾鏡實現各種圖片效果

多媒體開發（9）：我是聲音

多媒體開發（10）：提取圖片以及點陣圖儲存

小白學開發（iOS）OC_ 經常使用結構體（2015-08-14）

即時通訊音視訊開發（八）：常見的實時語音通訊編碼標準

Android多媒體開發（2）————使用Android NKD編譯原版FFmpeg

多媒體開發（18）：FFmpeg的常見結構體

（一）AVCodec

（1）演示程式碼

（2）編譯與除錯

（3）AVCodec結構內容

（二）AVCodecContext

（三）AVStream

（四）AVFormatContext

（五）AVIOContext

（1）解碼時

（2）編碼時

（3）AVIOContext的變數

相關推薦