音視訊開發進階指南（二）

阿新 • • 發佈：2020-07-24

　　在ffplay中音畫同步的實現方式其實有三種，分別是：　　

以音訊為主時間軸作為同步源；（ffplay的預設方式），ubuntu16下測試偶有卡頓，效果比下面兩種的好

ffplay 32037.mp4 -sync audio

以視訊為主時間軸作為同步源；（音訊播放會有重複渲染，拖長音）

ffplay 32037.mp4 -sync video

以外部時鐘為主時間軸作為同步源。（偶有卡頓，音訊渲染異常，變音）

ffplay 32037.mp4 -sync ext

首先要宣告的是，播放器接收到的視訊幀或者音訊幀，內部都會有
時間戳（PTS時鐘）來標識它實際應該在什麼時刻進行展示。實際的對

齊策略如下：比較視訊當前的播放時間和音訊當前的播放時間，如果視
頻播放過快，則通過加大延遲或者重複播放來降低視訊播放速度；如果
視訊播放慢了，則通過減小延遲或者丟幀來追趕音訊播放的時間點。關
鍵就在於音視訊時間的比較以及延遲的計算，當然在比較的過程中會設
置一個閾值（Threshold），若超過預設的閾值就應該做調整（丟幀渲染
或者重複渲染），這就是整個對齊策略。

3.ffmpeg命令對音訊、視訊檔案的渲染、轉換，合成，拆分，見P113

　　統一下術語，具體如下。

·　　容器／檔案（Conainer/File）：即特定格式的多媒體檔案，比如MP4、flv、mov等。

·　　媒體流（Stream）：表示時間軸上的一段連續資料，如一段聲音數

據、一段視訊資料或一段字幕資料，可以是壓縮的，也可以是非壓縮
的，壓縮的資料需要關聯特定的編解碼器。
·　　資料幀／資料包（Frame/Packet）：通常，一個媒體流是由大量的
資料幀組成的，對於壓縮資料，幀對應著編解碼器的最小處理單元，分
屬於不同媒體流的資料幀交錯儲存於容器之中。
·　　編解碼器：編解碼器是以幀為單位實現壓縮資料和原始資料之間
的相互轉換的。

5.FFmpeg API的使用

　　5.1 extern“C”的解釋

　　作為一種面向物件的語言，C++支援函式的過載，而面向過程的C
語言是不支援函式過載的。同一個函式在C++中編譯後與其在C中編譯

後，在符號表中的簽名是不同的，假如對於同一個函式：
void decode(float position, float duration)
在C語言中編譯出來的簽名是_decoder，而在C++語言中，一般編譯
器的生成則類似於_decode_float_float。雖然在編譯階段是沒有問題的，
但是在連結階段，如果不加extern“C”關鍵字的話，那麼將會連結
_decoder_float_float這個方法簽名；而如果加了extern“C”關鍵字的話，
那麼尋找的方法簽名就是_decoder。而FFmpeg就是C語言書寫的，編譯
FFmpeg的時候所產生的方法簽名都是C語言型別的簽名，所以在C++中
引用FFmpeg必須要加extern“C”關鍵字。

　　5.2註冊協議、格式與編解碼器

　　使用FFmpeg的API，首先要呼叫FFmpeg的註冊協議、格式與編解

碼器的方法，確保所有的格式與編解碼器都被註冊到了FFmpeg框架
中，當然如果需要用到網路的操作，那麼也應該將網路協議部分註冊到
FFmpeg框架，以便於後續再去查詢對應的格式。程式碼如下：
avformat_network_init();
av_register_all();
文件中還有一個方法是avcodec_register_all（），其用於將所有編
解碼器註冊到FFmpeg框架中，但是av_register_all方法內部已經呼叫了
avcodec_register_all方法，所以其實只需要呼叫av_register_all就可以
了。

　　5.3.開啟媒體檔案源，並設定超時回撥

註冊了格式以及編解碼器之後，接下來就應該開啟對應的媒體檔案
了，當然該檔案既可能是本地磁碟的檔案，也可能是網路媒體資源的一
個連結，如果是網路連結，則會涉及不同的協議，比如RTMP、HTTP
等協議的視訊源。開啟媒體資源以及設定超時回撥的程式碼如下：
AVFormatContext *formatCtx = avformat_alloc_context();
AVIOInterruptCB int_cb = {interrupt_callback, (__bridge void *)(self)};
formatCtx->interrupt_callback = int_cb;
avformat_open_input(formatCtx, path, NULL, NULL);
avformat_find_stream_info(formatCtx, NULL);
　　5.4.尋找各個流，並且開啟對應的解碼器
上一步中已打開了媒體檔案，相當於打開了一根電線，這根電線裡
面其實還有一條紅色的線和一條藍色的線，這就和媒體檔案中的流非常
類似了，紅色的線代表音訊流，藍色的線代表視訊流。所以這一步我們
就要尋找出各個流，然後找到流中對應的解碼器，並且開啟它。
尋找音視訊流：
for(int i = 0; i < formatCtx->nb_streams; i++) {
AVStream* stream = formatCtx->streams[i];
if(AVMEDIA_TYPE_VIDEO == stream->codec->codec_type) {
// 視訊流
videoStreamIndex = i;
} else if(AVMEDIA_TYPE_AUDIO == stream->codec->codec_type ){
// 音訊流
audioStreamIndex = i;
}
}
開啟音訊流解碼器：
AVCodecContext * audioCodecCtx = audioStream->codec;
AVCodec *codec = avcodec_find_decoder(audioCodecCtx ->codec_id);
if(!codec){
// 找不到對應的音訊解碼器
}
int openCodecErrCode = 0;
if ((openCodecErrCode = avcodec_open2(codecCtx, codec, NULL)) < 0){
// 開啟音訊解碼器失敗
}
開啟視訊流解碼器：
AVCodecContext *videoCodecCtx = videoStream->codec;
AVCodec *codec = avcodec_find_decoder(videoCodecCtx->codec_id);
if(!codec) {
// 找不到對應的視訊解碼器
}
int openCodecErrCode = 0;
if ((openCodecErrCode = avcodec_open2(codecCtx, codec, NULL)) < 0) {
// 開啟視訊解碼器失敗
}
　　5.5.初始化解碼後資料的結構體
知道了音視訊解碼器的資訊之後，下面需要分配出解碼之後的資料
所存放的記憶體空間，以及進行格式轉換需要用到的物件。
構建音訊的格式轉換物件以及音訊解碼後資料存放的物件：
SwrContext *swrContext = NULL;
if(audioCodecCtx->sample_fmt ！= AV_SAMPLE_FMT_S16) {
// 如果不是我們需要的資料格式
swrContext = swr_alloc_set_opts(NULL,
outputChannel, AV_SAMPLE_FMT_S16, outSampleRate,
in_ch_layout, in_sample_fmt, in_sample_rate, 0, NULL);
if(!swrContext || swr_init(swrContext)) {
if(swrContext) {
swr_free(&swrContext);
}
}
audioFrame = avcodec_alloc_frame();
}
構建視訊的格式轉換物件以及視訊解碼後資料存放的物件：
AVPicture picture;
bool pictureValid = avpicture_alloc(&picture,
PIX_FMT_YUV420P,
videoCodecCtx->width,
videoCodecCtx->height) == 0;
if (!pictureValid){
// 分配失敗
return false;
}
swsContext = sws_getCachedContext(swsContext,
videoCodecCtx->width,
videoCodecCtx->height,
videoCodecCtx->pix_fmt,
videoCodecCtx->width,
videoCodecCtx->height,
PIX_FMT_YUV420P,
SWS_FAST_BILINEAR,
NULL, NULL, NULL);
videoFrame = avcodec_alloc_frame();
　　5.6.讀取流內容並且解碼
打開了解碼器之後，就可以讀取一部分流中的資料（壓縮資料），
然後將壓縮資料作為解碼器的輸入，解碼器將其解碼為原始資料（裸數
據），之後就可以將原始資料寫入檔案了：
AVPacket packet;
int gotFrame = 0;
while(true) {
if(av_read_frame(formatContext, &packet)) {
// End Of File
break;
}
int packetStreamIndex = packet.stream_index;
if(packetStreamIndex == videoStreamIndex) {
int len = avcodec_decode_video2(videoCodecCtx, videoFrame,
&gotFrame, &packet);
if(len < 0) {
break;
}
if(gotFrame) {
self->handleVideoFrame();
}
} else if(packetStreamIndex == audioStreamIndex) {
int len = avcodec_decode_audio4(audioCodecCtx, audioFrame,
&gotFrame, &packet);
if(len < 0) {
break;
}
if(gotFrame) {
self->handleVideoFrame();
}
}
}
　　5.7.處理解碼後的裸資料
解碼之後會得到裸資料，音訊就是PCM資料，視訊就是YUV數
據。下面將其處理成我們所需要的格式並且進行寫檔案。
音訊裸資料的處理：
void* audioData;
int numFrames;
if(swrContext) {
int bufSize = av_samples_get_buffer_size(NULL, channels,
(int)(audioFrame->nb_samples * channels),
AV_SAMPLE_FMT_S16, 1);
if (!_swrBuffer || _swrBufferSize < bufSize) {
swrBufferSize = bufSize;
swrBuffer = realloc(_swrBuffer, _swrBufferSize);
}
Byte *outbuf[2] = { _swrBuffer, 0 };
numFrames = swr_convert(_swrContext, outbuf,
(int)(audioFrame->nb_samples * channels),
(const uint8_t **)_audioFrame->data,
audioFrame->nb_samples);
audioData = swrBuffer;
} else {
audioData = audioFrame->data[0];
numFrames = audioFrame->nb_samples;
}
接收到音訊裸資料之後，就可以直接寫檔案了，比如寫到檔案
audio.pcm中。
視訊裸資料的處理：
uint8_t* luma;
uint8_t* chromaB;
uint8_t* chromaR;
if(videoCodecCtx->pix_fmt == AV_PIX_FMT_YUV420P ||
videoCodecCtx->pix_fmt == AV_PIX_FMT_YUVJ420P){
luma = copyFrameData(videoFrame->data[0],
videoFrame->linesize[0],
videoCodecCtx->width,
videoCodecCtx->height);
chromaB = copyFrameData(videoFrame->data[1],
videoFrame->linesize[1],
videoCodecCtx->width / 2,
videoCodecCtx->height / 2);
chromaR = copyFrameData(videoFrame->data[2],
videoFrame->linesize[2],
videoCodecCtx->width / 2,
videoCodecCtx->height / 2);
} else{
sws_scale(_swsContext,
(const uint8_t **)videoFrame->data,
videoFrame->linesize,
0,
videoCodecCtx->height,
picture.data,
picture.linesize);
luma = copyFrameData(picture.data[0],
picture.linesize[0],
videoCodecCtx->width,
videoCodecCtx->height);
chromaB = copyFrameData(picture.data[1],
picture.linesize[1],
videoCodecCtx->width / 2,
videoCodecCtx->height / 2);
chromaR = copyFrameData(picture.data[2],
picture.linesize[2],
videoCodecCtx->width / 2,
videoCodecCtx->height / 2);
}
接收到YUV資料之後也可以直接寫入檔案了，比如寫到檔案
video.yuv中。
　　5.8.關閉所有資源
解碼完畢之後，或者在解碼過程中不想繼續解碼了，可以退出程
序，當然，退出的時候，要將用到的FFmpeg框架中的資源，包括
FFmpeg框架對外的連線資源等全都釋放掉。
關閉音訊資源：
if (swrBuffer) {
free(swrBuffer);
swrBuffer = NULL;
swrBufferSize = 0;
}
if (swrContext) {
swr_free(&swrContext);
swrContext = NULL;
}
if (audioFrame) {
av_free(audioFrame);
audioFrame = NULL;
}
if (audioCodecCtx) {
avcodec_close(audioCodecCtx);
audioCodecCtx = NULL;
}
關閉視訊資源：
if (swsContext) {
sws_freeContext(swsContext);
swsContext = NULL;
}
if (pictureValid) {
avpicture_free(&picture);
pictureValid = false;
}
if (videoFrame) {
av_free(videoFrame);
videoFrame = NULL;
}
if (videoCodecCtx) {
avcodec_close(videoCodecCtx);
videoCodecCtx = NULL;
}
關閉連線資源：
if (formatCtx) {
avformat_close_input(&formatCtx);
formatCtx = NULL;
}
以上就是利用FFmpeg解碼的全部過程了，其中包括開啟檔案流、
解析格式、解析流並且開啟解碼器、解碼和處理，以及最終關閉所有資
源的操作。

音視訊開發進階指南（二）

音視訊開發進階指南（二）

音視訊開發進階指南：基於iOS實現超低延時耳返

A7 A8 A9 進階日記（二）

Java資料進階知識（二）JDBC

DataX學習指南（二）--外掛開發 DataX學習指南（一）--基礎介紹

Blazor和Vue對比學習（進階.路由導航二）：佈局(母版/巢狀)

redis入門指南（二）—— 資料操作相關命令

大型Java進階專題（六）設計模式之代理模式

前端進階筆記（一）---JS語言通識

ROS中階筆記（二）：機器人系統設計—URDF機器人建模

【外掛開發】VSCode外掛開發全攻略（二）HelloWord

Qt開發技術：QCharts（二）QCharts折線圖介紹、Demo以及程式碼詳解

金山辦公2020校招前端開發工程師筆試題（二）

Python Web開發之Flask框架（二）

VoIP語音通話研究【進階篇（三）：opensips安裝】

CryptoAPI呼叫指南（二）

給萌新HTML5 入門指南（二）

前端開發Docker快速入門（二）製作映象並建立容器

vue路由——進階篇（一）

spark怎麼呼叫hadoop_Spark程式設計指南（二）

音視訊開發進階指南（二）

相關推薦