用ffmepg實現音訊重取樣

阿新 • • 發佈：2019-02-10

1.概述

在進行音訊播放時，有時音訊流不能滿足播放要求，需要對聲音的相關屬性如：通道數，取樣率，樣本儲存方式進行變更播放，也就是音訊重取樣。ffmpeg提供了SwrContext進行轉換。

typedef struct SwrContext SwrContext;

2.基本概念

2.1通道數

聲音在錄製時在不同空間位置用不同錄音裝置取樣的聲音訊號，聲音在播放時採用相應個數的揚聲器播放。採用多通道的方式是為了豐富聲音的現場感。常用的立體聲有2個通道，環繞立體聲3個通道。數字音訊就是有一連串的樣本流組成，立體聲每次取樣要採兩次。有點類似視訊中的YUV各個分量。

2.2取樣率

把模擬訊號轉換成數字訊號在計算機中處理，需要按照一定的取樣率取樣，樣本值就是聲音波形中的一個值。音訊在播放時按照取樣率進行，取樣率越高聲音的連續性就越好，由於人的聽覺器官分辨能力的侷限，往往這些數值達到某種程度就可以滿足人對“連續”性的需求了。例如22050和44100的取樣率就是電臺和CD常用的取樣率。類似視訊中的幀率。

2.3位元率（bps或kbps）

單位時間所需的空間儲存。位元率反應的是視訊或者音訊一個樣本所有的資訊量，越大含有的資訊量就大。視訊中，影象解析度越大，一幀就越大，實時解碼就容易飢渴，傳輸頻寬需要越大，儲存空間就越大。音訊中描述一個樣本就越準確。

2.4幀

視訊中幀就是一個圖片取樣。音訊中一幀一般包含多個樣本，如AAC格式會包含1024個樣本。

2.5樣本格式

音訊中一個樣本儲存方式。列舉ffmpeg中的樣本格式：

enum AVSampleFormat {
    AV_SAMPLE_FMT_NONE = -1,
    AV_SAMPLE_FMT_U8,          ///< unsigned 8 bits
    AV_SAMPLE_FMT_S16,         ///< signed 16 bits
    AV_SAMPLE_FMT_S32,         ///< signed 32 bits
    AV_SAMPLE_FMT_FLT,         ///< float
    AV_SAMPLE_FMT_DBL,         ///< double
 
    AV_SAMPLE_FMT_U8P,         ///< unsigned 8 bits, planar
    AV_SAMPLE_FMT_S16P,        ///< signed 16 bits, planar
    AV_SAMPLE_FMT_S32P,        ///< signed 32 bits, planar
    AV_SAMPLE_FMT_FLTP,        ///< float, planar
    AV_SAMPLE_FMT_DBLP,        ///< double, planar
 
    AV_SAMPLE_FMT_NB           ///< Number of sample formats. DO NOT USE if linking dynamically
};

講一下AV_SAMPLE_FMT_S16和AV_SAMPLE_FMT_S16P格式，AV_SAMPLE_FMT_S16儲存一個樣本採用有符號16bit交叉儲存的方式，AV_SAMPLE_FMT_S16P儲存一個樣本採用有符號16bit平面儲存的方式。舉例有兩個通道，通道1資料流 c1 c1 c1c1... , 通道2資料流 c2 c2 c2 c2...

平面儲存方式：c1 c1 c1c1... c2 c2 c2 c2...

交叉儲存方式：c1, c2,c1, c2, c1, c2, ...

AVFrame中平面方式planar每個通道資料儲存在data[0], data[1]中，長度為linesize[0],linesize[1],交叉方式則所有的資料都儲存在data[0],長度為linesize[0]。

3.ffmepg例項

例程是ffmpeg2.4原始碼目錄下的doc/examples/resampling_audio.c檔案，為便於學習做部分修改。

3.1 根據樣本格式返回樣本格式字串

 static int get_format_from_sample_fmt(const char **fmt,
	enum AVSampleFormat sample_fmt)
{
	int i;
	struct sample_fmt_entry 
	{
		enum AVSampleFormat sample_fmt; const char *fmt_be, *fmt_le;
	} sample_fmt_entries[] = 
	{
		{ AV_SAMPLE_FMT_U8,  "u8",    "u8"    },
		{ AV_SAMPLE_FMT_S16, "s16be", "s16le" },
		{ AV_SAMPLE_FMT_S32, "s32be", "s32le" },
		{ AV_SAMPLE_FMT_FLT, "f32be", "f32le" },
		{ AV_SAMPLE_FMT_DBL, "f64be", "f64le" },
	};
	*fmt = NULL;
 
	for (i = 0; i < FF_ARRAY_ELEMS(sample_fmt_entries); i++) 
	{
		struct sample_fmt_entry *entry = &sample_fmt_entries[i];
		if (sample_fmt == entry->sample_fmt) 
		{
			*fmt = AV_NE(entry->fmt_be, entry->fmt_le);
			return 0;
		}
	}
 
	fprintf(stderr,
		"Sample format %s not supported as output format\n",
		av_get_sample_fmt_name(sample_fmt));
	return AVERROR(EINVAL);
}

3.2獲取聲音

/**
* Fill dst buffer with nb_samples, generated starting from t.
*相當於是聲源 產生一個正弦波形的聲波
* dst 儲存聲音資料返回個呼叫者 nb_samples 採用的樣本數 nb_channels 聲音通道數，表明單次取樣的樣本數 t採用開始時間
*正弦波形就是一個生源，實際中複雜的聲音都是通過波形疊加成的。
*以 sample_rate取樣率，從時間t開始取樣，取樣通道為2，每個通道的資料相同，從頻率為440HZ的波形上取樣，形成聲源
*/
static void fill_samples(double *dst, int nb_samples, int nb_channels, int sample_rate, double *t)
{
	int i, j;
	//取樣時間間隔    tincr
	double tincr = 1.0 / sample_rate, *dstp = dst;
	//正弦波y=Asin（ωx+φ）+h 最小正週期T=2π/|ω| 所以440HZ是正弦波的頻率
	const double c = 2 * M_PI * 440.0;
 
	/* generate sin tone with 440Hz frequency and duplicated channels */
	//填充每個通道資料 採用交叉儲存
	for (i = 0; i < nb_samples; i++) 
	{
		*dstp = sin(c * *t);
		for (j = 1; j < nb_channels; j++)
		{
			dstp[j] = dstp[0];
		}
		dstp += nb_channels;
		*t += tincr;
	}
}

3.2主函式

int main(int argc, char **argv)
{
	// AV_CH_LAYOUT_STEREO 聲音佈局立體聲 	  AV_CH_LAYOUT_SURROUND 聲音佈局環繞立體聲
	int64_t src_ch_layout = AV_CH_LAYOUT_STEREO, dst_ch_layout = AV_CH_LAYOUT_SURROUND;
	//聲音取樣率
	int src_rate = 48000, dst_rate = 44100;
	uint8_t **src_data = NULL, **dst_data = NULL;
	int src_nb_channels = 0, dst_nb_channels = 0;
	int src_linesize, dst_linesize;
	//每次採用樣本數
	int src_nb_samples = 1024, dst_nb_samples, max_dst_nb_samples;
	//樣本儲存格式
	enum AVSampleFormat src_sample_fmt = AV_SAMPLE_FMT_DBL, dst_sample_fmt = AV_SAMPLE_FMT_S16;
	const char *dst_filename = NULL;
	FILE *dst_file;
	int dst_bufsize;
	const char *fmt;
	//重取樣上下文
	struct SwrContext *swr_ctx;
	double t;
	int ret;
 
	if (argc != 2) 
	{
		fprintf(stderr, "Usage: %s output_file\n"
			"API example program to show how to resample an audio stream with libswresample.\n"
			"This program generates a series of audio frames, resamples them to a specified "
			"output format and rate and saves them to an output file named output_file.\n",
			argv[0]);
		exit(1);
	}
	dst_filename = argv[1];
 
	dst_file = fopen(dst_filename, "wb");
	if (!dst_file) 
	{
		fprintf(stderr, "Could not open destination file %s\n", dst_filename);
		exit(1);
	}
 
	/* create resampler context */
	//初始化常取樣上下文
	swr_ctx = swr_alloc();
	if (!swr_ctx) 
	{
		fprintf(stderr, "Could not allocate resampler context\n");
		ret = AVERROR(ENOMEM);
		goto end;
	}
 
	/* set options */
	//設定源通道佈局
	av_opt_set_int(swr_ctx, "in_channel_layout",    src_ch_layout, 0);
	//設定源通道取樣率
	av_opt_set_int(swr_ctx, "in_sample_rate",       src_rate, 0);
	//設定源通道樣本格式
	av_opt_set_sample_fmt(swr_ctx, "in_sample_fmt", src_sample_fmt, 0);
 
	//目標通道佈局
	av_opt_set_int(swr_ctx, "out_channel_layout",    dst_ch_layout, 0);
	//目標採用率
	av_opt_set_int(swr_ctx, "out_sample_rate",       dst_rate, 0);
	//目標樣本格式
	av_opt_set_sample_fmt(swr_ctx, "out_sample_fmt", dst_sample_fmt, 0);
 
	/* initialize the resampling context */
	if ((ret = swr_init(swr_ctx)) < 0) 
	{
		fprintf(stderr, "Failed to initialize the resampling context\n");
		goto end;
	}
 
	/* allocate source and destination samples buffers */
	//獲取源通道數
	src_nb_channels = av_get_channel_layout_nb_channels(src_ch_layout);
	//分配源聲音所需要空間  src_linesize=	 src_nb_channels× src_nb_samples×sizeof(double)
	ret = av_samples_alloc_array_and_samples(&src_data, &src_linesize, src_nb_channels,
		src_nb_samples, src_sample_fmt, 0);
	if (ret < 0) 
	{
		fprintf(stderr, "Could not allocate source samples\n");
		goto end;
	}
 
	/* compute the number of converted samples: buffering is avoided
	* ensuring that the output buffer will contain at least all the
	* converted input samples */
	//計算目標樣本數  轉換前後的樣本數不一樣  抓住一點 取樣時間相等
	//src_nb_samples/src_rate=dst_nb_samples/dst_rate
	max_dst_nb_samples = dst_nb_samples = av_rescale_rnd(src_nb_samples, dst_rate, src_rate, AV_ROUND_UP);
 
	/* buffer is going to be directly written to a rawaudio file, no alignment */
	dst_nb_channels = av_get_channel_layout_nb_channels(dst_ch_layout);
	ret = av_samples_alloc_array_and_samples(&dst_data, &dst_linesize, dst_nb_channels,
		dst_nb_samples, dst_sample_fmt, 0);
	if (ret < 0) 
	{
		fprintf(stderr, "Could not allocate destination samples\n");
		goto end;
	}
 
	t = 0;
	do {
		/* generate synthetic audio */
		fill_samples((double *)src_data[0], src_nb_samples, src_nb_channels, src_rate, &t);
 
		/* compute destination number of samples */
		//swr_get_delay(swr_ctx, src_rate)延遲時間 源取樣率為單位的樣本數
		dst_nb_samples = av_rescale_rnd(swr_get_delay(swr_ctx, src_rate) +
			src_nb_samples, dst_rate, src_rate, AV_ROUND_UP);
		if (dst_nb_samples > max_dst_nb_samples) 
		{
			av_freep(&dst_data[0]);
			ret = av_samples_alloc(dst_data, &dst_linesize, dst_nb_channels,
				dst_nb_samples, dst_sample_fmt, 1);
			if (ret < 0)
				break;
			max_dst_nb_samples = dst_nb_samples;
		}
 
		/* convert to destination format */
		//ret 實際轉換得到的樣本數
		ret = swr_convert(swr_ctx, dst_data, dst_nb_samples, (const uint8_t **)src_data, src_nb_samples);
		if (ret < 0) 
		{
			fprintf(stderr, "Error while converting\n");
			goto end;
		}
		dst_bufsize = av_samples_get_buffer_size(&dst_linesize, dst_nb_channels,
			ret, dst_sample_fmt, 1);
		if (dst_bufsize < 0) 
		{
			fprintf(stderr, "Could not get sample buffer size\n");
			goto end;
		}
		printf("t:%f in:%d out:%d\n", t, src_nb_samples, ret);
		fwrite(dst_data[0], 1, dst_bufsize, dst_file);
	} while (t < 10);
 
	if ((ret = get_format_from_sample_fmt(&fmt, dst_sample_fmt)) < 0)
		goto end;
 
	fprintf(stderr, "Resampling succeeded. Play the output file with the command:\n"
		"ffplay -f %s -channel_layout %lld -channels %d -ar %d %s\n",
		fmt, dst_ch_layout, dst_nb_channels, dst_rate, dst_filename);
	while(1)
	{
		Sleep(50);
	}
 
end:
	fclose(dst_file);
 
	if (src_data)
		av_freep(&src_data[0]);
	av_freep(&src_data);
 
	if (dst_data)
		av_freep(&dst_data[0]);
	av_freep(&dst_data);
 
	swr_free(&swr_ctx);
	return ret < 0;
}

編譯環境：Win7_32bit+VS2010

FFMPEG版本：ffmpeg-2.4

用ffmepg實現音訊重取樣

1.概述

2.基本概念

2.1通道數

2.2取樣率

2.3位元率（bps或kbps）

2.4幀

2.5樣本格式

3.ffmepg例項

3.1 根據樣本格式返回樣本格式字串

3.2獲取聲音

3.2主函式

用ffmepg實現音訊重取樣

FFMPEG實現音訊重取樣

音訊重取樣原理及技術實現

基於ffmpeg-4.0 SDK的音訊重取樣

基於ffmpeg-0.10 SDK的音訊重取樣

關於音訊重取樣計算的理解

FFmpeg程式設計學習筆記二：音訊重取樣

音訊重取樣（libavfilter）及AVAudioFifo的使用

音訊重取樣

音訊重取樣的基本（我的學習筆記）

音訊重取樣造成音質損失的原理

基於sinc的音訊重取樣（一）：原理

用命令實現Win7遠端桌面關機和重啟

Spark SQL用UDF實現按列特徵重分割槽

php 用swoole 實現定時器執行linux指令碼，檢查程序掛了，重啟操作

用redis實現scrapy的url去重與增量爬取

用jrebel實現 jvm熱部署，修改類不用重啟tomcat

用java實現電腦的關機，重啟

ffmpeg音訊轉碼，採用swr_convert重取樣

音訊處理——pcm基礎知識與重取樣

用ffmepg實現音訊重取樣

1.概述

2.基本概念

2.1通道數

2.2取樣率

2.3位元率（bps或kbps）

2.4幀

2.5樣本格式

3.ffmepg例項

3.1 根據樣本格式返回樣本格式字串

3.2獲取聲音

3.2主函式

相關推薦