音訊編碼 Audio Converter
需求
iOS中將採集到的原始音訊資料(PCM)進行編碼以得到壓縮資料型別(AAC...).
本例最終實現的是通過Audio Unit採集到PCM資料,將其壓縮轉為AAC資料,並以錄製的形式儲存在沙盒中.可調整編碼後音訊資料格式,取樣率,編碼器型別等引數.
實現原理
利用Audio Toolbox Framework中的Audio Converter可以實現音訊資料編碼,即將PCM資料轉為其他壓縮格式.
閱讀前提:
GitHub地址(附程式碼) : 音訊編碼
簡書地址 : 音訊編碼
掘金地址 : 音訊編碼
部落格地址 : 音訊編碼
1.初始化
1.1. 初始化編碼器
初始化編碼器例項,通過指定原始資料格式,最終編碼後的格式,以及使用硬編還是軟編,以下是具體步驟.
- (instancetype)initWithSourceFormat:(AudioStreamBasicDescription)sourceFormat destFormatID:(AudioFormatID)destFormatID sampleRate:(float)sampleRate isUseHardwareEncode:(BOOL)isUseHardwareEncode {
if (self = [super init]) {
mSourceFormat = sourceFormat;
mAudioConverter = [self configureEncoderBySourceFormat:sourceFormat
destFormat:&mDestinationFormat
destFormatID:destFormatID
sampleRate:sampleRate
isUseHardwareEncode:isUseHardwareEncode];
}
return self;
}
複製程式碼
1.2. 配置編碼後ASBD音訊流資訊
AudioStreamBasicDescription destinationFormat = {};
destinationFormat.mSampleRate = sampleRate;
if (destFormatID == kAudioFormatLinearPCM) {
NSLog(@"Not get PCM format after encoding !");
return NULL;
} else {
destinationFormat.mFormatID = destFormatID;
// For iLBC,the number of channels must be 1.
destinationFormat.mChannelsPerFrame = (destFormatID == kAudioFormatiLBC ? 1 : sourceFormat.mChannelsPerFrame);
// Use AudioFormat API to fill out the rest of the description.
size = sizeof(destinationFormat);
if (![self checkError:AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,NULL,&size,&destinationFormat) withErrorString:@"AudioFormatGetProperty couldn't fill out the destination data format"]) {
return NULL;
}
}
memcpy(destFormat,&destinationFormat,sizeof(AudioStreamBasicDescription));
複製程式碼
對音訊做編碼操作,實際就是將PCM格式轉為如AAC等音訊壓縮格式(VBR格式),通過kAudioFormatProperty_FormatInfo
屬性可以自動獲取指定音訊格式的引數資訊.
注意: 如果音訊格式是iLBC,聲道數只能為1.
1.3. 選擇編碼器型別
AudioClassDescription
結構體描述了系統使用音訊編碼器資訊,其中最重要的就是指定使用硬編或軟編。然後編碼器的數量,即陣列的個數,由當前的聲道數決定。
// encoder conut by channels.
AudioClassDescription requestedCodecs[destinationFormat.mChannelsPerFrame];
const OSType subtype = destFormatID;
for (int i = 0; i < destinationFormat.mChannelsPerFrame; i++) {
AudioClassDescription codec = {
kAudioEncoderComponentType,subtype,isUseHardwareEncode ? kAppleHardwareAudioCodecManufacturer : kAppleSoftwareAudioCodecManufacturer,};
requestedCodecs[i] = codec;
}
複製程式碼
注意:硬編即利用裝置GPU硬體完成高效編碼,降低CPU消耗. 軟編就是傳統的通過CPU計算。
1.4. 建立編碼器
AudioConverterNewSpecific
: 通過指定編碼器來建立audio converter例項物件.第3,4個
分別是編碼器的數量與編碼器描述,同上,與聲道數保持一致.
// Create the AudioConverterRef.
AudioConverterRef converter = NULL;
if (![self checkError:AudioConverterNewSpecific(&sourceFormat,destinationFormat.mChannelsPerFrame,requestedCodecs,&converter) withErrorString:@"AudioConverterNew failed"]) {
return NULL;
}else {
printf("Audio converter create successful \n");
}
複製程式碼
1.5. 設定位元速率
我們可以手動設定需要的位元速率,如果沒有特殊要求一般可以根據取樣率使用建議值,如下.
/*
If encoding to AAC set the bitrate kAudioConverterEncodeBitRate is a UInt32 value containing
the number of bits per second to aim for when encoding data when you explicitly set the bit rate
and the sample rate,this tells the encoder to stick with both bit rate and sample rate
but there are combinations (also depending on the number of channels) which will not be allowed
if you do not explicitly set a bit rate the encoder will pick the correct value for you depending
on samplerate and number of channels bit rate also scales with the number of channels,therefore one bit rate per sample rate can be used for mono cases and if you have stereo or more,you can multiply that number by the number of channels.
*/
if (destinationFormat.mFormatID == kAudioFormatMPEG4AAC) {
UInt32 outputBitRate = 64000;
UInt32 propSize = sizeof(outputBitRate);
if (destinationFormat.mSampleRate >= 44100) {
outputBitRate = 192000;
} else if (destinationFormat.mSampleRate < 22000) {
outputBitRate = 32000;
}
outputBitRate *= destinationFormat.mChannelsPerFrame;
// Set the bit rate depending on the sample rate chosen.
if (![self checkError:AudioConverterSetProperty(converter,kAudioConverterEncodeBitRate,propSize,&outputBitRate) withErrorString:@"AudioConverterSetProperty kAudioConverterEncodeBitRate failed!"]) {
return NULL;
}
// Get it back and print it out.
AudioConverterGetProperty(converter,&propSize,&outputBitRate);
printf ("AAC Encode Bitrate: %u\n",(unsigned int)outputBitRate);
}
複製程式碼
1.6. 設定中斷後是否可恢復
kAudioConverterPropertyCanResumeFromInterruption
: 設定converter能否在中斷後恢復.
如果沒有顯式實現該屬性或get此屬性返回錯誤,說明當前不是硬編,如果此查詢返回1表明編碼器可以在中斷後恢復.否則不能恢復.
/*
Can the Audio Converter resume after an interruption?
this property may be queried at any time after construction of the Audio Converter after setting its output format
there's no clear reason to prefer construction time,interruption time,or potential resumption time but we prefer
construction time since it means less code to execute during or after interruption time.
*/
BOOL canResumeFromInterruption = YES;
UInt32 canResume = 0;
size = sizeof(canResume);
OSStatus error = AudioConverterGetProperty(converter,kAudioConverterPropertyCanResumeFromInterruption,&canResume);
if (error == noErr) {
/*
we recieved a valid return value from the GetProperty call
if the property's value is 1,then the codec CAN resume work following an interruption
if the property's value is 0,then interruptions destroy the codec's state and we're done
*/
if (canResume == 0) {
canResumeFromInterruption = NO;
}
printf("Audio Converter %s continue after interruption!\n",(!canResumeFromInterruption ? "CANNOT" : "CAN"));
} else {
/*
if the property is unimplemented (kAudioConverterErr_PropertyNotSupported,or paramErr returned in the case of PCM),then the codec being used is not a hardware codec so we're not concerned about codec state
we are always going to be able to resume conversion after an interruption
*/
if (error == kAudioConverterErr_PropertyNotSupported) {
printf("kAudioConverterPropertyCanResumeFromInterruption property not supported - see comments in source for more info.\n");
} else {
printf("AudioConverterGetProperty kAudioConverterPropertyCanResumeFromInterruption result %d,paramErr is OK if PCM\n",(int)error);
}
error = noErr;
}
複製程式碼
2.編碼
2.1. 估算音訊大小
kAudioConverterPropertyMaximumOutputPacketSize
: 可以查詢編碼後音訊資料最大數值.此值常用來估算音訊編碼後最大值.可以通過此值為音訊資料分配空間.
UInt32 outputSizePerPacket = destFormat.mBytesPerPacket;
if (outputSizePerPacket == 0) {
// if the destination format is VBR,we need to get max size per packet from the converter
UInt32 size = sizeof(outputSizePerPacket);
if (![self checkError:AudioConverterGetProperty(audioConverter,kAudioConverterPropertyMaximumOutputPacketSize,&outputSizePerPacket) withErrorString:@"AudioConverterGetProperty kAudioConverterPropertyMaximumOutputPacketSize failed!"]) {
return;
}
}
複製程式碼
2.2. 為編碼後音訊資料預分配記憶體
我們可以將2.1中算出的最大size為這個Buffer list分配記憶體,也可用原始音訊資料的大小為其分配記憶體,因為我們無法直接得知編碼後資料到底是多大,所以用估算出來的最大值或原始資料大小分配記憶體都可以生效,因為最終編碼器會將有效大小的值賦值進去.
// Set up output buffer list.
AudioBufferList fillBufferList = {};
fillBufferList.mNumberBuffers = 1;
fillBufferList.mBuffers[0].mNumberChannels = destFormat.mChannelsPerFrame;
fillBufferList.mBuffers[0].mDataByteSize = theOutputBufferSize;
fillBufferList.mBuffers[0].mData = malloc(theOutputBufferSize * sizeof(char));
複製程式碼
2.3. 編碼音訊資料
解析AudioConverterFillComplexBuffer
:用來編碼音訊資料.同時需要指定回撥函式(C語言函式),
第二個引數即指定回撥函式,此回撥函式中主要做的是為即將編碼的資料進行賦值,即我們要把原始音訊資料賦值給回撥函式中的ioData
引數,這是我們在編碼前最後一次控制原始音訊資料,此回撥函式執行後即完成了編碼的過程,新的資料會填充到第五個引數中,也就是我們上面預定義的fillBufferList
.
-
userInfo
: 自定義一個結構體,用來與編碼回撥函式間互動以傳遞資料.在這裡是將原始音訊資料資訊傳給編碼回撥函式中. -
ioOutputDataPackets
: 填入函式中時表示原始音訊資料包的數量,而函式呼叫完成時表示轉換後輸出的音訊資料包總數 -
outputPacketDescriptions
: 轉換完成後,如果此引數非空,表示轉換器輸出使用的音訊資料包描述,它必須提前分配好記憶體,以讓轉換器賦值到其中.
最終,我們將轉換後得到的AAC資料以回撥函式的形式傳給呼叫者.
OSStatus EncodeConverterComplexInputDataProc(AudioConverterRef inAudioConverter,UInt32 *ioNumberDataPackets,AudioBufferList *ioData,AudioStreamPacketDescription **outDataPacketDescription,void *inUserData) {
XDXConverterInfoType *info = (XDXConverterInfoType *)inUserData;
ioData->mNumberBuffers = 1;
ioData->mBuffers[0].mData = info->sourceBuffer;
ioData->mBuffers[0].mNumberChannels = info->sourceChannelsPerFrame;
ioData->mBuffers[0].mDataByteSize = info->sourceDataSize;
return noErr;
}
- (void)encodeFormatByConverter:(AudioConverterRef)audioConverter sourceBuffer:(void *)sourceBuffer sourceBufferSize:(UInt32)sourceBufferSize sourceFormat:(AudioStreamBasicDescription)sourceFormat dest:(AudioStreamBasicDescription)destFormat completeHandler:(void(^)(AudioBufferList *destBufferList,UInt32 outputPackets,AudioStreamPacketDescription *outputPacketDescriptions))completeHandler {
...
XDXConverterInfoType userInfo = {0};
userInfo.sourceBuffer = sourceBuffer;
userInfo.sourceDataSize = sourceBufferSize;
userInfo.sourceChannelsPerFrame = sourceFormat.mChannelsPerFrame;
UInt32 numberOutputPackets = 1;
UInt32 theOutputBufferSize = sourceBufferSize;
UInt32 ioOutputDataPackets = numberOutputPackets;
AudioStreamPacketDescription outputPacketDescriptions;
// Convert data
OSStatus status = AudioConverterFillComplexBuffer(audioConverter,EncodeConverterComplexInputDataProc,&userInfo,&ioOutputDataPackets,&fillBufferList,&outputPacketDescriptions);
// if interrupted in the process of the conversion call,we must handle the error appropriately
if (status != noErr) {
if (status == kAudioConverterErr_HardwareInUse) {
printf("Audio Converter returned kAudioConverterErr_HardwareInUse!\n");
} else {
if (![self checkError:status withErrorString:@"AudioConverterFillComplexBuffer error!"]) {
return;
}
}
} else {
if (ioOutputDataPackets == 0) {
// This is the EOF condition.
status = noErr;
}
completeHandler(&fillBufferList,ioOutputDataPackets,&outputPacketDescriptions);
}
}
複製程式碼
3. 模組對接
因為音訊編碼要依賴音訊採集,所以我們這裡以audio unit採集為例作示範,即使用audio unit採集pcm資料然後使用此模組編碼得到aac資料.如需瞭解請參考如下連結
- GitHub地址(附程式碼) : Audio Unit Capture
- 簡書地址 : Audio Unit Capture
- 掘金地址 : Audio Unit Capture
- 部落格地址 : Audio Unit Capture
3.1. 初始化編碼器
如下,在音訊採集的類中宣告一個編碼器例項變數,然後初始化它. 僅僅需要設定原始資料格式,編碼後的格式,使用硬編,軟編即可.
@property (nonatomic,strong) XDXAduioEncoder *audioEncoder;
...
self->_audioEncoder = [[XDXAduioEncoder alloc] initWithSourceFormat:m_audioDataFormat
destFormatID:kAudioFormatMPEG4AAC
sampleRate:44100
isUseHardwareEncode:YES];
複製程式碼
3.2. 編碼音訊資料
在Audio Unit採集PCM音訊資料的回撥中將PCM資料送入編碼器,然後在回撥函式中將得到的AAC資料其寫入檔案.
static OSStatus AudioCaptureCallback(void *inRefCon,AudioUnitRenderActionFlags *ioActionFlags,const AudioTimeStamp *inTimeStamp,UInt32 inBusNumber,UInt32 inNumberFrames,AudioBufferList *ioData) {
AudioUnitRender(m_audioUnit,ioActionFlags,inTimeStamp,inBusNumber,inNumberFrames,m_buffList);
XDXAudioCaptureManager *manager = (__bridge XDXAudioCaptureManager *)inRefCon;
void *bufferData = m_buffList->mBuffers[0].mData;
UInt32 bufferSize = m_buffList->mBuffers[0].mDataByteSize;
[manager.audioEncoder encodeAudioWithSourceBuffer:bufferData
sourceBufferSize:bufferSize
completeHandler:^(AudioBufferList * _Nonnull destBufferList,AudioStreamPacketDescription * _Nonnull outputPacketDescriptions) {
if (manager.isRecordVoice) {
[[XDXAudioFileHandler getInstance] writeFileWithInNumBytes:destBufferList->mBuffers->mDataByteSize
ioNumPackets:outputPackets
inBuffer:destBufferList->mBuffers->mData
inPacketDesc:outputPacketDescriptions];
}
free(destBufferList->mBuffers->mData);
}];
return noErr;
}
複製程式碼
3.4. 釋放記憶體
使用完編碼後的音訊資料,記得釋放記憶體.
free(destBufferList->mBuffers->mData);
複製程式碼
4. 檔案錄製
此部分可參考另一篇文章: 音訊檔案錄製
- 簡書地址 : Audio File Record
- 掘金地址 : Audio File Record
- 部落格地址 : Audio File Record
5. 釋放編碼器資源
如需釋放記憶體,請保證編碼器工作徹底結束後再釋放記憶體.
- (void)freeEncoder {
if (mAudioConverter) {
AudioConverterDispose(mAudioConverter);
mAudioConverter = NULL;
}
}
複製程式碼