iOS平臺上音頻編碼成aac
硬編碼的優勢是可以用硬件芯片集成的功能,高速且低功耗地完成編碼任務。
在iOS平臺,也提供了硬編碼的能力,APP開發時只需要調用相應的SDK接口就可以了。
這個SDK接口就是AudioConverter。
本文介紹iOS平臺上,如何調用AudioConverter來完成aac的硬編碼。
從名字來看,AudioConverter就是格式轉換器,這裏小程使用它,把pcm格式的數據,轉換成aac格式的數據。
對於媒體格式(編碼格式或封裝格式),讀者可以關註“廣州小程”公眾號,並在“音視頻->基礎概念與流程”菜單中查閱相關文章。
AudioConverter在內存中實現轉換,並不需要寫文件,而ExtAudioFile接口則是對文件的操作,並且內部使用AudioConerter來轉換格式,也就是說,讀者在某種場景下,也可以使用ExtAudioFile接口。
如何使用AudioConverter呢?基本上,對接口的調用都需要閱讀對應的頭文件,通過看文檔註釋來理解怎麽調用。
小程這裏演示一下,怎麽把pcm格式的數據轉換成aac格式的數據。
在演示代碼之後,小程只做簡單的解釋,有需要的讀者請耐心閱讀代碼來理解,並應用到自己的開發場景中。
下面的例子演示從pcm轉aac的實現(比如把錄音數據保存成aac的實現)。
typedef struct { void *source; UInt32 sourceSize; UInt32 channelCount; AudioStreamPacketDescription *packetDescriptions; }FillComplexInputParam; // 填寫源數據,即pcm數據 OSStatus audioConverterComplexInputDataProc( AudioConverterRef inAudioConverter, UInt32* ioNumberDataPackets, AudioBufferList* ioData, AudioStreamPacketDescription** outDataPacketDescription, void* inUserData) { FillComplexInputParam* param = (FillComplexInputParam*)inUserData; if (param->sourceSize <= 0) { *ioNumberDataPackets = 0; return -1; } ioData->mBuffers[0].mData = param->source; ioData->mBuffers[0].mNumberChannels = param->channelCount; ioData->mBuffers[0].mDataByteSize = param->sourceSize; *ioNumberDataPackets = 1; param->sourceSize = 0; param->source = NULL; return noErr; } typedef struct _tagConvertContext { AudioConverterRef converter; int samplerate; int channels; }ConvertContext; // init // 最終用AudioConverterNewSpecific創建ConvertContext,並設置比特率之類的屬性 void* convert_init(int sample_rate, int channel_count) { AudioStreamBasicDescription sourceDes; memset(&sourceDes, 0, sizeof(sourceDes)); sourceDes.mSampleRate = sample_rate; sourceDes.mFormatID = kAudioFormatLinearPCM; sourceDes.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger; sourceDes.mChannelsPerFrame = channel_count; sourceDes.mBitsPerChannel = 16; sourceDes.mBytesPerFrame = sourceDes.mBitsPerChannel/8*sourceDes.mChannelsPerFrame; sourceDes.mBytesPerPacket = sourceDes.mBytesPerFrame; sourceDes.mFramesPerPacket = 1; sourceDes.mReserved = 0; AudioStreamBasicDescription targetDes; memset(&targetDes, 0, sizeof(targetDes)); targetDes.mFormatID = kAudioFormatMPEG4AAC; targetDes.mSampleRate = sample_rate; targetDes.mChannelsPerFrame = channel_count; UInt32 size = sizeof(targetDes); AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &targetDes); AudioClassDescription audioClassDes; memset(&audioClassDes, 0, sizeof(AudioClassDescription)); AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders, sizeof(targetDes.mFormatID), &targetDes.mFormatID, &size); int encoderCount = size / sizeof(AudioClassDescription); AudioClassDescription descriptions[encoderCount]; AudioFormatGetProperty(kAudioFormatProperty_Encoders, sizeof(targetDes.mFormatID), &targetDes.mFormatID, &size, descriptions); for (int pos = 0; pos < encoderCount; pos ++) { if (targetDes.mFormatID == descriptions[pos].mSubType && descriptions[pos].mManufacturer == kAppleSoftwareAudioCodecManufacturer) { memcpy(&audioClassDes, &descriptions[pos], sizeof(AudioClassDescription)); break; } } ConvertContext *convertContex = malloc(sizeof(ConvertContext)); OSStatus ret = AudioConverterNewSpecific(&sourceDes, &targetDes, 1, &audioClassDes, &convertContex->converter); if (ret == noErr) { AudioConverterRef converter = convertContex->converter; tmp = kAudioConverterQuality_High; AudioConverterSetProperty(converter, kAudioConverterCodecQuality, sizeof(tmp), &tmp); UInt32 bitRate = 96000; UInt32 size = sizeof(bitRate); ret = AudioConverterSetProperty(converter, kAudioConverterEncodeBitRate, size, &bitRate); } else { free(convertContex); convertContex = NULL; } return convertContex; } // converting void convert(void* convertContext, void* srcdata, int srclen, void** outdata, int* outlen) { ConvertContext* convertCxt = (ConvertContext*)convertContext; if (convertCxt && convertCxt->converter) { UInt32 theOuputBufSize = srclen; UInt32 packetSize = 1; void *outBuffer = malloc(theOuputBufSize); memset(outBuffer, 0, theOuputBufSize); AudioStreamPacketDescription *outputPacketDescriptions = NULL; outputPacketDescriptions = (AudioStreamPacketDescription*)malloc(sizeof(AudioStreamPacketDescription) * packetSize); FillComplexInputParam userParam; userParam.source = srcdata; userParam.sourceSize = srclen; userParam.channelCount = convertCxt->channels; userParam.packetDescriptions = NULL; OSStatus ret = noErr; AudioBufferList* bufferList = malloc(sizeof(AudioBufferList)); AudioBufferList outputBuffers = *bufferList; outputBuffers.mNumberBuffers = 1; outputBuffers.mBuffers[0].mNumberChannels = convertCxt->channels; outputBuffers.mBuffers[0].mData = outBuffer; outputBuffers.mBuffers[0].mDataByteSize = theOuputBufSize; ret = AudioConverterFillComplexBuffer(convertCxt->converter, audioConverterComplexInputDataProc, &userParam, &packetSize, &outputBuffers, outputPacketDescriptions); if (ret == noErr) { if (outputBuffers.mBuffers[0].mDataByteSize > 0) { NSData* rawAAC = [NSData dataWithBytes:outputBuffers.mBuffers[0].mData length:outputBuffers.mBuffers[0].mDataByteSize]; *outdata = malloc([rawAAC length]); memcpy(*outdata, [rawAAC bytes], [rawAAC length]); *outlen = (int)[rawAAC length]; // 測試轉換出來的aac數據,保存成adts-aac文件 #if 1 int headerLength = 0; char* packetHeader = newAdtsDataForPacketLength((int)[rawAAC length], convertCxt->samplerate, convertCxt->channels, &headerLength); NSData* adtsPacketHeader = [NSData dataWithBytes:packetHeader length:headerLength]; free(packetHeader); NSMutableData* fullData = [NSMutableData dataWithData:adtsPacketHeader]; [fullData appendData:rawAAC]; NSFileManager *fileMgr = [NSFileManager defaultManager]; NSString *filepath = [NSHomeDirectory() stringByAppendingFormat:@"/Documents/test%p.aac", convertCxt->converter]; NSFileHandle *file = nil; if (![fileMgr fileExistsAtPath:filepath]) { [fileMgr createFileAtPath:filepath contents:nil attributes:nil]; } file = [NSFileHandle fileHandleForWritingAtPath:filepath]; [file seekToEndOfFile]; [file writeData:fullData]; [file closeFile]; #endif } } free(outBuffer); if (outputPacketDescriptions) { free(outputPacketDescriptions); } } } // uninit // ... int freqIdxForAdtsHeader(int samplerate) { /** 0: 96000 Hz 1: 88200 Hz 2: 64000 Hz 3: 48000 Hz 4: 44100 Hz 5: 32000 Hz 6: 24000 Hz 7: 22050 Hz 8: 16000 Hz 9: 12000 Hz 10: 11025 Hz 11: 8000 Hz 12: 7350 Hz 13: Reserved 14: Reserved 15: frequency is written explictly */ int idx = 4; if (samplerate >= 7350 && samplerate < 8000) { idx = 12; } else if (samplerate >= 8000 && samplerate < 11025) { idx = 11; } else if (samplerate >= 11025 && samplerate < 12000) { idx = 10; } else if (samplerate >= 12000 && samplerate < 16000) { idx = 9; } else if (samplerate >= 16000 && samplerate < 22050) { idx = 8; } else if (samplerate >= 22050 && samplerate < 24000) { idx = 7; } else if (samplerate >= 24000 && samplerate < 32000) { idx = 6; } else if (samplerate >= 32000 && samplerate < 44100) { idx = 5; } else if (samplerate >= 44100 && samplerate < 48000) { idx = 4; } else if (samplerate >= 48000 && samplerate < 64000) { idx = 3; } else if (samplerate >= 64000 && samplerate < 88200) { idx = 2; } else if (samplerate >= 88200 && samplerate < 96000) { idx = 1; } else if (samplerate >= 96000) { idx = 0; } return idx; } int channelIdxForAdtsHeader(int channelCount) { /** 0: Defined in AOT Specifc Config 1: 1 channel: front-center 2: 2 channels: front-left, front-right 3: 3 channels: front-center, front-left, front-right 4: 4 channels: front-center, front-left, front-right, back-center 5: 5 channels: front-center, front-left, front-right, back-left, back-right 6: 6 channels: front-center, front-left, front-right, back-left, back-right, LFE-channel 7: 8 channels: front-center, front-left, front-right, side-left, side-right, back-left, back-right, LFE-channel 8-15: Reserved */ int ret = 2; if (channelCount == 1) { ret = 1; } else if (channelCount == 2) { ret = 2; } return ret; } /** * Add ADTS header at the beginning of each and every AAC packet. * This is needed as MediaCodec encoder generates a packet of raw * AAC data. * * Note the packetLen must count in the ADTS header itself. * See: http://wiki.multimedia.cx/index.php?title=ADTS * Also: http://wiki.multimedia.cx/index.php?title=MPEG-4_Audio#Channel_Configurations **/ char* newAdtsDataForPacketLength(int packetLength, int samplerate, int channelCount, int* ioHeaderLen) { int adtsLength = 7; char *packet = malloc(sizeof(char) * adtsLength); // Variables Recycled by addADTStoPacket int profile = 2; //AAC LC //39=MediaCodecInfo.CodecProfileLevel.AACObjectELD; int freqIdx = freqIdxForAdtsHeader(samplerate); int chanCfg = channelIdxForAdtsHeader(channelCount); //MPEG-4 Audio Channel Configuration. NSUInteger fullLength = adtsLength + packetLength; // fill in ADTS data packet[0] = (char)0xFF; // 11111111 = syncword packet[1] = (char)0xF9; // 1111 1 00 1 = syncword MPEG-2 Layer CRC packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2)); packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11)); packet[4] = (char)((fullLength&0x7FF) >> 3); packet[5] = (char)(((fullLength&7)<<5) + 0x1F); packet[6] = (char)0xFC; *ioHeaderLen = adtsLength; return packet; }
以上代碼,有兩個函數比較重要,一個是初始化函數,這個函數創建了AudioConverterRef,另一個是轉換函數,這個函數應該被反復調用,對不同的pcm數據進行轉換。
另外,示例中,把pcm轉換出來的aac數據,進行了保存,保存出來的文件可以用於播放。
註意,AudioConverter轉換出來的都是音頻裸數據,至於組合成adts-aac,還是封裝成蘋果的m4a文件,由程序決定。
這裏解釋一下,adts-aac是aac數據的一種表示方式,也就是在每幀aac裸數據前面,增加一個幀信息(包括每幀的長度、采樣率、聲道數等),加上幀信息後,每幀aac可以單獨播放。而且,adts-aac是沒有封裝的,也就是沒有特定的文件頭以及文件結構等。
adts是Audio Data Transport Stream的縮寫。
當然,讀者也可以把轉換出來的aac數據,封裝成m4a格式,這種封裝格式,先是文件頭,然後就是祼音頻數據:
{packet-table}{audio_data}{trailer},頭信息之後就是音頻裸數據,音頻數據不帶packet信息。
至此,iOS平臺把pcm轉換成aac數據的實現就介紹完畢了。
總結一下,本文介紹了如何使用iOS平臺提供的AudioConverter接口,把pcm格式的數據轉換成aac格式。文章也介紹了怎麽保存成adts-aac文件,讀者可以通過這個辦法檢驗轉換出來的aac數據是否正確。
iOS平臺上音頻編碼成aac