1. 程式人生 > >使用mp4v2將H264+AAC合成mp4檔案

使用mp4v2將H264+AAC合成mp4檔案

這裡是連結:

https://itunes.apple.com/cn/app/shuang-yu-bo-fang-qi-kan-dian/id950279764?mt=8

錄製程式要新增新功能:錄製CMMB電視節目,我們的板卡傳送出來的是RTP流(H264視訊和AAC音訊),錄製程式要做的工作是:

(1)接收並解析RTP包,分離出H264和AAC資料流;

(2)將H264視訊和AAC音訊以某種格式封裝,最後存成檔案,供使用者檢視。

第一步已經有部分程式碼可供參考,因此很快就完成了。

第二步,我們決定封裝成mp4,查找了一些資料後,決定使用開源庫mp4v2來合成mp4檔案。

技術路線已確定,就開工幹活。

(一)mp4格式的基礎知識。

關於mp4格式,網上介紹的不少,有以下內容可供參考:

(1)兩個ISO標準:

[ISO/IEC 14496-12]:ISO base media file format --”is a general format forming the basis for a number of other more specific file formats. This format contains the timing, structure, and media information for timed sequences of media data, such as audio-visual presentations ”

[ISO/IEC 14496-14]:MP4 file format --”This specification defines MP4 as an instance of the ISO Media File format [ISO/IEC 14496-12 and ISO/IEC
15444-12]. ”

定義了mp4檔案格式標準。

是上面兩個標準的解釋,建議先看這個,瞭解大概,具體細節再看ISO標準檔案。

(二)技術驗證。主要就是寫驗證程式碼,驗證技術可行性。

先寫部分驗證程式碼,很快完成了,但封裝出來的檔案有問題,無法播放。

合成部分,程式碼如下:

複製程式碼

 1 static void* writeThread(void* arg)
 2 {
 3     rtp_s* p_rtp = (rtp_s*) arg;
 4     if (p_rtp == NULL)
 5     {
 6         printf("ERROR!\n");
 7         return;
 8     }
 9 
10     MP4FileHandle file = MP4CreateEx("test.mp4", MP4_DETAILS_ALL, 0, 1, 1, 0, 0, 0, 0);//建立mp4檔案
11     if (file == MP4_INVALID_FILE_HANDLE)
12     {
13         printf("open file fialed.\n");
14         return;
15     }
16 
17     MP4SetTimeScale(file, 90000);
18 
19     //新增h264 track    
20     MP4TrackId video = MP4AddH264VideoTrack(file, 90000, 90000 / 25, 320, 240,
21                                             0x64, //sps[1] AVCProfileIndication
22                                             0x00, //sps[2] profile_compat
23                                             0x1f, //sps[3] AVCLevelIndication
24                                             3); // 4 bytes length before each NAL unit
25     if (video == MP4_INVALID_TRACK_ID)
26     {
27         printf("add video track failed.\n");
28         return;
29     }
30     MP4SetVideoProfileLevel(file, 0x7F);
31 
32     //新增aac音訊
33     MP4TrackId audio = MP4AddAudioTrack(file, 48000, 1024, MP4_MPEG4_AUDIO_TYPE);
34     if (video == MP4_INVALID_TRACK_ID)
35     {
36         printf("add audio track failed.\n");
37         return;
38     }
39     MP4SetAudioProfileLevel(file, 0x2);
40 
41 
42     int ncount = 0;
43     while (1)
44     {
45         frame_t* pf = NULL; //frame
46         pthread_mutex_lock(&p_rtp->mutex);
47         pf = p_rtp->p_frame_header;
48         if (pf != NULL)
49         {
50             if (pf->i_type == 1)//video
51             {
52                MP4WriteSample(file, video, pf->p_frame, pf->i_frame_size, MP4_INVALID_DURATION, 0, 1);
53              }
54             else if (pf->i_type == 2)//audio
55             {
56                 MP4WriteSample(file, audio, pf->p_frame, pf->i_frame_size , MP4_INVALID_DURATION, 0, 1);
57             }
58 
59             ncount++;
60 
61             //clear frame.
62             p_rtp->i_buf_num--;
63             p_rtp->p_frame_header = pf->p_next;
64             if (p_rtp->i_buf_num <= 0)
65             {
66                 p_rtp->p_frame_buf = p_rtp->p_frame_header;
67             }
68             free_frame(&pf);
69             pf = NULL;
70 
71             if (ncount >= 1000)
72             {
73                 break;
74             }
75         }
76         else
77         {
78             //printf("BUFF EMPTY, p_rtp->i_buf_num:%d\n", p_rtp->i_buf_num);
79         }
80         pthread_mutex_unlock(&p_rtp->mutex);
81         usleep(10000);
82     }
83 
84     MP4Close(file);
85 }

複製程式碼

現象:沒有影象,也沒有聲音,根本無法播放。

於是,艱苦的工作開始了:跟蹤查詢原因。

(1)使用 vlc播放合成的mp4檔案,檢視詳細輸出:

複製程式碼

1 vlc -vvv test.mp4
2 [0x8e9357c] mp4 stream debug: found Box: ftyp size 24 
3 [0x8e9357c] mp4 stream debug: found Box: free size 136 
4 [0x8e9357c] mp4 stream debug: skip box: "free" 
5 [0x8e9357c] mp4 stream debug: found Box: mdat size 985725 
6 [0x8e9357c] mp4 stream debug: skip box: "mdat" 
7 [0x8e9357c] mp4 stream debug: found Box: moov size 5187 
8 [0x8e9357c] mp4 stream debug: found Box: mvhd size 108 
9 [0x8e9357c] mp4 stream debug: read box: "mvhd" creation 734515d-06h:22m:03s modification 734515d-06h:22m:23s time scale 90000 duration 694977d-48h:00m:29s rate 1.000000 volume 1.000000 next track id 3 

複製程式碼

可以看到vlc(實際上是呼叫libmp4庫)解析box都正確的,mdat的大小也是正確的。
但接下來一行:
skip box: "mdat" 
就比較奇怪了,明明解析正確了,為什麼要將mdat忽略掉呢?要知道,mdat裡存放的可是真正的音視訊資料阿?如果skip掉了,後面解碼時沒有資料,當然播放不了了?

(2)既然找到疑點,繼續跟蹤。

檢視vlc的原始碼,在檔案modules/demux/mp4/libmp4.c中發現:skip資訊是由MP4_ReadBoxSkip()函式列印的,而呼叫的地方在libmp4.c中2641行:

複製程式碼

 1 /* Nothing to do with this box */
 2 { FOURCC_mdat,  MP4_ReadBoxSkip,        MP4_FreeBox_Common },
 3 { FOURCC_skip,  MP4_ReadBoxSkip,        MP4_FreeBox_Common },{ FOURCC_free,  MP4_ReadBoxSkip,        MP4_FreeBox_Common },
 4 { FOURCC_wide,  MP4_ReadBoxSkip,        MP4_FreeBox_Common },
 5 
 6 而在libmp4.h中:
 7 #define FOURCC_mdat VLC_FOURCC( 'm', 'd', 'a', 't' )
 8 #define FOURCC_skip VLC_FOURCC( 's', 'k', 'i', 'p' )
 9 #define FOURCC_free VLC_FOURCC( 'f', 'r', 'e', 'e' )
10 #define FOURCC_wide VLC_FOURCC( 'w', 'i', 'd', 'e' )

複製程式碼

從程式碼看,vlc呼叫libmp4解析檔案時,主動忽略了mdat,skip,free,wide這四種類型的box。
為什麼呢?

(3)繼續檢視modules/demux/mp4/mp4.c中的Open()函式(解析模組的入口函式),可以看到本模組的主要工作是初始化一個demux_sys_t結構體,該結構體定義如下:

複製程式碼

 1 struct demux_sys_t
 2 {
 3     MP4_Box_t    *p_root;      /* container for the whole file */
 4     mtime_t      i_pcr;
 5     uint64_t     i_time;         /* time position of the presentation * in movie timescale */
 6     uint64_t     i_timescale;    /* movie time scale */
 7     uint64_t     i_duration;     /* movie duration */
 8     unsigned int i_tracks;       /* number of tracks */
 9     mp4_track_t  *track;         /* array of track */
10     float        f_fps;          /* number of frame per seconds */
11 
12     /* */
13     MP4_Box_t    *p_tref_chap;
14 
15     /* */
16     input_title_t *p_title;
17 };

複製程式碼

似乎只是為了獲取mp4的tracks,moov,duration, timescale等基本資訊,實際上並不解碼資料,因此就不需要關注mdat這個box了。

綜上:vlc的輸出是正常的,libmp4忽略了mdat這個box也不是造成mp4檔案無法播放的原因,只是因為libmp4這個模組並不真正解碼資料,所以不需要關注這個box。
既然問題不在這,那在哪裡呢?

(4)繼續看vlc的輸出:
AVC: nal size -1710483062 
no frame! 
[0x8e93eb4] avcodec decoder warning: cannot decode one frame (3595 bytes) 
可以看到,vlc實際上是呼叫avcodec(ffmpeg)來解碼資料的,我們的視訊是AVC(H264)格式的。
從錯誤資訊可以確定,是H264的NAL大小錯誤,似乎跟mp4檔案本身關係不大。

不管那麼多,先看看程式碼再說。

vlc是以lib的形式使用ffmpeg的,所以我們必須看ffmpeg的程式碼:

複製程式碼

libavcodec/h264.c:
static int decode_nal_units(H264Context *h, const uint8_t *buf, int buf_size){

    ….

    for(;;){
        if(buf_index >= next_avc) {
            if(buf_index >= buf_size) break;
            nalsize = 0;
            for(i = 0; i < h->nal_length_size; i++)
                nalsize = (nalsize << 8) | buf[buf_index++];
            if(nalsize <= 0 || nalsize > buf_size - buf_index){
                av_log(h->s.avctx, AV_LOG_ERROR, "AVC: nal size %d\n", nalsize);
                break;
            }
            next_avc= buf_index + nalsize;
        } 

…

}

複製程式碼

可以看到,正是這裡報錯的。
但是,為什麼報錯呢?根據ffmpeg的資訊,知道取出來的 nalsize為負數。
懷疑是h264流本身有問題,於是用Elecard查看了生成的mp4檔案,視訊播放又非常正常。似乎h264流是正常的?
愁呀愁。。。。

內容如下:
Ottavio Campana 
“question about MP4AddH264VideoTrack。
What's the meaning of the profile_compat and 
sampleLenFieldSizeMinusOne fields?”

Jeremy Noring
"Usually an NALU is prefixed by the start code 0x00000001. To write it 
as a sample in MP4 file format, just replace the start code with size 
of the NALU(without 4-byte start code) in big endian. You also need to 
specify how many bytes of the size value requires. Take libmp4v2 for 
example, the last parameter in MP4AddH264VideoTrack(.., uint8_t 
sampleLenFieldSizeMinusOne) indicate the number of byes minus one." 

...so each sample you and to mp4v2 should be prefixed with a size code 
(in big-endian, of course). I use a 4 byte size code, so 
sampleLenFieldSizeMinusOne gets set to 3. This seems to work; my 
files playback on just about everything. Perhaps one of the project 
maintainers can clarify this, and it'd also be good to update the 
documentation of that call to make this clear.”

Ottavio Campana 
that's the code I used as reference to write my program :-( 
but my doubt is that there must be something wrong somewhere, because 
boxes seem to be correctly written, but when I try to decode them I 
get errors like 
[h264 @ 0xb40fa0]AVC: nal size -502662121 
have you ever seen an error like this?

Ottavio Campana 
> Not sure, but it looks you're not converting it to big-endian before 
> prefixing it to your sample. 
well, eventually using ffmpeg to dump the read frames, I discovered 
that I had to strip che NALU start code, i.e. the 0x00000001, and to 
put the NALU size at its place. 
It works perfectly now, but I still wonder why I had to put the size 
at the begin of the data, since it is a parameter which is passed to 
MP4WriteSample, so I expected the function to add it.

從中得到如下關鍵資訊:
(1)h264流中的NAL,頭四個位元組是0x00000001;
(2)mp4中的h264track,頭四個位元組要求是NAL的長度,並且是大端順序;
(3)mp4v2很可能針對此種情況並沒有做處理,所以寫到mp4檔案中的每個NAL頭四個位元組還是0x00000001.

那好說,我將每個sample(也就是NAL)的頭四個位元組內容改成NAL的長度,且試試看:

if(pf->i_frame_size >= 4)
{
    uint32_t* p = (&pf->p_frame[0]);
      *p = htonl(pf->i_frame_size -4);//大端,去掉頭部四個位元組
}
MP4WriteSample(file, video, pf->p_frame, pf->i_frame_size,     MP4_INVALID_DURATION, 0, 1);

測試下來,果然OK了!

(6)視訊已經解決了,音訊還有問題:播放的聲音太快。

嘗試調整引數:

MP4TrackId audio = MP4AddAudioTrack(file, 48000, 1024, MP4_MPEG4_AUDIO_TYPE);

第三個引數sampleDuration,表示每個sample持續多少個duration,網上看到的都是1024。

我嘗試了幾個不同的值:128,256,512,4096都不行,最後發現設為2048就正常了。

(為什麼是2048??????我不清楚,也許是因為我們的音訊是雙聲道?有時間再研究。。。)

正確程式碼如下:

MP4TrackId audio = MP4AddAudioTrack(file, 48000, 2048, MP4_MPEG4_AUDIO_TYPE);

至此,已經成功的將rtp流合成了mp4檔案,證明了技術上是可行的。

注意:很多引數都是針對我們的具體應用寫死的,僅供參考。

(三)將功能合併到錄製程式中。

略。