RM/RMVB檔案格式總結
RM檔案格式是標準的標記符檔案格式,RM檔案格式把標記符塊組合成頭塊,資料快,索引塊,這些標記符塊的組合方法如下:
.RMF頭塊
RealMedia File Header(RM 檔案頭)
PROP屬性頭
Properties(屬性頭)
MDPR媒體屬性頭
(可含多個)
Media Properties(媒體屬性頭1)
Media Properties(媒體屬性頭2)
------(其他媒體屬性頭3—n)
CONT內容描述頭
Content Properties(內容描述頭)
DATA資料段(可含多個)
Data Chunk Header(資料塊頭)
Data Packets(資料包)
------(其他資料包)
Data Chunk Header(資料塊頭)
Data Packets(資料包)
------(其他資料包)
---------(其他資料段)
INDX索引段(可含多個)
Index Section(索引段)
------(其他索引段)
RM檔案格式是標記符檔案格式,所以塊的先後次序並不確定,只有RM檔案頭必須是檔案的第一個塊。典型的RM格式檔案的頭段:RM檔案頭(必須是檔案的第一塊)、屬性頭、媒體屬性頭、內容描述頭。在RM檔案頭後面,其他的頭可以按任何次序出現,除了索引頭以外,其他的頭都必須要。
RM/RMVB 檔案格式-.RMF(檔案頭)
每一個RM檔案都以RM檔案頭開始,RM檔案頭標識檔案是RMF型別,在RM檔案中只有一種RM檔案頭,因為RM檔案頭的內容可能隨著RMF的版本不同而改變,所以頭結構裡面有個版本域來指明有什麼其他的額外域存在。下面的資料結構就是RM檔案頭的資料儲存方式:
RealMedia_File_Header
{
UINT32 object_id;
UINT32 size;
UINT16 object_version;
if((object_version= =0)||(object_version= =1))
{
UINT32 file_version;
UINT32 num_headers;
}
}
結構中包含有如下成員:
object_id,RM檔案的唯一標識(“.RMF”),所有的RM檔案都以這個標識開頭,用32個二進位制位表示;
size,RM頭段的大小,用32個二進位制位表示;//上述結構體中五個元素都含有時,為18位元組
object_version,RM檔案頭的版本,決定結構中其他成員的取捨,用16個二進位制位表示;
file_version,RM檔案的版本,用32個二進位制位表示;
num_headers,頭段中RM檔案頭後面的包含的頭的個數,用32個二進位制位表示。注:num_headers具體指那些頭的個數不明,有檔案該值為7,但僅包含1個PROP,3個MPPR,1個CONT,如果算上.RMF共6個;另外有檔案該值為7,包含有1個PROP,3個MPPR,1個CONT,如果算上.RMF共6個,如果算上INDX(3個)則總數是9個。
如圖:
RM/RMVB 檔案格式-PROP(屬性頭)
屬性頭 PROP
屬性頭描述了RM檔案的一般媒體屬性,RM系統的元件根據屬性頭進行適當的配置來處理RM檔案或RM流中資料。在RM檔案中只有一個屬性頭,下面的資料結構就是屬性頭的資料儲存方式:
Properties
{
UINT32 object_id; [0-3]
UINT32 size;[4-7]
UINT16 object_version;[8-9]
if(object_version==0)
{
UINT32 max_bit_rate;[10 - 13]
UINT32 avg_bit_rate;[14 - 17]
UINT32 max_packet_size;[18 - 21]
UINT32 avg_packet_size;[22 - 25]
UINT32 num_packets;[26 - 29]
UINT32 duration;[30 - 33]
UINT32 preroll; [34 - 37]
UINT32 index_offset; [38 - 41]
UINT32 data_offset; [42 - 45]
UINT16 num_streams; [46 - 47]
UINT16 flags; [48 - 49]
}
}
結構中包含有如下成員:
object_id,屬性頭唯一標識 (“PROP”),用32個二進位制位表示;
size,屬性頭大小,用32個二進位制位表示;10+40位元組
object_version,RM檔案頭版本,決定結構中其他成員的取捨,此結構中此值為零,用16個二進位制位表示;
max_bit_rate,網路傳輸時要求的最大位元率,用32個二進位制位表示;
avg_bit_rate,網路傳輸時要求的平均位元率,用32個二進位制位表示;
max_packet_size,最大媒體資料包的大小(以位元組計算),用32個二進位制位表示;
avg_packet_size,平均媒體資料包的大小(以位元組計算),用32個二進位制位表示;
num_packets,媒體資料包個數,用32個二進位制位表示;
duration,媒體檔案應該播放的時間(以毫秒計算),用32個二進位制位表示;
preroll,回放之前的預留毫秒數,用32個二進位制位表示;
index_offset,索引頭到檔案開始的偏移,這個值可為零,表明沒有索引段,用32個二進位制位表示;
data_offset,資料段到檔案開始的偏移,用32個二進位制位表示(注:在RM檔案中可以有不止一個數據塊頭,這個值只表示第一個資料塊頭到檔案開始的偏移,其他資料塊頭的偏移可以從資料塊頭中next_data_header域得到);
num_streams,在主頭段中包含的全部媒體屬性頭(MDPR)的個數,用16個二進位制位表示;
flags,包含檔案資訊的位掩碼,用16個二進位制位表示。具體資訊如下表,其它位應設定為零。//flags,檔案中flag出現過如下值:0009,000b 所以下述位掩碼顯然不完全。檔案資訊的位掩碼錶:
位
標誌
描述
0
儲存允許
如果是1,就允許儲存此檔案到磁碟
1
最佳播放
如果是1, 建議使用額外的快取
2
實況
如果是1, 表明媒體流來自實況廣播
圖例:
RM/RMVB 檔案格式-MDPR(媒體屬性頭)
媒體屬性頭描述了RM檔案中每個媒體流的特定屬性,RM系統的元件根據媒體屬性頭進行適當的配置來處理RM流中的媒體資料,RM檔案中每一個媒體流有一個媒體屬性頭。
媒體屬性頭的整體組成
下面的資料結構就是媒體屬性頭的資料儲存方式:
Media_Properties
{
UINT32 object_id;[0 - 3]
UINT32 size;[4 - 7]
UINT16 object_version;[8 - 9]
if(object_version==0)
{
UINT16 stream_number; [10 - 11]
UINT32 max_bit_rate; [12 - 15]
UINT32 avg_bit_rate; [16 - 19]
UINT32 max_packet_size;[20 - 23]
UINT32 avg_packet_size;[24 - 27]
UINT32 start_time;[28 - 31]
UINT32 preroll; [32 - 35]
UINT32 duration; [36 -39]
UINT8 stream_name_size;[40 - 40]
UINT8[stream_name_size] stream_name;[41 - 41]
UINT8 mime_type_size;[42 - 42]
UINT8[mime_type_size] mime_type;[43 - 43]
//決定Type_Specific_Data
UINT32 type_specific_len;[44 - 47]
UINT8[type_specific_len] Type_Specific_Data;[48 - 51]
//詳細說明見後文
}
}
媒體屬性頭結構中包含有如下這些成員:
object_id:媒體屬性頭唯一標識,為“MDPR”;
size,表明媒體屬性頭的大小;
object_version,媒體屬性頭版本號;
stream_number,流的標識,表明RM媒體檔案中此媒體屬性頭代表的是哪個資料流(視訊資料流或音訊資料流),在資料段中的每個資料包中都包含有類似的標識以表明資料是屬於哪個媒體流。當版本號為零時此成員才存在;
max_bit_rate,網路上傳輸此媒體流所要求的最大位元率,當版本號為零時此成員才存在;
avg_bit_rate,網路上傳輸此媒體流所要求的平均位元率,當版本號為零時此成員才存在;
max_packet_size,媒體流資料包的最大容量(以位元組計算),當版本號為零時此成員才存在;
avg_packet_size,媒體流資料包的平均容量(以位元組計算),當版本號為零時此成員才存在;
start_time,開始時間(毫秒錶示),用於加到資料包時間標誌上,當版本號為零時此成員才存在;
preroll,和start_time相反的時間尺度(毫秒錶示),用於從資料包時間標誌中減去的值,當版本號為零時此成員才存在;
duration,流的持續時間,當版本號為零時此成員才存在;
stream_name_size,流名稱所佔位元組數,當版本號為零時此成員才存在,用8個二進位制位表示;
stream_name,流的名稱,版本號為零時此成員才存在,大小不定;
mime_type_size,表明下個成員(mime_type)所佔的儲存空間,版本號為零時此成員才存在,用8個二進位制位表示;
mime_type,和流相關的MIME形式的型別或子型別字串,版本號為零時此成員才存在,大小不定;mime_type:1、audio/x-pn-realaudio;2、video/x-pn-realvideo;3、logical-fileinfo
type_specific_len,表明下個成員(pe_specific_data)所佔的儲存空間,版本號為零時此成員才存在,;
Type_Specific_Data,一般用來儲存對流進行處理的特殊資料,版本號為零時此成員才存在,大小不定。
圖示1:
RealMedia-rmff
Multimedia container format developed by Real and used almost exclusively for codecs developed by Real. The old .ra files are just for audio. The newer RealMedia (.rm) files are for audio and video.
RA Format
This is the old audio-only RealAudio file format. A very similar structure is also used to describe audio streams in RM files. The audio data part is just a stream of bytes with no structure. There is no index in .ra files, but seeking is possible because
the codecs are CBR.
RealAudio 1.0 file (.ra version 3)
This is from the very first version of RealAudio (1995). These files can only contain 8kbps VSELP audio data. A FourCC (lpcJ) may be present, but it is ignored. Byte order is big-endian.
byte[4] Header signature ('.', 'r', 'a', 0xfd)
word Version (always 3)
word Header size, not including first 8 bytes
byte[10] Unknown
dword Data size
byte Title string length
byte[] Title string
byte Author string length
byte[] Author string
byte Copyright string length
byte[] Copyright string
byte Comment string length
byte[] Comment string
byte Unknown *
byte Fourcc string length (always 4) *
byte[] Fourcc string (always "lpcJ") *
Audio data
Notes:
Fields marked with * may be missing. Based on the only known sample with no FourCC, it's assumed that all these fields are either present or missing. To determine if they are missing, check the header size (bytes 6-7).
The informative fields (title, author, copyright and comment) can have zero length.
RealAudio 2.0 file (.ra version 4)
This is second version of the RealAudio file format. It is distinguished from the above by the value in byte 5 (=0x04). This type of file must contain a valid FourCC to identify the audio codec.
Possible FourCC values are 28_8, dnet and sipr.
byte[4] Header signature ('.', 'r', 'a', 0xfd)
word Version (always 4)
word Unused (always 0)
byte[4] ra4 signature (always ".ra4")
dword Data size - 0x27
word Version2 (always equal to version)
dword Header size - 16
word Codec flavor
dword Coded frame size
byte[12] Unknown
word Sub packet h
word Frame size
word Subpacket size
word Unknown
word Samplerate
word Unknown
word Sample size
word Channels
byte Interleaver ID string length (always 4)
byte[] Interleaver ID string
byte FourCC string length (always 4)
byte[] FourCC string
byte[3] Unknown
byte Title string length
byte[] Title string
byte Author string length
byte[] Author string
byte Copyright string length
byte[] Copyright string
byte Comment string length
byte[] Comment string
Audio Data
Notes:
The 0x27 in data size is the size of the fixed-length part of the header (up to channels).
The informative fields (title, author, copyright and comment) can have zero length.
.ra version 5
While the .ra header can contain version 5, there are no known RealAudio files with this format, and it's not known if they really exist.
RealMedia Format
This is the newer format which stores both audio and video. All multi-byte numbers are stored in big-endian format. A RealMedia file consists of a series of chunks. Each chunk has the following format:
dword chunk type (FOURCC)
dword chunk size, including 8-byte preamble
word chunk version
byte[] chunk payload
Real chunk types:
.RMF: RealMedia file header (only one per file, must be the first chunk)
PROP: File properties (only one per file)
MDPR: Stream properties (one for each stream)
CONT: Content description/metadata (typically one per file)
DATA: File data
INDX: File index (typically one per stream)
RealMedia file header (.RMF)
This must be the first chunk in a RealMedia file. Only one .RMF can be present in a file. The only useful information carried by .RMF is the number of headers. A .RMF chunk has the following format
dword chunk type ('.RMF')
dword chunk size (typically 0x12)
word chunk version (always 0, for every known file)
dword file version
dword number of headers
Notes:
All known sample files have version equal to 0.
There is a sample with chunk size = 0x10, in that case file version is a word. Note that the sample has chunk version = 0 like all other files.
File properties header (PROP)
This chunk contains some information about the general properties of a RealMedia file. Only one PROP chunk can be present in a file. A PROP chunk has the following format
dword Chunk type ('PROP')
dword Chunk size (typically 0x32)
word Chunk version (always 0, for every known file)
dword Maximum bit rate
dword Average bit rate
dword Size of largest data packet
dword Average size of data packet
dword Number of data packets in the file
dword File duration in ms
dword Suggested number of ms to buffer before starting playback
dword Offset of the first INDX chunk form the start of the file
dword Offset of the first DATA chunk form the start of the file
word Number of streams in the file
word Flags (bitfield, see below)
Flags:
bit 0: file can be saved on disk
bit 1: PerfectPlay can be used (extra buffering)
bit 2: the file is a live broadcast
Media properties header (MDPR)
This chunk contains information about the properties of a RealMedia stream. This header defines the type of a stream and the codec used. All codec-related data is in the type specific part of this header. Many fields share the same meanings as the ones
in PROP chunk, but in this case they are specific for one stream. There is one MDPR chunk for every stream in the file. A MDPR chunk has the following format
dword Chunk type ('MDPR')
dword Chunk size
word Chunk version (always 0, for every known file)
word Stream number
dword Maximum bit rate
dword Average bit rate
dword Size of largest data packet
dword Average size of data packet
dword Stream start offset in ms
dword Preroll in ms (to be subtracted from timestamps?)
dword Stream duration in ms
byte Size of stream description string
byte[] Stream description string
byte Size of stream mime type string
byte[] Mime type string
dword Size of type specific part of the header
byte[] Type specific data, meaning and format depends on mime type
Audio (audio/)
audio/x-pn-realaudio and audio/x-pn-multirate-realaudio
These mimetypes are used to specify streams with RealAudio codecs. There are 3 known versions of this datablock: ra3, ra4, ra5. ra3 is used only with the old 14_4 codec, ra4 and ra5 can be used with all the other codecs. The audio block has this format
byte[4] Header signature ('.', 'r', 'a', 0xfd)
word Version (3, 4 or 5)
#if version == 3
word Header size, not including first 8 bytes
byte[10] Unknown
dword Data size
byte Title string length
byte[] Title string
byte Author string length
byte[] Author string
byte Copyright string length
byte[] Copyright string
byte Comment string length
byte[] Comment string
byte Unknown *
byte Fourcc string length (always 4) *
byte[] Fourcc string (always "lpcJ") *
#elseif version == 4 or version == 5
word Unused (always 0)
byte[4] ra signature (".ra4" or ".ra5", depending on version)
dword Unknown (maybe data size)
word Version2 (always equal to version)
dword Header size
word Codec flavor
dword Coded frame size
byte[12] Unknown
word Sub packet h
word Frame size
word Subpacket size
word Unknown
#if version == 5
byte[6] Unknown
#endif
word Samplerate
word Unknown
word Sample size
word Channels
#if version == 4
byte Interleaver ID string length (always 4)
byte[] Interleaver ID string
byte FourCC string length (always 4)
byte[] FourCC string
#endif
#if version == 5
dword Interleaver ID
dword FourCC
#endif
byte[3] Unknown
#if version == 5
byte Unknown
#endif
dword Codec extradata length
byte[] Codec extradata
#endif
audio/X-MP3-draft-00
This is used to store MP3 audio in rm container. When this mimetype is used the type-specific part of the MDPR header is not used, and its length is set to 0. The MP3 frames are stored in ADU format (see RFC 3119 for details) with no interleaving (at least
this is true in the only known sample).
audio/x-ralf-mpeg4
This is used to store ralf lossless audio. This is the only known RealAudio codec that does not use the x-pn-realaudio mimetype. The format of this type-specific data is not known.
Content description header (CONT)
This chunk contains some text information (like title, author, ...) about the content of the file. This header has an informative purpose only and it's not needed to demux the file. A CONT chunk has the following format
dword Chunk type ('CONT')
dword Chunk size
word Chunk version (always 0, for every known file)
word Title string length
byte[] Title string
word Author string length
byte[] Author string
word Copyright string length
byte[] Copyright string
word Comment string length
byte[] Comment string
Data header (DATA)
This chunk contains a group of data packets. Packets from each stream are interleaved, except for multirate files. A DATA chunk has the following format
dword Chunk type ('DATA')
dword Chunk size
word Chunk version (always 0, for every known file)
dword Number of data packets in this chunk
dword Offset of the next DATA chunk (form the start of the file)
byte[] Data packets
Each data packet has this format
word Packet version (0 or 1 in available samples)
word Packet size
word Stream number
dword Timestamp (in ms)
byte Unknown
byte Flags (bitfield, see below)
#if version == 1
byte Unknown
#endif
byte[] Stream-specific data
Flags:
bit 0: reliable packet (refers to network transmission method)
bit 1: keyframe
Note: The previous description of the data packet comes from working demuxer code, the description in official Real docs (somewhere on Helix site) is a bit different:
word Packet version
word Packet size
word Stream number
dword Timestamp
#if version == 0
byte Packet group
byte Flags
#endif
#if version == 1
word ASM rule
byte ASM flags
#endif
byte[] Stream-specific data
where packet group is "The packet group to which the packet belongs. If packet grouping is not used, set this field to 0 (zero)", asm rule is "The ASM rule assigned to this packet" and asm flags "Contains HX_ flags that dictate stream switching points".
Index header (INDX)
This chunk contains index entries. It comes after all the DATA chunks. An index chunk contains data for a single stream, A file can have more than one INDX chunk. A INDX chunk has the following format
dword Chunk type ('INDX')
dword Chunk size
word Chunk version (always 0, for every known file)
dword Number of entries in this chunk
word Stream number
dword Offset of the next INDX chunk (form the start of the file)
byte[] Index entries
Each index entry has this format
word Entry version (always 0, for every known file)
dword Timestamp (in ms)
dword Packet offset in file (form the start of the file)
dword Packet number
Codecs
Codecs in RealMedia are identified by the following four character codes:
Audio
lpcJ - RealAudio 1.0 (VSELP)
28_8 - RealAudio 2.0 (LD-CELP)
dnet - AC3
sipr - Sipro
cook - Cook
atrc - ATRAC
ralf - RealAudio Lossless Format
raac - LC-AAC
racp - HE-AAC
Video
RV10 - H.263
RV13 - H.263
RV20 - H.263+
RV30 - H.264 precursor
RV40 - H.264 precursor
static const PVCodecTag rm_codec_tags[] = {
{ CODEC_ID_RV10, MKTAG('R','V','1','0') },
{ CODEC_ID_RV20, MKTAG('R','V','2','0') },
{ CODEC_ID_RV20, MKTAG('R','V','T','R') },
{ CODEC_ID_RV30, MKTAG('R','V','3','0') },
{ CODEC_ID_RV40, MKTAG('R','V','4','0') },
{ CODEC_ID_AC3, MKTAG('d','n','e','t') },
{ CODEC_ID_RA_144, MKTAG('l','p','c','J') },
{ CODEC_ID_RA_288, MKTAG('2','8','_','8') },
{ CODEC_ID_COOK, MKTAG('c','o','o','k') },
{ CODEC_ID_ATRAC3, MKTAG('a','t','r','c') },
{ CODEC_ID_SIPR, MKTAG('s','i','p','r') },
{ CODEC_ID_AAC, MKTAG('r','a','a','c') },
{ CODEC_ID_AAC, MKTAG('r','a','c','p') },
{ CODEC_ID_NONE },
};
Appendix E: RealMedia File Format (RMFF) Reference
The Helix architecture supports RealMedia File Format (RMFF), which enables Helix to deliver high- quality multimedia content over a variety of network bandwidths. Third-party developers can convert their media formats into RMFF, enabling Helix Universal
Server to deliver the files to RealPlayer or other applications built with the Helix Client and Server Software Development Kit. Third-party developers can thereby use Helix to transport content over the Internet to their own applications.
RealMedia File Format is a standard tagged file format that uses four-character codes to identify file elements. These codes are 32-bit, represented by a sequence of one to four ASCII alphanumeric characters, padded on the right with space characters. The
data type for four-character codes is FOURCC. Use the HX_FOURCC macro to convert four characters into a four-character code.
The basic building block of a RealMedia File is a chunk , which is a logical unit of data, such as a stream header or a packet of data. Each chunk contains the following fields:
Four-character code specifying the chunk identifier
32-bit value specifying the size of the data member in the chunk
Blob of opaque chunk data
Depending on its type, a top-level chunk can contain subobjects. This document describes the tagged chunks contained in RMFF, as well as the format of the data stored in each type of tagged chunk.
Tagged File Formats
RealMedia File Format organizes tagged chunks into a header section, a data section, and an index section. The organization of these tagged chunks is shown in the following figure.
Sections of a RealMedia File
Header Section
Because RMFF is a tagged file format, the order of the chunks is not explicit, except that the RealMedia File Header must be the first chunk in the file. However, most applications write the standard headers into the file's header section. The following
chunks are typically found in the header section of RMFF:
RealMedia File Header (this must be the first chunk of the file) Properties Header Media Properties Header Content Description Header
After the RealMedia File Header object, the other headers may appear in any order. All headers are required except the Index Header. The following sections describe the individual header objects.
RealMedia File Header
Each RealMedia file begins with the RealMedia File Header, which identifies the file as RMFF. There is only one RealMedia File Header in a RealMedia file. Because the contents of the RealMedia File Header may change with different versions of RMFF, the
header structure supports an object version field for determining what additional fields exists. The following pseudo-structure describes the RealMedia File Header:
RealMedia_File_Header
{
UINT32 object_id;
UINT32 size;
UINT16 object_version
;
if ((object_version == 0) || (object_version == 1))
{
UINT32 file_version;
UINT32 num_headers;
}
}
The RealMedia File Header contains the following fields:
object_id The unique object ID for a RealMedia File (.RMF ). All RealMedia files begin with this identifier. The size of this member is 32 bits. size The size of the RealMedia header section in bytes. The size of this member is 32 bits. object_version The version
of the RealMedia File Header object. All files created according to this specification have an object_version number of 0 (zero) or 1. The size of this member is 16 bits. file_version The version of the RealMedia file. The Helix Client and Server SDK only
covers files with a file version of either 0 (zero) or 1. This member is present on all RealMedia_File_Header objects with an object_version of 0 (zero) or 1. The size of this member is 32 bits. num_headers The number of headers in the header section that
follow the RealMedia File Header. This member is present on all RealMedia_File_Header objects with an object_version of 0 (zero) or 1. The size of this member is 32 bits.
Properties Header
The Properties Header describes the general media properties of the RealMedia File. Components of the RealMedia system use this object to configure themselves for handling the data in the RealMedia file or stream. There is only one Properties Header in a RealMedia
file. The following pseudo-structure describes the Properties header:
Properties
{
UINT32 object_id;
UINT32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT32 max_bit_rate;
UINT32 avg_bit_rate;
UINT32 max_packet_size;
UINT32 avg_packet_size;
UINT32 num_packets;
UINT32 duration;
UINT32 preroll;
UINT32 index_offset;
UINT32 data_offset;
UINT16 num_streams;
UINT16 flags;
}
}
The Properties Header contains the following fields:
object_id The unique object ID for a Properties Header ('PROP'). The size of this member is 32 bits. size The 32-bit size of the Properties Header in bytes. The size of this member is 32 bits. object_version The version of the RealMedia File Header object.
All files created according to this specification have an object_version number of 0 (zero). The size of this member is 16 bits. max_bit_rate The maximum bit rate required to deliver this file over a network. This member is present on all Properties objects
with an object_version of 0 (zero). The size of this member is 32 bits. avg_bit_rate The average bit rate required to deliver this file over a network. This member is present on all Properties objects with an object_version of 0 (zero). The size of this member
is 32 bits. max_packet_size The largest packet size (in bytes) in the media data. This member is present on all Properties objects with an object_version of 0 (zero). The size of this member is 32 bits. avg_packet_size The average packet size (in bytes) in
the media data. This member is present on all Properties objects with an object_version of 0 (zero). The size of this member is 32 bits. num_packets The number of packets in the media data. This member is present on all Properties objects with an object_version
of 0 (zero). The size of this member is 32 bits. duration The duration of the file in milliseconds. This member is present on all Properties objects with an object_version of 0 (zero). The size of this member is 32 bits. preroll The number of milliseconds
to prebuffer before starting playback. This member is present on all Properties objects with an object_version of 0 (zero). The size of this member is 32 bits. index_offset The offset in bytes from the start of the file to the start of the index header object.
This value can be 0 (zero), which indicates that no index chunks are present in this file. This member is present on all Properties objects with an object_version of 0 (zero). The size of this member is 32 bits. data_offset The offset in bytes from the start
of the file to the start of the Data Section. This member is present on all Properties objects with an object_version of 0 (zero). The size of this member is 32 bits.
Note: There can be a number of Data_Chunk_Headers in a RealMedia file. The data_offset value specifies the offset in bytes to the first Data_Chunk_Header. The offsets to the other Data_Chunk_Headers can be derived from the next_data_header field in a Data_Chunk_Header.
num_streams The total number of media properties headers in the main headers section. This member is present on all Properties objects with an object_version of 0 (zero). The size of this member is 16 bits. flags Bit mask containing information about this file.
The following bits carry informationall of the rest should be zero: BitFlagDescription 0Save_EnabledIf 1, clients are allowed to save this file to disk. 1Perfect_PlayIf 1, clients are instructed to use extra buffering. 2LIveIf 1, these streams are from a live
broadcast.
The size of this member is 16 bits.
Media Properties Header
The Media Properties Header describes the specific media properties of each stream in a RealMedia file. Components of the RealMedia system use this object to configure themselves for handling the media data in each stream. There is one Media Properties
Header for each media stream in a RealMedia file. The following pseudo-structure describes the Media Properties header:
Media_Properties
{
UINT32 object_id;
UINT32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT16 stream_number;
UINT32 max_bit_rate;
UINT32 avg_bit_rate;
UINT32 max_packet_size;
UINT32 avg_packet_size;
UINT32 start_time;
UINT32 preroll;
UINT32 duration;
UINT8 stream_name_size;
UINT8[stream_name_size] stream_name;
UINT8 mime_type_size;
UINT8[mime_type_size] mime_type;
UINT32 type_specific_len;
UINT8[type_specific_len] type_specific_data;
}
}
The Media Properties Header contains the following members:
object_id The unique object ID for a Media Properties Header ("MDPR"). The size of this member is 32 bits. size The size of the Media Properties Header in bytes. The size of this member is 32 bits. object_version The version of the Media Properties Header object.
The size of this member is 16 bits. stream_number The stream_number (synchronization source identifier) is a unique value that identifies a physical stream. Every data packet that belongs to a physical stream contains the same STREAM_NUMBER . The STREAM_NUMBER
enables a receiver of multiple physical streams to distinguish which packets belong to each physical stream. This member is present on all MediaProperties objects with an object_version of 0 (zero). The size of this member is 32 bits. max_bit_rate The maximum
bit rate required to deliver this stream over a network. This member is present on all MediaProperties objects with an object_version of 0 (zero). The size of this member is 32 bits. avg_bit_rate The average bit rate required to deliver this stream over a
network. This member is present on all MediaProperties objects with an object_version of 0 (zero). The size of this member is 32 bits. max_packet_size The largest packet size (in bytes) in the stream of media data. This member is present on all MediaProperties
objects with an object_version of 0 (zero). The size of this member is 32 bits. avg_packet_size The average packet size (in bytes) in the stream of media data. This member is present on all MediaProperties objects with an object_version of 0 (zero). The size
of this member is 32 bits. start_time The time offset in milliseconds to add to the time stamp of each packet in a physical stream. This member is present on all MediaProperties objects with an object_version of 0 (zero). The size of this member is 32 bits.
preroll The time offset in milliseconds to subtract from the time stamp of each packet in a physical stream. This member is present on all MediaProperties objects with an object_version of 0 (zero). The size of this member is 32 bits. duration The duration
of the stream in milliseconds. This member is present on all MediaProperties objects with an object_version of 0 (zero). The size of this member is 32 bits. stream_name_size The length of the following stream_name member in bytes. This member is present on
all MediaProperties objects with an object_version of 0 (zero). The size of this member is 8 bits. stream_name A nonunique alias or name for the stream. This member is present on all MediaProperties objects with an object_version of 0 (zero). This size of
this member is variable. mime_type_size The length of the following mime_type field in bytes. This member is present on all MediaProperties objects with an object_version of 0 (zero). This size of this member is 8 bits. mime_type A nonunique MIME style type/subtype
string for data associated with the stream. This member is present on all MediaProperties objects with an object_version of 0 (zero). This size of this member is variable. type_specific_len The length of the following type_specific_data in bytes. The type_specific_data
is typically used by the data type renderer to initialize itself in order to process the physical stream. This member is present on all MediaProperties objects with an object_version of 0 (zero). The size of this member is 32 bits. type_specific_data The type_specific_data
is typically used by the data type renderer to initialize itself in order to process the physical stream. This member is present on all MediaProperties objects with an object_version of 0 (zero). The size of this member is variable.
Logical Stream Organization
A RealMedia file can contain a higher-level grouping of physical streams. This grouping is called a logical stream. Logical streams contain the following information:
Identifies which physical streams are grouped together into a logical stream. Contains name value properties that can be used to idnetify properties of the logical stream. (such as language, packet grouping, and so on.)
A logical stream is represented with a Media Properties Header. The mime type of the physical stream is preceeded with "logical- ". For example, the mime type for an ASM-compatible RealAudio stream is audio/x-pn-multirate-realaudio . A logical stream consisting
of a set of RealAudio physical streams would therefore have the mime type logical-audio/x-pn-multirate-realaudio . An example of a logical stream is shown in the following figure.
Logical Stream Organization
In this example there is one logical stream, one low bit rate audio stream and one high bit rate audio stream. This results in a RealMedia file with three Media Property Headers and three data sections. The type_specific_data field of the logical stream's Media
Property Header contains a LogicalStream structure. This structure contains all of the information required to interpret the logical stream and its collection of physical streams. The structure refers to the low bit rate and high bit rate audio streams. The
LogicalStream structure also contains the data_offset s to the start of the data section for each physical stream.
The logical stream number assigned to the logical stream is determined from the stream_number field in the Media Properties Header.
There is also one special logical stream of MIME type "logical-fileinfo " containing information about the entire file. There should only be one media header with this type. Behavior of players and editing tools is undefined if you have more than one.
The ASM rules contained in the logical-fileinfo stream are used to define precisely how bandwidth will be divided between the streams in the file. The logical-fileinfo may also contain a name-value pair that specifies which stream combinations should be served
to older players.
LogicalStream Structure
The following sample shows the LogicalStream structure:
LogicalStream
{
ULONG32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT16 num_physical_streams;
UINT16 physical_stream_numbers[num_physical_streams];
ULONG32 data_offsets[num_physical_streams];
UINT16 num_rules;
UINT16 rule_to_physical_stream_number_map[num_rules];
UINT16 num_properties;
NameValueProperty properties[num_properties];
}
};
The LogicalStream structure contains the following fields:
size The size of the LogicalStream structure in bytes. The size of this structure member is 32 bits. object_version The version of the LogicalStream structure. The size of this structure member is 16 bits. num_physical_streams The number of physical streams
that make up this logical stream. The physical stream numbers are stored in a list immediately following this field. These physical stream numbers refer to the stream_number field found in the Media Properties Object for each physical stream belonging to this
logical stream. The size of this structure member is 16 bits physical_stream_numbers[] The list of physical stream numbers that comprise this logical stream. The size of this structure member is variable. data_offsets[] The list of data offsets indicating
the start of the data section for each physical stream. The size of this structure member is variable. num_rules The number of ASM rules for the logical stream. Each physical stream in the logical stream has at least one ASM rule associated with it or it will
never get played. The mapping of ASM rule numbers to physical stream numbers is stored in a list immediately following this member. These physical stream numbers refer to the stream_number field found in the Media Properties Object for each physical stream
belonging to this logical stream. The size of this structure member is 16 bits. rule_to_physical_stream_map[] The list of physical stream numbers that map to each rule. Each entry in the map corresponds to a 0-based rule number. The value in each entry is
set to the physical stream number for the rule. For example:
rule_to_physical_stream_map[0] = 5
This example means physical stream 5 corresponds to rule 0. All of the ASM rules referenced by this array are stored in the first name-value pair of this logical stream which must be called "ASMRuleBook" and be of type "string". Each rule is separated by a
semicolon.
The size of this structure member is variable.
num_properties The number of NameValueProperty structures contained in this structure. These name/value structures can be used to identify properties of this logical stream (for example, language). The size of this structure member is 16 bits. properties[]
The list of NameValueProperty structures (see NameValueProperty Structure below for more details). As mentionied above, it is required that the first name-value pair be a string named "ASMRuleBook" and contain the ASM rules for this logical stream. The size
of this structure member is variable.
NameValueProperty Structure
The following sample shows the NameValueProperty structure:
NameValueProperty
{
ULONG32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT8 name_length;
UINT8 name[namd_length];
INT32 type;
UINT16 value_length;
UINT8 value_data[value_length];
}
}
The NameValueProperty structure contains the following fields:
size The size of the NameValueProperty structure in bytes. The size of this structure member is 32 bits. object_version The version of the NameValueProperty structure. The size of this structure member is 16 bits. name_length The length of the name data. The
size of this structure member is 8 bits. name[] The name string data. The size of this structure member is 8 bits. type The type of the value data. This member can take on one of three values (any other value is undefined), as shown in the following table:
typeDescriptionvalue_length 032-bit unsigned integer property4 1buffervariable 2stringvariable
The size of this structure member is 32 bits.
value_length The length of the value data. The size of this structure member is 16 bits. value_data[] The value data. The size of this structure member is 8 bits.
Content Description Header
The Content Description Header contains the title, author, copyright, and comments information for the RealMedia file. All text data is in ASCII format. The following pseudo-structure describes the Content Description Header:
Content_Description
{
UINT32 object_id;
UINT32 size;
UINT16 object_version
;
if (object_version == 0)
{
UINT16 title_len;
UINT8[title_len] title;
UINT16 author_len;
UINT8[author_len] author;
UINT16 copyright_len;
UINT8[copyright_len] copyright;
UINT16 comment_len;
UINT8[comment_len] comment;
}
}
The Content Description Header contains the following fields:
object_id The unique object ID for the Content Description Header ("CONT"). The size of this member is 32 bits. size The size of the Content Description Header in bytes. The size of this member is 32 bits. object_version The version of the Content Description
Header object. The size of this member is 16 bits. title_len The length of the title data in bytes. Note that the title data is not null-terminated. This member is present on all Content Description Header objects with an object_version of 0 (zero). The size
of this member is 16 bits. title An array of ASCII characters that represents the title information for the RealMedia file. This member is present on all Content Description Header objects with an object_version of 0 (zero). The size of this member is variable.
author_len The length of the author data in bytes. Note that the author data is not null-terminated. This member is present on all Content Description Header objects with an object_version of 0 (zero). The size of this member is 16 bits. author An array of
ASCII characters that represents the author information for the RealMedia file. This member is present on all Content Description Header objects with an object_version of 0 (zero). The size of this member is variable. copyright_len The length of the copyright
data in bytes. Note that the copyright data is not null-terminated. This member is present on all Content Description Header objects with an object_version of 0 (zero). The size of this member is 16 bits copyright An array of ASCII characters that represents
the copyright information for the RealMedia file. This member is present on all Content Description Header objects with an object_version of 0 (zero). The size of this member is variable. comment_len The length of the comment data in bytes. Note that the comment
data is not null-terminated. This member is present on all Content Description Header objects with an object_version of 0 (zero). The size of this member is 16 bits. comment An array of ASCII characters that represents the comment information for the RealMedia
file. This member is present on all Content Description Header objects with an object_version of 0 (zero). The size of this member is variable.
Data Section
The data section of the RealMedia file consists of a Data Section Header that describes the contents of the data section, followed by a series of interleaved media data packets. Note that the size field of the Data Chunk Header is the size of the entire
data chunk, including the media data packets.
Data Chunk Header
The Data Chunk Header marks the start of the data chunk. There is usually only one data chunk in a RealMedia file; however, for extremely large files, there may be multiple data chunks. The following pseudostructure describes the Data chunk header:
Data_Chunk_Header
{
UINT32 object_id;
UINT32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT32 num_packets;
UINT32 next_data_header;
}
}
The Data Chunk Header contains the following fields:
object_id The unique object ID for the Data Chunk Header ('DATA'). The size of this member is 32 bits. size The size of the Data Chunk in bytes. The size includes the size of the header plus the size of all the packets in the data chunk. The size of this member
is 32 bits. object_version The version of the Data Chunk Header object. The size of this member is 16 bits. num_packets Number of packets in the data chunk. This member is present on all Data Chunk Header objects with an object_version of 0 (zero). The size
of this member is 32 bits. next_data_header Offset from start of file to the next data chunk. A non-zero value refers to the file offset of the next data chunk. A value of zero means there are no more data chunks in this file. This field is not typically used.
This member is present on all Data Chunk Header objects with an object_version of 0 (zero). The size of this member is 32 bits.
Data Packet Header
Following a data chunk header is num_packet data packets. These packets can all be from the same stream, or packets from different streams can follow one another. These packets, whether from the same stream or from different streams, should have an increasing
value of timestamp. That is, the timestamp of a packet should be greater than or equal to the timestamp of the previous packet in the file.
The following pseudo-structure describes the details of the packet:
Media_Packet_Header
{
UINT16 object_version;
if ((object_version == 0) || (object_version == 1))
{
UINT16 length;
UINT16 stream_number;
UINT32 timestamp;
if (object_version == 0)
{
UINT8 packet_group;
UINT8 flags;
}
else if (object_version == 1)
{
UINT16 asm_rule;
UINT8 asm_flags;
}
UINT8[length] data;
}
else
{
StreamDone();
}
}
The Media Packet Header contains the following fields:
object_version The version of the Media Packet Header object. The size of this member is 16 bits. length The length of the packet in bytes. This member is present on all Media Packet Header objects with an object_version of 0 (zero) or 1. The size of this member
is 16 bits. stream_number The 16-bit alias used to associate data packets with their associated Media Properties Header. This member is present on all Media Packet Header objects with an object_version of 0 (zero) or 1. The size of this member is 16 bits.
timeStamp The time stamp of the packet in milliseconds This member is present on all Media Packet Header objects with an object_version of 0 (zero) or 1. The size of this member is 32 bits. packet_group The packet group to which the packet belongs. If packet
grouping is not used, set this field to 0 (zero). This member is present on all Media Packet Header objects with an object_version of 0 (zero). The size of this member is 8 bits. flags Flags describing the properties of the packet. The following flags are
defined: HX_RELIABLE_FLAG
If this flag is set, the packet is delivered reliably.
HX_KEYFRAME_FLAG
If this flag is set, the packet is part of a key frame or in some way marks a boundary in your data stream.
This member is present on all Media Packet Header objects with an object_version of 0 (zero). The size of this member is 8 bits.
asm_rule The ASM rule assigned to this packet. Only present if object_version equals 1. The size of this member is 16 bits. asm_flags Contains HX_ flags that dictate stream switching points. Only present if object_version equals 1. The size of this member is
8 bits. data The application-specific media data. This member is present on all Media Packet Header objects with an object_version of 0 (zero) or 1. The size of this member is variable.
Index Section
The index section of the RealMedia file consists of a Index Chunk Header that describes the contents of the index section, followed by a series of index records. Note that the size field of the Index Chunk Header is the size of the entire index chunk, including
the index records.
Index Section Header
The Index Chunk Header marks the start of the index chunk. There is usually one index chunk per stream in a RealMedia file. The following pseudo-structure describes the Index chunk header.
Index_Chunk_Header
{
u_int32 object_id;
u_int32 size;
u_int16 object_version
;
if (object_version == 0)
{
u_int32 num_indices;
u_int16 stream_number;
u_int32 next_index_header;
}
}
The Index Chunk Header contains the following fields:
object_id The unique object ID for the Index Chunk Header ("INDX"). The size of this member is 32 bits. size The size of the Index Chunk in bytes. The size of this member is 32 bits. object_version The version of the Index Chunk Header object. The size of this
member is 16 bits. num_indices Number of index records in the index chunk. This member is present on all Index Chunk Header objects with an object_version of 0 (zero). The size of this member is 32 bits stream_number The stream number for which the index records
in this index chunk are associated. This member is present on all Index Chunk Header objects with an object_version of 0 (zero). The size of this member is 16 bits. next_index_header Offset from start of file to the next index chunk. This member enables RealMedia
file format readers to find all the index chunks quickly. A value of zero for this member indicates there are no more index headers in this file. This member is present on all Index Chunk Header objects with an object_version of 0 (zero). The size of this
member is 32 bits.
Index Record
The index section of a RealMedia file consists of a series of index record objects. Each index record contains information for quickly finding a packet of a particular time stamp for a physical stream. The following pseudo-structure describes the details
of each index record:
IndexRecord
{
UINT16 object_version;
if (object_version == 0)
{
u_int32 timestamp;
u_int32 offset;
u_int32 packet_count_for_this_packet;
}
}
An Index Record contains the following fields:
object_version The version of the Index Record object. The size of this member is 16 bits. timestamp The time stamp (in milliseconds) associated with this record. This member is present on all Index Record objects with an object_version of 0 (zero). The size
of this member is 32 bits. offset The offset from the start of the file at which this packet can be found. This member is present on all Index Record objects with an object_version of 0 (zero). The size of this member is 32 bits. packet_count_for_this_packet
The packet number of the packet for this record. This is the same number of packets that would have been seen had the file been played from the beginning to this point. This member is present on all Index Record objects with an object_version of 0 (zero).
The size of this member is 32 bits.
Metadata Section
The metadata section of the RealMedia file consists of a tag containing a set of named metadata properties that describe the media file. These properties can be text, integers, or any binary data. The tag is preceded by a header that identifies the size
of the entire metadata section. Following the tag, the footer identifies the size of the tag. Since the metadata section is found at the end of the file, the footer can be used to expedite seeking backwards. At the end of the metadata section, and the file
itself, is an ID3v1 tag.
Metadata Section Header
The Metadata Section Header marks the start of the metadata section. There is one metadata section in a RealMedia file and it is expected to be at the end of the file. The following structure describes the Metadata section header:
MetadataSectionHeader
{
u_int32 object_id;
u_int32 size;
}
The Metadata Section Header contains the following fields:
object_id The unique object ID for the Metadata Section Header ("RMMD"). The size of this member is 32 bits. size The size of the full metadata section in bytes. The size of this member is 32 bits.
Metadata Tag
The metadata tag of a RealMedia file consists of a series of properties. The properties are represented as a tree hierarchy with one unnamed root property. Each property contains a type and value, as well as multiple (optional) sub-properties. The following
structure describes the details of the metadata tag:
MetadataTag
{
u_int32 object_id;
u_int32 object_version;
u_int8[] properties;
}
The Metadata Tag contains the following fields:
object_id The unique object ID for the Metadata Tag ("RJMD"). The size of this member is 32 bits. object_version The version of the Metadata Tag. The size of this member is 32 bits. properties[] The MetadataProperty structure that makes up the metadata tag
(see "Metadata Property Structure" for more details). As mentioned above, the properties will be represented as one unnamed root metadata property with multiple sub-properties, each with their own optional sub-properties. These will be nested, as in a tree.
Metadata Property Structure
The following sample describes the details of the MetadataProperty structure:
MetadataProperty
{
u_int32 size;
u_int32 type;
u_int32 flags;
u_int32 value_offset;
u_int32 subproperties_offset;
u_int32 num_subproperties;
u_int32 name_length;
u_int8[name_length] name;
u_int32 value_length;
u_int8[value_length] value;
PropListEntry[num_subproperties] subproperties_list;
MetadataProperty[num_subproperties] subproperties;
}
The MetadataProperty structure contains the following fields:
size The size of the MetadataProperty structure in bytes. The size of this member is 32 bits. type The type of the value data. The data in the value array can be one of the following types: MPT_TEXT
The value is string data.
MPT_TEXTLIST
The value is a separated list of strings, separator specified as sub-property/type descriptor.
MPT_FLAG
The value is a boolean flageither 1 byte or 4 bytes, check size value.
MPT_ULONG
The value is a four-byte integer.
MPT_BINARY
The value is a byte stream.
MPT_URL
The value is string data.
MPT_DATE
The value is a string representation of the date in the form: YYYYmmDDHHMMSS (m = month, M = minutes).
MPT_FILENAME
The value is string data.
MPT_GROUPING
This property has subproperties, but its own value is empty.
MPT_REFERENCE
The value is a large buffer of data, use sub-properties/type descriptors to identify mime-type.
The size of this member is 32 bits.
flags Flags describing the property. The following flags are defined these can be used in combination: MPT_READONLY
Read only, cannot be modified.
MPT_PRIVATE
Private, do not expose to users.
MPT_TYPE_DESCRIPTOR
Type descriptor used to further define type of value.
The size of this member is 32 bits.
value_offset The offset to the value_length , relative to the beginning of the MetadataProperty structure. The size of this member is 32 bits. subproperties_offset The offset to the subproperties_list , relative to the beginning of the MetadataProperty structure.
The size of this member is 32 bits. num_subproperties The number of subproperties for this MetadataProperty structure. The size of this member is 32 bits. name_length The length of the name data, including the null-terminator. The size of this member is 32
bits. name[] The name of the property (string data). The size of this member is designated by name_length . value_length The length of the value data. The size of this member is 32 bits. value[] The value of the property (data depends on the type specified
for the property). The size of this member is designated by value_length . subproperties_list[] The list of PropListEntry structures. The PropListEntry structure identifies the offset for each property (see "PropListEntry Structure" for more details. The size
of this member is num_subproperties * sizeof(PropListEntry). subproperties[] The sub-properties. Each sub-property is a MetadataProperty structure with its own size, name, value, sub-properties, and so on. The size of this member is variable.
PropListEntry Structure
The following sample describes the details of the PropListEntry structure:
PropListEntry
{
u_int32 offset;
u_int32 num_props_for_name;
}
The PropListEntry structure contains the following fields:
offset The offset for this indexed sub-property, relative to the beginning of the containing MetadataProperty . The size of this member is 32 bits. num_props_for_name The number of sub-properties that share the same name. For example, a lyrics property could
have multiple versions as differentiated by the language sub-property type descriptor. The size of this member is 32 bits.
Metadata Section Footer
The metadata section footer marks the end of the metadata section of a RealMedia file. The metadata section footer contains the size of the metadata tag. Since the metadata section is at the end of the file, the section footer lies a fixed offset of 140 bytes
from the end of the file. The size of the metadata tag enables a file reader to quickly locate the beginning of the metadata tag relative to the end of the file. The following structure describes the Metadata section footer.
MetadataSectionFooter
{
u_int32 object_id;
u_int32 object_version;
u_int32 size;
}
The MetadataSectionFooter contains the following fields:
object_id The unique object ID for the Metadata Section Footer ("RMJE"). The size of this member is 32 bits. object_version The version of the metadata tag. The size of this member is 32 bits. size The size of the preceding metadata tag. The size of this member
is 32 bits.
ID3v1 Tag
The ID3v1 Tag is at the end of the metadata section and is expected to be at the end of the entire file. It is a fixed size128 bytesand begins with the characters "TAG". Futher information about the informal ID3v1 standard can be found at http://id3.org/id3v1.html
.
What is ID3 (v1)?
The audio format MPEG layer I, layer II and layer III (MP3) has no native way of saving information about the contents, except for some simple yes/no parameters like "private", "copyrighted" and "original home" (meaning this is the original file and not a copy).
A solution to this problem was introduced with the program "Studio3" by Eric Kemp alias NamkraD in 1996. By adding a small chunk of extra data in the end of the file one could get the MP3 file to carry information about the audio and not just the audio itself.
The placement of the tag, as the data was called, was probably chosen as there were little chance that it should disturb decoders. In order to make it easy to detect a fixed size of 128 bytes was chosen. The tag has the following layout (as hinted by the scheme
to the right):
Internal layout of an ID3v1 tagged file.
Song Title
30 characters
Artist
30 characters
Album
30 characters
Year
4 characters
Comment
30 characters
Genre
1 byte
If you one sum the the size of all these fields we see that 30+30+30+4+30+1 equals 125 bytes and not 128 bytes. The missing three bytes can be found at the very beginning of the tag, before the song title. These three bytes are always "TAG" and is the identification
that this is indeed a ID3 tag. The easiest way to find a ID3v1/1.1 tag is to look for