White Paper: Media Container File Formats


Meet the digital equivalent of Tupperware for your music and video files

When can a file encapsulate more than one type of data? When it’s a metafile, wrapper, or container file. You might think of a container file as a package or envelope in which other files are housed. Zip files, which can contain documents, photos, videos, software programs, and many other types of files, are one type of container that you encounter frequently.

We’ll limit our discussion here to media container formats. A pure container file specifies how the data is stored, but it doesn’t necessarily know how it was compressed or encoded or even what is required to play back those files. This can lead to confusion when dealing with container files wrapped around media because there’s a chance that the media player you’re using is capable of opening the container but not equipped with the algorithm required to decode the files inside. Although a container can theoretically hold any type of data, most are optimized during development to wrap around particular data groups, e.g., digital audio for music; static images for digital photographs; or digital video interleaved with digital audio, plus subtitles, closed-caption information, and chapter data for movies. Container formats that support video also include the information required to synchronize the various data streams in the file during playback.

The MP4 container, which is based on Apple's QuickTime technology, encapsulates audio, video, and synchronization information in a series of packages within packages.

Container files store data in chunks, packets, or segments; three terms that describe essentially the same concept. A chunk’s primary content is known as its payload, and most container formats arrange their chunks in sequence, with a file header at the beginning of each chunk that describes the type of data contained in the payload. This arrangement makes it easier to recover lost chunks in the event of file corruption or dropped frames.

Common Media Containers

WAV is a common example of a container format that’s used exclusively for audio on the Windows platform, although the container is also compatible with the Linux and Macintosh operating systems. WAV containers typically host uncompressed linear pulse code modulation (LPCM) audio files encoded in RIFF (Resource Interchange File Format). When you rip a CD to your hard drive, the file is converted from the Red Book audio format and saved as a WAV file on your hard drive, although most people then convert that file to another, less storage-intensive format using a lossy code such as MP3, or a lossless one such as FLAC.

If you’ve ever ripped a movie from a DVD (or just examined the directory structure on a DVD), you’ve encountered VOB files (the acronym stands for Video Object). VOB files are containers that house a DVD’s digital video and audio streams, plus menus and data streams such as subtitles. There is typically one VOB file for each title on the disc, although this is not a requirement. VOB files are in turn based on the MPEG Program Stream, a container format that multiplexes packetized digital audio, video, and data streams (these are individually known as elementary streams). Elementary streams are packetized by dividing the stream into sequential bytes and encapsulating them in packet headers.

Movies on Blu-ray discs, on the other hand, utilize a container based on the MPEG Transport Stream. Just like MPEG-PS, MPEG-TS multiplexes packetized digital audio, video, and data streams and synchronizes their output; the key difference is that MPEG-TS supports a mechanism for error correction. MPEG-TS is also used in the U.S. for ATSC digital television broadcasts.

Apple’s QuickTime container (which uses the file extension MOV) can host multiple audio, video, effects, and text tracks (for subtitles). MOV files are unique among media containers in that each track can contain either a digital media stream or a reference to a media stream contained in a separate file. This latter feature renders QuickTime very well-suited to editing because the media doesn’t need to be rewritten after an edit. QuickTime also forms the basis of the MPEG-4 Part 14 container (which uses the file extension MP4). Both MOV and MP4 containers can use the same MPEG-4 codecs, but MP4 is more widely supported because it’s an international standard.

Other popular container formats include AVI (Audio Video Interleave), an aging but ubiquitous Microsoft standard that can contain many types of audiovisual data, including MPEG-4; Ogg, the standard container for audio encoded with the open-source Vorbis codec and video encoded with the open-source Theora codec; and RealMedia, the standard container for RealNetworks’ RealVideo and RealAudio files.

But no discussion of media container formats would be complete without mentioning the Matroska Multimedia Container. This ambitious open-standard and royalty-free file format (its ownership resides in the public domain) can hold an unlimited number of media tracks in a single file. Unlike the other container formats we’ve covered, which are limited to certain types of audio and video files encoded using particular codecs, Matroska containers can harbor audio and video files encoded using virtually any codec (MPEG-4, H.264, MP3, FLAC, WMA, and more—including Dolby TrueHD and DTS-HD, the HD audio formats used on Blu-ray discs). MKV files are used to store video files, MKA files to store audio-only files, and MKS files are used for subtitles. Matroska containers can also support chapter divisions, subtitles, menus, and metadata and tags.

Around the web