MPEG-4 Tools

Last update : September 16, 2013

To create and modify MPEG-4 Multimedia files, you need different MPEG-4 tools, e.g. an encoder, a multiplexer and a packager :

MPEG-4 Tools : Video encoder

x264  (Wikipedia) is a free software library (libx264) and application (x264.exe) for encoding video streams into the H.264/MPEG-4 AVC format, and is released under the terms of the GNU GPL. X264 provides best-in-class performance, compression, and features, gives the best quality and has the most advanced psychovisual optimizations. A comparison with other H264 codecs is available at the MSU Graphics & Media Lab (Video Group) of Lomonosov Moscow State University. The leader in this comparison for software encoders is x264, followed by MainConcept, DivX H.264 and Elecard.

X264.exe is a command line tool. A typical command to enter in the Command Prompt Window looks as follows :

x264.exe --crf 18 --ref 3 --bframes 2 --subme 3 --keyint 100 --sar 1:1 --output %1.mkv %1
pause

All available parameters can be listed with the command x264 –fullhelp. The purpose and use of all x264 settings is also explained on the MeWiki website.

The fourcc code of the X264 codec is X264.

MPEG-4 Tools : Multiplexer

To encode videos, x264 is not sufficient. Audio, subtitles and metadata should be added, and all these data need to be multiplexed. Therefore other tools are needed. FFmpeg is one of these tools. FFmpeg is a free software project that produces libraries and programs for handling multimedia data. It includes libavcodec, the leading audio/video codec library and libavformat, an audio/video container mux and demux library. FFmpeg is published under the GNU Lesser General Public License 2.1+ or GNU General Public License 2+, depending on which options are enabled. The ffmpeg component is a command-line tool to convert one video file format to another. X264 is added as an external library to FFmpeg. Zeranoe has great static builds of FFmpeg for Windows with libx264 included. Other useful external libraries are the Fraunhofer AAC library for AAC encoding and the LAME library for MP3 encoding.

A very comprehensive documentation about ffmpeg , the libraries, utilities and tools is available at the FFmpeg website.

MPEG-4 Tools : Packager

A third command-line tool performing some manipulations on ISO media files like mp4 is MP4Box, the multimedia packager from GPAC (Project on Advanced Content). Dynamic Adaptive Streaming over HTTP (DASH) is one example. GPAC officially started as an open-source project in 2003 with the initial goal to develop from scratch, in ANSI C, clean software compliant to the MPEG-4 Systems standard, a small and flexible alternative to the MPEG-4 reference software. The GPAC framework is being developed at École nationale supérieure des télécommunications (ENST) as part of research work on digital media. A general documentation about MP4Box is available at the GPAC website.

MP4Box is a command-line tool, the following GUI’s are available :

  • MeGUI, by several authors (version 2356, released on June 8, 2013)
  • My MP4Box GUI, by Matthew Bodin (version 0.6.0.6, released on January 4, 2013)
  • Java MP4Box Gui, by Rune André Liland (version 1.7, released on May 18, 2013)
  • Yamb, by kurtnoise version 2.1.0.0 beta 2, released on June 29, 2009)

The following list provides links to additional posts about MPEG-4 tools :

MPEG-4 containers : mp4, m4a, m4p, m4v

Last update : September 16, 2013

To play H264 encoded movies, the encoded video and audio files must be packaged in a specific type of container following the MPEG-4 Part 12 specification. Stream packaging, also known as muxing, is the procedure to combine multiple elements that enable control of the distribution delivery process into a single multiplexed media file.

The most common container for H264 encoded videos is specified by MPEG-4 Part 14 (standard ISO 14496-14) and has the file extension mp4. This format, often called MP4 container,  is based on the quicktime format mov. Audio-only MPEG-4 files have a m4a extension, or m4p when they are encrypted.

Apple introduced the extension m4v for it’s iTunes applications. It’s very close to .mp4, some differences are the optional Apple’s DRM copyright protection, and the treatment of AC3 (Dolby Digital) audio which is not standardized for MP4 container

The following command line tools are available to create and modify MP4 files by combining (multiplexing) previously encoded video or audio tracks, as well as subtitles, chapter information and meta data.

  • AtomicParsley : lightweight program for reading, parsing and setting metadata into MPEG-4 files
  • MP4Creator : tool from Cisco’s mpeg4ip suite that combines video, audio, text and other media to create MPEG-4 streams.

GUI’s for both programs are also available. Atomicparsleygui is a GUI for AtomicParsley, MP4Muxer is a GUI for MP4Creator. All these programs have not been updated during the last four years. A better choice for MPEG-4 tools today is FFmpeg and MP4Box.

More informations about the MPEG-4 containers are listed hereafter :

MPEG-4 Part 2 and Part 10

Last update : September 16, 2013

MPEG-4 Part 2

MPEG-4 Part 2 (MPEG-4 Visual) is a video compression technology developed by MPEG, similar to previous standards such as MPEG-1 and MPEG-2 and compatible with H.263. Several popular codecs including DivX and Xvid implement this standard.

MPEG-4 Part 10

MPEG-4 Visual should not be confused with MPEG-4 Part 10 which is commonly referred to as H.264 or AVC (Advanced Video Coding), and was jointly developed by ITU-T and MPEG.

AVC is currently one of the most commonly used formats for the recording, compression, and distribution of high definition video.

Smart editing of MPEG-4/H264 videos

Last update : August 31, 2013

To edit a video, you need to cut & join numerous clips. This process is called smart editing and is particularly difficult if the video is encoded with H264.

H.264/MPEG-4 Part 10 or AVC (Advanced Video Coding) is is a block-oriented motion-compensation-based codec standard developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG).

H.264 is used in Blu-ray Discs, cable television services, real-time videoconferencing, mobile devices, … The standard defines 17 sets of capabilities, which are referred to as profiles, targeting specific classes of applications.

The highly compressed video file formats like MPEG-2 or MPEG-4 were exclusively designed for playback or distribution… not for editing. The term “compressing”  is misleading, because these video file formats are made in a similar fashion and achieve their very high compression rates by throwing away information. They rely on a system of deleting data that is unnecessarily repeated in frame after frame of the videos. The data is replaced by a reference to an earlier or later frame.

In MPEG-2 streams there are three types of pictures :

  • I-Pictures (Intra-coded picture) : these are easiest to think of as a complete picture and are slightly compressed like a JPEG photo file is compressed.
  • P-Pictures (Predicted picture) : these are incomplete pictures and only contain the infomation that has changed since the last I-Picture or the last P-Picture.
  • B-Pictures (Bi-predictive picture) : these ones are the most highly compressed because they can use information from previous I- or P-Pictures and forward I- or P-Pictures for reference in playback.

The following picture (courtesy Wikipedia) shows the relations between  I, P and B frames.

I, P and B frames

The I, P, and B pictures are arranged in groups of pictures (GOP) in a way so that the video file can be played back by a video device or software. There are generally two types of GOP’s : short GOP’s and long GOP’s. The sequence of the transmitted frames is not linear; P-frames are send before the related B-frames.

The following picture shows a typical sequence of a short GOP (I =blue, P = red, B  = yellow). The term VOP (video object plane) is used in relation with the video codecs (IVOP, PVOB, BVOP).

Short GOP sequence

You easily understand that cutting or joining such video sequences without disturbing the sequence of frames or the synchronization with the sound can be very complex.

In the case of  MPEG-4 streams it’s even more difficult.

The granularity of the establishment of prediction types in MPEG-4 is brought down to a lower level called the slice level of the representation. A slice is a spatially distinct region of a picture that is encoded separately from any other region in the same picture. In that standard, instead of I pictures, P pictures, and B pictures, there are I slices, P slices, and B slices. Motion estimation provides for the searching of sub-macro blocks of variable size, from 16×16 down to 4×4 blocks. Motion vectors allow up to quarter pixel accuracy for luminance, and up to 1/8th pixel for chrominance. MPEG-4 carries out intra-prediction for intra coded blocks before the transform, performed on either 4×4 or 16×16 blocks and allowing up to 9 directional modes for direction dependent prediction. Residual data transforms are executed on 4×4 blocks with modified integer discrete cosine transform (DCT) which avoids rounding errors. The employment of an adaptive in-loop filter increases subjective quality of video. The standard provides two alternative and more efficient processes of entropy coding. Context-adaptive variable length coding (CAVLC) utilizes multiple variable length codeword tables for transform coefficient encoding considering spatial neighborhood of the coded block. Context-adaptive binary arithmetic coding (CABAC) in addition provides highly efficient automatic adjustment for underlying probability model of encoded data. Long GOP’S are usual in MPEG-4.

Cutting and joining MPEG-4 videoclips without re-encoding (lossless) to keep a high quality and without creating visual or audial drops at the edges of the movies is very challenging. Only a few software tools are capable to do such a task which is called “smart editing“.

smart editing

MPEG-4 editor tool Machete

QuickTime pro MPEG-4 player & editor



A very simple MPEG-4 editor is Machete from Machetesoft. It’s a try-before buy-program, the current version is 4.0 build 33 released on March 22, 2013. The software is available at regnow for 15,99 euros.

Cutting videoclips or inserting other videoclips with Machete is only possible at the location of key-frames (I-pictures). Unfortunately a typical MPEG-4 videoclip has only very few key-frames (every 5 to 10 seconds).

A wellknown MPEG-4 player and editor is the pro-version of QuickTime from Apple. The current version is 7.7.3, build 1680.64. The selected part of a videoclip can be trimmed with the menu  “Edit > Trim to Selection”. The trimmed videoclip can be saved with the same parameters without re-compression with the menu “File > Export …;  Exporter > MPEG-4 sequence ; options >video and audio format > pass through”. A videoclip can be copied to the clipboard and added to another movie with the same features. I expected a clean export to MPEG-4 without affecting the audio or video streams, but this is not the case.

AvsPmod Editor with AviSynth

A very powerful and versatile video post-production tool is AviSynth, created by Ben Rudiak-Gould. It’s not a software you may usually think of programs (.exe and GUI), but it’s a video processing engine that works in the background. AviSynth uses scripts which tell the program what to do and what video to produce.

A Wiki on the main website provides some documentation and user guides about AviSynth. A more comprehensive documentation is available at the website of AnimeMusicVideos.org, a community dedicated to the creation, discussion, and general enjoyment of fan-made anime music videos. The AviSynth tutorial is part of a very useful documentation “Technical Guides to All Things Audio and Video” available on the same website.

The AviSynth syntax to program video scripts is available at the official wiki-website.

A package (AMVapp v3.1) including a lot of accessories and complementary software tools,  described in the technical guides, is available at the AimeMusicVideos website. One of the tools is a text editor specifically designed for making AviSynth, called AvsP. It has been written in Python by qwerpoi. The most recent version is 2.0.2 released on October 27th, 2007. An enhanced version called AvsPmod has been created by Zarxrax, the latest version is 2.5.1 released on June 25, 2013.

The AvsPmod editor shows not only the resolution, framerate, colorspace, frame number, time-code and aspect ratio of the videoclip in the bottom bar, but also the position and color of the videopixel defined by the mouse pointer. To play back the video in real-time, you need an external directshow media player (for example Windows Media Player) which is activated with the AvsPmod preview button (4th button from the left). The VLC-player doesn’t work because it’s not a directshow player.

VirtualDub

VirtualDub is a video capture/processing utility for 32-bit and 64-bit Windows platforms, written by Avery Lee and licensed under the GNU General Public License (GPL). The current stable version is v1.9.11. An unofficial VirtualDub support forum is available at the website. A modified version of VirtalDub called VirtualDubMod has been discontinued since 2005.

To play and edit MPEG-4 videos in VirtualDub, specific plugins and filters are required. The most straightforward solution is to combine VirtualDub with AviSynth and AvsPmod.

Avidemux 2.5

Avidemux2.5 is a free video editor designed for simple cutting, filtering and encoding tasks. It supports many file types using a variety of codecs. The tool is available for Linux, BSD, Mac OS X and Microsoft Windows under the GNU GPL license. The current version is 2.6.5 released on August 29, 2013. A detailed up-to-date documentation is available on the wiki-website.

The tool shows for each frame the type of picture (I, P or B). To save a selection of a clip in the default copy mode without re-encoding, the  marker A and B must be key-frames (I-pictures). An automatic search for key-frames and for black-frames (pictures without content, often inserted between movies and commercials) is provided. To join videoclips use the menu “File > Append”. The Smart-Copy feature doesn’t work for videos encoded with the H264 codec. For some other codecs you’re asked whether you want to use Smart-Copy or not if you cut your video, and the first frame of a segment is not an I-frame, and you try to save it .

Womble MPEG Video Wizard DVD 5.0

Womble MPEG Video Wizard DVD 5.0 is a commercial MPEG editor with DVD authoring and full MPEG-4 and AC-3 encoder support. The price for a single user personal license is $99. A free trial download is available. The features of this program are smart rendering, no re-encoding, fast HD MPEG editing with frame accuracy, automatic Ad detection and removal, movie conversion to iPod’s and PSP’s, intuitive User Interface (UI) and batch processing.

The current release is 5.0.1.108 from June 2013.

A tutorial “How to Edit Out Commercials? ” is available at the Womble website.
Another commercial video editing tool is SmartCutter from FameRing. The company states that SmartCutter is the world’s first H.264 AVCHD MPEG2 frame accurate cutter without re-encoding! The price is 40$, a free trial is available. Other tools as a video browser and a video framer or bundled versions are also offered. The current version is 1.8.1 released on August 28, 2013.

A tutorial how to edit H.264/AVCHD/MPEG2 videos without re-encoding is available oh the FameRing website. The name FAME stands for Frame Accurate Movie EngineeRing.

Smart Cutter from FameRing

The record function of the VLC media player can be used to do a simple cutting of video clips.

My favorite editing tools are now AVS Video Remaker and AVS Video Editor from AVS4YOU, a project of Online Media Technologies Ltd, an english IT high-tech company, founded in 2004 and specialized in developing innovative video and audio solutions for end-users and professional developers. AVS4YOU is a collection of software tools (currently there are 20 tools available) for which you can purchase either an unlimited access license or a one-year access license and use aLL of the tools with that license.

I did a lot of tests with other low-price commercial and shareware video-editors and I experienced serious problems with most of them when loading my MPEG-4/H264 test videos.

Progressive video download, pseudo streaming and realtime streaming

Last update : January 30, 2013
In the past, audio and video on the Web was primarily a download-and-play technology. You had to first download an entire media file before it could play. Today, streaming technologies allow watching audio and video files almost immediately, while the data is being sent, without having to wait for the whole file to download.

There are three methods of delivering streaming audio and video content over the Web.

The first method uses a standard HTTP server to deliver the audio and video data to a media player. Unlike the download-and-play client, a special streaming client embedded in the webpage starts playing the audio or video while it is downloading, after only a few seconds wait for buffering, the process of collecting the first part of a media file before playing. This streaming method is called progressive media download.

The second method is called pseudostreaming. Pseudostreaming is a protocol that can be installed on regular HTTP servers. It uses a server side script for Flash-to-server communication. The player sends a HTTP request to the server with a start time parameter in the request URL’s query string and the server script responds with the video stream so that its start position corresponds to the requested parameter. This start time parameter is usually named simply start. The video viewer skips the nondownloaded parts of the videos.

Both FLV and MP4 video can be played back with  pseudostreaming. The following scripts or tools are available :

  • The H264 streaming module for Apache, Lighttpd, IIS and NginX.
  • The mod_flv_streaming module for Lighttpd.
  • PHP/ASP scripts such as XmooV PHP.
  • Content delivery networks such as Bitgravity, Edgecast or Limelight.

There is one major advantage to streaming with a Web server rather than with a streaming media server—utilizing existing infrastructure.

The third method uses a separate streaming media server specialized to the audio/video streaming task. A streaming server offers the following advantages :

  • More efficient use of the network bandwidth
  • Better audio and video quality to the user
  • Advanced features like detailed reporting and multi-stream multimedia content
  • Supports large number of users
  • Multiple delivery options
  • Content copyright protection

The following protocols are commonly used by streaming servers :

  • UDP – this protocol provides the most efficient network throughput. The only downside to UDP is that many network administrators close their firewalls to UDP traffic, limiting the potential audience of UDP-based streams
  • TCP – this protocol provides an adequate, though not necessarily efficient, protocol for delivering streaming media content to flow through the firewalls
  • HTTP + TCP – this combination has the benefit of working with all firewalls that let Web traffic through (port 80) and provides much more control (fast forward, rewind, etc) than a standard Web server, but also adds some overhead to the raw TCP stream that decreases scalability.
  • Multicast – this protocol enables hundreds or thousands of users to play a single stream, but will only work on networks with Multicast-enabled routers. Multicast is becoming prevalent on corporate networks, but is still very rare on the Internet

Useful informations and tutorials about streaming are available at the streamingmedia.com website.

In 2009, Amazon CloudFront, the easy-to-use content delivery service, introduced the ability to stream audio and video files. Streaming with Amazon CloudFront is exceptionally easy: with only a few clicks on the AWS Management Console or a simple API call, you’ll be able to stream your content using a world-wide network of edge locations running Adobe’s Flash® Media Server. Like all AWS services, Amazon CloudFront streaming requires no up-front commitments or long-term contracts. There are no additional charges for streaming with Amazon CloudFront; you simply pay normal rates for the data that you transfer using the service.

FFmpeg : record, convert and stream audio and video

Last update : August 29, 2013

ffmpeg

FFmpeg Command Line Tool

FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video. It includes libavcodec – the leading audio/video codec library.

The latest version is 2.0.1 released August 11, 2013. Version 0.6.x released in 2010 featured a lot of improvements that are relevant for HTML5 video. The H.264 and Theora decoders were significantly faster, the vorbis decoder has seen important updates and the release supported Google’s newly released libvpx library for the VP8 codec and WEBM container.

FFmpeg is free software licensed under the LGPL or GPL depending on the configuration options. Companies that violate the license terms are tracked and listed on the Hall of Shame and eventually sued.

ffdshow wrapper for Windows DirectShow

FFmpeg is developed under GNU/Linux, but it can be compiled under most operating systems. Windows distributions are available at the website ffmpeg.zeranoe.com. On Windows there are however some limitations, for instance up to now it’s not possible to capture audio in realtime.

To make the libavcodec decoders available to DirectShow-based applications (a proprietary Windows technology), you can use ffdshow. This DirectShow filter is a DirectShow-wrapper around the libavcodec (ffmpeg) decoders. Non DirectShow-based applications like Avidemux use libavcodec/ffmpeg through it’s native interface. There exist a fork of the original ffdshow project called ffdshow tryouts.

Paul Glagla developed a utility Filmerit (version 3.0.8 published on May 14, 2007) to show DirectShow filters and diagnose errors. Another similar tool called InstalledCodec (version 1.30) which allows to enable/disable codec drivers and DirectShow filters is available on the NirSoft website.

FFmpeg is a command-line based tool. There are however several graphical user interfaces (GUI) available :

  • SUPER from eRigthSoft
  • Avanti, a dedicated “workbench” for FFmpeg/Avisynth
  • HandBrake, an open-source, GPL-licensed, multiplatform, multithreaded video transcoder; I upgraded to version 0.9.5 in june 2011
  • WinFF, a free tool published under the GNU public license for Windows and Ubuntu

SUPER from eRigthSoft

Avanti GUI

HandBrake

FFmpeg or the libraries are also used by other video frameworks :

  • VLC from Videolan

A full list of all projects using FFmpeg is available on the official website.

A similar project as FFmpeg, using several components of this project, is MEncoder.

The following list provides some useful links about FFmpeg :