Square and non square video pixels

Last update : August 27, 2013

Whereas in the graphic and computer world we have square video pixels, in the old video world (PAL and NTSC) we have non square video pixels (Recommendation ITU-R BT.601-4). Video pixels in the HD world are, fortunately, square.

The term which describes this squareness or non-squareness is the pixel aspect ratio, expressed as a fraction of horizontal (x) pixel size divided by vertical (y) pixel size.

The PAL (576i) pixel aspect ratio (PAR) is 59/54 (1,094), the NTSC (480i) pixel aspect ratio is 10/11.

The pixel aspect ratio must not be confused with the display aspect ratio (DAR) or where the common values are 4:3 and 16:9 (anamorphic format).

When doing a conversion of a video file from one size or format in another size or format, the resulting video geometry will be stretched or squished if the pixel aspect ratio is not accomodated. Usually the errors are small and there is no great damage in the result if the correct conversion factor is ignored. The difference can become critical if filters are applied or other synthetic effects are added.

More detailed informations are available in the lurker’s guide to video from Chris Pirazzi.

Another problem is that the commonly used digital video resolutions don’t exactly represent the actual 4:3 or 16:9 picture aspect ratios. All commonly used modern digital video standards  are based on their counterparts in analog video standards to avoid too many compatibility issues. The most used sampling rate in PAL and NTSC video systems is 13,5 Mhz.

PAL has a line length of 64 µs, of which 52 µs contains actual image information, the rest is reserved for horizontal blanking. 52 µs × 13.5 MHz = 702 samples per scanline. In the vertical direction, there are 574 complete lines and 2 half lines, giving a total of 576 scanlines. Thus, the active image area for a 4:3 or 16:9 frame at 13.5 MHz sampling is 702×576 pixels.

For NTSC, the same calculation gives an image area of 711×486 pixels.

Instead of using 702 or 711 samples per line, the digital video standard defines 720 samples (= pixels) per line to allow for little deviations from the ideal timing values and to use a common sampling rate of 13,5 Mhz.

When converting videos from one size to another, cropping or adding black side edges to the video is necessary to keep the correct image aspect ratio. Fortunately some video conversion softwares care for these conditions.

More details about square and non square video pixels and a conversion table are available in the Quick Guide to Digital Video Resolution and Aspect Ratio Conversions maintained by Jukka Aho. Another useful tutorial about Pixel Aspect Ratio is available at the doom9.net website.

Ken Burns video effect

The Ken Burns video effect is a popular name for a  panning and zooming effect used in video production from still imagery. The name refers to Kenneth Lauren “Ken” Burns,  an American director and producer of documentary films known for his style of using archival footage and photographs.

The technique is principally used in historical documentaries where film or video material is not available. The effect is included in several movie edition softwares, for instance on the Windows platform in AVS Video Editor.

Will McGugan published a tutorial about the Ken Burns effect in javascript and canvas on his blog.


Last update : January 30, 2013
FOURCC is short for “four character code” – an identifier for a video codec, compression format, color or pixel format used in media files. Another way to write FOURCC is 4CC. To find out which FOURCC’s are used within a media file, you need to use an application specialized to open and inspect the media file. Gspot, MediaInfo and ACIcodec are some of the tools supporting FOURCC.

A list of a few hundred video-codecs is available at the FOURCC website, a list of RGB- and YUV-pixel formats is available at the same site.

For audio codecs it is not FOURCC’s that is used, but rather audio tags, or an audio identifier – that identifies one specific audio codec or one type of audio compression scheme. An audio tag is an integer decimal value, often specified as a HEX value.

A great website about video- and audio-codecs is MovieCodec Forums/Downloads.


Frameserving is a process by which video data is transferred from one program to another. No intermediate or temporary files are created. The program that opens the source file(s) and outputs the video data is called the frameserver. The program that receives the data could be any type of video application. The input looks like a relatively small, uncompressed video file. This feature of frameserving enables you to open certain types of files in an application that wouldn’t normally support them.

There are three programs that are commonly used as frameservers. Those programs are Avisynth, VirtualDub, and VFAPI.

Further informations about frameserving are available at the AviSynth wiki-website.

Display extended informations about media files

Last update : January 30, 2013

I set up a list with useful software tools to display extended informations about media files :

  • MediaInfo : open source (GPL or LGPL licence) tool with user friendly user interface;
  • FLV MetaData Viewer (FLVMDV) : free tool to adds ‘FLV Details’ tab to the file windows properties dialog
  • TSPE (Transport Stream Packet Editor) : shareware from BitStreamTools
  • CodecVisa : powerful H.264/AVC real-time analyzer for H.264/AVC and VP8 video codecs
  • GSpot : codec information appliance
  • StreamEye tools : powerful applications from elecard, designed to analyze video quality, troubleshoot problems in the encoded stream for further video compression optimization, and ensure compliance to the video standards
  • VideoInspector : free video tool from KC Software
  • TSReader : free lite, payed standard and payed professional  MPEG-2 Transport Stream Analysis and Recording software

Smart editing of MPEG-4/H264 videos

Last update : August 31, 2013

To edit a video, you need to cut & join numerous clips. This process is called smart editing and is particularly difficult if the video is encoded with H264.

H.264/MPEG-4 Part 10 or AVC (Advanced Video Coding) is is a block-oriented motion-compensation-based codec standard developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG).

H.264 is used in Blu-ray Discs, cable television services, real-time videoconferencing, mobile devices, … The standard defines 17 sets of capabilities, which are referred to as profiles, targeting specific classes of applications.

The highly compressed video file formats like MPEG-2 or MPEG-4 were exclusively designed for playback or distribution… not for editing. The term “compressing”  is misleading, because these video file formats are made in a similar fashion and achieve their very high compression rates by throwing away information. They rely on a system of deleting data that is unnecessarily repeated in frame after frame of the videos. The data is replaced by a reference to an earlier or later frame.

In MPEG-2 streams there are three types of pictures :

  • I-Pictures (Intra-coded picture) : these are easiest to think of as a complete picture and are slightly compressed like a JPEG photo file is compressed.
  • P-Pictures (Predicted picture) : these are incomplete pictures and only contain the infomation that has changed since the last I-Picture or the last P-Picture.
  • B-Pictures (Bi-predictive picture) : these ones are the most highly compressed because they can use information from previous I- or P-Pictures and forward I- or P-Pictures for reference in playback.

The following picture (courtesy Wikipedia) shows the relations between  I, P and B frames.

I, P and B frames

The I, P, and B pictures are arranged in groups of pictures (GOP) in a way so that the video file can be played back by a video device or software. There are generally two types of GOP’s : short GOP’s and long GOP’s. The sequence of the transmitted frames is not linear; P-frames are send before the related B-frames.

The following picture shows a typical sequence of a short GOP (I =blue, P = red, B  = yellow). The term VOP (video object plane) is used in relation with the video codecs (IVOP, PVOB, BVOP).

Short GOP sequence

You easily understand that cutting or joining such video sequences without disturbing the sequence of frames or the synchronization with the sound can be very complex.

In the case of  MPEG-4 streams it’s even more difficult.

The granularity of the establishment of prediction types in MPEG-4 is brought down to a lower level called the slice level of the representation. A slice is a spatially distinct region of a picture that is encoded separately from any other region in the same picture. In that standard, instead of I pictures, P pictures, and B pictures, there are I slices, P slices, and B slices. Motion estimation provides for the searching of sub-macro blocks of variable size, from 16×16 down to 4×4 blocks. Motion vectors allow up to quarter pixel accuracy for luminance, and up to 1/8th pixel for chrominance. MPEG-4 carries out intra-prediction for intra coded blocks before the transform, performed on either 4×4 or 16×16 blocks and allowing up to 9 directional modes for direction dependent prediction. Residual data transforms are executed on 4×4 blocks with modified integer discrete cosine transform (DCT) which avoids rounding errors. The employment of an adaptive in-loop filter increases subjective quality of video. The standard provides two alternative and more efficient processes of entropy coding. Context-adaptive variable length coding (CAVLC) utilizes multiple variable length codeword tables for transform coefficient encoding considering spatial neighborhood of the coded block. Context-adaptive binary arithmetic coding (CABAC) in addition provides highly efficient automatic adjustment for underlying probability model of encoded data. Long GOP’S are usual in MPEG-4.

Cutting and joining MPEG-4 videoclips without re-encoding (lossless) to keep a high quality and without creating visual or audial drops at the edges of the movies is very challenging. Only a few software tools are capable to do such a task which is called “smart editing“.

smart editing

MPEG-4 editor tool Machete

QuickTime pro MPEG-4 player & editor

A very simple MPEG-4 editor is Machete from Machetesoft. It’s a try-before buy-program, the current version is 4.0 build 33 released on March 22, 2013. The software is available at regnow for 15,99 euros.

Cutting videoclips or inserting other videoclips with Machete is only possible at the location of key-frames (I-pictures). Unfortunately a typical MPEG-4 videoclip has only very few key-frames (every 5 to 10 seconds).

A wellknown MPEG-4 player and editor is the pro-version of QuickTime from Apple. The current version is 7.7.3, build 1680.64. The selected part of a videoclip can be trimmed with the menu  “Edit > Trim to Selection”. The trimmed videoclip can be saved with the same parameters without re-compression with the menu “File > Export …;  Exporter > MPEG-4 sequence ; options >video and audio format > pass through”. A videoclip can be copied to the clipboard and added to another movie with the same features. I expected a clean export to MPEG-4 without affecting the audio or video streams, but this is not the case.

AvsPmod Editor with AviSynth

A very powerful and versatile video post-production tool is AviSynth, created by Ben Rudiak-Gould. It’s not a software you may usually think of programs (.exe and GUI), but it’s a video processing engine that works in the background. AviSynth uses scripts which tell the program what to do and what video to produce.

A Wiki on the main website provides some documentation and user guides about AviSynth. A more comprehensive documentation is available at the website of AnimeMusicVideos.org, a community dedicated to the creation, discussion, and general enjoyment of fan-made anime music videos. The AviSynth tutorial is part of a very useful documentation “Technical Guides to All Things Audio and Video” available on the same website.

The AviSynth syntax to program video scripts is available at the official wiki-website.

A package (AMVapp v3.1) including a lot of accessories and complementary software tools,  described in the technical guides, is available at the AimeMusicVideos website. One of the tools is a text editor specifically designed for making AviSynth, called AvsP. It has been written in Python by qwerpoi. The most recent version is 2.0.2 released on October 27th, 2007. An enhanced version called AvsPmod has been created by Zarxrax, the latest version is 2.5.1 released on June 25, 2013.

The AvsPmod editor shows not only the resolution, framerate, colorspace, frame number, time-code and aspect ratio of the videoclip in the bottom bar, but also the position and color of the videopixel defined by the mouse pointer. To play back the video in real-time, you need an external directshow media player (for example Windows Media Player) which is activated with the AvsPmod preview button (4th button from the left). The VLC-player doesn’t work because it’s not a directshow player.


VirtualDub is a video capture/processing utility for 32-bit and 64-bit Windows platforms, written by Avery Lee and licensed under the GNU General Public License (GPL). The current stable version is v1.9.11. An unofficial VirtualDub support forum is available at the website. A modified version of VirtalDub called VirtualDubMod has been discontinued since 2005.

To play and edit MPEG-4 videos in VirtualDub, specific plugins and filters are required. The most straightforward solution is to combine VirtualDub with AviSynth and AvsPmod.

Avidemux 2.5

Avidemux2.5 is a free video editor designed for simple cutting, filtering and encoding tasks. It supports many file types using a variety of codecs. The tool is available for Linux, BSD, Mac OS X and Microsoft Windows under the GNU GPL license. The current version is 2.6.5 released on August 29, 2013. A detailed up-to-date documentation is available on the wiki-website.

The tool shows for each frame the type of picture (I, P or B). To save a selection of a clip in the default copy mode without re-encoding, the  marker A and B must be key-frames (I-pictures). An automatic search for key-frames and for black-frames (pictures without content, often inserted between movies and commercials) is provided. To join videoclips use the menu “File > Append”. The Smart-Copy feature doesn’t work for videos encoded with the H264 codec. For some other codecs you’re asked whether you want to use Smart-Copy or not if you cut your video, and the first frame of a segment is not an I-frame, and you try to save it .

Womble MPEG Video Wizard DVD 5.0

Womble MPEG Video Wizard DVD 5.0 is a commercial MPEG editor with DVD authoring and full MPEG-4 and AC-3 encoder support. The price for a single user personal license is $99. A free trial download is available. The features of this program are smart rendering, no re-encoding, fast HD MPEG editing with frame accuracy, automatic Ad detection and removal, movie conversion to iPod’s and PSP’s, intuitive User Interface (UI) and batch processing.

The current release is from June 2013.

A tutorial “How to Edit Out Commercials? ” is available at the Womble website.
Another commercial video editing tool is SmartCutter from FameRing. The company states that SmartCutter is the world’s first H.264 AVCHD MPEG2 frame accurate cutter without re-encoding! The price is 40$, a free trial is available. Other tools as a video browser and a video framer or bundled versions are also offered. The current version is 1.8.1 released on August 28, 2013.

A tutorial how to edit H.264/AVCHD/MPEG2 videos without re-encoding is available oh the FameRing website. The name FAME stands for Frame Accurate Movie EngineeRing.

Smart Cutter from FameRing

The record function of the VLC media player can be used to do a simple cutting of video clips.

My favorite editing tools are now AVS Video Remaker and AVS Video Editor from AVS4YOU, a project of Online Media Technologies Ltd, an english IT high-tech company, founded in 2004 and specialized in developing innovative video and audio solutions for end-users and professional developers. AVS4YOU is a collection of software tools (currently there are 20 tools available) for which you can purchase either an unlimited access license or a one-year access license and use aLL of the tools with that license.

I did a lot of tests with other low-price commercial and shareware video-editors and I experienced serious problems with most of them when loading my MPEG-4/H264 test videos.

TerraTec Magix Movie Software

Today I installed the TerraTec Magix Movie on DVD (version with different video devices.

TV Edit

TV Record

The program allows to

  • duplicate DVD’S and CD’s
  • record files from digital video, from analog video, from audio and from a PC-region
  • import videos from different sources
  • edit videos  by adding titles, motions and special effects
  • create and burn DVD’s with menus and navigation indexes

I did several tests to digitize videos from an VHS-recorder on my Media-PC with a Dual Core Intel Pentium R 3 GHz CPU running Windows 7. The recording parameters have been the following :

  • interlaced
  • MPEG main profile
  • 720 x 576 pixels (maximum resolution)
  • frame ratio 4:3
  • frame rate : 25 frames/s
  • I-frames : 12
  • P-frames : 3
  • video : YUY2
  • variable bit rate per second (max=9500, min=3000, mean=4500)
  • audio MPEG-Layer 2
  • audio sample rate rer second : 48000
  • audio bitrate : 128 Kbit/s
  • VCR checked

There are no visible differences between the different quality values ranging from 1 to 15. With high quality values I noticed however some CPU performance problems (sound noise, frame skips).

The TerraTec TV device H5 with composite interface provides a good sound quality and a reasonnable video quality with a visible jitter between frames.

  • video driver : TerraTec H5 Analog Capture (USB – DShow) *
  • audio drivers : TerraTec H5 Analog Capture

The second available audio driver Ligne (TerraTec H5) provides no sound.

The TerraTec video device G3 with composite or Scart interface provides a reasonnable sound quality and a good video quality.

  • video driver : TerraTec G3 Analog Capture (USB – DShow) *
  • audio drivers : TerraTec G3

The sound levels are not displayed in a reliable manner on the screen. With the second available audio driver Ligne (TerraTec G3), the audio levels are always shown, but they are very high, and there is a lot of echo.

The inbuilt TV-Card SAA 7131 is shown in the driver window, but without providing an image or a sound.

  • video driver : 713x BDA Analog Capture (DShow) *
  • audio drivers : 713x BDA Analog Audio Capture

In general the software is not very stable and provides a lot of crashes in Windows 7. The same is true for Windows XP. New versions of the Magix Movie on DVD Software are available in USA (version 8 ) and in Germany (version 9 ), but they don’t have an optimal support for the TerraTec devices.

CTX944 Dual DVB-T/DVB-S TV card

The CTX944 dual TV card supports analog TV, DVB-T, DVB-S and S-Video composite Video-In/Audio-In. This card is compliant with the PCI 2.2 Medion card used in the Medion MD8800 PC launched by ALDI in november 2005. The CTX948 is the single version of the CTX944.

The CTX944 and CTX484 cards are manufactured by Creatix, a german multimedia company. The main chips are Philips SAA7131E, TDA8275A and TDA8263..

The SAA7131E combines a digital global standard low IF demodulator for analog TV with a PCI audio and video decoder. The IF demodulator is an alignment-free digital multi standard vision and sound low IF signal PLL demodulator for positive and negative video modulation. It can be used worldwide for M/N, B/G/H, I, D/K and L/L’ standards.

The analog/DVB-T section is contolled by the Philips TDA8275A chip, the DVB-S section by the Philips TDA8263 chip.

The card is supported by the following software tools :

The Wilderness Downtown by Chris MILK and Arcade Fire

The Wilderness Downtown is an interactive interpretation of Arcade Fire’s song “We Used To Wait” and was built entirely with the latest open web technologies, including HTML5 video, audio, and canvas.

Choreographed windows, interactive flocking, custom rendered maps, real-time compositing, procedural drawing, 3D canvas rendering… this Chrome Experiment uses all of them.

The Wilderness Downtown is an outstanding browser-dominating Net Artwork. This experimental, interactive film by Chris Milk, is a lovely visual poem to accompany Arcade Fire’s excellent “We Used To Wait” from their album The Suburbs.

This Chrome Experiment has been done with some of Chris Milk’s friends from Google, among them the technology director Aaron Koblin.

Progressive video download, pseudo streaming and realtime streaming

Last update : January 30, 2013
In the past, audio and video on the Web was primarily a download-and-play technology. You had to first download an entire media file before it could play. Today, streaming technologies allow watching audio and video files almost immediately, while the data is being sent, without having to wait for the whole file to download.

There are three methods of delivering streaming audio and video content over the Web.

The first method uses a standard HTTP server to deliver the audio and video data to a media player. Unlike the download-and-play client, a special streaming client embedded in the webpage starts playing the audio or video while it is downloading, after only a few seconds wait for buffering, the process of collecting the first part of a media file before playing. This streaming method is called progressive media download.

The second method is called pseudostreaming. Pseudostreaming is a protocol that can be installed on regular HTTP servers. It uses a server side script for Flash-to-server communication. The player sends a HTTP request to the server with a start time parameter in the request URL’s query string and the server script responds with the video stream so that its start position corresponds to the requested parameter. This start time parameter is usually named simply start. The video viewer skips the nondownloaded parts of the videos.

Both FLV and MP4 video can be played back with  pseudostreaming. The following scripts or tools are available :

  • The H264 streaming module for Apache, Lighttpd, IIS and NginX.
  • The mod_flv_streaming module for Lighttpd.
  • PHP/ASP scripts such as XmooV PHP.
  • Content delivery networks such as Bitgravity, Edgecast or Limelight.

There is one major advantage to streaming with a Web server rather than with a streaming media server—utilizing existing infrastructure.

The third method uses a separate streaming media server specialized to the audio/video streaming task. A streaming server offers the following advantages :

  • More efficient use of the network bandwidth
  • Better audio and video quality to the user
  • Advanced features like detailed reporting and multi-stream multimedia content
  • Supports large number of users
  • Multiple delivery options
  • Content copyright protection

The following protocols are commonly used by streaming servers :

  • UDP – this protocol provides the most efficient network throughput. The only downside to UDP is that many network administrators close their firewalls to UDP traffic, limiting the potential audience of UDP-based streams
  • TCP – this protocol provides an adequate, though not necessarily efficient, protocol for delivering streaming media content to flow through the firewalls
  • HTTP + TCP – this combination has the benefit of working with all firewalls that let Web traffic through (port 80) and provides much more control (fast forward, rewind, etc) than a standard Web server, but also adds some overhead to the raw TCP stream that decreases scalability.
  • Multicast – this protocol enables hundreds or thousands of users to play a single stream, but will only work on networks with Multicast-enabled routers. Multicast is becoming prevalent on corporate networks, but is still very rare on the Internet

Useful informations and tutorials about streaming are available at the streamingmedia.com website.

In 2009, Amazon CloudFront, the easy-to-use content delivery service, introduced the ability to stream audio and video files. Streaming with Amazon CloudFront is exceptionally easy: with only a few clicks on the AWS Management Console or a simple API call, you’ll be able to stream your content using a world-wide network of edge locations running Adobe’s Flash® Media Server. Like all AWS services, Amazon CloudFront streaming requires no up-front commitments or long-term contracts. There are no additional charges for streaming with Amazon CloudFront; you simply pay normal rates for the data that you transfer using the service.