Understanding Color Sampling
By Barry Green

Understanding Color Sampling

You’ve seen the numbers: 4:2:0, 4:4:4, 3:1:1, 3:1.5:1.5, 4:1:1, 4:2:2… What does it all mean?  And how does it affect your video?  What’s better, what’s worse, and when does it matter?

What those numbers are referring to is a technique in digital video commonly called “color sampling.”  The concept can be a bit confusing to those not used to working with video in the digital domain.  In video, it’s common practice to not actually record all the color in an image, but rather to average the pixels together to cut down on the bandwidth.  Color sampling is, in effect, a form of compression.  The more compressed the engineers can make the color channel, the less bandwidth the signal occupies and the easier it is to record, transmit, or broadcast.

The concept came about because of the way the human eye works.  The eye uses rods and cones to sense light; rods are very numerous and sense brightness (i.e., light and dark, or black and white and the shades of gray in-between) but rods can’t perceive color differences.  Cones, on the other hand, see color; but cones are relatively few in number, and they’re coarser; we can see changes in brightness much more acutely than we can detect changes in color.  Long ago, video engineers decided to take advantage of that quirk of human physiology and developed the now-common color sampling systems to save bandwidth, under the idea that “we wouldn’t really notice anyway.”

And overall, they’re basically right, but there are a few circumstances where low color sampling really makes a difference in your ability to work; greenscreen is a prime example.  The whole concept of working with greenscreen is that you’re relying on the camera’s ability to separate one color (green) from the rest; if you’ve got low-resolution color sampling, your task becomes commensurately more difficult, and a higher-color-sampling system would make the task that much easier.

So let’s discuss how color gets stored, and what recording formats use what color sampling system.  First, we’ll start with a base image from a computer program, PhotoShop.  This image forms the foundation for all the subsequent discussions, and it’s a pixel-level image that uses different colors for every pixel (i.e., no color sub-sampling, or, put another way, “4:4:4”).
l

In the picture above, each square represents one individual pixel in a PhotoShop document.  Every pixel is encoded with its own color.  The picture is magnified enormously so we can clearly see each pixel.

As said before, this image would represent 4:4:4 color.  And this is what a non-video-savvy person would expect their video to be encoded like, right?  Because in the real world, every sampled pixel should have its own corresponding color – it only makes sense.  But no, actually very few video formats use 4:4:4 color, and no video system in common use (such as DV, DVCAM, DVCPRO, AVC-HD, Digital8, HDV, DVCPRO-HD, HDCAM, etc) uses 4:4:4 color, they all use some manner of color sampling.  So, with the preliminary introductions out of the way, let’s talk about the actual color sampling systems!

4:1:1
l

Look at the picture above.  That’s an example of what 4:1:1 color sampling does to our image.  If you look at the center you’ll see that it’s the same 4:4:4 image, but I’ve run that image through a nonlinear editor, compressing it to DV@4:1:1 color sampling, and then magnified the result.  4:1:1 is used by NTSC DV and by PAL DVCPRO25.

What you can see happening is that each group of four pixels is being “averaged together” to all be set to the same color!  Sounds preposterous, I know, but that’s exactly what it’s doing.  Look at the upper-left block of four pixels.  There’s red, orange, blue, and a bluish-gray.  Orange is in the reddish family, so we have basically two reddish and two bluish pixels, and red + blue = purple.  So all four pixels get forced to become purple, even though none of the source pixels are purple!  But this is exactly what happens in 4:1:1 DV.  We can still discern between the individual pixels because they retain their own brightness characteristics, but as for color, they all get rounded together.

And the other blocks show similar color merging.  The upper-right block turns yellow, green, cyan and orange pixels all to shades of green.  And, curiously, the lower left block does the same, even though its source pixels were green, yellow, brown-ish and purple.  And on the last block, the colors are so varied that the color sampling algorithm turns them all to brown, even though none of the source pixels are brown!  That’s a result of additive color mapping; if you’re used to mixing paints you know that if you keep glopping different colors into a big blob of paint, sooner or later it’ll all turn brown.  Color sampling will do the same thing if the underlying pixels are different enough.

Now, you may be thinking: “but DV looks great, how can this be true?”  The reason is because in the video domain we typically don’t see such color fluctuations at a pixel-by-pixel level.  Look at any object in the real world and you’ll likely see that for the most part there’s a decent-sized swath of similar color, or at least colors within the same basic family; in cases like that DV performs spectacularly.  It’s when you get down to fine pixel-level color transitions that the blockiness comes into play (and, most notably, during chroma keying).

4:2:0
l
4:2:0 is a very common color sampling system.  It’s used by PAL DV, by DVD, by HDV and AVC-HD and XDCAM-HD.  It delivers the same color resolution as 4:1:1, but in a different pattern.  In 4:2:0 you still have blocks of four pixels being averaged to one color, but in 4:2:0 it’s done in a 2x2 grid, whereas in 4:1:1 it’s done in a 4x1 block.

4:2:0 suffers from the same blocky color issues and poor color fidelity resolution that 4:1:1 does, and for the same reasons (i.e., it’s still averaging four pixels together to deliver one single color).  4:2:0 delivers similar problems with chroma keying and with shooting under solid-color lights (such as shooting a play under a red wash or a blue wash of light); just in a different pattern.

4:2:2
l

4:2:2 is the color sampling format used by most professional formats.  Digital Betacam, DVCPRO50, DVCPRO-HD, and MPEG-IMX are all examples of formats that use 4:2:2.  In 4:2:2 every pair of pixels gets their colors averaged together, in a 2x1 pattern.  The immediate benefit is twice as high color resolution, which obviously leads to much better results when chroma keying.  Every scan line has its own color information, and every pixel pair has its own discrete color information, so the only possible color error would be limited to 1 pixel’s worth, and even then it could be said that it’s a half-pixel’s worth of error because the color of that pixel will be made of at least half of the proper underlying color.

 

Color Accuracy

On the surface, 4:2:2 would seem to deliver twice as much color resolution, but the benefits go far beyond that.  Look at the color accuracy!  In the other color sampling systems blocks of four pixels get averaged together and the resulting color accuracy isn’t very good.  In fact, let’s revisit them and count the color-accurate pixels, starting with 4:1:1.
l

Let’s count the color-accurate pixels.  In the upper-left block, none of the newly-minted purple pixels are accurate to their original source pixels, are they?  In the upper right block, I’d say that the green pixel is accurately represented, and none of the others are.  In the lower left block, none of the pixels are accurately represented, but if I was to be generous I’d say that the leftmost green pixel is, at least, green...  And in the lower right block, again none of the pixels match their source pixel.  So out of a group of 16 pixels, only two are accurately represented(!)

How about with 4:2:0, is that any better?  Let’s count…
l

Well, at first glance the answer would seem to be “no”.  In the leftmost block, none of the pixels are accurate at all.  In the next block, an argument could be made that maybe the upper-right pixel is vaguely accurate?  And maybe, in a stretch, the lower left pixel?  In the third block, no pixel is remotely accurate, and the same with the fourth block.  So, again, out of 16 pixels, only two are even somewhat close to the original.

How does 4:2:2 resolution fare with color accuracy?
l

Well now we’re talking.  Look at how much more accurate 4:2:2 color is!  By my quick count, I’d say that about 8 of the pixels are either an excellent match, or at least as good a match as how the 4:1:1 and 4:2:0 systems were performing.  That means that in this (admittedly extreme) case, we’re getting up to four times as much color accuracy!  In a real-world circumstance the difference in results would not be so exaggerated, because real-world video subjects usually don’t have so much variation in color between pixels (unless, perhaps, you’re shooting a tourist who’s wearing an alarmingly colorful Hawaiian shirt, for example!)

Before we close this article on color sampling, let’s examine the numbering system and address 3:1:1.  First, to understand what the numbers stand for, I defer to the work of Charles Poynton, the unparalleled expert in the field of video:
http://poynton.com/PDFs/Chroma_subsampling_notation.pdf

4 is the leading digit used in chroma sampling notation. The reason "3" ever entered the equation was because of Sony's HDCAM, which employs 3:1:1 color sampling (3:1:1 is basically halfway between 4:1:1 and 4:2:2; each block of 3 pixels gets set to the same number).  It has nothing to do with the prefiltering of the recording format, it refers to the ratio of color samples to luma samples. Unfortunately some people insist that the leading "3" of HDCAM has to do with the prefiltering from 1920x1080 down to 1440x1080 for recording, so they think that the "3" in 3:1:1 comes from the fact that 1440 is 75% of 1920, and 3 is 75% of 4, so therefore the 3 references the prefiltering. It doesn't. It references that there are three luma samples for every one chroma sample.

In summary, if you’ve seen blocky color edges in your video, now you know why, and you know that a better color sampling video system can reduce the appearance of blockies.  But there are some things you can do in post as well to minimize the impact of low color sampling.  On the PC you can usually find a chroma blurring filter which will smooth out the blocky edges; on the Mac I’d recommend looking to Nattress’s G-Chroma Smoother plug-in.

 

Discuss this in the Forum