roxics
Veteran
I will chime in with some general descriptions that partially answer some of your questions, and probably partially repeat what you and others have said.
Bit depth
(8 bit, 10 bit, 12 bit, etc.)
This is the number of bits per channel. There are three channels per pixel. So multiply it by 3 to know the number of bits per pixel. For example, 8-bit would be 24 bits per pixel (or 3 bytes per pixel).
A photodiode is at first analog. It has a noise floor, which represents the blackest shade it could reproduce, and on the other end its full-well capacity, which represents the whitest shade it could reproduce. Suppose its noise floor is 5 electrons, and its full-well capacity is 40,000 electrons. That means its brightest signal is 8,000 times as much as its darkest signal. This could also be expressed as 13 f-stops, because 2^13 = 8,096.
To record a photodiode's experience over the course of time, you could translate it to similar waves on magnetic tape (analog) or number its shades from noise floor to full well (digital). Suppose you thought 100 steps between darkest and lightest were enough. So darkest is 1, and lightest is 100. To save this to disk, you would need 7 bits (because 2^7=128). A bit is a 1 or a 0. Therefore it has 2 possible values. So two bits in a row could have 4 values (00, 01, 10, 11), three bits in a row could have 8 values (000, 001, 010, 100, 011, 110, 101, 111), and so on. And now you see why the number of bits becomes the exponent. 2^x = how high you can count with that many bits.
If you say 7 bits (100 levels), that actually gives you 1 million colors, because again it is 7 bits per channel, and 100 x 100 x 100 = 1 million. Seems like enough, doesn't it? Actually it is for a lot of cases, but not us picky photographers.
If you allow yourself 8 bits, you literally double the number of shades per channel, because 2^8 = 256. But that's not all! Your total color palette is now 256 x 256 x 256, or 16.8 million colors (a number which I bet you have seen many times in advertising, especially in the 2000s).
And so, now you can see why upping the ante to 10 or 12 or 16 bits will give you a ludicrous number of colors. Let's see:
2^10 is 1,024. Then 1,024^3 = 1 billion colors.
2^12 is 4,096. Then 4,096^3 = 68 billion colors.
2^16 is 65,536. Then 65,536^3 = 281 trillion colors.
(I'm not saying you won't need something like 10 bit, especially if you plan to push and pull the image in post. I think 10-bit is enough if distributed logarithmically rather than linearly. Norman Koren does a nice explanation of why, if you can follow: http://www.normankoren.com/digital_t...l#Human_vision)
Chroma subsampling
(4:1:1, 4:2:0, 4:2:2, etc.)
The three digits actually refer to YCrCb, not RGB.
Y means luminance (I don't know why)
Cr means the luminance minus the red.
Cb means the luminance minus the blue.
Somehow engineers find this easier to work with than RGB.
(and the three digits don't map to Y, Cr, and Cb, they map to some weird formula: https://en.wikipedia.org/wiki/Chroma_subsampling)
So if you convert RGB to YCrCb, and just leave it alone after that, then it is said to be 4:4:4. All three channels are at full-resolution.
So if your resolution was HD, then the Y layer would be 1920x1080, the Cr layer would be 1920x1080, and the Cb channel 1920x1080.
But now they can exploit the human eye's partiality toward luminance and just throw away parts of Cr and Cb.
So the mildest cut would be 4:2:2.
The Y layer would still be 1920x1080, but the Cr and Cb layers would each be 960x1080 (line-doubled horizontally, to stretch it back to full width).
Then 4:2:0 would be 1920x1080 for the luminance, but only 960x540 for Cr and Cb.
4:1:1 just cuts it horizontally. Vertical is full resolution. So Y would be 1920x1080, and Cr and Cb would each be 480x1080 --- which is kind of weird. It leads to more color artifacts than 4:2:0, which is probably why we left it in the dust. (It was easier, I think, to encode to magnetic tape, because you just quarter the carrier frequency, which is why it was used in DV).
So now for some storage comparisons.
1920x1080 12-bit 4:4:4 would be:
1920 pixels per line
x 1080 lines per frame
x 12 bits per channel
x 3 channels per pixel
= 74,649,600 bits per frame
or about 9 megabytes.
1920x1080 12-bit 4:2:2 would be:
(1920 x 1080 x 12) + (960 x 1080 x 12) + (960 x 1080 x 12)
= 49,766,400 bits per frame (exactly 2/3 of the 4:4:4 version)
or about 6 megabytes per frame
1920x1080 12-bit 4:2:0 would be:
(1920 x 1080 x 12) + (960 x 540 x 12) + (960 x 540 x 12)
= 37,324,800 bits per frame (exactly 1/2 of the 4:4:4 version)
or about 4.5 megabytes per frame
1920x1080 8-bit 4:2:0 would be:
(1920 x 1080 x 8) + (960 x 540 x 8) + (960 x 540 x 8)
= 24,883,200 bits per frame
or about 3 megabytes per frame
These are then usually further compressed with something like the Discrete Cosine Transform, maybe 10:1. So then 8-bit 4:2:0 could be, say, just 300 KB per frame. (Best explanation ever of DCT: https://www.youtube.com/watch?v=Q2aEzeMDHMA)
Wow! Great explanation. Simple to follow through. I really appreciate it.
Makes more sense now.