Video bit depth vs dynamic range

bdemenil · May 2, 2016

When I first learned of cameras recording in 10bit and 12bit, I assumed that greater luminance and color precision go hand in hand with greater dynamic range. Now after some research I am not sure if that is the case. If bit rate remains constant, wouldn't increasing dynamic range actually reduce precision? ie if the camera captures more low light information, and more highlight information, but encodes this larger amount of information using the same bit length, wouldn't that mean that the amount of range allocated to each interval in the data sample would necessarily increase?

On a separate but related topic. If a single layer sensor (like the ones most cameras today use) captures in black and white, then is it true that the precision and dynamic range of colors and luminance are linked? ie, more precise luminance = more precise colors? And more dynamic range in luminance = more precise colors in the darks and lights.

And is dynamic range in color even an issue? Don't good cameras already capture all visible color? Would you want to capture UV and IR?

Ben

Patryk_Rebisz · May 2, 2016

There is lots of "he said, she said" going on. There is no such thing as have-it-all. For instance, the most precise NASA lenses produce B&W image because when you don't have to worry about various colors (that obviously occupy different parts in the spectrum thus bend at different angles) you get sharper image.

Likewise with whole 8,10 etc bit... A doc we've shot a few years back (on low end Canon 7D) produced unexpected results when encoding to one codec versus another. Even super high codec like ProRes 4444 produces unwanted results on certain shots (the ones that have delicate gradation of tones) while another high end codec like DNxHD 175 10bit produced perfect results. You might scream "BS!" as 4444 codec claims to be lossless but our test have proven otherwise. So as far as our doc was concerned DNxHD produced vastly superior results to ProRes.

Bottom line, trust only your eyes (and ears) - there is more going on to it all then "simple" math.

TheDingo · May 2, 2016

Patryk_Rebisz said:
Bottom line, trust only your eyes (and ears) - there is more going on to it all then "simple" math.

+1

Bassman2003 · May 2, 2016

How about this scenario:

If you had an 8bit version of DNxHD vs a 10bit version of DNxHD how much difference would that show?

And further, what if you added a camera with low dynamic range and a camera with higher dynamic range?

How would the camera with higher dynamic range but to an 8bit codec fare against the lower dynamic range camera but recording to an 10bit codec?

Yes this is splitting hairs but this is what buyers need to know when considering all of the camera and recording options out there.

Thanks!

AndreeOnline · May 2, 2016

I don't think this subject is fluffy at all. These are technical terms that mean actual things. Wikipedia is your friend.

The way your phrase your question, I think you should read up on the topic a bit before "discussing" it here on the forum. You'll get speculation here, dressed up as factual information. And since the topic isn't subjective—just read up on it.

Patryk_Rebisz · May 2, 2016

AndreeOnline said:
I don't think this subject is fluffy at all. These are technical terms that mean actual things. Wikipedia is your friend.

The way your phrase your question, I think you should read up on the topic a bit before "discussing" it here on the forum. You'll get speculation here, dressed up as factual information. And since the topic isn't subjective—just read up on it.

Well, the topic is "subjective" in a way as every component in your chain can be the thing that screws it up. Your editing software can interpret the gamma one way in one version and differently in another (Premiere CC v 2016 version, for instance), then the compression of what comes from the camera might influence the choice of master codec (as was the case with our film). One can "objectively" claim (based on reality of mathematical numbers) that going to 10 bit codec if your source material is 8 bit doesn't add anything but then that someone would be wrong.

bdemenil · May 2, 2016

AndreeOnline said:
I don't think this subject is fluffy at all. These are technical terms that mean actual things. Wikipedia is your friend.

The way your phrase your question, I think you should read up on the topic a bit before "discussing" it here on the forum. You'll get speculation here, dressed up as factual information. And since the topic isn't subjective—just read up on it.

If I had found a clear answer elsewhere, I wouldn't have posted here. It sounds like you know something on the subject, so perhaps you'd care to shed some light.

To the other posters, I'm asking about the physical properties of samples - which are definite and not open to interpretation. To simplify the discussion, let's assume we are using raw uncompressed images.

AndreeOnline · May 2, 2016

bdemenil said:
I'm asking about the physical properties of samples - which are definite and not open to interpretation.

But as you can see above, you are entering a realm where stuff "based on the reality of mathematical numbers" will be wrong.

When I wrote my post I didn't actually know that my foresight was THAT good, but yes... there you have it. Well, no one will get hurt by turning this topic over in a friendly manner once or twice. Enjoy yourselves! =)

bdemenil · May 2, 2016

AndreeOnline said:
But as you can see above, you are entering a realm where stuff "based on the reality of mathematical numbers" will be wrong.

When I wrote my post I didn't actually know that my foresight was THAT good, but yes... there you have it. Well, no one will get hurt by turning this topic over in a friendly manner once or twice. Enjoy yourselves! =)

I'm not sure how the math can be susceptible to interpretation. I'm not asking what looks better, just what the mathematical properties are. If you'd prefer not to contribute that's fine. But then why are you posting?

morgan_moore · May 2, 2016

I think your OP is basically correct.

You need more bits to capture wide DR too much DR and you dont have enough bits per stop and therfore not enough tonal detail.

Also log encoding makes better use of the bits so 10bit log might equal 12bit linear (maybe???)

One real world element you miss is compression ratio.

Most of the artifacts we see are not due to lack of bits but too much compression

S

dmitrizigany · May 3, 2016

Isn't this what the whole thread about the raw output of the FS7 is about? People seem to be in agreement (on the Internet? No way!?) that the raw of the FS7 is sub par as the 12bit raw is a too small container for the camera's 14+ stop dynamic range.
That is if I have understood that discussion correctly.

combatentropy · May 3, 2016

bdemenil said:
When I first learned of cameras recording in 10bit and 12bit, I assumed that greater luminance and color precision go hand in hand with greater dynamic range. Now after some research I am not sure if that is the case. If bit rate remains constant, wouldn't increasing dynamic range actually reduce precision? ie if the camera captures more low light information, and more highlight information, but encodes this larger amount of information using the same bit length, wouldn't that mean that the amount of range allocated to each interval in the data sample would necessarily increase?

On a separate but related topic. If a single layer sensor (like the ones most cameras today use) captures in black and white, then is it true that the precision and dynamic range of colors and luminance are linked? ie, more precise luminance = more precise colors? And more dynamic range in luminance = more precise colors in the darks and lights.

And is dynamic range in color even an issue? Don't good cameras already capture all visible color? Would you want to capture UV and IR?

Bit depth and dynamic range, the perennial question. They are related but not how you expect, sort of like a movie's twist ending.

Dynamic range. A sensor is always generating a certain level of noise. This is called the noise floor. A sensor also has only so much room in each of its pixels. This is called the full-well capacity. Suppose a sensor has a noise floor of 10 electrons and a well capacity of 40,000 electrons. To determine the dynamic range, you divide one into the other: 40,000 ÷ 10 = 4,000. Its dynamic range is 4,000:1. To convert that to f-stops, you say, "2 to the what equals 4,000?" Answer: 12. So, 12 f-stops. (Mathematicians call it the binary logarithm --- what is the binary logarithm of 4,000? 12.)

Bit depth. The sensor converts light into electrical waves. These waves are analog. Ew. So then they get fed into an analog-to-digital converter. Guess what it does. The waves get converted into numbers. The bit depth is how high the converter can count. A bit is 2 to the what. So 10 bit is 2 to the 10th power (1,024). The monkeys in the audience would point out that I have simplified the truth. The truth is that the converter is probably 12- or 14- or 16-bit, but down the pipe this can get reduced, and it only matters what ends up on tape. (The monkeys in the audience will point out that we don't use tape anymore.) Anyway, somehow you end up with a recorded signal which is so-many-bits (8, 10, or 12, usually). This determines how many steps of brightness you have. If you have too few steps, you get banding, especially if you push and pull it in post. 10 bits per channel is usually enough, especially if recorded logarithmically instead of linearly.

Now we are almost ready to understand the relationship between dynamic range and bit depth.

Think about your TV (or computer screen). It too has a dynamic range. I'm afraid to say it is probably only about 5 or 6 stops. You want it that way. A TV that had 10 or 12 stops of dynamic range would have to have a very bright upper limit. Watching a 12-stop TV render a scene at high noon would be like looking out your window at high noon. It would cause you to squint. You might even get a sunburn. I prefer my TV set to top out at 6 stops, b ecause I want my living room to be comfortable. If I wanted real life, I would go outside. I want my TV shows nice and subdued. Don't challenge me, mentally or physically.

Anyway, so if your camera is capable of capturing 12 stops but it's going to be shown on a screen with just 6, what happened to the other 6? That's a great question, and most people's ignorance of the transfer is what leads to milky or crumby imagery. The short answer is, if you don't make the decision, your editing software and people's TVs will make the decision for you. The medium answer is: Learn to use the Curves control on your editor, and apply an S-curve. Or find a really good LUT. The long answer is outside of the scope of this forum post.

Bit depth has an important connection to dynamic range, but not so much the dynamic range of your camera as the dynamic range of the TV. The human eye can discern a certain number of levels of brightness per f-stop. Some say 30. If an image has less than 30 steps per f-stop, then that's called banding. If a screen has 6 stops, then you need 180 steps. Since 8-bit is 256 steps, 8-bit is plenty. But that's only if you didn't do any significant pushing or pulling in post. When you apply LUTs and curves and raise exposure or roll off the highlights, you are essentially moving steps around from one f-stop to another. If you do it too much you can starve a certain brightness level of bits, and banding will appear. So that's why, in your camera, you want to start off with 10 bits, and you want to record them logarithmically. Norman Koren can explain why, if you can cut through his article: Tonal quality and dynamic range in digital cameras.

Bassman2003 · May 3, 2016

Thanks for the posts. Great that this can be brought into the open since the topic gets thrown around a lot. There seems to be three major players - Bit Depth, Dynamic Range and Bit Rate (compression). All three would be the holy grail, two would be grand one is what we usually get in most cameras. To get all three is usually out of the budget, which two would you prefer to have?

Razz16mm · May 3, 2016

With regard to any standard REC709 compliant HD codec, the dynamic range is set by the analog reference voltage values that are encoded in the digital file. Bit depth alone does not change video DR at all any more than measuring a one meter ruler in millimeters instead of centimeters changes the length of the stick. You will see this value listed on Sony camera spec sheets as 57dB signal to noise ratio, or 9.5 stops, close the theoretical maximum for REC709 video. Log curves are a means of altering linear voltage values captured by the sensor, which can cover a much broader DR than the video codec standards, to bring them within a more usable range for video.
Digital video encoding is just a ruler used to define a discrete scale of binary values for a predefined analog voltage standard.
All video systems are in fact analog systems. They capture light as a range of voltage values per pixel at one end and output an analog image as light from your display at the other end.
Digital encoding just provides a means of storing, modifying, and transporting values of light in between.

morgan_moore · May 3, 2016

Bassman2003 said:
which two would you prefer to have?

Id go for bit depth lack of compression

Bit depth, lack of compression... BMC
DR compression Sony
All three Canon at 2 or 3 x the cost?

S

AndreeOnline · May 3, 2016

bdemenil said:
I'm not sure how the math can be susceptible to interpretation. I'm not asking what looks better, just what the mathematical properties are.

Yes, I agree with this. That's why I said what I said in my original post—it's not fluffy. There are several good articles you can read about bit depth and DR on Wikipedia.

You put the burden of what I quoted on me, when it was in fact from higher up in the thread. It was an example of a direction that gets kind of fuzzy.

I recognize this topic as one of those where many will want to help you, well intended, but in many cases they won't. Stuff gets injected that adds to the confusion… like more subjective, experience based interpretations. That's why I wanted to direct you towards more technical documents.

The two things are separate. A sensor's DR is what it is: a Signal to Noise measure. Contrast. Full well capacity relative to the noise floor. Your DR is set in stone before the codec comes into play.

Can you fit 12 stops (4096 values) into an 8bit (256 values) codec? Yes.

In linear light, the last stop of your 12 stop signal takes up 2048 values—just by it self. The first stop only gets 1 value. By evening out the values over all the stops (in reality they are not all treated quite equal) you get 256/12 = 21 values per stop. Kind of like bending a tall stick to fit in a small room. This is your log profile.

You can debate how many values you need per stop in order to not get banding. By going to 10bit with your 12 stops you get around 85 values per stop. Much better. Trying to cram 14 stops into 8 bits gets you 18 values—decidedly too low. Of course... the log profile makes sure that the shadow stops gets fewer values than skin tones, let's say (so not quite even distribution).

Saving an 8bit encoded signal to a higher bit depth adds nothing in terms of quality and won't increase DR. It won't hurt you, you just get value duplication. But of course, even if you shot 8bit, you still want to do post at a much higher bit depth to handle all the calculations. Resolve does 32bit floating point for instance.

There are techniques to combat banding like introducing a small amount of noise and such, but that is another topic.

Samuel H · May 3, 2016

There are already some very good answers in this thread but I'll try to have my go as well...

If you have 10 stops of dynamic range and your video is 8-bit, you'll have an average of 25.6 brightness values per stop of light. But usually there's a built-in s-curve (non-log gamma curves) or at least a knee at the bottom (all sorts of log use this AFAIK), which means you'll probably have 30 values per stop of light in the midtones but only 15 in the shadows, and maybe just 8 in the darkest couple of stops the camera can capture.

If the camera can record 15 stop of DR and the video remains 8-bit, all those numbers fall: the average is 17 brightness values per stop of light, going to 10 and 5 in the shadows. Those are very low numbers. You can already see the banding issues in your head.

Now, move to 10-bit, and all those numbers get 4 times bigger. Even with 15 stops of DR, you have 68 values per stop of light on average, and 40 to 20 in the shadows.

Some complications:
- Compression is much more important than bit depth. And so is workflow. You can have horrible banding in 10-bit video with just 10 stops of DR, and you can see no issues in footage coming from an 8-bit camera recording 13 stops of light.
- Sometimes the codec doesn't use the full color space. Technicolor CineStyle on Canon DSLRs, and slog3 on a7SII and a6300, both leave a big chunk of the brightness range unused. This means less values per stop of light in the part that is used.
- I'm always talking about gamma- or log-encoded 8-bit or 10-bit. If you are storing linear values, then you need A LOT more bits, because it's very innefficient in the shadows. 12-bit linear RAW is barely good enough for 10 stops of dynamic range (the bottom two stops get 4 and 8 values each).

Bassman2003 · May 3, 2016

Thanks for your responses.

Patryk_Rebisz · May 3, 2016

samuel h said:
- compression is much more important than bit depth. And so is workflow. You can have horrible banding in 10-bit video with just 10 stops of dr, and you can see no issues in footage coming from an 8-bit camera recording 13 stops of light.
- sometimes the codec doesn't use the full color space. Technicolor cinestyle on canon dslrs, and slog3 on a7sii and a6300, both leave a big chunk of the brightness range unused. This means less values per stop of light in the part that is used.
- i'm always talking about gamma- or log-encoded 8-bit or 10-bit. If you are storing linear values, then you need a lot more bits, because it's very innefficient in the shadows. 12-bit linear raw is barely good enough for 10 stops of dynamic range (the bottom two stops get 4 and 8 values each).

this.

Video bit depth vs dynamic range

bdemenil

Member

Patryk_Rebisz

Veteran

TheDingo

Veteran

Bassman2003

Veteran

AndreeOnline

Veteran

Patryk_Rebisz

Veteran

bdemenil

Member

AndreeOnline

Veteran

bdemenil

Member

morgan_moore

Major Contributor

dmitrizigany

Veteran

combatentropy

Veteran

Bassman2003

Veteran

Razz16mm

Veteran

morgan_moore

Major Contributor

AndreeOnline

Veteran

Samuel H

Major Contributor

Bassman2003

Veteran

Patryk_Rebisz

Veteran