AI partially generating images

rob norton

Veteran
Why aren't we seeing AI turn something like files from an fx3 into alexa 35-esque images?

I know compared to total image generation, something like this would be a fraction of the market, but is anyone else surprised that there hasn't yet been an AI gap filler? If fx3s all shot the EXACT same quality as an A35, the only difference (still of course potentially huge differences to some) between cameras would be size, I/O, physical build quality. If the Arri image comes partly through a more powerful computer than an fx3, it'd essentially be giving us the current gold standard by spreading the computing across a few more stages.

Or what about recording to SD cards but ending up with the highest quality files available? I guess resolve will implement similar tools soon enough, but I'm surprised there's not more hand-in hand positioning of the AI tech.
 
Last edited:
I think you have a good idea if it is possible. Can AI make up for lack of dynamic range & color accuracy? Making lesser cameras look like top of the line cameras seems more difficult that just generating new images. I am sure the camera companies are neck deep in all of this stuff looking for an edge. They have successfully used AI to improve autofocus. Image enhancement can't be far behind.
 
I wonder if we'll see a large camera company throw the kitchen sink at a product release. If there are generational leaps in IQ, their perception will take a hit for holding back, but maybe the product will be so good no one will care.

If say resolve can bridge the gap, this would inform several new camera body designs, rather than needing to bend the knee to the computer. They might be small changes or it'd probably just be a mirrorless minus any of the associated downsides (no micro-jitters, doesn't need a large PL lens because lenses chosen in post etc.), which would obviously be welcome by a lot of people.
 
You bring up one of the tough things about AI. It is supposed to do everything, but in practice, it kind of does everything in a mediocre fashion.

I was not thinking about lens choices etc..., I was just thinking about improving the image quality from a sensor point of view in-camera. Sort of like how smart phones can get away with 10X the processing compared to mirrorless cameras. Color fidelity and accuracy is not an issue with modern cameras but representing a huge contrast ratio scene is. So I say keep it small and focused so maybe the AI can win!
 
It really depends on the standard to which you scrutinize your work. (e.g., the sadly prevalent "good enough" mantra).

It's possible today for AI to generate only the content needed to fill-in parts of the image where highlights are blown or shadows are crushed, but that's all you'll be getting - a generated guess of what was there - not the scene as it would've been shot using a higher dynamic range camera.

Because physics will always limit how many photons of light a camera can collect relative to its circuitry noise, if you want the look of an Arri, ultimately you have to shoot with one.

I might be feeling the sunk-cost of trillions of tax-payer dollars, but fingers crossed, maybe in a few years it won't smear the same half-cooked cartoon aestetic over everything.
 
Image quality is not easily discussed because the whole is often greater than the sum of its parts.

It may also be that it's not really economically viable for any potential manufacturers to put in the time and effort to produce an AI-powered camera that's an Alexa35 competitor.

Regarding software solutions, I agree with this:
It's possible today for AI to generate only the content needed to fill-in parts of the image where highlights are blown or shadows are crushed, but that's all you'll be getting - a generated guess of what was there - not the scene as it would've been shot using a higher dynamic range camera.

Because physics will always limit how many photons of light a camera can collect relative to its circuitry noise, if you want the look of an Arri, ultimately you have to shoot with one.
 
It's possible today for AI to generate only the content needed to fill-in parts of the image where highlights are blown or shadows are crushed, but that's all you'll be getting - a generated guess of what was there - not the scene as it would've been shot using a higher dynamic range camera.
Yeah, and in the narrative/advertising space, I can see the guess versions being good enough, if they feel there's a net positive - "not the same as what we shot at all but the sky went from white to deep cyan so who cares?"

I haven't actively been looking, but I'm still surprised we haven't seen even bad versions of this yet. https://www.topazlabs.com/topaz-video doesn't include any highlight recovery tricks.
 
Why aren't we seeing AI turn something like files from an fx3 into alexa 35-esque images?

I know compared to total image generation, something like this would be a fraction of the market, but is anyone else surprised that there hasn't yet been an AI gap filler? If fx3s all shot the EXACT same quality as an A35, the only difference (still of course potentially huge differences to some) between cameras would be size, I/O, physical build quality. If the Arri image comes partly through a more powerful computer than an fx3, it'd essentially be giving us the current gold standard by spreading the computing across a few more stages.

Or what about recording to SD cards but ending up with the highest quality files available? I guess resolve will implement similar tools soon enough, but I'm surprised there's not more hand-in hand positioning of the AI tech.

Part of the problem is that it can only work from what's there, and what it's been trained on. How would you train an AI to expand dynamic range or perform color space alterations? It's not impossible but those specifications would need to be clearly defined and isolated for it to learn how to do it properly.

I haven't gotten into it deep, but I have messed around with recreating "cinematic" scenes. I was impressed, but it's a long ways off from being any kind of replacement like some are predicting.

Where it could be extremely useful is in visual FX. Being able to alter real footage to include cost prohibitive FX would be a boon to low budget filmmakers. I don't know that it would necessarily put VFX artists out of work either. I think they would be the primary beneficiaries of it if the tools were made with their use cases in mind. Instead of lengthy renders they could design assets that could be animated or modeled using AI to cut down on development times. What maybe took weeks or months would take minutes to mock up and submit.
 
Part of the problem is that it can only work from what's there, and what it's been trained on. How would you train an AI to expand dynamic range or perform color space alterations? It's not impossible but those specifications would need to be clearly defined and isolated for it to learn how to do it properly.
I don't think it'll necessarily do anything properly, it'll either be that's good enough, or the result is quite different, but we don't mind because it looks "better" than not applying the tech. Maybe a chat gpt/BM camera would be regular camera, with in-built secondary 360 camera purely to assists filling the gaps.

I'd accept that it's not possible if there's nothing there already, but it's more about the end result, so it doesn't necessarily matter if getting there a certain way isn't an option, and not many people will care that a different path was taken.
 
Back
Top