Does AI have limits? You come to this question very quickly when you begin to think about phone cameras. They have tiny lenses which would once have limited both the amount of light and the resolution of photos. Once upon a time, the limited aperture would have meant long exposure times (and camera shake). It would also have created a resolution problem: that you could not get distant details with limited aperture (that’s bokeh for you). How does a phone camera wriggle out of this problem and produce photos as sharp as these two?
There are two parts to the answer. One is physics: the chemistry of silver halide crystals is replaced by the electronics of CCDs. The pixels can be made smaller, and there are clever constructions for noise reduction. As a result, you get closer to the ideal of “one photon, one pixel”, although not very close, at least on the cheap phone that I use. The other is mathematics: there is a lot of computation between what you see and what the CCD gives. First there is the subtraction of the background noise. Then there is the AI which begins to make statistical inferences about the image. Earlier I’d mentioned computational super-resolution: the improvement of lens resolution by making assumptions about the nature of the lighting. In both the photos above I looked at another technique that the AI uses: image averaging.
When I looked at this scene of the Sahyadris, there was a wall of dragonflies between me and the far cliffs. Each hovered in the air to find prey, then quickly darted to it. The light was not too bad, and on a normal camera, many would be blurred, but some would perhaps be sharp because they would be caught hovering. I wondered what the 65 megapixels of my phone camera would catch. Surprise! It caught nothing, although the EXIF told me that the exposure was 1/1912 seconds. Nothing at all, as you can see in this full size zoomed segment of the photo. I went over the photo segment by segment at full size. Nothing! What happened?
The phone camera took multiple readouts (frames) from the CCD sensor and then added them together to form the final image. This image averaging give noise reduction: pixels are averaged over frames and random noise is cancelled. But the random darting of the dragonflies also mimicked noise, and was removed. The exposure time written on the EXIF is probably a sum over the exposure times of the frames. The shorter reported exposure perhaps means that a smaller number of frames is averaged over.


Do I have an image that tells me that the camera is doing image averaging? Yes, the image comparator above tells me that. The “original image” (compressed for the purposes of this blog to 640×480 pixels) is shown on the left. The photo was taken from the car as it slowed for a speed bump. The EXIF data tells me that this was shot at f/1.7 with an exposure of 1/2422 seconds. In that time I estimate that the car must would have moved by a hair over 1/2 mm. The original looks sharp here, and looked even sharper on my phone. But the full size zoom shows strange artifacts. The lettering on the signboard is very blurred, as it would be if multiple images were added together. But the narrow leftmost pole supporting the roof of the shack is perfectly clear. Similarly, the edges of the sun umbrella are clean. This is clear evidence that the AI has selectively added parts of images. Even more than image averaging, there is clearly adaptive multiframe image averaging at work.


Now let’s get back to the photo of moss on a brick wall to see how much detail I could get from it. It was taken in full sunlight. At f/3.2 my macro lens required an exposure of 1/200 of a second to capture the moss in the comparison photo on the right. The phone camera lens has about 1/25 of the area, so if I’d masked my macro lens to keep only a phone camera sized area free, the exposure would have climbed to 1/8 of a second. But the phone camera reported f/1.7 (the lens is fixed), with an exposure of 1/264 seconds. Yet when I looked at the phone camera output at full size, I saw the blur on the left! Why?
First, keep in mind that the exposure time of the photo of moss implies averaging about 7 times as many frames as that of the cliff. You might expect so much averaging to remove blur. But I suspect that the blur in this second photo is an due to image averaging interacting with computational super-resolution: the improved lens resolution that AI gives. Since the small details in the zoomed view is almost entirely due to computation, little changes in the illumination can change the inferred image. Then averaging over the result can give the blurred details that you see. In the second zoom into the same photo you can see that the deep shadows look more noisy and equally blurred. This is also what you might expect from the averaging over super-resolved frames: CCD noise is removed, but inferred details are blurred by averaging over inferences.
Phone photography changes our expectation of the interaction of camera hardware and image so dramatically that it is worth rethinking what photography means. I intend to explore this a bit in this series.
Very interesting indeed! Thank you for posting. You mentioned the left or the right photo. On my Ipad I see the images above and below. It would perhaps be helpful to name them first or second, wouldn’t it ?
LikeLiked by 2 people
Thanks for that comment. I’ll check why that happened
LikeLiked by 1 person
When I viewed the article on the web, the photo comparator (with the slider) was there but when I viewed it on the WordPress app (Android), the photos appeared one below the other. So it’s to do with how the app is programmed.
LikeLiked by 1 person
Thanks for that check
LikeLike
This is an excellent analysis, Sir, and begins to explain a lot of what I have been experiencing with my mobile camera pictures. I had to read and reread it to absorb some points which I will now try to corelate with what had happened to some of my pictures. Please do continue with this series – it is a scientific revelation.
LikeLiked by 1 person
Thanks. I hope the next one I write is easier to follow and you do not have to read it many times to understand what I am trying to say.
LikeLiked by 1 person
No, no… please do not water down the language – it will dilute the technicality of the subject. I think what I wanted to say did not come across correctly. I meant that I had to reread since as I read it, my mind started corelating it to my experiences – so I had to go over some lines again to reinforce that I had understood correctly. And sometimes when facts hit home, you want to reread and soak in it a while. 🙂
LikeLiked by 1 person
Thanks for the feedback
LikeLiked by 1 person
I had a similar experience when I went to Himeji Castle in Japan. In the uppermost floor, there was a look out from which visitors could look at the city below. One problem though. There was a wire mesh installed probably to prevent things (or people) from falling. But this meant taking a clear shot of the cityscape impossible, or at least that’s what I thought until I used my mobile phone camera. Almost like magic, the wire mesh disappeared in the photo!
LikeLiked by 1 person
Quite likely your experience was of the AI working deliberately to remove the visual obstruction. Try it out through wire meshes again to check. In this example the AI’s noise-reduction method was fooled.
LikeLike
Very interesting. Took time to understand but very helpful for novices.
LikeLiked by 1 person
Glad you liked it
LikeLike