Midweek mobile 12

Push anything to extremes and flaws will begin to show up. Cell phone cameras now boast photos with around 100 million pixels, but taken with toy lenses a few millimeters across. The images that the phone gives you are the result of statistical computations based on fuzzy data. The enormous amount of computation needed for building an image (yes, images are now built, not captured) drains your phone’s battery faster than other uses would. How do you actually get to see the flaws of the optics? One way is to push the camera to extremes. Here I look at low-light photography, so that the camera’s ISO boost begins to amplify the problems that the image always has.

The featured photo used the 4.7 mm lens at the back of my phone, and used an exposure of 1/13 seconds (this means averaging over an enormous number of captures). The original image had 9248 pixels on the long side. When I compress it down to 1250 pixels for wordpress, the result is a crisp picture. Examine it at larger scales though, and flaws emerge. The detail shown in the above photo takes a segment which is 830 pixels on the long side and compresses it to 640 for wordpress. The camera chose an ISO of 15047, and there is quite a bit of noise in the detail. You can see lens flare below the arch. Above the arch you can see some of the railing blown out. The pixels are saturated and nothing you do can bring information out of them. Elsewhere, the railings are full of digital artifacts such as aliasing.

In the slideshow above you see an even more extreme case. This is a photo taken in a dark wood on a new moon night looking for owls using flashlights (yes, this was how I spent my diwali). The camera chose an ISO of 17996 and an exposure of 1/10 seconds. In the most contrasty bits of the photo you can easily see the noise in the image even without sliding into the detailed view. The lens flare in the detail looks cloudy; the AI has tried to erase it without success. It has performed surprisingly well in the face. I’m really impressed with the technique of computational super-resolution that it applies.

I close with a less extreme example from earlier in the evening. Here the software chose an ISO of 844 and an exposure of 1/25 seconds. Details are less grainy, as you can see when you zoom into the picture. The road signs are quite clear, if a little too dark to read easily, but the darker areas of the photo have clear digital artifacts, some of which you can see in the zoom. But you can see the liquor shop in its prize location at a crossroad blazing with light, open to its business of making the roads less safe to drive on.

Phone photography changes our expectation of the interaction of camera hardware and image so dramatically that it is worth rethinking what photography means. I intend to explore this a bit in this series.

Midweek mobile 8

Puri, an ancient temple town, is the perfect place for street photos. No camera can be more discreet these days than a phone; bystanders can seldom tell whether you are taking a selfie or a photo of the street. Gone are the days when you saw a photographer and you would walk around them. These days you could land up photo-bombing a selfie. I walked about taking more shots than I could ever use. I had a few destinations in mind, and since these small lanes are a little confusing, I had my maps and location service on. I knew that all this can eat charge like a hungry spider. This time I was going to track exactly how much.

Normally I charge the battery fully, and it gives me a low battery alert when the charge has fallen to 15% of its capacity. On the average I have to charge my phone every three days. That means in an average hour I use 2.3% of the charge. After an hour of walking, I saw that maps and camera had each been on for the hour. Maps had eaten 3% of charge, but the camera had eaten just over 10%. This was just the camera software, since the display is counted separately. This agreed with my previous experience, that I would need to charge my camera after a day’s shooting.

To understand why, back up a little. These photos are zoomed in by a factor of about 4 to 8. With a DSLR setup you would not expect to capture the details of the old man’s mustache using a 15 mm diameter lens which has a focal length of 26 mm. The sensor size on my phone is almost 8 times smaller than that on most DSLRs, and therefore catches that much less light. The sharpness that you see comes from the number of output pixels in the image. That pixel density is due to intense computation, including two components that I’ve explored before: computational super-resolution and image averaging over a very large number of images. Driving the software that tries to compensate for the hardware limitations is what uses up so much charge. Algorithms will improve in future, mass market hardware will become better, and processors will run cooler. But until then, the carbon footprint of phone photography will remain several times larger than that of communication.

Phone photography changes our expectation of the interaction of camera hardware and image so dramatically that it is worth rethinking what photography means. I intend to explore this a bit in this series.

Midweek mobile 5

Panoramas have had a cult following since before the early days of photography. Special cameras and odd format film plates were developed for this purpose even in the 19th century CE. I came to the field in the early days of commercial digital photography when enthusiasts would spend days stitching together individual exposures with the early versions of photo-editors. When I tried my hand at it I rapidly realized two important points. First, as you rotate the camera, the illumination changes, and you have to compensate for the different quality of light in the editor. Second, as you image the same object from slightly different angles, its shape on the film changes slightly, and you come across mismatched edges when you try to stitch the images together. You can understand this as a simple problem in perspective, but it was hard to compensate for it with the photo editor tools that were then available.

Now, a smart camera does all this “in the box” for you. On my drive in the Sahyadris, I stopped near a village and took a panorama of a wooded cliff above rice fields. All I had to do was stand firm, hold the phone in my hands and twist my torso smoothly. The demon in the box took care of the rest: the shading of light across the image, and the automatic adjustment of the distortions of objects due to changing angles. The second step, the one which was hard to do by hand, has a simple mathematical representation (three-point perspective) which the AI solves rapidly. The result seems to be good enough that you can wrap it around the walls of your home.

But my phone records the panoramas only in 40 megapixel images, not the 65 megapixels which it uses for other photos. This is due to the physical limitations of the small aperture and sensor which the phone camera uses. I’ve discussed earlier how multiple frames are read out from the CCD and averaged to produce a single image. The same thing is being done for panoramas. But since the camera is moving while the image is synthesized, the number of frames available for a single segment is limited. When you examine a regular image and a panorama at the same scale, you can see this clearly. In the image comparison above, both photos use the same number of pixels from the original image. I can zoom less when I use a panorama. This is the result of using a smaller number of frames for image averaging in the panorama, and also the restriction on computational super-resolution imposed when using the smaller number of frames. So really, you cannot wrap the panorama from a cheap phone camera around the walls of your home. At least not until the sensor and frame-readouts, or the processor and algorithms improve.

Phone photography changes our expectation of the interaction of camera hardware and image so dramatically that it is worth rethinking what photography means. I intend to explore this a bit in this series.

Midweek Mobile 4

Does AI have limits? You come to this question very quickly when you begin to think about phone cameras. They have tiny lenses which would once have limited both the amount of light and the resolution of photos. Once upon a time, the limited aperture would have meant long exposure times (and camera shake). It would also have created a resolution problem: that you could not get distant details with limited aperture (that’s bokeh for you). How does a phone camera wriggle out of this problem and produce photos as sharp as these two?

There are two parts to the answer. One is physics: the chemistry of silver halide crystals is replaced by the electronics of CCDs. The pixels can be made smaller, and there are clever constructions for noise reduction. As a result, you get closer to the ideal of “one photon, one pixel”, although not very close, at least on the cheap phone that I use. The other is mathematics: there is a lot of computation between what you see and what the CCD gives. First there is the subtraction of the background noise. Then there is the AI which begins to make statistical inferences about the image. Earlier I’d mentioned computational super-resolution: the improvement of lens resolution by making assumptions about the nature of the lighting. In both the photos above I looked at another technique that the AI uses: image averaging.

When I looked at this scene of the Sahyadris, there was a wall of dragonflies between me and the far cliffs. Each hovered in the air to find prey, then quickly darted to it. The light was not too bad, and on a normal camera, many would be blurred, but some would perhaps be sharp because they would be caught hovering. I wondered what the 65 megapixels of my phone camera would catch. Surprise! It caught nothing, although the EXIF told me that the exposure was 1/1912 seconds. Nothing at all, as you can see in this full size zoomed segment of the photo. I went over the photo segment by segment at full size. Nothing! What happened?

The phone camera took multiple readouts (frames) from the CCD sensor and then added them together to form the final image. This image averaging give noise reduction: pixels are averaged over frames and random noise is cancelled. But the random darting of the dragonflies also mimicked noise, and was removed. The exposure time written on the EXIF is probably a sum over the exposure times of the frames. The shorter reported exposure perhaps means that a smaller number of frames is averaged over.

Do I have an image that tells me that the camera is doing image averaging? Yes, the image comparator above tells me that. The “original image” (compressed for the purposes of this blog to 640×480 pixels) is shown on the left. The photo was taken from the car as it slowed for a speed bump. The EXIF data tells me that this was shot at f/1.7 with an exposure of 1/2422 seconds. In that time I estimate that the car must would have moved by a hair over 1/2 mm. The original looks sharp here, and looked even sharper on my phone. But the full size zoom shows strange artifacts. The lettering on the signboard is very blurred, as it would be if multiple images were added together. But the narrow leftmost pole supporting the roof of the shack is perfectly clear. Similarly, the edges of the sun umbrella are clean. This is clear evidence that the AI has selectively added parts of images. Even more than image averaging, there is clearly adaptive multiframe image averaging at work.

A 1450×1088 pixel section of two photos reduced to 640X480 pixels are shown here for comparison. The left from a phone camera, the right with a macro lens.

Now let’s get back to the photo of moss on a brick wall to see how much detail I could get from it. It was taken in full sunlight. At f/3.2 my macro lens required an exposure of 1/200 of a second to capture the moss in the comparison photo on the right. The phone camera lens has about 1/25 of the area, so if I’d masked my macro lens to keep only a phone camera sized area free, the exposure would have climbed to 1/8 of a second. But the phone camera reported f/1.7 (the lens is fixed), with an exposure of 1/264 seconds. Yet when I looked at the phone camera output at full size, I saw the blur on the left! Why?

First, keep in mind that the exposure time of the photo of moss implies averaging about 7 times as many frames as that of the cliff. You might expect so much averaging to remove blur. But I suspect that the blur in this second photo is an due to image averaging interacting with computational super-resolution: the improved lens resolution that AI gives. Since the small details in the zoomed view is almost entirely due to computation, little changes in the illumination can change the inferred image. Then averaging over the result can give the blurred details that you see. In the second zoom into the same photo you can see that the deep shadows look more noisy and equally blurred. This is also what you might expect from the averaging over super-resolved frames: CCD noise is removed, but inferred details are blurred by averaging over inferences.

Phone photography changes our expectation of the interaction of camera hardware and image so dramatically that it is worth rethinking what photography means. I intend to explore this a bit in this series.

Midweek mobile 2

A mobile camera is not a good camera in ways that photographers were used to thinking of. The lens is a toy. Four centwp-admin/wp-admin/wp-admin/uries worth of lens technology have been junked by two related developments. The most important? That about 95% of the world looks at photos on tiny screens when distributing likes. So you don’t need the sharpness that photographers of old wanted; sell megapixels instead. That translates to about 10 Mbytes for the featured photo when my camera saves it. I know from experience that even on my large screen I can easily compress it down to about 200 kbytes and most people would not be able to tell the difference. That means I can retain only 2% of what is recorded. And on my phone I could easily throw away another 90% of the information (retain just 0.2% of the original) and no one would be able to tell. Then why so many megapixels? Because when you start from a large format photo and compress it down to a small screen, everything looks sharp.

You might remember that when you last changed your phone the picture quality changed a lot. Is that all due to more pixels? In a big part, yes. I dropped my old phone too often and was forced to change it quicker than I normally do. In three years the number of pixels in photo from a less-than-mid-range phone had gone up from around 10 million to about 65 million. Now look at the featured photo. The architectural details look sharp, considering that the subject is more than 300 meters away, and it was taken from a car that was making a sharp turn at a reasonable speed. But look at the near-full size blow-up in the photo above. You can see that at this zoom, details are pretty blurred. I have gained the clarity of the featured photo purely by not looking at it at full scale.

But that’s not the only change when you get a new phone. You also get a different AI translating the sensor output into an image. And this technology, which is a guess at what is being seen, is improving rapidly. As a result, the distortions of a bad lens can still be interpreted better, and result in a reasonable image. Note that this phone can remove digital noise much better than a five years-old phone would have done. The darker areas of the photo are much more clean (the detailed view above has been cropped out in the featured photo). Also, notice that the new generation AI deals with non-white faces better than before, getting an impressive image for the man walking towards the camera. This improvement is a response to accusations of biased training of AI.

But another detail is technically very impressive. Notice the level of detail? I can see very clearly that he is not wearing a mask. This resolution is better than a fundamental limitation which is imposed on lenses due to the wave nature of light (something called Rayleigh’s resolution limit). This computational super-resolution is a statistical trick which improves the image by making a guess about the nature of the ambient light. The down side of all this AI? This much of computation has a carbon cost. When I use my phone only for communication, the batteries last three and a half days. Street photography can drain the charge in a few hours.

Phone photography changes our expectation of the interaction of camera hardware and image so dramatically that it is worth rethinking what photography means. I intend to explore this a bit in this series.