In science and technology, there has been a long and steady push toward improving the accuracy of measurements of all kinds, along with parallel efforts to enhance the accuracy of images. The accompanying goal is to reduce uncertainty in the estimates that can be made and the conclusions drawn from the data (visual or otherwise) collected. However, uncertainty cannot be completely eliminated. And since we must live with it, at least to some extent, there is much to be gained by measuring uncertainty as precisely as possible.
Expressed in other terms, we would like to know the extent of our uncertainty.
This issue is addressed in a new study, led by Swami Sankaranarayanan, a postdoctoral researcher at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and his co-authors Anastasios Angelopoulos and Stephen Bates of the University of California, Berkeley. Yaniv Romano of the Technion, Israel Institute of Technology; and Philip Isola, associate professor of electrical engineering and computer science at MIT. Not only have these researchers succeeded in obtaining accurate measures of uncertainty, but they have also found a way to show uncertainty in a way that the average person can understand.
their paper, Presented in December at the Neural Information Processing Systems conference in New Orleans, it’s about computer vision — a field of artificial intelligence that involves training computers to gather information from digital images. The focus of this research is on images that are smudged or partially damaged (due to missing pixels), As well as methods—computer algorithms, in particular—designed to reveal the part of the signal that’s distorted or otherwise hidden. An algorithm of this type, Sankaranarayanan explains, “takes the blurry image as input and gives you a clean image as output” — a process that usually happens in two steps.
First, there is an encoder, a type of neural network specially trained by researchers for the task of de-blurring noisy images. The encoder takes a distorted image, and from that, creates an abstract (or “latent”) representation of a clean image in a form—consisting of a list of numbers—that would be comprehensible to a computer but wouldn’t make sense to most humans. The next step is the decoder, of which there are two types, which are usually neural networks. Sankaranarayanan and his colleagues worked with a type of decoder called the “generative” model. In particular, they used a ready-made version called StyleGAN, which takes numbers from an encoded representation (of a cat, for example) as its input and then builds a complete, formatted image (for that particular cat). So the whole process, including the encoding and decoding phases, gives a clear picture from an already muddy show.
But how much faith can someone place in the accuracy of the resulting image? And as addressed in the December 2022 paper, what is the best way to represent uncertainty in that picture? The standard approach is to create a “salinity map,” which assigns a probability value—somewhere between 0 and 1—to indicate the confidence the model has in the correctness of each pixel, taken one by one. This strategy has a drawback, according to Sankaranarayanan, “because the prediction is performed independently for each pixel. But meaningful things happen within groups of pixels, not within individual pixels,” he adds, which is why he and his colleagues propose a completely different method for assessing uncertainty.
Their approach centers around the “semantic features” of an image—groups of pixels that, when brought together, have meaning, forming a human face, say, or a dog, or any other recognizable object. The goal, Sankaranarayanan asserts, “is to estimate uncertainty in a way that relates to pixel clusters that humans can easily interpret.”
While the standard method may produce a single image, and make a “best guess” of what the real image should be, the uncertainty in this representation is usually difficult to discern. The new paper argues that for real-world use, uncertainty must be presented in a way that makes sense to people who are not machine learning experts. Instead of producing a single image, the authors devised a procedure for generating a set of images – each of which might be correct. Furthermore, they can place precise boundaries on the range, or interval, and provide probabilistic assurance that the real imaging is somewhere within that range. A narrower range can be provided if the user is comfortable with, say, 90 percent certainty, and still an even narrower range if there are more acceptable risks.
The authors believe their paper presents the first algorithm, designed for a generative model, that can identify periods of uncertainty related to meaningful (linguistically interpretable) features of an image and comes with a “formal statistical guarantee”. While this is an important milestone, Sankaranarayanan sees it as just a step toward the “ultimate goal.” So far, we’ve been able to do this for simple things, like restoring images of human faces or animals, but we want to extend this approach to more important areas, like medical imaging. , where our “statistical guarantee” is particularly important.”
Let’s say the film, or radiograph, of a chest X-ray is not clear, he adds, “and you want to reconstruct the image. If you’re given a set of images, you want to know that the real image is in that range, so you don’t miss anything important”— Information that may reveal whether or not a patient has lung cancer or pneumonia. In fact, Sankaranarayanan and his colleagues have already begun working with radiologists to see if their algorithm for predicting pneumonia could be useful in a clinical setting.
He says their work may also be relevant to law enforcement. “The image from the surveillance camera might be blurry, and you want to improve on that. There are already models for doing that, but it’s not easy to measure uncertainty. And you don’t want to make a mistake in a life-or-death situation.” The tools he and his colleagues are developing could help identify the guilty person and help exonerate an innocent person as well.
Sankaranarayanan notes that much of what we do and many things that happen in the world around us are shrouded in mystery. Therefore, gaining a stronger understanding of this uncertainty can help us in countless ways. For one thing, it can tell us more about what we don’t know exactly.
Angelopoulos was supported by the National Science Foundation. Bates has been supported by the founders of the Data Science Institute and the Simmons Institute. Romano was supported by the Israel Science Foundation and a Career Advancement Fellowship from the Technion. Sankaranarayanan and Isola’s research for this project was sponsored by the U.S. Air Force Research Laboratory and the U.S. Air Force’s Artificial Intelligence Accelerator and was accomplished under Collaborative Agreement No. FA8750-19-2- 1000. MIT’s SuperCloud and Lincoln Lab’s Supercomputing Center also provided the resources that contributed to the results reported in this work.