New technology called RedEye could provide users with continuous vision. It’s a first step toward allowing devices to see what their owners see and keep track of what they need to remember.
“The concept is to allow our computers to assist us by showing them what we see throughout the day,” says group leader Lin Zhong, professor of electrical and computer engineering at Rice University, who describes the technology in a recent paper. “It would be like having a personal assistant who can remember someone you met, where you met them, what they told you, and other specific information like prices, dates, and times.”
RedEye is an example of the kind of technology the computing industry is developing for use with wearable, hands-free, always-on devices that are designed to support people in their daily lives, Zhong says. The trend, which is sometimes referred to as “pervasive computing” or “ambient intelligence,” centers on technology that can recognize and even anticipate what someone needs and provide it right away.
See what we see, hear what we hear
“The pervasive-computing movement foresees devices that are personal assistants, which help us in big and small ways at almost every moment of our lives,” he says. “But a key enabler of this technology is equipping our devices to see what we see and hear what we hear. Smell, taste, and touch may come later, but vision and sound will be the initial sensory inputs.”
The bottleneck for continuous vision is energy consumption because today’s best smartphone cameras, though relatively inexpensive, are battery killers, especially when they are processing real-time video.
Researchers began studying the problem in the summer of 2012 when they worked at Microsoft Research’s Mobility and Networking Research Group in Redmond, Wash. They measured the energy profiles of commercially available, off-the-shelf image sensors and determined that existing technology would need to be about 100 times more energy efficient for continuous vision to become commercially viable. This was the motivation behind LiKamWa’s doctoral thesis, which pursues software and hardware support for efficient computer vision.
In a paper a year later, LiKamWa and colleagues showed they could improve the power consumption of off-the-shelf image sensors tenfold simply through software optimization.
“RedEye grew from that because we still needed another tenfold improvement in energy efficiency, and we knew we would need to redesign both the hardware and software to achieve that,” LiKamWa says. The energy bottleneck was the conversion of images from analog to digital format.
“Real-world signals are analog, and converting them to digital signals is expensive in terms of energy,” he says. “There’s a physical limit to how much energy savings you can achieve for that conversion. We decided a better option might be to analyze the signals while they were still analog.”
The main drawback of processing analog signals—and the reason digital conversion is the standard first step for most image-processing systems today—is that analog signals are inherently noisy. To make RedEye attractive to device makers, the team needed to demonstrate that it could reliably interpret analog signals.
“We needed to show that we could tell a cat from a dog, for instance, or a table from a chair,” LiKamWa says.
The researchers decided to attack the problem using a combination of the latest techniques from machine learning, system architecture, and circuit design. In the case of machine learning, RedEye uses a technique called a “convolutional neural network,” an algorithmic structure inspired by the organization of the animal visual cortex.
Graduate student Yunhui Hou brought new ideas related to system architecture circuit design based on previous experience working with specialized processors called analog-to-digital converters at Hong Kong University of Science and Technology.
“We bounced ideas off one another regarding architecture and circuit design, and we began to understand the possibilities for doing early processing in order to gather key information in the analog domain,” LiKamWa says.
“Conventional systems extract an entire image through the analog-to-digital converter and conduct image processing on the digital file. If you can shift that processing into the analog domain, then you will have a much smaller data bandwidth that you need to ship through that ADC bottleneck.”
Convolutional neural networks are the state-of-the-art way to perform object recognition, and the combination of these techniques with analog-domain processing presents some unique privacy advantages for RedEye.
“The upshot is that we can recognize objects—like cats, dogs, keys, phones, computers, faces, etc.—without actually looking at the image itself,” LiKamWa says.
“We’re just looking at the analog output from the vision sensor. We have an understanding of what’s there without having an actual image. This increases energy efficiency because we can choose to digitize only the images that are worth expending energy to create. It also may help with privacy implications because we can define a set of rules where the system will automatically discard the raw image after it has finished processing.
“That image would never be recoverable. So, if there are times, places, or specific objects a user doesn’t want to record—and doesn’t want the system to remember—we should design mechanisms to ensure that photos of those things are never created in the first place.”
The National Science Foundation and a Texas Instruments Graduate Research Fellowship to LiKamWa supported the work, which the team presented at the International Symposium on Computer Architecture conference in Seoul, South Korea.
Source: Rice University