The deep neural network models that power today’s most demanding machine learning applications are becoming extremely large and complex, pushing the limits of traditional electronic computing hardware.
Photonic hardware, which can use light to perform machine learning calculations, offers a faster and more energy-efficient alternative. However, some types of neural network computations cannot be performed on photonic devices and require the use of off-chip electronics or other techniques that impede speed and efficiency.
Building on a decade of research, scientists at MIT and elsewhere have developed new photonic chips that overcome these obstacles. They demonstrated a fully integrated photonic processor that can perform all the key computations of a deep neural network optically on a chip.
The optical device was able to complete the key calculations for a machine learning classification task in less than 0.5 nanoseconds while achieving more than 92% accuracy, or performance on par with traditional hardware.
The chip is made up of interconnected modules that form an optical neural network and is manufactured using a commercial foundry process, which could potentially allow the technology to be scaled up and integrated into electronics.
In the long term, photonic processors have the potential to enable faster, more energy-efficient deep learning for computationally demanding applications such as LIDAR, scientific research in astronomy and particle physics, and high-speed communications. .
“It is often not just how well a model performs, but how quickly you can get an answer. Now that we have an end system, we can start thinking about applications and algorithms at a higher level,” says Saumil Bandyopadhyay ’17, MEng ’18. PhD ’23, Visiting Scientist in the Quantum Photonics and AI Group at the Research Institute of Electronics (RLE), Postdoctoral Fellow at NTT Research, Inc., First Author Putting paper on a new chip.
Bandyopadhyay is joined on this paper by Alexander Sludds ’18, MEng ’19, PhD ’23. Nicholas Harris PhD ’17; Darius Bunandal PhD ’19; Stefan Krastanov is a former RLE research scientist and currently an assistant professor at the University of Massachusetts Amherst. Ryan Hamerly, visiting researcher at RLE and senior researcher at NTT Research; Matthew Streshinsky is the former head of silicon photonics at Nokia and currently co-founder and CEO of Enosemi. Michael Hochberg, President, Periplous, LLC. Dirk Englund is a professor in the Department of Electrical Engineering and Computer Science, principal investigator in the Quantum Photonics and Artificial Intelligence Group and RLE, and senior author of this paper. The research is published today in Nature Photonics.
Machine learning with light
Deep neural networks consist of many interconnected layers of nodes, or neurons, that manipulate input data and produce outputs. One of the key operations in deep neural networks is to perform matrix multiplication using linear algebra. This transforms the data as it is passed from layer to layer.
However, in addition to these linear operations, deep neural networks perform nonlinear operations that help the model learn more complex patterns. Nonlinear operations like activation functions give deep neural networks the power to solve complex problems.
In 2017, a group in England, along with researchers in the lab of Cecil and Ida Green Professor of Physics Marin Soljačić, demonstrated an optical neural network on a single photonic chip that can perform matrix multiplication with light. did.
However, at the time, this device could not perform nonlinear operations on the chip. To perform nonlinear operations, optical data had to be converted into electrical signals and sent to a digital processor.
“Nonlinearity in optics is very difficult because photons don’t easily interact with each other. So it’s very power-intensive to induce optical nonlinearity, so we’re building systems that can do it in a scalable way. It becomes difficult to do so,” explains Bandyopadhyay.
They overcame this challenge by designing a device called a nonlinear optical functional unit (NOFU) that combines electronics and optics to implement nonlinear operations on a chip.
The researchers built an optical deep neural network on a photonic chip using three layers of devices that perform linear and nonlinear operations.
Fully integrated network
First, their system encodes the parameters of a deep neural network into light. A series of programmable beamsplitters, demonstrated in a 2017 paper, then performs matrix multiplication on these inputs.
The data is then passed to a programmable NOFU, which implements the nonlinear function by siphoning a small amount of light into a photodiode and converting the optical signal into an electrical current. This process eliminates the need for external amplifiers and consumes very little energy.
“We stay in the optical domain all the way to the end when we want to read out the answer. This allows us to achieve ultra-low latency,” says Bandyopadhyay.
Achieving such low latency has made it possible to efficiently train deep neural networks on a chip. This process is typically known as in situ training, which consumes large amounts of energy on digital hardware.
“This is particularly useful for systems that do in-domain processing of optical signals, such as navigation and telecommunications, but also for systems that want to learn in real time,” he says.
The photonic system achieved over 96% accuracy during training tests and over 92% accuracy during inference. This is comparable to traditional hardware. Additionally, the chip performs key calculations in less than 0.5 nanoseconds.
“This research shows that computing (in essence input-to-output mapping) can be compiled into new architectures for linear and nonlinear physics that allow fundamentally different scaling laws in terms of computation and required effort. ” said England.
The entire circuit was manufactured using the same infrastructure and foundry processes that manufacture CMOS computer chips. This has the potential to manufacture chips at scale using proven techniques that introduce little error into the manufacturing process.
Bandyopadhyay says scaling up the device and integrating it with real-world electronics such as cameras and communication systems will be a major focus of future research. Additionally, the researchers hope to explore algorithms that can take advantage of optics to train the system faster and with better energy efficiency.
This research was funded in part by the U.S. National Science Foundation, the U.S. Air Force Office of Scientific Research, and NTT Research.