Hacker News new | past | comments | ask | show | jobs | submit login
The future of deep learning is photonic (ieee.org)
103 points by pcaversaccio on Aug 1, 2021 | hide | past | favorite | 26 comments



Lightmatter is apparently shipping their photonic accelerator to early adopters right now https://lightmatter.co/products/envise/

They claim quite big performance uplift over current GPUs and even better performance per watt figures.

However I haven’t seen independent benchmarks for their Envise ASIC yet.

They seem to be using a different method with an external laser to “power” the chips using fiber optic cables rather than integrating solid state lasers into the silicon. It’s also not based on LCD technology like early attempts at photonic computing which was essentially a sandwich of LCD screens and solid state detectors and amplifiers.


Yep and the CEO claimed 10x faster than current tech while using 10% the electrical power. If it’s really 100x more efficient that is huge. But it’s the CEO saying it so hard to say how true it is.

https://youtu.be/t1R7ElXEyag


It’s an interesting tech however the big issue with photonics is that building complex logic and memory is hard to near impossible it’s quite good at doing relatively basic operations at scale which is why the tech can work with as simple components as a bunch of LCD screens that hold your input values as a pixel mask and a detector in the end, since the LCD matrix can be quite big say 1024x1024 pixels you are able to do basic but wise ops like XOR on huge matrices the challenge was always to A) get photonics to be fast enough where their scale can outpace the switching frequency of traditional semiconductors and B) get to a point where you can actually do useful operations.

My gut feeling is that Envise can actually do only a few basic operations in photonics, however if these operations are sufficiently tailored for specific tasks in ML that might be good enough.

It’s not that different than what NVIDIA did with their Tensor cores they aren’t general purpose ALUs they only do a few things but they are very fast at those tasks.


The CEO of Lightmatter says their chip only does a matrix vector multiply, which he says is a core operation in deep learning. He also says photonics is not good for normal logic operations.

But that’s fine, because I am mostly concerned with accelerating deep learning. I’m a robotics engineer and when I look at large neural networks like GPT-3 I get the sense that robotics could work well with very massive networks, even orders of magnitude larger than GPT-3 (imagine not just ingesting text and producing a stream of words, but encoding a multidimensional world state for a robot and producing a desired action based on all current and past signals).

But to put massive neural networks orders of magnitude larger than GPT-3 in to a robot requires a significant step change in the efficiency and scale of neural network compute.

So I don’t mind if their chip doesn’t do standard logic well because a regular intel chip is great at that. I just want to see significantly more powerful neural network compute. And if the Lightmatter CEO is to be believed (I don’t know), their tech could be a boon for machine learning and robotics some day.


Photonics doesn’t really scale well for this application due to relatively large area requirements and the fact that array size is limited. It’s really better suited to helping with data movement than for computation.


While I agree with the area especially for the LCD / screen mask approach I’m not sure if an array size limit is a valid concern since all traditional ASIC silicon is limited in the same manner if you want to do things quickly.

ALUs and SFU/FFUs and the rest have their own defined sizes for operands, accumulators and the likes a 32bit ALU can’t work on larger operands unless it’s capable of doing some complicated tricks which are often done in the compiler rather than by the instruction decoder / scheduler in hardware and if you need more than 32 bits you are going to see a major slow down because you need to break things into smaller pieces to fit your hardware.

Same thing with matrices if you have hardware that can multiply matrices efficiently then it’s almost certainly has a defined size and it’s up to you to optimize your workload to fit in that fixed matrix size of say 256x256.


I think we have different definitions as to what constitutes a large array. The ability to create photonic arrays is much more limited than what we have in elections. This is due to the rather large size of the elements (~0.3mm per device) as well as the fact that there there is loss in each element. Such a device would be ~76mm on one side. That’s not including any other elements or devices including optical coupling for light sources (no native lasers in silicon).

The other dimension could be smaller but will have practical limits due to thermal and/or cross-talk between elements depending on the technology used.

Optical computing is analog so the cascaded loss matters. If we assume a 1x256 array and we biased the MZMs such that we propagated a “1” to the end of the array and assume each MZM only has 0.12dB of loss; it would have 30dB of loss at the other end. So the “1” would have 1000x less power at the far end. This is only a toy problem to show how things would scale in a very simple way. In reality these devices will have much more loss per device.



By the time any of this is viable commercially, I really wonder where commercial GPUs and accelerators will be.

And they're going to have the home team advantage when that happens. So that means that unless these things are as accessible to tensorflow/pytorch or whatever the heck the framework de jour is at that point (hopefully more like Jax but better), they will get no traction.

Evidence? Every single demonstrably superior CPU architecture that failed to dislodge x86 over the past 50 years. Sure, it's finally happening, but it's also 50 years later.


Luckily chip companies know this now and all the alternative deep learning accelerators have some level of support for mainstream frameworks. Getting their support upstreamed to the main framework project is another matter, though, as it reaching the quality of implementation as CUDA and CPU...


We've already seen disruptive architectures like Google's Tensor Processing Units. x86 has already been upended for ML, and photonics based processing units will simply be another PCI card you plug into your computer just like a GPU.


How disruptive are TPUs really though? My understanding is that essentially everything is still trained on Nvidia architecture.


If you design your network on a TPU, you will tend to use operators that work well on a TPU. And in the end you will have a network that works best on the TPU.

Lather rinse repeat for any other architecture. You can even make a network that runs best on Graphcore that way, but it won't be fun to do it. You might even get Graphcore to pay top dollar for it though as they both need some good publicity and they have lots of VC left to squander.

This also tends to be true of video games where the platform on which they were developed is the best place to play them rather than their many ports.


We’ve been doing this for years with DSP and networking. So kind of ho-hum from a HW perspective.

If you ask me the thing that makes these things even remotely interesting is the willingness from the SW side to support new HW architectures. Without that you can’t have any innovation in HW.


That's only because google is stupidly refusing to sell their devices


Not to mention it is only happening due to the emergence of mobile with a different architecture (new field, existing moat is meaningless etc etc)


Better CPU arches were never faster, definitely not by much.


Going to disagree. I ran circles around contemporary x86 back in the early '90s because of specific instructions Intel denied they would ever need in their processor roadmap. But it really didn't matter and that's one of the most important lessons of my career.

They did similar goofy thinking with respect to the magic transcendental unit on GPUs so it's not like they ever learned. It's not entirely about clock rate.


It is true that photonic systems can multiply/add very efficiently. The limiting factor for optical neural networks are activation functions. Non-linear relationships are hard for photonics. State-of-the-art nonlinear features require optical elements that are difficult to manufacture or op-amps which slow the computational potential of the optical neural network by orders of magnitude.


People have been talking about photonics for decades. Is photonics actually well-suited to deep learning? Or is this just another "new thing is happening, time to remind people of the old thing that keeps never happening" take?


Nobody was talking seriously about photonic accelerators 10 years ago. Optical computing last had a hype cycle in the late 80s/early 90s.


Can we make fiber optics cables smaller than 2nm?

Or is the game plan to make them bigger & nobody cares cause no heat & fast?


How could you make a 2nm optics cable? The wavelengths are two orders of magnitude larger than that.


Not even remotely an expert on chip design, but deep learning dataflow is a lot more predictable and linear than what a CPU or even a GPU doing actual graphics needs to do. Speed of light latency is probably not an issue since the relevant distances are not one side of the die to the other but rather the distance between adjacent components.


I don’t think the size of silicon transistors has anything to do with this. There is no direct comparison. So they will prob bigger but no one cares.


Photonic integrated circuits are almost as old as regular integrated circuits.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: