I think we have barely scratched the surface of post-trained inference/generative model inference efficiency.
A uniquely efficient hardware stack, for either training or inference, would be a great moat in an industry that seems to offer few moats.
I keep waiting to here of more adoption of Cerebras Systems' wafer-scale chips. They may be held back by not offering the full hardware stack, i.e. their own data centers optimized around wafer-scale compute units. (They do partner with AWS, as a third party provider, in competition with AWS own silicon.)
I hope we never find good moats. I hope that progress in AI is never bottlenecked on technology that centralizes control over the ecosystem to one or a handful of vendors. I want to be able to run the models myself and train them myself. I don't want to be beholden to one company because they managed to hire up all the people building fancy optical chips and kept the research for themselves.
From a “business is interesting” perspective, I love being surprised by clever competitive moves.
From an “AI is the ultimate technology of technologies, and if competitively compounding, the ultimate enabler of power centralization”, I don’t want any serious moats either.
Re: cerebras, they filed a S1 [1] last year when attempting to go public. It showed something like a $60M+ loss for the first 6 months of 2024. The IPO didn’t happen because the CEO’s past included some financial missteps and the banks didn’t want to deal with this. At the time the majority of their revenue came from a single source in Abu Dhabi, as well. They did end up benefiting by the slew of open source model releases which enabled them to become inference providers via APIs rather than needing to provide the full stack for training.
Google is already there with TPUs. The reason they can add AI to every single google search is not just that Google has near-infinite cash, but also that inference costs far less for Google than anyone else.
Reading the specs on the new TPU designs and how it incorporates optical switching fabric and other DC-level technology to even function, I think the moat is already there.
The raw materials: diffractive optical elements and single mode fibers from a materials perspective are all quite easy to manufacture. The primarily limitation with miniaturization is the single-mode fibers, which are limited by the optical wavelength you are using and the index of the fiber. For a conventional silica optical fiber, this is probably around ~100 nm diameter at a minimum. Newer materials can definitely change this 2-3x, but I'm not aware of anything more fundamental.
So in general this would be something that you would potentially be able to see in cars, but unlikely consumer electronics or handhelds without a modification in the operational principle (eg time-multiplexing to reduce the required number of fibers).
My personal opinion is that competing on low-power and small-scale is a lost cause for photonic computing. In terms of absolute energy efficiency and absolute miniaturization, photonics will never win. But at larger energy scales and larger systems, photonics can reach a regime where higher parallel throughput will dominate.
Not cheap, unless that one specific model is going to be used across tens of millions of devices, with no updates, for the physical lifetime of the device.
A uniquely efficient hardware stack, for either training or inference, would be a great moat in an industry that seems to offer few moats.
I keep waiting to here of more adoption of Cerebras Systems' wafer-scale chips. They may be held back by not offering the full hardware stack, i.e. their own data centers optimized around wafer-scale compute units. (They do partner with AWS, as a third party provider, in competition with AWS own silicon.)
reply