Hacker News new | past | comments | ask | show | jobs | submit login
U.S. focuses on invigorating ‘chiplets’ to stay cutting-edge in tech (nytimes.com)
85 points by adapteva on May 12, 2023 | hide | past | favorite | 48 comments




Some folks I work with are interested in chiplets for secure/defense purposes. If you don't trust the fab but you do trust the integrator, the fab can make multiple little modules with well-defined interfaces, and your integrator can instrument the interfaces more easily than an entire chip.


Yes, here have been a fair number of public DoD studies around the virtues of disaggregation when it comes to security. Minimize the people/things you have to trust (eg. RoT).


I’m sure they are smart, and are applying bigtime brainpower to the project. But my basically layman gut take is, it seems surprising that inside-the-package is a reasonable place to have an attack surface.

And is it really impossible to sneak an antenna into chiplet?


The threat model I'm thinking of is the untrusted 3rd party fab.

For example, you want to use the fancy new process node that's only available from SketchySemi in some other country. They're happy to do business with you, but you're worried about hardware trojans.

Chiplets help isolate the untrusted component so you can focus your scrutiny on its i/o.

SketchySemi could put antennas inside the chiplet they sell to you, but you have freedom to orient the chiplet however you want, wrap it in shielding, throttle its power, etc.


> SketchySemi could put antennas inside the chiplet they sell to you, but you have freedom to orient the chiplet however you want, wrap it in shielding, throttle its power, etc.

TEMPEST shielding is also a thing:

* https://en.wikipedia.org/wiki/Tempest_(codename)


Might be time to reread Vinge's Zones of Thought books.

Sub in chiplets for motes. Also antennae are not the only side channel attack out there.


That makes total sense. The paranoid approach to software involves writing N separate pieces using teams that can't talk to each other to minimise how many people know what the overall system can do. Applying the same reasoning to hardware seems to end up with this conclusion (and probably some enthusiasm for FPGAs).

I'm curious to what extent software built in this isolated silos scheme actually works, best guess is it's OK but slow and expensive to build. Same idea might apply here, i.e. it's a way to make hardware take longer to put together.


> teams that can't talk to each other

Is there such a thing?


Sure. Just look at most mid-size companies!


Yeah once you get past 50 engineers


The chiplet with the interfaces could still be black boxes right?


I am impressed by the fabrication technology for these heterogenous systems. It was already complex enough to fabricate a CPU, but chiplets require incredible precision for placing the die on the interposer wafer for soldering. And the coplanarity of the whole assembly is critical, else it will be impossible to effectively cool. It's incredible that any of it works.


Agreed...although not sure if it's any more impressive than automated fabs that churn out nanometer precision transistors with billion transistors chips costing less than a cup of coffee per chip.:-)


Why chiplets? Can anyone explain to me why I'd want chiplets as opposed to a processor embedded in a Field Programmable Gate Array? Seems like development would be infinitely cheaper and the possibilities far more dynamic? Chiplets seem slow and expensive to develop. Once built they'd be completely static and probably necessarily replaced by the subsequent version in 3-5 years.


If an FPGA can get the job done, it's usually the right answer. However FPGA programmability comes with a 25x-100x penalty in terms of area, cost, power, performance compared to application specific devices. Sometimes 100x matters...


Kinda ironic how the article is about chiplets, but all the factory photos are of an old school wire bonded PGA packaging line... But maybe that's the point, you can't find a chiplet factory within the borders of the USA to photograph for a story?


https://archive.is/pUgCO

Will this eventually give me more FPS in cawadooty?


Your computer may be able to run minesweeper one day


I'm not the OP, but I'd be curious for an informed HN reader's take on this.


By using several small chips instead of a big monolithic one, it's possible to reduce costs in 2 ways:

1) the yield is better for a small die. For a given density of defect, a big chip will have a higher probability to have a defect than a small one. Basic example: you use 4 chips instead of one, and one defect that would kill the big chip will only kill one of four of the small chips. It's more subtle than this, there are simulators on the web to see the impact of size on cost for those interested;

2) parts of the chip can use cheaper nodes. For example the I/Os not only can use less advanced and cheaper nodes, but those nodes have often better support for analog IPs.

On the flip side, communications that were internal in the big monolithic die now must cross those small dies boundaries. And communications is expensive: you would certainly not want to handle this through a PCB. Instead, more local short range interconnects are used that are much more power efficient than a PCB interconnect (but not as good as in die). These require sophisticated packaging, which adds to the cost. Still for complex chips the net effect is positive, see what AMD did (with Intel now following).


Couldn’t they use fiber optic connections to realistically remove the overhead of the interconnect communication?

I saw a demo from Kodak once where they printed a fiber optic backed motherboard and it was lightning fast even when they increased distance. No clue what the heck happened to that tech but I do recall one of the folks giving a lengthy explanation about how fiber optics could replace some of the metal used in cpus because they could make microscopic glass

Edit: turns out I was on to something with this, there is work being done on this exact problem[0]. I still wonder WTF happened with that Kodak tech though

[0]: https://spectrum.ieee.org/amp/optical-interconnects-26589434...


At the distances (millimeters) involved, optical is slower, more expensive, and consumes more power than electrical signaling.


As long as the processing is being being done with electrons (CMOS), there will be a pretty big hit for converting back and forth to photons. In package chiplet communication is actually not that inefficient as long as you can put them right next to each other (like on a chip). Intel demonstrated this with AIB. With parallel interfaces, you can get to 0.1-0.5pJ/bit transferred, which is good enough for most.


Am I informed right that chiplets tend to live not very long in comparison to single chips? I have a friend who uses to repair computers and he claims that "combines" (that is how he calls chiplets) tend to break chip vs plate connections and he can not repair this, all what he can is to replace the whole BGA thing.


One chiplet probably can't be sanely replaced if it dies, but equally we couldn't really cut out and replace part of a single die either. So that seems like a wash.

I could believe they're more vulnerable to mechanical damage. Also seems possible that the thermal expansion introducing mechanical stresses is more of a problem. I suppose we won't really know for a while yet.


This is what I would expect. With the thermal conduction across the planar barrier being different, disparities in heat flow to the sinks and the possibility of one chiplet running hotter than the other parts, there might be too much stress on the joined area. This is also assuming that the package is not experiencing any additional mechanical stress from motion or vibration.


Not sure what your friend is referring to, but packages with exposed dies are definitely more fragile than ones with built in heat sinks.


I'm not at all an expert in hardware, but I have experience using chiplet-based chips in production and optimizing for them. Those chips are better from a performance per $ point of view, but they don't necessarily achieve the highest absolute performance. The main limiting factor appears to be latency for communication between chiplets. If you write anything with shared mutable memory you are affected by this. Simple atomic operations like compare-exchange is much slower if the threads run on different chiplets.

However I fully expect this to be the future. The performance per $ is what really matters to the bean counters, and us software engineers will just have to write better software to work around it, perhaps with something like NUMA-aware scheduling that understands chiplets.


I wonder if it would be better to scale down cluster paradigms (MPI stuff), rather than trying to somehow scale up shared-memory paradigms.


Dropping cache coherency is a big lever for performance. That's definitely more annoying to program against than a magically coherent model though.


“Big lever” is a nice expression, successfully expresses both the power and unwieldiness of a thing.


As a summary, the last 50 years has achieved smaller, cheaper, faster through monolithic integration Moore's Law). Cost and complexity of design and manufacturing is now making that approach impractical. Disaggregated design and manufacturing through small chiplets is "the next thing". A bit hyperbolic, but it gets to the point ..


Chiplets are the microservices of the semiconductor world. It’s good in that smaller individual chips are cheaper to produce, and the whole package is more scalable, but it’s bad in that there are interfaces that reduce performance vs a monolith


This isn't really true. Chiplets can be used to "break apart" what would traditionally be one chip, but also to more tightly integrate things that would previously have been discrete components on the motherboard and to use the right process for given a functionality.

Consider AMD's approach. They use multiple CPU dies in a single package to build very high core count systems that would previously have required multiple sockets. Bringing these into one package can make communication more energy-efficient and faster, as well as simplifying other aspects of the system. They also use different processes for different dies. The "IO die" is fabricated on a slightly old process as it is not performance-critical while the best process is reserved for building cores.


They break things apart and preserve interface protocols, but leaving the die isn't free. AMD's approach has all of the upsides and downsides that come with microservices, and taking effective advantage of it requires a similar level of planning.


I think this is a poor comparison because microservices exist to help code mirror the org chart while chiplets exist for physical engineering reasons.

Microservices are supposed to improve testability, reduce complexity, etc: it is an organizational choice. Chiplets add complexity: silicon interposer, tougher packaging, NUMA, etc: it's an engineering choice with a tradeoff for better yield, chips reaching maximum reticle size, etc


Chiplets also exist for organizational reasons, e.g.

  trust boundaries
  open vs. closed IP
  old vs. new processes
  onshore vs. offshore
  supply chain resilience


I'm mostly meaning: will the best chips (CPUs and GPUs) in 2030 use this method, or not?


The best chips TODAY use chiplets. Check out Epyc, Ponte Vecchio, all high end GPUs and ML accelerators using HBM memory. A chiplet is generally defined as a chip/die with custom interfaces for in package communication.


I think what they are asking is if chiplets have a durable advantage, or a transitory one.


The current leading edge processing products moved to chiplets because it was the optimal/only solution from a cost and performance perspecitve. If we assume that reticle sizes will stay roughly similar and process scaling will continue to slow, then it would seem that the future bends further toward chiplets...


Yes this is basically it. Reticle size is at the limit. Barring some sort of quantum optical revolution, we need chiplets to make more complex chips.

Also the amount of heat in a given area is rising. We need to spread it out or find some really innovative ways of dumping the heat.

I suspect as time goes on, prices per transistor will become absurdly cheap but you'll need to do things like have redundant hardware running half the time to get rid of the heat.


Barring some unexpectedly big breakthroughs in things like 3d stacking or in improving yields for increasingly complex process nodes, chips in 2030 will definitely be using this method.


I am curious how Cerebras 850,000 core CPU fits in all this chip / chiplets categories


Cerebras is the anti-chiplet approach, working at the wafer scale stitching together reticle sizes islands of logic using on wafer wiring.


What is the opposite of a chiplet? A chiplot?


invigorating chiplets' _______?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: