Hacker News new | past | comments | ask | show | jobs | submit login

I'm not the OP, but I'd be curious for an informed HN reader's take on this.



By using several small chips instead of a big monolithic one, it's possible to reduce costs in 2 ways:

1) the yield is better for a small die. For a given density of defect, a big chip will have a higher probability to have a defect than a small one. Basic example: you use 4 chips instead of one, and one defect that would kill the big chip will only kill one of four of the small chips. It's more subtle than this, there are simulators on the web to see the impact of size on cost for those interested;

2) parts of the chip can use cheaper nodes. For example the I/Os not only can use less advanced and cheaper nodes, but those nodes have often better support for analog IPs.

On the flip side, communications that were internal in the big monolithic die now must cross those small dies boundaries. And communications is expensive: you would certainly not want to handle this through a PCB. Instead, more local short range interconnects are used that are much more power efficient than a PCB interconnect (but not as good as in die). These require sophisticated packaging, which adds to the cost. Still for complex chips the net effect is positive, see what AMD did (with Intel now following).


Couldn’t they use fiber optic connections to realistically remove the overhead of the interconnect communication?

I saw a demo from Kodak once where they printed a fiber optic backed motherboard and it was lightning fast even when they increased distance. No clue what the heck happened to that tech but I do recall one of the folks giving a lengthy explanation about how fiber optics could replace some of the metal used in cpus because they could make microscopic glass

Edit: turns out I was on to something with this, there is work being done on this exact problem[0]. I still wonder WTF happened with that Kodak tech though

[0]: https://spectrum.ieee.org/amp/optical-interconnects-26589434...


At the distances (millimeters) involved, optical is slower, more expensive, and consumes more power than electrical signaling.


As long as the processing is being being done with electrons (CMOS), there will be a pretty big hit for converting back and forth to photons. In package chiplet communication is actually not that inefficient as long as you can put them right next to each other (like on a chip). Intel demonstrated this with AIB. With parallel interfaces, you can get to 0.1-0.5pJ/bit transferred, which is good enough for most.


Am I informed right that chiplets tend to live not very long in comparison to single chips? I have a friend who uses to repair computers and he claims that "combines" (that is how he calls chiplets) tend to break chip vs plate connections and he can not repair this, all what he can is to replace the whole BGA thing.


One chiplet probably can't be sanely replaced if it dies, but equally we couldn't really cut out and replace part of a single die either. So that seems like a wash.

I could believe they're more vulnerable to mechanical damage. Also seems possible that the thermal expansion introducing mechanical stresses is more of a problem. I suppose we won't really know for a while yet.


This is what I would expect. With the thermal conduction across the planar barrier being different, disparities in heat flow to the sinks and the possibility of one chiplet running hotter than the other parts, there might be too much stress on the joined area. This is also assuming that the package is not experiencing any additional mechanical stress from motion or vibration.


Not sure what your friend is referring to, but packages with exposed dies are definitely more fragile than ones with built in heat sinks.


I'm not at all an expert in hardware, but I have experience using chiplet-based chips in production and optimizing for them. Those chips are better from a performance per $ point of view, but they don't necessarily achieve the highest absolute performance. The main limiting factor appears to be latency for communication between chiplets. If you write anything with shared mutable memory you are affected by this. Simple atomic operations like compare-exchange is much slower if the threads run on different chiplets.

However I fully expect this to be the future. The performance per $ is what really matters to the bean counters, and us software engineers will just have to write better software to work around it, perhaps with something like NUMA-aware scheduling that understands chiplets.


I wonder if it would be better to scale down cluster paradigms (MPI stuff), rather than trying to somehow scale up shared-memory paradigms.


Dropping cache coherency is a big lever for performance. That's definitely more annoying to program against than a magically coherent model though.


“Big lever” is a nice expression, successfully expresses both the power and unwieldiness of a thing.


As a summary, the last 50 years has achieved smaller, cheaper, faster through monolithic integration Moore's Law). Cost and complexity of design and manufacturing is now making that approach impractical. Disaggregated design and manufacturing through small chiplets is "the next thing". A bit hyperbolic, but it gets to the point ..


Chiplets are the microservices of the semiconductor world. It’s good in that smaller individual chips are cheaper to produce, and the whole package is more scalable, but it’s bad in that there are interfaces that reduce performance vs a monolith


This isn't really true. Chiplets can be used to "break apart" what would traditionally be one chip, but also to more tightly integrate things that would previously have been discrete components on the motherboard and to use the right process for given a functionality.

Consider AMD's approach. They use multiple CPU dies in a single package to build very high core count systems that would previously have required multiple sockets. Bringing these into one package can make communication more energy-efficient and faster, as well as simplifying other aspects of the system. They also use different processes for different dies. The "IO die" is fabricated on a slightly old process as it is not performance-critical while the best process is reserved for building cores.


They break things apart and preserve interface protocols, but leaving the die isn't free. AMD's approach has all of the upsides and downsides that come with microservices, and taking effective advantage of it requires a similar level of planning.


I think this is a poor comparison because microservices exist to help code mirror the org chart while chiplets exist for physical engineering reasons.

Microservices are supposed to improve testability, reduce complexity, etc: it is an organizational choice. Chiplets add complexity: silicon interposer, tougher packaging, NUMA, etc: it's an engineering choice with a tradeoff for better yield, chips reaching maximum reticle size, etc


Chiplets also exist for organizational reasons, e.g.

  trust boundaries
  open vs. closed IP
  old vs. new processes
  onshore vs. offshore
  supply chain resilience


I'm mostly meaning: will the best chips (CPUs and GPUs) in 2030 use this method, or not?


The best chips TODAY use chiplets. Check out Epyc, Ponte Vecchio, all high end GPUs and ML accelerators using HBM memory. A chiplet is generally defined as a chip/die with custom interfaces for in package communication.


I think what they are asking is if chiplets have a durable advantage, or a transitory one.


The current leading edge processing products moved to chiplets because it was the optimal/only solution from a cost and performance perspecitve. If we assume that reticle sizes will stay roughly similar and process scaling will continue to slow, then it would seem that the future bends further toward chiplets...


Yes this is basically it. Reticle size is at the limit. Barring some sort of quantum optical revolution, we need chiplets to make more complex chips.

Also the amount of heat in a given area is rising. We need to spread it out or find some really innovative ways of dumping the heat.

I suspect as time goes on, prices per transistor will become absurdly cheap but you'll need to do things like have redundant hardware running half the time to get rid of the heat.


Barring some unexpectedly big breakthroughs in things like 3d stacking or in improving yields for increasingly complex process nodes, chips in 2030 will definitely be using this method.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: