Also, we are starting to see weird stuff happen with GPU/processor and memory prices going up as inflationary pressures are able to outrun chip technology advances. We might start to see commodity computer products hoarded as a store of value.
Advances have mostly been made in storage speed (SSDs), battery performance, and display quality (resolution, colours, and lately refresh rate), plus vector processing for machine learning.
I think this is a misconception. Computers were "fast enough for users" since mid-eighties. Anything is fast enough if you cannot imagine it working better. What happened is that companies came up with more advanced software products for consumers and that drove the demand for faster consumer hardware.
The reality is that all the cutting-edge computing today requires a modern $600 GPU. The difference is that in the past we would have expected this kind of hardware to "trickle down" to normal users in couple of years. Today we don't. Instead of marketing better computers companies market cloud services. It's a really shitty phenomenon. We're seeing the reversal of the PC revolution.
Exactly. Is memory cheap enough that I can load the entire contents of my hard disk into it at a reasonable price? Is optane (or something similar) cheap enough that syncs from memory back onto something persistent happen at close to realtime? Are processors fast/parallel enough that I can actually start running all the programs I will ever want to run and just switch to them when necessary? Do I have enough cores that even if most of my running processes start glitching out and using 100% cpu, the rest of them will still run buttery smooth? Add to that that resolution of monitors will increase, and details displayed even on lower resolutions will only get more intricate. And things currently on the high-end can only get cheaper.
Additionally, software has gotten better. Nginx is just plain much more streamlined than Apache, and simple caching techniques have really increased the amount of boom for your buck you get with hardware these days, at least in the server space.
AMD will sell CPUs with at least 12 cores on a single die when 7nm arrives and maybe even up to 16 if we're lucky.
EUV is expected to scale to 1 nm at a minimum. It wouldn't surprise me if high end consumer desktops will have 64 cores and servers will have 2x 4x 64 cores for a total of 512 cores per server before the ride finally ends.
People have known that we were reaching the limits of Moore's law. It has also been, in my experience, the common sentiment that the rapid progress of fab technology had been killing potential for architectural innovation.
I did look at price trends in pcpartpicker (https://pcpartpicker.com/trends/), but didnt really notice clear upward trends for cpu prices. I saw an uptick in gpu prices that I imagine was due to the recent crypto craze. Memory definitely looked like prices were going up across the board. Maybe these windows are too narrow to really identify the kinds of trends you are proposing. I would love to hear more on this topic from someone who knows more about economics and the industry than I do.
Edit: autocorrect apparently doesn’t know about gpus
All guns are pointed at _memory_
Memory is an uncompetitive industry, a cash cow unseen in history, comparable only to oil. The SEL empire is built not on top of galaxy notes, but on a pile of memory chips.
The easiest way to get an order of magnitude improvement right away is to put more memory on die and closer to execution units and eliminate the I/O bottleneck, but no mem co. will sell you the memory secret sauce.
Not only that memory is made on proprietary equipment, but decades of research were made entirely behind closed doors of Hynix/SEL/Micron triopoly hydra, unlike in the wider semi community where even Intel's process gets leaks out a bit in their research papers.
SEL makes a lot of money not only selling you the well known rectangular pieces, but also effectively forces all top tier players buying their fancy interface IP if they want to jump on the bandwagon of the next DDR generation earlier than others: https://www.design-reuse.com/samsung/ddr-phy-c-342/ . This makes them want to keep the memory chip a separate piece even more.
Many companies tried to break the cabal, or workaround them, but with no results. Even Apple's only way around this was just to put a whopping 13 megs of SRAM on die.
Changing the classical Von Neuman style CPU for GPU or the trendy neural net streaming processor changes little when it comes to _hardware getting progressively worse_ at running synchronous algorithms because of memory starvation.
You see, the first gen Google TPU is rumored to have the severest memory starvation problem, as do embedded GPUs without steroid pumped memory busses of gaming grade hardware.
When PS3 came out, outstanding benchmark results on typical PC benchmark tasks were wrongly attributed to it having 8 dsp cores, while they were not used in any way. It was all due to it reverting back to more skinny, and synchronous operation friendlier memory. The amazing SPU performance was all thanks to that too. DSP style loads benefitted enormously from nearly synchronous memory behaviour.
Probably didn't work out for the reasons you mention.
Next question: What prevents Intel or AMD combining logic process and DRAM process or Samsung and others combining logic in their DRAM ships?
Integrating CMOS and DRAM might be impossible to do so that the price/speed is less than using separate chips. Combining two processes increases the price. CPU/GPU makers don't have the latest DRAM knowledge. Reverse is also true. Partnership is required.
Then there are technological problems: There are yield differences in different processes. CPU/GPU's operate in high temperatures. DRAM's technology needs to adjust or there needs to be halfway solution.
It's possible that in some time in the future new technology called STT-MRAM could replace replace low-density DRAM and SRAM and it could be integrated into logic because it can use existing CMOS manufacturing techniques and processes. It will take time. (STT = Spin-Transfer Torque)
If Intel wants to add more cache, they simply paste the template again. What's limiting on-die caches is the competition for space. Chip yields sink roughly proportional to die size.
> The REX Neo architecture gains its performance and efficiency improvements with a reexamining of the on chip memory system, but retains general programmability with breakthrough software tools.
https://insidehpc.com/2017/02/rex-neo-energy-efficient-new-p... (check out the linked video)
Did not seem to be an issue in the paper with the first generation.
An in-depth look at Google's first Tensor Processing Unit (TPU ...
Even with monsterous HBM2 memory, they still have it.
It is probably hard to predict what matrix set to prefetch when you deal with a neural net. So you have cache misses there too
The other mammoth problem however, is scaling the deep trench capacitors.
However if you have a great idea for logic and dram on the same die you are unlikely to be able to do so economically since you can't get the dram IP.
Like to see this progress and get more data on the difference is power required using a CPU and traditional algorithms versus using a TPU and new algorithms like Jeff outlined.
The Case for Learned Index Structures
As far as grand new ideas: fleet architecture by Ivan Sutherland.
As far as master-of-the-obvious ideas: make a hybrid unit that contains flash, RAM and an FPGa-based processor. Persistent by default. Connect a lot of them and get rid of disks, caches and memory bottlenecks. Do something similar to what XMOS does for peripherals (simulate them) to simplify hardware even further.
You could even combine the two ideas above. Make a computer where programming isn't just about instructions, but about re-configuring hardware and information routing within the system.
Fuse an i7 processor with 32 GB of RAM. This should be far sufficient for all normal consumer needs, at this time.
You can still allow for additional memory, but this would function as a level 2 memory which is slower to access.
When HBM first came out, I mused at how much faster/more efficient a CPU+HBM chip could be. I wonder what's stopping it from happening.
Thinking aloud here-- Is the lull in chip design due to the shift in focus to software over hardware due to the boom of internet companies circa 2000. Just compare entry level EE vs. CS available jobs or pay for example. Or, maybe it's just that, like markets, R&D also operates in cycles? Regardless, it would be interesting to see a comparison of $$$ invested in R&D on each side over the last n years.