Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What'd be possible with 1000x faster CPUs?
54 points by xept on Sept 21, 2022 | hide | past | favorite | 94 comments
Imagine if we had an unlikely scientific breakthrough and many orders of magnitude faster general-purpose CPUs, probably alongside petabyte-scale RAM modules and appropriately fast memory bus, become widely available. Besides making bloatware on a previously unimaginable scale possible, what other interesting, maybe revolutionary, impossible today or at least impractical, applications would crop up then?



Video engineer here. Many seemingly network restricted tasks could be unlocked with faster CPUS doing advanced compression and decompression.

1. Video Calls

In video calls, encoding and decoding is actually a significant cost of video calls, not just networking. Right now the peak is Zoom's 30 video streams onscreen, but with 1000x CPUS you can have 100s of high quality streams with advanced face detection and superscaling[1]. Advanced computer vision models could analyze each face creating a face mesh of vectors, then send those vector changes across the wire instead of a video frame. The receiving computers could then reconstruct the face for each frame. This could completely turn video calling into a CPU restricted task.

2. Incredible Realistic and Vast Virtual Worlds

Imagine the most advanced movie realistic CGI being generated for each frame. Something like the new Lion King or Avatar like worlds being created before you through your VR headset. With extremely advanced eye tracking and graphics, VR would hit that next level of realism. AR and VR use cases could explode with incredibly light headsets.

To be imaginative, you could have everything from huge concerts to regular meetings take play in the real world, but be scanned and sent to VR participants in real time. The entire space including the room and whiteboard or live audience could be rendered in realtime for all VR participants.

[1] https://developer.nvidia.com/maxine-getting-started


> In video calls, encoding and decoding is actually a significant cost of video calls, not just networking. Right now the peak is Zoom's 30 video streams onscreen, but with 1000x CPUS you can have 100s of high quality streams with advanced face detection and superscaling[1]. Advanced computer vision models could analyze each face creating a face mesh of vectors, then send those vector changes across the wire instead of a video frame. The receiving computers could then reconstruct the face for each frame. This could completely turn video calling into a CPU restricted task.

Interesting, how do you see this different from deep learning based video coding recently demonstrated? [1]

[1]https://dl.acm.org/doi/10.1145/3368405


Realistically, AI network training at the level being done by corporations with big server farms, becomes accessible to solo devs and hobbyists (let's count GPU's as general purpose). So if you want your own network for Stable Diffusion or Leela Chess, you can do on your own PC. I think that is the most interesting obvious consequence.

Also, large scale data hoarding becomes far more affordable (I assume the petabyte ram modules also mean exabyte disk drives). So you can be your own Internet Archive, which is great. Alternatively, you can be your own NSA or Google/Facebook in terms of tracking everyone, which is less great.


I think when that hardware is attainable and the tech democratized things are going to get very bizarre very quickly. I'm hitting a wall in my imagination of what a society where this is common even looks like and it scares me.


Like any of your great grand parents would be absolutely scared of actual you and stuff you do on a daily basis, at first.


I imagine limitless tailor made entertainment and control on a per-user basis.

"Play me Frank Zappa's new album featuring Kanye West."


> Also, large scale data hoarding becomes far more affordable (I assume the petabyte ram modules also mean exabyte disk drives).

It will also mean data in general will be bigger and scale accordingly.


Imagine just saving every web page your computer ever browsers, forever.


Atlassian products would be twice as fast.


40 thousand years of evolution and we barely even tapped the vastness of Atlassian functionality potential


Instead of electron we'd be bundling an entire OS with our chat apps.


That would be nice, because many OSes are much smaller than Electron.


Electron basically IS an entire OS. Since Chromium has APIs for doing just about anything, including accessing the filesystem and USB devices and 500 other APIs.


If _accessing_ the filesystem counts toward being an OS, and not _implementing_ the filesystem, then I guess Qt and the stdlib of every lang is also "kind of an OS"


That's splitting hairs. Paravirtualised IO on a virtual machine doesn't make the guest OS running inside it, any less of an OS just because it has a simpler interface to the outside world than a SATA/SAS/NVMe controller.


Oh we are not far away from that. Most devs consider it completely fine to run a docker instance per project.


many apps already have "wget docker image" as the first step


Some applications depend on approximately solving optimization problems that are hard even for small problems. The poster child here is combinatorial optimization (more or less equivalently, np-complete problems), concrete examples are SMT solvers and their applications to software verification [1]. Non convex problems are sometimes similarly bad.

Non smooth and badly conditioned optimization problems scale much better with size, but getting high precision solutions is hard. These are important for simulations mentioned elsewhere, but not just for architecture and games, also for automating design, inspections etc [2].

[1] https://ocamlpro.github.io/verification_for_dummies/

[2] https://www.youtube.com/watch?v=1ALvgx-smFI&t=14s


1 million Science per Minute Factorio bases.


Microsoft teams may work without locking my pc. Hopefully


and Windows 12


The thing is, computing has been getting steadily faster, just not at quite the pace it was before and in a different way.

With GPUs we have proven that parallelism can be just as good or even better than speed increases in enhancing computation. And there again have been speed increases trickling in.

I don't think it's realistic to say that more speed advances are unlikely. We have already been through many different paradigm shifts in computing, from mechanical to nanoscale. There are new paradigms coming up such as memristors and optical computing.

It seems like 1000x will make Stable Diffusion-style video generation feasible.

We will be able to use larger, currently slow AI models in realtime for things like streaming compression or games.

Real global illumination in graphics could become standard.

Much more realistic virtual reality. For example, imagine a realistic forest stream that your avatar is wading through, with realtime accurate simulation of the water, and complex models for animal cognition of the birds and squirrels around you.

I think with this type of speed increase we will see fairly general purpose AI, since it will allow average programmers to easily and inexpensively experiment with combining many, many different AI models together to handle broader sets of tasks and eventually find better paradigms.

It also could allow for emphasis on iteration in AI, and that could move the focus away from parallel-specific types of computation back to more programmer-friendly imperative styles, for example if combined with many smaller neural networks to enable program synthesis, testing and refinement in real time.

Here's a weird one: imagine something like emojis in VR, but in 3d, animated, and customized on the fly for the context of what you are discussing, automatically based on an AI you have given permission to.

Or, hook the AI directly into your neocortex. Hook it into several people's neocortices and then train an animated AI 3d scene generation system to respond to their collective thoughts and visualizations. You could make serialized communication almost obsolete.


However, 1000x is really not very much. With a 1000x uplift, we could certainly get better weather predictions, but not necessarily paradigm-altering improvement. In a real sense, we already have 1000x speedup and its what you get in a contemporary "supercomputer", whatever that is in a given market at a given point in history.

Let's say we had perfect 1000x improvement in compute, storage, and IO such that everything remains balanced. A fluid-dynamics or atmospheric simulation can only increase resolution by about 10x if a 3D volumetric grid is refined uniformly, or only about 5x if we spread it uniformly over 4D to also improve temporal resolution. Or maybe you decide to increase the 2D geographic reach of a model by 30x and leave the height and temporal resolution alone. These growth factors are not life-changing unless you happen to be close to a non-linear boundary where you cross a threshold from impractical to practical.

I'm not sure we can say how much a video game would improve. There are so many "dimensions" that are currently limited and it's hard to say where that extra resource budget should go. Maybe you currently can simulate a dozen interesting NPCs and now you could have a crowd of 10,000 of them. But you still couldn't handle a full stadium full of these interesting behaviors without another 10x of resources...


I work on an open source multiplayer game that's limited by single thread CPU speed so I can give a perspective of what would improve for us at least.

The fastest thing to change is we'd increase player limits per server, per player CPU costs are significant and we could bring the player limits to maybe 500 before network speeds start being a consideration. Certain ai improvements that are currently not viable like goal oriented ai design and pathfinding improvements could be added that would make new kinds of gameplay possible. Hell with even just 10x I would be very tempted to try unifying our atmospheric and chemistry simulations so they use the same data structures, thus allowing chemical reactions between gases that aren't basically masses of nonstandard performance hacks on the back end.

In short though, even minor performance improvements would vastly change what we could accomplish. 1000x is extreme and you would see very different games that could make use of techniques that today are mostly relegated to games built around them as a gimmick that they make sacrifices for.


>With GPUs we have proven that parallelism can be just as good or even better than speed increases in enhancing computation.

Not really, no. It's just that certain classes of problems can be very readily parallelized and it's relatively easy to figure out how to do something 1000x in parallel compared to figuring out how to achieve a 1000x single thread speedup.

>Much more realistic virtual reality. For example, imagine a realistic forest stream that your avatar is wading through, with realtime accurate simulation of the water, and complex models for animal cognition of the birds and squirrels around you.

I'm not sure 1000x would do much more than scratch the surface of that, especially if you're already tying a lot of it up with higher fidelity rendering.



I wish CPUs for a while got 10x slower to allow some room for software products optimisation.


Exactly, 1000x faster CPUs would result in new software filling in the extra speed in no time at all


First thing that comes to mind is using your mobile device as your main workstation would become a lot more realistic.


In a lot of aspects the limiting factor of using mobiles as workstations is the software and OS, you can add a Bluetooth keyboard and mouse then cast it to a screen but all you will get is a bigger phone and not a workstation. Mobile CPUs are not that bad nowadays.


My main workstation up until 2005 or something was probably less computing power than the smartphone you use today.


8 core 2.8 Ghz, 11 GB RAM, 256 GB storage, liquid cooling, camera zooms at the level of a toy microscope. This is more powerful than some gaming PCs just a while ago.

It runs fine, but any less gets laggy, so I suspect apps like Facebook and TikTok are just going to continue to swallow up any more power.


Infinite arbitrary precision real time Mandelbrot zoom generation :-)


Can't you already do this with a good shader program? Well, Google search finds one that claims 'almost infinite'.


Only if you roll your own arbitrary precision type on the gpu, which is much harder given the constraints.


the best thing I know of is you could emulate 256 bits with 4x64 bits float (double) and then use the derivative of mandelbrot to approximate the fractals around interesting points


>'almost finite'

I mean one of the fundamental attributes of infinity is that you can never be 'almost there'.


Likely we would see 8192 keys for SSH


it would be nice for the architecture field. we deal with lots of crappy unoptimized software that's 20-30 years old. so if you like nice buildings and better energy performance (which requires simulations), give us faster cpus.

imagine you're working on airport. thousands of sheets, all of them PDF. hundreds or thousands of people flipping PDFs and waiting 2-3+ seconds for the screen to refresh. CPUs baby, we need CPUs.


Is there any way I can contact you? I have an aspirational semi-related project.


Real-time ray tracing was the goal in the old days. Are we there yet at adequate quality?


No, we're not there yet. Ray tracing in games is still merely augmenting traditional rasterization, and requires heavy post-processing to denoise because we cannot yet run with enough rays per pixel to get a stable, accurate render.


I feel like we are - I can run Minecraft RTX at 4k with acceptable framerate using DLSS 2.0 on a 3090. Minecraft is using pure raytracing (no rasterization). It also isn't using A-SVGF or ReSTIR, so there are 2 pretty big improvements that could be made.

Minecraft RTX does suffer really badly with ghosting when you destroy a light source, but my intuition says that A-SVGF would fix that entirely.

That being said, some of the newest techniques, like ReSTIR PT (a generalized form) have only been published for a couple of months, so current games don't have that yet. But in 3-6 months I would start to expect some games go with a 100% RT approach.


Still orders of magnitude away from full tracing, only as a part of traditional rendering, with a ton of hacks on top.

Actually, there always was a lingering suspicion that brute force simulation might get sidestepped by some another clever technique long before it's achieved, to get both photorealism and ease of creation. ML style transfer could potentially become such a technique (or not).


Unlikely to be done on CPU


Much more complicated redstone CPUs in Minecraft.


One thing i'd like to see would be smart traffic lights. For example as soon as a person finishied crossing the road, when there is noone else it switches back to green immediately.


This totally be done with existing CV tech- think pedestrian detection in self driving cars.


Assuming that a CPU at today's speeds would require vastly less power, we would have very powerful, very efficient mobile devices such as smartwatches.

Probably using AI a lot more, on-device for every single camera.


I’d just not discover my accidentally quadratic code and ship it. It would save me a lot of debugging time.


Your question is missing the factor of power - If we have 1000x at current power usage or 1000x at 1000x power?

Also, 1000x parallelism or 1000x single core?


Be able to run Emacs as fast as I can run Vim?


Consider you can easily emulate Vim inside Emacs but NOT the inverse and you'll understand what those extra cycles do.


For sure Emacs is a great operating system. If only it came with a decent text editor!


OMG moment, moment. Let me get my popcorn and I'm with you guys right away.


I'll switch to emacs the day they implement an Acme or Sam emulator, until then, ed.


Cheaper employees. With faster CPU's, they won't need to understand leetcode level optimization, i.e. they won't need expensive or sophisticated training. Just find someone with a pulse and stick them in front of the computer. Less-than-ideal big O's won't be an issue with this kind of speed.


Simulation? Like fluid dynamics. I heard that was CPU intensive.


Incredible biodiversity monitoring— everywhere, all the time


More bloat


I guess it depends on what you mean by faster.

Higher IPC, higher clock, more cores, more cache, more cache levels, more memory bandwidth, faster memory access, faster decode, etc.

One idea I imagine would be possible with a 1000x speed would be real time software defined radio capture, analysis and injection.


React Native could now handle 500,000,000 3rd party jankfest line rather than just 100,000,000


If I dare to be optimistic for once, cure cancer via simulated protein folding.


Current encryption standard would become obsolete over night, internet/network connectivity would become insecure.

This would lead to a complete chaos, until we update our security standards.


Less time spent in software development on optimization. That might sound horrible at first, but also means that less resources need to be used for programming something


Single-shard MMO with no instancing requirements.


As per raykurzwiel https://www.kurzweilai.net/images/chart03.jpg

With 1000X CPU computing, each computer will have equivalent computing power as human brain.

So brain compute interface or jarvis like AI may get possible


Weather forecasts would be as good as they are now, perhaps 1-2 days further ahead.


A Ruby on Rails renaissance.


Windows update in the background would take 3 hours invested of 4.

Average nodejs manifest file would contain x12000 more dependencies.

Also, we would see a ton more AI being done on the local CPU. Anything from genuine OS improvements to super realistic cat filters on teams/zoom.

And finally, I think people would need to figure out storage and network bottlenecks because there is only so much you can do with compute before you end up stalling waiting for more data


we have always been memory-bound, in one way or another, even today.

the difference in performance between an application using RAM with random access patterns and an application using RAM sequentially is far more than you are expecting it to be if you haven’t actually measured it. an order of magnitude or more for sequential stuff over random access. having your data already in the L1 cache before you need it is worth the effort it takes to make that happen.


Indeed, but in the case of your average application it is not only lack of will/expertise to optimize, but also simply that the program domain has a much more random memory allocation pattern. Most programs are not operating in a single hot loop on terabytes of data.


> Average nodejs manifest file would contain x12000 more dependencies.

This is absolutely true


> Windows update in the background would take 3 hours invested of 4.

And MacOS updates will still find a way to take your machine offline for an hour


How many pi decimal could be generated within X time using such a machine?


1000x better porno


Good code.


Whole brain simulation, AGI.


A truck with 1000000000hp still won't beat a Ferrari on a race track, nothing guarantees faster hardware would solve any of our AI problems


A 1000000000hp equivalent electric truck generating much tork would probably lift off and fly to the Moon, or dig itself so deep it would melt in lava. In the meantime, a cybertruck with 3 motors (or 4) may soon (2023?) challenge Ferrari.


Training time is a massive constraint on advancement of the science, so at the very least the field would progress much faster and be much more accessible to researchers.


Faster processing alone won't make training 1000x faster, the bottleneck is more on the memory size / bandwidth side


I feel people are overlooking the OP's mention of parallel improvements in storage and speed of access. While there are physical limits to this, I feel like capabilities will continue to expand not so much in terms of pure speed as in better automation of parallelization and resource allocation.


The thread is about the whole stack 1000x'ing. Not just processing speed.


Still not even close to a brain though.


I think AGI requires different topological/conceptual paradigms rather than pure speed/processing capacity. But the latter is necessary to experiment and create recognizable results.

A lot of the current excitement around AI image construction and SD's availability is the intuitive sense that these tools have succeeded in emulating some key aspects of our visual cortex - given a set of object classifiers they can create imaginary views that are recognizable to us. It's sort of an illusion - Stable Diffusion has no aesthetic or experiential preferences of its own and so its activity is reflexive rather than conscious, and we don't understand if or how consciousness is emergent from complex reflexivity.

But the key point is that it's doing such a good job at this 'narrow' task of visual synthesis, and other models are doing such a good job at the 'narrow' tasks of textual or audible synthesis, that it's competitive with a human in an idiot-savant kind of way. And we know from our own experience that skill and learning are protean - we may disagree on the value of different types of learning, but don't question the similarity of the underlying mechanism. Thus I might think that becoming an expert on, say, the fictional universe of Star Wars is a waste of time, but the process of knowledge acquisition, recall, and synthesis are not fundamentally different from those used to learn history or engineering ('experimentation' can exist in terms of consensus establishment in a fandom about whether an innovation is canonical or parodic).

So if we can train models with a billion semantically-tagged media objects and have them generate new media objects that meaningfully reflect the tags we supply, it means we have a decent general environmental-feature detection, recall, and resynthesis tool. Being able to take an existing model and tune it on workstations instead of needing a whole datacenter substantially widens the field of possibilities. So what happens if we connect it to sensors and actuators and train our model to navigate a dynamic landscape, which includes 'internal' signals that can't be directly responded to? Consider a virtual or lab environment which is complex and dynamic, and includes energy units (batteries). Our model has internal batteries and feedback mechanisms, but their state can only be altered through external activity and their signals are heavily weighted. Sensory subsystems attached to the model have some precomputed models of their own.

My idea is that the brain is a 'system of systems' and that consciousness emerges from the instrumentation of the time cost of model tuning vs the rate of environmental variation.


depending on the brain


We can’t do those slowly though


java might run at a decent speed... Might, but probably won't (jk, sorry, I couldn't help myself...) [edit Grammarly decided to remove some text when fixing spelling...]


Java runs at 90% the speed of C for most common benchmarks.

It uses 50x the RAM to do so. But you're dead wrong to think java is slow.

The only reason physics game engines are written in C++ is because physics game engines are written in C++.


>Java runs at 90% the speed of C for most common benchmarks.

It's not 90%, more of several times slower: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

>The only reason physics game engines are written in C++ is because physics game engines are written in C++.

They are written in C++ because of latency requirements which are nearly impossible in GCed language.


From what I know, the major C++ engines (Unity, Unreal) have GC in them. Using GC does not automatically mean that latency is out the window.


You missed the jk (joking part) didn't you. Only Java apps I use are jdownloader and ikvm apps for servers...and well, they are slow...


Have you ever used any service from AWS? Then you were using Java.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: