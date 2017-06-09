PCIe signals are generated by transceivers -- devices within chips that are specialized in signal conditioning e.g echo cancelling, emphasis/de-emphasis, dynamic impedance matching. These transceivers and the analog and digital techniques they implement get better with time. This is easily measurable by looking at the Bit Error Rate of data or by looking at eye diagrams (see slide 15). As data rates increase things like drive strengths, impedance mismatches, and a number of other properties of silicon will "close the eye" meaning the transmitted "0"s and "1"s are not different enough for them to be distinguished by a receiver enough of the time to successfully decode a packet. (PCIe is packet based, it's surprisingly somewhat similar to Ethernet). But essentially as our understanding and processes for manufacturing semiconductor devices increase, we're able to "open the eye" more, at which point the industry decides to increase data rates.
It helps that this isn't happening just for PCIe; there's lots of breakthroughs that benefit (and may have originated with) other high speed links.
Optical PCIe would be hugely handicapped by lack of a standard optical PCB construction method. You'd have to print waveguides onto the PCB. And then it stops working if you get dust in the socket.
Multimedia is the driving force behind increased data usage, and I think we'll continue to need more throughput until we no longer get any benefits from higher resolutions (aka when we have substantially more pixels than rods and cones in our eyes). At the moment a phone with a 4K display saturates your eyes at any distance greater than 2 feet from your face. I think a 16x PCIe 4.0 link will likely provide more than enough bandwidth to generate fully immersive VR experiences, so the question then becomes... why and when will we need optical PCIe 5.0 to quadruple the datarate of PCIe 4.0...
There are advantages of optics including that light moves faster than electrons (important for HPC where the figure of merit is latency in us between nodes, etc) and typically has higher fidelity. But the size of these structures is orders of magnitude larger than conventional semiconductors.
Both for power consumption and EMC reasons you want to minimize the maximum slew rate of the signal on the link (and thus the voltage) while on the same hand you need the voltage to stay large enough so that the receiver (which is for all purposes an analog design) can be implemented in widespread digital CMOS processes.
On the other hand both conventional parallel PCI and conventional (<=2.0) USB is limited by physical factors, which is in both cases the physical length of the link/bus and propagation velocity of the used wires (ie. speed of light divided by some small-ish constant). In both these standards this limit was intentionally introduced by Intel as cost reducing measure (in PCI's case this means that motherboard does not have to contain about 60 or so discrete resistors, real impact on cost of USB's implementation is somewhat questionable).
The electrical level isn't too magic, but there are still a lot of things that have to be tuned (the chapter on tuning in the Mindshare book on PCIe is about 100pp). For a commodity consumer bus, the relative reliability and speed of PCIe is kind of a miracle.
i'd guess the speed improvements come from much more precise timing and voltages, so you can get better guarantees about interference. if voltage is +/-10%, that field will be bigger, and interfere more. If the timings are +/-10% the field will be there when you don't want it to for the next signal.
Anyway, i'm sure there are much more knowledgeable people who can give you much better insight, but i think that's the physics 101 kinda answer.
I'm not sure I understand what you're saying it almost sounds like you're referring to impedance.
What goes at nearly the speed of light is the "message" that electrons should move a certain way. If you want an analogy, if you blow into a flute, even though the air is moving slowly, the sound travels fast.
Imagine people going to the shopping mall. Once in a while someone goes in and an hour or two later they come out the other door 10m away. But they travelled a lot more than 10m. You just didn't notice from outside.
My memory may be off but as I recall the fermi speed in copper is only around 0.5% of c, rather far off. What does propagate at speeds on the order of c is the EM field, which is what most people are actually talking about when they think of electricity moving down a wire. But, it's still something around ~60% of c in a copper transmission line.
https://en.wikipedia.org/wiki/Signal_velocity
signals propagate over the transmission line as a wave by alternatively transferring energy from electric to the magnetic fields i.e. between L & C.
which is where the delay comes from...
it is fairly trivial to derive the wave propagation equation for the above model (assuming ofcourse that leakage conductance is zero). when considering lossy transmission lines though, things get quite complicated, but you can always (almost) get away with numerical techniques...
Somewhat related, I've noticed comm tech tends to follow a fairly consistent evolution: new enabling material/process, improvements to interconnect, algorithm optimizations, repeat.
problems are how to take care of the electromagnetic noise, that's why PCIe uses differential pairs wiring (both positive and negative wires are next to each other on board) instead of single-ended (single wire with common ground, negative) which were used in original PCI and PCI-X so the if some noise hits the first wire, same noise hits the second differential wire.
improvements in coding and decoding the signals with error correction also help recovering any errors caused by the electromagnetic noise.
The associated graphic is some random “Maximum PC Magazine via Getty Images“ thing. You mean they have nobody that can take a photo of the inside of a PC?
Then the source of the article is not PCI-SIG itself but a TechReport article: https://techreport.com/news/32064/pcie-4-0-specification-fin... which, frankly, is much more packed with info and deets. In contrast to Engadget there's a nice info-graphic showing the evolution in bandwidth over the years _plus_ there's a table with PCI specs 1 through 5. Finally the source there is PCI-SIG itself: http://pcisig.com/ and from there you can root through the spec revisions: http://pcisig.com/specifications/review-zone – being as someone else pointed out here "PCI Express Base Specification Revision 4.0, Version 0.9" and "PCI Express Base Specification Revision 5.0, Version 0.3"
I mean I do like Engadget, I've been going there years, but sometimes I ask myself why I do. Their tech event live-blogs are pretty damn decent I guess but I wish their articles had more meat on them like ArsTechica or AnandTech or TechReport or …
They're probably using un-openable imacs.
Yeah seriously. I couldn't believe this line:
"Backward compatibility means that manufacturers won't have to redesign old systems, either"
What does that even mean? That you can take a PCI-e 4.0 device and plug it into an older slot? Isn't that universally the case with computers, otherwise the bus would be called something else...
I'm unaware of any hardware vendor that puts effort into redesigning a product they're already shipping, if not to resolve a deficiency (and even then, it'd better be important to re-spin the hardware).
Last time I checked, even very high end cards are far from being bottlenecked by a 16x 3.0 bus.
Though it may be that expensive high-end VR could be the luxury home theater of tomorrow, and you put 4 GPUs in a box to get 90FPS 8K seamless VR. Faster PCIe could be nice to have in that case.
New PCIe specs will likely only affect data transfer like for storage applications, at least for the time being.
While Vulkan fully supports it. Metal and DX12 don't.
The real issue for real time AR/VR/144FPS+ is knowing what you can/can't offload, what's the transfer latency, etc. this will change based on cards, generations, CPU's, library versions, Driver versions, vendors.
It is a nightmare.
Even SLI/XFire when you know there are identical cards, and drivers. You still see ~10-20% pref gain for 50% more resources.
From the arstechnica review on the 1060: "GPU Boost 3.0, Fast Sync, HDR, VR Works Audio, Ansel, and preemption make a return too , as well as the ability to render multiple viewpoints in a single render-pass."
https://arstechnica.com/gadgets/2016/07/nvidia-gtx-1060-revi...
From my limited understanding, VR is difficult because of the tolerances required. For regular gaming, slight frame drops were annoying, but didn't break the experience. Thus, it was reasonable to ship a game that was able to hit 60fps 99% of the time, and just write off the remaining 1% of the time. For VR, not only do we need to hit at least 75fps, the tolerance for frame drops is much much lower (a stutter while you're watching a monitor is annoying, the same stutter in VR could make you lose your balance). To aim to hit 75fps and guarantee that you'll hit that 99.9%, 99.99%, or 99.999% of the time is where the difficulty lies. I'm sure most of the HN audience has experience with just how difficult it is to tack on an additional 9.
Another more well-known source: https://www.techpowerup.com/reviews/NVIDIA/GeForce_GTX_1080_... at 3840x2160 there was an average of 171% performance across games when they excluded the ones which didn't scale with SLI at all.
Linus Tech Tips has a rant about how Intel is trying to segment the market by only offering more PCI-E lanes on the most-expensive CPUs despite mobo support with their new X299 platform (they've done this in the past as well): https://www.youtube.com/watch?v=TWFzWRoVNnE
Basically, if you want the full 44 lanes, you have to get the top-spec processor for $1k ($1k for the cpu itself). The i9s above the 7900x don't even have details yet, since Intel is working on adapting their Xeons to their new HEDT platform to keep up with AMD. Here's the breakdown: https://www.cinema5d.com/wp-content/uploads/2017/05/Intel-co... I've also heard speculation that they're crippling the lanes on cheaper chips because they're so worried about cannibalizing their server market. Don't expect ECC support on these either. Honestly, if any Xeon BIOSes and CPUs supported unlocked multipliers, they'd be a better deal, but I get that overclocking and max stability don't really mix. OR they could totally wow everyone and come out with some 5.0Ghz (all-core) 16-core chip that isn't overclockable any further but can run ECC. Sell THAT for $2k to workstation users and rich gamers. Make it 2P capable as well in case you need 32 cores. Maybe AMD will do it with ThreadRipper, which BTW has 64 PCI-E 3.0 lanes ( https://www.pcper.com/news/Processors/Computex-2017-AMD-Thre... ).
But returning to video cards, in GPGPU configurations transferring data between the board and system memory can quickly become a bottleneck.
Bottom line, is that gen4 is about 3 years late. PCI specs (or more generally x86 IO interfaces) seem to consistently lag their requirements. Hence why we lived with AGP (or VLB for that matter)..
Once that's done add a few more x4 slots. Even consumer CPUs have enough lanes for it, no need to waste x16 on the GPU. TB3 is x4, U.2 is x4, everything big and useful is x4 so more of that is helpful. The only thing that needs x8 is like HPC and 40Gbps+ Ethernet both of which are clearly server land.
And the future is actually less bandwidth requirement:
> Performance doesn't even drop with newer DirectX 12 and Vulkan games, including titles like "DOOM," which are known to utilize virtual texturing ("mega textures," an API feature analogous to Direct3D tiled-resources). If anything, mega textures has reduced the GPU's bandwidth load on the PCI-Express bus.
So: 3.0 x4 will be plenty for the foreseeable future.
Have they fixed some problem with the development process that will make it take 1/3 of the time?
They might as wel wait a year and have 5.0
https://www.extremetech.com/computing/250640-pci-sig-announc...
It's also possible that the timing is a result of them expecting far more stalls in the process for 5.0 than cropped up, though there's obviously still plenty of time.
But holy shit, $4k a year for the privilege of downloading a few specs? This is even more ludicrous than the USB or Bluetooth messes.
But yeah, that's what I meant by USB being better than PCIe. At the very least the specs are publicly downloadable.
Doesn't sound as solid as a socket though, they support some weight of the card in addition to carrying data. I'd be concerned about USB sockets getting torqued off if you looked at them wrong while installing a card, or if you took out the mounting screws without having the card properly supported.
If all that weight was hinging on one socket an accommodation to have some kind of flange wouldn't be too hard to incorporate.
Then we could eliminate the cost of the USB components by building contacts into the support socket that connect to pads on the PCB!
I dunno, ease of inserting and removal just isn't on my list of desires for a GPU. I install one maybe every 2 years, and I don't want to worry about it falling out and smashing around inside my case if I pick up my computer or something.
Whatever mounting shenanigans are necessary to lock it in place with weaker connectors feels like a solution looking for a problem. The somewhat high force of PCIe slots is a feature, not a bug.
[1] USB-C to USB-C cables don't have the symmetry they appear to have. Each end can be independently upside down.
IIRC, this means that certain USB-C configurations require a captive cable because they can only tolerate reversal of one end, not both.
https://www.youtube.com/watch?v=dx596o8t_TY
https://www.youtube.com/watch?v=WgbAESoFDAY