10-03-2022, 07:53 PM
(This post was last modified: 10-03-2022, 07:55 PM by Chick3nman.)
>PCIe lanes - Current technology is that the RTX3090 TIs have the ability to run x8 4.0 PCIe lanes due to their 1008GB/s mind melting speed (I'm going to look back in 10 years at me saying this and laugh). You mentioned it's best not to dip below x4 3.0 PCIe lanes and I'm wondering if that includes stringing together high performance GPUs like the 3090 TI that has such a high speed ability in this case?
The issue with PCIe lanes and trying to give good recommendations here is a bit complicated I think. There's many many variables at play that often get boiled down to just "x4 lanes". The vast majority of modern GPUs will negotiate and operate at pretty high speed(even x16 5.0 now) regardless of other factors, but that isn't the end of the story. CPUs have a limited number of PCIe lanes, motherboards can have chipset PCIe lanes that add on to the CPU count but are not quite the same, motherboards and backplanes can have PLX Chips that effectively switch PCIe communications to add even more lanes, but again this may not behave quite how you would expect. There are other devices that contend for lanes as well, including NVMe storage, Thunderbolt connections, etc. We even see lane bifurcation and duplication through cheaper PLX style chips leading to weird cases of more than 1 device per lane. All of these things complicate what we mean vs what we say when we try to discuss the hardware. For this I will try to be as explicit as possible and clear some of that up.
Hashcat's usage of PCIe lanes is mostly relatively small, fast, low latency data loading and device status queries through the runtime with the occasional device data returns such as cracked hashes. The things that can slow us down related specifically to the PCIe bus are latency increases or limitations in bandwidth/transaction rate. These issues can happen for a number of reasons such as increased error rates and TX/RX resends due to interference or poor signal quality(This happens mostly with bad risers), delays from switching due to weak/poor PLX style chips, contention with other devices, etc. It is always best to have you GPU attached to a high quality physical connection(no risers) backed by a high speed link to the CPU(no low end PLX chips). Once you have achieved those things, you almost never need to worry about the speed because most motherboards won't let you plug in more GPUs than your CPU can handle as it is. And if they do, it's highly likely that you are achieving at least x4 3.0 or better. It's usually when people start to put GPUs on risers and add more cards than would normally physically fit that they find out that only certain slots will run at the same time or that with so many cards their lane count is cut down to x1 per card. At the point where you are running into the issue of PCIe bandwidth, you've already likely done a bunch of other stuff that could contribute to degraded performance/stability so I'm not sure I would focus on it anyway.
To summarize, I wouldn't worry about PCIe lanes until you've covered all the other stuff because by the time you do cover all the other stuff, the PCIe lanes will almost surely not be a problem. Modern GPUs will run fast enough in an "approved" configuration for it to never be an issue. It's only when you start getting creative and trying to slot extra cards in where they may not have normally fit that it starts to become a problem worth considering.
Also, to touch on another subject that gets brought up a lot about GPUs and their operation in hashcat, the GPUs are treated as separate devices and do not cooperate directly with each other. Each card is initialized and run by the host individually. Technologies like NVLink/SLI/CrossFire/etc. are not currently in use and not likely to be added due to limited benefit and significant complexity for the workload. As long as your host system(CPU, RAM, etc.) can comfortably run more GPUs, you can continue to add them without worry, including some mixing of different cards, though using all of the same card is generally suggested and will simplify a number of things should you have issues.
The issue with PCIe lanes and trying to give good recommendations here is a bit complicated I think. There's many many variables at play that often get boiled down to just "x4 lanes". The vast majority of modern GPUs will negotiate and operate at pretty high speed(even x16 5.0 now) regardless of other factors, but that isn't the end of the story. CPUs have a limited number of PCIe lanes, motherboards can have chipset PCIe lanes that add on to the CPU count but are not quite the same, motherboards and backplanes can have PLX Chips that effectively switch PCIe communications to add even more lanes, but again this may not behave quite how you would expect. There are other devices that contend for lanes as well, including NVMe storage, Thunderbolt connections, etc. We even see lane bifurcation and duplication through cheaper PLX style chips leading to weird cases of more than 1 device per lane. All of these things complicate what we mean vs what we say when we try to discuss the hardware. For this I will try to be as explicit as possible and clear some of that up.
Hashcat's usage of PCIe lanes is mostly relatively small, fast, low latency data loading and device status queries through the runtime with the occasional device data returns such as cracked hashes. The things that can slow us down related specifically to the PCIe bus are latency increases or limitations in bandwidth/transaction rate. These issues can happen for a number of reasons such as increased error rates and TX/RX resends due to interference or poor signal quality(This happens mostly with bad risers), delays from switching due to weak/poor PLX style chips, contention with other devices, etc. It is always best to have you GPU attached to a high quality physical connection(no risers) backed by a high speed link to the CPU(no low end PLX chips). Once you have achieved those things, you almost never need to worry about the speed because most motherboards won't let you plug in more GPUs than your CPU can handle as it is. And if they do, it's highly likely that you are achieving at least x4 3.0 or better. It's usually when people start to put GPUs on risers and add more cards than would normally physically fit that they find out that only certain slots will run at the same time or that with so many cards their lane count is cut down to x1 per card. At the point where you are running into the issue of PCIe bandwidth, you've already likely done a bunch of other stuff that could contribute to degraded performance/stability so I'm not sure I would focus on it anyway.
To summarize, I wouldn't worry about PCIe lanes until you've covered all the other stuff because by the time you do cover all the other stuff, the PCIe lanes will almost surely not be a problem. Modern GPUs will run fast enough in an "approved" configuration for it to never be an issue. It's only when you start getting creative and trying to slot extra cards in where they may not have normally fit that it starts to become a problem worth considering.
Also, to touch on another subject that gets brought up a lot about GPUs and their operation in hashcat, the GPUs are treated as separate devices and do not cooperate directly with each other. Each card is initialized and run by the host individually. Technologies like NVLink/SLI/CrossFire/etc. are not currently in use and not likely to be added due to limited benefit and significant complexity for the workload. As long as your host system(CPU, RAM, etc.) can comfortably run more GPUs, you can continue to add them without worry, including some mixing of different cards, though using all of the same card is generally suggested and will simplify a number of things should you have issues.