Posts: 21
Threads: 6
Joined: Feb 2019
(02-19-2019, 10:07 PM)NoReply Wrote: If you can afford it, going with a threadripper is a good idea imo, since the smallest one (1900X) is available for a decent price ( ~ 280 € in my country) and they boast 64 PCIe lanes, which is more than a workstation Xeon has. Drawback is that the boards are quite pricy and there are very few to choose from.
From doing a quick lookup I would say a Gigabyte Aorus Pro AMD X399 (https://www.gigabyte.com/Motherboard/X39...PRO-rev-10) has enough connectivity for 8 cards, but it is necessary to check the details (e.g. does every m.2 slot connect to the PCIe interface, or is it SATA only).
Another plus is that you have 8 RAM slots, which scales well. Also here in the forum it is recommended to have RAM >= VRAM to prevent CL_OUT_OF_RESSOURCES errors.
Awesome. Thanks for the info. I'll do some more research on those and will look to push to 64 GB RAM also.
Posts: 413
Threads: 2
Joined: Dec 2015
On the topic of Threadripper having 64 PCI-E lanes, please note that not all of those lanes are usable and the number is often misleading. Only 60 of those lanes are available for use, 4x are dedicated to the chipset on the board. And of the remaining 60, only 48 or 32, depending on the motherboard used, can be used for GPUs, 12 are designed as reserved for NVME storage or other IO. On top of all of that, a maximum of 7 directly attached PCI-E devices can be used with Threadripper cpu at 1 time(inc. storage, gpus, network, etc.). Any more PCI-E devices must be added via use of a PLX multiplexing chip. You may not be happy to find out that 8 GPUs doesn't want to work because your motherboard doesn't have the capabilities to multiplex PCI-E connections and threadripper doesn't support 7< devices.
Posts: 21
Threads: 6
Joined: Feb 2019
Well, damn. This is turning out to be much more involved than I thought.
Posts: 413
Threads: 2
Joined: Dec 2015
For what it's worth, take a look at this thread:
https://community.amd.com/thread/228930
This shows the trouble gone through to make threadripper work with more than 7 GPUs. Final number of working cards was 12ish on a 1950X after a ton of tweaking and it was not stable. The resolution on the thread was to just buy an intel based system because it "just works".
Posts: 21
Threads: 6
Joined: Feb 2019
(02-19-2019, 11:13 PM)Chick3nman Wrote: For what it's worth, take a look at this thread:
https://community.amd.com/thread/228930
This shows the trouble gone through to make threadripper work with more than 7 GPUs. Final number of working cards was 12ish on a 1950X after a ton of tweaking and it was not stable. The resolution on the thread was to just buy an intel based system because it "just works".
So the dual xeons are pretty much necessary in order to get the most PCIe lanes possible?
Posts: 413
Threads: 2
Joined: Dec 2015
(02-19-2019, 11:16 PM)FrostByte Wrote: So the dual xeons are pretty much necessary in order to get the most PCIe lanes possible?
Not dual Xeons specifically, just something built for real workstation use. A board with PLX chips is not the end of the world, you likely wont care about the difference in performance from switching like that. But the threadripper platforms i can find have a hard limit and seemingly no real workstation boards.
Posts: 21
Threads: 6
Joined: Feb 2019
(02-20-2019, 12:18 AM)Chick3nman Wrote: (02-19-2019, 11:16 PM)FrostByte Wrote: So the dual xeons are pretty much necessary in order to get the most PCIe lanes possible?
Not dual Xeons specifically, just something built for real workstation use. A board with PLX chips is not the end of the world, you likely wont care about the difference in performance from switching like that. But the threadripper platforms i can find have a hard limit and seemingly no real workstation boards.
What about this:
TYAN FT77CB7079 - 10x 3.5” SATA/SAS - 8x NVIDIA GPU - Dual 1-Gigabit Ethernet - 2000W Redundant (2+1)
2 x Six-Core Intel® Xeon® Processor E5-2603 v4 1.70GHz 15MB Cache (85W)
4 x 16GB PC4-19200 2400MHz DDR4 ECC Registered DIMM
480GB Intel® SSD D3-S4610 Series 2.5" SATA 6.0Gb/s Solid State Drive
No Operating System
Those CPUs carry 40 PCIe lanes a piece.
Only downside with this is that this is a 4U rackmount. It's a configuration similar to the blog post here -- https://www.shellntel.com/blog/2017/2/8/...rd-cracker
Posts: 21
Threads: 6
Joined: Feb 2019
02-20-2019, 08:21 AM
(This post was last modified: 02-20-2019, 08:22 AM by FrostByte.)
(02-19-2019, 06:19 PM)undeath Wrote: depends on the attacks you run. For wordlist attacks (a0) with no or few rules your bandwidth will limit the cards. With those cards you might even see a bottleneck with slower algorithms such as WPA in such a case.
I came across this post by Atom - https://hashcat.net/forum/thread-7267-post-39112.html
It looks like starting from 4.1.0, hashcat now copys the data to GPU memory which makes a significant difference for GPUs on risers (PCIe x1). His example lists several algorithms with noticeable differences. I know it's a rather old post since we're on 5.1.0 now, but I wanted to point it out in case there's something that has changed since the post, and if we should be avoiding PCIe x1.
Posts: 2,301
Threads: 11
Joined: Jul 2010
That technique reduces the bottleneck when using slower pcie bandwidths but doesn't remove it. There certainly is a tradeoff to be made. But if you do intend to run straight wordlist attacks with no or few rules (also includes stdin attacks) on those hash types you listed you should not go as low as x1. x8 per card would be preferable, x4 might still work.
Posts: 24
Threads: 8
Joined: Jun 2018
FWIW, I'm running several repurosed mining rigs with hashcat, all cards are still on the 1x risers.
The main bottle necks I've hit are with the low RAM situation and craptastic CPUs (4G ram and a celeron, yea baby). With those abyssmal specs I can still peg the GPUs on a fast hash and large wordlist and ruleset. Using a stright wordlist without a lot of ram, a fast CPU, and 1x PCI is going to be horrible, rules or masks are needed.
Another hitch is hashcat running out of ram, this happens quickly when using a large wordlist and a large ruleset (eg rocktastic and oneruletorulethemall). Solution is adding a large swap space.
Now, this is very slow to start up, but, once it does start the cards (6 gtx1070) are running flat out. This setup sucks, but it does work!
I'd throw as much money as you can at the rig, but if you want to save some $$ and grief, I've had very nice results with 32G RAM and an i5. The cards will still starve out with a straight wordlist on PCI 1x doing NTLM, but really with a hash that fast it's silly not to use a ruleset by default (just be sure your rules have the ':' at the top).
If you want to get experimental with kw generator or prince, you'll need to make sure you have CPU cycles and ram to work with.
I've only noticed 1x being an issue with the faster hashes, but then again I haven't tried *everything*.
|