Cost optimized GPU farm
#1
Hello guys!

First of all i'd like to thank you for great software. I had alot of headaches and sleepless nights with hashcat and VCL, however the results are very satisfying! Smile

Recently i started hitting a wall because of scaling. I got running various types of hardware, mostly desktop boards, all with HD79**, interconnected with infiniband and ethernet (for PXE boot and scrypt mining). At some point it started to become a complete mess because of wiring mess and desktop hardware reliability issues. Bitcoin style plastic baskets as cases are not very convenient to manage.

I'm looking into rackmountable solutions now. The ultimate FT77 is an option, however i don't like it for a couple of reasons, will describe later. I've made minimal comparison of rackmountable solutions, based on density, cost/efficiency, here's what i came up with:

1) Running generic desktop hardware, the cost to set up for one single chip GPU is ~300$. Mounting these in cheap rackmountable 4u cases gives 3GPU/4u density (4th GPU isn't possible, remember about infiniband card, which consumes one slot). No PSU redundancy, no IPMI...

2) Supermicro 7046GT-TRF (http://www.supermicro.nl/products/system...gt-trf.cfm). Barebone is 2k$, CPUs are cheap LGA1366 from ebay, running rig will be ~2.5k$. I have one running btw. We get about 625$ per card and 4GPU/4u density. Booting generic Wheezy kernel without issues, 4xHD7970.

3) Tyan FT77. I live in Europe and getting one here is around 6k$, thanks to exclusive PNY distributor. If you got these cheaper here, please tell me how you did it. Say, CPUs + RAM will do around 1k. One case will run 7GPUs + infiniband card, which gives us 1k$ per card to setup and 7GPU/4u density. One thing that scares me off is running backported kernel to get it up and running. I'm used to boot universal Debian image over PXE, when you need to update smth you just update one image and you're done. I don't want to maintain an Ubuntu image just for FT77s.

4) Supermicro 2027GR-TRFH (http://www.supermicro.nl/products/system...r-trfh.cfm). This is one big "if", since i don't have one to test and no one tried AMD cards in it. It's 2u, however manufacturer states it will fit 6(!) double width GPUs. In my case 5 GPUs + infiniband card. This barebone costs ~2k$ + 1k$ for CPUs and RAM. Assuming GPUs will fit (possibly custom passive heatsinks will be needed, since the case is designed for passive Teslas and Phi), we get 600$ per card and 10GPU/4u density! Sounds too good to be true Smile

If you have any comments or wanna hit me, do so. About 4th solution, maybe somebody tried that already. If no one did, maybe i will Smile
Reply
#2
Stay away from the 2027GR-TRFH. I know it looks tempting, six GPUs in 2U, but it does not work for our purposes.

First, it requires passive GPUs with a special mounting bracket (which Teslas have.) You can pull the shroud, fan, and PCI bracket off a 7970, but then there's no way to mount it in the carrier without the Tesla-specific mounting bracket.

Second, the chassis can only handle six GPUs if you're using low-power GPUs which only take one 6-pin or 8-pin PCI power cable. If your GPU requires both an 8-pin and a 6-pin like the 7970, then you can only use at most three GPUs in this chassis.

Third, this chassis was designed for GPUs which have their PCI power connectors on the rear of the card. GPUs which have their PCI power connectors on the top of the card, like the 7970, will not work in this chassis because you will be unable to plug in the power cords!

So yeah, as you said, this chassis was designed for Teslas. It will not work with 7970s, no matter how hard you try.

7047GR-TPRF is a pretty sweet chassis though, and you can easily run 4x 7970 in there.
Reply
#3
(12-03-2013, 12:24 AM)epixoip Wrote: Stay away from the 2027GR-TRFH. I know it looks tempting, six GPUs in 2U, but it does not work for our purposes.

First, it requires passive GPUs with a special mounting bracket (which Teslas have.) You can pull the shroud, fan, and PCI bracket off a 7970, but then there's no way to mount it in the carrier without the Tesla-specific mounting bracket.

Second, the chassis can only handle six GPUs if you're using low-power GPUs which only take one 6-pin or 8-pin PCI power cable. If your GPU requires both an 8-pin and a 6-pin like the 7970, then you can only use at most three GPUs in this chassis.

Third, this chassis was designed for GPUs which have their PCI power connectors on the rear of the card. GPUs which have their PCI power connectors on the top of the card, like the 7970, will not work in this chassis because you will be unable to plug in the power cords!

So yeah, as you said, this chassis was designed for Teslas. It will not work with 7970s, no matter how hard you try.

7047GR-TPRF is a pretty sweet chassis though, and you can easily run 4x 7970 in there.

Thanks for clarifying about 2027GR, didn't throw out 2 grand for nothing Smile
Yeah, then Supermicro 7047/7046 looks like best choice for me atm.
I'm still looking for something with more GPU per rack space density. Apart from Supermicro and Tyan, i found OSS (http://www.onestopsystems.com/expansion_enclosures.php), but they are not answering mails and i can't find a vendor to buy and test one. There is Magma too, but their pricing is insane.
Maybe there are alternative solutions i missed. Will keep digging Smile
Reply
#4
you don't really gain anything with expansion enclosures since you're still going to hit the 8 GPU limit, plus you have the drawback of increased startup latency due to sharing 8 devices over one lane.

the tyan ft77 really is the best option for rack density at the moment.
Reply