4 graphic cards->ok, 5 graphiccards -> not ok
#1
Hy

First i would like to say thank you for quick support in your IRC channel Big Grin
Now i encountered problem, which i guess is better to be discused here...

What is bothering me:

If i install 4x hd5970 on my MB (890FXA-gd70) everything runs ok.
clinfo output is OK,
aticonfig --odgt --adapter=all is able to display temps of all gpus.
oclhashcat runs OK.

When i install 5th hd5970 and after aticonfig --adapter=all --initial -f and reboot:

clinfo output:
PHP Code:
Segmentation fault (core dumped

aticonfig --odgt --adapter=all is not able to display temps of additional graphic card (look at the bottom of code):
PHP Code:
pozi@ubuntu:~$ aticonfig --odgt --adapter=all

Adapter 0 
AMD Radeon HD 5900 Series
            Sensor 0
Temperature 60.50 C

Adapter 1 
AMD Radeon HD 5900 Series
            Sensor 0
Temperature 67.00 C

Adapter 2 
AMD Radeon HD 5900 Series
            Sensor 0
Temperature 61.00 C

Adapter 3 
AMD Radeon HD 5900 Series
            Sensor 0
Temperature 62.50 C

Adapter 4 
AMD Radeon HD 5900 Series
            Sensor 0
Temperature 59.50 C

Adapter 5 
AMD Radeon HD 5900 Series
            Sensor 0
Temperature 61.50 C

Adapter 6 
AMD Radeon HD 5900 Series
            Sensor 0
Temperature 30.50 C

Adapter 7 
AMD Radeon HD 5900 Series
            Sensor 0
Temperature 36.00 C
ERROR 
Get temperature failed for Adapter 8 AMD Radeon HD 5900 Series
ERROR 
Get temperature failed for Adapter 9 AMD Radeon HD 5900 Series 

but aticonfig --lsa recognise all of them:

PHP Code:
pozi@ubuntu:~$ aticonfig --lsa
0. 1b:00.0 AMD Radeon HD 5900 Series
  1. 1a
:00.0 AMD Radeon HD 5900 Series
  2. 17
:00.0 AMD Radeon HD 5900 Series
  3. 16
:00.0 AMD Radeon HD 5900 Series
  4. 0f
:00.0 AMD Radeon HD 5900 Series
  5. 0e
:00.0 AMD Radeon HD 5900 Series
  6. 0b
:00.0 AMD Radeon HD 5900 Series
  7. 0a
:00.0 AMD Radeon HD 5900 Series
  8. 07
:00.0 AMD Radeon HD 5900 Series
  9. 06
:00.0 AMD Radeon HD 5900 Series 

and lspcie too...

Problem is not graphic card or slot related- i tried every possible combination, it occurs only when i install 5th card.

What i did:
set PCIE latency timer in bios to 96 and 128, neither solves problem

Other informations:

PSU: LEPA 1600W 80plus (there is plenty of wattage unused)
I use PCIE risers, i tested them all and they are working as supposed to.
OS: Ubuntu Server 12.04.4 LTS
catalyst: i have tried 14.9 and 14.12, same problem at both of them
oclHashcat: version 1.31

Can someone please give me some advice? It seems a shame not to use all 5 graphic cards.

<offer to pay removed by philsmd>

Thanks in advance!
martin.po21
#2
Please do not offer to pay here. Users of this forum are willing to help if they can and you do not need to offer payment for a solution to your problem. It won't help.

Topic specific answer: the interesting thing is that the bios seems to recognize all the GPU(s), which is already good. Indeed, it seems somehow related to the driver.

Could you give output of:
lspci | grep VGA
too

did you try to use another operating system, like windows, with the same 5 GPU system? Maybe that could help to identify the problem...
I'm sure some other people (like epixoip) have more ideas....
and yes, in general we tell people to avoid risers wherever possible, since they lead to very strange problems (but not sure if that is also the case here).

P.S. also some system log files (also dmesg) could help to troubleshoot this problem
#3
Thanks for reply.
About reward.. please do not get me wrong, i am just desperate and out of ideas Sad

lspci | grep VGA output recognises all 5 of them:

PHP Code:
pozi@ubuntu:~$ lspci grep VGA
07
:00.0 VGA compatible controllerAdvanced Micro DevicesInc. [AMD/ATIHemlock [Radeon HD 5970]
0b:00.0 VGA compatible controllerAdvanced Micro DevicesInc. [AMD/ATIHemlock [Radeon HD 5970]
0f:00.0 VGA compatible controllerAdvanced Micro DevicesInc. [AMD/ATIHemlock [Radeon HD 5970]
17:00.0 VGA compatible controllerAdvanced Micro DevicesInc. [AMD/ATIHemlock [Radeon HD 5970]
1b:00.0 VGA compatible controllerAdvanced Micro DevicesInc. [AMD/ATIHemlock [Radeon HD 5970]
pozi@ubuntu:~$ 

dmesg:
http://pastebin.com/ip62JBTa

syslog is empty.

please tell which log files should i also post, i am a bit new to linux sistem (half a year) and do not understand it yet completely.

No, i did not tried it on other OS, so i can not tell.
Would win7 ultimate be ok for test? But unfortunately i will be able to do this next weekend, today i have to go to city where my school is :/

Again, thank you.
Please tell if you need more informations

EDIT: after searching log files i found one mentioning 5900 card quite a few times, and one failure is mentioned:
xorg.o.log:
http://pastebin.com/p7CzMdPq
#4
That's simple to explain. AMD OpenCL simply does not support more than 8 GPU's. Since you have dual-GPU card, you can only use 4.

You can say thanks to AMD.
#5
I was afraid it will be something like this...
Is there anything that can be done about it?

Thanks for explanation Smile
#6
Sure, use multiple computers
#7
Thats not bad a idea... actualy i have enough spare parts for another one... is there a way to link them up somehow?
#8
Yes, check: http://hashcat.net/forum/thread-3159.html
#9
Thanks for help everyone!
you are great Big Grin
#10
let us hope amd will support more then 8 in the new drivers upcomming next....