Sounds incredible:
http://www.tomshardware.com/news/nvidia-...31557.html
Fifteen billion transistors, fab'd in 16 nanometers FinFET tech, new unified memory, etc. etc.
My big question is: What would this kind of power look like harnessed by oclHashCat? What would some of the hashing numbers jump to?
Wow, big day for NVidia...
It doesn't look like it will be that big of a jump over Maxwell, based on the numbers Nvidia published today it looks like it might be ~ 16% faster than Titan X.
(04-05-2016, 10:06 PM)epixoip Wrote: [ -> ]It doesn't look like it will be that big of a jump over Maxwell, based on the numbers Nvidia published today it looks like it might be ~ 16% faster than Titan X.
You have missed the big bump of its base clock. In fact, the performance jump is at least at 55%.
Compared to the Titan X the base clock rises from 1000MHz to 1328MHz (+32.8%). From the M40 it's even +40%.
The shaders grow by 16,67%.
Titan X -> GP100 +55%
M40 -> GP100 +63%
And it doesn't end there. Technically the Chip has 3840 shaders, but to ramp up yield efficiency of the new 16nm process NVidia has limited them to 3584. And the professional cards always have a lower clock speed.
So a postprocessor of the Titan X might even see a jump of +25% in shaders and +40% in clock rate. A whopping +75% in performance would be the result.
And all the internal improvements are not even calculated.
In short terms: that new thing is a Monster (and will sadly have a monster price tag).
Nope, I didn't miss the big bump in base clock. First, base clock is irrelevant. oclHashcat runs at the boost clock until it runs up against PowerMizer. Second, we run all of our Titan X at 1515 Mhz. So the bump in clocks doesn't mean very much to us. Third, a 40% increase in clockrate does NOT translate to a 40% increase in performance. So you are over-exaggerating the impact of a higher base clock. Also, Titan X and M40 are literally the exact same GPU, so I don't know why you have two different figures there.
The one thing you didn't notice was the 300W TDP in spite of 256 cores being disabled. Which is a bit hard to believe considering the die shrink. Why double the transistors for only a 25% increase in cores? What is drawing 50W more power? Why do we have cores disabled on a flagship GPU?
First, we can only compare the factory specs provided by NVidia. No one knows how good or bad the new chip will overclock. The boost clock is at 1480+MHz already. Even a slight overclock of 10-15% would give a big performance boost over Titan X. So the standard base and boost rate are somewhat compareble.
Second, under hashcat, a 40% increase in clockrate surely means 40% plus in performance. I agree with you totally for many other applications, but especially hashcat has an almost linear increase of performance (maybe not with all kernels, but at least the ones I use).
Third, the factory specs of Titan X and M40 differ, that's why there are two numbers. It's 1000MHz vs. 948Mhz. I guess the consumer GP100 will also have a slightly higher clock rate.
About your question: each core has new functions and a much better DP-rate. Also the cache is bigger (4MB instead of 3MB). There are now 1792 FP64 cores instead of 96. That's where all the transistors went.
As the factory clock rate saw a +32,8% the ramp up of 50W seems reasonable. You have to see it like that: double the transistors and ramp up clock rate only draw 20% more power. That's not too shabby.
As I mentioned, they disabled a few cores to get a better yield rate. They want to be able to get more usable GPUs out of one wafer, so they disable one of the 15 SMX wether it's broken or not. They did the same with the GK110, where the first series had 2688 cores and the B-version 2880 (Titan and Titan Black). The latter also saw a small die shrink after the production process had been optimised.
Ok, let's compare factory specs to factory specs then:
Titan X - 3072 * 1215000000 = 3732480000000
Tesla P100 - 3584 * 1480000000 = 5304320000000
5304320000000 / 3732480000000 * 100 =~ 142.11
So 42% faster, stock vs. stock. Pretty much splits the difference between your 75% estimate and my 16% estimate. BUT...
If we're already at 300W at 1480 Mhz with 256 cores disabled, then there's zero room for overclocking assuming this has a 25A power feed (kind of hard to tell since it's not a standard PCI-e card, and I don't know anything about this proprietary connector.) We can reasonably assume though that the GTX variant will have 6pin + 8pin power which means it will not be able to draw more than 300W without violating the PCI-e spec (and Nvidia is not AMD, so that won't happen.) So if the 300W TDP for this chip is accurate, the GTX variant will have zero overclocking potential unless it has 2x 8pin power.
And there's another problem as well: 300W TDP is a big deal since the Tyan FT77, Supermicro 7048GR-TF, etc. cannot accept GPUs with TDP > 300W. So even if the card does have 2x 8pin instead of 8pin+6pin, it won't make any difference in dense enterprise applications (which is obviously what I care about), we'll have to do everything we can to keep power consumption < 300W. That means absolutely no overclocking, and possibly even downclocking like we do with the R9 290X.
So if we have a GTX variant of this chip that pulls 300W at 1480 Mhz, and the Titan X can overclock nicely up to 1515 Mhz without overvolting and without drawing > 300W, then the real-world picture is quite a bit different:
Titan X - 3072 * 1515000000 = 4654080000000
Tesla P100 - 3584 * 1480000000 = 5304320000000
5304320000000 / 4654080000000 * 100 = 113.97
That's only 14% faster in the real world.
Now maybe the GTX variant with 3584 cores doesn't draw 300W since it likely wont have all the FP64 cores the Tesla has. So maybe it only draws like 250W and there is indeed some overclocking potential. We'll have to wait and see. But that still leaves doubts about how much power the chip will draw when all the cores are enabled.
Anyway, my point is, you have to take all the factors into consideration. I'm analyzing the big picture here, not just what's immediately on paper.
(04-08-2016, 02:44 AM)epixoip Wrote: [ -> ]Ok, let's compare factory specs to factory specs then:
Titan X - 3072 * 1215000000 = 3732480000000
Tesla P100 - 3584 * 1480000000 = 5304320000000
5304320000000 / 3732480000000 * 100 =~ 142.11
So 42% faster, stock vs. stock. Pretty much splits the difference between your 75% estimate and my 16% estimate. BUT...
The calculation is not correct since the Titan X on paper has a boost clock of 1075 (your 1215 is already an overclocked product).
Titan X - 3072 * 1075000000 = 3302400000000
Tesla P100 - 3584 * 1480000000 = 5304320000000
5304320000000 / 3302400000000 * 100 =~ 160.62
So 60% faster. Without considering any other optimization (FP16, new instructions, more instruction per cycle etc.)
My 75% was the estimation for all available shaders + even higher boost rate.
Titan X - 3072 * 1075000000 = 3302400000000
Titan P100 - 3840 * 1505000000 = 5779200000000
5779200000000 / 3302400000000 * 100 =~ 175.00
There you go, 75% on a possible new Titan P100.
About the Titan X doing 1515MHz: In the tests I read they never could bring the Titan X over 1300MHz (boost) without crashing or hitting the powerlimit. So how do you know your Titan X is
really clocking 1515Mhz? Cause if that's just a number shown somewhere you cannot make serious calculations based on that.
No offence here, I'm just in pedantic mode.
The 300W limit is a problem indeed for the Tyan.
But let's see what comes up. It's obviously a big step forward.
Well 1075 Mhz is the "Average Boost" not the "Max Boost." A stock, non-SC Titan X will run at 1215 Mhz when cracking with oclHashcat. And as you know FP16 etc are worthless for password cracking
The 1515 Mhz is reported by ''nvidia-settings -q GPUCurrentClockFreqs'' while running oclHashcat. I have no reason to believe this is in any way unreliable.
Anyway it sounds like all this is moot because apparently Nvidia just announced there will not be a GTX card released based on the GP100. So now we're back to having no idea what a GTX Pascal chip will look like.
(04-08-2016, 04:44 PM)epixoip Wrote: [ -> ]Well 1075 Mhz is the "Average Boost" not the "Max Boost." A stock, non-SC Titan X will run at 1215 Mhz when cracking with oclHashcat.
You're right. 1152MHz is the correct value.
Titan X - 3072 * 1152000000 = 3538944000000
Tesla P100 - 3584 * 1480000000 = 5304320000000
5304320000000 / 3538944000000 * 100 =~ 149.88
So about 50% faster. It does not matter what the Titan X does or show under hashcat. We're comparing factory specs since we know them. Anything else is speculation. How knows, maybe NVidia is conservative with the clocks and that thing overclocks like hell?
But big agree, the GP100 will be a $4000+ card and the GTX might be something else. End of Juli the GP104 might [edited] be shipped (GTX1080). I still expect a 40-50% jump up from the GTX980. Knowing the GP100 specs, thats sounds more than realistic to me.
It certainly does matter what Titan X shows under hashcat because this is the hashcat forums and how the card performs under hashcat is all we care about...
Anyway I think you're crazy, 50% is way off base from what will happen in reality. I'm still expecting max 20% and that's being optimistic.
Also where are you seeing the GTX 1080 will be shipping at the end of July?