hashcat v3.00 + ntlm performance
It is stunning: "NTLM performance on my i7-6700 CPU increased from 95.64MH/s to 1046.1 MH/s, which is by the way new world record for cracking NTLM on CPU."

Is there some place where I can find more about how the NTLM hash calculation was optimized to achieve this massive increase?
It was always that fast (in oclHashcat), I just didn't backport the optimizations to hashcat-legacy. If you want to know more about the partial reversal used read this: https://hashcat.net/events/p13/js-ocohaaaa.pdf