05-30-2017, 08:32 AM
For future searchers, note that later work does incur a small performance penalty (perhaps in the 8-12% range?) at the previous default of 16 due to adding support for big-endian architectures, so the default was dropped from 16 to 8.
~