09-21-2011, 06:02 PM
Hash-IT, I thought myself about better dictionaries in the world of GPGPU based rule engines and wrote tool to collect words from various sources: wlc The last version can parse wikipedia xml.bz2 dumps, but it is a bit long process due the size. You can download some preprocessed dicts from DWPA site. Keep in mind, that all dicts there are unique as a whole.