fgets-sse2 v0.01 released
#5
Sorry for reviving this thread.

Out of my own curiosity, I translated the len utility to use fgets_sse2 instead of fgets and am seeing some odd behavior. Output:

Code:
mangix@Mangix ~/testing
$ len 2 6 < enwik8 | wc -l
33205

mangix@Mangix ~/testing
$ ./len 2 6 < enwik8 | wc -l
33070

Oddly enough, when I use rockyou.txt everything is fine:

Code:
mangix@Mangix ~/testing
$ len 2 6 < rockyou.txt | wc -l
2227662

mangix@Mangix ~/testing
$ ./len.exe 2 6 < rockyou.txt | wc -l
2227662

When I piped the outputs to two different files and compared them, It seems that then version using fgets_sse2 is not keeping lines which have a . at the end. Maybe it's an issue with newline characters being handled differently. enwik8 is available here: http://mattmahoney.net/dc/enwik8.zip

edit: I found a secondary problem. It looks like when you feed it wordlists that have \r\n at the end of a line, the \r gets treated as part of the word. Looks like filtering is needed.
Reply


Messages In This Thread
fgets-sse2 v0.01 released - by atom - 01-06-2013, 08:39 PM
RE: fgets-sse2 v0.01 released - by epixoip - 01-06-2013, 08:42 PM
RE: fgets-sse2 v0.01 released - by undeath - 01-06-2013, 08:44 PM
RE: fgets-sse2 v0.01 released - by M@LIK - 01-07-2013, 05:46 AM
RE: fgets-sse2 v0.01 released - by Mangix - 08-06-2013, 12:22 AM
RE: fgets-sse2 v0.01 released - by arthur - 10-16-2013, 09:07 PM
RE: fgets-sse2 v0.01 released - by arthur - 10-17-2013, 09:50 AM