fgets-sse2 v0.01 released
#6
Hi guys, is this still alive?

Have stumblod upon this one and gave it a try with VS on Windows; seems to work fine for me, but I don't get near to 4x speedup; *only* about a dobule compared to fgets from vs runtime. and about 30% compared to doing same thing over buffer read with fread without intrinsics. I guess this version of sse2 optimized fgets is similar to using fread in concept, but performance come from parallell processing with sse2.

With fread:

duration 0: 0.957747
duration 1: 0.869212
duration 2: 0.842793
Mean value: 0.889917

With fgets:

duration 0: 1.258209
duration 1: 1.258632
duration 2: 1.258991
Mean value: 1.258611

With sse2 optimizied fgets:

duration 0: 0.661175
duration 1: 0.661901
duration 2: 0.661526
Mean value: 0.661534

I used 100 meg of random generated ASCII with relatively uniform distribution of new lines. I had to do some chanes to code to get it to compile with VS (just in struct declarations). My test code can be seen here: http://www.nextpoint.se/?p=580 and if someone would like modified code I can make it avialable.
Reply


Messages In This Thread
fgets-sse2 v0.01 released - by atom - 01-06-2013, 08:39 PM
RE: fgets-sse2 v0.01 released - by epixoip - 01-06-2013, 08:42 PM
RE: fgets-sse2 v0.01 released - by undeath - 01-06-2013, 08:44 PM
RE: fgets-sse2 v0.01 released - by M@LIK - 01-07-2013, 05:46 AM
RE: fgets-sse2 v0.01 released - by Mangix - 08-06-2013, 12:22 AM
RE: fgets-sse2 v0.01 released - by arthur - 10-16-2013, 09:07 PM
RE: fgets-sse2 v0.01 released - by arthur - 10-17-2013, 09:50 AM