Ignore:
Timestamp:
Nov 20, 2012, 5:29:13 PM (6 years ago)
Author:
linmengl
Message:

use store_unaligned to for speed

File:
1 edited

Legend:

Unmodified
Added
Removed
  • proto/CSV/csv2xml/Report on Perf.txt

    r2611 r2663  
    2828   |        B2. XML logic and I/O time
    2929   -            (I/O unstable, logic simple)
     30
     31Nov. 6th
     32
     33When I turn off xml output and fwrite each buffer directly, it's much faster, every test case we have less than 20 cyc/byte. WHICH INFER, small amount of fwrite each time is quite slow. Consider buffering each fwrite to a huge fwrite
     34
     35Nov. 8th
     36
     37I tried static buffer, with 16KB or 32KB or larger, it doesn't speed up significantly. fwrite has a buffer by default, whose size is 8KB. Maybe single thread can not make it any faster, I want to count the output size divide input size
     38
     39Nov. 13th
     40
     41I tried to do parsing segment by segment, that is, parse a dozen of blocks and then write the result to a buffer, flush it to file while turning off the default buffer system in fwrite. It doesn't help much yet... WTF!
     42
     43I wrote a better class for segment iterators with Nigel's help. File-off performance increased, but file performance didn't change.
     44
     45Nov. 15th
     46At the beginning of Today, test results are
     47
     48File         Size      File On  File Off  Mask Only Size(Output)/Size(Input)
     49
     502006scores   (1.8MB)   40       7           7       2.3
     51scaledwps    (2.0MB)   34       8           7       3.55
     52gen1000      (28MB)    80       12          3       6.46
     53gen10000     (280MB)   167      12          3       6.46
     54L2_2012-01   (103.2MB) 39       6           3       2.39
     55L2_2012-02   (130.5MB) 42       6           3       2.28
     56
     57Nov. 20th
     58After using SIMD store, store_unaligned, I have this test result:
     59
     60File         Size      File On
     61
     622006scores   (1.8MB)   39
     63scaledwps    (2.0MB)   32
     64gen1000      (28MB)    84
     65gen10000     (280MB)   125
     66L2_2012-01   (103.2MB) 32
     67L2_2012-02   (130.5MB) 39
Note: See TracChangeset for help on using the changeset viewer.