source: proto/CSV/csv2xml/Report on Perf.txt @ 2663

Last change on this file since 2663 was 2663, checked in by linmengl, 6 years ago

use store_unaligned to for speed

File size: 2.5 KB
Line 
1Report on Perf
2
3Appendix I:  CPU cyc/kElem, using BUFFER_PROFILING
4
5When file output is off, the cyc/kElem is much more stable(the Cumulative speed won't suddenlly increase).
6
7File         Size      File On  File Off  Mask Only
8
92006scores   (1.8MB)   37361    10100     2785
10scaledwps    (2.0MB)   52664    16350     2450
11gen1000      (28MB)    120480   21494     1302      (delim, eol intensive)
12gen10000     (280MB)   144629   21458     1214      (10 times copy of above)
13L2_2012-01   (103.2MB) 40943    7839      1134
14L2_2012-02   (130.5MB) 29058    7399      1153
15
16
17Appendix II
18
19   | for each block
20   |
21   |    A. Transposing and mask parsing
22   -       (Fast, 1~3 cyc/byte)
23   |
24   |    B. Split buffer according to masks and output into XML
25   |        B1. Split (eol, delim, &, hide_symbol)
26   -            (Linear scan, forward iterators)
27   |
28   |        B2. XML logic and I/O time
29   -            (I/O unstable, logic simple)
30
31Nov. 6th
32
33When I turn off xml output and fwrite each buffer directly, it's much faster, every test case we have less than 20 cyc/byte. WHICH INFER, small amount of fwrite each time is quite slow. Consider buffering each fwrite to a huge fwrite
34
35Nov. 8th
36
37I tried static buffer, with 16KB or 32KB or larger, it doesn't speed up significantly. fwrite has a buffer by default, whose size is 8KB. Maybe single thread can not make it any faster, I want to count the output size divide input size
38
39Nov. 13th
40
41I tried to do parsing segment by segment, that is, parse a dozen of blocks and then write the result to a buffer, flush it to file while turning off the default buffer system in fwrite. It doesn't help much yet... WTF!
42
43I wrote a better class for segment iterators with Nigel's help. File-off performance increased, but file performance didn't change.
44
45Nov. 15th
46At the beginning of Today, test results are
47
48File         Size      File On  File Off  Mask Only Size(Output)/Size(Input)
49
502006scores   (1.8MB)   40       7           7       2.3
51scaledwps    (2.0MB)   34       8           7       3.55
52gen1000      (28MB)    80       12          3       6.46
53gen10000     (280MB)   167      12          3       6.46
54L2_2012-01   (103.2MB) 39       6           3       2.39
55L2_2012-02   (130.5MB) 42       6           3       2.28
56
57Nov. 20th
58After using SIMD store, store_unaligned, I have this test result:
59
60File         Size      File On
61
622006scores   (1.8MB)   39
63scaledwps    (2.0MB)   32
64gen1000      (28MB)    84
65gen10000     (280MB)   125
66L2_2012-01   (103.2MB) 32
67L2_2012-02   (130.5MB) 39
Note: See TracBrowser for help on using the repository browser.