source: trunk/lib/symtab/ReadMe @ 1229

Last change on this file since 1229 was 1229, checked in by vla24, 8 years ago

Reorganized SymbolTable? library

File size: 2.0 KB
Line 
1Symbol Table Develoment Notes
2
31. Integrate packed length partitioned symbol table for lengths 1 through 8. - DONE.
4   
5   Avoiding memory copies for lengths above length 8. - DONE.
6     
7   Use symbol values to avoid unaligned loads and to allow the prefetching hardware the opportunity to prefetch values to cache automatically
8   as a side effect of prefetching. - DONE
9 
102. Map lengths to indices to eliminate allocation of length 0 for all arrays. OPTIONAL
11
123. Re-factor packed symbols code to do 'fast' matching for single symbol occurrences. TODO.
13
144. Re-Examine the Length AND Key versus strict Length sorting design decision.
15   This may fall into the category of JIC (Just In Case) design.
16   This may impact the final SymbolTable interface. TODO. 
17   This decision may add complexity in a decision to parallelize the length sorted symbol table across cores.
18
195. Avoid maintenance of large arrays across buffers. Buffer Data (symbol values, document index), Sort Data, Flatten Data (optional). DONE.   
20
216. Develop a 'finalize' method to destroy temporary data structures and build gid <-> symbol value mappings, etc... TODO.
22
237. Finalize the SymbolTable interface.
24
258. Examine the idea of creating a C++-templated symbol table with template specialization based on both symbol length L
26   and prefix length N.
27
289. Set using fw<L>::value.
29
3010.Move away from convert<3>, convert<5>, convert<6>, convert<7> that mask padding with 1's. Instead require that values are padded/masked with 1's. DONE - INIT_ONES HeapArray compile time flag.
31
3211.Write packed gids directly to the flattened global gid array (1D). DONE.
33
34
35A Multicore Symbolt Table Design
36   
371. Re-examine the design decision to assign gids from 0 to n and then to use these gids value for symbol value lookup.     
382. Determine how to partition data for multicore?   
39   
40   Ideas:
41   
42   A symbol table for each length or group of lengths.
43   
44   Partition symbols by length prevents gid contention.
45   
46   Symbol lookup could then proceed with a length parameter and gids could be local to groups of lengths.
Note: See TracBrowser for help on using the repository browser.