1 | STANDARD LIBRARY LENGTH SORTING |
---|

2 | |
---|

3 | A symbol table using STL C library. |
---|

4 | To compile : make symtab_stl |
---|

5 | Data structure : lib/symtab/symtab.* |
---|

6 | |
---|

7 | --------------------------------- |
---|

8 | |
---|

9 | HASH TABLE (Jestress) |
---|

10 | |
---|

11 | This symbol table uses Jestress hashing function. |
---|

12 | To compile : make symtab_hash |
---|

13 | Data structure : lib/symtab/hash_symbol_table.* |
---|

14 | |
---|

15 | --------------------------------- |
---|

16 | |
---|

17 | LENGTH SORTING |
---|

18 | |
---|

19 | 1. MASK-EQUAL-AND/OR |
---|

20 | This is Ken's implementation. We are using mask-equal-and/or method |
---|

21 | To compile : make symtab_ls |
---|

22 | Data structure : lib/symtab/ls_symbol_table*.* |
---|

23 | |
---|

24 | 2. USE_IDENTITY_SORT f(L) = L |
---|

25 | Input: |
---|

26 | - a bitstream that marks start positions |
---|

27 | - a bitstream that marks end positions |
---|

28 | Sequentially scan symbols and group the symbols based on their length. |
---|

29 | To compile : make symtab_id |
---|

30 | Data structure : lib/symtab/pbgs_identity_symbol_table.h |
---|

31 | |
---|

32 | --------------------------------- |
---|

33 | |
---|

34 | PARALLEL BITSTREAM-BASED LENGTH SORTING |
---|

35 | |
---|

36 | 1. USE_IDENTITY_SORT f(L) = L |
---|

37 | Data structure : lib/symtab/pbgs_identity_symbol_table.h |
---|

38 | There are 2 different implementations: |
---|

39 | a. We evaluate the bitstreams using interpose32 and advance32. |
---|

40 | To compile : make symtab_pbgs_id_adv |
---|

41 | b. We evaluate the bitstreams using one shift right for each bitutil.advance call |
---|

42 | To compile : make symtab_pbgs_id |
---|

43 | |
---|

44 | 2. USE_LOG_SORT f(L) = log(L) |
---|

45 | To compile : make symtab_pbgs_log |
---|

46 | Data structure : lib/symtab/pbgs_log_symbol_table.h |
---|

47 | |
---|

48 | 3. USE_DIV_SORT f(L) = L/4 |
---|

49 | NOTE: This implementation is not ready yet |
---|

50 | To compile : make symtab_pbgs_div |
---|

51 | Data structure : lib/symtab/pbgs_div_symbol_table.h |
---|

52 | |
---|

53 | |
---|

54 | HASH TABLE IMPLEMENTATION NOTES: |
---|

55 | - All the 3 strategies are using the same hash table implementation (bitstream_hash_table.h) and the same comparison implementation (equal_compare.h). |
---|

56 | |
---|

57 | |
---|

58 | XMLWF IMPLEMENTATION NOTES: |
---|

59 | - XMLWF makes use of overlapping buffer so that the system could look ahead or look back as needed. |
---|

60 | - For string buffer implementation, the system can look 32 bytes back/ahead (1 byte = 1 character) |
---|

61 | - For hash value implementation, the system can look 128 bits back/ahead (1 bit = 1 character/position) |
---|