1 | \section{Conclusion}\label{sec:Concl} |
---|

2 | \paragraph*{Contributions} |
---|

3 | A new class of regular expression matching algorithm has been |
---|

4 | introduced based on the concept of bit-parallel data streams |
---|

5 | together with the MatchStar operation. The algorithm is |
---|

6 | fully general for nondeterministic regular expression matching; |
---|

7 | however it does not address the nonregular extensions |
---|

8 | found in Perl-compatible backtracking implementations. |
---|

9 | Taking advantage of the SIMD features available on commodity |
---|

10 | processors, its implementation in a grep too offers consistently good performance in |
---|

11 | contrast to available alternatives. While lacking some |
---|

12 | special optimizations found in other engines to deal with |
---|

13 | repeated substrings or to perform skipping actions based |
---|

14 | on fixed substrings, it nevertheless performs competitively |
---|

15 | in all cases. The algorithm tends to scale very well with regular |
---|

16 | expression complexity, often with order-of-magnitude |
---|

17 | performance advantage over even the best of its competitors. |
---|

18 | |
---|

19 | A parallelized algorithm for long-stream addition has also |
---|

20 | been introduced in the paper making a key contribution |
---|

21 | to the scalability of the bit-parallel matching technique |
---|

22 | overall and that of MatchStar in particular. This |
---|

23 | algorithm has enabled straightforward extension of the |
---|

24 | matching algorithm to the implementation using 256-bit |
---|

25 | AVX2 technology as well as 4096-bit SIMT implementation |
---|

26 | on an AMD GPU. |
---|

27 | |
---|

28 | \paragraph*{Future Work} |
---|

29 | |
---|

30 | An important area of future work is to develop and assess |
---|

31 | multicore versions of the algorithm to handle regular expression |
---|

32 | matching problems involving larger rulesets than are |
---|

33 | typically encountered in the grep problem. Such implementations |
---|

34 | could have useful application in tokenization and network |
---|

35 | intrusion detection for example. |
---|

36 | |
---|

37 | Another area of interest is to extend the capabilities of |
---|

38 | the underlying method with addition features for substring |
---|

39 | capture, zero-width assertions and possibly |
---|

40 | backreference matching. Extending Unicode support beyond |
---|

41 | basic Unicode character handling to include full Unicode |
---|

42 | character class support and normalization forms is also |
---|

43 | worth investigating. |
---|

44 | |
---|