Changeset 4501 for docs/Working


Ignore:
Timestamp:
Feb 11, 2015, 5:46:19 PM (4 years ago)
Author:
nmedfort
Message:

Strip unicode agnostic optimizations

Location:
docs/Working/icGrep
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/icGrep/architecture.tex

    r4499 r4501  
    1313The layering enables further optimization based on information available at each stage.
    1414%
    15 The initial \REParser{} validates and transforms the input \RegularExpression{} into an abstract syntax tree (AST).
     15The \REParser{} validates and transforms the input \RegularExpression{} into an abstract syntax tree (AST).
    1616%
    1717%The AST is a minimalistic representation that, unlike traditional \RegularExpression{}, is not converted into a NFA or DFA for further processing.
     
    1919%Instead, \icGrep{} passes the AST into the transformation module, which includes a set of \RegularExpression{} specific optimization passes.
    2020%
    21 Successive \RegularExpression{} Transformations exploit knowledge domain
     21Successive \RegularExpression{} Transformations exploit domain
    2222knowledge to optimize the regular expressions.
    2323%
    24 An initial \emph{Nullable} pass, determines whether the \RegularExpression{}
    25 contains prefixes or suffixes that may be removed or
    26 modified whilst matching the same lines of text as the original expression.
     24%An initial \emph{Nullable} pass, determines whether the \RegularExpression{}
     25%contains prefixes or suffixes that may be removed or
     26%modified whilst matching the same lines of text as the original expression.
    2727%
    28 For example, ``\verb|a*bc+|'' is equivalent to ``\verb|bc|'' because the Kleene Star (Plus) operator matches zero (one) or more instances of a
    29 specific character.
     28%For example, ``\verb|a*bc+|'' is equivalent to ``\verb|bc|'' because the Kleene Star (Plus) operator matches zero (one) or more instances of a
     29%specific character.
    3030%
    3131The aforementioned \texttt{toUTF8} transformation also applies during this phase to generate code unit classes.
     
    3737%This is described in more detail in \S\ref{sec:Unicode:toUTF8}.
    3838%
    39 A final \emph{Simplification} pass flattens nested structures into their simplest legal form.
     39%A final \emph{Simplification} pass flattens nested structures into their simplest legal form.
    4040%
    41 For example, ``\verb`a(b((c|d)|e))`'' becomes ``\verb`ab(c|d|e)`'' and ``\verb`([0-9]{3,5}){3,5}`'' becomes ``\verb`[0-9]{9,25}`''.
     41%For example, ``\verb`a(b((c|d)|e))`'' becomes ``\verb`ab(c|d|e)`'' and ``\verb`([0-9]{3,5}){3,5}`'' becomes ``\verb`[0-9]{9,25}`''.
    4242%
    4343
Note: See TracChangeset for help on using the changeset viewer.