Ignore:
Timestamp:
Jan 30, 2013, 5:29:40 PM (7 years ago)
Author:
nmedfort
Message:

more edits

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/icXML/multithread.tex

    r2525 r2871  
    77% which naturally enables pipeline parallel processing.
    88
    9 As discussed in section \ref{background:xerces}, Xerces can be considered a complex finite-state machine.
    10 As an application class, finite-state machines are considered very difficult to parallelize
    11 and have been termed ``embarassingly sequential.'' \cite{Asanovic:EECS-2006-183}.
    12 However, \icXML{} is designed to organize processing into logical layers that
    13 are separable.   In particular, layers within the \PS{} are designed to operate
     9As discussed in section \ref{background:xerces}, Xerces can be considered a FSM application.
     10These are ``embarassingly sequential.''\cite{Asanovic:EECS-2006-183} and notoriously difficult to parallelize.
     11However, \icXML{} is designed to organize processing into logical layers.   
     12In particular, layers within the \PS{} are designed to operate
    1413over significant segments of input data before passing their outputs on for
    1514subsequent processing.  This fits well into the general model of pipeline
     
    1817
    1918The most straightforward division of work in \icXML{} is to separate
    20 the \PS{} and the \MP{} into distinct logical layers in a two-stage pipeline.
     19the \PS{} and the \MP{} into distinct logical layers into two seperate stages.
     20The resultant application, {\it\icXMLp{}}, is a course-grained software-pipeline application.
    2121In this case, the \PS{} thread $T_1$ reads 16k of XML input $I$ at a time and produces the
    2222content, symbol and URI streams, then stores them in a pre-allocated shared data structure $S$.
     
    4141\subfigure[]{
    4242\includegraphics[width=0.48\textwidth]{plots/threads_timeline2.pdf}
     43\label{threads_timeline2}
    4344}
    4445\caption{Thread Balance in Two-Stage Pipelines}
    45 \label{threads_timeline2}
    4646
    4747\end{figure}
     
    6565% and the first thread has to wait for the second thread finishing reading the shared data before it can reuse the memory space.
    6666
    67 Overall, our design assumption is that an accelerated Xerces parser will be
    68 most significant for applications that themselves perform substantial
    69 processing on the parsed XML data delivered.  Our design is intended for
    70 a range of applications ranging between two design points.   The first
    71 design point is one in which XML parsing cost handled by the
    72 \PS{} dominates at 67\% of the overall
    73 cost, with the cost of application processing (including the driver logic
    74 withinn the \MP{}) still being quite significant
    75 at 33\%.   The second is almost the reverse scenario, the cost of application processing
    76 dominates at 60\% of the overall cost, while the overall cost of parsing represents
    77 an overhead of 40\%.
     67Overall, our design is intended to benefit a range of applications.
     68Conceptually, we consider two design points.
     69The first, the parsing performed by the \PS{} dominates at 67\% of the overall cost,
     70with the cost of application processing (including the driver logic within the \MP{}) at 33\%.   
     71The second is almost the opposite scenario, the cost of application processing dominates at 60\%,
     72while the cost of XML parsing represents an overhead of 40\%.
    7873
    79 Our design is also predicated on a goal of using the Parabix
    80 framework to achieve achieving a 50\% to 100\% improvement
    81 in the parsing engine itself.   Our best case scenario is
     74Our design is predicated on a goal of using the Parabix
     75framework to achieve a 50\% to 100\% improvement in the parsing engine itself.   
     76In a best case scenario,
    8277a 100\% improvement of the \PS{} for the design point in which
    8378XML parsing dominates at 67\% of the total application cost.
    84 In this case, single-threaded \icXML{} should achieve a 50\% speedup
     79In this case, the single-threaded \icXML{} should achieve a 1.5x speedup over Xerces
    8580so that the total application cost reduces to 67\% of the original. 
    86 However, with our two-stage pipeline model, our ideal scenario
    87 gives us two well-balanced threads each performing about 33\% of the
    88 original work.   In this case, Amdahl's law predicts that
    89 we could expect up to a 3X speedup, at best.
     81However, in \icXMLp{}, our ideal scenario gives us two well-balanced threads
     82each performing about 33\% of the original work.   
     83In this case, Amdahl's law predicts that we could expect up to a 3x speedup at best.
    9084
    9185At the other extreme of our design range, we consider an application
    92 in which core parsing cost is 40\%.   Assuming the 2X speedup of
     86in which core parsing cost is 40\%.   Assuming the 2x speedup of
    9387the \PS{} over the corresponding Xerces core, single-threaded
    9488\icXML{} delivers a 25\% speedup.   However, the most significant
Note: See TracChangeset for help on using the changeset viewer.