Changeset 1650

Nov 2, 2011, 8:26:45 PM (6 years ago)

Minor edit. Prefer runtime over run-time.

4 edited


  • docs/HPCA2012/03b-research.tex

    r1639 r1650  
    4747modules are then compiled to low-level C/C++ code using our Pablo
    4848compiler.  This code is then linked in with the general Transposition
    49 code available in the Parabix run-time library, as well as the
     49code available in the Parabix runtime library, as well as the
    5050hand-written postprocessing code that completes the well-formed
  • docs/HPCA2012/06-scalability.tex

    r1421 r1650  
    8181superscalar microprocessor. It includes a 32kB L1 data cache and a
    8282512kB L2 shared cache.  Migration of Parabix-XML to the Android platform
    83 only required developing a Parabix run-time library for ARM \NEON{}.
     83only required developing a Parabix runtime library for ARM \NEON{}.
    8484The majority of the runtime functionality was ported
    8585directly. However, a small subset of key SIMD instructions (e.g., bit
  • docs/HPCA2012/07-avx.tex

    r1421 r1650  
    33In this section, we discuss the scalability and performance advantages
    44of our 256-bit AVX (Advanced Vector Extensions) Parabix-XML port.  The
    5 Parabix run-time libraries originally targeted the 128-bit SSE2 SIMD
     5Parabix runtime libraries originally targeted the 128-bit SSE2 SIMD
    66technology, available on all modern 64-bit Intel and AMD processors.
    77It was recently been ported to AVX, which is commercially
    4343to take advantage of the 3-operand form of AVX instructions
    4444while retaining a uniform 128-bit SIMD processing width.  The second
    45 involved rewriting the Parabix run-time library to
     45involved rewriting the Parabix runtime library to
    4646leverage the 256-bit AVX instructions wherever possible and to simulate
    4747the remaining operations using pairs of 128-bit operations. Figure
Note: See TracChangeset for help on using the changeset viewer.