Dec 13, 2011, 5:37:36 PM (8 years ago)

Minor fixes; figure placement

1 edited


  • docs/HPCA2012/final_ieee/11-conclusions.tex

    r1774 r1775  
    1010% Future research
    12 In this paper we presented Parabix, a software runtime framework for
     12This paper presents Parabix as a software runtime framework for
    1313exploiting SIMD data units found on commodity processors for text
    1414processing.  The Parabix framework allows programmers to focus on exposing the
    1616abstract SIMD machine without worrying about or having to change code
    1717to handle processor specifics (e.g., 128-bit SIMD SSE vs 256-bit SIMD
    18 on AVX). We applied Parabix technology to a widely deployed
    19 application, XML parsing and demonstrate the efficiency gains that can
     18on AVX). Parabix technology was applied to XML parsing
     19to demonstrate the efficiency gains that can
    2020be obtained on commodity processors. Compared to the conventional XML
    21 parsers, Expat and Xerces, we achieve 2$\times$---7$\times$
     21parsers, Expat and Xerces, a 2$\times$---7$\times$
    2222improvement in performance and average 4$\times$ improvement in
    23 energy. We achieve high compute efficiency with an overall 9$\times$---15$\times$
    24 reduction in branches, 7$\times$---15$\times$ reduction in branch mispredictions,
    25 % ?\times$ reduction in LLC misses, and increase in data parallelism
    26 and process up to 128 characters with a single operation. We used the
    27 Parabix framework and XML parsers to study the features of the new 256-bit
    28 AVX extension in Intel processors. We find that while the move to
    29 3-operand instructions deliver significant benefit the wider
    30 operations in some cases have higher overheads compared to the
    31 existing 128-bit SSE operations. We also compare Intel's SIMD
    32 extensions against the ARM \NEON{}. Note that Parabix allowed us to
     23energy was achieved. Furthermore, computational efficiency was
     24greatly increased, with an overall 9$\times$---15$\times$
     25reduction in branches and 7$\times$---15$\times$ reduction in branch mispredictions.
     27The Parabix framework and XML parsers was also used to study the
     28features of the new 256-bit AVX extension in Intel processors.  While the move to
     293-operand instructions delivers significant benefits, the
     30advantage of loads and bitwise logic with 256 bits at a time was
     31negated by the need to convert to 128 bit SIMD registers for
     32integer operations.  We expect this will be remedied with AVX2.
     33Intel's SIMD
     34extensions were also compared with the ARM \NEON{}. Note that Parabix allowed us to
    3335perform these studies without having to change the application source.
    34 Finally, we parallelized the Parabix XML parser to take advantage of
    35 the SIMD units in every core on the chip. We demonstrate that the
     36Finally, the Parabix XML parser was parallelized
     37to take advantage of the SIMD units in every core on the chip, demonstrating that the
    3638benefits of thread-level-parallelism are complementary to the
    37 fine-grain parallelism we exploit; parallelized Parabix achieves a
     39fine-grain parallelism we exploit.   In this study, our parallelized Parabix achieves a
    3840further 2$\times$ improvement in performance.
Note: See TracChangeset for help on using the changeset viewer.